Journal of Functional Analysis 257 (2009) 1–19 www.elsevier.com/locate/jfa
Hardy type inequality and application to the stability of degenerate stationary waves Shuichi Kawashima a,∗ , Kazuhiro Kurata b a Faculty of Mathematics, Kyushu University, Fukuoka 812-8581, Japan b Department of Mathematics and Information Sciences, Tokyo Metropolitan University, Hachioji, Tokyo 192-03, Japan
Received 6 August 2008; accepted 7 April 2009
Communicated by C. Villani
Abstract This paper is concerned with the asymptotic stability of degenerate stationary waves for viscous conservation laws in the half space. It is proved that the solution converges to the corresponding degenerate stationary wave at the rate t −α/4 as t → ∞, provided that the initial perturbation is in the weighted space L2α = L2 (R+ ; (1 + x)α ) for α < αc (q) := 3 + 2/q, where q is the degeneracy exponent. This restriction on α is best possible in the sense that the corresponding linearized operator cannot be dissipative in L2α for α > αc (q). Our stability analysis is based on the space-time weighted energy method combined with a Hardy type inequality with the best possible constant. © 2009 Elsevier Inc. All rights reserved. Keywords: Viscous conservation laws; Degenerate stationary waves; Asymptotic stability; Hardy inequality
1. Introduction We study the stability problem of degenerate stationary waves for viscous conservation laws in the half space x > 0: ut + f (u)x = uxx , u(0, t) = −1,
u(x, 0) = u0 (x).
* Corresponding author.
E-mail addresses:
[email protected] (S. Kawashima),
[email protected] (K. Kurata). 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.04.003
(1.1)
2
S. Kawashima, K. Kurata / Journal of Functional Analysis 257 (2009) 1–19
Here the initial function is assumed to satisfy u0 (x) → 0 as x → ∞, and f (u) is a smooth function of the form f (u) =
1 (−u)q+1 1 + g(u) , q
f (u) > 0 for −1 u < 0,
(1.2)
where q is a positive integer (degeneracy exponent) and g(u) = O(|u|) for u → 0. Since f (0) = f (0) = 0 and f (u) is strictly convex for −1 u < 0, we see that f (u) > 0 for −1 u < 0. In particular, we have 1 + g(u) > 0 for −1 u 0. In this situation, the corresponding stationary problem admits a unique solution φ(x) (called degenerate stationary wave), which verifies φx = f (φ), φ(0) = −1,
φ(x) → 0 as x → ∞.
(1.3)
We see easily that φ(x) behaves like φ(x) ∼ −(1 + x)−1/q . In particular, we have φ(x) = −(1 + x)−1/q when g(u) ≡ 0. To discuss the stability of the degenerate stationary wave φ(x), we introduce the perturbation v by u(x, t) = φ(x) + v(x, t) and rewrite the problem (1.1) as vt + f (φ + v) − f (φ) x = vxx , v(0, t) = 0,
v(x, 0) = v0 (x),
(1.4)
where v0 (x) = u0 (x) − φ(x), and v0 (x) → 0 as x → ∞. The stability of degenerate stationary waves was first studied in [15]. It was proved in [15] that if the initial perturbation v0 (x) is in the weighted space L2α , then the perturbation v(x, t) decays in L2 at the rate t −α/4 as t → ∞, provided that α < α∗ (q), where α∗ (q) := q + 1 + 3q 2 + 4q + 1 /q. The decay rate t −α/4 obtained in [15] would be optimal but the restriction α < α∗ (q) was not very sharp. The main purpose of this paper is to relax this restriction. Indeed, by employing the space–time weighted energy method in [15] and by applying a Hardy type inequality with the best possible constant (see Proposition 2.3), we show the same decay rate t −α/4 under the weaker restriction α < αc (q) := 3 + 2/q (see Theorem 4.1). Notice that α∗ (q) < αc (q). It is interesting to note that a similar restriction on the weight is imposed also for the stability of degenerate shock profiles (see [10]). We remark that our stability result for degenerate stationary waves is completely different from those for non-degenerate case. In fact, for non-degenerate stationary waves, we have the better decay rate t −α/2 for the perturbation without any restriction on α. See [4–6,14,16] for the details. See also [2,7,9,11] for the related stability results for stationary waves. In this paper we also discuss the dissipativity of the following linearized operator associated with (1.4): Lv = vxx − f (φ)v x .
(1.5)
S. Kawashima, K. Kurata / Journal of Functional Analysis 257 (2009) 1–19
3
In a simpler situation including the case g(u) ≡ 0 in (1.2), we show that the operator L is uniformly dissipative in L2α for α < αc (q) but cannot be dissipative for α > αc (q) (see Theorem 3.5). This suggests that the exponent αc (q) is the critical exponent of the stability problem of degenerate stationary waves. This result on the characterization of the dissipativity of L is an improvement on the previous one in [15] and is established by using a Hardy type inequality with the best possible constant (see Proposition 2.3). The contents of this paper are as follows. In Section 2 we introduce several Hardy type inequalities and discuss the best possibility of their constants. In Section 3 we discuss the dissipativity of the operator L in (1.5) in weighted L2 spaces. Finally in Section 4, we study the nonlinear stability of degenerate stationary waves. Notations. For 1 p ∞, Lp = Lp (R+ ) denotes the usual Lebesgue space on R+ = (0, ∞) with the norm · Lp . Let s be a nonnegative integer. Then the Sobolev space W s,p = W s,p (R+ ) is defined by W s,p = {u ∈ Lp ; ∂xk u ∈ Lp for k s} with the norm · W s,p . When p = 2, we write H s = W s,2 . Next we introduce weighted spaces. Let w = w(x) > 0 be a weight function defined on [0, ∞) such that w ∈ C 0 [0, ∞). Then, for 1 p < ∞, we denote by Lp (w) the weighted Lp space on R+ equipped with the norm ∞ 1/p p u(x) w(x) dx . uLp (w) :=
(1.6)
0
The corresponding weighted Sobolev space W s,p (w) is defined by W s,p (w) = {u ∈ Lp (w); 1,p ∂xk u ∈ Lp (w) for k s} with the norm · W s,p (w) . Also, we denote by W0 (w) the completion of C0∞ (R+ ) with respect to the norm ∞ 1/p p ∂x u(x) w(x) dx uW 1,p (w) := ∂x uLp (w) = .
(1.7)
0
0
When p = 2, we write H s (w) = W s,2 (w) and H01 (w) = W01,2 (w). In the special case where w = (1 + x)α with α ∈ R, these weighted spaces are abbreviated as Lpα = Lp (1 + x)α , 1,p 1,p Wαs,p = W s,p (1 + x)α , Wα,0 = W0 (1 + x)α , 1 Hα,0 = H01 (1 + x)α . Hαs = H s (1 + x)α , Let k be a nonnegative integer. Then, for an interval I ⊂ [0, ∞) and a Banach space X on R+ , C k (I ; X) denotes the space of k-times continuously differentiable functions on I with values in X. Finally, letters C and c in this paper denote positive generic constants which may vary from line to line.
4
S. Kawashima, K. Kurata / Journal of Functional Analysis 257 (2009) 1–19
2. Hardy type inequality Hardy’s inequality was first introduced by Hardy [1] and its best possible constant was given by Landau [8]. Here we introduce several Hardy type inequalities which will be used in this paper. The following first one is found in [13] but its best possible constant is not given explicitly there. Proposition 2.1. Let ψ ∈ C 1 [0, ∞) and assume either (1) ψ > 0, ψx > 0 and ψ(x) → ∞ for x → ∞; or (2) ψ < 0, ψx > 0 and ψ(x) → 0 for x → ∞. Then we have ∞
∞ v ψx dx 4 2
0
vx2 ψ 2 /ψx dx
(2.1)
0
for v ∈ C0∞ (R+ ) and hence for v ∈ H01 (w) with w = ψ 2 /ψx . Here 4 is the best possible constant, and there is no function v ∈ H01 (w), v = 0, which attains the equality in (2.1). Proof. Let v ∈ C0∞ (R+ ). A simple calculation gives 2 1 1 v ψ x = v 2 ψx + 2vvx ψ = v 2 ψx + (v + 2vx ψ/ψx )2 ψx − 2vx2 ψ 2 /ψx . 2 2
(2.2)
Integrating (2.2) in x, we obtain ∞
∞ ∞ 2 v ψx dx + (v + 2vx ψ/ψx ) dx = 4 vx2 ψ 2 /ψx dx, 2
0
0
(2.3)
0
which gives the desired inequality (2.1). It follows from (2.3) that the equality in (2.1) holds if and only if v + 2vx ψ/ψx ≡ 0. This gives v = C1 |ψ|−1/2 for some constant C1 . But, if C1 = 0, this v is not in H01 (w) with w = ψ 2 /ψx . In fact, in the case (1), we have vx = − 12 C1 ψ −3/2 ψx and hence ∞ vx2 w dx 0
1 = C12 4
∞
∞ 1 ψ −1 ψx dx = C12 log ψ(x) x=0 = ∞. 4
0
The case (2) can be treated similarly. Thus we conclude that there is no function v ∈ H01 (w), v = 0, which attains the equality in (2.1).
S. Kawashima, K. Kurata / Journal of Functional Analysis 257 (2009) 1–19
5
Finally, we show the best possibility of the constant 4 in (2.1). The following proof is based on the computation in [8]. First, we consider the case (1). Let us fix a > 0. Let > 0 be a small parameter and put ⎧ ⎨ 0, 0 x < a, v (x) = (x − a)ψ(x)−1/2− , a < x < a + 1, ⎩ ψ(x)−1/2− , a + 1 < x.
(2.4)
Then we have ∞ a+1 ∞ 2 2 −1−2 v ψx dx = (x − a) ψ ψx dx + ψ −1−2 ψx dx a
0
a+1
=: I1 + I2 . Here we see that I1 = O(1) for → 0 and ∞ 1 1 −2 I2 = − ψ(x) = ψ(a + 1)−2 . 2 2 x=a+1 On the other hand, we have ⎧ ⎨ 0, 0 x < a, vx (x) = ψ(x)−1/2− − (1/2 + )(x − a)ψ(x)−3/2− ψx (x), ⎩ −(1/2 + )ψ(x)−3/2− ψx (x), a + 1 < x.
a < x < a + 1,
Therefore we get ∞ a+1 2 2 −1/2− 2 vx ψ /ψx dx = ψ − (1/2 + )(x − a)ψ −3/2− ψx ψ 2 /ψx dx a
0
∞ + (1/2 + )
2
ψ −1−2 ψx dx =: J1 + J2 .
a+1 1 Here we find that J1 = O(1) for → 0 and J2 = (1/2 + )2 2 ψ(a + 1)−2 . Consequently, we obtain
∞ 0
1 ψ(a + 1)−2 (v )2 ψ 2 /ψx dx O(1) + (1/2 + )2 2 ∞x = 1 2 O(1) + 2 ψ(a + 1)−2 0 (v ) ψx dx
=
O() + (1/2 + )2 ψ(a + 1)−2 1 −→ 4 O() + ψ(a + 1)−2
for → 0. This shows that 4 in (2.1) is the best possible constant.
6
S. Kawashima, K. Kurata / Journal of Functional Analysis 257 (2009) 1–19
In the case (2), to show the best possibility of the constant in (2.1), we may take a test function v (x) as ⎧ ⎨ 0, 0 x < a, v (x) = (x − a)(−ψ(x))−1/2− , a < x < a + 1, ⎩ (−ψ(x))−1/2− , a + 1 < x, 2
but we omit the details. This completes the proof of Proposition 2.1. We have the Lp version of Proposition 2.1.
Proposition 2.2. Let ψ be the same as in Proposition 2.1. Let 1 < p < ∞. Then we have ∞
∞ |v| ψx dx p p
0
p−1
|vx |p |ψ|p /ψx
p
(2.5)
dx
0
for v ∈ C0∞ (R+ ) and hence for v ∈ W0 (w) with w = |ψ|p /ψx . Here p p is the best possible 1,p constant, and there is no function v ∈ W0 (w), v = 0, which attains the equality in (2.5). 1,p
p−1
Proof. Let 1 < p < ∞ and v ∈ C0∞ (R+ ). A simple calculation gives p |v| ψ x = |v|p ψx + p|v|p−2 vvx ψ =
1 p p−1 |v| ψx − p p |vx |p |ψ|p /ψx + R, p
(2.6)
where 1 1 p−1 |v|p ψx + p p |vx |p |ψ|p /ψx + p|v|p−2 vvx ψ. R= 1− p p Integrating (2.6) in x, we obtain ∞
∞ |v| ψx dx + p
0
∞ R dx = p
p
0
p−1
|vx |p |ψ|p /ψx
p
dx.
0
Here we see that −p|v|p−2 vvx ψ p|v|p−1 |vx ||ψ| (p−1)/p (p−1)/p p|vx ||ψ|/ψx = |v|p−1 ψx 1 1 p−1 1− |v|p ψx + p p |vx |p |ψ|p /ψx , p p
(2.7)
S. Kawashima, K. Kurata / Journal of Functional Analysis 257 (2009) 1–19
7
where we have used the Young inequality AB (1 − 1/p)Ap/(p−1) + (1/p)B p for A = (p−1)/p (p−1)/p and B = p|vx ||ψ|/ψx . Thus we have R 0. This together with (2.7) |v|p−1 ψx gives the desired inequality (2.5). It follows from (2.7) that the equality in (2.5) holds if and only if R ≡ 0. This is the case p−1 where vvx ψ 0 and |v|p ψx ≡ p p |vx |p |ψ|p /ψx . This is equivalent to pvx ψ ≡ −vψx and −1/p for some constant C1 . A simple computation shows that when hence we have v = C1 |ψ| 1,p p−1 C1 = 0, this v is not in W0 (w) with w = |ψ|p /ψx . Thus we conclude that there is no 1,p function v ∈ W0 (w), v = 0, which attains the equality in (2.5). The best possibility of the constant p p is proved in the same way as in the proof of Proposition 2.1. For example, in the case (1), we take the test function as ⎧ ⎨ 0, 0 x < a, v (x) = (x − a)ψ(x)−1/p− , a < x < a + 1, ⎩ ψ(x)−1/p− , a + 1 < x,
(2.8)
where a > 0 is fixed and > 0 is a small parameter. Then we see that ∞ 0
p−1 p 1 −p |vx |p |ψ|p /ψx dx O(1) + (1/p + ) p ψ(a + 1) ∞ = 1 p O(1) + p ψ(a + 1)−p 0 |v | ψx dx
=
O() + (1/p + )p ψ(a + 1)−p 1 −→ p O() + ψ(a + 1)−p p
for → 0. This shows that p p in (2.5) is the best possible constant. Thus the proof of Proposition 2.2 is complete. 2 The following variant of Proposition 2.1 is useful in our application. Proposition 2.3. Let φ ∈ C 1 [0, ∞), φ < 0, φx > 0, and φ(x) → 0 for x → ∞. Let σ ∈ R with σ = 0, and define the weight functions w and w1 by w = (−φ)−σ +1 /φx ,
w1 = (−φ)−σ −1 φx .
(2.9)
Then we have ∞
4 v w1 dx 2 σ
∞
2
0
vx2 w dx
(2.10)
0
for v ∈ H01 (w). Here 4/σ 2 is the best possible constant, and there is no function v ∈ H01 (w), v = 0, which attains the equality in (2.10).
8
S. Kawashima, K. Kurata / Journal of Functional Analysis 257 (2009) 1–19
Proof. Let σ > 0. In this case we put ψ = (−φ)−σ > 0. Then we have ψx = σ (−φ)−σ −1 φx > 0 and ψ(x) → ∞ as x → ∞. This corresponds to the case (1) of Proposition 2.1. Since ψ 2 /ψx = (1/σ )(−φ)−σ +1 /φx , by applying Proposition 2.1, we have ∞ 2
σ
−σ −1
v (−φ)
4 φx dx σ
0
∞
vx2 (−φ)−σ +1 /φx dx.
0
This gives (2.10) and hence the proof of Proposition 2.3 is complete for σ > 0. When σ < 0, we put ψ = −(−φ)−σ < 0. Then, applying the case (2) of Proposition 2.1, we get the desired conclusion also for σ < 0. This completes the proof of Proposition 2.3. 2 As a simple corollary of Proposition 2.3, we have: Corollary 2.4. Let α ∈ R with α = 1. Then we have vL2
α−2
2 vx L2α |α − 1|
(2.11)
1 = H 1 ((1 + x)α ). Here the constant 2/|α − 1| is the best possible, and there is no for v ∈ Hα,0 0 1 , v = 0, which attains the equality in (2.11). function v ∈ Hα,0
Proof. Let φ = −(1 + x)−1/q with q > 0. Then we see that φ < 0, φx = (1/q)(1 + x)−1/q−1 = (1/q)(−φ)q+1 > 0, and φ(x) → 0 as x → ∞. Now we apply Proposition 2.3. Since w = (−φ)−σ +1 /φx = q(−φ)−σ −q = q(1 + x)σ/q+1 , 1 1 (−φ)−σ +q = (1 + x)σ/q−1 , q q
w1 = (−φ)−σ −1 φx =
(2.12)
we have from (2.10) that 1 q
∞ v (1 + x) 2
σ/q−1
4q dx 2 σ
0
∞ vx2 (1 + x)σ/q+1 dx 0
1 for v ∈ Hσ/q+1,0 . Thus we have
v2L2
σ/q−1
4q 2 vx 2L2 , σ2 σ/q+1
(2.13)
1 for v ∈ Hσ/q+1,0 . This together with σ = (α − 1)q gives the desired inequality (2.11). This completes the proof. 2
S. Kawashima, K. Kurata / Journal of Functional Analysis 257 (2009) 1–19
9
3. Dissipativity of the linearized operator We discuss the dissipativity of the operator L defined by (1.5) in the weighted space L2 (w). For this purpose, we first review the basic properties of the degenerate stationary wave φ(x) (see [9] for the details). Lemma 3.1. Suppose that f (u) satisfies (1.2). Then the stationary wave φ(x), which is a solution of (1.3), verifies the following properties: φ ∈ C ∞ [0, ∞), and −1 φ(x) < 0,
φ(x) → 0 for x → ∞.
φx (x) > 0,
(3.1)
Moreover, we have c(1 + x)−1/q −φ(x) C(1 + x)−1/q .
(3.2)
Now, let w > 0 be a weight function depending only on x such that w ∈ C 2 [0, ∞) and we calculate the inner product Lv, v L2 (w) for v ∈ C0∞ (R+ ), where ∞ u, v L2 (w) :=
uvw dx.
(3.3)
0
We multiply (1.5) by v. Then a simple computation gives 1 1 (Lv)v = vvx − f (φ)v 2 − vx2 − f (φ)φx v 2 . 2 2 x Multiplying by w, we obtain 1 2 1 2 (Lv)vw = vvx − f (φ)v w − v wx 2 2 x 1 − vx2 w + v 2 wxx + wx f (φ) − wf (φ)φx . 2
(3.4)
Now we choose the weight function w and the corresponding w1 in terms of the degenerate stationary wave φ by (2.9), where σ ∈ R. Then we have w = (−φ)−σ +1 /f (φ) by φx = f (φ). Differentiating this expression with respect to x and using φx = f (φ) several times, we find by direct computations that wx = (σ − 1)(−φ)−σ − (−φ)−σ +1 f (φ)/f (φ), wxx = σ (σ − 1)(−φ)−σ −1 f (φ) − (σ − 1)(−φ)−σ f (φ) − (−φ)−σ +1 f (φ) − f (φ)2 /f (φ) . Consequently, we arrive at the expression
10
S. Kawashima, K. Kurata / Journal of Functional Analysis 257 (2009) 1–19
wxx + wx f (φ) − wf (φ)φx = σ (σ − 1)(−φ)−σ −1 f (φ) − 2(−φ)−σ +1 f (φ) = σ (σ − 1) − 2(−φ)2 f (φ)/f (φ) (−φ)−σ −1 f (φ) = 2 c1 (σ ) − r(φ) w1 , (3.5) where w1 is given in (2.9) and c1 (σ ) := σ (σ − 1)/2 − q(q + 1), r(u) := (−u)2 f (u)/f (u) − q(q + 1).
(3.6)
Substituting (3.5) into (3.4) and integrating with respect to x, we get the following conclusion. Claim 3.2. Let φ be the degenerate stationary wave and define the weight functions w and w1 by (2.9) with σ ∈ R. Then the operator L defined in (1.5) verifies ∞ Lv, v L2 (w) = −vx 2L2 (w)
+ c1 (σ )v2L2 (w ) 1
−
v 2 r(φ)w1 dx
(3.7)
0
for v ∈ C0∞ (R+ ) and hence for v ∈ H01 (w), where c1 (σ ) and r(φ) are given in (3.6). To discuss the dissipativity of L, we need to estimate the term r(φ) in (3.7). By straightforward computations, using (3.6) and (1.2), we see that r(u) = (−u) (−u)g (u) − 2(q + 1)g (u) 1 + g(u) . (3.8) This shows that r(u) = O(|u|) for u → 0. In particular, we have r(u) ≡ 0 if g(u) ≡ 0. With these preparations, we have the following result on the characterization of the dissipativity of L. Theorem 3.3. Assume (1.2). Let φ be the degenerate stationary wave and L be the operator defined in (1.5). Let w and w1 be the weight functions in (2.9) with the parameter σ ∈ R. Then we have: (1) Let −2q < σ < 2(q + 1). Then, under the additional assumption that r(u) 0 for −1 u 0, the operator L is uniformly dissipative in L2 (w). Namely, there is a positive constant δ such that Lv, v L2 (w) −δ vx 2L2 (w) + v2L2 (w ) (3.9) 1
for v ∈ H01 (w). (2) Let σ > 2(q + 1) or σ < −2q. Then the operator L cannot be dissipative in L2 (w). Namely, we have Lv, v L2 (w) > 0 for some v ∈ H01 (w) with v = 0. Remark 3.4. In (1) of this theorem, we have assumed that r(u) 0 for −1 u 0. In view of (3.8), this additional condition is satisfied if g(u) in (1.2) is of the form g(u) = G(−u), where G (η) 0 and G (η) 0 for 0 η 1. The simplest example of such a g(u) is g(u) = (−u)m with a nonnegative integer m.
S. Kawashima, K. Kurata / Journal of Functional Analysis 257 (2009) 1–19
11
Proof. The proof is based on the equality (3.7) in Claim 3.2 and the Hardy type inequality (2.10) in Proposition 2.3. Let −2q < σ < 2(q + 1). This is equivalent to c1 (σ ) < σ 2 /4. Therefore we can choose δ > 0 so small that δ(1 + σ 2 /4) σ 2 /4 − c1 (σ ). Since r(φ) 0 by the additional assumption on r(u) and since (σ 2 /4)v2L2 (w ) vx 2L2 (w) by the Hardy type inequality (2.10), we have from (3.7) 1 that Lv, v L2 (w) −vx 2L2 (w) + c1 (σ )v2L2 (w
1)
= −δvx 2L2 (w)
− (1 − δ)vx 2L2 (w)
+ c1 (σ )v2L2 (w ) 1 2 2 2 −δvx L2 (w) − (1 − δ)σ /4 − c1 (σ ) vL2 (w ) 1 2 2 −δ vx L2 (w) + vL2 (w ) 1
for v ∈ C0∞ (R+ ) and hence for v ∈ H01 (w), where we used the fact that (1 − δ)σ 2 /4 − c1 (σ ) δ. This completes the proof of the uniform dissipative case (1). Next we consider the case where σ > 2(q + 1); the case σ < −2q can be treated similarly and we omit the argument in this latter case. When σ > 2(q + 1), we have c1 (σ ) > σ 2 /4. Then we choose δ > 0 so small that c1 (σ ) σ 2 /4 + 3δ. Since r(u) = O(|u|) for u → 0 and φ(x) → 0 for x → ∞, we take a = a(δ) > 0 so large that |r(φ)| δ for x a. For this choice of a and for > 0, we take a test function v as in (2.4): ⎧ ⎨ 0, 0 x < a, v (x) = (x − a)(−φ(x))σ (1/2+) , a < x < a + 1, (3.10) ⎩ (−φ(x))σ (1/2+) , a + 1 < x. Then we have ∞ ∞ 2 2 2 v r(φ)w1 dx δ v w1 dx = δ v L2 (w ) , 1 a
0
so that we have from (3.7) that
Lv , v
L2 (w)
2 2 −vx L2 (w) + c1 (σ ) − δ v L2 (w ) . 1
Here, a direct computation shows that 2 v 2
L (w1
a+1 ∞ 2 2σ −1 = (x − a) (−φ) φx dx + (−φ)2σ −1 φx dx ) a
a+1
2σ 1 = O(1) + −φ(a + 1) 2σ for → 0, where the term denoted by O(1) depends on δ. Similarly, we have 2 v 2
x L (w)
= O(1) + σ 2 (1/2 + )2
2σ 1 −φ(a + 1) 2σ
(3.11)
12
S. Kawashima, K. Kurata / Journal of Functional Analysis 257 (2009) 1–19
for → 0. Consequently, we obtain vx 2L2 (w) v 2L2 (w
=
O(1) + σ 2 (1/2 + )2 2σ1 (−φ(a + 1))2σ O(1) +
1)
=
1 2σ (−φ(a
+ 1))2σ
O() + σ 2 (1/2 + )2 (−φ(a + 1))2σ σ2 −→ 2σ 4 O() + (−φ(a + 1))
for → 0. Thus we have vx 2L2 (w) /v 2L2 (w ) σ 2 /4 + δ for a suitably small = (δ) > 0. 1 Consequently, we have from (3.11) that Lv , v L2 (w) v 2L2 (w
1)
−
vx 2L2 (w)
+ c1 (σ ) − δ v 2L2 (w ) 1 − σ 2 /4 + δ + c1 (σ ) − δ δ.
This completes the proof of the non-dissipative case (2). Thus the proof of Theorem 3.3 is complete. 2 Finally in this section, we consider the special case where g(u) ≡ 0 so that f (u) = q1 (−u)q+1 . In this case the degenerate stationary wave is given explicitly by φ(x) = −(1 + x)−1/q and the operator L in (1.5) is reduced to L0 below: L0 v = vxx +
v q +1 . q 1+x x
(3.12)
For this simplest case, we have the complete characterization of the dissipativity of the operator L0 . Theorem 3.5. Let αc (q) := 3 + 2/q. Then we have the complete characterization of the dissipativity of the operator L0 given in (3.12): (1) Let −1 < α < αc (q). Then L0 is uniformly dissipative in L2α . Namely, there is a positive constant δ such that L0 v, v L2α −δ vx 2L2 + v2L2 α
(3.13)
α−2
1 . for v ∈ Hα,0 (2) Let α = αc (q) or α = −1. Then L0 is strictly dissipative in L2α . Namely, we have 1 with v = 0. L0 v, v L2α < 0 for v ∈ Hα,0 (3) Let α > αc (q) or α < −1. Then L0 cannot be dissipative in L2α . Namely, we have 1 with v = 0. L0 v, v L2α > 0 for some v ∈ Hα,0
S. Kawashima, K. Kurata / Journal of Functional Analysis 257 (2009) 1–19
13
Proof. Consider the case where f (u) = q1 (−u)q+1 with g(u) ≡ 0. In this case we have φ(x) = −(1 + x)−1/q and L = L0 . Moreover, noting that r(u) ≡ 0, we have as a counterpart of (3.7), L0 v, v L2 (w) = −vx 2L2 (w) + c1 (σ )v2L2 (w ) , 1
(3.14)
where w and w1 are the weight functions defined in (2.9) with σ ∈ R, and c1 (σ ) is given in (3.6). In our special case, these weight functions are given explicitly by (2.12), so that we have L0 v, v L2 (w) = q L0 v, v L2
σ/q+1
vx L2 (w) = qvx 2L2
σ/q+1
,
,
vL2 (w1 ) =
1 v2L2 . q σ/q−1
(3.15)
Now we put σ = (α − 1)q. First, let −1 < α < αc (q). This corresponds to the case where −2q < σ < 2(q + 1), for which we have c1 (σ ) < σ 2 /4. Therefore, applying to (3.14) the same arguments as in (1) of Theorem 3.3, we obtain L0 v, v L2 (w) −δ vx 2L2 (w) + v2L2 (w ) 1
for some δ > 0. This inequality together with the relations in (3.15) (with σ = (α − 1)q) shows the uniform dissipativity of L0 in L2α . Second, let α = αc (q) or α = −1. This corresponds to the case where σ = 2(q + 1) or σ = −2q. In this case we have c1 (σ ) = σ 2 /4. On the other hand, we have (σ 2 /4)v2L2 (w ) vx 2L2 (w) by the Hardy type inequality (2.10). Consequently, we get from (3.14) that
1
L0 v, v L2 (w) 0. Here the equality holds if and only if (σ 2 /4)v2L2 (w
1)
= vx 2L2 (w) . However, we know
from Proposition 2.3 that such a v = 0 does not exist in H01 (w). Thus we conclude that L0 v, v L2 (w) < 0 for v ∈ H01 (w) with v = 0, which together with (3.15) (with σ = (α − 1)q) proves the strict dissipativity of L0 in L2α . Finally, let α > αc (q) or α < −1. Then we have σ > 2(q + 1) or σ < −2q and hence c1 (σ ) > σ 2 /4. Therefore, applying to (3.14) the same arguments as in (2) of Theorem 3.3, we find that L0 v, v L2 (w) > 0 for some v ∈ H01 (w) with v = 0. This together with (3.15) (with σ = (α − 1)q) gives the desired conclusion of (3) of Theorem 3.5. This completes the proof. 2 4. Nonlinear stability The aim of this section is to prove the following stability result for the nonlinear problem (1.4) that is a refinement of the result in [15]. Theorem 4.1. Assume (1.2). Suppose that v0 ∈ L2α ∩ L∞ for some α with 1 α < αc (q) := 3 + q/2. Then there is a positive constant δ1 such that if v0 L2 δ1 , then the problem (1.4) has 1
a unique global solution v ∈ C 0 ([0, ∞); L2α ∩ Lp ) for each p with 2 p < ∞. Moreover, the solution verifies the decay estimate v(t) p C v0 2 + v0 L∞ (1 + t)−α/4−ν (4.1) Lα L for t 0, where 2 p < ∞, ν = (1/2)(1/2 − 1/p), and C is a positive constant.
14
S. Kawashima, K. Kurata / Journal of Functional Analysis 257 (2009) 1–19
A key to the proof of Theorem 4.1 is to show the following space–time weighted energy inequality. Proposition 4.2. Assume the same conditions as in Theorem 4.1. Let v be a solution to the problem (1.4) with the initial data v0 ∈ L2α ∩ L∞ , where 1 α < αc (q) := 3 + 2/q. Then there is a positive constant δ2 such that if v0 L2 δ2 , then we have 1
v(t)
L21
Cv0 L2 .
(4.2)
1
Moreover, we have the following space–time weighted energy inequality: 2 v(t)L2 +
γ
(1 + t)
t
β
2 2 (1 + τ )γ vx (τ )L2 + v(τ )L2 dτ β
β−2
0
t Cv0 2L2 β
+γC
2 (1 + τ )γ −1 v(τ )L2 dτ
(4.3)
β
0
for any γ 0 and β with 0 β α, where the constant C is independent of γ and β. Proof. The main part of the proof of this proposition is to derive the following space–time weighted energy inequality: 2 v(t)L2 +
γ
(1 + t)
t
β
2 2 (1 + τ )γ vx (τ )L2 + v(τ )L2 dτ β
β−2
0
t Cv0 2L2 β
+γC
2 γ (1 + τ )γ −1 v(τ )L2 dτ + CSβ (t)
(4.4)
β
0
for any γ 0 and β with 0 β α, where 1 α < αc (q) := 3 + 2/q, C is a constant independent of γ and β, and
γ Sβ (t) =
t
3 (1 + τ )γ v(τ )L3
(4.5)
dτ.
β−1
0
Once (4.4) is obtained, we can show the desired estimates (4.2) and (4.3) as follows. We observe that γ Sβ (t) C
t
2 2 (1 + τ )γ v(τ )L2 vx (τ )L2 + v(τ )L2 dτ, 1
0
β
β−2
(4.6)
S. Kawashima, K. Kurata / Journal of Functional Analysis 257 (2009) 1–19
15
which is an easy consequence of the following inequality (see [15] for the details): v3L3
β−1
CvL2 vL2 1
β−2
vx L2 + vL2 , β
β−2
where β ∈ R. Now we put γ = 0 and β = 1 in (4.4) and define V (t) 0 by 2 V (t) = sup v(τ )L2 +
t
2
0τ t
1
vx (τ )2 2 + v(τ )2 2 dτ. L L 1
−1
0
Since S10 (t) CV (t)3 by (4.6), we get the inequality V (t)2 Cv0 2 2 + CV (t)3 . This can be L1
solved as V (t) Cv0 L2 , provided that v0 L2 is suitably small. Thus we obtain 1
1
v(t)2 2 +
t
L1
vx (τ )2 2 + v(τ )2 2 dτ Cv0 2 2 , L L L −1
1
(4.7)
1
0
which gives the uniform estimate (4.2). Consequently, we have t Sγβ (t) Cv0 L2 1
2 2 (1 + τ )γ vx (τ )L2 + v(τ )L2 dτ. β
β−2
0
Substituting this estimate into (4.4) and assuming that v0 L2 is suitably small (say, v0 L2 δ2 ), 1 1 we arrive at the desired energy inequality (4.3). It remains to prove the inequality (4.4). To this end, we recall the following uniform estimate: v(t)
L∞
M∞ ,
(4.8)
where M∞ = v0 L∞ + 2. This is an easy consequence of the maximum principle (see [5] for the details). Proof of (4.4) for β = 0. The proof is based on the time weighted L2 energy method. We multiply Eq. (1.4) by v. This yields
1 2 v 2
+ (F − vvx )x + vx2 + G = 0,
(4.9)
t
where F = f (φ + v) − f (φ) v −
v
f (φ + η) − f (φ) dη,
0
v G= 0
f (φ + η) − f (φ) dη · φx .
(4.10)
16
S. Kawashima, K. Kurata / Journal of Functional Analysis 257 (2009) 1–19
We note that 1 F = f (φ)v 2 + O |v|3 , 2
1 G = f (φ)φx v 2 + φx O |v|3 2
(4.11)
for v → 0. Also, we observe that 1 (−φ)q+1 1 + O |φ| , q f (φ) = (q + 1)(−φ)q−1 1 + O |φ|
φx = f (φ) =
for |φ| → 0 and that f (φ) > 0 by (1.2). Therefore, noting (3.2) and (4.8), we have from (4.11) that G c(1 + x)−2 v 2 − C(1 + x)−1−1/q |v|3
(4.12)
for any x ∈ R+ . We integrate (4.9) over R+ and substitute (4.12) into the resulting equality. This gives 1 d v2L2 + vx 2L2 + cv2L2 Cv3L3 . 2 dt −2 −1 We multiply this inequality by (1 + t)γ and integrate with respect t. This yields the desired inequality (4.4) for β = 0. Proof of (4.4) for β > 0. We apply the space–time weighted energy method employed in [15] (see also [3]). Let w > 0 be a smooth weight function depending only on x, which will be specified later. We multiply (4.9) by w, obtaining
1 2 v w 2
t
1 2 1 2 2 v wxx + F wx − Gw = 0. + (F − μvvx )w + v wx + vx w − 2 2 x
(4.13)
Here, using (4.11), we have 1 2 1 v wxx + F wx − Gw = v 2 wxx + wx f (φ) − wf (φ)φx + R, 2 2
(4.14)
where R = wx O(|v|3 ) − wφx O(|v|3 ) for v → 0. Notice that the coefficient wxx + wx f (φ) − wf (φ)φx in (4.14) is just the same as that appeared in (3.4). Now we choose the weight function w and the corresponding w1 by (2.9) with σ = (β − 1)q, where 0 β α and 1 α < αc (q) := 3 + 2/q. Then we have (3.5) with σ = (β − 1)q. Substituting these expressions into (4.13) and integrating over R+ , we obtain 1 d v2L2 (w) + vx 2L2 (w) − c1 (σ )v2L2 (w ) + 1 2 dt
∞
∞ v r(φ)w1 dx = 2
0
R dx, 0
(4.15)
S. Kawashima, K. Kurata / Journal of Functional Analysis 257 (2009) 1–19
17
where c1 (σ ) and r(φ) are given in (3.6) with σ = (β − 1)q. Here our weight functions in (2.9) are w = q(−φ)−σ −q / 1 + g(φ) ,
w1 =
1 (−φ)−σ +q 1 + g(φ) . q
Therefore, noting (3.2), we see that w ∼ (1 + x)σ/q+1 = (1 + x)β ,
w1 ∼ (1 + x)σ/q−1 = (1 + x)β−2 ,
(4.16)
where the symbol ∼ means the equivalence. This implies that the norms · L2 (w) and · L2 (w1 ) are equivalent to · L2 and · L2 , respectively. β β−2 We estimate (4.15) similarly as in (1) of Theorem 3.3. To this end, we note that σ1 σ σ2 , where σ1 = −q and σ2 = (α − 1)q. Since c1 (σ ) < σ 2 /4 for −2q < σ < 2(q + 1) and since −2q < σ1 < σ2 < 2(q + 1), we can choose δ > 0 so small that δ
min
σ1 σ σ2
σ 2 /4 − c1 (σ ) . 2 + σ 2 /4
Notice that this δ is independent of β. For this choice of δ, we take a = a(δ) > 0 so large that |r(φ)| δ for x a. Then we have ∞ 2 v r(φ)w1 dx δv2L2 (w1 ) + Cv2L2 , −2 0
where C is a constant satisfying C (1 + x)2 |r(φ)|w1 for 0 x a. Also, using the Hardy type inequality (σ 2 /4)v2L2 (w ) vx 2L2 (w) in (2.10), we have 1
vx 2L2 (w) − c1 (σ )v2L2 (w ) = δvx 2L2 (w) + (1 − δ)vx 2L2 (w) − c1 (σ )v2L2 (w ) 1 1 2 2 2 δvx L2 (w) + (1 − δ)σ /4 − c1 (σ ) vL2 (w ) 1
δvx 2L2 (w)
+ 2δv2L2 (w ) , 1
where we used the fact that (1 − δ)σ 2 /4 − c1 (σ ) 2δ. On the other hand, using (4.8), we see that |R| C(|wx | + wφx )|v|3 . Moreover, a straightforward computation shows that |wx | + wφx C(1 + x)β−1 . Substituting all these estimates into (4.15), we obtain 1 d v2L2 (w) + δ vx 2L2 (w) + v2L2 (w ) Cv2L2 + Cv3L3 , 1 2 dt −2 β−1
(4.17)
where δ and C are independent of β. We multiply this inequality by (1 + t)γ and integrate with respect to t. By virtue of (4.16), we have
18
S. Kawashima, K. Kurata / Journal of Functional Analysis 257 (2009) 1–19
t
2 v(t)L2 +
γ
(1 + t)
β
2 2 (1 + τ )γ vx (τ )L2 + v(τ )L2 dτ β
β−2
0
t Cv0 2L2 β
+γC
2 (1 + τ )γ −1 v(τ )L2 dτ β
0
t +C
2 γ (1 + τ )γ v(τ )L2 dτ + CSβ (t),
(4.18)
−2
0
where the constant C is independent of γ and β. Here the third term on the right-hand side of (4.18) was already estimated by (4.4) with β = 0. Hence we have proved (4.4) also for 0 < β α. This completes the proof of Proposition 4.2. 2 Proof of Theorem 4.1. The proof is essentially the same as that of Theorem 3.3 of [15] so that we only give an outline of the proof of the decay estimate (4.1). First, we show the following L2 decay estimate by applying the induction argument to the space–time weighted energy inequality (4.3). 2 (1 + t) v(t)L2
j
α−2j
t +
2 (1 + τ )j vx (τ )L2
α−2j
2 + v(τ )L2
α−2j −2
dτ Cv0 2L2
α
(4.19)
0
for each integer j with 0 j [α/2]. To see this, we put γ = j and β = α − 2j in (4.3), obtaining t
2 (1 + t) v(t)L2
j
α−2j
+
2 (1 + τ )j vx (τ )L2
α−2j
2 + v(τ )L2
α−2j −2
dτ
0
t Cv0 2L2 α−2j
+ jC
2 (1 + τ )j −1 v(τ )L2
(4.20)
dτ.
α−2j
0
Then, applying the induction with respect to the integer j with 0 j [α/2], we obtain (4.19). On the other hand, when α/2 is not an integer, we have 2 (1 + t)γ v(t)L2 +
t
2 2 (1 + τ )γ vx (τ )L2 + v(τ )L2 dτ −2
0
Cv0 2L2 (1 + t)γ −α/2 . α
(4.21)
for any γ with γ > α/2. This can be shown by using (4.3) with γ > α/2, β = 0 and (4.19) with j = [α/2] together with Nishikawa’s technique in [12]. For the details, see the proof of Proposition 2.6 of [15].
S. Kawashima, K. Kurata / Journal of Functional Analysis 257 (2009) 1–19
Thus we have shown the following L2 decay estimate: v(t) 2 Cv0 2 (1 + t)−α/4 . Lα L
19
(4.22)
The desired Lp decay estimate (4.1) is then obtained by the time weighted Lp energy method. In fact, under the additional smallness condition on vL2 , we have 1
p v(t)Lp +
γ
(1 + t)
t
p 2 (1 + τ )γ |v|p/2 x (τ )L2 + v(τ )Lp dτ −2
0
Cv0 Lp + Cv0 L2 (1 + t)γ −(α/4+ν)p , p
p
(4.23)
α
where 2 p < ∞, ν = (1/2)(1/2 − 1/p) and γ > (α/4 + ν)p. We omit the details and refer the reader to the proof of Theorem 2.3 of [15]. This completes the proof of Theorem 4.1. 2 References [1] G.H. Hardy, Note on a theorem of Hilbert, Math. Z. 6 (1920) 314–317. [2] Y. Kagei, S. Kawashima, Stability of planar stationary solutions to the compressible Navier–Stokes equation on the half space, Comm. Math. Phys. 266 (2006) 401–430. [3] S. Kawashima, A. Matsumura, Asymptotic stability of traveling waves solutions of systems for one-dimensional gas motion, Comm. Math. Phys. 101 (1985) 97–127. [4] S. Kawashima, S. Nishibata, M. Nishikawa, Asymptotic stability of stationary waves for two-dimensional viscous conservation laws in half plane, Discrete Contin. Dyn. Syst. Suppl. (2003) 469–476. [5] S. Kawashima, S. Nishibata, M. Nishikawa, Lp energy method for multi-dimensional viscous conservation laws and application to the stability of planar waves, J. Hyperbolic Differ. Equ. 1 (2004) 581–603. [6] S. Kawashima, S. Nishibata, M. Nishikawa, Asymptotic stability of stationary waves for multi-dimensional viscous conservation laws in half space, preprint, 2004. [7] S. Kawashima, S. Nishibata, P. Zhu, Asymptotic stability of the stationary solution to the compressible NavierStokes equations in the half space, Comm. Math. Phys. 240 (2003) 483–500. [8] E. Landau, A note on a theorem concerning series of positive terms, J. London Math. Soc. 1 (1926) 38–39. [9] T.-P. Liu, A. Matsumura, K. Nishihara, Behavior of solutions for the Burgers equations with boundary corresponding to rarefaction waves, SIAM J. Math. Anal. 29 (1998) 293–308. [10] A. Matsumura, K. Nishihara, Asymptotic stability of traveling waves for scalar viscous conservation laws with non-convex nonlinearity, Comm. Math. Phys. 165 (1994) 83–96. [11] T. Nakamura, S. Nishibata, T. Yuge, Convergence rate of solutions toward stationary solutions to the compressible Navier–Stokes equation in a half line, J. Differential Equations 241 (1) (2007) 94–111. [12] M. Nishikawa, Convergence rate to the traveling wave for viscous conservation laws, Funkcial. Ekvac. 41 (1998) 107–132. [13] B. Opic, A. Kufner, Hardy-Type Inequalities, Pitman Res. Notes in Math. Ser., vol. 219, Longman Scientific & Technical, 1990. [14] Y. Ueda, Asymptotic stability of stationary waves for damped wave equations with a nonlinear convection term, Adv. Math. Sci. Appl. 18 (1) (2008) 329–343. [15] Y. Ueda, T. Nakamura, S. Kawashima, Stability of degenerate stationary waves for viscous gases, Arch. Ration. Mech. Anal., in press. [16] Y. Ueda, T. Nakamura, S. Kawashima, Stability of planar stationary waves for damped wave equations with nonlinear convection in multi-dimensional half space, Kinetic Related Models 1 (2008) 49–64.
Journal of Functional Analysis 257 (2009) 20–46 www.elsevier.com/locate/jfa
The planar algebra of group-type subfactors ✩ Dietmar Bisch, Paramita Das, Shamindra Kumar Ghosh ∗ Vanderbilt University, Department of Mathematics, SC 1326, Nashville, TN 37240, USA Received 11 August 2008; accepted 25 March 2009
Communicated by D. Voiculescu
Abstract If G is a countable, discrete group generated by two finite subgroups H and K and P is a II1 factor with an outer G-action, one can construct the group-type subfactor P H ⊂ P K introduced by Haagerup and the first author to obtain numerous examples of infinite depth subfactors whose standard invariant has exotic growth properties. We compute the planar algebra of this subfactor and prove that any subfactor with an abstract planar algebra of “group type” arises from such a subfactor. The action of Jones’ planar operad is determined explicitly. © 2009 Elsevier Inc. All rights reserved. Keywords: Planar algebra; Planar operad; Subfactor; IRF model
1. Introduction The technique of composing subfactors pioneered in [4] led to a zoo of exotic subfactors of the hyperfinite II1 factor. In particular, the first examples of irreducible, amenable subfactors that are not strongly amenable (in the sense of [16]) were constructed in this way in [4]. The idea is simple: Let H and K be two finite groups with an outer action on a II1 factor P (e.g. the hyperfinite II1 factor) and consider the composition of the fixed-point subfactor P H ⊂ P with the crossed product subfactor P ⊂ P K to obtain, what we will call here, a group-type subfactor P H ⊂ P K. If P is hyperfinite, analytical properties of this subfactor, such as amenability ✩
The authors were supported by NSF under Grant No. DMS-0301173.
* Corresponding author.
E-mail addresses:
[email protected] (D. Bisch),
[email protected] (P. Das),
[email protected] (S.K. Ghosh). 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.03.014
D. Bisch et al. / Journal of Functional Analysis 257 (2009) 20–46
21
and property (T) in the sense of Popa [16,18], were proved to be equivalent to the corresponding properties (amenability, property (T) in the sense of Kazhdan) of the group G generated by H and K in the outer automorphism group of P [4,7]. Note that a group-type subfactor is obtained from the two fixed-point subfactors P H ⊂ P and P K ⊂ P by performing the basic construction of [10] to one of them. A group-type subfactor is therefore an invariant for the relative position of the two fixed-point subfactors. For more on this, see [12]. The main invariant for a subfactor is the so-called standard invariant (see for instance [9,8,13]). It is a very sophisticated mathematical object that can be portrayed in a number of seemingly quite different ways. For example, it has descriptions as a certain category of bimodules ([14], see also [3]), as a lattice of algebras [17], or as a planar algebra [11]. Jones’ planar algebra technology has become a very efficient tool to capture and analyze the standard invariant of a subfactor. Composition of subfactors was the motivating idea that led to the results in [5]. It was proved there that two standard invariants without extra structure, i.e. consisting of just the Temperley– Lieb algebras, can always be composed freely to form a new standard invariant – namely the one generated by the Fuss–Catalan algebras of [5]. This concept of free composition was then pushed much further in [6], where it is shown that any two planar algebras arising from subfactors can be composed freely to form a new subfactor planar algebra. The principal graphs of a subfactor encode the algebraic information contained in the standard invariant [9], and their structure determines the growth properties of the invariant. It was shown in [4] that subfactors whose standard invariants have very exotic growth properties exist by constructing concrete group-type subfactors. The principal graphs of these subfactors were computed there. In this paper we go a step further and give a concrete description of the standard invariant (or equivalently the planar algebra) of the group-type subfactors. We concentrate on the case when the group G, generated by H and K in the automorphism group of P , has an outer action on P . Note that any group G generated by two finite subgroups H and K has such an action on the hyperfinite II1 factor (for instance by a Bernoulli shift action). We find a description of the planar algebra of these group-type subfactors that is reminiscent of an IRF (interaction round a face) model in statistical mechanics. The general case, where G is generated by H and K in the outer automorphism group of P , and hence may or may not lift to an action of G on P , is much more elaborate and will be treated in a separate paper. Here is a more detailed outline of the sections of this paper. We review in Section 2 the basic notions of planar algebras. In Section 3, we define an abstract planar algebra P associated to a countable, discrete group G and two of its finite subgroups H and K which generate G. The vector spaces underlying the planar algebra are spanned by alternating words in H and K that multiply to the identity element. The action of Jones’ planar operad is given explicitly by a particular labelling of planar tangles that can be viewed as an IRF-like model. It takes some work to show that the action of planar tangles is well defined and preserves composition. The latter is achieved by showing that the action respects composition with certain elementary tangles that generate any annular tangle. We then analyze the filtered ∗-algebra structure of P and determine the action of Jones projection and conditional expectation tangles. In Section 4 we compute the basic construction and higher relative commutants of a grouptype subfactor P H ⊂ P K. We exhibit a nice basis which is used in Section 5 to prove that the abstract group-type planar algebra of Section 3 is indeed isomorphic to the concrete one computed in Section 4. Moreover, we prove in Section 5 that if the standard invariant of an
22
D. Bisch et al. / Journal of Functional Analysis 257 (2009) 20–46
arbitrary subfactor is isomorphic to a group-type planar algebra, then the subfactor is indeed of group-type. This is proved using results on intermediate subfactors from [2,6,1]. 2. Planar algebra basics In this section, we will give a very brief overview of planar algebras; the reader is encouraged to see [11] for details. Let us first describe the key ingredients that constitute a planar tangle. • There is an external disc, several (possibly none) internal discs and a collection of disjoint smooth curves. • To each disc – internal or external, we attach a non-negative integer. This integer will be referred to as the ‘color’ of the disc. If a disc has color k > 0, there will be 2k points on the boundary of the disc marked 1, 2, . . . , 2k counted clockwise, starting with a distinguished marked point, which is decorated with ‘∗’. A disc having color 0 will have no marked points on its boundary. • Each of the curves is either closed, or joins a marked point on the boundary of a disc to another such point, meeting the boundary of the disc transversally. Each marked point must be the endpoint of exactly one curve. • The whole picture has to be planar, in the sense that there should be no crossing of curves or overlapping of discs. • Finally, we will not distinguish between pictures obtained from one another by planar isotopy preserving the ∗’s. The data of such a picture will be termed as a planar k-tangle, where k refers to the color of its external disc. Remark 2.1. We can induce a black-and-white shading on the complement of the union of the internal discs and curves in the external disc by specifying that the region between the last and first marked points be left unshaded. This leaves a scope for ambiguity in the case of 0-discs, thus, 0-discs may be of two types depending on whether the region surrounding their boundary is shaded or unshaded. Given a planar k-tangle T – one of whose internal discs have color ki – and a ki -tangle S, one can define the k-tangle T ◦i S by isotoping S so that its boundary, together with the marked points and the ‘∗’ coincides with that of Di and then erasing the boundary of Di . The collection of tangles – along with the composition defined thus – is called the colored operad of planar tangles. A planar algebra is a collection of vector spaces {Pk }k0 such that every k-tangle T that has b internal discs D1 , D2 , . . . , Db having colors k1 , k2 , . . . , kb respectively gives rise to a multilinear map ZT : Pn1 × Pn2 × · · · × Pnb → Pn0 . The collection of maps is required to be compatible with substitution of tangles and renumbering of internal discs. 3. An abstract planar algebra In this section we will abstractly define a planar algebra which will be identified in Section 5 to be the one corresponding to the group-type subfactor P H ⊂ P K of [4].
D. Bisch et al. / Journal of Functional Analysis 257 (2009) 20–46
23
Fig. 1. Example of faces in a tangle.
Let G be a group generated by two of its finite subgroups H and K and e denote the identity element of G. Let us define ⎧ if n = 0, ⎨ {e} K × H × K × H × · · · Sn = if n 1, ⎩ S=
(n factors)
Sn ,
n0
Ln =
K, H,
if n is even, otherwise.
Let μ : S → G be the multiplication map. With the above notation, we are ready to define the planar algebra but first we would need some terminology. Terminology. By a face in an unlabelled tangle T , we will mean a connected component of D0 \ [( bi=1 Di ) ∪ S] where D0 is the external disc, Di is the ith internal disc for i = 1, 2, . . . , b and S is the set of strings of (an element in the isotopy class of) T . By an opening in a tangle, we will mean the subset of the boundary of a disc lying between two consecutive marked points. An opening will be called internal (resp., external) if it is a subset of the boundary of the internal (resp., external) disc. Note that the boundary of a generic face may be disconnected due to the presence of loops or networks inside it (see Fig. 1). The set of connected components of the boundary of each face will have a single outer component and several (possibly none) inner component(s). Definition 3.1. A state f on a tangle T is a function f : {all openings in T } → H K such that following holds:
(i)
f (α) ∈
K, H,
if the face containing α is shaded, otherwise.
(ii) Triviality on the outer component of the boundary of a face: Let α1 , α2 , . . . , αm be the openings on the outer component of the boundary of a face counted clockwise, then we must have
24
D. Bisch et al. / Journal of Functional Analysis 257 (2009) 20–46
Fig. 2. An example of a state on an unlabelled tangle.
f (α1 )η1 f (α2 )η2 · · · f (αm )ηm = e where
ηi =
+1, if αi is an external opening, −1, otherwise.
(iii) Triviality on internal discs: f induces a map ∂f : {D0 , D1 , . . . , Db } → n0 Sn defined by
(i)
(i) (i) ∂f (Di ) = f α1 , f α2 , . . . , f α2ni ∈ S2ni (i)
(i)
(i)
(i)
where α1 , α2 , . . . , α2ni are consecutive openings counted clockwise such that α1 is the opening between the first and the second marked points of ∂Di . We demand that μ(∂f (Di )) = e, for all i = 1, 2, . . . , b. Note that triviality on the external disc and every boundary component of every face follows (see Remark 3.2 for a proof of this fact). The above definition also applies to networks (positively or negatively oriented). Fig. 2 illustrates conditions (ii) and (iii) of Definition 3.1; triviality on internal discs gives b1 b2 b3 b4 = c1 c2 c3 c4 = d1 d2 d3 d4 d5 d6 d7 d8 = f1 f2 = e and triviality on the outer component of the boundaries of faces gives a1 b1−1 = a2 b2−1 = a3 d5−1 d1−1 b3−1 = c4−1 f2−1 = c2 = d2−1 d4−1 = d3 = a4 a6 b4−1 d8−1 d6−1 = d7 = a5 = e. Note that the above relations also imply a1 a2 a3 a4 a5 a6 = e, and c1−1 c3−1 f1−1 = e.
D. Bisch et al. / Journal of Functional Analysis 257 (2009) 20–46
25
As the computation involving Fig. 2 suggests, triviality on inner boundaries of a face and on the external disc are actually consequences of the definition of a state. This is made precise in the following remark. Remark 3.2. Let f be a state on a tangle or a network. Then the following conditions hold: (ii) Triviality on every inner component of the boundary of a face: For every inner component of the boundary of a face with openings α1 , α2 , . . . , αm counted clockwise, we have f (α1 )f (α2 ) · · · f (αm ) = e. (iii) Triviality on the external disc: This is just a restatement of condition (iii) in Definition 3.1 applied to the external disc only if its color is greater than zero. We prove the above remark using planar graphs. Without loss of generality, we may start with a tangle or a network which is connected. By a network or a tangle (with non-zero color on its external disc) being connected, we mean that the union of the boundaries of all the discs and strings is a connected set; a 0-tangle is said to be connected if the network obtained after removing its external disc is connected. To each tangle or network, we associate a planar graph whose vertex set is the set of all marked points on the internal and external discs, and the edges are the openings and the strings. Further, we make this graph a directed one in such a way that the directions on the edges arising from the openings are induced by clockwise orientation on the boundary of the discs, and the remaining edges (coming from the strings) are free to have any direction. Any state f assigns group elements to edges arising from openings; we label each of the remaining edges by e and that is why we did not put any restriction on the direction of such edges. Note that the definition of the state implies the following condition on the group labelled planar directed graph: Triviality on the boundary each bounded face of the graph1 : If g1 , g2 , . . . , gm are the group elements assigned to consecutive edges around a face read clockwise, then η
η
ηm =e g1 1 g2 2 · · · gm
where
ηi =
+1, −1,
if ith edge induces clockwise orientation in the face, otherwise.
To establish Remark 3.2, it is enough to prove triviality on the boundary of the unbounded face. To see this, we first consider a pair of bounded faces which have at least one vertex or edge in common. If this pair is considered as a separate graph, then using triviality on each face, it is easy to check triviality on the boundary of the unbounded face of this pair. One can then use this fact inductively to deduce the desired result for the whole graph. Next, we give the definition of the planar algebra. Let the set of states on a tangle T be denoted by S(T ). 1 Since we started with a connected tangle or network, the associated planar graph will be connected; in particular, boundary of each face of the graph will be connected.
26
D. Bisch et al. / Journal of Functional Analysis 257 (2009) 20–46
Fig. 3. Assigning scalars to maxima or minima.
The vector spaces. For n 0, define Pn = C{s ∈ S2n : μ(s) = e}. Action of tangles. Let T be an unlabelled tangle with (possibly zero) internal disc(s) D1 , D2 , . . . , Db and external disc D0 where the color of Di is ni . Then T defines a multilinear map, denoted by ZT : Pn1 × Pn2 × · · · × Pnb → Pn0 . We will define ZT (s1 , s2 , . . . , sb ) ∈ Pn0 , where si ∈ S2ni such that μ(si ) = e. In fact, we will just prescribe the coefficient of s0 ∈ S2n0 (such that μ(s0 ) = e) in the expansion of ZT (s1 , s2 , . . . , sb ) in terms of the canonical basis mentioned above. We choose and fix a representative in the isotopy class of T and call it the standard form of T , denoted by T˜ . It is assumed to satisfy the following properties: • T˜ is in rectangular form – meaning that – all of its discs are replaced by boxes and it is placed in R2 in such a way that the boundaries of the boxes are parallel to the co-ordinate axes. • The first marked point on the boundary of each box is on the top left corner. • The collection of strings have finitely many local maxima or minima. • The external box can be sliced by horizontal lines in such a way that each maxima, minima, internal box is in a different slice. To each local maximum or minimum of a string with end-points, we assign a scalar according to Fig. 3. Let p(T ) denote the product of all scalars arising from the local maxima or minima and n+ (T ) (resp., n− (T )) be the number of non-empty connected positively (resp., negatively) oriented network(s) in the tangle T˜ . Then, the coefficient ZT (s1 , s2 , . . . , sb )|s0 of s0 in ZT (s1 , s2 , . . . , sb ) is given by p(T )|H |n+ (T ) |K|n− (T ) f ∈ S(T ): ∂f (Di ) = si for all i = 0, 1, . . . , b . Note that there could be several standard form representatives for T . However, one standard form representative for T can be transformed into another by a finite sequence of moves of the following three types:
D. Bisch et al. / Journal of Functional Analysis 257 (2009) 20–46
27
I. Horizontal or vertical sliding of boxes. II. Wiggling of the strings. III. Rotation of an internal box by a multiple of 360◦ . It is easy to check that the above three moves do not alter the number of connected networks and keep |{f ∈ S(T ): ∂f (Di ) = si for all i = 0, 1, . . . , b}| unaltered. So, it remains to show that p(T ) is unchanged under the moves. Type I moves are the easiest because they do not generate any new local maxima or minima. In each of the moves of type II and III, there arises pair(s) of local maximum and minimum in such a way that the two scalars assigned to each pair are inverses of each other; as a result, p(T ) remains unchanged. Action of the tangles preserve composition. For S an n0 -tangle with internal discs D1 , D2 , . . . , Db having colors n1 , n2 , . . . , nb respectively, and T an nj -tangle for some j ∈ {1, 2, . . . , b}, we would like to show ZS◦Dj T = ZS ◦ (idPn1 × · · · ZT · · · × idPnb ). For this, we first identify a set E of annular tangles (with the distinguished internal disc as D1 ) which we call elementary tangles, namely: (i) Capping tangles:
with col(D1 ) = n 1, col(D0 ) = n − 1, and 1 i 2n − 1 (col(D1 ) = 1 just means that there are no strings connecting the internal disc to the external disc). (ii) Cap inclusion tangles:
with col(D1 ) = n 0, col(D0 ) = n + 1, and 1 i 2n + 1.
28
D. Bisch et al. / Journal of Functional Analysis 257 (2009) 20–46
If i = 1 in (i) or (ii) then the pictures should be interpreted as simply not having the bunch of (i − 1) straight strings joining the marked points of the internal disc and the corresponding points of the external disc. (iii) Left inclusion tangles:
with col(D1 ) = n 0, col(D0 ) = n + 2. (iv) Disc inclusion tangle:
(iv) Disc inclusion tangle:
with col(D1 ) = col(D0 ) = n 0, col(D2 ) = q, where p 0, q 0, r 0 such that p + q + r = n and p is even. Note that any annular tangle can be expressed as a composition of the elementary tangles. To see this, express the annular tangle in standard form and cut it into horizontal strips each of which contains at most one internal disc or one local maxima or minima. Now, the strip containing the distinguished internal disc inside the annular tangle can be obtained by composition of elementary tangles of type (iii) and type (ii) (more specifically, the inclusion tangles); one can then glue the other strips consecutively one after the other along the lines of cutting to get back the original tangle. Each such gluing operation is given by composition of an elementary tangle of type (i), (ii), (iv) or (iv) . So, to prove that the action of the tangles preserve composition, it is enough to prove ZE◦D1 T = ZE ◦ ZT (resp., ZE◦D1 T = ZE ◦ (ZT × idPq )) for any tangle T and any E ∈ E of type (i), (ii) or (iii) (resp. (iv) or (iv) ) whenever the composition makes sense.
D. Bisch et al. / Journal of Functional Analysis 257 (2009) 20–46
29
We fix an n-tangle T with internal discs D1 , D2 , . . . , Db with colors n1 , n2 , . . . , nb respectively, and an n0 -tangle E ∈ E such that both T and E are in standard forms and color of D1 in E is n. Let us consider the standard form on E ◦D1 T induced by the standard forms of E and T . Our goal is to show: ZE◦D1 T (s1 , s2 , . . . , sb )|s0 =
ZE (s)|s0 ZT (s1 , s2 , . . . , sb )|s
s∈S2n s.t. μ(s)=e
if E is of type (i), (ii) or (iii), and ZE◦D1 T (s1 , s2 , . . . , sb , t)|s0 =
ZE (s, t)|s0 ZT (s1 , s2 , . . . , sb )|s
s∈S2n s.t. μ(s)=e
if E is of type (iv) or (iv) where sj ∈ S2nj for 0 j b and t ∈ Sq such that μ(sj ) = e = μ(t) for all j . An interesting situation arises when we pick elementary tangles of type (i), since composition of tangles in this case may lead to a change in the number of connected networks. The reasoning in the other cases is either similar or straightforward. For E being type (i) elementary tangle, the above equation is equivalent to: ⎧ ⎫ ∂f (Dj ) = sj ⎬ ⎨ p(E ◦D1 T )|H |n+ (E◦D1 T ) |K|n− (E◦D1 T ) f ∈ S(E ◦D1 T ) for 1 j b, ∂f (D0 ) = s0 ⎭ ⎩ = p(E)p(T )|H |n+ (E)+n+ (T ) |K|n− (E)+n− (T ) ×
s∈S2n s.t. μ(s)=e
f ∈ S(E) ∂f (D1 ) = s ∂f (D0 ) = s0
⎧ ⎫ ∂f (Dj ) = sj ⎬ ⎨ f ∈ S(T ) for 1 j b, ⎩ ∂f (D ) = s ⎭ 0
where D0 denotes the external disc of T . First, observe that p(E ◦D1 T ) = p(E)p(T ). To show the equality of the remaining scalars, we consider the following two cases. Case 1. The string which connects the ith and the (i + 1)th points on D1 in E, does not produce any new network in E ◦D1 T other than those that are already present in T . Clearly, n (E ◦D1 T ) = n (T ) (since no new network appears in E ◦D1 T ) and n (E) = 0 for ∈ {+, −}. A typical example of such a case can be viewed in the following picture where we label the openings on the internal discs of T by group elements coming from the coordinates of sj for 1 j b, and the openings on D0 of E by coordinates of s0 = (g1 , g2 , . . . , g2n−2 ). Using triviality on the boundary of faces in the definition of a state, we get
f ∈ S(E) ∂f (D1 ) = s = 0 ⇔ f ∈ S(E) ∂f (D1 ) = s = 1 ∂f (D0 ) = s0 ∂f (D0 ) = s0
⇔ s = g1 , . . . , gi−2 , (gi−1 g), e, g −1 , gi , . . . , g2n−2 ∈ S2n for some g ∈ Li .
30
D. Bisch et al. / Journal of Functional Analysis 257 (2009) 20–46
Fig. 4.
Define s g = (g1 , . . . , gi−2 , (gi−1 g), e, g −1 , gi , . . . , g2n−2 ) for g ∈ Li . So, it is enough to check ⎫ ⎧ ⎫ ⎧ ∂f (Dj ) = sj ⎬ ∂f (Dj ) = sj ⎬ ⎨ ⎨ f ∈ S(T ) for 1 j b, . f ∈ S(E ◦D T ) for 1 j b, = 1 ⎩ ∂f (D ) = s g ⎭ ∂f (D0 ) = s0 ⎭ g∈Li ⎩ 0 Carefully observing Fig. 4 and using triviality on the boundary of faces once again, we get ⎧ ⎫ ⎨ ∂f (Dj ) = sj ⎬ f ∈ S(E ◦D T ) for 1 j b, = 0, equivalently, equals to 1 1 ⎩ ∂f (D0 ) = s0 ⎭ ⎧ ⎫ ∂f (Dj ) = sj ⎬ ⎨ ⇒ f ∈ S(T ) for 1 j b, = δg,bη1 bη2 ···bηl l 1 2 ∂f (D ) = s g ⎭ ⎩ 0 where ηj = ±1 according as the corresponding opening is external or internal. Conversely, if ⎧ ⎫ ∂f (Dj ) = sj ⎬ ⎨ f ∈ S(T ) for 1 j b, ⎩ ∂f (D ) = s g ⎭ g∈Li 0
η
η
η
is non-zero, then it has to equal to 1 since g must be b1 1 b2 2 · · · bl l by triviality on the boundary of faces in T ; from the unique state on T which makes the above sum non-zero, one can easily induce a well-defined state on E ◦D1 T , and hence ⎫ ⎧ ∂f (Dj ) = sj ⎬ ⎨ f ∈ S(E ◦D T ) for 1 j b, = 1. 1 ⎩ ∂f (D0 ) = s0 ⎭ This completes the proof of Case 1.
D. Bisch et al. / Journal of Functional Analysis 257 (2009) 20–46
31
Fig. 5.
Case 2. The string which connects the ith and the (i + 1)th points on D1 in E, produces a new (connected) network in E ◦D1 T other than those that are already present in T . First, let us assume that the new network is positively oriented, equivalently, i is odd. Clearly, n− (E ◦D1 T ) = n− (T ) (since no new negatively oriented network appears in E ◦D1 T ), and n+ (E ◦D1 T ) = n+ (T ) + 1. Further, assume that col(D0 ) 1. In this case, n (E) = 0 for ∈ {+, −}. A typical example of this case can be viewed in the following picture where we label the openings on the internal discs of T by group elements coming from the coordinates of sj for 1 j b, and the openings on D0 of E by coordinates of s0 = (g1 , g2 , . . . , g2n−2 ). So, in this case, it is enough to check ⎧ ⎫ ⎧ ⎫ ∂f (Dj ) = sj ⎬ ⎨ ∂f (Dj ) = sj ⎬ ⎨ f ∈ S(T ) for 1 j b, . |H | f ∈ S(E ◦D1 T ) for 1 j b, = ⎩ ⎩ ⎭ ⎭ ∂f (D ) = s g ∂f (D0 ) = s0 g∈H 0
If ⎧ ⎨
⎫ ∂f (Dj ) = sj ⎬ f ∈ S(E ◦D1 T ) for 1 j b, ⎩ ∂f (D0 ) = s0 ⎭ η
η
is non-empty (equivalently, singleton), from Fig. 5, we have a1 a2 · · · ak = e and gi−1 b1 1 b2 2 η · · · bl l = e where ηj = ±1 according as the corresponding opening is external or internal. For any g ∈ H , define f g by setting ∂f (Dj ) = sj for 1 j b and ∂f (D0 ) = s g . To check whether f g is a state, we consider the face in T appearing in Fig. 5; triviality on the boundary of this face η η η is given by the equation g −1 b1 1 b2 2 · · · bl l (gi−1 g)a1 a2 · · · ak = e which indeed holds. Triviality on all other discs or faces is induced by the existence of the state on E ◦D1 T . Thus, ⎧ ⎫ ∂f (Dj ) = sj ⎬ ⎨ f ∈ S(T ) for 1 j b, = 1 ⎩ ∂f (D ) = s g ⎭ 0
32
D. Bisch et al. / Journal of Functional Analysis 257 (2009) 20–46
for all g ∈ H . Conversely, if the right-hand side is non-zero, then there exists g ∈ H such that η η η f g (defined earlier) is a state on T . Analyzing Fig. 5, we get g −1 b1 1 b2 2 · · · bl l (gi−1 g)ak−1 · · · a2−1 a1−1 = e. Note that the opening between the ith and the (i + 1)th points of the disc D0 of T , is assigned e. Now, if we consider the network appearing in Fig. 5 separately, then we have triviality on each of its internal faces and discs (induced by f g being a state); by Remark 3.2(ii) , we also have a1 a2 · · · ak = e (triviality on the internal boundary of the external face of the network). This η η η implies b1 1 b2 2 · · · bl l gi−1 = e; as a result, f h is a state for every h ∈ H . So, if right-hand side is η η η non-zero, then it has to be |H |; moreover, we get b1 1 b2 2 · · · bl l gi−1 = e which plays an important role in showing that ⎧ ⎫ ⎨ ∂f (Dj ) = sj ⎬ f ∈ S(E ◦D T ) for 1 j b, = 1 ⎩ 0 (equivalently, equals to 1). ⎭ ∂f (D0 ) = s0 This finishes the proof for the case where i is odd. For i even, the proof is exactly similar, except that one has to interchange |H | and |K|. The subcase that deserves separate treatment is when col(D0 ) = 0. In this case, n+ (E) = 1 and |{f ∈ S(E): ∂f (D1 ) = s}| = δs,(e,e) . Therefore, it is enough to show
f ∈ S(E ◦D T ) ∂f (Dj ) = sj 1 for 1 j b
⎧ ⎫ ∂f (Dj ) = sj ⎨ ⎬ = f ∈ S(T ) for 1 j b, . ⎩ ∂f (D ) = (e, e) ⎭ 0
The proof of the equality of the two sides is similar and is left to the reader. We now analyze the filtered ∗-algebra structure of P and the action of Jones projection tangles and conditional expectation tangles which will be useful in Section 5 to show that P is isomorphic to the planar algebra arising from a group-type subfactor. We start with laying some notations. Define ⎧ if n = 0, ⎨ {e} K × H × K if n 1, Sn = ·· · × H × ⎩ (n factors)
⎧ if n = 0, ⎨ {e} H × K × H × K × · · · if n 1. Tn = ⎩ (n factors)
Define : Sn → Sn by (s1 , s2 , . . . , sn )= (sn−1 , . . . , s2−1 , s1−1 ) for (s1 , s2 , . . . , sn ) ∈ Sn and let : Sn → Sn denote its inverse. Remark 3.3. We describe below the main structural features of the planar algebra P . (i) Identity:
e 1Pn =
s, e) s∈Sn−1 (s, e,
if n = 0, if n 1.
D. Bisch et al. / Journal of Functional Analysis 257 (2009) 20–46
33
(ii) ∗-structure: Define ∗ on P by defining on the basis as
−1 −1 s ∗ = s2n−1 , . . . , s2−1 , s1−1 , s2n where s = (s1 , s2 , . . . , s2n ) ∈ S2n such that μ(s) = e for n 1, and then extend conjugate linearly. Clearly, ∗ is an involution. One also needs to verify whether the action of a tangle T preserves ∗, that is, ZT ∗ ◦ (∗ × · · · × ∗) = ∗ ◦ ZT ; in particular, ZT ∗ (s1∗ , . . . , sb∗ )|s0∗ =
ZT (s1 , . . . , sb )|s0 . It is enough to check this equation for the cases when T has no internal disc or closed loops, and when T is an elementary tangle. The actual verification in each of these cases is completely routine and is left to the reader. (iii) Multiplication: (a1 , l1 , b1 , h1 ) · (a2 , l2 , b2 , h2 ) = δb1 ,a2 (a1 , l1 l2 , b2 , h2 h1 ) where ai ∈ Sn−1 , bi ∈ S n−1 , li ∈ Ln−1 , hi ∈ H such that μ(ai , li , bi , hi ) = e for i = 1, 2 and n 1 (where we consider the elements ai and bi to be void in the case of n = 1). (iv) Inclusion:
Pn s →
(s1 , s2 , . . . , sn−1 , l1 , e, l2 , sn+1 , . . . , s2n ) ∈ Pn+1
l1 ,l2 ∈Ln−1 such that l1 l2 =sn
where s = (s1 , s2 , . . . , s2n ) ∈ S2n such that μ(s) = e for n 1. (v) Jones Projection Tangle: For P2 ,
=
|K| e, h, e, h−1 |H | h∈H
and for Pn+1 for n > 1,
=
|Ln−1| |Ln |
s ∈ Sn−2 l1 , l2 , l3 ∈ Ln s.t. l1 l2 l3 = e
(s, l1 , e, l2 , e, l3 , s˜ , e).
34
D. Bisch et al. / Journal of Functional Analysis 257 (2009) 20–46
(vi) Conditional Expectation Tangle from Pn+1 onto Pn : For n 1, let s1 ∈ Sn−1 , s2 ∈ S n−1 , m1 , m2 ∈ Ln−1 , l ∈ Ln , h ∈ H such that μ(s1 , m1 m2 , s2 , h) = e. Then,
= δl,e
|Ln | (s1 , m1 m2 , s2 , h) |Ln−1|
and for n = 0,
= δh,e δk,e
|K| e. |H |
(vii) Conditional Expectation Tangle from Pn onto P1,n : For n 2 let k1 , k2 ∈ K, t ∈ T2n−3 , h ∈ H such that μ(k1 , t, k2 , h) = e. Then,
= δh,e
|H | |K|
k ,k ∈K s.t. k k =k2 k1
and for n = 1,
= δh,e δk,e
|H | e. |K|
(k , t, k , e)
D. Bisch et al. / Journal of Functional Analysis 257 (2009) 20–46
35
4. Iterated basic construction and higher relative commutants of group-type subfactors Let P be a II1 factor and G a countable discrete group with an outer action on P , and suppose G is generated by two of its finite subgroups H and K. Consider the associated group-type subfactor P H ⊂ P K. In this section we will give a concrete realization of the Jones tower associated to this subfactor and compute its higher relative commutants. See also [19] for related results. First, let us recall the following characterization of the basic construction of a finite index subfactor [15]. Lemma 4.1. Let N ⊂ M be a finite index subfactor with E : M → N being the trace-preserving conditional expectation, B be a II1 factor containing M as a subfactor and f be a projection in B satisfying: (i) f xf = E(x)f for all x ∈ M, (ii) B is the algebra generated by M and f . Then, B is isomorphic to the basic construction M1 of N ⊂ M. Second, we recall some basic facts and notations for the crossed product construction. Unless otherwise specified, we will reserve the symbol e for the identity element of a group. The crossed product P K can be realized as the von Neumann subalgebra of L(l 2 (K) ⊗ L2 (P )) (∼ = MK (C) ⊗ L(L2 (P ))) generated by the images of P and K in the following way:
P x → Ek,k ⊗ k −1 (x) ∈ MK (C) ⊗ L L2 (P ) , (4.1) k∈K
K k → λk ⊗ 1 ∈ MK (C) ⊗ L L2 (P )
(4.2)
where we set the convention of considering k(x) as the element of P obtained by applying the automorphism k on x (in P ) and λk is the matrix in MK (C) corresponding to left multiplication by k. Consequently, the following commutation relation holds in P K: kxk −1 = k(x)
for all x ∈ P , k ∈ K.
(4.3)
However, P K can also be viewed as the vector space generated by elements of the form x k, k∈K k xk ∈ P , where the multiplication structure is given by the relation (4.3). The unique trace on P K is given by tr xk k = tr(xe ) k∈K
and the unique trace-preserving conditional expectation is given by P K EP xk k = xe . k∈K
If P K denotes the fixed point subalgebra of P , then P K is isomorphic to the basic construction of P K ⊂ P where the Jones projection is given by
36
D. Bisch et al. / Journal of Functional Analysis 257 (2009) 20–46
e1 =
1 k∈P K |K| k∈K
implementing the conditional expectation: 1 k(x) ∈ P K |K|
EPP K (x) =
for all x ∈ P .
k∈K
The basic construction of P ⊂ P K is isomorphic to MK (C) ⊗ P where the inclusion P K → MK (C) ⊗ P is given by the maps (4.1), (4.2), and the corresponding Jones projection is given by e2 = Ee,e ⊗ 1 ∈ MK (C) ⊗ P implementing the conditional expectation EPP K . The next element in the tower of basic construction is given by MK (C) ⊗ (P K) where the inclusion MK (C) ⊗ P → MK (C) ⊗ (P K) is induced by the inclusion P ⊂ P K and the Jones projection is given by e3 =
1 ρk −1 ⊗ k ∈ MK (C) ⊗ (P K) |K|
(4.4)
k∈K
implementing the conditional expectation: M (C)⊗P
K EP K
(Ek1 ,k2 ⊗ x) =
1 k1 xk2 −1 ∈ P K |K|
for all x ∈ P , k1 , k2 ∈ K,
(4.5)
where ρk is the matrix in MK (C) corresponding to right multiplication by k. Coming back to the context of group-type subfactors we consider the unital inclusions P H → P K → MK (C) ⊗ (P H ) where the second inclusion factors through MK (C) ⊗ P in the obvious way. Lemma 4.2. MK (C) ⊗ (P H ) is the basic construction for P H ⊂ P K with Jones projection 1 e1 = Ee,e ⊗ |H | h∈H h. Proof. We need to show that conditions (i) and (ii) of Lemma 4.1 are satisfied. To show (i), let us assume that x˜ = k∈K xk k denotes a typical element of P K,
1 1 ˜ 1 = Ee,e ⊗ h x˜ Ee,e ⊗ h e1 xe |H | |H | h∈H h ∈H 1 1 −1 = Ee,e ⊗ h Ek ,k λk ⊗ k (xk ) Ee,e ⊗ h |H | |H | =
Ee,e λk Ee,e ⊗
k∈K
= Ee,e ⊗
1 |H |2
h∈H h ∈H
h ∈H
k ∈K k∈K
h∈H
1 |H |2
h∈H h ∈H
h(xe )hh
hxk h
D. Bisch et al. / Journal of Functional Analysis 257 (2009) 20–46
37
whereas 1 −1 1 Ek,k ⊗ k h(xe ) Ee,e ⊗ h E(x)e ˜ 1= |H | |H | k∈K h∈H h ∈H 1 . h(x )h = Ee,e ⊗ e |H |2 h∈H
h ∈H
Therefore, LHS = RHS. To show (ii), it is enough to show that elements of the form Ek1 ,k2 ⊗ xh for x ∈ P , h ∈ H , Let us denote the Jones projection in P H k1 , k2 ∈ K are in the algebraic span of P K and e1 . corresponding to the inclusion P H ⊂ P by f = |H1 | h∈H h. Thus, e1 = Ee,e ⊗ f ∈ MK (C) ⊗ P H. This implies P e1 P = Ee,e ⊗ Pf P = Ee,e ⊗ P H ⊂ MK (C) ⊗ P H where P in the left-hand side is identified with its image inside MK (C) ⊗ P H (namely, the prescription given by (4.1)). Thus the algebraic span of P K and e1 contains elements of the type Ee,e ⊗ xh for x ∈ P , h ∈ H . To obtain elements of the form Ek1 ,k2 ⊗ xh, note that the relation λk1 Ee,e λk −1 = Ek1 ,k2 holds in MK (C). 2 2
Lemma 4.3. MK (C) ⊗ MH (C) ⊗ (P K) is the basic construction for P K ⊂ MK (C) ⊗ 1 (P H ) where the Jones projection is given by e2 = |K| k∈K ρk −1 ⊗ Ee,e ⊗ k. M (C)⊗(P H )
K Proof. The conditional expectation EP K
M (C)⊗P
K EP K
is the composition
M (C)⊗(P H )
◦ EMKK (C)⊗P
.
1 k1 xk2 −1 . Therefore, E(Ek1 ,k2 ⊗ xh) = δh,e |K| To show condition (i) of Lemma 4.1,
e2 (Ek1 ,k2 ⊗ xh)e2 =
1
ρk −1 ⊗ Ee,e ⊗ k
|K|2 k ∈K ρk −1 ⊗ Ee,e ⊗ k ×
Ek1 ,k2 ⊗ Eh ,h λh ⊗ h −1 (x)
h ∈H
k ∈K
= δh,e = δh,e
1 |K|2 1 |K|2
ρk −1 Ek1 ,k2 ρk −1 ⊗ Ee,e λh Ee,e ⊗ k xk
k ,k ∈K
k ,k ∈K
Ek1 k −1 ,k2 k ⊗ Ee,e ⊗ k xk
38
D. Bisch et al. / Journal of Functional Analysis 257 (2009) 20–46
whereas E(Ek1 ,k2 ⊗ xh)e2
−1 1 −1 k (x) = δh,e λk1 Ek,k λk −1 ⊗ Eh ,h ⊗ h ρk −1 ⊗ Ee,e ⊗ k 2 |K|2 h ∈H k∈K k ∈K 1 −1 λk1 Ek,k λk −1 ρk −1 ⊗ Ee,e ⊗ k (x)k = δh,e 2 |K|2 k∈K 1 −1 Ek1 k,k2 kk ⊗ Ee,e ⊗ k xkk = δh,e |K|2 k,k ∈K and the two sides are the same after renaming the indices. To show condition (ii), note that MK (C) ⊗ (P K) is algebraically generated by MK (C) ⊗ P 1 and |K| k∈K ρk −1 ⊗ k (by the remarks preceding Eq. (4.4)). The following holds in MK (C) ⊗ sits inside MH (C) ⊗ P (since in the MH (C) ⊗ (P K) because of this fact and the way P second tensor component we get expressions of the form h,h ∈H Eh,h Ee,e Eh ,h which reduces to Ee,e ):
MK (C) ⊗ P e2 MK (C) ⊗ P = MK (C) ⊗ Ee,e ⊗ (P K)
⇒ MK (C) ⊗ P H e2 MK (C) ⊗ P H = MK (C) ⊗ MH (C) ⊗ (P K) where the last implication is again due to the relation λh1 Ee,e λh−1 = Eh1 ,h2 in MH (C). 2
2
Thus, we have the first two levels in the tower of basic construction: P H ⊂ P K ⊂ MK (C) ⊗ (P H ) ⊂ MK×H (C) ⊗ (P K) where we identify MK×H (C) with MK (C) ⊗ MH (C). The next levels in the tower are obvious generalizations and we gather everything in the following proposition. Proposition 4.4. Let G be a group acting outerly on the II1 factor P and assume G is generated by two of its finite subgroups H and K. Then the nth element of the tower of basic construction of the group-type subfactor N = P H ⊂ P K = M is given by Mn ∼ = MSn (C) ⊗ (P Ln ) where the inclusion of Mn inside Mn+1 is as follows: Mn Es,t ⊗ x →
Es,t ⊗ El,l ⊗ l −1 (x) ∈ Mn+1
for all x ∈ P , s, t ∈ Sn ,
l∈Ln
Mn Es,t ⊗ l → Es,t ⊗ λl ⊗ e ∈ Mn+1
for all l ∈ Ln , s, t ∈ Sn ,
D. Bisch et al. / Journal of Functional Analysis 257 (2009) 20–46
and the nth Jones projection is: 1 Mn e n =
l∈Ln IMSn−2 ⊗ ρl −1 |Ln | 1 h∈H Ee,e ⊗ h, |H |
39
⊗ Ee,e ⊗ l, if n > 1, if n = 1.
Proof. We use induction. The case of n = 1 is a little different from the rest and is proved in Lemma 4.2 and the n = 2 case is proved in Lemma 4.3. Suppose the statement of the above proposition holds upto a level n (> 2). Now, the subfactor Mn−1 ⊂ Mn is isomorphic to: MSn−1 ⊗ P Ln−1 ⊂ MSn−1 ⊗ MLn−1 ⊗ P Ln where we identify MSn−1 ⊗ MLn−1 with MSn and the inclusion is induced by identity over MSn−1 tensored with the inclusion of the subfactor P Ln−1 ⊂ MLn−1 ⊗ P Ln . Using Lemma 4.3 for K = Ln−1 and H = Ln , it is clear that the statement of the proposition holds for level n + 1. 2 Remark 4.5. The formula for the unique trace-preserving conditional expectation is: n EM Mn−1 (Es1 ,s2 ⊗ Em1 ,m2 ⊗ xl) = δl,e
1 Es ,s ⊗ m1 xm−1 2 |Ln | 1 2
where s1 , s2 ∈ Sn−1 , m1 , m2 ∈ Ln−1 , l ∈ Ln and x ∈ P and the unique trace on Mn is given by trMn (Es1 ,s2 ⊗ xl) =
1 δl,e δs1 ,s2 trM (x) |Sn |
where s1 , s2 ∈ Sn , l ∈ Ln and x ∈ P . We will now compute the higher relative commutants using the above model of the Jones tower. To this end, we need the following two lemmas, where we denote the set of automorphisms of M ⊃ N that fixes elements of N pointwise by Gal(N ⊂ M). Lemma 4.6. Let N ⊂ M be a finite index subfactor and θ ∈ Gal(N ⊂ M), then the bimodule 2 2 M L (θ )M (where the module is L (M) with usual left action of M but right action is twisted by θ ) is a 1-dimensional irreducible sub-bimodule of M L2 (M1 )M . Proof. Define uθ : L2 (M) → L2 (M) by uθ (xΩ) = θ (x)Ω for x ∈ M where Ω is the cyclic and separating vector in the GNS construction with respect to the canonical trace tr. Note that uθ (n1 · xΩ · n2 ) = uθ (n1 xn2 Ω) = n1 · θ (x)Ω · n2 for n1 , n2 ∈ N , x ∈ M. This implies uθ ∈ N ∩ M1 . Now define T : M L2 (θ )M → M L2 (M1 )M by T (xΩ) = xuθ Ω1 for x ∈ M. It is completely routine to check that T is a well-defined M-M linear isometry and we leave this to the reader. 2 Corollary 4.7. H = Gal(P H ⊂ P ). Proof. Clearly H ⊂ Gal(P H ⊂ P ). Let θ ∈ Gal(P H ⊂ P ). Note that P L2 (P H )P ∼ = 2 (h) . Thus by Lemma 4.6, L2 (θ ) ∼ L2 (h) for some h ∈ H . This implies L = P P P P P h∈H P θ h−1 ∈ Inn(P ) ∩ Gal(P H ⊂ P ) = {idP } since P H ⊂ P is irreducible. Hence, θ ∈ H . 2
40
D. Bisch et al. / Journal of Functional Analysis 257 (2009) 20–46
Lemma 4.8. Let N ⊂ M be an irreducible subfactor, i.e. N ∩ M ∼ = C and θ ∈ Aut(M). For x ∈ M, the following are equivalent: (i) x = 0 and xθ (y) = yx for all y ∈ N , x ∈ U(M) and Adx0 ◦ θ ∈ Gal(N ⊂ M). (ii) x0 := x Proof. (ii) ⇒ (i) part is easy. For (i) ⇒ (ii), note that we also have θ (y)x ∗ = x ∗ y for all y ∈ N . Thus, x ∗ x ∈ θ (N ) ∩ M and ∗ xx ∈ N ∩ M. Since N ∩ M ∼ = C, xx ∗ = x ∗ x = x2 . Hence, x0 ∈ U(M) and x0 θ (y)x0 ∗ = y for all y ∈ N . This implies Adx0 ◦ θ ∈ Gal(N ⊂ M). 2 Proposition 4.9. For the group-type subfactor N = P H ⊂ P K = M, the relative commutants N ∩ Mn and M ∩ Mn are given by ⎧ C, ⎧ ⎪ ⎫ if n = −1, ⎪ ⎨ s , s ∈ S ⎨ ⎬ 1 2 n N ∩ Mn ∼ = span E l ∈ Ln ⊗ l , if n 0, ⎪ s ,s ⎪ ⎩ 1 2 ⎩ μ(s1 )lμ(s2 )−1 ∈ H ⎭ ⎧ C, ⎧ ⎪ ⎫ if n = 0, ⎪ ⎨ t , t ∈ T ⎨ ⎬ n−1 1 2 M ∩ Mn ∼ = span Ek,kk0 ⊗ Et1 ,t2 ⊗ l μ(t1 )lμ(t2 )−1 ∈ K , if n 1. ⎪ k∈K ⎪ ⎩ ⎩ k0 = μ(t1 )lμ(t2 )−1 ⎭ Proof. We compute the relative commutants in relation to the concrete model of the basic construction described in Proposition 4.4. Consider the inclusion Es,s ⊗ μ(s)−1 (x) ∈ MSn (C) ⊗ (P Ln ) = Mn . N = P H x → s∈Sn
Let w=
s1 ,s2 ∈Sn l∈Ln
Es1 ,s2 ⊗ xsl1 ,s2 l ∈ N ∩ Mn ,
for some xsl1 ,s2 ∈ P .
Then, wy = yw ⇔
for all y ∈ N
Es1 ,s2 ⊗ xsl1 ,s2 l μ(s2 )−1 (y)
s1 ,s2 ∈Sn l∈Ln
=
s1 ,s2 ∈Sn l∈Ln
⇔ ⇔
Es1 ,s2 ⊗ μ(s1 )−1 (y)xsl1 ,s2 l
for all y ∈ N
xsl1 ,s2 lμ(s2 )−1 (y) = μ(s1 )−1 (y)xsl1 ,s2 for all y ∈ N, s1 , s2 ∈ Sn , l ∈ Ln
for all y ∈ N, s1 , s2 ∈ Sn , l ∈ Ln . μ(s1 ) xsl1 ,s2 μ(s1 )lμ(s2 )−1 (y) = yμ(s1 ) xsl1 ,s2
D. Bisch et al. / Journal of Functional Analysis 257 (2009) 20–46
41
Now, by Lemma 4.8 and Corollary 4.7, for s1 , s2 ∈ Sn and l ∈ Ln , xsl1 ,s2
= 0
Adx0 ◦ μ(s1 )lμ(s2 )−1 ∈ H
⇔
where x0 =
μ(s1 )(xsl1 ,s2 ) μ(s1 )(xsl1 ,s2 )
.
Moreover, xsl1 ,s2 = 0 ⇒ Adx0 ∈ G ∩ Inn(P ) = {idP }. Since P is a factor, x0 ∈ C1. Thus,
xsl1 ,s2 will be non-zero only if μ(s1 )lμ(s2 )−1 ∈ H and in such cases xsl1 ,s2 will be a scalar multiple of identity. Hence, N ∩ Mn is spanned by the linearly independent set {Es1 ,s2 ⊗ l: s1 , s2 ∈ Sn , l ∈ Ln , μ(s1 )lμ(s2 )−1 ∈ H }. For M ∩ Mn where n 1, we consider the inclusion M = P K ⊃ P x →
Es,s ⊗ μ(s)−1 (x) ∈ MSn (C) ⊗ (P Ln ) = Mn ,
s∈Sn
M = P K ⊃ K k → λk ⊗ ITn−1 ⊗ 1 ∈ MSn (C) ⊗ (P Ln ) = Mn . Let w=
s1 ,s2 ∈Sn l∈Ln
Es1 ,s2 ⊗ xsl1 ,s2 l ∈ M ∩ Mn ,
for some xsl1 ,s2 ∈ P .
Using arguments similar to those for the calculations for the case of N ∩ Mn it follows that wy = yw ⇔
for all y ∈ P
μ(s1 ) xsl1 ,s2 μ(s1 )lμ(s2 )−1 (y) = yμ(s1 ) xsl1 ,s2
for all y ∈ P , s1 , s2 ∈ Sn , l ∈ Ln ,
and hence xsl1 ,s2 = 0 ⇔ μ(s1 )lμ(s2 )−1 = e since P is a factor; further, in such cases, xsl1 ,s2 is a scalar multiple of 1. Now, wk = kw ⇔ ⇔
for all k ∈ K
k −1 wk = w for all k ∈ K l λ−1 k ⊗ ITn−1 ⊗ 1 Es1 ,s2 ⊗ xs1 ,s2 l (λk ⊗ ITn−1 ⊗ 1) s1 ,s2 ∈Sn l∈Ln
=
s1 ,s2 ∈Sn l∈Ln
⇔
k1 , k2 ∈ K t1 , t2 ∈ Tn−1 l ∈ Ln
Es1 ,s2 ⊗ xsl1 ,s2 l
for all k ∈ K
l λ−1 k Ek1 ,k2 λk ⊗ Et1 ,t2 ⊗ x(k1 ,t1 ),(k2 ,t2 ) l
42
D. Bisch et al. / Journal of Functional Analysis 257 (2009) 20–46
=
l Ek1 ,k2 ⊗ Et1 ,t2 ⊗ x(k l 1 ,t1 ),(k2 ,t2 )
for all k ∈ K
k1 , k2 ∈ K t1 , t2 ∈ Tn−1 l ∈ Ln
⇔
l Ek −1 k1 ,k −1 k2 ⊗ Et1 ,t2 ⊗ x(k l 1 ,t1 ),(k2 ,t2 )
k1 , k2 ∈ K t1 , t2 ∈ Tn−1 l ∈ Ln
=
l Ek1 ,k2 ⊗ Et1 ,t2 ⊗ x(k l 1 ,t1 ),(k2 ,t2 )
for all k ∈ K
k1 , k2 ∈ K t1 , t2 ∈ Tn−1 l ∈ Ln
l l = x(kk x(k 1 ,t1 ),(k2 ,t2 ) 1 ,t1 ),(kk2 ,t2 )
⇔
for all k, k1 , k2 ∈ K, t1 , t2 ∈ Tn−1 , l ∈ Ln .
Finally, combining the conditions that we get from considering commutation of w with elements of P and K, we can express w as a linear combination of elements of the form:
Ekk1 ,kk2 ⊗ Et1 ,t2 ⊗ l
k∈K
where k1 , k2 ∈ K, t1 , t2 ∈ Tn−1 , l ∈ Ln such that k1 μ(t1 )lμ(t2 )−1 k2−1 = e. Equivalently, w can be realized as a linear combination of k∈K
Ekk −1 ,k ⊗ Et1 ,t2 ⊗ l = 0
Ek,kk0 ⊗ Et1 ,t2 ⊗ l
k∈K
where t1 , t2 ∈ Tn−1 , l ∈ Ln such that μ(t1 )lμ(t2 )−1 ∈ K and k0 = μ(t1 )lμ(t2 )−1 .
2
Remark 4.10. (i) The set ⎫ s1 , s2 ∈ Sn , ⎬ Es1 ,s2 ⊗ l l ∈ Ln , ⎩ μ(s1 )lμ(s2 )−1 ∈ H ⎭ ⎧ ⎨
⎫# t1 , t2 ∈ Tn−1 , ⎬ Ek,kk0 ⊗ Et1 ,t2 ⊗ l μ(t1 )lμ(t2 )−1 ∈ K, resp. ⎩ k0 = μ(t1 )lμ(t2 )−1 ⎭ k∈K
"
⎧ ⎨
forms a basis of N ∩ Mn (resp. M ∩ Mn ). (ii) The unique trace-preserving conditional expectation from N ∩ Mn onto M ∩ Mn is given by
∩Mn EN M ∩Mn (Ek1 ,k2 ⊗ Et1 ,t2 ⊗ l) = δk1 μ(t1 )l,k2 μ(t2 )
1 Ek,kk1 −1 k2 ⊗ Et1 ,t2 ⊗ l |K| k∈K
where k1 , k2 ∈ K, t1 , t2 ∈ Tn−1 , l ∈ Ln such that k1 μ(t1 )lμ(t2 )−1 k2−1 ∈ H . To see (ii), we need to check that for t1 , t2 ∈ Tn−1 and k0 = μ(t1 )l μ(t2 )−1 ∈ K,
D. Bisch et al. / Journal of Functional Analysis 257 (2009) 20–46
tr
$ k ∈K
% Ek ,k k0 ⊗ Et1 ,t2 ⊗ l (Ek1 ,k2 ⊗ Et1 ,t2 ⊗ l)
= δl l,e δt2 ,t1 δt1 ,t2 δk2 k0 ,k1
1 |Sn |
= δk1 μ(t1 )l,k2 μ(t2 ) δl l,e δt1 ,t2 δt2 ,t1 δe,k0 k1 −1 k2 = tr
43
$ k ∈K
Ek ,k k0 ⊗ Et1 ,t2 ⊗ l
1 |Sn |
% 1 δk1 μ(t1 )l,k2 μ(t2 ) Ek,kk1 −1 k2 ⊗ Et1 ,t2 ⊗ l |K| k∈K
which is a routine computation using the second part of Remark 4.5. 5. The planar algebra of group-type subfactors In this section, our main goal is to show that the planar algebra defined in Section 3, is isomorphic to the one arising from the group-type subfactor P H ⊂ P K. Conversely, we will show that any subfactor whose standard invariant is given by such planar algebra, is indeed of this type. We will use the following well-known fact regarding isomorphism of two planar algebras and Theorem 4.2.1 of [11] describing the planar algebra arising from an extremal finite index subfactor. Fact. Let P 1 and P 2 be two planar algebras. Then P 1 ∼ = P 2 if and only if there exists a vector 1 2 space isomorphism ψ : P → P such that: (i) ψ preserves the filtered algebra structure, (ii) ψ preserves the actions of all possible Jones projection tangles and the (two types of) conditional expectation tangles. If P1 and P2 are ∗-planar algebras, then we consider such ψ that are ∗-preserving to be a ∗-planar algebra isomorphism. Let us denote the planar algebra in Section 3 by P BH and the one arising from the group-type subfactor N = P H ⊂ P K = M by P N ⊂M . Theorem 5.1. P N ⊂M ∼ = P BH . Proof. By Theorem 4.2.1 of [11] we have that PnN ⊂M = N ∩ Mn where the nth Jones projection N ∩M
N ⊂M is δEN ∩Mn+1 tangle is given by δen , the conditional expectation tangle from PnN ⊂M onto Pn−1 n N ∩M
N ⊂M and the conditional expectation tangle from PnN ⊂M onto P1,n is δEM ∩Mn+1 . n+1 Define the map
ψ : P N ⊂M → P BH ∪
∪
by ψn (Es1 ,s2 ⊗ l) = (s1 , l, s˜2 , h)
ψn : PnN ⊂M → PnBH where n 0, s1 , s2 ∈ Sn , l ∈ Ln such that μ(s1 )lμ(s2 )−1 ∈ H and μ(s1 )lμ(s2 )−1 h = e.
44
D. Bisch et al. / Journal of Functional Analysis 257 (2009) 20–46
Clearly, ψ is a vector space isomorphism by definition. In order to check that ψ is a filtered ∗-algebra isomorphism, we use Remark 3.3(i), (ii), (iii) and (iv). For instance, to show that ψn is an algebra homomorphism, we need to show ψn (Es1 ,s2 ⊗ l1 · Es3 ,s4 ⊗ l2 ) = ψn (Es1 ,s2 ⊗ l1 ) · ψn (Es3 ,s4 ⊗ l2 )
⇐ δs2 ,s3 s1 , l1 l2 , s˜4 , μ(s4 )l2−1 l1−1 μ(s1 )−1
= s1 , l1 , s˜2 , μ(s2 )l1−1 μ(s1 )−1 · s3 , l2 , s˜4 , μ(s4 )l2−1 μ(s3 )−1 which indeed holds by Remark 3.3(iii). Now, it remains to be shown that ψ preserves the action of Jones projection tangles and the two types of conditional expectation tangles. For this, we use Remark 3.3(v), (vi) and (vii). Proof of each of the three kinds of tangles is completely routine; however, we will discuss the action of conditional expectation tangle in details. Let us consider the conditional expectation tangle from n + 1 (color of the internal disc) to color n (color of the external disc) applied to the element N ⊂M Es1 ,s2 ⊗ Em1 ,m2 ⊗ l ∈ Pn+1 = N ∩ Mn ; by Jones’s theorem (4.2.1 of [11]) and Remark 4.5, the output should be
|Ln | Es ,s ⊗ m1 m−1 2 |Ln+1| 1 2 |Ln | ψn s1 , m1 m−1 −→ δl,e , s˜2 , μ(s2 )m2 m−1 μ(s1 )−1 . 2 1 |Ln+1|
δl,e
On the other hand, by Remark 3.3(vi), the conditional expectation tangle applied to −1 −1 ψn+1 (Es1 ,s2 ⊗ Em1 ,m2 ⊗ l) = (s1 , m1 , l −1 , m−1 2 , s˜2 , μ(s2 )m2 lm1 μ(s1 ) ) is given by & −1 −1 n| δl,e |L|Ln+1| (s1 , m1 m−1 2 , s˜2 , μ(s2 )m2 m1 μ(s1 ) ). This completes the proof.
2
Corollary 5.2. Given any countable discrete group G generated by two of its finite subgroups, there exists a hyperfinite subfactor with standard invariant described by P BH . Moreover, P BH is a spherical C ∗ planar algebra. Proof. The proof follows from the fact that any countable discrete group G has an outer action on the hyperfinite II1 factor. 2 Theorem 5.3. Given a subfactor N ⊂ M with standard invariant isomorphic to P BH , there exists an intermediate subfactor N ⊂ P ⊂ M and outer actions of H and K on P such that (N ⊂ M) (P H ⊂ P K). Proof. Let P N ⊂M denote the planar algebra of N ⊂ M formed by its relative commutants and φ : P N ⊂M → P BH be a planar algebra isomorphism. Consider the element q = φ −1 (e, e, e, e) ∈ N ∩ M1 . Clearly, (i) q is a projection, (ii) qe1 = e1 , and (iii) EM (q) = |K|−1 1. Using action of tangles and the planar algebra isomorphism φ, it also follows that
D. Bisch et al. / Journal of Functional Analysis 257 (2009) 20–46
45
(iv)
The conditions (i)–(iv) assert that q is an intermediate subfactor projection as described in [2]. Define P = M ∩ {q} . To show P is a factor, first note that
P ∩ P ⊂ N ∩ P = N ∩ M ∩ {q} = φ P1BH ∩ (e, e, e, e) . So, it is enough to show that P1BH ∩ {(e, e, e, e)} = C1. If x = {(e, e, e, e)} , then (e, e, e, e) · x = x · (e, e, e, e)
⇒
g∈K∩H (g, g
−1 )
∈ P1BH ∩
λg e, e, g, g −1 = λg g, e, e, g −1 .
g∈K∩H
g∈K∩H
Hence, λg = δg,e λe for all g ∈ K ∩ H and x ∈ C1. It remains to establish that N (resp. M) is the fixed-point subalgebra (resp. crossed-product algebra) of P with respect to an outer action of the group H (resp. K). It is easy to prove (see [11]) that if the standard invariant of a subfactor N˜ ⊂ M˜ is given by the planar algebra corresponding to the fixed-point subfactor (resp. crossed-product subfactor) with respect to a ˜ then there exists an outer action of G ˜ on M˜ (resp. N˜ ) such that N˜ (resp. M) ˜ finite group G, ˜ is isomorphic to fixed-point subalgebra (crossed-product algebra) of the action. Again, if N ⊂ P˜ ⊂ M˜ is an intermediate subfactor and q˜ is its corresponding intermediate subfactor projection, ˜ is given by the range of the idempotent tangle then the planar algebra of N˜ ⊂ P˜ (resp. P˜ ⊂ M)
or
(resp.
according as n is even or odd ([6], see also [1]).
or
)
46
D. Bisch et al. / Journal of Functional Analysis 257 (2009) 20–46
Getting back to our context, to get the planar algebra of N ⊂ P (resp. P ⊂ M), the elements in the image of the above idempotent tangles are given by the words with letters coming from K and H alternately where every element coming from K (respectively H ) must necessarily be e. Such a planar algebra is the same as P BH with K = {e} (resp. H = {e}); by Theorem 5.1 this is indeed the planar algebra corresponding to fixed-point subfactor (resp. crossed-product subfactor) with respect to H (resp. K). 2 References [1] B. Bhattacharyya, Z. Landau Intermediate standard invariants and intermediate planar algebras, preprint. [2] D. Bisch, A note on intermediate subfactors, Pacific J. Math. 163 (2) (1994) 201–216. [3] D. Bisch, Bimodules, higher relative commutants and the fusion algebra associated to a subfactor, in: Operator Algebras and Their Applications, in: Fields Inst. Commun., vol. 13, Amer. Math. Soc., Providence, RI, 1997, pp. 13– 63. [4] D. Bisch, U. Haagerup, Composition of subfactors: New examples of infinite depth subfactors, Ann. Sci. École Norm. Sup. 29 (1996) 329–383. [5] D. Bisch, V. Jones, Algebras associated to intermediate subfactors, Invent. Math. 128 (1) (1997) 89–157. [6] D. Bisch, V. Jones, The free product of planar algebras, and subfactors, in preparation. [7] D. Bisch, S. Popa, Examples of subfactors with Property (T) standard invariant, Geom. Funct. Anal. 9 (1999) 215– 225. [8] D. Evans, Y. Kawahigashi, Quantum Symmetries on Operator Algebras, Oxford Math. Monogr., Oxford Science Publications, Oxford University Press, New York, 1998, xvi+829 pp. [9] F. Goodman, P. de la Harpe, V. Jones, Coxeter Graphs and Towers of Algebras, Math. Sci. Res. Inst. Publ., vol. 14, Springer-Verlag, New York, 1989, x+288 pp. [10] V. Jones, Index for subfactors, Invent. Math. 72 (1983) 1–25. [11] V. Jones, Planar algebras, arxiv:math/9909027v1. [12] V. Jones, Two subfactors and the algebraic decomposition of bimodules over II1 factors, preprint. [13] V. Jones, V.S. Sunder, Introduction to Subfactors, London Math. Soc. Lecture Note Ser., vol. 234, ISBN 0-52158420-5, 1997. [14] A. Ocneanu, Quantized groups, string algebras and Galois theory for algebras, in: Operator Algebras and Applications, vol. 2, in: London Math. Soc. Lecture Note Ser., vol. 136, Cambridge Univ. Press, Cambridge, 1988, pp. 119–172. [15] M. Pimsner, S. Popa, Iterating the basic construction, Trans. Amer. Math. Soc. 310 (1) (1988) 127–133. [16] S. Popa, Classification of amenable subfactors of type II, Acta Math. 172 (1994) 163–255. [17] S. Popa, An axiomatization of the lattice of higher relative commutants of a subfactor, Invent. Math. 120 (1995) 427–445. [18] S. Popa, Some properties of the symmetric enveloping algebra of a subfactor, with applications to amenability and property T, Doc. Math. 4 (1999) 665–744. [19] J.-M. Vallin, Relative matched pairs of finite groups from depth two inclusions of von Neumann algebras to quantum groupoids, J. Funct. Anal. 254 (2) (2008) 2040–2068.
Journal of Functional Analysis 257 (2009) 47–87 www.elsevier.com/locate/jfa
Noncommutative ball maps J. William Helton a,1 , Igor Klep b,c,2 , Scott McCullough d,∗,3 , Nick Slinglend a a Department of Mathematics, University of California, San Diego, United States b Univerza v Ljubljani, Fakulteta za matematiko in fiziko, Slovenia c Univerza v Mariboru, Fakulteta za naravoslovje in matematiko, Slovenia d Department of Mathematics, University of Florida, Gainesville, FL, United States
Received 8 October 2008; accepted 10 March 2009
Communicated by N. Kalton
Abstract In this paper, we analyze problems involving matrix variables for which we use a noncommutative algebra setting. To be more specific, we use a class of functions (called NC analytic functions) defined by power series in noncommuting variables and evaluate these functions on sets of matrices of all dimensions; we call such situations dimension-free. These types of functions have recently been used in the study of dimension-free linear system engineering problems. In this paper we characterize NC analytic maps that send dimension-free matrix balls to dimension-free matrix balls and carry the boundary to the boundary; such maps we call “NC ball maps”. We find that up to normalization, an NC ball map is the direct sum of the identity map with an NC analytic map of the ball into the ball. That is, “NC ball maps” are very simple, in contrast to the classical result of D’Angelo on such analytic maps in C. Another mathematically natural class of maps carries a variant of the noncommutative distinguished boundary to the boundary, but on these our results are limited. We shall be interested in several types of noncommutative balls, conventional ones, but also balls defined by constraints called Linear Matrix Inequalities (LMI). What we do here is a small piece of the bigger puzzle of understanding how LMIs behave with respect to noncommutative change of variables. © 2009 Elsevier Inc. All rights reserved.
* Corresponding author.
E-mail addresses:
[email protected] (J.W. Helton),
[email protected] (I. Klep),
[email protected] (S. McCullough),
[email protected] (N. Slinglend). 1 Research supported by NSF grants DMS-0700758, DMS-0757212, and the Ford Motor Co. 2 Supported by the Slovenian Research Agency (project No. Z1-9570-0101-06). 3 Research supported by the NSF grant DMS-0758306. 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.03.008
48
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
Keywords: Noncommutative analytic function; Complete isometry; Ball map; Linear matrix inequality
1. Introduction In the introduction we will state some of our main results. For this we need to start with the definitions of NC polynomials (Section 1.1) and NC analytic maps (Section 1.2). We then proceed to define NC ball maps in Section 1.3, where we explain what it means for an NC ball map to map ball to ball with boundary to boundary. After that we can and do state our main results classifying NC ball maps in Sections 1.3 and 1.4. Finally, the introduction concludes by considering two types of generalizations, the first being to balls defined by LMIs, the second being to NC analytic maps carrying special sets on the boundary of a ball to the boundary of a ball. Before continuing with the more detailed introduction, we pause to offer some perspective and mention related significant contributions. Ball maps form a distinguished subset of the space of noncommutative analytic functions on a noncommutative domain and we direct the interested reader to the elegant general theory of noncommutative analytic functions developed in the articles [12–14,29,30,16,17] for a more complete account than found here and to the work of Popescu on free analytic functions. Some of these references are [22,27,23,24,26]. Noncommutative rational, Schur class and analytic functions are also intimately related to systems theory. A small sample of the references includes [2,3,10,12,18]. The noncommutative balls that we consider are modeled on g × g matrices, and in the special case that g = 1, they correspond to those studied by Popescu for operator, not just matrix, coefficients. Precomposition by an automorphism of the domain preserves ball maps and such automorphisms are studied at various levels of generality in [28,17]. Linear ball maps are an important special case, and these are identified with completely isometric mappings from one operator space into another. The books [19,20,6] provide comprehensive introductions to the theory of operator systems, spaces, and algebras, and the papers [5] and [4] treat very generally complete isometries into a C-star algebra. 1.1. Words and NC polynomials Let g , g ∈ N. We write x for the monoid freely generated by x, i.e., x consists of words in the g g letters x11 , . . . , x1g , x21 , . . . , xg g (including the empty word ∅ which plays the role of the identity 1). Let Cx denote the associative C-algebra freely generated by x, i.e., the elements of Cx are polynomials in the noncommuting variables x with coefficients in C. Its elements are called NC polynomials. An element of the form aw where 0 = a ∈ C and w ∈ x is called a monomial and a its coefficient. Hence words are monomials whose coefficient is 1. Let ∗ , . . . , x ∗ ) denote another set of g g symbols. We shall also consider the free algebra x ∗ = (x11 g g Cx, x ∗ that comes equipped with the natural involution xij → xij∗ . For example, ∗ 2 2 ∗ ∗ ∗ 1 + ix11 x23 x34 = 1 − ix34 x23 x11 . (Here i denotes the imaginary unit
√
−1.)
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
49
1.1.1. NC matrix polynomials A matrix-valued NC polynomial is an NC polynomial with matrix coefficients. We shall use the phrase scalar NC polynomial if we want to emphasize the absence of matrix constructions. Often when the context makes the usage clear we drop adjectives such as scalar, 1 × 1, matrix polynomial, matrix of polynomials and the like. 1.1.2. Polynomial evaluations If p is an NC polynomial in x and X ∈ (Cn×n )g ×g , the evaluation p(X) is defined by simply replacing xij by Xij . For example, if p(x) = Ax11 x21 , where
−4 3 2 A= , 2 −1 0 then p
⎡ 0 4 0 −3 0 0 1 1 0 0 1 1 0 3 0 2 ⎢ −4 0 , =A⊗ =⎣ 1 0 0 −1 1 0 0 −1 0 −2 0 1 0 2 0 −1 0 0
⎤ −2 0 ⎥ ⎦. 0 0
On the other hand, if p(x) = A and X ∈ (Cn×n )g ×g , then p(X) = A ⊗ In . The tensor product in the expressions above is the usual (Kronecker) tensor product of matrices. Thus we have reserved the tensor product notation for the tensor product of matrices and have eschewed the strong temptation of using A ⊗ xk in place of Axk when xk is one of the noncommuting variables. 1.2. Definition of NC analytic functions An elegant theory of noncommutative analytic functions is developed in the articles [12–14, 29,30]; see also [22]. What we need in this article are specializations of definitions of these papers. In this section we summarize the definitions and properties needed in the sequel. For d , d ∈ N define Bd ×d :=
∞ d ×d Idn − X ∗ X 0 , X ∈ Cn×n
(1.1)
n=1
int Bd ×d :=
∞ d ×d Idn − X ∗ X 0 , X ∈ Cn×n
(1.2)
n=1
∂Bd ×d :=
∞ d ×d X = 1 , X ∈ Cn×n
(1.3)
n=1
Md ×d :=
∞ n×n d ×d C . n=1
(1.4)
50
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
We shall occasionally use the notation ,d Xj, ∈ CN ×N , X 1 , Bd ×d (N ) = X = [ Xj, ]dj,=1 ,d Xj, ∈ CN ×N . Md ×d (N ) = X = [ Xj, ]dj,=1
g ×g is the (disjoint) Given g , g ∈ N, the noncommutative (NC) ε-neighborhood of 0 in C union N ∈N {X ∈ Mg ×g (N ) | X < ε}. An open NC domain D containing 0 (in its interior) is a union N DN of open sets DN ⊆ Mg ×g (N ) which is closed with respect to direct sums and such that there is an ε > 0 such that D contains the NC ε-neighborhood of 0. A d × d NC analytic function f on an open NC domain D containing 0 is defined as follows:
(1) f has an NC power series, for which there exists an NC ε > 0 neighborhood of 0 on which it is convergent. That is,
f=
aw w
(1.5)
w∈x
for aw ∈ Cd ×d and for every N ∈ N and every g × g-tuple of square matrices X ∈ Bg ×g with X < ε the series f (X) =
aw ⊗ w(X)
(1.6)
w∈x
converges. We interpret convergence for a given X as conditional convergence of the series ∞
aw ⊗ w(X).
α=0 |w|=α
Thus the order of summation is over the homogeneous parts of the power series expansion. Thus with f (α) equal to the α homogeneous part in the NC power series expansion of f , the series converges for a given X provided ∞
f (α) (X)
(1.7)
α=0
converges. Since both aw and w(X) are matrices, the particular norm topology chosen has no influence on convergence. The radius ε of this ball of convergence (or sometimes, by abuse of notation, the ball itself) will be called the series radius. (2) If a : W → D is a matrix-valued function analytic on a domain W in CN , the composition f ◦ a is a matrix-valued analytic function on W and continuous to ∂W. Remark 1.1. Popescu [22] has a notion of free analytic function in g variables (that is, g = 1) based upon power series expansions like that in (1.7). His definition allows for operator coefficients, but on the other hand requires convergence of the NC power series on all of int Bg (the
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
51
NC 1-neighborhood of 0 in Cg ). It turns out that for bounded NC analytic functions with matrix coefficients the two notions are the same, see Lemma 6.1. Here we have avoided extending the theory of free analytic functions to Bg ×g , looking forward to working on more general domains in the future. 1.2.1. Properties of NC analytic functions Proposition 1.2. Let D be an NC domain containing 0. (i) The sum of two d × d NC analytic functions on D is a d × d NC analytic function on D. (ii) The product of two d × d NC analytic functions on D is a d × d NC analytic function on D. (iii) The composition of two NC analytic functions is an NC analytic function. More precisely, if f : D → D is a d1 × d1 NC analytic function, where D is an NC domain with 0 ∈ D , and h is a d2 × d2 NC analytic function on D , then h ◦ f is a d1 d2 × d1 d2 NC analytic function on D. Proof. Properties (i) and (ii) are standard and we only consider (iii). The fact that h ◦ f admits an NC power series as in (1.5) was observed e.g. in [14,29]. The composition property (2) of Section 1.2 is easily checked. 2 More is said about properties of NC analytic functions in Section 6. 1.3. NC ball maps f and their classification when f (0) = 0 A function f : int Bg ×g → Md×d which is NC analytic will often be called an NC analytic function on the ball Bg ×g and denoted f : Bg ×g → Md×d . An NC analytic function f : Bg ×g → Bd ×d mapping the boundary to the boundary is called an NC ball map. The notion of f mapping boundary to boundary is a bit complicated (because of convergence issues) so requires explanation. For a given X ∈ Bg ×g (N ), define the function fX : D → Md×d (N ) by z → f (zX). (Here D denotes the unit disc D = {z ∈ C | |z| < 1} in the complex plane.) If lim fX reit r1
exists, denote that limit by f (eit X). The function f maps the boundary to the boundary if whenever X = 1 and f (eit X) exists, then it f e X = 1. Since f is bounded, Fatou’s theorem implies that for each X ∈ Bg ×g the limit fX (eit ) = f (eit X) exists for almost every t. If f is an NC ball map, X ∈ ∂Bg ×g and f (X) is defined, then a (nonzero) vector γ such that f (X)γ = γ is called a binding vector and this property binding.
52
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
Our main result on NC ball maps which map 0 to 0 is: Theorem 1.3. Let h : Bg ×g → Bd ×d be an NC ball map with h(0) = 0. Then there exist unitaries U : Cd → Cd and V : Cd → Cd such that x 0 h(x) = V U ∗, (1.8) ˜ 0 h(x) ˜ where h˜ : Bg ×g → B(d −g )×(d−g) is an NC analytic contraction-valued map with h(0) = 0. Conversely, every NC analytic h satisfying (1.8) for unitaries U, V and an NC analytic contraction-valued map h˜ fixing the origin, is an NC ball map Bg ×g → Bd ×d sending 0 to 0. The proof of the theorem is completed in Section 4. The general result is built from the linear version of the theorem, Theorem 3.3 which appears as Corollary 3.6 in [5]. See also [4]. As an illustration of Theorem 1.3 we describe a special case. For convenience, we adopt the notation Bg for Bg ×1 . Corollary 1.4. If h : Bg → Bd is an NC ball map with h(0) = 0, then h is linear and there is a unique isometry M ∈ Cd ×g such that h = Mx. In particular, if d < g then no such NC ball maps exist. ˜ column is gone. Moreover, Proof. When h maps Bg to Bd the h(x) I M =V∗ 0 is an isometry.
2
1.4. NC ball maps f when f (0) is not necessarily 0 In the previous section we treated NC ball maps f with f (0) = 0, an assumption we drop in this section. The strategy is to compose f with a bianalytic automorphism of an NC ball to reduce the problem to the f (0) = 0 setting. Section 1.4.1 contains information on bianalytic mappings on an NC ball, while the main results appear in Section 1.4.2. 1.4.1. Linear fractional transformations For a given d × d scalar matrix v with v < 1, define Fv : Bd ×d → Bd ×d by 1/2 −1 1/2 Fv (u) := v − Id − vv ∗ Id − v ∗ v u Id − v ∗ u .
(1.9)
Of course it must be shown that Fv actually takes values in Bd ×d . This is done in Lemma 1.6 below. Linear fractional transformations such as Fv are common in circuit and system theory, since they are associated with energy conserving pieces of a circuit (cf. [31]). Lemma 1.5. Suppose D is an open NC domain containing 0. If u : D → Bd ×d is NC analytic, then Fv (u(x)) is an NC analytic function (in x) on D. Proof. See Section 5.
2
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
53
Notice that if d = d = 1, then v is a scalar and u is a scalar NC analytic function, hence ¯ −1 = (1 − uv) ¯ −1 (v − u). Fv (u) = (v − u)(1 − uv) Now fix v ∈ D and consider the map D → C, u → Fv (u). This map is a linear fractional map that maps the unit disc to the unit disc, maps the unit circle to the unit circle, and maps v to 0. The geometric interpretation of the map in NC variables in (1.9) is similar. Suppose we fix N ∈ N and V ∈ Bd ×d (N ) with V < 1 and consider the map U → FV (U ).
(1.10)
The first part of Lemma 1.6 tells us that the map defined in (1.10) maps the unit ball of d × dtuples of N ×N matrices to the unit ball of d ×d-tuples of N ×N matrices carrying the boundary to the boundary. The third part of Lemma 1.6 tells us that FV (V ) = 0; that is, the map given in (1.10) takes V to 0. Lemma 1.6. Suppose that N ∈ N and V ∈ Bd ×d (N ) with V < 1. (1) U → FV (U ) maps the Bd ×d (N ) into itself with boundary to the boundary. (2) If U ∈ Bd ×d (N ), then FV (FV (U )) = U . (3) FV (V ) = 0 and FV (0) = V . Proof. See Section 5.
2
1.4.2. Classification of NC ball maps General NC ball maps – those where f (0) is not necessarily 0 – are described using the linear fractional transformation F . / ∂Bd ×d . Then Theorem 1.7. Let f : Bg ×g → Bd ×d be an NC ball map with f (0) ∈ f (x) = Ff (0) ϕ(x) ,
(1.11)
where ϕ(x) = Ff (0) f (x) = V
x 0
0 U∗ ϕ(x) ˜
(1.12)
for some unitaries U : Cd → Cd and V : Cd → Cd and an NC analytic contraction-valued map ϕ˜ with ϕ(0) ˜ < 1. Conversely, every NC analytic f satisfying (1.11) and (1.12) for unitaries U, V and ϕ˜ as / ∂Bd ×d . above, is an NC ball map f : Bg ×g → Bd ×d with f (0) ∈ Proof. Define ϕ(x) := Ff (0) (f (x)). Then ϕ(0) = 0. By Lemma 1.5, ϕ(x) is an NC analytic map. Hence it is an NC ball map sending 0 to 0 and is thus classified by Theorem 1.3. Moreover, Eq. (1.11) is implied by Lemma 1.6(2). The converse easily follows from Lemmas 1.5 and 1.6. 2 The results of Sections 1.3 and 1.4 are treated in Part I of this paper.
54
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
1.5. More generality In this subsection we extend the main results presented so far in two directions. The first concerns LMIs. Our interest will be in properties of the set of all solutions to a given LMI. In Section 1.5.2 we will define what we mean by an LMI, then show that the set of solutions to a “monic” LMI equals a general type of matrix ball we call a pencil ball. Ultimately we would like to study maps from pencil balls to pencil balls and this paper is a beginning which handles the special case where the domain pencil ball is the ordinary NC ball Bg ×g (see Corollary 1.10). Eventually we hope to understand which NC analytic change of variables takes one LMI to another. Work is in progress on such problems. In the next generalization we do not have applications in mind, but do something that is mathematically natural. A basic notion in several complex variables is the Shilov or distinguished boundary. A natural problem is to classify NC analytic functions mapping the ball to the ball and carrying the distinguished boundary to the boundary. Classification of linear maps of this type proves to be an interesting challenge tackled in Sections 7 and 9. For NC analytic maps we introduce the semi-distinguished boundary (a set larger than the distinguished boundary) and study the NC analytic functions mapping the semi-distinguished boundary to the boundary. All of this we only do for balls of vectors, rather than balls of matrices, that is for g = 1. 1.5.1. Linear pencils Let L(x) := A11 x11 + · · · + Ag g xg g
(1.13)
denote an NC analytic truly linear pencil in x. If the matrices Aij that are used to define it are in Cd ×d , then L(x) is called a d × d linear pencil. As an example, for g = 2 and g = 1,
1 2 , A11 = 3 4
0 1 A21 = , −1 0
the linear pencil is L(x) =
x11 3x11 − x21
2x11 + x21 . 4x11
1.5.2. Linear matrix inequalities and (pencil) balls Let L˜ be a d × d monic symmetric linear pencil. The positivity domain of L˜ is defined to be ˜ 0 . DL˜ := X ∈ Mg ×g L(X) ˜ In other words, it is the set of all solutions to the LMI L(X) 0. We wish to analyze this solution set and we can using results on balls which we have already obtained. Now we describe DL˜ as a type of ball [11]. To do this write L˜ as L˜ = I + L + L∗ where L is a d × d NC analytic truly linear pencil, then to L(x) we associate the (pencil) ball
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
BL :=
55
∞ X ∈ Mg ×g (n) Idn − L(X)∗ L(X) 0 n=1
=
∞ X ∈ Mg ×g (n) L(X) 1 .
(1.14)
n=1
Observe that Bg ×g = BL for L(x) =
Eij xij
i,j
with Eij being the elementary g × g matrix with 1 located at position (i, j ). Lemma 1.8. For X ∈ Mg ×g ,
0 X ∈ DL˜ 0 0
iff X ∈ BL .
(1.15)
0 X ∈ ∂DL˜ 0 0
iff X ∈ ∂BL .
(1.16)
Furthermore,
Proof. By definition, L˜
0 X 0 0
=
I L(X)∗
L(X) . I
2
1.5.3. Pencil ball maps Now we turn to Bg ×g → BL maps. As a generalization of NC ball map, given a linear pencil L, an NC analytic mapping f : Bg ×g → BL will be called a pencil ball map provided L(f (X)) = 1, whenever X = 1 and f (X) is defined. Lemma 1.8 tells us that understanding pencil ball maps is equivalent to understanding maps on the sets of solutions to certain types of LMIs. Theorem 1.9. Let L be a d × d NC analytic truly linear pencil and f : Bg ×g → BL a pencil ball map with f (0) = 0. Write h := L ◦ f . Then there exist unitaries U : Cd → Cd and V : Cd → Cd such that x 0 (1.17) h(x) = V U ∗, ˜ 0 h(x) where h˜ is an NC analytic contraction-valued map. Proof. Follows easily by applying Theorem 1.3 to h.
2
56
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
Corollary 1.10. Let L be a d × d NC analytic truly linear pencil and f : Bg ×g → BL a pencil ball map with L ◦ f (0) < 1. Then L ◦ f (x) = FL◦f (0) ϕ(x) ,
(1.18)
where ϕ(x) = FL◦f (0) (L◦f (x)) is an NC ball map Bg ×g → Bd ×d taking 0 to 0 and is therefore completely described by Theorem 1.3. Proof. Apply Theorem 1.7 to L ◦ f (x).
2
1.5.4. Semi-distinguished pencil ball maps Many of our proofs with little extra effort work for a class of functions more general than pencil ball maps. These involve the notion of distinguished boundary which we now define. The Shilov boundary or distinguished boundary of Bg ×g (N ) is the smallest closed subset of Bg ×g (N ) with the following property: For f : Bg ×g (N ) → CK analytic and continuous to the boundary ∂Bg ×g (N ), for any X ∈ Bg ×g (N ) we have f (X) maxf (U ). U ∈
(1.19)
In other words, the maximum of f over Bg ×g (N ) occurs in the distinguished boundary. We refer the reader to [15, p. 145] or [8, Ch. 4] for more details. It is a theorem [1, p. 77] that the distinguished boundary of Bg (N ) is
X ∈ Bg (N ) X ∗ X = I .
Accordingly, we let ∂dist Bg denote the disjoint union of these distinguished boundaries and call this the distinguished boundary of Bg . A further discussion of distinguished boundaries for Bg ×g is in Section 6.3. An NC analytic function f : Bg → BL satisfying f (0) ∈ / ∂BL and f (∂dist Bg ) ⊆ ∂BL
(1.20)
is called a distinguished pencil ball map. Here, (1.20) means that for every isometry X for which limδ1 f (δX) exists, this limit lies in ∂BL . A natural open question is: classify distinguished pencil ball maps. Our proof of Theorem 8.1 does something like this but a little weaker. A key distinction between the semi-distinguished maps and the case treated earlier in Theorems 1.3 and 1.9 occurs with linear distinguished ball maps. These we find much harder to classify than linear NC ball maps, which we leave as an interesting open question. Definition 1.11. The semi-distinguished boundary of Bg is defined to be 1/2
∂dist Bg :=
∞ n=1
1 X ∈ Bg (n) X ∗ X is a projection of dimension n . 2
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
57
An NC analytic function f : Bg → BL satisfying f (0) ∈ / ∂BL and 1/2 f ∂dist Bg ⊆ ∂BL
(1.21) 1/2
is called a semi-distinguished pencil ball map. Here, (1.21) means that for every X ∈ ∂dist Bg for which limδ1 f (δX) exists, this limit lies in ∂BL . The study of semi-distinguished pencil ball maps is the subject of Part II of this article. For semi-distinguished pencil ball maps we get a weak version of the pencil ball map classification Theorem 1.9 – see Theorem 8.1. Part I. Binding 2. Models for NC contractions Let S denote the (g -tuple of) shift(s) on noncommutative Fock space Fg . The Hilbert space Fg is the Hilbert space with orthonormal basis consisting of words x in g NC variables x = (x1 , . . . , xg ). Then Sj w = xj w for a word w ∈ x and Sj extends by linearity and continuity to Fg . The key properties we need about S are: Sj∗ S = δj I
for j, = 1, . . . , g ,
I−
g
Sj Sj∗ = P0 ,
(2.1)
j =1
where P0 is the (rank one) projection onto the span of the empty word. A column contraction is a g -tuple of square matrices (operators), ⎡
⎤ X1 . X = ⎣ .. ⎦ Xg such that I − X ∗ X = I − Xj∗ Xj 0. If X acts on finite dimensional space, then X is a column contraction if and only if X ∈ Bg . In general, X is a column contraction if and only if X ∗ is a row contraction. Row contractions (and so column contractions too) are well studied – e.g. by Popescu and also Arveson. A strict column contraction is a column contraction X for which there is an ε > 0 such that I − Xj∗ Xj ε. If X is acting on a finite dimensional space, this last condition is equivalent to I − Xj∗ Xj 0, i.e., X ∈ int Bg . Column contractions are modeled by S ∗ , which is the content of Lemma 2.1 below and a major motivation for these definitions. We do not use this property of the Sj until proving Theorem 6.2. Lemma 2.1. (See [7,21].) If X is a strict column contraction acting on a Hilbert space H, then there is a Hilbert space K and an isometry V : H → K ⊗ Fg such that V X = (I ⊗ S ∗ )V ; i.e., for each j , V Xj = (I ⊗ Sj∗ )V and in particular, for each word w ∈ x, V w(X) = (I ⊗ w(S ∗ ))V . Here I is the identity on K. Further, if X ∈ Bg (N ) (so is a tuple of matrices), then the dimension of K can be assumed to be at most N .
58
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
A natural generalization of the g -tuple of shifts on Fock space to Mg ×g and its (sequence of) ball(s) is g ,g
X = [ Sj∗ ⊗ S ]j,=1 . (A word of caution: we have abused notation by using Sj to denote shifts on both Fg and Fg .) The operator X should be compared to the reconstruction operator in [27]. Though we do not know if X serves as a universal model for Bg ×g in the same way that S does for Bg , it does serve as a type of boundary for NC analytic functions. The statement of the results requires approximating X by matrices. The operator (not matrix) X acts upon Fg ×g – the g ,g
Hilbert space with orthonormal basis consisting of words in g g NC variables x = (xj, )j,=1 . Given a natural number n, let Fg (n) denote the span of words of length at most n in Fg , and set Fg ×g (n) = Fg (n) ⊗ Fg (n). Let Xn denote the compression of X to the (semi-invariant finite dimensional) subspace Fg ×g (n). Lemma 2.2. Let Pn denote the projection onto the complement of the span of ∅ in Fg (n) (and also in Fg (n)) and let Qn denote the projection onto the complement of the span of {w | w is a word of length n} in Fg (n) (and also in Fg (n)). Then: X∗n Xn = Ig ⊗ Pn ⊗ Qn , Xn X∗n = Ig ⊗ Qn ⊗ Pn .
(2.2)
Remark 2.3. In view of the definition of Bg ×g , it is natural to think of an NC analytic function h on Bg ×g as a function of the g g variables xj, , 1 j g and 1 g. In turn, a monomial m in (xj, ) can be viewed as a homogeneous monomial u ⊗ v, where u and v are monomials of the same length (same as the length of m) and u and v monomials in NC variables yj (1 j g ) and z (1 g) respectively. In this way, au⊗v u ⊗ v = h(α) . h= α |u|=|v|=α
α
For instance, the monomial x23 x41 is identified with y2 y4 ⊗ z3 z1 . We want to evaluate NC analytic functions Bg ×g → Md ×d on Xn , which is a norm one matrix thereby causing power series convergence difficulties. However, evaluating NC analytic functions on nilpotent tuples X ∈ Bg ×g behaves especially well. Here a tuple X is called nilpotent of order β if w(X) = 0 for every word w of length β. Lemma 2.4. If f : Bg ×g → Md×d is NC analytic and X ∈ Bg ×g is nilpotent of order β, then f (X) is defined and moreover, f (α) (X). f (X) = α<β
In particular, if f is an NC ball map, f (0) = 0, and Y ∈ ∂Bg ×g is nilpotent of order two, then f (Y ) = f (1) (Y ).
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
59
Proof. Let X ∈ Bg ×g be given and let r denote the series radius for f . For z ∈ D with |z| < r the power series expansion for f (zX) converges. The nilpotent hypothesis gives f (zX) =
f (α) (X)zα .
αβ
Since f (zX) is analytic for |z| < 1 and is equal to the polynomial on the right-hand side above for |z| < r, equality holds for all z. If Y ∈ ∂Bg ×g and Y is nilpotent of order two, the argument above shows f (zY ) =
f (α) (Y )zα .
α1
Moreover, the assumption f (0) = 0 implies f (0) = 0. Choosing z = 1 gives f (Y ) = f (1) (Y ).
2
Lemma 2.5.
(a) Suppose p is an NC polynomial of degree N with Cd ×d coefficients in g g variables and p(0) = 0. (1) If 0 I − X∗n Xn − p(Xn )∗ p(Xn ) for each n N , then p = 0. (2) If 0 I − Xn X∗n − p(Xn )p(Xn )∗ for each n N , then p = 0. (b) Suppose h : Bg ×g → Md ×d is NC analytic. If h(Xn ) = 0 for each n, then h = 0. Proof. (a) Write p=
m
p (α)
α=0
as in Remark 2.3. In particular, p (α) =
au⊗v u ⊗ v,
|u|=|v|=α
and au⊗v ∈ Cd ×d . By hypothesis a∅ = 0, so that p (0) = 0. Now suppose pk = 0 for k < n. Let w be a word of length n and γ ∈ Cg be given. From Lemma 2.2, we have 0 = I − X∗n Xn γ ⊗ w ⊗ ∅.
60
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
Hence, 0 = p(Xn )γ ⊗ w ⊗ ∅ = p (n) (Xn )γ ⊗ w ⊗ ∅ =
pw⊗v γ ⊗ ∅ ⊗ v.
|v|=n
Thus pw⊗v = 0 and it follows that p (n) = 0. (b) This proof is similar. Here is a brief outline. First note that 0 = h(0). Let r denote the series radius for h. Fix N . For |z| < r and for any n N , by Lemma 2.4 we have (since Xn is nilpotent of order n N ) h(Xn ) =
N
h(α) (Xn ).
α=1
(α) of degree N , it follows from (a) that p = 0. If we now let p denote the polynomial N α=1 h Since this is true for all N , we see h = 0. 2 3. NC isometries This section has two parts. The first shows that the linear part of an NC ball map is an NC ball map, i.e., it is what is commonly known as a complete isometry. The second subsection classifies these linear NC ball maps. Recall that an NC analytic function f : Bg ×g → Bd ×d is an NC ball map provided it is NC analytic and contraction-valued in the interior of Bg ×g and for X ∈ Bg ×g (N) with X = 1, f (eit X) = 1 for almost every t ∈ R. 3.1. Pencil ball maps have isometric derivatives
A linear mapping ψ : Cg ×g → Cd ×d is completely determined by its action on the matrix units Ej, ∈ Cg ×g with a 1 in the (j, ) position and 0 elsewhere. The mapping ψ then naturally extends to a mapping, still denoted ψ , on Cn×n ⊗ Cg ×g by the formula ψ [ Xj, ]j, = Xj, ⊗ ψ(Ej, ) ∈ Cn×n ⊗ Cd ×d .
(3.1)
For notational simplicity, the formula above is written ψ(X). The mapping ψ is completely iso metric if ψ(X) = X for each X ∈ Cn×n ⊗ Cg ×g and each n, and is completely contractive if ψ(X) X for all X. Proposition 3.1. Suppose f : Bg ×g → Bd ×d is an NC analytic map with f (0) = 0. If f is an NC ball map, then f (1) , the linear part of f , is a complete isometry. Proof. We start by observing that, in view of Lemma 2.4, f for every X ∈ Bg ×g .
(1)
0 0
X 0
=f
0 X 0 0
(3.2)
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
61
If f is an NC ball map, then for X ∈ ∂Bg ×g 0 X = f 0 0
0 1 = X = 0
X 0
(3.3)
by the binding property. Now by (3.2), 0 X 0 X = f (1) f = 0 0 0 0 0 0
f (1) (X) = f (1) (X). 0
From (3.3) and (3.4) we obtain f (1) (X) = 1 for all X with X = 1.
(3.4)
2
Remark 3.2. This remark does not contribute to the proofs, rather it is for the sake of reconciling the definitions of complete isometries and contractions given here with what is typically found in the literature (cf. [19]). Often a completely contractive (resp. isometric) mapping ψ : Cg ×g → Cd ×d is defined as ×g n×n g follows. Given n, let (C ) denote the n × n matrices with entries from Cg ×g and define ×g n×n ×d n×n g d 1n ⊗ ψ : (C ) → (C ) by n 1n ⊗ ψ [ Yα,β ]nα,β=1 = ψ(Yα,β ) α,β=1 . In this definition, the block matrix Y = [ Yα,β ]nα,β=1 is written as Y=
Eα,β ⊗ Yα,β ,
where Eα,β ∈ Cn×n are the n × n matrix units. Evaluating ψ on Y becomes 1n ⊗ ψ(Y ) =
Eα,β ⊗ ψ(Yα,β ).
By using the matrix units basis Ej, of Cg ×g , Y can be rewritten as Y=
Xj, ⊗ Ej, ,
for some Xj, . Evaluating 1n ⊗ ψ on Y expressed as above gives Eq. (3.1). Passing between these two expressions for Y is known as the canonical shuffle in [19]. Letting Aj, = ψ(Ej, ), Eq. (3.1) becomes ψ(X) =
Xj, ⊗ Aj, .
3.2. Completely isometric maps on Cg ×g
The following theorem which classifies completely isometric maps on Cg ×g is the main result of this section. It appears as Corollary 3.4 in [5] (see also [4]). For the readers convenience, we provide an elementary self-contained proof.
62
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
Theorem 3.3. A linear mapping ψ : Cg ×g → Cd ×d is completely isometric if and only if there exist unitaries U : Cd → Cd , V : Cd → Cd and a completely contractive (linear) mapping ϕ : Cg ×g → C(d −g )×(d−g) such that ψ(Y ) = V
Y 0
0 U ∗. ϕ(Y )
Throughout this subsection let ψ : Cg ×g → Cd ×d denote a completely isometric mapping. It is convenient to make use of the matrix units in Cg ×g . Let {ej } and {ej } denote the standard
basis for Cg and Cg respectively. Let Aj, = ψ(ej e∗ ) for 1 j g and 1 g be as in Remark 3.2. We have represented ψ in terms of the matrix g ×g g ,g A = [ Aj, ]j,=1 ∈ Cd ×d .
This matrix has the formal block transpose given by A∗ = [ A,j ]j, . Lemma 3.4. If ψ is completely contractive, then A∗ is a contraction. g ,g Proof. Choose X = j,=1 ej e∗ ⊗ e (ej )∗ . Direct computation reveals that X ∗ X = Ig g and thus the block matrix X is a contraction. Hence ψ(X) = A∗ is also a contraction.
2
Remark 3.5. (1) That the converse of Lemma 3.4 is not true in general can be seen by considering the mapping ψ : C2×2 → C defined by ψ(ej e∗ ) = δj . In this case, A∗ = I2 , but ψ(E11 + E22 ) = 2, so that ψ is not even contraction-valued. (2) For g = 1 the converse does hold. We leave this as an exercise for the interested reader.
Proposition 3.6. A completely contractive mapping ψ : Cg ×g → Cd ×d is a complete isometry if and only if there is a set {f1 , . . . , fg } ⊆ Cd of unit vectors satisfying Aα,s fu , Aβ,t fv =
1 0
if (α, s, t) = (β, u, v), otherwise.
Here 1 u, v g, 1 s, t g, and 1 α, β g . The following lemma is an important ingredient in the proof.
(3.5)
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
63
Lemma 3.7. Under the hypotheses of Proposition 3.6, the set {f1 , . . . , fg } is orthonormal. Moreover, hα = Aα,j fj ∈ Cd
(1 α g )
(3.6)
is independent of j . Proof. Let fj be a set of unit vectors satisfying Eq. (3.5). Notice first that, for fixed j , the set {Aα,j fj | 1 α g } is an orthonormal set. Let Sj denote the span of this set. Given j, and α, Aα,j fj =
cβ Aβ, f + ζ
for some ζ orthogonal to S (and where the dependence of the coefficients cβ on α, j, has been suppressed). Taking the inner product with Aγ , f it follows that cβ = 1 if β = α and cβ = 0 otherwise; i.e., Aα,j fj = Aα, f + ζ. On the other hand both Aα,j fj and Aα, f are unit vectors and thus ζ = 0. Hence, Aα,j fj is independent of j and hα = Aα,j fj is unambiguously defined. Since Aα,j is a contraction (as it is, by definition, ϕ(Eα,j )) and since fj = 1 and Aα,j fj = 1, it follows that fj = A∗j,α hα , and is thus independent of α. Using this last claim, consider, for j = , 2 2 A∗j,α + eit A∗,α hα = 2 + 2 Re eit fj , f . It follows that fj , f = 0. Here we have used ϕ ej eα∗ ⊗ eα ej∗ + e−it e eα∗ ⊗ eα e∗ = Aα,j + e−it Aα, is a contraction, [ 1 eit ]2 = 2, and hα = 1.
2
Proof of Proposition 3.6. Suppose such f ’s exist. Let X ∈ Cg ×g ⊗ Cn×n with X = 1 be given. Thus X is a contraction and there is a unit vector x = xj ⊗ ej such that Xx = 1. In particular, ∗ Xα,s xs = 1. xt∗ Xα,t
64
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
Thus,
ψ(X)∗ ψ(X)
xu ⊗ f u ,
u
∗ ∗ ∗ fv Aβ,t Aα,s fu xv∗ Xβ,t xv ⊗ f v = Xα,s xu
v
=
∗ xt∗ Xα,t Xα,s xs = 1.
Of course we also must be careful to check, in view of the orthonormality of {f1 , . . . , fg } of Lemma 3.7,
xu ⊗ f u ,
u
xv ⊗ f v = xu∗ xu = 1.
v
u
Thus, if X = 1, then ψ(X) 1. Since ψ assumed to be a contraction, the proof that ψ is completely isometric follows. turn to the converse. Suppose ψ is completely isometric. Fix α and choose X = Let us now ∗ ⊗ e ⊗ (e )∗ . Then, e (e ) α α XX ∗ = g eα eα∗ ⊗ eα eα∗ . Thus, ϕ(X) =
A1, ⊗ e1 (e )∗ has norm at most α = [ Aα,1
has norm at most
g . Equivalently,
· · · Aα,g ]
g . Suppose now that ⎡
⎤ h1 . h = ⎣ .. ⎦ hg and α h2 = g . Then, using the fact that each Aα,s is a contraction, 2 Aα,s hs , Aα,t ht Aα,s hs = Aα,s hs , Aα,t ht g =
s,t
s,t
s,t
Aα,s hs Aα,t ht
hs ht =
hs
2
g h2 = g .
s,t
The Cauchy–Schwartz inequality was used in two of the inequalities. Because equality prevails in the end, we must have equality in the inequalities. Therefore, hs 2 = g1 for each s and moreover, Aα,s hs , Aα,t ht = for each s.
1 g
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
Choose X =
∗ j, ej (e )
65
⊗ ej (e )∗ and note X2 = gg . Then ϕ(X) =
∗ Aj, ej e
has norm squared exactly gg . In particular, there is a unit vector ⎡
⎤ f1 f = ⎣ ... ⎦ fg such that ϕ(X)f 2 = gg . Hence gg =
α f 2 .
α
From the paragraph above α 2 g and thus for each α we must have α f 2 = g . Again in view of the preceding paragraph, it follows that fs 2 = g1 for each s and moreover Aα,s fs , Aα,t ft =
1 g
(3.7)
for every α, s, t. Fix α. Applying the matrix A∗ of Lemma 3.4 to the vector f1 ⊗ eα produces the vector ⎡A
α,1 f1
⎤
⎢ Aα,2 f1 ⎥ ⎢ ⎥. .. ⎣ ⎦ . Aα,g f1 Since the first entry has norm g1 and the whole vector itself has norm at most g1 , it follows that Aα,s f1 = 0 whenever s = 1. Applying the same argument to the other indices u shows Aα,s fu = 0 for s = u. For the final ingredient, fix α = β and let ∗ ∗ Y = eα e1 ⊗ e1∗ + eβ e2 ⊗ e2∗ . Since Y ∗ Y = e1 e1∗ ⊗ e1 e1∗ + e2 e2∗ ⊗ e2 e2∗ , Y is a contraction. Therefore, ϕ(Y ) = Aα,1 ⊗ e1∗ + Aβ,2 ⊗ e2∗
(3.8)
66
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
is also a contraction. Let F (t) = f1 ⊗ e1 + eit f2 ⊗ e2 . With these notations, ϕ(Y )F (t) = Aα,1 f1 + Aβ,2 f2 , which gives the second equality in ! 1 2= 2 + e−it Aα,1 f1 , Aβ,2 f2 + eit Aβ,2 f2 , Aα,1 f1 dt 2π ! 1 ϕ(Y )F (t)2 dt 2. = 2π The inequality is a consequence of the hypothesis ϕ(Y ) 1 and F (t)2 = 2. It follows that ϕ(Y )F (t) = 1 for every t and thus Aα,1 f1 , Aβ,2 f2 = 0 whenever α = β. Repeating the argument with other indices shows Aα,s fs , Aβ,t ft = 0 if α = β. (Here s = t is ok so long as α = β.) Combining Eqs. (3.7), (3.8), and (3.9) gives the desired (3.5).
(3.9) 2
3.2.1. Characterization of complete isometries In this subsection, Theorem 3.3 is deduced from Proposition 3.6. We begin with a lemma which follows readily from Lemma 2.5.
Lemma 3.8. Suppose the linear map Σ : Cg ×g → Cd ×d has the form x σ1 (x) . Σ(x) = σ2 (x) σ3 (x) If Σ is a completely contractive, then σ1 = 0 and σ2 = 0. Proof. For a given n we have 0 I − Σ(Xn )∗ Σ(Xn ) I − X∗n Xn − σ2 (Xn )∗ σ2 (Xn ) = ∗
∗ . ∗
Thus the upper left-hand corner in the block matrix above is positive semidefinite and Lemma 2.5 implies σ2 = 0. Reversing the order of the products shows σ1 = 0. 2 Proof of Theorem 3.3. If ψ has the given form, then ψ is evidently completely isometric. Conversely, suppose ψ is completely isometric. Let fj be a set of unit vectors satisfying Eq. (3.5). By Lemma 3.7, the set {f1 , . . . , fg } is orthonormal and moreover, hα = Aα,j fj is independent of j and {h1 , . . . , hg } is also an orthonormal set.
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
67
Let F = [ f1
· · · fg ],
H = [ h1
· · · hg ].
The mappings F, H are isometries Cg → Cd and Cg → Cd respectively. Further, for given β, u, h∗β
xα,s Aα,s fu =
xα,s h∗β Aα,s fu = xα,s h∗α Aα,s fs = xα,s .
α,s
It follows that H ∗ ϕ(x)F = x. This proves the first part of this direction of the theorem. The isometries H and F extend to unitaries V and U respectively which produces the representation x σ1 U ∗, ϕ(x) = V σ2 σ3 x σ where the block matrix Σ = σ2 σ31 is completely contractive since the same is true for ϕ. Now Lemma 3.8 completes the proof. 2 4. Proof of Theorem 1.3 In this section we prove Theorem 1.3. Accordingly, suppose h : Bg ×g → Bd ×d is an NC ball map and h(0) = 0. From Lemma 3.1, h(1) , the linear part of h, is a complete isometry. By Theorem 3.3, there exist unitaries U and V and a completely contractive mapping h˜ (1) such that h (x) = V (1)
x 0
0 U ∗. h˜ (1) (x)
(4.1)
We claim that V h(x)U ∗ is of the desired form (1.8). For the sake of convenience we replace h(x) by V ∗ h(x)U . For X ∈ Bg ×g (N ) consider D → Md ×d (N), z → h(zX), which is analytic (in z). This is a function of one complex variable, so the classical Schwarz lemma applies. Hence for all 0 δ < 1 and 0 θ 2π we have ∗ 0 δ 2 I − h δeiθ X h δeiθ X . If δ is in the series radius, we may write ∞ iθ iθ iθ (1) (∞) h δe X = h δe X + h δe X = h(α) δeiθ X . α=1
We integrate (4.2) for such δ to obtain
(4.2)
68
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
1 0 2π
!2π 2 ∗ δ I − h δeiθ X h δeiθ X dθ 0
1 = δ I − δ h (X) h (X) − 2π 2
2 (1)
∗ (1)
!2π
∗ h(∞) δeiθ X h(∞) δeiθ X dθ
0
= δ 2 I − δ 2 h(1) (X)∗ h(1) (X) −
∞
δ 2α h(α) (X)∗ h(α) (X),
(4.3)
α=2
where the last equality uses the homogeneity (of order α) of h(α) . Fix an α 2 and write δ α−1 h(α) = bb13 bb42 for NC analytic polynomials bj . Then by Eqs. (4.1) and (4.3) and because the bj are polynomials, ∗ b1 (X) b2 (X) b1 (X)∗ b3 (X)∗ 0 I 0 X X − 0 − b2 (X)∗ b4 (X)∗ 0 h˜ (1) (X)∗ h˜ (1) (X) b3 (X) b4 (X) 0 I I − X∗ X 0 = 0 I − h˜ (1) (X)∗ h˜ (1) (X) b (X)∗ b1 (X) + b3 (X)∗ b3 (X) b1 (X)∗ b2 (X) + b3 (X)∗ b4 (X) . (4.4) − 1 b2 (X)∗ b1 (X) + b4 (X)∗ b3 (X) b2 (X)∗ b2 (X) + b4 (X)∗ b4 (X)
It follows that I − X∗n Xn − bj (Xn )∗ bj (Xn ) 0 for j = 1, 3 and all n. Lemma 2.5 thus implies b1 = 0 and b3 = 0. We now multiply in the other order (consider say Xn X∗n instead of X∗n Xn ) to conclude that b2 = 0 (also b1 = 0, but that we already knew). This shows h has the desired form and completes the proof. 5. Linear fractional transformation of a ball It is well known that the bianalytic maps on the unit disk D are exactly the linear fractional maps. These act transitively on the unit disk. That is, if w, z ∈ D, then there is a linear fractional map F which maps w to z. It is standard in classical several complex variables that this generalizes to special domains in Cn [8]. In this section we give basic properties of linear fractional maps on Bd ×d . Given a d × d matrix v with v < 1, define Fv : Bd ×d → Bd ×d by 1/2 −1 1/2 Fv (u) := v − Id − vv ∗ Id − v ∗ v u Id − v ∗ u .
(5.1)
Lemma 5.1. Suppose D is an open NC domain containing 0. If u : D → Bd ×d is NC analytic, then Fv (u(x)) is an NC analytic function (in x) on D.
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
69
Proof. Since v is a matrix with v < 1, the expressions (Id − vv ∗ )1/2 and (Id − v ∗ v)1/2 are constant NC analytic functions. As sums and products of NC analytic functions are NC analytic, it suffices to show that ζ := (Id − v ∗ u)−1 is NC analytic. Note that v ∗ u is NC analytic on D. Thus ζ being the composition of the NC analytic function (1 − z)−1 on D and the NC analytic function v ∗ u on D is NC analytic as well. 2 For the convenience of the reader, we now give the basic, and known (see for instance [25]) properties of F in a lemma generalizing Lemma 1.6. For this we define Uk to be the set of all U ∈ Bd ×d (N) which are isometric on a space of dimension at least N k. For example, Ud denotes the isometries in Bd ×d (N ). Lemma 5.2. Suppose that N ∈ N and V ∈ Bd ×d (N ) with V < 1. (1) U → FV (U ) maps the unit ball Bd ×d (N ) into itself with boundary to the boundary. Furthermore, for each k d, Uk maps onto Uk . (2) If U ∈ Bd ×d (N ), then FV (FV (U )) = U . (3) FV (V ) = 0 and FV (0) = V . Proof. The proof is motivated by linear system theory but an understanding of system theory is not needed to read the proof. Let y ∈ CN d be given. Define i=
i1 i2
=
1
(I − V ∗ V )− 2 (I − V ∗ U )y −Uy
∈ CN d ⊕ CN d .
Let M denote the matrix −V ∗ . (I − V V ∗ )1/2
(I − V ∗ V )1/2 M := V
(5.2)
Straightforward computation shows M is unitary; i.e., M ∗ M = I = MM ∗ . Let o=
o1 o2
= Mi =
y 1
(I − V V ∗ )− 2 (V − U )y
1
∈ CN d ⊕ CN d .
1
The relation V (I − V V ∗ )− 2 = (I − V ∗ V )− 2 V was used in computing Mi. Since M is unitary, i1 2 + i2 2 = o1 2 + o2 2 .
(5.3)
On the other hand, computations give FV (U )i1 = o2 . Combining the last two equations gives 2 2 i1 2 − FV (U )i1 = o1 2 − i2 2 = y 2 − Uy2 0.
(5.4)
70
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87 1
Since the mapping y → i1 = (I − V ∗ V )− 2 (I − V ∗ U )y is onto, the matrix FV (U ) is a contraction and the first part of item (1) of the lemma is proved. To prove the second part of item (1), notice that from Eq. (5.4) and the fact that both FV (U ) and U are contractions, the dimension of the space on which FV (U ) is isometric is the same as the dimension of the space on which U is isometric. We now turn to the proof of item (2). Define 1/2 −1 1/2 F := FV (U ) = V − I − V V ∗ I − V ∗V U I − V ∗U . First notice that 1/2 −1 1/2 I − V ∗V I − V ∗F = I − V ∗V + V ∗ I − V V ∗ U I − V ∗U 1/2 ∗ −1 1/2 I − V ∗V = I − V ∗V + I − V ∗V V U I − V ∗U 1/2 −1 1/2 I − V ∗U I − V ∗U I − V ∗V = I − V ∗V 1/2 ∗ −1 1/2 I − V ∗V + I − V ∗V V U I − V ∗U 1/2 −1 1/2 I − V ∗U I − V ∗V = I − V ∗V . So
I − V ∗F
−1
−1/2 −1/2 I − V ∗U I − V ∗V = 1 − V ∗V .
We use this and elementary calculations to obtain 1/2 −1 1/2 FV (F ) = V − I − V V ∗ I − V ∗V F I − V ∗F 1/2 −1/2 I − V ∗U =V − I −VV∗ F I − V ∗V 1/2 −1/2 I − V ∗U + I − V V ∗ U V I − V ∗V =V − I −VV∗ = V − V I − V ∗ U + U − V V ∗ U = U. For (3), compute 1/2 −1 1/2 FV (V ) = V − I − V V ∗ I − V ∗V V I − V ∗V 1/2 −1/2 =V − I −VV∗ V I − V ∗V 1/2 −1/2 I − V ∗V = V − V I − V ∗V = 0. 2 Part II. Clinging In this, and the sections to follow, we turn our attention to semi-distinguished ball maps introduced in Section 1.5.4. In particular, attention is restricted to the NC domains Bg .
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
71
6. NC functions revisited This section gives several basic facts about NC analytic functions on the ball, most of which are used in the remainder of the paper. We feel several of the main results here also are of interest in their own right. A few of the results are included purely for their own sake. 6.1. Series radius of convergence This section shows that NC power series expansions of NC analytic functions on a ball have good convergence properties. As a consequence of this convergence, bounded NC analytic functions are free analytic in the sense of Popescu [22]. Lemma 6.1. If h : Bg → Bd ×d is an NC analytic function, with NC power series expansion h=
aw w,
w
then
aw 2 d.
w
Moreover, if Z is a strict column contraction acting on a separable Hilbert space or if Z = I ⊗ S ∗ where S is the shift of Fock space Fg , and z ∈ D, then h(zZ) =
aw ⊗ (zZ)w
converges absolutely, h(zZ) is a contraction and z → h(zZ) is an analytic function on D. Proof. Let S denote the shifts introduced in Section 2. Let Fg (n) denote the span of the words of length at most n in the NC Fock space Fg . Let Wn : Fg (n) → Fg denote the inclusion. Thus, for any finite dimensional Hilbert space K, I ⊗ Sj (n) = I ⊗ Wn∗ (I ⊗ Sj )I ⊗ Wn is the compression of I ⊗ Sj to the (semi-invariant finite dimensional) subspace K ⊗ Fg (n). Here I is the identity on K. In view of the hypotheses (and since the Sj (n) are nilpotent of order n), ∗ aw ⊗ w S(n) . h S(n)∗ = |w|n
Thus, for any vector γ ∈ Cd , 2 2 ∗ ∗ ∗ a ∗ γ 2 . aw γ ⊗ w = γ h S(n) γ ⊗ ∅ = w 2
|w|n
It follows that d
a ∗ ej 2 , w
w
j
|w|n
72
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
where {e1 , . . . , ed } is an orthonormal basis for Cd . (Note that the sums over j terms on the right∗ 2 hand are the squares of the Hilbert–Schmidt norms of the aw .) Since aw 2 = aw side ∗ e 2 , it follows that a w j j d
aw 2 .
Consequently, if |z| < 1 and Z = (Z1 , . . . , Zg ) is a g tuple of operators on Hilbert space (potentially infinite dimensional) satisfying Zj∗ Zj I and if |z| < 1, then h(zZ) :=
aw ⊗ (zZ)w
converges (absolutely). A favorite choice is Z = I ⊗ S ∗ . For |z| < 1, we have I ⊗ Wn Wn∗ h(zI ⊗ S ∗ )I ⊗ Wn Wn∗ converges in the SOT to h(zI ⊗ S ∗ ). On the other hand, I ⊗ Wn∗ h(zI ⊗ S ∗ )I ⊗ Wn = h(zI ⊗ S(n)∗ ) which is assumed to be a contraction. Thus, h(zI ⊗ S ∗ ) is a contraction. For a general strict column contraction X, represent X as V X = (I ⊗ S ∗ )V by Lemma 2.1. For |z| < 1, it follows that h(I ⊗ zS ∗ )V = V h(zX) and hence h(zX) 1. 2 6.2. The NC Schwarz lemma The classical Schwarz lemma from complex variables states the following: if f : D → D is analytic and f (0) = 0, then f (z) z for z ∈ D. There are several ways to extend this to NC analytic functions, for example Popescu [22, Theorem 2.4] gives one and other results of this type can be found in [28]. In this subsection we give two extensions of our own. Theorem 6.2. Suppose f : Bg → Bd ×d is an NC analytic function on Bg . If f (0) = 0 and f (X) 1 for each X ∈ int Bg , then X ∗ X − f (X)∗ f (X) 0,
(6.1)
for X ∈ int Bg . Proof. The proof relies on the model for column contractions and the convergence result for bounded NC analytic functions f of Lemma 6.1 which allows us to evaluate bounded NC analytic functions on operators, not just matrices. Since f maps into Bd ×d , if |z| < 1, then ∗ I − f zS ∗ f zS ∗ 0,
(6.2)
by Lemma 6.1. Thus,
∗ Sj Sj∗ + P0 − f zS ∗ f zS ∗ 0.
(6.3)
j
(Here P0 is the projection onto the span of the empty word.) From f (0) = 0, we obtain f (S ∗ )P0 = 0. Hence (6.3) transforms into
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
73
∗ (I − P0 ) Sj Sj∗ + P0 − f zS ∗ f zS ∗ (I − P0 ) j
+ P0
Sj Sj∗
∗ ∗ ∗ P0 + P0 − f zS f zS
j
∗ ∗ ∗ ∗ (I − P0 ) + P0 0. = (I − P0 ) Sj Sj + P0 − f zS f zS
(6.4)
j
As ∗ ∗ ∗ ∗ (I − P0 ) Sj Sj + P0 − f zS f zS (I − P0 ) j
∗ ∗ ∗ ∗ (I − P0 ) Sj Sj − f zS f zS = (I − P0 ) j
and P0
∗ ∗ Sj Sj∗ − f zS ∗ f zS ∗ = 0 = Sj Sj∗ − f zS ∗ f zS ∗ P0 ,
j
j
(6.4) is equivalent to
∗ Sj Sj∗ − f zS ∗ f zS ∗ 0,
(6.5)
j
for |z| < 1. Replacing S by I ⊗ S in the argument above yields ∗ Sj Sj∗ − f zI ⊗ S ∗ f zI ⊗ S ∗ 0. I⊗
(6.6)
j
Given X ∈ Bg with X < 1, we can write V X = (I ⊗ S ∗ )V , where I is the identity on a finite dimensional Hilbert space, by Lemma 2.1. Moreover, by Lemma 6.1, for |z| < 1, Vf (zX) = f zI ⊗ S ∗ V . Multiply (6.6) by V ∗ on the left and V on the right to obtain ∗ ∗ ∗ ∗ ∗ V V = X ∗ X − f (zX)∗ f (zX) 0, I ⊗ Sj Sj − f zI ⊗ S f zI ⊗ S j
for |z| < 1. Since X < 1, letting z 1 completes the proof.
2
Remark 6.3. Popescu [22, Theorem 2.4] formulates and proves a Schwarz lemma for free analytic functions, which in our context implies that if f is a contraction-valued NC analytic function ∗ I for all α. (This with f (0) = 0, then f (X) X for X < 1 and further, |w|=α aw aw inequality remains true even with operator coefficients aw .)
74
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
A classical complex variables statement equivalent to Schwarz’s lemma is the following: if f : D → D is analytic and f (0) = 0, then h(z) = f (z) z is also analytic and h : D → D. We give a noncommutative analog of this result, which does not appear to be an immediate consequence of Theorem 6.2. Theorem 6.4. Suppose that H = [ H1 on Bg . If for each X ∈ int Bg ,
· · · Hg ] is a row of d × d NC analytic functions
H (X)X = Hj (X)Xj 1,
(6.7)
j
i.e., ∗ I − H (X)X H (X)X 0,
(6.8)
I − H (X)H (X)∗ 0.
(6.9)
then for each X ∈ int Bg
Equivalently, H (X) 1. Proof. This proof depends upon both Lemmas 2.1 and 6.1. Let G(x) = H (x)x. The hypotheses imply G : Bg → Md×d is contraction-valued. Hence Lemma 6.1 applies. Denote the power series expansions for Hj by Hj =
(α)
hj .
α
It follows that the power series expansion (by homogeneous terms) for G is then G=
α
(α)
hj xj .
j
Hence, also by Lemma 6.1, for each j the power series expansion for Hj converges for any strict column contraction Z (even for operators on an infinite dimensional Hilbert space) and for such Z, G(Z) =
Hj (Z)Zj .
j
In particular, for |z| < 1 and Z = zI ⊗ S ∗ (where S is as in Lemma 2.1 and I is the identity on a finite dimensional Hilbert space), G zS ∗ = Hj zI ⊗ S ∗ Sj∗ . j
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
75
Because G(zI ⊗ S ∗ ) 1, 0 I − G zI ⊗ S ∗ G(zI ⊗ S)∗ ∗ =I − Hj zI ⊗ S ∗ I ⊗ Sj∗ I ⊗ S H zI ⊗ S ∗ j
=I −
∗ Hj zI ⊗ S ∗ Hj zI ⊗ S ∗ .
(6.10)
j
Let X ∈ int Bg be a strict column contraction acting on a finite dimensional space. Express X = V ∗ (I ⊗ S ∗ )V according to Lemma 2.1, where I is the identity on a finite dimensional Hilbert space. For every NC analytic polynomial f and |z| < 1, f (zX) = V ∗ f (zI ⊗ S ∗ )V . Hence the same holds true for NC analytic functions and in particular, Hj zI ⊗ S ∗ V = V Hj (zX). Thus, applying V on the right and V ∗ on the left of Eq. (6.10) gives 0I − Hj (zX)Hj (zX)∗ . j
Letting z 1 concludes the proof.
2
6.3. The distinguished boundary for Bg ×g Fix N . The distinguished (Shilov) boundary of the algebra A(Bg ×g (N )), the functions which are analytic in int Bg ×g (N ) and continuous on Bg ×g (N ) is the smallest closed subset of Bg ×g (N ) so that each element of A(Bg ×g (N )) takes its maximum on . That a smallest, as opposed simply minimal, such set exists is a standard fact in complex analysis and the theory of uniform algebras; see [15, p. 145] or [8, Ch. 4] for more details. While not needed in the sequel, the following known result explains the distinguished terminology in the definitions of distinguished isometry and semi-distinguished pencil ball map. Proposition 6.5. The distinguished boundary of A(Bg ×g (N )) is {X ∈ Bg ×g (N ) | X ∗ X = I }. That the distinguished boundary of A(Bg ×g (N )) must be contained in {X ∈ Bg ×g (N ) | X ∗ X = I } follows readily from Lemma 5.2; see Proposition 6.6. For the fact that no smaller set can serve as a distinguished boundary, we refer the reader to [1, p. 77].
Proposition 6.6. Fix N ∈ N. If f : Bg ×g (N ) → Cd ×d is continuous and analytic in int Bg ×g (N ), then for any X ∈ Bg ×g (N ) we have f (X) max f (U ) (6.11) U ∈Uk
for any 0 < k min{g , g}. Thus if f (X) = 0 for all X ∈ Bg ×g (N ) such that X ∗ X = I (if g g) or XX ∗ = I (if g < g), then f = 0. For example, if g g, then the set of isometries Ug contains the distinguished boundary of Bg ×g (N ).
76
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
Proof. First suppose f : Bg ×g (N ) → C (so that d = 1 = d ). Pick any U ∈ Uk . By the maximum principle, the function h(z) = f (zU ) takes its maximum value on |z| = 1. Now we use linear fractional automorphisms of the ball to prove that such an inequality holds for any X in the interior of Bg ×g (N ). Select F as in Lemma 5.2 which maps 0 to X. Then h(Z) := f (F (Z)) is analytic and maps 0 to f (X). The previous paragraph applies to give f (X) = h(0) maxh(zU ) = maxf F (zU ) . |z|=1
|z|=1
By Lemma 5.2(1), F (zU ) ∈ Uk for |z| = 1; so we have proved that the maximum of f occurs on Uk . To prove the statement for matrix-valued f , simply note that given unit vectors γ ∈ Cd and η ∈ Cd , the function F (X) = η∗ f (X)γ takes it maximum on Uk . It follows that F (X) max f (U ). U ∈Uk
Since γ and η are arbitrary, the result follows.
2
Remark 6.7. This proposition has more content for larger k and in particular k = min{g, g } is optimal. 6.4. Matrix Linksnullstellensatz For scalar NC analytic polynomials there is an elegant Linksnullstellensatz whose proof is due to Bergman, cf. [9]. Now we generalize it to matrices with entries which are NC analytic polynomials. Theorem 6.8. Given an m × d matrix P over Cx and an n × d matrix Q over Cx, suppose that P (X)v = 0 implies Q(X)v = 0 for every matrix g-tuple X and vector v. Then for some G ∈ Cxn×m we obtain Q = GP . Proof. The rows of a matrix A will be denoted by Aj = [ aj 1 aj 2 · · · aj d ]. In particular, Pj is a 1 × d matrix over Cx. Let Vd = Cx1×d denote the left Cx-module of 1 × d matrices of polynomials. Note Pj ∈ Vd . Let Id be the Cx-submodule of Vd generated by the Pj , i.e., Id =
" rj Pj rj ∈ Cx .
Id is the smallest subspace of Vd containing the Pj and invariant with respect to Mj = left multiplication by xj (for each j ). Note that Mj determines a well-defined linear mapping Yj on the quotient: Yj : Vd /Id → Vd /Id . Let Wk denote the image of polynomials of degree at most k in the quotient Vd /Id . These spaces are finite dimensional and Wk−1 ⊆ Wk . So Wk−1 is complemented in Wk .
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
77
Choose N > max degree of all polynomials in P and Q. Define Xj = Yj : WN −1 → WN and extend Xj to a linear mapping WN → WN in any way (on a complementary subspace). Let vj denote the element # of WN determined by the row with the polynomial 1 in the j -entry and 0 elsewhere. Define v = vj ∈ WNd . For a polynomial q, q(X)vj = [ 0 · · · 0 q 0 · · · 0 ] (j th spot). Hence Qj (X)v = Qj . A similar statement is true for Pj ; i.e., Pj (X)v = Pj ∈ Id and so Pj (X)v = 0. So Qj (X)v = Qj is 0 too which means Qj ∈ Id . Thus there exists Gsj such that Qj = Hence Q = GP , as desired.
Gj s Ps .
2
7. The linear part of semi-distinguished ball maps Having established preliminary results, we turn our attention to semi-distinguished ball maps, introduced in Section 1.5.4. First we show that semi-distinguished ball maps have very distinctive linear parts. And then we set about to give properties of these linear maps. A linear map L : Cg → Cd ×d is a distinguished isometry if it maps the distinguished boundary of Bg to the boundary of Bd ×d ; i.e., if for each X ∈ Bg with X ∗ X = I we have that L(X) = 1. In this case a (nonzero) vector γ such that L(X)γ = γ is called a clinging vector and this property clinging. Proposition 7.1. If f is a semi-distinguished ball map, then f (1) , the linear part of f , is a distinguished isometry. Proof. The proof is the same as that of Proposition 3.1.
2
7.1. Properties of distinguished isometries The remainder of this section is devoted to giving properties of distinguished isometries.
Proposition 7.2. Let L : Cg → Cd ×d be a linear map. (1) L is a distinguished isometry if and only if L (X) := X1∗ X1 + · · · + Xg∗ Xg ⊗ Id − L∗ (X)L(X) 0
(7.1)
and clings (i.e., L (X) is always positive semidefinite and never positive definite). (2) If L is completely isometric, then it is a distinguished isometry. The converse is not true. Proof. For the implication (⇒) in (1), given any Xi , choose a W satisfying W ∗ W = X1∗ X1 + · · · + Xg∗ Xg . Note that it suffices to show (7.1) on a dense subset of Bg . Thus we may assume that W is invertible. Then (X1 W −1 )∗ X1 W −1 + · · · + (Xg W −1 )∗ Xg W −1 = I , so by assumption, I − L∗ (XW −1 )L(XW −1 ) 0 and it binds. Since L is truly linear, we multiply this inequality with W ∗ on the left and with W on the right: W ∗ W − L∗ (X)L(X) 0 and it binds. The converse (⇐) is obvious.
78
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
First part of (2) is trivial. To finish the proof it suffices to exhibit an example of a distinguished isometry which is not a complete isometry. Consider L(x, y) = Ax + By with ⎤ 0 0 ⎥ ⎦, 0 √
⎡
1 √0 ⎢ 0 22 A=⎣ 0 0 0 0 1 0 0 0 For X = 0 0 and Y = 1 0 ,
⎡ ⎢ B =⎣
0 √ 2 2
2 2
0 0
⎤ 0 0 ⎥ ⎦. 0 √
0 0 1 0
2 2
$ X √ 3 Y = 2 > 2 = L(X, Y ) . This shows that L is not a complete isometry. It remains to be seen that L satisfies (7.1). We compute ⎡
1 ∗ 2y y ⎣ L (x, y) = − 12 x ∗ y
0
− 12 y ∗ x 1 ∗ 2x x 0
⎤ 0 ⎦. 0 1 ∗ 2 (x − y) (x − y)
The top left 2 × 2 block of L (x, y) can be factored as ∗ −∗ 1 ∗ −1 −y x 1 −x y 2x x 0 1 0 0 0 1
1 . 0
This immediately implies that for invertible X, L (X, Y ) is always positive semidefinite and never positive definite. For noninvertible X the same holds true by a standard density argument. 2
Remark 7.3. By way of contrast, every contractive L : Cg → Cd ×d is completely contractive. For related results see Section 9. 7.1.1. The Gram representation A powerful tool used is a matrix representation of a quadratic NC polynomial. A key property of this representation is that matrix positivity of the quadratic NC polynomial is equivalent to the positive semidefiniteness of the representing matrix. The following lemma is needed to establish this. Lemma 7.4. For large enough n the set g Xw X ∈ Cn×n , w ∈ Cn
(7.2)
is all Cng . Proof. Given w, x1 , . . . , xg ∈ Cn with xj = 0, choose Xj ∈ Cn×n such that Xj w = xj . For w∗ instance Xj = xj w 2 2 will do. Note Lemma 7.4 is true even with a fixed w = 0 and parametrizing over all X.
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
79
Proposition 7.5. Let p=
xi∗ Bij xj
1i,j g
be a homogeneous quadratic NC polynomial with Bij ∈ Cd ×d . Then there is a unique matrix G ∈ (Cd ×d )g×g with p = x ∗ Gx. Moreover, p(X) = iff G 0.
i,j
(7.3)
Bij ⊗ Xi∗ Xj is positive semidefinite for all N ∈ N and all X ∈ (CN ×N )g
Proof. In d × d block form, G = [ Bij ]i,j . If G 0, then G = H ∗ H for some matrix H . Hence p = (H x)∗ (H x) is a sum of hermitian squares, so p(X) 0 for all N ∈ N and X ∈ (CN ×N )g . The converse follows from Lemma 7.4. 2 7.1.2. Orthotropicity In this subsection we establish a basic property of distinguished isometries L (that is, of those L for which L is positive semidefinite and clinging), which we call orthotropicity. A d × d linear pencil L = A1 x1 + · · · + Ag xg : Cg → Cd ×d is called orthotropic if for every X ∈ Cg and w ∈ Cd satisfying L(X)w = w, the vector L(X)w is orthogonal to the image of L(X ⊥ ). Proposition 7.6. Every distinguished isometry is orthotropic. To continue our analysis of distinguished isometries we write L in a special form. We multiply L with a unitary V on the left and a unitary U ∗ on the right. Thus without loss of generality, A1 is the block matrix 1 0 (7.4) 0 (A1 )22 and Aj for j 2 equals
0 (Aj )21
(Aj )12 . (Aj )22
(7.5)
g Proof of Proposition 7.6. Suppose L = i=1 Ai xi is a distinguished isometry and without loss of generality write L in the special form described above. Clearly, orthotropicity of L is equivalent to (Aj )12 = 0 for j 2. In order to prove this we set all variables except for X1 , Xj to 0. For convenience we use X, Y (resp. x, y) instead of X1 , Xj (resp. x1 , xj ) and A, B instead of A1 , Aj . Thus L(x, y) =
x B21 y
B12 (Id−1 ⊗ y) . A22 (Id−1 ⊗ x) + B22 (Id−1 ⊗ y)
80
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
∗ ∗ ∗ A straightforward computation shows we can x ∗ x x represent L (x, y) = x x +y y −L(x, y) L(x, y) as y G y (cf. Proposition 7.5), where y stands for
⎡ x x ⎢0 =⎣ y y 0
0
⎤
Id−1 ⊗ x ⎥ ⎦ 0 Id−1 ⊗ y
and ⎡
0 ⎢ 0 G=⎢ ⎣ 0 ∗ −B12
0 I − A∗22 A22 ∗ A −B21 22 ∗ A −B22 22
0
−A∗22 B21 ∗ B 1 − B21 21 ∗ −B22 B21
⎤ −B12 ⎥ −A∗22 B22 ⎥. ∗ ⎦ −B21 B22 ∗ ∗ I − B12 B12 − B22 B22
If L (X, Y ) is positive semidefinite for all X, Y , then by Proposition 7.5, G is positive semidefinite. In particular, B12 = 0. (Note if L (X, Y ) is only positive semidefinite for scalars X, Y , then B12 need not be 0.) Alternative proof of B12 = 0. By density, we may assume Y is invertible. L (X, Y ) multiplied on m m12 −1 the right by Y 0 and on the left by the transpose of the same matrix yields M2 := m11 21 m22 0 I for ∗ m11 = 1 − B21 B21 , ∗ ∗ m12 = −B21 A22 (Id−1 ⊗ X) − B21 B22 (Id−1 ⊗ Y ) + Y −∗ X ∗ B12 (Id−1 ⊗ Y ), ∗ ∗ B21 + Id−1 ⊗ Y ∗ B12 XY −1 , m21 = m∗12 = − Id−1 ⊗ X ∗ A∗22 B21 − Id−1 ⊗ Y ∗ B22 ∗ ∗ m22 = Id−1 ⊗ X X + Y ∗ Y − Id−1 ⊗ Y ∗ B12 B12 (Id−1 ⊗ Y ) ∗ − Id−1 ⊗ X ∗ A∗22 + Id−1 ⊗ Y ∗ B22 A22 (Id−1 ⊗ X) + B22 (Id−1 ⊗ Y ) . (7.6)
Consider m12 and note that Y −∗ X ∗ B12 (Id−1 ⊗ Y ) = Y −∗ X ∗ Y B12 . Suppose B12 = 0. Then it is easy to construct X = X(ε), Y = Y (ε) sending this term to ∞ as ε → 0 while keeping all the remaining terms bounded. This contradiction yields B12 = 0. 2 8. Characterization of semi-distinguished ball maps The following theorem summarizes what we know about semi-distinguished pencil ball maps. Both the hypotheses and the conclusions are weaker than those of Theorem 1.9. The relationship between Theorem 1.9 and Theorem 8.1 is made precise by Corollary 8.2 of this section. Theorem 8.1. Let L be a d × d NC analytic truly linear pencil and f : Bg → BL a semidistinguished pencil ball map with f (0) = 0. Clearly, h := L ◦ f maps Bg → Bd ×d . Write h h(1) is the linear homogeneous component in the NC power series as h = h(1) + h(∞) , where (∞) (α) d ×d )g and a = ∞ expansion of h and h α=2 h . Then there is a unique contraction M ∈ (C unique nontrivial subspace S ⊆ Cdg such that:
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
81
(1) h(1) (x) = Mx, M|S is an isometry and MΠS ⊥ is a strict contraction. (2) Each h(α) (x) for α 2 is of the form Pα Π S ⊥ x for a matrix Pα of NC ∞polynomials. (∞) (X)v := (3) For the formal NC power series P (∞) := ∞ P , P α=2 α α=2 Pα (X)v converges for v ∈ S ⊥ ⊗ CN and X ∈ (CN ×N )g in an NC ε-neighborhood of 0. Also, h(∞) (x) = P (∞) (x)ΠS ⊥ x. (4) (M ⊗ IN + P (∞) (X))ΠS ⊥ ⊗CN 1 for X ∈ (CN ×N )g with X < 1 and (M ⊗ IN ) × (S ⊗ CN ) is orthogonal to (M ⊗ IN )(S ⊥ ⊗ CN ), to Pα (X)(S ⊥ ) for all α 2 and to P (∞) (X)(S ⊥ ). Proof. Let h(1) (x) := x ∗ x − h(1) (x)∗ h(1) (x) = x ∗ Gx be as in Proposition 7.5, where G 0. Write h(1) (x) = Mx and note that G = 1 − M ∗ M. Let S := ker G = ker(I − M ∗ M) = range(I − M ∗ M)⊥ . By the clinging property, S = {0}. By definition, M|S is an isometry. Conversely, if v satisfies Mv = v, then Mv, Mv = v, v and hence v, (I − M ∗ M)v = 0. Since M is a contraction, I − M ∗ M is positive semidefinite. Thus (I − M ∗ M)v = 0, that is, v ∈ S. This proves (1) and also the uniqueness of M and S. For (2) fix N 1 and let X ∈ Bg (N ) such that X = 1 be given. By Eq. (6.1) of Schwarz’s lemma (Theorem 6.2) applied to h(zX), |z| < 1 for all 0 δ < 1 and 0 θ 2π we have ∗ 0 δ 2 X ∗ X − h δeiθ X h δeiθ X .
(8.1)
If series radius, we may write h(δeiθ X) = h(1) (δeiθ X) + h(∞) (δeiθ X) = ∞δ is (α)in the iθ α=1 h (δe X). We integrate (8.1) to obtain 1 0 2π
!2π 2 ∗ ∗ δ X X − h δeiθ X h δeiθ X dθ 0
1 = δ X X − δ h (X) h (X) − 2π 2
∗
2 (1)
∗ (1)
!2π
∗ h(∞) δeiθ X h(∞) δeiθ X dθ
0
= δ 2 X ∗ X − δ 2 h(1) (X)∗ h(1) (X) −
∞
h(α) (δX)∗ h(α) (δX).
(8.2)
α=2
By Proposition 7.2, h(1) (X) 0 with clinging. Thus by (8.2), for every w satisfying ∗ X X − h(1) (X)∗ h(1) (X) w = 0
(8.3)
we have h(α) (δX)w =√0 for α 2 and δ in the series radius. In particular, by Proposition 7.5, (8.3) is equivalent to Gxw = 0 and this implies that h(α) (X)w = 0 for α 2 and every X in the series radius. By a scaling argument (h(α) is homogeneous), the same holds true for every X and w. Hence the matrix NC Nullstellensatz that there is √ Theorem 6.8 applies √ and implies √ ˜α with P˜α (x) Gx = h(α) (x). Since G = G(ΠS + ΠS ⊥ ) = a√matrix of NC polynomials P √ G ΠS ⊥ , we set Pα = P˜α G. Then h(α) (x) = Pα (x)ΠS ⊥ x. (3) The second part is clear and for the first statement we refer the reader to [14].
82
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
(4) Let v ∈ S and w ∈ S ⊥ . Then % & Mv, Mw = M ∗ Mv, w = v, w = 0
(8.4)
since (1 − M ∗ M)v = 0. This shows that M(S ⊥ ) is orthogonal to M(S). For the strengthening with tensor products, let si ∈ S, ti ∈ S ⊥ , vi , uj ∈ CN . Then (M ⊗ IN ) si ⊗ vi , (M ⊗ IN ) tj ⊗ uj = (Msi ⊗ vi ), (Mtj ⊗ uj ) i
j
i
j
= Msi , Mtj vi , uj = 0. i,j
˜ Let h(x) = h(x)x, where
˜ h(x) =M +
Pα (x)ΠS ⊥ .
α2
˜ h(X) ˜ By Theorem 6.4 (applied with H = h), 1 for all X with X < 1. ˜ Rewrite h(x) as ˜ h(x) = MΠS + M + Pα (x) ΠS ⊥ .
(8.5)
α2
Both summands have norm 1 for X with X < 1. In particular, M ⊗ IN + Pα (X) ΠS ⊥ ⊗CN 1, α2
as desired. ˜ ˜ Clearly, h(x)| ⊗ CN ) is orthogonal to S = M|S is an isometry and thus h(X)(S ˜h(X)(S ⊥ ⊗ CN ) = (M ⊗ IN + P (∞) (X))(S ⊥ ⊗ CN ). Since (M ⊗ IN )(S ⊗ CN ) is orthogonal to (M ⊗ IN )(S ⊥ ⊗ CN ), this implies (M ⊗ IN )(S ⊥ ⊗ CN ) ⊥ P (∞) (X)(S ⊥ ⊗ CN ). Suppose w ∈ (M ⊗ IN )(S ⊥ ⊗ CN ). Then w ∗ P (∞) (tX)(S ⊥ ⊗ CN ) = {0} for small enough t. Then 0 = w ∗ P (∞) (tX) = w ∗
t α Pα (X) =
α2
implies (M ⊗ IN )(S ⊥ ⊗ CN ) ⊥ Pα (X)(S ⊥ ⊗ CN ).
t α w ∗ Pα (X)
α2
2
Let us note in passing that under the conditions of the previous theorem, (8.3) implies h(∞) (X)w = 0 for all X ∈ Bg . Indeed, let us consider the analytic function z → h(∞) (zX)w on D. Clearly, (8.3) holds for X replaced by δX due to homogeneity. If δ is in the series radius,
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
83
then by the NC power series expansion and the lemma, h(∞) (δX)w =
∞
h(α) (δX)w = 0.
α=2
Thus by analytic continuation, h(∞) (X)w = 0. Next we give a corollary which makes the relationship between Theorems 8.1 and 1.9 clearer. Corollary 8.2. Keep the assumptions and notation from Theorem 8.1. If, in addition, f is a pencil ball map, then M is a complete isometry. Conversely, h = L ◦ f satisfying (1), (2), (3), (4) for a complete isometry M is an NC ball map Bg → Bd ×d sending 0 to 0. Proof. Suppose f is a pencil ball map. Then h(1) is a (linear) NC ball map by Proposition 3.1. Hence h(1) = Mx with M a complete isometry (see Theorem 3.3). ˜ ˜ For the converse, suppose h satisfies (1)–(4). By (1) and (3), h(x) = h(x)x, where h(x) is given by ˜h(x) = MΠS + M + Pα (x) ΠS ⊥ . α2
˜ ˜ is a (1) and (4) imply that h(X) is for X ∈ Bg an orthogonal sum of two contractions, thus h(X) ˜ is a contraction. contraction for X ∈ Bg , i.e., X 1. Hence h(X) = h(X)X For the binding property of h we use that M is a complete isometry. Let e denote the disj tinguished vector associated with M, that is, A∗j Ai e = δi e if M = [ A1 · · · Ag ] (cf. Proposition 3.6). By (1), h(1) (X) binds at e ⊗ w, where w is a binding vector for I − X ∗ X, i.e., X ∗ Xw = w. This concludes the proof since h(X)(e ⊗ w) = h(1) (X)(e ⊗ w) by (2) and (3). 2 9. Further analysis of distinguished isometries We have successfully classified complete isometries, see Theorem 3.3. Distinguished isometries are more challenging and a few sample results are provided below. Theorem 9.1. Suppose L is an orthotropic linear pencil in 2 variables. If L (X1 , X2 ) 0 for all X1 , X2 ∈ Cn×n , and clings for all scalar X1 , X2 ∈ C, then L (X1 , X2 ) clings for all X1 , X2 ∈ Cn×n . Remark 9.2. We conjecture based on inconclusive computer experiments that Theorem 9.1 is false in 3 variables. 9.1. Equations which reformulate the clinging property Throughout this subsection L will denote an orthotropic d × d linear pencil in g variables. We assume it clings for X ∈ Cg . Let L (x) = x ∗ x − L(x)∗ L(x) = x ∗ Gx
84
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
be the Gram representation as in Proposition 7.5. Given G ∈ (Cd×d )g×g we call the linear subspace of its kernel spanned by all the vectors of the form ⎡
⎤ α1 v ⎣ ... ⎦ ∈ ker G αg v the scalar binding kernel N0 . (Since L clings for X ∈ Cg , for every α1 , . . . , αg ∈ C there exists ' α1 v ( .. d a 0 = v ∈ C with . ∈ ker G.) αg v
Fix a basis ⎫ ⎧ ⎡α v ⎤ i,1 i ⎬ ⎨ ηi := ⎣ ... ⎦ i = 1, . . . , t + m ⊆ Cgd ⎭ ⎩ αi,g vi for the scalar binding kernel N0 of G. We assume that {v1 , . . . , vt } is a maximal linearly independent set and that vt+j =
t
γj i v i
i=1
for j = 1, . . . , m. Let X1 , . . . , Xg ∈ Cn×n . We assume that X1 is invertible and define Zi := X1−1 Xi . (Matrix) binding at X is equivalent to the existence (for all Zi ) of a nontrivial solution to L (In , Z2 , . . . , Zg )v = 0. This is implied by the existence of ri ∈ Cn for which there is a nonzero v ∈ Cdn such that ⎡
⎤ (Id ⊗ In )v ⎢ (Id ⊗ Z2 )v ⎥ t+m ⎢ ⎥= ηi ⊗ ri . . ⎣ ⎦ .. (Id ⊗ Zg )v
(9.1)
i=1
In particular, v=
t+m i=1
=
t i=1
vi ⊗ ri = /
t
1
vi ⊗ ri +
t
j =t+1 i=1
i=1
vi ⊗ ri +
t+m
t+m j =t+1
23
Γi (r)
0
γj i rj . 4
γj i vi ⊗ rj
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
Similarly, (Id ⊗ Zk )v =
t
i=1 vi
t
85
⊗ Γi (Zk r). Using this in (9.1) yields
vi ⊗ Γi (Zk r) =
i=1
t
vi ⊗ Γi r diag(αk ) ,
i=1
where r = [ r1 · · · rt+m ] and diag(αk ) is the diagonal matrix with αi,k as its (i, i) entry. Linear independence of the v1 , . . . , vt gives Γi (Zk r − r diag(αk )) = 0 for all k = 2, . . . , g and i = 1, . . . , t. Thus for all these i, k: t+m
(Zk − αi,k )ri +
γj i (Zk − αj,k )rj = 0.
(9.2)
j =t+1
Hence if all the Zk − αi,k are invertible, ri = −
t+m
bij diag(αk ), Zk rj ,
(9.3)
j =t+1
where bij diag(αk ), Z := γj i (Z − αi,k )−1 (Z − αj,k ). Equations derived so far reformulate then clinging property and we say how precisely in the following lemma. Lemma 9.3. Consider the following conditions: (i) For each Z2 , . . . , Zg the system of Eqs. (9.2) has a solution r1 , . . . , rt+m ; (ii) L clings. Then (i) ⇒ (ii) and if N0 = ker G, then (ii) ⇒ (i). Proof. Follows from the computations given above.
2
9.2. The general case and the proof of Theorem 9.1 Now we give a theorem more general than Theorem 9.1 that implies Theorem 9.1. Theorem 9.4. Suppose t (g − 2) < m. If L (X) 0 for all X ∈ (Cn×n )g and clings for all scalar X ∈ Cg , then L (X) clings for all X ∈ (Cn×n )g . Proof. We assume all bij (diag(αk ), Zk ) exist, i.e., all Zk − αi,k are invertible. This causes no loss of generality: the set of all matrix g-tuples that make L cling is closed and our condition implies clinging on a dense subset. Eq. (9.3) gives ri = ri (k) as a function of k. By Lemma 9.3 we need to show that for every choice of Zi the system (9.2) has a solution, i.e., ri (2) = ri (3) = · · · = ri (g) for all i = 1, . . . , t.
86
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
This yields tn(g − 2) homogeneous equations in mn unknowns. Thus if m > t (g − 2) this system will always have a nontrivial solution. 2 Proof of Theorem 9.1. Fix a basis
cαi vi i = 1, . . . , t + m βi vi
for the scalar binding kernel N0 , where {v1 , . . . , vt } is a maximal linearly independent set. In view of Theorem 9.4 it suffices to show that m > 0. Suppose m = 0 and choose α, β with βα = αβii for all i. By scalar binding, there is a nonzero α u vector u with βii u ∈ N0 , i.e., for some λi :
t t αv αu = λi ηi = λi i i . βu βi vi i=1
i=1
Hence β
t i=1
λi αi vi = α
t
λi βi vi
i=1
and thus by the linear independence of the vi , βλi αi = αλi βi for all i = 1, . . . , t. As at least one λj is nonzero, this implies α αj = , β βj contrary to our assumption. Thus m > 0, as desired.
2
Acknowledgments The authors thank Victor Vinnikov and Dima Kalyuzhnyi-Verbovetski˘ı for helping us with NC analytic functions. References [1] L.A. A˘ızenberg, Carleman’s Formulas in Complex Analysis: Theory and Applications, Kluwer, 1993. [2] J.A. Ball, V. Bolotnikov, Q. Fang, Schur-class multipliers on the Fock space: de Branges–Rovnyak reproducing kernel spaces and transfer-function realizations, in: Operator Theory, Structured Matrices, and Dilations, in: Theta Ser. Adv. Math., vol. 7, 2007, pp. 85–114. [3] J.A. Ball, G. Groenewald, T. Malakorn, Bounded real lemma for structured noncommutative multidimensional linear systems and robust control, Multidimens. Syst. Signal Process. 17 (2006) 119–150. [4] D.P. Blecher, D.M. Hay, Complete isometries – An illustration of non-commutative functional analysis, in: Proceedings of 4th Conference on Function Spaces, in: Contemp. Math., vol. 328, 2003, pp. 85–97. [5] D.P. Blecher, D.M. Hay, Completely isometric maps into C-star algebras, preprint http://arxiv.org/abs/math/ 0203182. [6] D.P. Blecher, C. Le Merdy, Operator Algebras and Their Modules – An Operator Space Approach, Cambridge University Press, 2004. [7] A.E. Frazho, Complements to models for noncommuting operators, J. Funct. Anal. 59 (3) (1984) 445–461.
J.W. Helton et al. / Journal of Functional Analysis 257 (2009) 47–87
87
[8] S. Helgason, Differential Geometry Lie Groups and Symmetric Spaces, Academic Press, 1978. [9] J.W. Helton, S.A. McCullough, A Positivstellensatz for non-commutative polynomials, Trans. Amer. Math. Soc. 356 (2004) 3721–3737. [10] J.W. Helton, S.A. McCullough, M. Putinar, V. Vinnikov, Convex matrix inequalities versus linear matrix inequalities, IEEE Trans. Automat. Control, in press. [11] J.W. Helton, S.A. McCullough, V. Vinnikov, Noncommutative convexity arises from linear matrix inequalities, J. Funct. Anal. 240 (2006) 105–191. [12] D. Kalyuzhnyi-Verbovetski˘ı, Carathéodory interpolation on the non-commutative polydisk, J. Funct. Anal. 229 (2005) 241–276. [13] D. Kalyuzhnyi-Verbovetski˘ı, V. Vinnikov, Non-commutative positive kernels and their matrix evaluations, Proc. Amer. Math. Soc. 134 (2006) 805–816. [14] D. Kalyuzhnyi-Verbovetski˘ı, V. Vinnikov, Foundations of noncommutative function theory, in preparation. [15] S.G. Krantz, Function Theory of Several Complex Variables, Wiley, 1982. [16] P.S. Muhly, B. Solel, Schur class functions and automorphism of Hardy algebras, Doc. Math. 13 (2008) 365–411. [17] P.S. Muhly, B. Solel, The Poisson kernel for Hardy algebras, Complex Anal. Oper. Theory, in press. [18] E. Nelson, The distinguished boundary of the unit operator ball, Proc. Amer. Math. Soc. 12 (1961) 994–995. [19] V. Paulsen, Completely Bounded Maps and Operator Algebras, Cambridge University Press, 2002. [20] G. Pisier, Introduction to Operator Space Theory, Cambridge University Press, 2003. [21] G. Popescu, Isometric dilations for infinite sequences of noncommuting operators, Trans. Amer. Math. Soc. 316 (1989) 523–536. [22] G. Popescu, Free holomorphic functions on the unit ball of B(H)n , J. Funct. Anal. 241 (2006) 268–333. [23] G. Popescu, Free holomorphic functions and interpolation, Math. Ann. 342 (2008) 1–30. [24] G. Popescu, Free pluriharmonic majorants and commutant lifting, J. Funct. Anal. 255 (2008) 891–939. [25] G. Popescu, Hyperbolic geometry on the unit ball of B(H )n and dilation theory, Indiana Univ. Math. J. 57 (2008) 2891–2930. [26] G. Popescu, Noncommutative transforms and free pluriharmonic functions, Adv. Math. 220 (2009) 831–893. [27] G. Popescu, Unitary invariants in multivariable operator theory, Mem. Amer. Math. Soc., in press. [28] G. Popescu, Free holomorphic automorphisms of the unit ball of B(H )n , J. Reine Angew. Math., http://arxiv.org/ abs/0810.0451, in press. [29] D.-V. Voiculescu, Free analysis questions I: Duality transform for the coalgebra of ∂X:B , Int. Math. Res. Not. 16 (2004) 793–822. [30] D.-V. Voiculescu, Free analysis questions II: The Grassmannian completion and the series expansions at the origin, preprint http://arxiv.org/abs/0806.0361. [31] M.R. Wohlers, Lumped and Distributed Passive Networks. A Generalized and Advanced Viewpoint, Academic Press, 1969.
Journal of Functional Analysis 257 (2009) 88–121 www.elsevier.com/locate/jfa
Uniform K-homology theory Ján Špakula Mathematisches Institut, Universität Münster, Einsteinstr. 62, 48149 Münster, Germany Received 17 October 2008; accepted 9 February 2009 Available online 25 February 2009 Communicated by Alain Connes
Abstract We define a uniform version of analytic K-homology theory for separable, proper metric spaces. Furthermore, we define an index map from this theory into the K-theory of uniform Roe C∗ -algebras, analogous to the coarse assembly map from analytic K-homology into the K-theory of Roe C∗ -algebras. We show that our theory has a Mayer–Vietoris sequence. We prove that for a torsion-free countable discrete group Γ , the direct limit of the uniform K-homology of the Rips complexes of Γ , limd→∞ K∗u (Pd Γ ), is isomortop phic to K∗ (Γ, ∞ Γ ), the left-hand side of the Baum–Connes conjecture with coefficients in ∞ Γ . In particular, this provides a computation of the uniform K-homology groups for some torsion-free groups. As an application of uniform K-homology, we prove a criterion for amenability in terms of vanishing of a “fundamental class”, in spirit of similar criteria in uniformly finite homology and K-theory of uniform Roe algebras. © 2009 Elsevier Inc. All rights reserved. MSC: primary 46L80 Keywords: Analytic K-homology; Coarse assembly map; Uniform Roe algebra
1. Introduction The analytic K-homology theory of a second countable locally compact Hausdorff topological space X (see e.g. [8]) can be understood as an attempt to organize the elliptic differential operators over the space X into an abelian group. The (higher) indices of these operators can be interpreted as K-theory elements over C ∗ X, the Roe C∗ -algebra [12]. The Coarse Baum–Connes E-mail address:
[email protected]. 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.02.008
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
89
and coarse Novikov conjectures assert certain properties of this index (or coarse assembly) map μ : K∗ (X) → K∗ (C ∗ X), and have applications in geometry (see e.g. [12,15]). Also, the Coarse Baum–Connes conjecture can be viewed as an algorithm to compute the K-theory of Roe C∗ -algebras. In this spirit, the work presented here is setting up a framework for obtaining an algorithm to compute the K-theory groups of uniform Roe C∗ -algebras. In this paper, we define a refined version of analytic K-homology theory. We loosely follow the exposition [8] of analytic K-homology. The main idea is to quantify “how well approximable by finite rank operators” are various compact operators appearing in the definition of a Fredholm module. Our theory, compared to the classical K-homology, has some advantages (the theory becomes sensitive to some coarse properties, for instance amenability), but also some disadvantages (the K-theory of uniform Roe algebras tends to be uncountable if nonzero). The theory exhibits similarities to the uniformly finite homology theory of Block and Weinberger [3,4], which should be connected to our theory via a Chern character map. This is analogous to the Chern map from analytic K-homology into the locally finite homology groups. Using estimates from [11], we show that some elliptic operators coming from geometry give rise to uniform K-homology classes. Furthermore, we construct an index map μu from uniform K-homology into the K-theory of uniform Roe C∗ -algebras. The original example of a coarse index theorem [11] is actually carried out in this uniform context. We prove that amenability of a metric space is equivalent to non-vanishing of a “fundamental class” in the uniform K-homology of the space. Our criterion is parallel to similar criteria in the uniformly finite homology [3] and K-theory of uniform Roe algebras [6]. Our proof borrows ideas from both of these papers. In the case when the space in question is a Cayley graph of a countable torsion–free group Γ , we show that limd→∞ K∗u (Pd Γ ), the direct limit of uniform K-homologies of its Rips complexes top is naturally isomorphic to K∗ (Γ, ∞ Γ ), the left-hand side of the Baum–Connes conjecture for the group Γ with coefficients in ∞ (Γ ). This is analogous to a result of Yu [14], where he shows the equivalence of the Coarse Baum–Connes conjecture and the Baum–Connes conjecture with coefficients in ∞ (Γ, K ). This statement is true without any assumption on torsion; it is open whether the torsion-free assumption can be dropped in the uniform setting. On the other hand, since the Baum–Connes conjecture with commutative coefficients is known for a number of torsion–free groups, this result provides a computation of some uniform K-homology groups. The structure of this paper is as follows: Section 2 introduces the uniform K-homology groups, and in Section 3 we prove that certain Dirac-type differential operators give rise to uniform K-homology classes. Sections 4 and 5 are devoted to proving the Mayer–Vietoris sequence in our theory. We turn to coarse geometry and the index map in Sections 7–9. The connection between the uniform K-homology of a group Γ and the Baum–Connes conjecture with coefficients in ∞ Γ is shown in Section 10. In the final Section 11, we prove our criterion for amenability. 2. Uniform K-homology groups In this paper, the spaces are separable proper metric spaces, unless explicitly specified otherwise. Throughout the paper, X shall stand for such a space, and d will denote its metric. Finally, to avoid set-theoretic difficulties, all Hilbert spaces are assumed to be separable. Recall the definition of Fredholm modules—the representatives of cycles in the classical analytic K-homology theory.
90
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
Definition 2.1. Let (H, φ, S) be a triple, where H is a Hilbert space, φ : C0 (X) → B(H ) is a ∗ -homomorphism and S ∈ B(H ) is an operator. We say that such a triple is a 0-Fredholm module (or even Fredholm module), if for every f ∈ C0 (X) the following hold: • (Fredholmness) (1 − S ∗ S)φ(f ) ∈ K (H ) and (1 − SS ∗ )φ(f ) ∈ K (H ), • (pseudolocality) [S, φ(f )] ∈ K (H ). Similarly, we say that a triple (H, φ, P ) as above is a 1-Fredholm module (or odd Fredholm module), if for every f ∈ C0 (X): • (P 2 − 1)φ(f ) ∈ K (H ) and (P − P ∗ )φ(f ) ∈ K (H ), • (pseudolocality) [P , φ(f )] ∈ K (H ). Remark 2.2. We can also formulate the Fredholmness condition for even Fredholm modules in another form, which is more convenient for the setting of differential operators: A triple (H, φ, T ) forms an even Fredholm module, if H is Z2 -graded, φ(f ) is of degree 0 (i.e. even) for all f ∈ C0 (X) and T ∈ B(H ) is a pseudolocal operator of degree 1 (odd), satisfying that (T 2 − 1)φ(f ) ∈ K (H ) and (T ∗ − T )φ(f ) ∈ K (H ) for all f ∈ C0 (X). We modify this concept, defining “uniform Fredholm modules”, which shall represent elements in the uniform K-homology theory. We introduce uniformity by “quantifying” the compactness of an operator in the following way: given ε > 0 we try to approximate our compact operator within ε by a finite rank operator with the smallest possible rank. In the definition of a Fredholm module, instead of just one compact operator, we really have a collection of compacts, depending on f ∈ C0 (X), and we require a uniform bound on the ranks of ε–approximants for fixed R—a “scale” in the metric of X. This consideration is sufficient to ensure uniformity on the large scale. However, we want (certain) first order differential operators to give rise to uniform K-homology classes. The approximation properties of compacts arising from the pseudolocality condition really depend not only on the support of f but also on its derivative (just consider an operator [D, f ]), and so we need to build in also some local control. Specifically, for a metric space X and R, L 0 we denote CR (X) = f ∈ Cc (X) diam supp(f ) R and f ∞ 1 , CR,L (X) = f ∈ CR (X) f is L-continuous . We say that f : X → Y is L-continuous, if there exists a nondecreasing function α: [0, ∞] → [0, ∞) with α (0) L1 , such that for any x, y ∈ X we have d(x, y) α(s) ⇒ d(f (x), f (y)) s. Loosely, one could formulate the condition as “locally L-Lipschitz”. In particular, if a function is L-Lipschitz, then it is L-continuous (with α(s) = L1 s). The converse is true for instance when X is a geodesic space. Hence for practical purposes we can replace this condition with just L-Lipschitz. We use the notion of L-continuity to emphasize the local side of being Lipschitz. The reason for introducing L-continuity is the following: if X is a manifold and f ∈ CR,L (X) is differentiable at x ∈ X, then the norm of the derivative df of f at x is at most L. This observation is used in a crucial way in Section 3, when proving that Dirac-type operators produce uniform K-homology classes. If one doesn’t require the theory to include such classes, it is possible to just ignore L’s and l-’s throughout the paper.
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
91
Furthermore L0 CR,L (X) is dense in CR (X). This is completely analogous to saying that (once) differentiable functions are dense in the space of all continuous functions. The proof is outlined at the end of this section, in Lemma 2.18. In the following definition, we introduce the uniformity conditions. We list two versions—one without the “L-dependency” and one featuring L. Definition 2.3 (Uniform approximability). Let H be a Hilbert space, X a metric space and φ: C0 (X) → B(H ) a ∗ -homomorphism. For ε, M > 0, an operator T ∈ B(H ) is said to be (ε, M)approximable, if there is a rank-M operator k, such that T − k < ε. Let E(·) (or E(f )) stand for an expression with operators in B(H ) and terms φ(·) (or φ(f )). (For instance E(·) = T φ(·) or E(f ) = [T , φ(f )].) • For ε, M, R > 0, an expression E(·) is said to be (ε, R, M; φ)-approximable, if for each f ∈ CR (X), E(f ) is (ε, M)-approximable. • For ε, R, L, M > 0, an expression E(·) is said to be (ε, R, L, M; φ)-approximable, if for each f ∈ CR,L (X), E(f ) is (ε, M)-approximable. • An expression E(·) is uniformly approximable, if for every R 0, ε > 0 there exists M > 0, such that E(·) is (ε, R, M; φ)-approximable. Furthermore, we write E1 (·) ∼ua E2 (·), if the difference E1 (·) − E2 (·) is uniformly approximable. • An expression E(·) is l-uniformly approximable, if for every R, L 0, ε > 0 there exists M > 0, such that E(·) is (ε, R, L, M; φ)-approximable. Furthermore, we write E1 (·) ∼lua E2 (·), if the difference E1 (·) − E2 (·) is l-uniformly approximable. We introduce a special cases of uniform approximability: • We say that an operator T ∈ B(H ) is uniform, if T φ(·) and φ(·)T are uniformly approximable (i.e. T φ(f ) ∼ua 0 ∼ua φ(f )T ). We also say that T is (ε, R, M; φ)-uniform, if both operators φ(f )T , T φ(f ) are (ε, R, M; φ)-approximable. • An operator T ∈ B(H ) is said to be uniformly pseudolocal, if [T , φ(·)] is uniformly approximable (i.e. [T , φ(f )] ∼ua 0). • An operator T ∈ B(H ) is said to be l-uniformly pseudolocal, if [T , φ(·)] is l-uniformly approximable (i.e. [T , φ(f )] ∼lua 0). Remark 2.4. The property of being uniformly pseudolocal is obviously stronger than that of being l-uniformly pseudolocal. In the former, we can obtain a bound M on ranks of approximants, which is independent of L (local condition), and depends only on R (support condition) and of course on ε. Remark 2.5. The notion of an “l-uniform” operator is in fact equivalent to the notion of a uniform operator given above. More precisely, if T φ(·) and φ(·)T are l-uniformly approximable, then they are in fact just uniformly approximable, i.e. we can get a bound on M independent of L. In other words, checking uniformity of operator on “nice” function is sufficient. Indeed, for every f ∈ CR (X) we can construct a function f˜ ∈ CR+1,1 (X), such that f f˜ = f . Now given R, ε > 0, if M is the constant such that T φ(·) and φ(·)T are (ε, R + 1, 1, M; φ)-approximable, then φ(f )T = φ(f )φ(f˜)T and T φ(f ) = T φ(f˜)φ(f ) are (ε, R, M; φ)-approximable. Such an f˜ can be constructed for instance as f˜(x) = max(0, 1 − d(x, supp(f ))). One easily checks that this function is 1-Lipschitz, and so f˜ ∈ Cdiam(supp(f ))+1,1 (X).
92
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
In the view of the previous remark, we can completely disregard the constant L appearing in the definition above, when we work with uniform operators only. This is so for instance in Sections 6–11. Definition 2.6. Let (H, φ, S) be a 0-Fredholm module. It is said to be uniform, if S is l-uniformly pseudolocal and the operators 1 − SS ∗ , 1 − S ∗ S are uniform. Let (H, φ, Q) be a 1-Fredholm module. It is said to be uniform, if Q is l-uniformly pseudolocal and the operators 1 − Q2 and Q − Q∗ are uniform. Remark 2.7. By using “uniform Fredholm module” (without 0- or 1-) in a statement we shall mean that the statement applies to both 0- and 1-uniform Fredholm modules. Remark 2.8. If we are given a Hilbert space H together with a ∗ -homomorphism φ : C0 (X) → B(H ) (i.e. an action of C0 (X) on H ), we say that (H, φ), or just H , is an X-module. When no confusion about φ can arise, we identify f ∈ C0 (X) with φ(f ) ∈ B(H ). Similarly, we omit “φ” from (ε, R, M; φ), etc. Example 2.9 (Fundamental class). Let Y be a uniformly discrete space. Let S be the unilateral shift operator on 2 N (i.e. a Fredholm operator with index 1). Denote H = 2 Y ⊗ 2 N, and set S˜ = diag(S) ∈ B(H ). Endow H with the multiplication action of C0 (Y ). More precisely, define φ : C0 (Y ) → B(H ) by φ(f )(ζ (y)) = f (y)ζ (y), for ζ : Y → 2 N, a square summable function, and y ∈ Y , f ∈ C0 (Y ). ˜ is a 0-uniform Fredholm module for Y (S˜ is actually uniIt is easy to check that (H, φ, S) formly pseudolocal). This module has pivotal role in our characterization of amenability in Section 11. The following example is concerned with the K-homology classes coming from elliptic differential operators on manifolds. Example 2.10. Let M be a complete Riemannian manifold and S a smooth complex vector bundle over M. Let D be a symmetric elliptic differential operator operating on sections of S. Let χ : R → R be a chopping function (an odd smooth function), χ(t) > 0 for t > 0, χ(t) → ±1 for t → ±∞. Denote H = L2 (M, S) and let ρ : C0 (M) → B(H ) be the multiplication action. It is proved in [8, Section 10.8], that (H, ρ, χ(D)) is a Fredholm module (whether odd or even depends on the dimension of M). Assuming that M has bounded geometry and that the operator D is “geometric” (e.g. has finite propagation speed), this Fredholm module is actually uniform. The proof is outlined in Section 3. We now proceed towards the definition of uniform K-homology groups. Definition 2.11. A collection (H, φt , St ), t ∈ [0, 1], of uniform Fredholm modules is a homotopy, if: • t → St is continuous in norm,
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
93
• the C∗ -algebras Θ(φt ) ⊂ B(H ) generated by all the uniform operators with finite propagation1 all the same for all t ∈ [0, 1]. By an operator homotopy we mean a homotopy as above, with the restriction that φt = φ0 for all t ∈ [0, 1]. Remark 2.12. The second condition above is satisfied if for instance φt ’s fulfill • there exists R > 0, such that for f, g ∈ C0 (X) with d(supp(f ), supp(g)) R, we have φs (f )φt (g) = 0 for all s, t ∈ [0, 1], • for every s, t ∈ [0, 1] and R > 0, there are R and M, so that every φt (f ), f ∈ CR (X), is within a rank-M operator from one of the form φs (g), g ∈ CR (X). We now proceed as in [8, Section 8.2] in defining a K-homology theory. Given two uniform Fredholm modules, we can clearly form their direct sum, which becomes again a uniform Fredholm module. Definition 2.13 (K∗u ). We define the uniform K-homology group Kiu (X), i = 0, 1, to be an abelian group generated by the unitary equivalence classes of uniform i-Fredholm modules (H, φ, S) with the following relations: • if two uniform Fredholm modules x, y are homotopic, we declare [x] = [y], • for two uniform Fredholm modules x, y, we set [x ⊕ y] = [x] + [y]. Recall that a Fredholm module (H, φ, S) is called degenerate, if the conditions in the definition hold exactly, that is (1 − S ∗ S) = (1 − SS ∗ ) = [φ(f ), S] = 0 for all f ∈ C0 (X) for the 0-version; and S − S ∗ = S 2 − 1 = [φ(f ), S] = 0 for all f ∈ C0 (X) for the 1-version. The K∗u class of a degenerate Fredholm module is 0: the proof of the analogous result for K-homology [8, 8.2.8] carries over verbatim. The additive inverse of [(H, φ, S)] ∈ K0u (X) is [(H, φ, −S ∗ )]. Similarly, the additive inverse of [(H, φ, P )] ∈ K1u (X) is [(H, φ, −P )]. Again, the proof of these facts is just as [8, proof tS sin tI , t ∈ [0, π2 ], is a homotopy showing that [(H, φ, S)] + of 8.2.10]. For instance, cos sin tI − cos tS ∗ = 0 ∈ K0u (X). [(H, φ, −S ∗ )] = H ⊕ H, φ ⊕ φ, I0 0I It follows from the facts in the last two paragraphs, that every element of K∗u (X) can be represented as a class of a single uniform Fredholm module. Furthermore, [x] = [y] in K∗u (X) if and only if there exists a degenerate Fredholm module z, such that x ⊕ z and y ⊕ z are unitarily equivalent to a pair of homotopic uniform Fredholm modules. In this case, we say that x and y are stably homotopic. Therefore, we may reformulate the definition of K∗u (X) as follows: Proposition 2.14. The group Kiu (X) is canonically isomorphic to the semigroup of stable homotopy equivalence classes of uniform i-Fredholm modules. 1 Recall (see 6.4) that T ∈ B(H ) has finite propagation, if there exists R 0, such that for any f, g ∈ C (X) whose 0 supports are at least R apart, we have φt (f )T φt (g) = 0.
94
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
The uniform K-homology is not functorial under continuous maps in general; we need two extra condition in order to obtain functoriality: one handling the large-scale and one taking care of the local phenomena. Definition 2.15. (See also 6.1.) A (not necessarily continuous) map g : X → Z between metric spaces X and Z is said to be uniformly cobounded, if for any r 0, we have Rg (r) := sup diam g −1 B(z, r) < ∞. z∈Z
Observe that an L-continuous uniformly cobounded map g : X → Z descends to a homomorphism on the uniform K-homology groups g∗ : K∗u (X) → K∗u (Z) by the following observation: Take a uniform Fredholm module (H, φ : C0 (X) → B(H ), S) of an K∗u (X)-element. We denote by g˜ : C0 (Z) → C0 (X) the induced ∗ -homomorphism. Then there is a ∗ -homomorphism φ ◦ g˜ : C0 (Z) → B(H ). By uniform coboundedness, we obtain that if f ∈ CR (Z), then ˜ ) ∈ CRg (R),LL (X). Hence the g(f ˜ ) ∈ CRg (R) (X). By L-continuity, f ∈ CR,L (Z) implies g(f uniformity requirements transfer and (H, φ ◦ g, ˜ S) becomes a uniform Fredholm module repre˜ S)]. senting a K∗u (Z)-element. We define g∗ [(H, φ, S)] = [(H, φ ◦ g, We now prove a simple lemma analogous to a similar statement in the classical K-homology: Lemma 2.16. If (H, φ, T ) is a uniform Fredholm module and K ∈ B(H ) is uniform, then (H, φ, T ) and (H, φ, T + K) are operator homotopic. Proof. We need to show that (H, φ, T + tK), t ∈ [0, 1] are uniform Fredholm modules. Fix ε, R, L > 0 and let M be such that all the operators K, [T , φ(f )], (1 − T ∗ T )φ(f ) and (1 − T T ∗ )φ(f ) (or (1 − T 2 )φ(f ) and (T − T ∗ )φ(f ) in the 1-case) are (ε, M)–approximable for f ∈ CR,L (X). First, for f ∈ CR,L (X), we have that [T + tK, φ(f )] = [T , φ(f )] + tKφ(f ) − tφ(f )K, which is clearly (3ε, 3M)-approximable. Hence the pseudolocality requirement is satisfied. Let us now deal with the 0-case. Examine the following expression: 1 − (T + tK)(T + tK)∗ = (1 − T T ∗ ) − tKT ∗ − tT K ∗ − t 2 KK ∗ . Taking f ∈ CR,L (X) and multiplying by φ(f ) the previous formula on the right, each of the elements (1 − T T ∗ )φ(f ), tT K ∗ φ(f ), t 2 KK ∗ φ(f ) is going to be (T Kε, M)-approximable by assumption. We can rewrite the remaining term as follows tKT ∗ φ(f ) = tKφ(f )T ∗ + tK[T ∗ , φ(f )], and so it is (2T Kε, R, 2M)-approximable. Therefore, (1 − (T + tK)(T + tK)∗ )φ(f ) is (5T Kε, 5M)-approximable. It is clear that similar considerations can be applied to 1 − (T + tK)∗ (T + tK) as well. Finally, we deal with the 1-case. Let f ∈ CR,L (X). Observe that ((T + tK) − (T ∗ + tK ∗ ))φ(f ) is (2ε, 2M)-approximable. Furthermore, (1 − (T + tK)2 )φ(f ) = (1 − T 2 )φ(f ) − tT Kφ(f ) − t 2 K 2 φ(f ) − tKφ(f )T − tK[T , φ(f )] and this last expression is (5T Kε, 5M)-approximable. 2 As a first application of the previous lemma, we make the following observation: Remark 2.17. We may always assume that a K1u -element is represented by a uniform 1-Fredholm module (H, φ, Q) with Q selfadjoint. It is because if we take any Q, 12 (Q + Q∗ ) is selfadjoint and Q − 12 (Q + Q∗ ) = 12 (Q − Q∗ ) is uniform. Moreover, the procedure of replacing Q by a selfadjoint operator can be applied to whole homotopies.
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
95
We finish the section by a lemma promised earlier. Lemma 2.18. Let X be a metric space. Given any compactly supported continuous function f : X → C and ε > 0, there exists L > 0 and an L-continuous function g : X → C, such that f − g∞ < ε. Proof. Without loss of generality we can assume that f (X) ⊂ [0, 1]. Take an integer N , such that N1 < ε, and set Un = f −1 [0, Nn ], n = 0, . . . , N . Then U0 ⊂ U1 ⊂ · · · ⊂ UN = X are closed sets. By uniform continuity, there exists δ > 0, such that d(x, y) < δ implies |f (x) − f (y)| < N1 . This implies that Nδ (Un ) ⊂ Un+1 . / Un(x)−1 Define g : X → R as follows: for x ∈ X, let n(x) be such that x ∈ Un(x) , but x ∈ 1 1 (where we set U−1 = ∅). Now set g(x) = n(x)−1 + · min(1, d(x, U )) if n(x) > 0, and n(x)−1 N N δ 1 g(x) = 0 if n(x) = 0. It is clear from the construction that f − g∞ N < ε and it is easy to verify that g is N1δ < εδ -continuous. 2 3. Dirac-type operators In this section, we outline the proof of the fact that “geometric” operators on complete Riemannian manifolds with bounded geometry give rise to uniform Fredholm modules. The hard work was already done in [11], where it is shown that such geometric operators have index defined in the algebraic K-theory of the algebra U−∞ (M) of operators given by smooth uniformly bounded kernels, the precursor of the uniform Roe algebra. Recall the setting: Let M be a complete Riemannian manifold (without boundary) and S a Clifford bundle over M. More precisely, denote by Cliff(M) the complexified bundle of Clifford algebras Cliff(Tx M) (equipped with a natural connection), and let S be a smooth complex vector bundle over M equipped with an action of Cliff(M) and a compatible connection. The bundle S is graded, if in addition it is equipped with an involution anticommuting with the Clifford action of tangent vectors (see Remark 2.2). A “geometric” operator will be a first-order differential operator D defined by the composition Γ (S) → Γ (T ∗ M ⊗ S) → Γ (T M ⊗ S) → Γ (S), where the arrows are given by the connection, metric and Clifford multiplication, respectively. In local coordinates, this operator has the form D=
ek
∂ . ∂xk
The signature and Dirac operators are of this type. The main properties of these operators is that they are elliptic, and have finite propagation (in a sense that there exists a constant C, such that supp(eitD ξ ) ⊂ NCt (supp(ξ )) for all ξ ∈ Γc (S)).2 Denote H = L2 (S) and let ρ : C0 (M) → B(H ) be the multiplication action. Let χ : R → R be a chopping function (an odd smooth function, χ(t) > 0 for t > 0, χ(t) → ±1 for t → ±∞). Then (H, ρ, χ(D)) is a Fredholm module (see [8, Sections 10.6 and 10.8]). This is true in more 2 Recall that N (Y ) denotes the δ-neighborhood of a set Y ; and Γ (S) denotes the set of smooth sections of S. δ
96
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
general context, namely for any first-order elliptic differential operator on a complex smooth vector bundle. However to obtain uniformity, bounded geometry assumption and some analysis from [11] is required. Following [11, Section 2], we say that a Riemannian manifold M has bounded geometry, if it has positive injectivity radius and the curvature tensor is uniformly bounded, as are all its covariant derivatives. A bundle S has bounded geometry, if its curvature tensor, as well as all its covariant derivatives, are uniformly bounded. By [11, Proposition 2.4], bounded geometry can be seen by existence of nice coordinate patches, such that the Christoffel symbols comprise a bounded set in the Fréchet structure on C ∞ . For the record, let us collect all the assumptions and the conclusion into a theorem. Theorem 3.1. Let D be a geometric operator (as described above) on a Clifford bundle S over a complete Riemannian manifold with bounded geometry. For any chopping function χ , the triple (L2 (S), ρ, χ(D)) is a uniform Fredholm module. The idea of the proof (which will be made more precise afterward) is as follows: it is proved in [11, Theorem 5.5], that if ϕ ∈ C0 (R) satisfies ϕ (k) (t) Ck (1 + |t|)m−k , then ϕ(D) extends to a continuous map between Sobolev spaces W r → W r−m for any r. Now a bounded piece of our manifold can be transferred to a torus. The Fourier coefficients of a W −k -function on a torus decay faster than s → s1k . Hence the finite rank approximants to the inclusion W r−m → W r can be constructed just by truncating the Fourier series—and knowing the rate of decay of the coefficients tells us how big rank do we need for a given ε > 0—independently on the position of our bounded piece in the manifold. Putting the facts together, ϕ(D) : W r → W r−m → W r is uniformly approximable. In order to cite [11, Theorem 5.5], we need to introduce some notation. Define (global) Sobolev spaces W k (S) as the completion of Γc (S) in the norm 2 1/2 ξ k = s2 + Ds2 + · · · + D k s . Furthermore, if L ⊂ M, denote ξ k,L = inf ζ k ζ ∈ W k (S), ξ = ζ on a neighborhood of L . An operator A : W k (S) → W l (S) is called quasilocal, if there exists a function μ : R+ → R+ , such that μ(r) → 0 as r → ∞ and for each K ⊂ M and each ξ ∈ W k (S) supported within K one has Aξ l,M\Nr (K) μ(r)ξ k . We call μ a dominating function for A. Finally, we set S m (R) to be the set of functions ϕ ∈ C ∞ (R), which satisfy inequalities of the form (k) ϕ (λ) < Ck 1 + |λ| m−k and define the Schwartz space S (R) =
S m (R).
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
97
Theorem 3.2. (See [11, Theorem 5.5].) Let D be a geometric operator on a Clifford bundle S over a complete manifold M with bounded geometry. If ϕ ∈ S m (R), then ϕ(D) continuously extends to a quasilocal operator W r (S) → W r−m (S). Proof of Theorem 3.1. Fix now a function ϕ ∈ S m (R) (m −1). We are going to show that ϕ(D) is a uniform operator.3 By the above theorem, there is a dominating function μ for ϕ(D), and ϕ(D) extends to a bounded operator L2 (S) → W −m (S). Fix now also ε > 0 and R > 0. Pick any open subset U ⊂ M with diam(U ) R. Consider now he restriction of ϕ(D) to sections supported on U (denoted L2 (S|U )). This is sufficient to obtain uniformity, since ϕ(D) is selfadjoint. Since μ(r) → 0, there is r0 > 0, such that μ(r0 ) < ε/2. Now decompose ϕ(D)|L2 (S|U ) : L2 (S|U ) → W −m (S|Nr0 (U ) ) ⊕ W −m (S|M\Nr0 (U ) ). By quasilocality, the second component has norm at most ε/2. Hence the argument is finished by proving that the restrictions of ϕ(D) to L2 (S|U ) → W −m (S|Nr0 (U ) ) → L2 (S|Nr0 (U ) ) are approximable by finite rank operators, such that the ranks depend only on ε > 0 and R diam(U ). We can now reduce to the case of a torus with a trivial bundle. This just follows from a partition of unity argument and the existence of nice coordinate patches (from bounded geometry). Also note that for a given R, there is a uniform bound on how many of these patches are needed to cover any subset of M with diameter less than R + 2r0 . On the torus T n with the trivial bundle E = T n × Cn , we can use Fourier series. Denote by PN : L2 (E) → L2 (E) the orthogonal projection given by replacing the q -Fourier coefficient ( q ∈ Zn ) of a function by 0 if | q | > N (in other words, we truncate the Fourier series at N ). The absolute values of the Fourier coefficients of a function in W −m (E) decrease at least as PN
fast as | q |m . Consequently, the finite–rank maps W −m (E) → L2 (E) −→ L2 (E) approximate the inclusion W −m (E) → L2 (E) in norm for m −1. Moreover, for a given ε > 0, the rank of an ε-approximant depends only on ε and m. This concludes the proof of the fact that ϕ(D) is uniform if ϕ ∈ S m (R) with m −1. The passage from ϕ ∈ S m (R), m −1, to ϕ ∈ C0 (R) is by the usual approximation argument (together with the fact that uniform operators from a C∗ -algebra, see 4.2). Summarizing, for ϕ ∈ C0 (R) we have that ϕ(D) is a uniform operator. Now if χ is any chopping function, then χ(D)2 − 1 = (χ 2 − 1)(D) and χ 2 − 1 ∈ C0 (R), hence the Fredholmness condition follows from the previous argument. Furthermore, the difference of two chopping functions is also in C0 (R), and so we are free to choose one particular chopping function (we choose χ(t) = √ t 2 ) to prove that χ(D) is l-uniformly pseudolocal. We apply a 1+t
useful formula from [9, Lemma 4.4]:
2 χ(D) = π
∞ 0
D dλ 1 + λ2 + D 2
(convergence in the strong topology), so that
3 Note that the notion of a uniform operator from [11] is different from ours.
98
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
2 ρ(f ), χ(D) = π
∞ 0
1 1 + λ2 ρ(f ), D 1 + λ2 + D 2
+ D ρ(f ), D D
1 1 + λ2
+ D2
dλ.
Fix ε > 0 and L > 0. We have estimates 1 • 1+λD 2 +D 2 2λ , • for a smooth f ∈ CR,L (M), [ρ(f ), D] is the multiplication operator by the derivative of f , and so we have that [ρ(f ), D] L.
Consequently, the integral in the last display converges in norm; and there exists k > 0 and λ1 , . . . , λk , such that the integral can be approximated within ε > 0 by the sum of the integrands 1 with λ = λ1 , . . . , λk . Now each of the operators 1+λD 2 +D 2 , 1+λ2 +D 2 is uniform by the previous considerations:
t 1 , 1+λ2 +t 2 1+λ2 +t 2
∈ S −1 (R). This finishes the proof.
2
We finish the section by an observation, which can be applied to obtain uniform Fredholm modules for non-geometric elliptic operators. We assume that a finitely generated discrete group Γ acts cocompactly on M (this assumption implies that M has bounded geometry), and that D commutes with this action. The vague reason for uniformity is that D “looks the same” on each translate of a fundamental domain (which is bounded), and so the approximation properties of D at any place of M are the same as those over a fixed fundamental domain. In this case, just knowing that ϕ(D) is locally compact for ϕ ∈ C0 (R) upgrades to: Claim 1. For any ϕ ∈ C0 (R), the operator ϕ(D) is uniform. Proof. For a given R > 0, we can find a bounded open set U ⊂ M, such that the collection {U γ }γ ∈Γ covers M and has Lebesgue number at least R. Construct a continuous function f : M → [0, 1], which is 1 on U and 0 outside a small neighborhood of U . Then for any function g ∈ CR (M) there is a γ ∈ Γ , such that g · f γ = g (by f γ we denote the translate of f by γ ). Then ρ(g)ϕ(D) = ρ(gf γ )ϕ(D) = ρ(g)ρ(f γ )ϕ(D γ ) = ρ(g)(ρ(f )ϕ(D))γ . Hence (ε, N)-approximability of ρ(g)ϕ(D) is not worse than the one of ρ(f )ϕ(D) (which is a compact operator, independent of g). This proves that ϕ(D) is uniform. 2 Pseudolocality can be now deduced in the same way as in the geometric case from the claim, provided that [ρ(f ), D] is bounded independently of f ∈ CR,L (M). 4. Dual algebras In the analytic K-homology, one can use the Voiculescu’s theorem and a standard normalizing procedure to express K-homology as a K-theory of a certain C∗ -algebra. In this section, we first work on a fixed X-module (H, φ) to obtain a similar isomorphism for the “partial” uniform K-homology groups (Proposition 4.3). To work around the Voiculescu’s theorem, we express the uniform K-homology as a direct limit of “partial” uniform K-homology groups (Proposition 4.9).
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
99
Definition 4.1. Let H be a Hilbert space and let φ : C0 (X) → B(H ) be a ∗ -representation. We define Ψφ0 (X) ⊂ B(H ) to be the set of all l-uniformly pseudolocal operators in B(H ) and Ψφ−1 (X) ⊂ B(H ) to be the set of all uniform operators. Furthermore, we denote 0 (X) ⊂ B(H ⊕ H ). Dφu (X) = Ψφ⊕0
Lemma 4.2. Let H be a Hilbert space and φ : C0 (X) → B(H ) a ∗ -representation. Then Ψφ0 (X) ⊂ B(H ) is a C∗ -algebra. Likewise, Ψφ−1 (X) ⊂ Ψφ0 (X) is a C∗ -algebra. Furthermore, Ψφ−1 (X) is a closed two-sided ideal of Ψφ0 (X).
Proof. We show that Ψφ0 (X) is norm-closed. Assume that T ∈ B(H ) is approximable by l-uniformly pseudolocal operators. Take ε > 0 and R, L 0. By assumption, there is an luniformly pseudolocal operator S ∈ B(H ), such that T − S < ε/4. Let M be such that S is (ε/2, R, L, M; φ)-approximable. Hence for any f ∈ CR,L (X) there exists k ∈ B(H ) with rank(k) M such that [φ(f ), S] − k < ε/2. Consequently, [φ(f ), T ] − k [φ(f ), (T − S)] + [φ(f ), S] − k < ε. In other words, [φ(f ), T ] is (ε, M)-approximable. The proof that the norm-limits of uniform operators are again uniform is analogous. The identity [φ(f ), ST ] = [φ(f ), S]T + S[φ(f ), T ] implies that Ψφ0 (X) is closed under multiplication. Likewise, using the identity φ(f )ST = [φ(f ), S]T + Sφ(f )T we obtain that Ψφ−1 (X) is an ideal of Ψφ0 (X) (we’re using Remark 2.5 here). 2 For a fixed X-module (H, φ), define a group K∗u (X; φ) in a similar manner as K∗u (X), except that we consider only (unitary equivalence classes of) uniform Fredholm modules, whose Hilbert spaces and C0 (X)-actions are direct sums (finite or countably infinite) of (H ⊕ H, φ ⊕ 0). A glance at the proofs for K∗u (X) shows that K∗u (X; φ) can be characterized also as a group of (unitary equivalence classes of) uniform Fredholm modules over the sums of (H ⊕ H, φ ⊕ 0), with homotopies also taken within this category (see 2.14). Fix (H, φ) for a time being, and let us define a homomorphism ϕ0 : K1 Dφu (X) → K0u (X; φ) as follows: If U ∈ Mn (Dφu (X)) is a unitary representing a K1 -class, we set ϕ0 ([U ]) = [(H 2n , (φ ⊕ 0)n , U )]. It is immediate that (H 2n , (φ ⊕ 0)n , U ) is a uniform Fredholm module. Since homotopies of unitaries translate into operator homotopies of Fredholm modules and the operations on K1 and K0u are both direct sums, we see that ϕ0 is a group homomorphism. Analogously, we induce a homomorphism ϕ1 : K0 Dφu (X) → K1u (X; φ) by assigning to a projection Q ∈ Mn (Dφu (X)) the triple (H 2n , (φ ⊕ 0)n , 2Q − 1). It is again easy to check that this triple is actually a uniform 1-Fredholm module. Since operations on K0 and K1u are both direct sums and homotopies translate to homotopies, we really do get a group homomorphism. Proposition 4.3. The above defined maps ϕ∗ : K1−∗ (Dφu (X)) → K∗u (X; φ) are isomorphisms.
100
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
The proof follows the usual route of showing that elements of K∗u (X; φ) have nice representatives (cf. [8, Sections 8.3 and 8.4]). It is done in the following three lemmas. Lemma 4.4. Any element of K∗u (X; φ) may be represented by a uniform Fredholm module of the form (H 2n , (φ ⊕ 0)n , S), where S 1. Furthermore, homotopies can be also assumed to have this property. Proof. This is a standard cutting argument. We first deal the even case. Take any repre with sentative (H 2n , (φ ⊕ 0)n , S). Consider the matrix S˜ = S0∗ 0S . It represents an odd selfadjoint operator in B(H 4n ), whose square differs from 1 by a uniform operator. Take the cutting function c : R → R given by ⎧ ⎨ −1, if t < −1, c(t) = t, if − 1 t 1, ⎩ 1, if t > 1. ˜ is again an odd selfadjoint operator (since c is odd), but with By functional calculus, c(S) ˜ 1. Denote by T the upper right corner of c(S). ˜ Then T 1, and T − S is unic(S) form. The last statement can be seen by referring to the theorem on the essential spectrum of selfadjoint operators. The proof is completed by applying Lemma 2.16. The odd case is even more straightforward, since we may take a representative (H 2n , (φ ⊕ 0)n , P ) with P = P ∗ . Hence we can apply the cutting directly to P and replace it by c(P ). The same procedures can be applied to whole homotopies. 2 Lemma 4.5. Any element of K0u (X; φ) may be represented by a uniform 0-Fredholm module of the form (H 2n , (φ ⊕ 0)n , S), where S is a unitary. Furthermore, the homotopies can also be assumed to have this property. Proof. Take a representative (H 2n , (φ ⊕ 0)n , S), such that S 1. For simplicity, assume 12 , T , Sij ∈ B(H ). It follows that T 1, so the operator U = n = 1, so that S = ST21 SS22 √ T − 1−T T ∗ √ is well defined and unitary. 1−T ∗ T T∗ Since S is l-uniformly pseudolocal, T is l-uniformly pseudolocal and for any ε > 0, R, L 0 there exists M > 0, such that φ(f )S12 and S21 φ(f ) are (ε, M)-approximable for all f ∈ CR,L (X). Using this and uniformity of 1 − SS ∗ and 1 − S ∗ S, we conclude that 1 − T ∗ T and 1 − T T ∗ are uniform. Since Ψφ−1 (X) is a C∗ -algebra, so are their square roots. Consequently, S − U is uniform, and another application of Lemma 2.16 finishes the proof. Again, we can apply this procedure to the whole homotopy. 2 Lemma 4.6. Any class in K1u (X; φ) can be represented by a uniform 1-Fredholm module of the form (H 2n , (φ ⊕ 0)n , P ), where P 2 = 1. Proof. We proceed similarly as in the previous lemma. Choose a representative (H 2n , (φ ⊕ 0)n , P ), such that P = P ∗ and P 1. For simplicity, we assume that n = 1, and so 12 , where Q, Pij ∈ B(H ). It follows that Q is also selfadjoint and contractive. ThereP = PQ21 PP22 √ Q 1−Q2 is selfadjoint with O 2 = 1. fore, the operator O = √ 2 1−Q
−Q
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
101
As in the previous proof, we obtain that 1 − Q2 is uniform and that P − O is uniform as well. This finishes the proof. 2 Let us now turn to relationship between K∗u (X, φ)’s for different φ’s. We shall need another definition (which is more general than what we need at the moment, but full generality will be required later): Definition 4.7. Let X and Z be spaces, let ϕ : C0 (X) → C0 (Z) be a ∗ -homomorphism, φX : C0 (X) → B(HX ) and φZ : C0 (Z) → B(HZ ) be ∗ -representations. We say that an isometry V : HZ → HX uniformly covers ϕ, if for every ε > 0, R, L 0 there exists M 0, such that V ∗ φX (f )V − φZ (ϕ(f )) is (ε, M)-approximable for every f ∈ CR,L (X). In short, V ∗ φX (·)V ∼lua φZ (ϕ(·)). We introduce a relation ≺ on the set X of (unitary equivalence classes of) ∗ -representations φ of C0 (X) on some (separable) Hilbert space, which turns it into a directed system. We define the relation ≺ by declaring that (H, φ) ≺ (E, ρ) (or just φ ≺ ρ) if and only if there exists an isometry Vφ,ρ : H → E which uniformly covers the identity map id : C0 (X) → C0 (X). The reflexivity of ≺ is obvious and the transitivity becomes clear after a momentary reflection on the definition of uniform covering. Furthermore, for φ, ρ ∈ X , we easily see that φ ≺ φ ⊕ ρ and ρ ≺ φ ⊕ ρ. If φ ≺ ρ, then we obtain a homomorphism iVφ,ρ : K∗u (X, φ) → K∗u (X, ρ) using Proposition 4.3 and the fact that Ad(Vφ,ρ ) maps Ψφ0 (X) into Ψρ0 (X) (where Ad(V ) is defined as Ad(V )(T ) = V T V ∗ and it is a ∗ -homomorphism when V is an isometry). This fact is a special case (when Z = X and π = id) of Lemma 5.4 from the next section, where we prove a more general statement requiring new notation. The set of K∗u (X, φ)’s, together with the maps iVφ,ρ , becomes a directed system indexed by X . The next lemma ensures that we may arbitrarily choose (and fix that choice of) an isometry Vφ,ρ for each pair φ ≺ ρ. Lemma 4.8. We adopt the notation from Definition 4.7. If two isometries V1 , V2 : HZ → HX uniformly cover ϕ, then the induced maps on K-theory are the same: Ad(V1 ) ∗ = Ad(V2 ) ∗ : K∗ Ψφ0Z (Z) → K∗ Ψφ0X (X) . (Note that by the proof of Lemma 5.4, Ad(Vi )’s really map Ψφ0Z (Z) into Ψφ0X (X).) This lemma is analogous to the second part of [8, Lemma 5.2.4] and the proof carries over verbatim. This lemma also implies that ≺ becomes antisymmetric when it descends to K∗u (X, φ)’s. For each φ there is an obvious homomorphism jφ : K∗u (X; φ) → K∗u (X). It is also clear that jφ ’s commute with iVφ,ρ ’s, which allows us to state the final proposition of this section: Proposition 4.9. With the notation above, K∗u (X) = lim jφ K∗u (X, φ) . φ∈X
102
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
5. Mayer–Vietoris sequence The goal of this section is to prove the Mayer–Vietoris sequence for uniform K-homology groups: Theorem 5.1. Let A, B ⊂ X be closed subsets of X, such that A ∪ B = X, int(A ∩ B) = ∅ and d(A \ B, B \ A) > 0.4 Then there is a 6-term exact sequence K0u (A ∩ B)
K0u (A) ⊕ K0u (B)
K0u (X)
K1u (X)
K1u (A) ⊕ K1u (B)
K1u (X).
Before outlining the proof, we need a definition: Definition 5.2. Given a Hilbert space H and a ∗ -representation φ : C0 (X) → B(H ), we let Ψφ0 (X, Z) ⊂ Ψφ0 (X) to be the set of all operators T ∈ Ψφ0 (X) which are uniform on X \ Z, that is, such that for every ε > 0, R 0, there exists M > 0, such that for every f ∈ CR (X) with f |Z = 0 we have that φ(f )T and T φ(f ) are (ε, M)-approximable. Also, we set Dφu (X, Z) = 0 (X, Z) ⊂ B(H ⊕ H ). Ψφ⊕0 Note that a proof similar to the proof of Lemma 4.2 yields that Ψφ0 (X, Z) is a closed two-sided ideal of Ψφ0 (X). Proof of 5.1. The strategy is to first use the C∗ -algebra Mayer–Vietoris sequence (with φ fixed), and then apply Propositions 4.3, 4.9 and Excision Lemma 5.3 to obtain the result. Keeping the notation from 5.1, we have that Dφu (X, A) ∩ Dφu (X, B) = Dφu (X, A ∩ B) directly from the definitions, and Dφu (X, A) + Dφu (X, B) = Dφu (X) (by a partition of unity argument5 ). Subsequently, from the C∗ -algebra Mayer–Vietoris sequence, we get that K0 (Dφu (X, A ∩ B))
K0 (Dφu (X, A)) ⊕ K0 (Dφu (X, B))
K0 (Dφu (X))
K1 (Dφu (X))
K1 (Dφu (X, A)) ⊕ K1 (Dφu (X, B))
K1 (Dφu (X, A ∩ B))
is exact. The general Mayer–Vietoris sequence now follows by “taking the direct limit”, i.e. using naturality of our constructions, Proposition 4.9 and Excision Lemma 5.3. 2 4 This last condition just expresses the requirement that “the overlap of A and B does not get arbitrarily thin”. It is used only in the next footnote. 5 Take f, g ∈ C (X) with f + g = 1, f | b X\A = 0 and g|X\B = 0, f, g are L-continuous for some L (this is possible since d(A \ B, B \ A) > 0). Write T = T φ(f ) + T φ(g). Now if h|A = 0, then T φ(f )φ(h) = 0 and φ(h)T φ(f ) = [φ(h), T ]φ(f ) + T φ(h)φ(f ).
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
103
It remains to deal with the excision lemma. For the rest of this section, we shall denote by X a proper metric space, by Z ⊆ X a closed subset of X and by φ : C0 (X) → C0 (Z) the restriction homomorphism. Lemma 5.3 (Excision lemma). There is a natural isomorphism lim K∗ Dφu (X, Z) ∼ = lim K∗ DφuZ (Z) . φ
φZ
By virtue of 4.9, we may say that the “relative uniform K-homology” K∗u (X, Z) is isomorphic to K∗u (Z). Proof. The strategy is obtain a commutative diagram (notation will be introduced in the course of the proof)
Ad(V )
K∗ (Ψφ0X (X, Z)) Ad(W V )
K∗ (Ψφ0Z (Z))
Ad(SW ) Ad(W )
Ad(S)
K∗ (Ψφ0 (X, Z)) X
(1)
K∗ (Ψφ0 (Z)) Z
starting with the following data: a representation φX : C0 (X) → B(HX ), a representation φZ : C0 (Z) → B(HZ ) and an isometry V : HZ → HX , which uniformly covers π (this gives the first in (1)). In the diagram, the horizontal arrows shall uniformly cover the identity (on the level of K-theory), and the diagonals heading up will uniformly cover π . This would establish the lemma. Let us now explain how can we arrange the starting data. If we start with a ∗-representation φX : C0 (X) → B(HX ), it induces a Borel measure on X, and extends to a ∗-representation (also denoted by φX ) of ∞ (X). In particular, we may restrict φX to a representation φZ : C0 (Z) → B(HX ) and let HZ = χZ HX . Then the inclusion V : HZ → HX actually exactly covers π , i.e. V ∗ φX (f )V = φZ (π(f )) = φ(χZ f ) for all f ∈ C0 (X). Conversely, starting with a ∗-representation φZ : C0 (Z) → B(HZ ), we obtain a ∗-representation φX = φZ ◦ π of C0 (X), so that we can put HX = HZ and V = id. The rest of the proof is devoted to obtaining a diagram (1) from given φX , φZ and V uniformly covering π . We accomplish our goal similarly as [8, Proof of 3.5.7]. Lemma 5.4. Let φX : C0 (X) → B(HX ), φZ : C0 (Z) → B(HZ ) be ∗ -representations and let V : HZ → HX be an isometry which uniformly covers π . Then Ad(V ) Ψφ0Z (Z) ⊂ Ψφ0X (X, Z). (The adjoint map Ad is defined as Ad(V )(T ) = V T V ∗ , and it is a ∗ -homomorphism since V is an isometry.) Proof. We first show that V V ∗ ∈ Ψφ0X (X, Z). Decompose HX = V V ∗ HX ⊕(1−V V ∗ )HX . With φ φ12 . The fact that V V ∗ is respect to this decomposition V V ∗ = 10 00 , and we denote φX = φ11 21 φ22 φX -uniformly pseudolocal is equivalent to φ12 (·)
and φ21 (·)
are l-uniformly approximable.
104
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
Using the covering assumption, φ11 (f ∗ f ) = V V ∗ φX (f ∗ f )V V ∗ ∼lua V φZ π(f ∗ f ) V ∗ ∗ = V φZ π(f ) φZ π(f ) V ∗ ∼lua V V ∗ φX (f )∗ V V ∗ φX (f )V V ∗ = φ11 (f )∗ φ11 (f ). Since φX is a ∗ -homomorphism, we have φ21 (f )∗ φ21 (f ) = φ11 (f ∗ f ) − φ11 (f )∗ φ11 (f )
(2)
for each f ∈ C0 (X). In other words, φ21 (·)∗ φ21 (·) is l-uniformly approximable. Using the spec√ tral theorem for compact selfadjoint operators,6 also φ21 (·)∗ φ21 (·) = |φ21 (·)| is l-uniformly approximable. Let φ21 (f ) = u(f )|φ21 (f )| denote the polar decomposition. From this formula, it follows that φ21 (f ) is l-uniformly approximable as well. To show that V V ∗ is uniform on X \ Z, it suffices to note that in addition to φ12 (·) and φ21 (·) being l-uniformly approximable, we also have φ11 (f ) = V V ∗ φX (f )V V ∗ ∼lua V φZ (π(f ))V ∗ = 0 for f ∈ C0 (X \ Z). We have shown that V V ∗ ∈ Ψφ0X (X, Z). From this, we easily get that Ad(V ) maps Ψφ0Z (Z) into Ψφ∗X (X, Z). 2 Let σ : C0 (Z) → C0 (X) be a completely positive lift of π that satisfies • if f ∈ CR (X) then supp(σ (π(f ))) ⊂ {x ∈ X | d(x, supp(f )) 1}, • there exists L , such that if f is L-continuous then σ (f ) is L + L -continuous. In particular, if g ∈ CR (Z) then σ (g) ∈ CR+2 (X). Such a lift exists.7 Now φX σ : C0 (Z) → B(HX ) is a completely positive map, so by the Stinespring’s theorem, there exist a Hilbert space H and maps ρ12 , ρ21 , ρ22 such that φZ =
φX σ ρ21
ρ12 ρ22
: C0 (Z) → B(HX ⊕ H )
is a ∗ -homomorphism. Denote by W : HX → HX ⊕ H the obvious inclusion. Claim 2. Ad(W ) maps Ψφ0X (X, Z) into Ψφ0 (Z). Furthermore W V uniformly covers id: Z
C0 (Z) → C0 (Z). In other words, W ∗ V ∗ φZ (·)V W − φZ (·) is l-uniformly approximable on C0 (Z). Proof. Decomposing into matrices shows that Ad(W )(T ) belongs to Ψφ0 (Z) if and only if Z
both T ρ12 (·) and ρ21 (·)T are l-uniformly approximable. Since φZ is a ∗ -homomorphism, ∗ (f )ρ (f )T is l-uniformly ρ21 (f )∗ ρ21 (f ) ∈ φX (C0 (X \Z)) for all f ∈ C0 (Z), cf. (2). Hence ρ21 21
6 If k ∈ K is selfadjoint, then for ε > 0 we can approximate k by a rank-M operator, where M is the sum of dimensions of eigenspaces corresponding to all eigenvalues λ with |λ| > ε. 7 Note that a positive map between commutative C∗ -algebras is automatically completely positive, and a nice positive linear lift can be constructed using a linear basis and the Urysohn lemma–type construction. The L-continuity can be also arranged.
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
105
approximable. Consequently, (ρ21 (f )T )∗ (ρ21 (f )T ) is l-uniformly approximable as well, and it follows by the argument in the proof of Lemma 5.4 that ρ21 (f )T itself is as well. This finishes the first part. To see that W V uniformly covers id on C0 (Z), just observe that for f ∈ C0 (Z), we have V ∗ W ∗ φZ (f )W V = V ∗ φX (σ (f ))V ∼lua φZ (π(σ (f ))) = φZ (f ) by assumption of V . 2 The next step is to consider the Hilbert space HX = HX ⊕ (HX ⊕ H ) with the ∗ -representation = φX ⊕ φZ π of C0 (X). Denote by S : HX ⊕ H → HX the inclusion (HX is included as the second HX summand).
φX
Claim 3. S uniformly covers π . Ad(SW ) is homotopic to a ∗ -homomorphism which uniformly covers id : C0 (X) → C0 (X). Hence we are in the position to iterate the construction we have done so far to obtain a commutative diagram (1). S = φ π . Continuing with the second Proof. In fact, S actually exactly covers π , since S ∗ φX Z part of the claim, note that SW includes HX into HX ⊕ HX ⊕ H as the second copy of HX . If we denote by Y : HX → HX ⊕ HX ⊕ H the inclusion as the first summand, then Ad(Y ) exactly covers id : C0 (X) → C0 (X). Furthermore, Ad(SW ) and Ad(Y ) are homotopic via the homotopy of ∗ -homomorphisms
⎛
sin2 ( π2 t)T
sin( π2 t) cos( π2 t)T
At : T → ⎝ sin( π2 t) cos( π2 t)T 0
cos2 ( π2 t)T 0
0
⎞
0⎠, 0
t ∈ [0, 1].
It remains to verify that At maps Ψφ0X (X, Z) into Ψφ0 (X, Z). To this end, it is enough to observe X T T 0 0 -l-uniformly pseudolocal ˜ that if T ∈ ΨφX (X, Z), then T = T T 0 ∈ B(HX ⊕ HX ⊕ H ) is φX 0 0 0
and uniform on C0 (X \ Z). For f ∈ C0 (X), we compute ⎛ T φX σ π(f ) − φX (f )T T φX (f ) − φX (f )T T˜ , φX (f ) = ⎝ T φX (f ) − φX σ π(f )T T φX σ π(f ) − φX σ π(f )T 0 0
⎞ 0 0⎠. 0
It is now clear that for showing l-pseudolocality of T˜ it suffices to see that T φX (f ) − φX σ π(f )T = [T , φX (f )] + (φX (f − σ π(f )))T is l-uniformly approximable. But f − σ π(f ) ∈ C0 (X \ Z), hence the assertion follows from the assumptions on T and the lift σ . T φX (f ) T φX σ π(f ) 0 (f ) = T φ (f ) T φ σ π(f ) 0 and the uniformness of T˜ on C (X \ Z) follows Similarly T˜ φX X X 0 0
0
0
from the observation that π(f ) = 0 for f ∈ C0 (X \ Z). This finishes the proof of Lemma 5.3.
2
2
6. Coarse geometry and C∗ -algebras The first part of this section is devoted to a review of basic notions from coarse geometry. The second part recalls the definitions of C∗ -algebras reflecting the coarse structure: (uniform) Roe C∗ -algebras.
106
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
Coarse geometry studies large-scale behavior of spaces. While it is possible to give an abstract definition of a coarse structure (see [13]), for our purposes it is sufficient and more straightforward to assume that our spaces are endowed with a metric. The appropriate notion of maps in the “coarse category” is the following: Definition 6.1. A (not necessarily continuous) map g : X → Z between metric spaces X and Z is said to be coarse, if: • For any r 0 there exists R 0, such that d(x1 , x2 ) r implies d(g(x1 ), g(x2 )) R for x1,2 ∈ X. An equivalent condition is that there exists a non-decreasing function ρ + : R+ → R+ , such that d(g(x1 ), g(x2 )) ρ + (d(x1 , x2 )). • For any r 0 we have diam(g −1 (B(z, r))) < ∞ for all z ∈ Z. This condition is referred to as being cobounded. Furthermore, we say that g is called uniformly cobounded, if for any r 0, we have Rg (r) := sup diam g −1 B(z, r) < ∞. z∈Z
When working in the “coarse category”, we may choose a nice representative in the class of coarsely equivalent spaces: Definition 6.2. A metric space Y is said to be uniformly discrete, if there is δ > 0, such that d(x, y) δ whenever x = y ∈ Y . Furthermore, Y is said to have bounded geometry, if for any r 0 we have supB(y, r) < ∞. y∈Y
When switching between discrete and “continuous spaces”, the following concept proved to be useful: Definition 6.3. Let X be a metric space and let d 0. The Rips complex Pd (X) is a simplicial polyhedron defined as follows: • the vertex set of Pd (X) is X, • any q + 1 vertices x0 , x1 , . . . , xq span a simplex of Pd (X) if and only if d(xi , xj ) d,
∀i, j ∈ {0, . . . , q}.
Note that if X has bounded geometry, Pd (X) is locally finite and finite dimensional. We endow it with the geodesic metric. We now define C∗ -algebras, which reflect large-scale behavior of metric spaces. Let Y be a uniformly discrete metric space with bounded geometry. We consider the Hilbert space 2 (Y ) ⊗ 2 (N) ∼ = 2 (Y × N) (or 2 (Y )), and represent bounded operators T on it as matrices T = (tyx )x,y∈Y with entries tyx in B(2 (N)) (or C respectively).
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
107
Definition 6.4 (Finite propagation: Discrete version). We say that an operator T = (tyx ) ∈ B(2 (Y × N)) (or B(2 (Y ))) has finite propagation, if there exists R 0, such that tyx = 0 whenever d(x, y) > R. The smallest such R is called the propagation of T and denoted by propagation(T ). Definition 6.5. We say that T is locally compact, if tyx ∈ K (2 (N)) for all x, y ∈ Y . (This condition is void in the case T ∈ B(2 (Y )).) We say that T has uniformly bounded coefficients, if there exists C > 0, such that tyx C for all x, y ∈ Y . Definition 6.6. The norm-closure of the algebra of all finite propagation operators with uniformly bounded coefficients in B(2 (Y )) is said to be the uniform Roe C∗ -algebra of Y , denoted by Cu∗ Y . We denote by Ck∗ (Y ) the norm-closure of the algebra of all locally compact finite propagation operators T = (tyx ) with uniformly bounded coefficients in B(2 (Y × N)), which satisfy the additional condition that the set {tyx | x, y ∈ Y } ⊂ K (2 (N)) is compact in the norm topology on K (2 (N)). Remark 6.7. The additional condition in the previous definition merely says that up to ε, we have only finitely many entries tyx . Another way of stating this condition is that for each ε > 0 there exists M 0, such that each txy , x, y ∈ Y , is at distance at most ε from a rank-M operator. Remark 6.8. The C∗ -algebra Cu∗ Y is not functorial under coarse uniformly cobounded maps, as an examples of one-point and two-point spaces show. Nevertheless, coarsely equivalent spaces have Morita equivalent uniform Roe C∗ -algebras, see [5]. This corresponds to the fact that Ck∗ (Y ) is functorial under such maps. We now cite a proposition, which provides an estimate on the norm of an operator in terms of its entries: Proposition 6.9. (See [13].) Let Y be a uniformly discrete space with bounded geometry, and let t = (tyz )y,z∈Y be a matrix with entries tyz ∈ K (H ) [or tyz ∈ C]. For every P > 0 there is C > 0, such that if t has propagation at most P , we have t C supy,z tyz , with the operator norm in B(2 Y ⊗ H ) [or B(2 Y ) respectively]. To finish the section, we show that as far as K-theory of uniform Roe algebras is concerned, we may work with Ck∗ (Y ). Lemma 6.10. Let Y be a uniformly discrete metric space with bounded geometry. Then Ck∗ (Y ) ∼ = Cu∗ Y ⊗ K . Proof. We show that Cu∗ Y ⊗ K (2 (N)) is dense in Ck∗ (Y ) (with the obvious inclusion). Pick T = (tyx ) ∈ Ck∗ (Y ) and > 0. Denote the propagation of T by p. By Proposition 6.9, there is a constant C > 0, such that if S = (syx ) is a matrix of compacts with propagation at most p, then S C supx,y∈X syx . Since {tyx | x, y ∈ Y } is compact, there is an /C-net t1 , . . . , tm in it. Then clearly T is -far from an operator of the form T1 ⊗ t1 + · · · + Tm ⊗ tm , where each Ti ∈ Cu∗ Y . This shows the density, which implies that Ck∗ (Y ) and Cu∗ Y ⊗ K (2 (N)) are actually isomorphic. 2
108
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
7. Finite propagation representatives In this section, we prove that any class in a uniform K-homology group can be represented by a uniform Fredholm module with the operator having finite propagation. The proof follows the outline of the proof of analogous result in analytic K-homology. Definition 7.1. An open cover of X is said to • have finite multiplicity, if for any R 0 there is K 0, such that any ball with radius R intersects at most K elements of the cover; • be uniformly bounded, if there is a common upper bound for all the diameters of members of the cover. Remark 7.2. Any space X with bounded geometry admits uniformly bounded covers with finite multiplicity. However, bounded geometry alone produces such covers with possibly large bound on the diameters of the cover members. Consequently, a priori the propagation might not be made arbitrarily small (see the proof the next proposition). In order to achieve small propagation, we need some small scale (topological) assumption; for instance finite covering dimension would suffice. Definition 7.3 (Finite propagation: Continuous version). Let H be a Hilbert space and let φ : C0 (X) → B(H ) be a ∗ -representation. We say that T ∈ B(H ) has finite propagation, if there exists R > 0, such that φ(f )T φ(g) = 0 for every f, g ∈ C0 (X) with d(supp(f ), supp(g)) R. Proposition 7.4. Each uniform K-homology element over a space X with bounded geometry can be represented by a uniform Fredholm module (H, φ, S), where S is a finite propagation operator. Furthermore, we may assume that homotopies go through finite propagation operators as well. Proof. Let (H, φ, T ) be a uniform Fredholm module. Take a uniformly bounded open cover (Ui )i∈I with finite multiplicity, and let (ϕi2 )i∈I be a continuous partition of unity subordinate to (Ui )i∈I . By replacing the sets Ui by Nδ (Ui ), the δ-neighborhoods for a fixed δ > 0 and obtaining a partition of unity for the cover (Nδ (Ui ))i , we can assume that all ϕi ’s are L0 -continuous for some L0 0. Denote S = i∈I ϕi T ϕi . This operator has finite propagation (which is bounded from above by supi diam(Ui )). We prove that (H, φ, S) is a uniform Fredholm module which represents the same uniform K-homology element as (H, φ, T ). Fix ε > 0 and R, L > 0. Let M be such that [T , φ(·)] is (ε, R, 2 max(L0 , L), M; φ)approximable and that T φ(·) and φ(·)T are (ε, R, M; φ)-approximable. Denote S = S − T = i∈I ϕi [T , ϕi ]. By finite multiplicity assumption, there is M1 , such that any ball with radius R intersects at most M1 sets Ui . Take f ∈ CR (X) and consider f S = i f ϕi [T , ϕi ]. This sum has at most M1 nonzero terms, and each of them is (ε, M)-approximable, hence f S itself is (M1 ε, MM1 )-approximable. Similarly for f ∈ CR,L (X),
ϕi [T , ϕi ]f = ϕi T ϕi f − ϕi2 Tf = (ϕi T ϕi f − ϕi2 f T ) + ϕi2 [f, T ] Sf = i
=
i
i
ϕi [T , f ϕi ] + [f, T ].
i
i
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
109
The last term is (ε, M)-approximable by assumption, and again only at most M1 terms in the sum are nonzero, and all of them are (ε, M)-approximable. Consequently, S f is (M1 + 1)ε, MM1 + 1)-approximable. Therefore we have proved that S is uniform. Applying Lemma 2.16 finishes the first part of the proof. For the part on homotopies, we just need to observe that the formula i∈I ϕi T ϕi produces a continuous family if we vary T continuously, thanks to finite multiplicity of the chosen cover. 2 8. Another picture of uniform Roe algebras The definition of Ck∗ (Y ) as given in Section 6 inherently uses the standard basis of the auxiliary Hilbert space 2 N. In this section, we develop a picture of Ck∗ (Y ) starting with a general X-module (H, φ), instead of the concrete one (2 Y ⊗ 2 N with the multiplication action). Furthermore, this model allows us to translate from “continuous” spaces X (which are needed in order to observe more than just 0-dimensional phenomena in (uniform) K-homology) to their discrete models Y ⊂ X (which are supposed to be the targets of the index/assembly map). Let us fix a metric space X for the rest of this section. Definition 8.1. We say that Y ⊂ X is a quasi-lattice, if Y with induced metric is uniformly discrete space with bounded geometry, which is coarsely equivalent to X. We say that a collection (Vy )y∈Y of subsets of X is a quasi-latticing partition, if each Vy is open, Vx ∩ Vy = ∅ if x = y, X = y∈Y Vy , supy∈Y diam(Vy ) < ∞ and for every ε > 0, supy∈Y |{z ∈ Y | Vz ∩ Nbhdε (Vy ) = ∅}| < ∞. Remark 8.2. Not all spaces X have a quasi-lattice, but those with “bounded geometry” in any reasonable sense do. Furthermore, once there is a quasi-lattice, it is easy to produce quasilatticing partitions (for instance by means of “pick the closest point in Y ” map). Example 8.3. A useful example to have in mind is the one of a graph X (with edges attached), with Y being its 0-skeleton. More generally, 0-skeleton of a uniformly locally finite simplicial polyhedron (endowed with the geodesic metric) is a quasi-lattice. Recall that any ∗ -homomorphism φ : C0 (X) → B(H ) induces a Borel measure on X, and extends to a representation (also denoted by φ) of ∞ (X). We shall use this fact without mentioning explicitly throughout this section. Definition 8.4 (Bases choice). Given a metric space X, we define the bases choice A for X to be a 5-tuple (Y, (Vy )y∈Y , H, φ, {Sy }y∈Y ), where • • • •
Y ⊂ X is a quasi-lattice of X, (Vy )y∈Y is a quasi-latticing partition of X, H is a Hilbert space, φ : C0 (X) → B(H ) a non-degenerate ∗-representation,8 y Ny Sy = (ei )i=1 is a basis of Hy = φ(χVy )H (where we allow Ny ∈ N ∪ {∞} and we put by convention that Sy = ∅ if Hy = {0}).
8 A representation φ : C (X) → B(H ) is non-degenerate, if [φ(C (X))]⊥ = {0} 0 0
110
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
Such a bases choice determines a (possibly non-surjective) isometry uA : H = 2 (Y × N).
y
Hy →
Definition 8.5. Let X be a metric space, Y ⊂ X a quasi-lattice, (Vy )y∈Y a quasi-latticing partition, and let Ai = (Y, (Vy )y∈Y , Hi , φi , {Syi }y∈Y ), i = 1, . . . , k be bases choices. Define the C∗ -algebra Ck∗ (X, A1 , . . . , Ak ) ⊂ B( ki=1 Hi ) as the closure of the algebra of the operators k T ∈ B( i=1 Hi ) satisfying the following conditions: • T has finite propagation, • there exists M 0, such that each “entry” Tj,i;y,x : φi (χVx )Hi → φj (χVy )Hj only uses the j
first M basis vectors from bases Sxi , Sy . There is an injective ∗ -homomorphism Ad(uA1 ⊕ · · · ⊕ uAk ) : Ck∗ (X, A1 , . . . , Ak ) → Mk Ck∗ (Y ) . We call the C∗ -algebra Ck∗ (X, A ) ⊂ B(H ) the A -realization of Ck∗ (Y ). Remark 8.6. Note that Ck∗ (X, A ) is isomorphic only to a subalgebra of Ck∗ (Y ) in general, but if each Sy is infinite, then Ck∗ (Y ) and Ck∗ (X, A ) are isomorphic. Define supp(A ) = {y ∈ Y | Sy = ∅}. If supp(A ) is coarsely equivalent to Y , we have that K∗ (Ck∗ (Y )) ∼ = K∗ (Ck∗ (X, A )). More precisely, Ck∗ (Y ) and Ck∗ (X, A ) are Morita equivalent. Indeed, M∞ (Ck∗ (X, A )) ∼ = M∞ (Cu∗ (supp(A ))), for Morita equivalence of Cu∗ (supp(A )) and Cu∗ Y we refer to [5]. We continue by defining a relation between tuples of bases choices, in order to be able to get an inductive limit of realizations of Ck∗ (Y ). We begin by a notion similar to an inclusion between a pair of bases choices. Definition 8.7. Fix a quasi-lattice Y ⊂ X. Let Ai = Y, (Vyi )y∈Y , Hi , φi , {Syi } , i = 1, 2, be bases choices. We shall write A1 ⊆ A2 , if the following conditions are satisfied: • For each y ∈ Y , φ(χVy1 )H1 is isometric to a subspace of φ(χVy2 )H2 via an isometry vy . • Each vy maps nth vector in the basis Sy1 to the n-vector in the basis Sy2 . A weaker version of ⊆, denoted now by A1 A2 , is defined in the same manner, except the last condition is replaced by • for all k ∈ N there is l ∈ N, such that for all y ∈ Y the vy -images of the first k vectors of Sy1 are among the linear span of the first l vectors of Sy2 . We now extend this inclusion to lists. Given two lists of bases choices (A1 , . . . , Ak ) and (A1 , . . . , Al ) for X with respect to Y , we declare (A1 , . . . , Al ) ≺ (A1 , . . . , Ak ), if there is an
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
111
injective function σ : {1, . . . , l} → {1, . . . , k}, such that Ai ⊆ Aσ (i) for all i = 1, . . . , l. If this happens, then there is a natural embedding i : Ck∗ X, A1 , . . . , Al → Ck∗ (X, A1 , . . . , Ak ) (implemented by the ∗ -homomorphism Ad(V ), where V = y vy is the isometric embedding of appropriate Hilbert spaces). This embedding commutes with maps between matrix algebras over Ck∗ (Y ) as follows: Ck∗ (X, A1 , . . . , Ak )
Ck∗ (X, A1 , . . . , Al ) Ad(uA ⊕···⊕uA )
Ad(uA1 ⊕···⊕uAk )
l
1
Ml (Ck∗ (Y ))
hσ ⊗id
(3)
Mk (Ck∗ (Y )).
By hσ : Ml (C) → Mk (C) we denote the embedding of matrix algebras determined by σ . More precisely, hσ is the linear extension of the following assignment of matrix units Ml (C) eij → eσ (i)σ (j ) ∈ Mk (C). Furthermore, if we assume that Ai = Aσ (i) for i = 1, . . . , l, and if supp(Aj ) is coarsely equivalent to Y for each j = 1, . . . , k, then the top horizontal map induces an isomorphism on K-theory. This is a straightforward generalization of Remark 8.6. Note that for any bases choice A = (Y, (Vy )y∈Y , H, φ, {Sy }y∈Y ), there is another one A with A ⊆ A , such that supp(A ) = Y . This can be arranged by choosing the Hilbert space of A to be H = H ⊕ 2 (Y × N), the direct sum action of C0 (X) and a suitable choice of bases Sy . The previous discussion, together with Lemma 6.10, culminates in the following proposition: Proposition 8.8. Let X be metric space and let Y ⊂ X be a quasi-lattice. The collection X of all finite lists (A1 , . . . , Ak ) of bases choices for X with Y fixed forms a directed system. We have that there is an isomorphism η ∼ = η : lim K∗ Ck∗ (X, A1 , . . . , Ak ) −→ K∗ Cu∗ Y . X
The following lemma shows that given a finite propagation uniform operator T on an Xmodule H , we can always find a bases choice A , such that T ∈ Ck∗ (X, A ). Lemma 8.9. Let X be metric space, let Y ⊂ X be a quasi-lattice and let (Vy )y∈Y be a quasi-latticing partition of X. Let H be a Hilbert space and let φ : C0 (X) → B(H ) be a ∗ homomorphism. Given a finite collection T1 , . . . , Tk ∈ B(H ) of uniform operators with finite propagation, there exists a bases choice A , such that Ti ∈ Ck∗ (X, A ) for all i = 1, . . . , k. Proof. For simplicity, assume that we are given just one T ∈ B(H ) to deal with (it will be clear that we can follow the procedure outlined below simultaneously for finitely many operators). Denote Hy = φ(χVy H ) and Tyz = φ(χVy )T φ(χVz ) ∈ B(Hz , Hy ). Since T has finite propagation and Y is uniformly discrete, there is a K, such that there are at most K nonzero entries in each column and row of the matrix (Txz )x,z∈Y .
112
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
Fix ε1 = 1 and take R > supy∈Y diam(Vy ). It follows from the assumption that there exists M, such that each Tyz is (ε1 , M)-approximable. Therefore, for each y ∈ Y , there are 2M orthonormal y y vectors e1 , . . . , e2M ∈ Hy , for which there are 2M × 2M-matrices which in these (partial) bases represent operators sy ∈ B(Hy ) with Tyy − sy < ε1 . Fix y ∈ Y for a while and consider the “column” (Tyz )z∈Y . Each of them is (ε1 , M)y y approximable, but not necessarily by a matrix in the partial basis e1 , . . . , e2M chosen so far. By adding at most M vectors to the chosen partial bases for Hy and Hz respectively, we can ensure that Tyz will be (ε1 , M)-approximable in the partial bases of Hy and Hz . We can do this for each nonzero Tyz , z ∈ Y , resulting in having chosen partial basis for Hy having at most (2 + K)M elements, and partial bases for Hz ’s having at most 3M elements. Doing this process for all y ∈ Y results in choosing partial bases for each Hy having at most (2 + 2K)M elements, now with the property that each Tyz is (ε1 , M)-approximable with matrices in the chosen partial bases. To finish the construction, we choose a sequence of εn > 0 converging to 0 and do the above described process for each n, always just adding the newly chosen partial bases to the previous ones. Hence, we have constructed A = A (Y ). The fact that T ∈ Ck∗ (X, A ) follows easily from the construction and the estimate 6.9. 2 In fact, we can improve the previous lemma to finite collections of uniform operators which are do not necessarily have finite propagation, but are only approximable by finite propagation ones. To carry out the argument, we are going to use the relation on bases choices (see Definition 8.7). Note that if A1 A2 , then Ck∗ (X, A1 ) ⊂ Ck∗ (X, A2 ): Let w ∈ Ck∗ (X, A1 ) be finite propagation operator, such that a bound M on the number of basis vectors from Sy1 which are used in each entry wyz of w. By the last condition in the definition of , there is a number M , such that for each y ∈ Y , the first M vectors of Sy1 are in the linear span of the first M vectors of Sy2 . Consequently, entries wyz use only the first M vectors of bases Sy2 , and so w ∈ Ck∗ (X, A2 ). Lemma 8.10. Let X be metric space, let Y ⊂ X be a quasi-lattice and let (Vy )y∈Y be a quasi-latticing partition of X. Let H be a Hilbert space and let φ : C0 (X) → B(H ) be a ∗ -homomorphism. Given a finite collection T , . . . , T in Θ(φ), the C∗ -algebra generated by uni1 k form operators with finite propagation, there exists a bases choice A , such that Ti ∈ Ck∗ (X, A ) for all i = 1, . . . , k. We isolate a part of the proof of the above lemma as another lemma, as it is useful by itself. Lemma 8.11. Let X be metric space, let Y ⊂ X be a quasi-lattice and let (Vy )y∈Y be a quasi-latticing partition of X. Let H be a Hilbert space and let φ : C0 (X) → B(H ) be a ∗ -homomorphism. Assume that we are given a countable collection A , . . . , A , . . . of bases 1 n choices of the form (Y, (Vy )y∈Y , H, φ, ·). Then there exists a bases choice A of the same form, such that Ai A , i 1. Proof. Denote An = (Y, (Vy )y∈Y , H, φ, {Syn }y∈Y ). We now define bases Sy out of Syn (and put A = (Y, (Vy )y∈Y , H, φ, {Sy }y∈Y )). Fix y ∈ Y and enumerate the orthonormal bases Syn of the Hilbert space φ(χVy )H as (ein )i1 . We make one basis out of this sequence as follows: we fix a bijection α : N × N → N (for instance α(n, i) = 12 (n + i − 1)(n + i − 2) + i; say we think of N × N to be the lattice points in the first quadrant of the plane, and we enumerate the points along
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
113
the diagonals going from “top-left” to “right-bottom”). Let (β1 , β2 ) : N → N × N be its inverse. β (k) Now take the sequence of vectors k → eβ21(k) , and apply the Gramm–Schmidt orthogonalization process to it. We obtain a new basis Sy , which obviously has the following property: for each n 1 and i 1, the vectors ein , . . . , ein are in the linear span of the first α(n, i) basis vectors of the new basis. A quick glance at the definition of the relation for bases choices shows that A is as required. 2 Proof of Lemma 8.10. For simplicity, we concentrate on the case that k = 1, i.e. when we are given one operator T ∈ Θ(φ). Note that T is uniform by the argument of Lemma 4.2. By assumption, T is approximable by a sequence Tn of uniform operators with finite propagation. For each Tn , there is a bases choice An = (Y, (Vy )y∈Y , H, φ, {Syn }y∈Y ), such that Tn ∈ Ck∗ (X, An ). Applying the previous lemma yields a bases choice A , such that An A for each n 1. Since Ck∗ (X, An ) ⊂ Ck∗ (X, A ), Tn is a sequence of operators in Ck∗ (X, A ) which converges to T . This finishes the proof. 2 9. The uniform index map In the usual analytic K-homology, there is the index map (often also called the coarse assembly map) from the K-homology K∗ (X) of a space X to the K-theory of its Roe algebra K∗ (C ∗ X). But since Roe algebras of coarsely equivalent spaces are isomorphic, the target group of the index map can be understood as the K-theory K∗ (C ∗ Y ) of the Roe algebra of any quasilattice Y ⊂ X. The quickest way to define this map in the usual case is to use the reformulation of the Khomology as K-theory of a dual algebra (see [8, Theorem 8.4.3] and Section 4 for an analogous result in the uniform case) and then the 6-term exact sequence in K-theory, whose boundary maps become the assembly maps. For details of this construction, see for instance [8, Section 12.3]. The goal of this section is to construct a similar index/assembly map in the uniform setting. More precisely, we define a homomorphism μu : K∗u (X) → K∗ (Cu∗ Y ) for a quasi-lattice Y ⊂ X in a metric space X. However, instead of the C∗ -algebra route, we take a more hands-on approach. In this paragraph, we recall a formula for the usual assembly map. If (H, φ, S) is a 0-Fredholm module, we can define its index as follows: denote W=
1 S 0 1
1 −S ∗
0 1
1 0
S 1
0 1
This is an invertible in M2 (B(H )). Then put ind(S) = W ind(S) =
SS ∗ + (1 − SS ∗ )SS ∗ S ∗ (1 − SS ∗ )
−1 0
∈ M2 B(H ) .
1 0 −1 W ∈ M2 (B(H )). Concretely, 00
S(1 − S ∗ S) + (1 − SS ∗ )S(1 − S ∗ S) . (1 − S ∗ S)2
A simple computation shows that ind(S) is actually an idempotent in M2 (B(H )). Furthermore, ∂(H, φ, S) = [ind(S)] − [ 10 00 ] is a K0 -class in the K-theory group of appropriate algebra, modulo which is S invertible. For example, starting with a finite propagation S, one gets ∂(H, φ, S) in K0 (C ∗ X), the K-theory of the Roe C∗ -algebra.
114
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
Starting with a 1-Fredholm module (H, φ, Q), its index can be constructed using the formula 9 ind(Q) = exp(−2πi Q+1 2 ) ∈ B(H ). The operator ind(Q) is invertible, but even if we start with a finite propagation Q, ind(Q) might not have finite propagation. However, it is approximable by finite propagation invertibles in this case, hence still gives a class [ind(Q)] ∈ K1 (C ∗ X). Let us now turn to the uniform case. Fix a quasi-lattice Y ⊂ X. We define μu : K∗u (X) → K∗ (Cu∗ Y ) in the following proposition: Proposition 9.1 (Uniform index map, even case). Let (H, φ, S) be a 0-uniform Fredholm module with S having finite propagation and φ being non-degenerate. For any quasi-lattice Y ⊂ X, there y exists a bases choice A = (Y, (Vy )y∈Y , H, φ, {(ei )i∈N }y∈Y ), such that ind(S) ∈ M2 (B(H )) is an idempotent that actually belongs to Ck∗ (X, A , A ). Furthermore, we can define a group homomorphism μu : K0u (X) → K0 (Cu∗ Y ) by 1 0 ∈ K0 Cu∗ Y , μu (H, φ, S) = η∗ ind(S) − 0 0 i.e. the right-hand side does not depend on the choices made. Recall that η is described in Proposition 8.8. Proposition 9.2 (Uniform index map, odd case). Let (H, φ, Q) be a 1-uniform Fredholm module with Q having finite propagation and φ being non-degenerate. For any quasi-lattice Y ⊂ X there y exists a bases choice A = (Y, (Vy )y∈Y , H, φ, {(ei )i∈N }y∈Y ), such that ind(Q) ∈ B(H ) is an invertible that actually belongs to Ck∗ (X, A )+ . Furthermore, the map μu : K1u (X) → K1 (Cu∗ Y ) defined by μu [H, φ, Q] = η∗ ind(Q) ∈ K1 Cu∗ Y is a group homomorphism. Proof of the 0-case. Picking any quasi-latticing partition (Vy )y∈Y , the existence of a suitable A follows from Lemma 8.9, applied to the four entries of ind(S), which are uniform and have finite propagation. It is clear that our construction of the index preserves direct sums. Also, the index of a degenerate element gives zero in Indeed, if (H, φ, S) is a degenerate 0-Fredholm the) 0K-theory. for any f ∈ C0 (X), so by using a partition of unity we module, then φ(f ) ind(S) = φ(f 0 0 obtain that ind(S) = 10 00 . Thus, to finish the proof, we need to show the independence of the index on the choice of A , and under homotopies of uniform Fredholm modules. Our proof for homotopies includes the argument for choices of A , since we can just take a constant homotopy, and choose different bases choices at the endpoints. We shall now outline the proof for homotopies. Assume that we are given a homotopy (H, φt , St ) of uniform Fredholm modules. We assume that all St have finite propagation (see proposition 7.4), so that the index as we have defined it can be constructed. Note that the requirements on φt ensure that B = Θ(φt ), the C∗ -algebra generated by all φ-uniform operators with φ-finite propagation, does not depend on t. 9 When we talk about invertibles in a non-unital C∗ -algebra, we mean that they are invertible in the unitization.
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
115
By applying the index formula to St , we obtain a norm-continuous path of projections in M2 (B) ⊂ M2 (B(H )). For the sake of simplicity, let us assume that we have a norm-continuous path of projections (Tt ) in B itself. Choose A0 and A1 to be bases choices corresponding to (H, φ0 ) and (H, φ1 ) respectively, such that Ti ∈ Ck∗ (X, Ai ), i = 0, 1. Now we are in the position to apply the following Lemma 9.3, which finishes the proof for the even case. 2 Lemma 9.3. Let H be a Hilbert space, φ : C0 (X) → B(H ) a ∗ -representation. Denote B = Θ(φ) ⊂ B(H ), the C∗ -algebra generated by φ-uniform operators with φ-finite propagation. Assume that Tt , t ∈ [0, 1] is a homotopy of projections in B, and that A0 and A1 are two bases choices, such that Ti ∈ Ck∗ (X, Ai )10 . Then [T0 ] = [T1 ] ∈ K0 (Ck∗ (X, A0 , A1 )). Proof. Since Tt is a homotopy of projections in a C∗ -algebra B, there exists an invertible element v0 ∈ B with v0 = 1, such that T1 = v0−1 T0 v0 (see e.g. [2, Proposition 4.3.2]). Note that v0 might not have finite propagation, so we will need to make some approximations further on. The images of T0 and T1 under the inclusions of Ck∗ (X, A0 ) and Ck∗ (X, A1 ) into Ck∗ (X, A0 , A1 ) ⊂ M2 (B(H )) are the operators T00 00 and 00 T01 . These two projections are 0 0 Murray–von Neumann equivalent by the elements x = 00 T00v0 and y = v −1 T0 0 . To finish the 0 argument, we must show that x, y ∈ Ck∗ (X, A0 , A1 ). For the rest of the proof, we will think of Mk = Mk (C) as B(span(e1 , . . . , ek )) in K (2 N), where e1 , e2 , . . . is the standard basis of 2 N. Let A ⊂ B(2 (Y ) ⊗ 2 N) be the algebra of all finite propagation matrices (tyz )y,z∈Y for which there exists k ∈ N with tyz ∈ Mk for all y, z ∈ Y . Then Ck∗ (Y ) is the norm closure of A. We shall give a proof that y ∈ Ck∗ (X, A0 , A1 ); a proof for x is analogous. Denote u0 = uA0 and u1 = uA1 . We need to show that y ∈ Ad(u0 ⊕ u1 )(M2 (Ck∗ (Y ))). This will follow from the following statement: For any ε > 0, there exists p ∈ A, such that p − u1 v0−1 T0 u∗0 < ε. By the choice of A0 and A1 , we know that there are sˆ0 , sˆ1 ∈ B(H ), such that u0 sˆ0 u∗0 , u1 sˆ1 u∗1 ∈ A, ˆs0 − T0 < ε and ˆs1 − v0−1 T0 v0 < ε. Note that sˆ0 and sˆ1 have finite propagation. Furthermore, there exists an invertible element v ∈ B with finite propagation, norm 1, and v − v0 < ε and v −1 − v0−1 < ε. It follows that v sˆ1 v −1 − T0 < 3ε. At this moment, the setting is as follows: we have finite propagation operators v, sˆ0 , sˆ1 and T0 , such that ˆs0 − T0 < ε, v sˆ1 v −1 − T0 < 3ε. Claim 4. There exists p ∈ A, such that p − u1 sˆ1 v −1 u∗0 < 4ε. Proof of claim. Combining the two inequalities with T0 gives 4ε > sˆ0 − v sˆ1 v −1 = v −1 sˆ0 − sˆ1 v −1 u1 v −1 u∗0 u0 sˆ0 u∗0 − u1 sˆ1 u∗1 u1 v −1 u∗0 . 10 For any bases choice A , C ∗ (X, A ) ⊂ B. The uniformity of T ∈ C ∗ (X, A ) follows the formula f = f χ . Vy y k k Note that for fixed R 0 and f ∈ CR (X), there is a uniform bound on the number of nonzero terms in the sum by
bounded geometry.
116
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
Denoting w = u1 v −1 u∗0 ∈ B(2 (Y ) ⊗ 2 (N)), s0 = u0 sˆ0 u∗0 ∈ A, s1 = u1 sˆ1 u∗1 ∈ A, we obtain ws0 − s1 w < 4ε. Note that w has finite propagation. Let k be such that all entries s0 and s1 belong to Mk . We split the standard basis of 2 (Y ) ⊗ 2 N into two sets B1 (first k vectors from each {y} ⊗2 N) With respect to this decomposition, we can and B other basis w vectors). 2 (the w write s0 = 0 00 , s1 = s011 00 and w = 11 12 . Consequently, 4ε > ws0 − s1 w = Hence s w12 < 4ε. Denoting p = 0 s w 11 11 12 < 4ε. 2 0
0 0 s
−
11 w11
0
s11 w11 0
s11 w12 = s11 w12 . 0 0
0 , we immediately see that p 0
∈ A and s1 w −p =
0
Returning to the proof of the lemma, we conclude p − u1 v −1 T0 u∗ p − u1 v −1 T0 u∗ + εT0 0 0 0 p − u1 sˆ1 v −1 u∗0 + εT0 + 3ε < 4Cε + εT0 + 3ε. This finishes the proof.
2
Q+1 Proof of the 1-case. The operator ind(Q) = exp(−2πi Q+1 2 ) − 1 ∈ B(H ) is uniform (P = 2 2 satisfies P ∼ua P and so exp(−2πiP ) − 1 ∼ua P (exp(−2πi) − 1) = 0), but might not have finite propagation. However, from the formula for ind(Q) and finite propagation of Q it follows that ind(Q) − 1 ∈ Θ(φ), and so the existence of suitable A follows from Lemma 8.10 (after we have fixed some quasi-latticing partition (Vy )y∈Y ). We reduce the independence of the index on homotopies to independence on bases choices. Taking a homotopy (H, φt , Qt ) of 1-uniform Fredholm modules, we assume that all Qt have finite propagation. It follows that Ut = ind(Qt ), t ∈ [0, 1] is a homotopy of invertibles in B + = Θ(φ0 )+ . Since the set of invertibles is open, by a standard compactness argument we can assume that the homotopy is piecewise-linear. Hence, it is sufficient to assume that we have just one linear path of invertibles from (say) U0 to U1 in B + , and that we are given two bases choices A0 and A1 , such that Ui ∈ Ck∗ (X, Ai )+ , i = 1, 2. Applying Lemma 8.11 gives a bases choice A , such that Ai A for each i = 1, 2. Hence U0 , U1 and the whole (linear) homotopy between them is actually in Ck∗ (X, A )+ . So [U0 ] = [U1 ] ∈ K1 (Ck∗ (X, A )), and the assertion will follow from the independence of the index on the choice of a bases choice. We find ourselves in the following situation: we are given an invertible U = 1 + K, K ∈ B = Θ(φ), and two bases choices A0 , A1 , such that K ∈ Ck∗ (X, Ai ), i = 0, 1. We will think of Mk = Mk (C) as B(span(e1 , . . . , ek )) in K (2 N), where e1 , e2 , . . . is the standard basis of 2 N. Let A ⊂ B(2 (Y ) ⊗ 2 N) be the algebra of all finite propagation matrices (tyz )y,z∈Y for which there exists k ∈ N with tyz ∈ Mk for all y, z ∈ Y . Then Ck∗ (Y ) is the norm closure of A. Denote u0 = uA0 and u1 = uA1 . We will prove that U0 10 ∼ 10 U0 ∈ Ck∗ (X, A0 , A1 )+ . The standard rotation homotopy be sin2 ( π t) cos( π t) sin( π t) 2 2 2 K, and so it is suffitween these two matrices has the form 1 + cos( π2 t) sin( π2 t) cos2 ( π2 t) 0 K 0 0 cient to prove that actually 0 0 and K 0 ∈ Ck∗ (X, A0 , A1 ). Equivalently, that u0 Ku∗1 and u1 Ku∗0 ∈ A.
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
117
Pick ε > 0. Since K ∈ Ck∗ (X, Ai ), i = 0, 1, there exist sˆ0 , sˆ1 ∈ B(H ) with finite propagation, such that si := ui sˆi u∗i ∈ A and ˆsi − K < ε for i = 0, 1. Since K ∈ B, there exist an operator ˆ < ε, i = 0, 1. Consequently, ˆsi − K ˆ < 2ε Kˆ ∈ B with finite propagation, such that K − K for i = 0, 1. At this moment, we can apply the proof of Claim 4 above (with v = 1 and T0 = K, otherwise verbatim), to obtain p ∈ A, such that p − u1 Ku∗0 < 8ε. Letting ε → 0, we obtain that u1 Ku∗0 ∈ A. Analogous proof shows also u0 Ku∗1 ∈ A. We are done. 2 10. On the Baum–Connes conjecture with ∞ -coefficients As an application of the Mayer–Vietoris sequence for uniform K-homology, we exhibit a connection with the Baum–Connes conjecture. Yu [14] proved that for a discrete group Γ , the Baum–Connes conjecture [1] for Γ with coefficients in ∞ (Γ, K ) is equivalent to the Coarse Baum–Connes conjecture for Γ . The righthand side in both conjectures is the K-theory of the Roe C∗ -algebra C ∗ Γ ∼ = ∞ (Γ, K ) r Γ . The core of Yu’s proof is showing the left-hand sides are the same, i.e. that top K∗ Γ, ∞ (Γ, K ) = lim KK∗Γ C0 ρ −1 () , ∞ (Γ, K ) ∼ = lim K∗ (Pd Γ ), ⊂BΓ, compact
d→∞
where ρ : EΓ → BΓ denotes the quotient map. We prove an analogous statement for uniform K-homology in certain cases: Theorem 10.1. If Γ is a torsion-free countable discrete group, then lim KK∗Γ C0 ρ −1 () , ∞ Γ ∼ = lim K∗u (Pd Γ ). ⊂BΓ, compact
d→∞
As a consequence, this provides a computation of limd→∞ K∗u (Pd Γ ) for torsion-free discrete groups for which the Baum–Connes conjecture with commutative coefficients is known, for instance Zn or the free groups (see e.g. [7,10]). We consider a discrete group Γ endowed with a proper, left-invariant metric. Such a metric makes Γ into a uniformly discrete space with bounded geometry. There are such metrics on every discrete group, and any two such are quasi-isometric. For instance, if Γ is finitely generated, the word metric provides an example of such a metric. Proof of 10.1. First, realize that for countable discrete groups lim KK∗Γ C0 ρ −1 () , ∞ Γ ∼ = lim KK∗Γ C0 (Pd Γ ), ∞ Γ . ⊂BΓ, compact
d→∞
The rest of the proof is devoted to showing that the right-hand side above is in fact isomorphic to limd→∞ K∗u (Pd Γ ), by using the Mayer–Vietoris sequence. We proceed similarly as in [14, Proof of Theorem 2.7]. Let X be a Γ -invariant subset of ρ −1 (), where ⊂ BΓ is compact. We construct a homomorphism ψ : K∗u (X) → KK∗Γ C0 (X), ∞ Γ
118
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
as follows: given a uniform Fredholm module (H, φ, F ) for X, we let H = H ⊗ ∞ Γ ∼ = ∞ (Γ, H ), a Hilbert module over ∞ Γ . The group Γ acts on it by translations. Furthermore, we define φ : C0 (X) → B(H ) by φ (f )ξ (γ ) = φ(γ ∗ f ) ξ(γ ) for f ∈ C0 (X), ξ ∈ H ∼ = ∞ (Γ, H ), γ ∈ Γ , and where γ ∗ denotes the action of γ on C0 (X). Finally, we put F ∈ B(H ) to be the operator given by (F ξ )(γ ) = F (ξ(γ )). It is straightforward to check that the triple (H , φ , F ) is a Fredholm Γ -module. Since Γ acts on X by isometries, note that γ ∗ f has the same support and the same L-continuity as f , and so the size of matrices which approximate expressions like (F 2 − 1)φ(γ ∗ f ) does not depend on γ , only on f . We let the image of [(H, φ, F )] under ψ to be [(H , φ , F )]. It is immediate that this assignment is well-defined, and it describes a group homomorphism. We prove that ψ is an isomorphism for X = Pd Γ . For X = Pd Γ , there exists a finite cover {Ui }m i=1 of X, such that each Ui is a space of the form Γ × Y , where Y is contractible, compact, with the diameter at most 12 . Such a cover can be constructed by considering sufficiently fine barycentric subdivision of the finite simplicial complex Pd Γ /Γ , and then pulling it back to Pd Γ by ρ. We now use the Mayer–Vietoris sequences for both KK∗Γ and K∗u (Theorem 5.1) simultaneously for showing that ψ is an isomorphism for X = Pd Γ . Note that ψ commutes with all the involved Mayer–Vietoris sequences. This, together with an induction process, reduces the general case to the case when X = Γ × Y , Y is as above. By [14, Lemma 2.3], KK∗Γ (C0 (Γ × Y ), ∞ Γ ) can be identified with KK∗ (C0 (Y ), ∞ Γ ). Under this identification, the map ψ : K∗u (Γ × Y ) → KK∗ C0 (Y ), ∞ Γ can be understood as follows: Given a uniform Fredholm module (H, φ, F ) for Γ × Y , denote Hγ = φ(χ{γ }×Y )H . Then H = γ ∈Γ Hγ is naturally a Hilbert module over ∞ Γ (with the coordinate-wise inner product). Furthermore, we let φ : C0 (Y ) → B(H ) be defined as ((φ (f ))ξ )(γ ) = φ(χ{γ }×Y ·f )(ξ(γ )) for f ∈ C0 (Y ), ξ ∈ H , γ ∈ Γ . Finally, define F ∈ B(H ) by F = γ ∈Γ φ(χ{γ }×Y )F φ(χ{γ }×Y ). With this notation, ψ assigns the Fredholm module [(H , φ , F )] to [(H, φ, F )]. Now it is easy to see that ψ is an isomorphism, since we may assume that F has propagation at most 12 , by Proposition 7.4, Remark 7.2 and the properties of the space Γ × Y . This concludes the proof. 2 Remark 10.2. From the constructions in the above proof, the isomorphism commutes with the uniform index map. Moreover, from the definition of the index map it is clear that in fact it coincides with the Baum–Connes map with coefficients in ∞ Γ when Γ is torsion-free. Corollary 10.3. The statement that μu : lim K∗u (Pd Γ ) → K∗ Cu∗ Γ d→∞
is an isomorphism for a torsion-free countable discrete group Γ (an analogue of the Coarse Baum–Connes conjecture) is equivalent to the Baum–Connes conjecture for Γ with coefficients in ∞ Γ .
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
119
Remark 10.4. It is likely that the conclusion of Theorem 10.1 holds without any assumption on torsion. However, that would require at least some degree of homotopy invariance of the uniform K-homology, which would allow us to pass from EΓ to EΓ on the K-homology side (cf. [14, Lemma 2.10]). 11. Amenability As an application of uniform K-homology, we prove a criterion for amenability. It is analogous to similar criteria in the context of uniformly finite homology [3] and K-theory of uniform Roe algebras [6]. Both of these can be interpreted as saying that a space (which is uniformly discrete and has bounded geometry) is amenable if and only if its “fundamental class” is nontrivial in the appropriate group. In [3], it is the usual fundamental class in the 0th uniformly finite homology group; in [6] it is the class of the identity operator [1] in the K0 group of the uniform Roe algebra. Our criterion (Theorem 11.2) has the same form. Recall (Følner’s) definition of amenability (see [3, Section 3]). Definition 11.1. Let Y be a uniformly discrete metric space. For a set U ⊂ Y , we define its r-boundary by ∂r U = y ∈ Y d(y, U ) r and d(y, Y \ U ) r . We say that Y is amenable, if for any r, δ > 0, there exists a finite set U ⊂ Y , such that |∂r U | < δ. |U | Note that this definition is equivalent to the usual definition of amenability of groups (existence of an invariant mean) for spaces arising as Cayley graphs of discrete groups. However, we do not require the Følner sets to exhaust the whole space, and so we need to be cautious when applying this to general metric spaces. For instance, taking any uniformly discrete metric space Y , one can make it amenable by attaching an infinite “spaghetti” to it, i.e. an infinite ray. Also note that any “coarse disjoint union finite spaces” is also amenable in this sense, since for a given r > 0, we can always select a finite piece U of the space, which is at least r-far from the rest of the space, hence making ∂r U = ∅. In particular, this applies to expanders. Let X be a graph (with the edges attached) and let Y be its vertex set. Recall the definition of the fundamental class S ∈ K0u (X) (see Example 2.9). Let H = 2 Y ⊗ 2 N, and endow H with the multiplication action of C0 (X). Let S ∈ B(2 N) be the unilateral shift. Let S˜ = diag(S) ∈ B(H ) ˜ =1⊗ ˜ It is easy to see that S ∈ K u (X), and that ind(S) and finally denote S = [(H, φ, S)]. 0 2 2 2 p0 ∈ B( Y ⊗ N), where p0 is a rank one projection (onto Ce1 ∈ N). We also denote by 0 ∈ K0u (Y ) the trivial element. Theorem 11.2. Let X be a connected graph with the vertex set Y . Then Y is amenable if and only if S = 0 in K0u (X). More generally, if X is not connected, then Y is amenable if and only if there exists C 0, such that S = 0 in K0u (PC (Y )) (recall that PC (Y ) denotes the Rips complex of Y , see Definition 6.3). Remark 11.3. Note that the technical assumption that Y is a graph is not too restrictive, since every metric space with bounded geometry is coarsely equivalent to a graph.
120
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
Proof. If Y is amenable, then μu (S) = [1] = [0] = μu (0) ∈ K0 (Cu∗ Y ) by [6], and so S = 0. For the convenience of the reader, let us sketch this part of Elek’s proof. The idea is that if Y is amenable, then usingFølner sets Bn , one can construct a trace on Cu∗ Y as an ultralimit of functions fn (T ) = |B1n | x∈Bn txx . Trace then distinguishes [1] from [0] in K0 (Cu∗ Y ). Let us turn to the reverse implication. Assume that Y is not amenable. We will proceed to constructing a homotopy connecting S and 0 in K0u (X). First, we describe a “building block”. Denote I = [0, 1]. Denote T0 = 10 S0 and T1 = S 0 ∈ B(HI ), where HI = 2 N ⊕ 2 N. Let the action ψ of C(I ) on HI be ψ(f )(η ⊕ ξ ) = 01 f (0)η ⊕ f (1)ξ . Let us show a homotopy (HI , ψt , Tt ) between (HI , ψ, T0 ) and (HI , ψ, T1 ). Define ⎧ 1 ⎪ ⎨ f (0)η ⊕ f (1 − 3t)ξ, 0 t 3 , 1 2 ψt (f )(η ⊕ ξ ) = f (0)η ⊕ f (0)ξ, 3 t 3, ⎪ ⎩ f (0)η ⊕ f (3t − 2)ξ, 23 1, and ⎧ ⎪ ⎨ T0, Tt = αt 10 S0 αt∗ , ⎪ ⎩ T1 ,
0 t 13 , 1 3 2 3
t 23 , 1,
cos( π (3t−1)) sin( π (3t−1)) k where αt = − sin(2π (3t−1)) cos( π2 (3t−1)) is the rotation homotopy. It is clear that operators S 0l 0 S 2 S k−1 0 2 and (on the same Hilbert space with the same action of C(I )) are homotopic as well. 0 S l+1 Now we turn to Y ⊂ X. Assuming non-amenability of Y and applying [3, Theorem 3.1 and y Lemma 2.4], for each y ∈ Y there exists a “tail”, i.e. a sequence (zi )i0 ⊂ Y , such that z0 = y, y y C = supy,i (d(zi , zi+1 )) < ∞, satisfying the condition that in every ball of a fixed radius, the number of tails passing through is uniformly bounded. In the case when X is connected, we can reduce the general C to the case C = 1, i.e. to the situation when the tails actually follow the edges of X. We may achieve this just by refining the tails, without violating the condition on uniform bound on tails passing through balls, since Y has bounded geometry. If we do not assume connectedness, we may get by working with the Rips complex PC (Y ) instead of X = P1 (Y ), since any two points with distance C are connected by an edge in PC (Y ). Consequently, it is possible to partition the collection of edges contained in all tails y y ((zi , zi+1 ))y∈Y,i∈N (we allow for multiplicities) into finitely many parts A1 , . . . , Ak , such that no two edges from the same part share a common vertex. The idea of the rest of the construction is to “send off” the S˜ along the tails off to infinity, and thus connecting S˜ with 1. This is done in k steps. In step j , we simultaneously apply the building block construction to each of the edges in Aj (this is possible by the choice of Aj ), thus “transferring” one S along each of those edges. After each step, we obtain a diagonal matrix in ˜ and ends B(H ) with various powers of S on the diagonal. The whole homotopy begins with S, with 1, since after all k steps the S from each y ∈ Y was shifted away from y along the tail. 2
J. Špakula / Journal of Functional Analysis 257 (2009) 88–121
121
Acknowledgment The author would like to thank Guoliang Yu for helpful and enlightening conversations and never-ending encouragement. References [1] P. Baum, A. Connes, N. Higson, Classifying space for proper actions and K-theory of group C ∗ -algebras, in: C ∗ Algebras: 1943–1993, San Antonio, TX, 1993, in: Contemp. Math., vol. 167, Amer. Math. Soc., Providence, RI, 1994, pp. 240–291. [2] B. Blackadar, K-Theory for Operator Algebras, second ed., Math. Sci. Res. Inst. Publ., vol. 5, Cambridge Univ. Press, Cambridge, 1998. [3] J. Block, S. Weinberger, Aperiodic tilings, positive scalar curvature, and amenability of spaces, J. Amer. Math. Soc. 5 (4) (1992) 907–918. [4] J. Block, S. Weinberger, Large scale homology theories and geometry, in: Geometric Topology, Athens, GA, 1993, in: AMS/IP Stud. Adv. Math., vol. 2, Amer. Math. Soc., Providence, RI, 1997, pp. 522–569. [5] J. Brodzki, G.A. Niblo, N.J. Wright, Property A, partial translation structures, and uniform embeddings in groups, J. Lond. Math. Soc. (2) 76 (2) (2007) 479–497. [6] G. Elek, The K-theory of Gromov’s translation algebras and the amenability of discrete groups, Proc. Amer. Math. Soc. 125 (9) (1997) 2551–2553. [7] N. Higson, G.G. Kasparov, E-theory and KK-theory for groups which act properly and isometrically on Hilbert space, Invent. Math. 144 (1) (2001) 23–74. [8] N. Higson, J. Roe, Analytic K-Homology, Oxford Math. Monogr., Oxford Univ. Press, Oxford Sci. Publ., Oxford, 2000. [9] G.G. Kasparov, Equivariant KK-theory and the Novikov conjecture, Invent. Math. 91 (1) (1988) 147–201. [10] V. Lafforgue, K-théorie bivariante pour les algèbres de Banach et conjecture de Baum–Connes, Invent. Math. 149 (1) (2002) 1–95. [11] J. Roe, An index theorem on open manifolds. I, II, J. Differential Geom. 27 (1) (1988) 87–113, 115–136. [12] J. Roe, Index Theory, Coarse Geometry, and Topology of Manifolds, CBMS Reg. Conf. Ser. Math. 90, Published for the Conference Board of the Mathematical Sciences, Washington, DC, 1996. [13] J. Roe, Lectures on Coarse Geometry, Univ. Lecture Ser. 31, Amer. Math. Soc., Providence, RI, 2003. [14] G. Yu, Baum–Connes conjecture and coarse geometry, K-Theory 9 (3) (1995) 223–231. [15] G. Yu, The coarse Baum–Connes conjecture for spaces which admit a uniform embedding into Hilbert space, Invent. Math. 139 (1) (2000) 201–240.
Journal of Functional Analysis 257 (2009) 122–148 www.elsevier.com/locate/jfa
Stable manifolds for nonuniform polynomial dichotomies ✩ António J.G. Bento ∗ , César Silva Departamento de Matemática, Universidade da Beira Interior, 6201-001 Covilhã, Portugal Received 17 October 2008; accepted 29 January 2009 Available online 24 February 2009 Communicated by L. Gross
Abstract We establish the existence of smooth stable manifolds in Banach spaces for sufficiently small perturbations of a new type of dichotomy that we call nonuniform polynomial dichotomy. This new dichotomy is more restrictive in the “nonuniform part” but allow the “uniform part” to obey a polynomial law instead of an exponential (more restrictive) law. We consider two families of perturbations. For one of the families we obtain local Lipschitz stable manifolds and for the other family, assuming more restrictive conditions on the perturbations and its derivatives, we obtain C 1 global stable manifolds. Finally we present an example of a family of nonuniform polynomial dichotomies and apply our results to obtain stable manifolds for some perturbations of this family. © 2009 Elsevier Inc. All rights reserved. Keywords: Invariant manifolds; Nonautonomous dynamics; Nonuniform polynomial dichotomies
1. Introduction The existence of invariant manifolds is one of the key features in the theory of nonuniform hyperbolicity. The concept of nonuniform hyperbolicity introduced by Pesin [20–22] (see [1,2] for a description of the current status of the theory) is a generalization of the classical concept of (uniform) hyperbolicity. In the nonuniform hyperbolic context the rates of expansion and ✩
Supported by Centro de Matemática da Universidade da Beira Interior.
* Corresponding author.
E-mail addresses:
[email protected] (A.J.G. Bento),
[email protected] (C. Silva). URL: http://www.mat.ubi.pt/~csilva (C. Silva). 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.01.032
A.J.G. Bento, C. Silva / Journal of Functional Analysis 257 (2009) 122–148
123
contraction are allowed to vary from point to point. The stable manifold theorem for nonuniform hyperbolic trajectories obtained by Pesin [20] in the finite-dimensional setting is an elaboration of the classical work of Perron. Since then other proofs were obtained, namely, in [24] Ruelle gave a proof based on the study of perturbations of products of matrices occurring in Oseledets’ multiplicative ergodic theorem [18]. The proof given by Pugh and Shub in [23] is based on the classical work of Hadamard and uses graph transform techniques. There exist also versions of the stable manifold theorem for dynamical systems in infinite-dimensional spaces. In [25] Ruelle established a corresponding version in Hilbert spaces under some compactness assumptions, following his approach in [24]. In [17] Mañé considered transformations in Banach spaces under some compactness and invertibility assumptions that includes the case of differentiable maps with compact derivative at each point. The results of Mañé were generalized by Thieullen in [27] for a family of transformations satisfying some asymptotic compactness. The existence of invariant manifolds is also an important subject in the context of exponential dichotomies introduced by Perron in [19]. There is a substantial amount of literature concerning the existence of stable and unstable manifolds for exponential dichotomies (see for example [26] and the references given there). As mentioned before, the notion of nonuniform hyperbolicity is a generalization of the concept of (uniform) hyperbolicity. Similarly there is also a concept introduced by Barreira and Valls [9,6] of nonuniform exponential dichotomy that is a weaker (and therefore more general) version of the classical notion of exponential dichotomy. In the discrete time setting, Barreira and Valls obtained C 1 stable manifolds for nonuniformly exponential dichotomies in finite dimension in [7]. Building on this result Barreira, Silva and Valls were able in [3] to establish the existence of C k local manifolds for C k perturbations, using an induction process and considering a more geometric approach based on the linear extension of the dynamics. Assuming some exponential decay of the derivatives along the orbits, the same authors established in [4] the existence of C 1 global manifolds for perturbations of nonuniformly exponential dichotomies in Banach spaces. Continuous time versions of this results were obtained by Barreira and Valls in [5,6,8,11]. In a recent work Barreira and Valls [12] considered a generalization of the concept of nonuniform exponential dichotomies that they call ρ-nonuniform exponential dichotomy, with ρ an + increasing differentiable function from R+ 0 into R0 such that lim
t→+∞
log t = 0. ρ(t)
(1)
With this generalization, Barreira and Valls replaced the asymptotic rates ect that appear in the nonuniform exponential case by the asymptotic rates ecρ(t) . Barreira and Valls established, in a finite-dimensional space, the existence of ρ-nonuniform exponential dichotomies and trichotomies for a general family of nonautonomous linear differential equations v = A(t)v, where A(t) are matrices in some block form. To achieve this Barreira and Valls used an adapted type of Lyapunov exponent. In this work we consider a different kind of nonuniform dichotomy where the rates of expansion and contraction are allowed to vary polynomially. Naturally, the nonuniform parts must vary at most polynomially (see (2) and (3)). We thus consider a new type of behavior: there are families of nonuniform polynomial dichotomies that are not nonuniform exponential dichotomies and vice versa. Even in the more general case of ρ-nonuniform exponential dichotomy, it is not possible to have a polynomial behavior because, to have a polynomial behavior, ρ(t) would have
124
A.J.G. Bento, C. Silva / Journal of Functional Analysis 257 (2009) 122–148
to increase at most logarithmically, and by (1) this is not possible. Note that our case allows situations for which the Lyapunov exponent defined in [10, Section 8] for Hilbert spaces is zero for all v ∈ E1 (see Section 2 for the definition of E1 ). The main results of this paper are stable manifolds theorems for perturbations of nonuniform polynomial dichotomies. In Section 3 we get local Lipschitz stable manifolds and in Section 4 we get global C 1 stable manifolds. The reason for the difference in the regularity of the manifolds obtained is that to get C 1 manifolds in the local case we would have to consider perturbations that are zero outside a ball of increasingly small radius and it is not known in the infinite-dimensional setting how to obtain appropriate cutoff functions (see the comment in [11, p. 2]). The content of the paper is the following: in Section 2 we introduce some notations, the main definitions and also establish a technical lemma used several times; then, respectively in Sections 3 and 4, we obtain a local and a global stable manifold theorem; finally we present in Section 5 examples of families of nonuniform polynomial dichotomies and apply our results to obtain stable manifolds for some perturbations of these families. 2. Notation and preliminaries Let B(X) be the space of bounded linear operators in the Banach space X. Given a sequence (An )n∈N of invertible operators of B(X) we define A(m, n) =
Am−1 . . . An Id
if m > n, if m = n.
We say that (An )n∈N admits a nonuniform polynomial dichotomy if there exist projections Pn : X → X for n ∈ N such that Pm A(m, n) = A(m, n)Pn ,
m, n ∈ N,
and constants a < 0 b, ε 0 and D 1 such that for every m n, A(m, n)Pn D(m − n + 1)a nε , A(m, n)−1 Qm D(m − n + 1)−b mε ,
(2) (3)
where Qm = Id −Pm is the complementary projection. When ε = 0 we say that we have a uniform polynomial dichotomy or simply a polynomial dichotomy. In these conditions we define, for each n ∈ N, the linear subspaces En = Pn (X) and Fn = Qn (X). Without loss of generality, we always identify the spaces En × Fn and En ⊕ Fn as the same space and we equip these spaces with the norm given by (x, y) = x + y for (x, y) ∈ En × Fn . We are going to address the problem of existence of stable manifolds of the dynamics given by F(m, n) =
(Am−1 + fm−1 ) ◦ · · · ◦ (An + fn ) Id
if m > n, if m = n,
(4)
where (An )n∈N admits a nonuniform polynomial dichotomy and fn : X → X are perturbations that verify some conditions to be specified later.
A.J.G. Bento, C. Silva / Journal of Functional Analysis 257 (2009) 122–148
125
Given n ∈ N and vn = (ξ, η) ∈ En × Fn , for each m > n we write vm = F(m, n)(vn ) = (xm , ym ) ∈ Em × Fm . Writing fm = (gm , hm ) where gm = Pm+1 fm and hm = Qm+1 fm for each m n, the trajectory (vm )mn satisfies the following equations xm = A(m, n)ξ +
m−1
A(m, k + 1)gk (xk , yk ),
(5)
A(m, k + 1)hk (xk , yk ).
(6)
k=n
ym = A(m, n)η +
m−1 k=n
In what follows we are going toαuse Dirichlet series. For every α < −1, we denote by λα the sum of the Dirichlet series ∞ k=1 k . The following lemma will be used several times. Lemma 1. Let m, n ∈ N with m > n, a < 0, q > 0 and ε 0. (a) If aq + ε < −1, then the following inequality holds m−1
(m − k)a (k + 1)ε (k − n + 1)aq+a 2ε−a λaq+ε (m − n + 1)a nε .
(7)
k=n
(b) If ε > 0, then the following inequality holds m−1
(m − k)a (k − n + 1)a (k + 1)ε k −3ε−1 2ε−a λ−2ε−1 (m − n + 1)a .
(8)
k=n
Proof. (a) Because the sum of the factors of (m − k)(k − n + 1) is constant and a < 0, it follows that (m − k)a (k − n + 1)a (m − n)a 2−a (m − n + 1)a , k = n, . . . , m − 1. On the other hand, we have ε n ε (k − n + 1)ε 2ε nε (k − n + 1)ε , (k + 1) = 1 + k−n+1 k = n, . . . , m − 1. Therefore m−1
(m − k)a (k + 1)ε (k − n + 1)aq+a 2ε−a (m − n + 1)a nε
k=n
m−1
(k − n + 1)aq+ε
k=n
2
ε−a
λaq+ε (m − n + 1)a nε .
(b) It follows immediately from (9) and (k + 1)ε 2ε k ε .
2
(9)
126
A.J.G. Bento, C. Silva / Journal of Functional Analysis 257 (2009) 122–148
3. Local stable manifolds In this section we assume that there are constants c > 0 and q > 1 such that the functions fn in the perturbed dynamics (4) verify the following conditions fn (0) = 0,
fn (u) − fn (v) cu − v u + v q
(10) (11)
for every n ∈ N and u, v ∈ X. Making v = 0 in (11) we have fn (u) cuq+1
(12)
for every u ∈ X. We denote by Bn (r) the open ball of En centered at zero and with radius r > 0. The initial condition at time n will be taken in Bn (δn−β ) for some δ, β > 0. We denote by Xβ the space of sequences (ϕn )n∈N of continuous functions ϕn : Bn (δn−β ) → Fn such that ϕn (0) = 0, ϕn (ξ ) − ϕn (ξ¯ ) ξ − ξ¯
(13) (14)
for every ξ , ξ¯ ∈ Bn (δn−β ). For each ϕ = (ϕn )n∈N ∈ Xβ we define −β
ϕn (ξ ) : n ∈ N and ξ ∈ Bn δn \ {0} . ϕ = sup ξ
Clearly ϕ 1, and given m ∈ N and ξ = 0, we have ϕn (ξ ) δn−β ϕn (ξ ) δϕ δ ξ for every ϕ ∈ Xβ . This readily implies that Xβ is a complete metric space with the distance induced by · . We also consider the space X∗β of sequences ϕ = (ϕn )n∈N with ϕn : En → Fn such that the sequence (ϕn |Bn (δn−β ))n∈N is in Xβ and, for each n ∈ N, ϕn (ξ ) = ϕn
δn−β ξ ξ
whenever ξ ∈ / Bn δn−β .
There is a one-to-one correspondence between the sequences in Xβ and in X∗β because for each sequence of functions ϕ = (ϕn )n∈N ∈ Xβ there is a unique extension ϕ = (ϕ n )n∈N such that each ϕ n is a Lipschitz extension of ϕn to Bn (δn−β ). Clearly X∗β is also a complete metric space with the metric induced by X∗β ϕ → ϕ|Xβ . Furthermore, one can easily verify that given ϕ ∈ X∗β and n ∈ N we have ϕn (x) − ϕn (y) 2x − y for every x, y ∈ E.
(15)
A.J.G. Bento, C. Silva / Journal of Functional Analysis 257 (2009) 122–148
127
Given β, δ > 0 and (ϕn )n∈N ∈ Xβ , for each n ∈ N, we consider the graph Vn,δ,β =
ξ, ϕn (ξ ) : ξ ∈ Bn δn−β .
(16)
We now present the main result of this section. Theorem 1 (Local stable manifolds). Let (An )n∈N be a sequence of invertible bounded linear operators, acting on a Banach space X, that admits a nonuniform polynomial dichotomy satisfying (2) and (3) for some D 1, a < 0 b and ε > 0 and let (fn )n∈N be a sequence of functions, acting on X, that verifies (10) and (11) for some c > 0 and some q > 1. If aq + ε + 1 < 0 and a + β < 0 hold with β = ε(1 + 2/q), then, for every C > D, choosing δ sufficiently small, there is a unique ϕ ∈ Xβ such that F(m, n)(Vn, δ ,β+ε ) ⊂ Vm,δ,β C
for every m n,
(17)
with F(m, n) given by (4) and Vn, δ ,β+ε and Vm,δ,β given by (16). C Furthermore, for every m n and ξ , ξ¯ ∈ Bn (δn−(β+ε) /C) we have
F(m, n) ξ, ϕn (ξ ) − F(m, n) ξ¯ , ϕn (ξ¯ ) 2C(m − n + 1)a nε ξ − ξ¯ .
(18)
We call each Vn,δ,β a stable manifold. In view of the forward invariance in (17), each trajectory starting Vn, δ ,β+ε must be in Vm,δ,β . C Thus, for every (ξ, ϕn (ξ )) ∈ Vn, δ ,β+ε , using Eqs. (5) and (6), we have to prove that C
xm (ξ ) = A(m, n)ξ +
m−1
A(m, k + 1)gk xk (ξ ), ϕk xk (ξ ) ,
(19)
k=n m−1
A(m, k + 1)hk xk (ξ ), ϕk xk (ξ ) , ϕm xm (ξ ) = A(m, n)ϕn (ξ ) +
(20)
k=n
and xm (ξ ) δm−β
(21)
for every ξ ∈ Bn (δn−(β+ε) /C) and every m > n, where
F(m, n) ξ, ϕn (ξ ) = xm (ξ ), ϕm xm (ξ ) ∈ Em × Fm . The idea of the proof of Theorem 1 is to solve Eqs. (19) and (20) separately using in each case the Banach fixed point theorem. For this we first establish, using the Banach fixed point theorem on a suitable space B, that for every ϕ ∈ Xβ there is a unique sequence of functions x ϕ = (xm )mn ∈ B that verifies (19) and (21). To prove Eq. (20), we first prove that this equation
128
A.J.G. Bento, C. Silva / Journal of Functional Analysis 257 (2009) 122–148
is equivalent to another one and we solve this second equation applying the Banach fixed point theorem in the space X∗β . Let B = Bn,β be the space of all sequences x = (xm )mn of functions
xm : Bn δn−β → Em such that for every m n and every ξ, ξ¯ ∈ Bn (δn−β ) we have xn (ξ ) = ξ, xm (0) = 0, xm (ξ ) − xm (ξ¯ ) C(m − n + 1)a nε ξ − ξ¯ , for some constant C > D. Making ξ¯ = 0 in (23) we obtain the following estimate xm (ξ ) C(m − n + 1)a nε ξ Cδ(m − n + 1)a nε−β
(22) (23)
(24)
for every m n and every ξ ∈ Bn (δn−β ). This space B allows to estimate the speed of decay of the stable component of the solution along the graphs given by ϕ. In fact, if ξ ∈ Bn (δn−(β+ε) /C), then, (21) holds because a + β < 0: xm (ξ ) C(m − n + 1)a nε ξ δ(m − n + 1)a n−β δm−β . For every x ∈ B, we define x = sup
−β
xm (ξ ) : m n, ξ ∈ Bn δn (m − n + 1)a nε
(25)
and with the metric induced by (25), B is a complete metric space. Given ϕ = (ϕn )n∈N ∈ X∗β and x = (xm )mn ∈ B we write ∗ ϕm = ϕm ◦ xm
∗ and fm∗ (ξ ) = fm xm (ξ ), ϕm (ξ ) .
Lemma 2. Given δ > 0 sufficiently small, for each ϕ ∈ X∗β and n ∈ N there exists a unique sequence x = x ϕ ∈ B satisfying Eq. (19) for every m n and ξ ∈ Bn (δn−β ). Proof. We define an operator J in B by (J x)n (ξ ) = ξ , and for each m > n by (J x)m (ξ ) = A(m, n)ξ +
m−1
A(m, k + 1)gk xk (ξ ), ϕk xk (ξ ) .
k=n
One can easily verify from (22), (13) and (10) that (J x)m (0) = 0 for every m n. From the definition of the operator it follows that (J x)m (ξ ) − (J x)m (ξ¯ ) m−1 A(m, k + 1)Pk+1 · f ∗ (ξ ) − f ∗ (ξ¯ ). A(m, n)Pn ξ − A(m, n)Pn ξ¯ + k k k=n
(26)
A.J.G. Bento, C. Silva / Journal of Functional Analysis 257 (2009) 122–148
129
Using (2) we have A(m, n)Pn ξ − A(m, n)Pn ξ¯ A(m, n)Pn ξ − ξ¯ D(m − n + 1)a nε ξ − ξ¯ and from (2), (11), (15), (23) and (7) we also have m−1
A(m, k + 1)Pk+1 · f ∗ (ξ ) − f ∗ (ξ¯ ) k k
k=n
cD
m−1
(m − k)a (k + 1)ε xk (ξ ) − xk (ξ¯ ) + ϕk xk (ξ ) − ϕk xk (ξ¯ )
k=n
q × xk (ξ ) + ϕk xk (ξ ) + xk (ξ¯ ) + ϕk xk (ξ¯ ) cD
m−1
q (m − k)a (k + 1)ε 3xk (ξ ) − xk (ξ¯ ) 3xk (ξ ) + 3xk (ξ¯ )
k=n
c2q (3C)q+1 Dδ q nε(q+1)−βq ξ − ξ¯
m−1
(m − k)a (k + 1)ε (k − n + 1)aq+a
k=n
2
q+ε−a
q+1
c(3C)
Dδ λaq+ε (m − n + 1)a nε(q+2)−βq ξ − ξ¯ . q
Choosing δ sufficiently small, it follows from (26) that (J x)m (ξ ) − (J x)m (ξ¯ ) C(m − n + 1)a nε ξ − ξ¯ and this implies the inclusion J (B) ⊂ B. We now show that J is a contraction for the metric induced by (25). Let x, y ∈ B. Then m−1 (J x)m (ξ ) − (Jy)m (ξ ) A(m, k + 1)Pk+1 αk , k=n
where αk = fk (xk (ξ ), ϕk (xk (ξ ))) − fk (yk (ξ ), ϕk (yk (ξ ))). By (11) and (15) we have
αk c xk (ξ ) − yk (ξ ) + ϕk xk (ξ ) − ϕk yk (ξ )
q × xk (ξ ) + ϕk xk (ξ ) + yk (ξ ) + ϕk yk (ξ ) q 3q+1 cxk (ξ ) − yk (ξ ) xk (ξ ) + yk (ξ ) and, using (24), it follows that αk 2q 3q+1 cC q δ q (k − n + 1)aq n(ε−β)q xk (ξ ) − yk (ξ ) 2q 3q+1 cC q δ q (k − n + 1)aq+a nε(q+1)−βq x − y . Hence, from (2) and (27), we have
(27)
130
A.J.G. Bento, C. Silva / Journal of Functional Analysis 257 (2009) 122–148
(J x)m (ξ ) − (Jy)m (ξ ) 2 3
q q+1
q
q ε(q+1)−βq
cC Dδ n
x − y
m−1
(m − k)a (k + 1)ε (k − n + 1)aq+a
k=n
2
q−a+ε q+1
3
cC Dδ λaq+ε (m − n + 1)a nε(q+2)−βq x − y q
q
and choosing δ sufficiently small it follows that (J x)m (ξ ) − (Jy)m (ξ ) μ(m − n + 1)a nε x − y with μ < 1. Therefore, J x − Jy μx − y , and J is a contraction in B provided that δ is sufficiently small. Because B is complete, by the Banach fixed point theorem, the map J has a unique fixed point x ϕ in B, which is thus the desired sequence. 2 ϕ
We now represent by (xn,k )kn ∈ Bn,β the unique sequence given by Lemma 2. Lemma 3. Given δ > 0 sufficiently small and ϕ ∈ X∗β the following properties hold: (1) If for every n ∈ N, m n and ξ ∈ Bn (δn−β ) the identity (20) holds, then ϕn (ξ ) = −
∞
ϕ ϕ
A(k + 1, n)−1 hk xn,k (ξ ), ϕk xn,k (ξ )
(28)
k=n
for every n ∈ N, m n and ξ ∈ Bn (δn−β ). (2) If for every n ∈ N, m n and ξ ∈ Bn (δn−β ) Eq. (28) holds, then (20) holds for every ξ ∈ Bn (δn−(β+ε) /C). Proof. First we prove that the series in (28) is convergent. From (3), (12), (15) and (24), we conclude that ∞
A(k + 1, n)−1 hk x ϕ (ξ ), ϕk x ϕ (ξ ) n,k n,k k=n ∞
A(k + 1, n)−1 Qk+1 fk x ϕ (ξ ), ϕk x ϕ (ξ ) n,k n,k k=n
∞
q+1 D(k − n + 2)−b (k + 1)ε c xn,k (ξ ) + ϕk xn,k (ξ )
k=n
cD
∞
q+1 (k − n + 2)−b (k + 1)ε 3Cδ(k − n + 1)a nε−β k=n
A.J.G. Bento, C. Silva / Journal of Functional Analysis 257 (2009) 122–148
c(3Cδ)q+1 Dn(ε−β)(q+1)
131
∞ (k − n + 1)aq+a−b (k + 1)ε k=n
2ε c(3Cδ)q+1 Dnε(q+2)−β(q+1)
∞ (k − n + 1)aq+a−b+ε < ∞. k=n
Let us suppose that (20) holds. Then by A(m, n)−1 A(m, k + 1) = A(k + 1, n)−1 , Eq. (20) can be written in the following equivalent form ϕ ϕ ϕ
m−1
(ξ ) − A(k + 1, n)−1 hk xn,k (ξ ), ϕk xn,k (ξ ) . ϕn (ξ ) = A(m, n)−1 ϕm xn,m
(29)
k=n
Using (3), (15) and (24), we have
A(m, n)−1 ϕm x ϕ (ξ ) = A(m, n)−1 Qm ϕm x ϕ (ξ ) n,m n,m ϕ 2D(m − n + 1)−b mε xn,m (ξ ) 2D(m − n + 1)−b mε Cδ(m − n + 1)a nε−β 2CDδ(m − n + 1)a−b+ε n2ε−β and this converges to zero when m → ∞. Hence, letting m → ∞ in (29) we obtain the identity (28). We now assume that for every n ∈ N, m n and ξ ∈ Bn (δn−β ) the identity (28) holds. If ξ ∈ Bn (δn−(β+ε) /C) then, since a + β < 0, we get xn,m (ξ ) C(m − n + 1)a nε ξ δ(m − n + 1)a n−β δm−β . (30) Then A(m, n)ϕn (ξ ) = −
∞
ϕ ϕ
A(m, k + 1)hk xn,k (ξ ), ϕk xn,k (ξ ) ,
k=n
and thus it follows from (28), the uniqueness of the sequences x ϕ and (30) that A(m, n)ϕn (ξ ) +
m−1
ϕ ϕ
A(m, k + 1)hk xn,k (ξ ), ϕk xn,k (ξ )
k=n
=−
∞
ϕ ϕ
A(m, k + 1)hk xn,k (ξ ), ϕk xn,k (ξ )
k=m
=−
∞ k=m
ϕ ϕ
ϕ ϕ
A(m, k + 1)hk xm,k xn,m (ξ ) , ϕk xm,k xn,m (ξ )
ϕ
= ϕm xn,m (ξ ) . This proves the lemma.
2
132
A.J.G. Bento, C. Silva / Journal of Functional Analysis 257 (2009) 122–148
Lemma 4. Given δ > 0 sufficiently small, for each ϕ, ψ ∈ X∗β , n ∈ N, m n and ξ ∈ Bn (δn−β ) we have ϕ x (ξ ) − x ψ (ξ ) C (m − n + 1)a ξ ϕ − ψ . m m 2
(31)
ϕ ϕ
ψ ψ
γk = fk xk (ξ ), ϕk xk (ξ ) − fk xk (ξ ), ϕk xk (ξ ) ,
(32)
Proof. Putting
by (11) it follows that ϕ
ψ
ϕ ψ γk c xk (ξ ) − xk (ξ ) + ϕk xk (ξ ) − ψk xk (ξ ) ϕ ϕ ψ ψ q × xk (ξ ) + ϕk xk (ξ ) + xk (ξ ) + ψk xk (ξ ) ψ q ϕ
ψ ϕ ϕ ψ c xk (ξ ) − xk (ξ ) + ϕk xk (ξ ) − ψk xk (ξ ) 3xk (ξ ) + 3xk (ξ ) ϕ
ϕ ψ
ψ c(6Cδ)q xk (ξ ) − xk (ξ ) + ϕk xk (ξ ) − ψk xk (ξ ) (k − n + 1)aq n(ε−β)q and because ϕ
ϕk x (ξ ) − ψk x ψ (ξ ) ϕk x ϕ (ξ ) − ϕk x ψ (ξ ) + ϕk x ψ (ξ ) − ψk x ψ (ξ ) k k k k k k ϕ ψ
ψ ψ xk (ξ ) − xk (ξ ) + ϕk xk (ξ ) − ψk xk (ξ ) ϕ ψ ψ xk (ξ ) − xk (ξ ) + ϕ − ψ xk (ξ ) we have ψ
ϕ ψ γk c(6Cδ)q 2xk (ξ ) − xk (ξ ) + ϕ − ψ xk (ξ ) (k − n + 1)aq n(ε−β)q .
(33)
Using (33), we are going to prove this lemma by induction on m. For m = n the result follows ψ ϕ immediately because xn (ξ ) = ξ = xn (ξ ). Suppose that (31) is true for n, . . . , m − 1. Then for k = n, . . . , m − 1 we have ϕ ψ ψ 2xk (ξ ) − xk (ξ ) + ϕ − ψ xk (ξ ) C(k − n + 1)a ξ ϕ − ψ + C(k − n + 1)a nε ξ ϕ − ψ 2C(k − n + 1)a nε ξ ϕ − ψ and this implies by (33) that γk 2cC q+1 (6δ)q ξ ϕ − ψ (k − n + 1)aq+a nε(q+1)−βq . Hence we have
A.J.G. Bento, C. Silva / Journal of Functional Analysis 257 (2009) 122–148
133
ϕ x (ξ ) − x ψ (ξ ) m
m
m−1
A(m, k + 1)Pk+1 γk
k=n
2cC q+1 D(6δ)q nε(q+1)−βq ξ ϕ − ψ
m−1
(m − k)a (k + 1)ε (k − n + 1)aq+a
k=n
2
1+ε−a
cC
q+1
D(6δ) λaq+ε (m − n + 1) n q
a ε(q+2)−βq
ϕ − ψ ξ .
Choosing δ > 0 sufficiently small and using the fact that ε(q + 2) − βq = 0 we obtain (31).
2
Lemma 5. Given δ > 0 sufficiently small there is a unique ϕ ∈ X∗β such that ϕn (ξ ) = −
∞
ϕ ϕ
A(k + 1, n)−1 hk xk (ξ ), ϕk xk (ξ )
k=n
for every n ∈ N and every ξ ∈ Bn (δn−β ). Proof. We consider the operator Φ defined for each ϕ ∈ X∗β by (Φϕ)n (ξ ) = −
∞
ϕ ϕ
A(k + 1, n)−1 hk xk (ξ ), ϕk xk (ξ ) ,
(34)
k=n ϕ
where x ϕ = (xk )kn is the unique sequence given by Lemma 2. Since x ϕ ∈ Bβ we have ϕ xm (0) = 0, m n. It follows from (10), (22), (13) and (34) that (Φϕ)n (0) = 0 for each n ∈ N. Furthermore, given ξ, ξ¯ ∈ Bn (δn−β ), by (3) and (11), we have (Φϕ)n (ξ ) − (Φϕ)n (ξ¯ )
∞ A(k + 1, n)−1 Qk+1 · f ∗ (ξ ) − f ∗ (ξ¯ ) k
k
k=n
cD
∞ q (k − n + 2)−b (k + 1)ε 3xk (ξ ) − xk (ξ¯ ) 3xk (ξ ) + 3xk (ξ¯ ) k=n
3cC q+1 D(6δ)q nε(q+1)−βq ξ − ξ¯
∞ (k − n + 2)−b (k + 1)ε (k − n + 1)aq+a k=n
2q+ε c(3C)q+1 Dδ q nε(q+2)−βq ξ − ξ¯
∞
(k − n + 1)aq+a−b+ε
k=n
2 3cC ε
q+1
D(6δ) λaq+a−b+ε ξ − ξ¯ . q
Hence, choosing δ > 0 sufficiently small (independently of ϕ, n and ξ ) we have (14). Therefore Φ(X∗β ) ⊂ X∗β .
134
A.J.G. Bento, C. Silva / Journal of Functional Analysis 257 (2009) 122–148
We now show that Φ is a contraction. Given ϕ, ψ ∈ X∗β and n ∈ N, let x ϕ and x ψ be the unique sequences given by Lemma 2 respectively for ϕ and ψ . By (33) (see (32) for the definition of γk ) and Lemma 4 we have γk c2C q+1 (6δ)q ξ ϕ − ψ (k − n + 1)aq+a nε(q+1)−βq and this inequality and (3) imply (Φϕ)n (ξ ) − (Φψ)n (ξ )
∞ A(k + 1, n)−1 Qk+1 γk k=n
2cC q+1 D(6δ)q ξ ϕ − ψ nε(q+1)−βq
∞
(k − n + 2)−b (k + 1)ε (k − n + 1)aq+a
k=n
21+ε cC q+1 D(6δ)q ξ ϕ − ψ nε(q+2)−βq
∞ (k − n + 1)aq+a−b+ε k=n
2
1+ε
cC
q+1
D(6δ) ξ λaq+a−b+ε ϕ − ψ . q
Hence, choosing δ > 0 sufficiently small, we have (Φϕ)n (ξ ) − (Φψ)n (ξ ) μξ ϕ − ψ for some μ < 1. This implies that Φϕ − Φψ μϕ − ψ and this means that Φ is a contraction in X∗β . Therefore the map Φ has a unique fixed point ϕ in X∗β that is the desired sequence. 2 We are now in conditions to prove Theorem 1. Proof of Theorem 1. By Lemma 2, for each ϕ ∈ X∗β there is a unique sequence x ϕ ∈ B satisfying (19). It remains to solve (20) with x = x ϕ . By Lemma 3, this is equivalent to solve (28). Finally, by Lemma 5, there is a unique solution of (28). This establishes the existence of the stable manifolds for δ > 0 sufficiently small. Moreover, for each n ∈ N, m n and ξ, ξ¯ ∈ Bn (δn−(β+ε) /C) it follows from (14) that
F(m, n) ξ, ϕn (ξ ) − F(m, n) ξ¯ , ϕn (ξ¯ ) xm (ξ ) − xm (ξ¯ ) + ϕ ∗ (ξ ) − ϕ ∗ (ξ¯ ) m m 2xm (ξ ) − xm (ξ¯ ) 2C(m − n + 1)a nε ξ − ξ¯ . Hence we obtain (18) and the theorem is proved.
2
A.J.G. Bento, C. Silva / Journal of Functional Analysis 257 (2009) 122–148
135
4. Global stable manifolds We say that a function is of class C 1,1 if it is of class C 1 and its derivative is Lipschitz. We will consider that the perturbations fn in (4) are of class C 1,1 and we assume that there exists δ > 0 such that, for every n ∈ N and u, v ∈ X, fn (0) = 0,
d0 fn = 0,
du fn δn−3ε−1 , du fn − dv fn δn
−3ε−1
u − v,
(35) (36) (37)
with the same ε as in (2) and (3). In this section, for technical reasons, we have to assume that ε > 0. It is easy to verify that all the results in this section remain true in the case ε = 0 if we replace the exponent in (36) and (37) by −1 − γ with γ > 0. Let X be the space of sequences of C 1,1 functions ϕn : En → Fn such that, for every n ∈ N and x, y ∈ En , ϕn (0) = 0,
d0 ϕn = 0,
(38)
dx ϕn 1,
(39)
dx ϕn − dy ϕn x − y.
(40)
Given (ϕn )n∈N ∈ X we consider the graphs
Vn = ξ, ϕn (ξ ) : ξ ∈ En
(41)
that we call global stable manifolds. We have the following result. Theorem 2 (Global stable manifolds). Let (An )n∈N be a sequence of bounded invertible linear operators, acting on a Banach space X, that admits a nonuniform polynomial dichotomy satisfying (2) and (3) for some D 1, a < 0 b and ε > 0 and let (fn )n∈N be a sequence of functions, acting on X, that verify (35), (36) and (37) for some δ > 0. If a + ε < b,
(42)
then, choosing δ > 0 sufficiently small, there exists a unique sequence (ϕn )n∈N ∈ X such that (a) F(m, n)(Vn ) = Vm for every m n, where F(m, n) is given by (4) and Vn and Vm are given by (41); (b) Vn is a C 1,1 manifold with T0 Vn = En for each n ∈ N; (c) there exists K > 0 such that for m n and ξ, ξ¯ ∈ En we have F(m, n)(v) − F(m, n)(v) ¯ K(m − n + 1)a nε ξ − ξ¯ , dv F(m, n) − dv¯ F(m, n) K(m − n + 1)a n2ε ξ − ξ¯ , where v = (ξ, ϕn (ξ )) and v¯ = (ξ¯ , ϕn (ξ¯ )).
(43) (44)
136
A.J.G. Bento, C. Silva / Journal of Functional Analysis 257 (2009) 122–148
The idea of the proof of this theorem follows the same lines of the proof of Theorem 1, although with different spaces of sequences of functions. As in Theorem 1, we have to solve the following equations
xm (ξ ) = A(m, n)ξ +
m−1
A(m, k + 1)gk xk (ξ ), ϕk xk (ξ ) ,
(45)
k=n m−1
ϕm xm (ξ ) = A(m, n)ϕn (ξ ) + A(m, k + 1)hk xk (ξ ), ϕk xk (ξ ) ,
(46)
k=n
for every ξ ∈ En and every m > n, where
F(m, n) ξ, ϕn (ξ ) = xm (ξ ), ϕm xm (ξ ) ∈ Em × Fm , and prove that these solutions verify (43) and (44). Given n ∈ N and C > D (note that C > 1), let B = Bn be the space of sequences of C 1,1 functions xm : En → Em that, for every m n and ξ, ξ¯ ∈ En , satisfy the following conditions xn (ξ ) = ξ,
xm (0) = 0,
(47)
dξ xm C(m − n + 1)a nε ,
(48)
dξ xm − dξ¯ xm C(m − n + 1)a n2ε ξ − ξ¯ .
(49)
From the mean value theorem, (47) and (48) it follows that xm (ξ ) C(m − n + 1)a nε ξ
(50)
for every ξ ∈ En . This allows us to equip B with the metric induced by
xm (ξ )(m − n + 1)−a n−ε : m n, ξ ∈ En \ {0} x = sup ξ
(51)
for x = (xm )mn ∈ B. Note that x C for each x ∈ B. Proposition 1. The space B is a complete metric space with the metric induced by (51). Proof. Let (xk )k = ((xm,k )mn )k be a Cauchy sequence in B with respect to the metric induced by (51). Then for each m n the sequence (xm,k |Bn (r) )k∈N is a Cauchy sequence with respect to the supremum norm in the space of bounded functions from Bn (r) into Fm . Here Bn (r) is the open ball of En centered at 0 and with radius r. Therefore, for each m n, there exists a function ym : Bn (r) → Fm such that (xm,k |Bn (r) )k∈N converges to ym in the space of bounded functions from Bn (r) into Fm equipped with the supremum norm. For every ξ ∈ Bn (r), from (50), (48) and (49) we get
A.J.G. Bento, C. Silva / Journal of Functional Analysis 257 (2009) 122–148
137
xm,k (ξ ) C(m − n + 1)a nε r dξ xm,k C(m − n + 1)a nε , dξ xm,k − dξ¯ xm,k C(m − n + 1)a n2ε ξ − ξ¯ . Putting b = C(m − n + 1)a nε max{nε , r}, it follows that, for each m n and k ∈ N, xm,k ∈ Cb1,1 (Bn (r), Fm ). Here Cb1,1 (Bn (r), Fm ) is the space of C 1,1 functions from Bn (r) into Fm such that the norm defined by
u1,1 = max u∞ , du∞ , L(du) , is less or equal than b, · ∞ is the supremum norm and u(x) − u(y) L(u) = sup : x, y ∈ X with x = y . x − y It follows from the generalization of Henry’s lemma (see [15, p. 151]) given by Elbialy [14] (for related results see also [16] and [13]) that ym ∈ Cb1,1 (Bn (r), Fm ) and (dξ xm,k )k∈N converges pointwise to dξ y¯m when k → ∞
(52)
for every ξ ∈ Bn (r). By the uniqueness of each function ym in the ball Bn (r), we can obtain a function y¯m ∈ C 1,1 (Em , Fm ) such that y¯m |Bn (r) = ym for each r > 0. Using (52) we can easily verify that (y¯m )m ∈ B. Moreover, since (xk )k is a Cauchy sequence, for each κ > 0 there exists p ∈ N such that if k, m > p then xq,k (ξ ) − xq,m (ξ ) κ(q − n + 1)a nε ξ (53) for every q n and ξ ∈ Em . Letting m → ∞ in (53) we obtain xq,k (ξ ) − y¯q (ξ ) κ(q − n + 1)a nε ξ , and thus (xk )k converges to y¯ = (y¯q )qn in the space B.
2
Given ϕ = (ϕn )n∈N ∈ X and x = (xm )mn ∈ B we write
∗ ∗ = ϕm ◦ xm and fm∗ (ξ ) = fm xm (ξ ), ϕm (ξ ) . ϕm Lemma 6. For each ϕ ∈ X, n ∈ N, x ∈ B, ξ, ξ¯ ∈ En , and m n we have xm (ξ ) − xm (ξ¯ ) C(m − n + 1)a nε ξ − ξ¯ , dξ ϕ ∗ C(m − n + 1)a nε , m dξ f ∗ 2Cδm−3ε−1 (m − n + 1)a nε , m ∗ ϕ (ξ ) − ϕ ∗ (ξ¯ ) C(m − n + 1)a nε ξ − ξ¯ , m m dξ ϕ ∗ − d ¯ ϕ ∗ 2C 2 (m − n + 1)a n2ε ξ − ξ¯ , m ξ m dξ f ∗ − d ¯ f ∗ 7C 2 δm−3ε−1 (m − n + 1)a n2ε ξ − ξ¯ . m ξ m
(54) (55) (56) (57) (58) (59)
138
A.J.G. Bento, C. Silva / Journal of Functional Analysis 257 (2009) 122–148
Proof. By the mean value theorem and (48) we have a ε xm (ξ ) − xm (ξ¯ ) sup d ¯ ¯ ξ +r(ξ¯ −ξ ) xm · ξ − ξ C(m − n + 1) n ξ − ξ r∈[0,1]
and this establishes (54). Eq. (55) follows immediately from (39) and (48). In fact dξ ϕ ∗ dx
m (ξ )
m
ϕm · dξ xm C(m − n + 1)a nε .
By (36), (48) and (55) we have dξ f ∗ ∂ 1,0 m
∗ (ξ ) fm xm (ξ ),ϕm
−3ε−1
2Cδm
· dξ xm + ∂ 0,1
∗ (ξ ) fm xm (ξ ),ϕm
· dξ ϕ ∗ m
(m − n + 1) n
a ε
and this proves (56). Using the mean value theorem and (55) we obtain (57). To prove (58) we note that by (40) and (54) we get dxm (ξ ) ϕm − dxm (ξ¯ ) ϕm xm (ξ ) − xm (ξ¯ ) C(m − n + 1)a nε ξ − ξ¯ , and, since C > 1, it follows from (48), (39) and (49) that dξ ϕ ∗ − d ¯ ϕ ∗ dx (ξ ) ϕm − d ¯ ϕm · dξ xm + d ¯ ϕm · dξ xm − d ¯ xm m m xm (ξ ) xm (ξ ) ξ m ξ C 2 (m − n + 1)2a n2ε ξ − ξ¯ + C(m − n + 1)a n2ε ξ − ξ¯ 2C 2 (m − n + 1)a n2ε ξ − ξ¯ . Finally we are going to prove (59). Writing λ,μ λ,μ Fλ,μ = ∂xm (ξ ),ϕ ∗ (ξ ) fm − ∂x (ξ¯ ),ϕ ∗ (ξ¯ ) fm m
m
m
where λ, μ ∈ {0, 1} and λ + μ = 1, we have dξ f ∗ − d ¯ f ∗ F1,0 dξ xm + F0,1 dξ ϕ ∗ + ∂ 1,0 m
ξ m
m
· dξ xm − d ¯ xm
∗ ∗ . f · dξ ϕm − dξ¯ ϕm (ξ¯ ),ϕ ∗ (ξ¯ ) m
+ ∂ 0,1 xm
∗ (ξ¯ ) fm xm (ξ¯ ),ϕm
m
Using (37), (54) and (57) we obtain ∗ ∗ ¯ Fλ,μ δm−3ε−1 xm (ξ ) − xm (ξ¯ ) + δm−3ε−1 ϕm (ξ ) − ϕm (ξ ) 2Cδm−3ε−1 (m − n + 1)a nε ξ − ξ¯ . By the former inequalities, (48), (55), (36), (49) and (58) it follows from (60) that
ξ
(60)
A.J.G. Bento, C. Silva / Journal of Functional Analysis 257 (2009) 122–148
139
dξ f ∗ − d ¯ f ∗ 4C 2 δm−3ε−1 (m − n + 1)2a n2ε ξ − ξ¯ m ξ m
+ C + 2C 2 δm−3ε−1 (m − n + 1)a n2ε ξ − ξ¯ . This yields (59) and the lemma is proved.
2
Lemma 7. Given δ > 0 sufficiently small, for each ϕ ∈ X and n ∈ N there exists a unique sequence x = x ϕ ∈ B satisfying (45) for every m > n and ξ ∈ En . Proof. We consider an operator J in B defined by (J x)n (ξ ) = ξ and by (J x)m (ξ ) = A(m, n)ξ +
m−1
A(m, k + 1)gk xk (ξ ), ϕk xk (ξ )
k=n
for m > n. For every m n, one can easily verify that (J x)m is a function of class C 1 and, from (47), (38) and (35), that (J x)m (0) = 0. From dξ (J x)m = A(m, n)Pn +
m−1
A(m, k + 1)Pk+1 dξ fk∗
k=n
it follows from (2), (56) and (8) that m−1 dξ (J x)m A(m, n)Pn + A(m, k + 1)Pk+1 dξ f ∗ k
k=n
D(m − n + 1)a nε +
m−1
D(m − k)a (k + 1)ε 2Cδ(k − n + 1)a nε k −3ε−1
k=n
D(m − n + 1)a nε + 2CDδnε
m−1
(m − k)a (k − n + 1)a (k + 1)ε k −3ε−1
k=n
D(m − n + 1) n + 2 a ε
1+ε−a
CDδλ−2ε−1 (m − n + 1)a nε .
Choosing δ sufficiently small (independently of ϕ, x, n, m and ξ ) and since C > D we have dξ (J x)m C(m − n + 1)a nε . (61) Proceeding in a similar manner, by (2), (59) and (8) it follows that m−1 A(m, k + 1)Pk+1 · dξ f ∗ − d ¯ f ∗ dξ (J x)m − d ¯ (J x)m k
ξ
ξ k
k=n
m−1 k=n
D(m − k)a (k + 1)ε 7C 2 δk −3ε−1 (k − n + 1)a n2ε ξ − ξ¯
140
A.J.G. Bento, C. Silva / Journal of Functional Analysis 257 (2009) 122–148
7 2ε−a C 2 Dδλ−2ε−1 (m − n + 1)a n2ε ξ − ξ¯ and choosing δ sufficiently small (independently of ϕ, x, n, m, ξ and ξ¯ ) we have dξ (J x)m − d ¯ (J x)m C(m − n + 1)a n2ε ξ − ξ¯ . ξ
(62)
Therefore from (61) and (62) we conclude that J (B) ⊂ B. We now show that J is a contraction in B with the metric induced by (51). Let x, y ∈ B. By (36), the mean value theorem and (39), for k n we have
αk := fk xk (ξ ), ϕk xk (ξ ) − fk yk (ξ ), ϕk yk (ξ )
δk −3ε−1 xk (ξ ) − yk (ξ ) + ϕk xk (ξ ) − ϕk yk (ξ ) 2δk −3ε−1 xk (ξ ) − yk (ξ ) 2δk −3ε−1 (k − n + 1)a nε ξ x − y . Using (2), the former inequality and (8) we obtain m−1 (J x)m (ξ ) − (Jy)m (ξ ) A(m, k + 1)Pk+1 αk k=n
m−1
D(m − k)a (k + 1)ε 2δk −3ε−1 (k − n + 1)a nε ξ x − y
k=n
δθ (m − n + 1)a nε ξ x − y with θ = 21+ε−a Dλ−2ε−1 . Therefore, J x − Jy δθ x − y , and for δ > 0 sufficiently small J is a contraction in B. By Proposition 1, the map J has a unique fixed point x ϕ in B, which is thus the desired sequence. 2 ϕ
For the next lemma we need to represent by (xn,k )kn ∈ Bn the unique sequence given by Lemma 7. Lemma 8. Given δ > 0 sufficiently small, for each ϕ ∈ X the following properties are equivalent: ϕ
(1) for every n ∈ N, m > n, and ξ ∈ En , the identity (46) holds with x = (xn,k )kn ; (2) for every n ∈ N, and ξ ∈ En we have ϕn (ξ ) = −
∞ k=n
ϕ ϕ
A(k + 1, n)−1 hk xn,k (ξ ), ϕk xn,k (ξ ) .
(63)
A.J.G. Bento, C. Silva / Journal of Functional Analysis 257 (2009) 122–148
141
Proof. We first show that the series in (63) converges. By the mean value theorem, using (36), (39) and (50) we obtain ϕ
fk x (ξ ), ϕk x ϕ (ξ ) δk −3ε−1 x ϕ (ξ ) + ϕk x ϕ (ξ ) n,k n,k n,k n,k ϕ 2δk −3ε−1 x (ξ ) n,k
2δCk
−3ε−1
(k − n + 1)a nε ξ .
From (3), we conclude that ∞
A(k + 1, n)−1 hk x ϕ (ξ ), ϕk x ϕ (ξ ) n,k n,k k=n
=
∞
A(k + 1, n)−1 Qk+1 fk x ϕ (ξ ), ϕk x ϕ (ξ ) n,k n,k k=n
∞
D(k − n + 2)−b (k + 1)ε 2δCk −3ε−1 (k − n + 1)a nε ξ
k=n
21+ε δCDnε ξ
∞
(k − n + 1)a−b k −2ε−1
k=n
21+ε δCDnε ξ
∞
(k − n + 1)a−b−2ε−1 < ∞.
k=n
If the first property holds, the identity A(m, n)−1 A(m, k + 1) = A(k + 1, n)−1 allows to write (46) in the following equivalent form ϕ ϕ ϕ
m−1
(ξ ) − A(k + 1, n)−1 hk xn,k (ξ ), ϕk xn,k (ξ ) . ϕn (ξ ) = A(m, n)−1 ϕm xn,m
(64)
k=n
By (3) and (39), it follows that
A(m, n)−1 ϕm x ϕ (ξ ) = A(m, n)−1 Qm ϕm x ϕ (ξ ) n,m n,m ϕ D(m − n + 1)−b mε xn,m (ξ ) D(m − n − 1)−b mε C(m − n + 1)a nε ξ CD(m − n + 1)a−b+ε n2ε ξ . Therefore by (42) we have A(m, n)−1 ϕm (xn,m (ξ )) → 0 when m → ∞ and letting m → ∞ in (64) we obtain (63). We now assume that (63) holds. Then ϕ
A(m, n)ϕn (ξ ) = −
∞ k=n
ϕ ϕ
A(m, k + 1)hk xn,k (ξ ), ϕk xn,k (ξ ) ,
142
A.J.G. Bento, C. Silva / Journal of Functional Analysis 257 (2009) 122–148
and thus A(m, n)ϕn (ξ ) +
m−1
ϕ ϕ
A(m, k + 1)hk xn,k (ξ ), ϕk xn,k (ξ )
k=n
=−
∞
ϕ ϕ
A(m, k + 1)hk xn,k (ξ ), ϕk xn,k (ξ ) .
(65)
k=m ϕ
It follows from (63) that the right-hand side of (65) is ϕm (xn,m (ξ )), and Eq. (46) is satisfied with x = xϕ . 2 We now equip the space X with metric induced by ϕn (x) : n ∈ N and x ∈ En \ {0} . ϕ = sup x
(66)
Proposition 2. The space X is a complete metric space with the metric induced by (66). The proof of Proposition 2 is completely analogous to the proof of Proposition 1 and thus it is omitted. Lemma 9. Given δ > 0 sufficiently small, for each ϕ, ψ ∈ X, n ∈ N, m n, and ξ ∈ En we have ϕ x (ξ ) − x ψ (ξ ) C (m − n + 1)a nε ξ · ϕ − ψ . m m 2 Proof. Putting ϕ ϕ
ψ ψ
γk = fk xk (ξ ), ϕk xk (ξ ) − fk xk (ξ ), ψk xk (ξ ) ,
(67)
by the mean value theorem and (36) we obtain ϕ
ϕ ψ
ψ γk δk −3ε−1 xk (ξ ) − xk (ξ ) + ϕk xk (ξ ) − ψk xk (ξ ) . Using again the mean value theorem, (39) and (50) we have ϕ
ϕk x (ξ ) − ψk x ψ (ξ ) ϕk x ϕ (ξ ) − ψk x ϕ (ξ ) + ψk x ϕ (ξ ) − ψk x ψ (ξ ) k k k k k k ϕ ϕ ψ ϕ − ψ · xk (ξ ) + xk (ξ ) − xk (ξ ) ϕ ψ C(k − n + 1)a nε ϕ − ψ ξ + xk (ξ ) − xk (ξ ) and this implies that ϕ
ψ γk δk −3ε−1 2xk (ξ ) − xk (ξ ) + C(k − n + 1)a nε ϕ − ψ ξ .
(68)
We are now going to prove the lemma by induction. For k = n the result follows immediately. Suppose that the result is true for k = n, . . . , m − 1. Then for k = n, . . . , m − 1 we get from (68)
A.J.G. Bento, C. Silva / Journal of Functional Analysis 257 (2009) 122–148
143
and the induction hypothesis that
γk δk −3ε−1 C(k − n + 1)a nε ϕ − ψ ξ + C(k − n + 1)a nε ϕ − ψ ξ 2δC(k − n + 1)a nε k −3ε−1 ϕ − ψ ξ . Hence by (2), the last inequality and (8) it follows that m−1 ϕ A(m, k + 1)Pk+1 γk x (ξ ) − x ψ (ξ ) m m k=n
m−1
D(m − k)a (k + 1)ε 2δC(k − n + 1)a nε k −3ε−1 ϕ − ψ ξ
k=n
21+ε−a CDδλ−2ε−1 (m − n + 1)a nε ϕ − ψ ξ . Choosing δ such that 21+ε−a Dδλ−2ε−1 < 1/2 we have ϕ x (ξ ) − x ψ (ξ ) C (m − n + 1)a nε ϕ − ψ ξ m m 2 and this concludes the proof of the lemma.
2
Lemma 10. Given δ > 0 sufficiently small, there is a unique ϕ ∈ X such that (63) holds for every n ∈ N and ξ ∈ En . Proof. We consider the operator Φ defined for each ϕ ∈ X by (Φϕ)n (ξ ) = −
∞
ϕ ϕ
A(k + 1, n)−1 hk xk (ξ ), ϕk xk (ξ ) ,
k=n ϕ
where x ϕ = (xk )kn is the unique sequence given by Lemma 7. One can easily verify that each ϕ function (Φϕ)n is of class C 1 . Since x ϕ ∈ B we have xm (0) = 0, m n. It follows from (38) and (35) that (Φϕ)n (0) = 0 for each n ∈ N. We also have d0 (Φϕ)n = 0 for each n ∈ N because ϕ d0 fk = 0, k ∈ N, ϕm (0) = 0 and xm (0) = 0, m n. Furthermore, by (3) and (56) we have ∞ A(k + 1, n)−1 Qk+1 · dξ f ∗ dξ (Φϕ)n k
k=n
2CDδ
∞ (k − n + 2)−b (k + 1)ε k −3ε−1 (k − n + 1)a nε k=n
21+ε CDδ
∞ (k − n + 2)−b (k − n + 1)a k −ε−1 k=n
21+ε CDδ
∞ (k − n + 1)a−b−ε−1 < ∞. k=n
144
A.J.G. Bento, C. Silva / Journal of Functional Analysis 257 (2009) 122–148
Hence, for δ > 0 sufficiently small (independently of ϕ, n and ξ ) we have dξ (Φϕ)n 1 for every n ∈ N. On the other hand, given ξ, ξ¯ ∈ En it follows from (59) that ∞ dξ (Φϕ)n − d ¯ (Φϕ)n A(k + 1, n)−1 Qk+1 · dξ f ∗ − d ¯ f ∗ k ξ ξ k k=n
7C 2 Dδn2ε ξ − ξ¯
∞ (k − n + 2)−b (k + 1)ε k −3ε−1 (k − n + 1)a k=n
7 2ε C 2 Dδξ − ξ¯
∞ (k − n + 1)a−b−1 k=n
ξ − ξ¯ provided that δ > 0 is sufficiently small (independently of ϕ, n, ξ and ξ¯ ). Hence, Φ(X) ⊂ X. We now show that Φ is a contraction. Given ϕ, ψ ∈ X and n ∈ N, let x ϕ and x ψ be the unique sequences given by Lemma 7 respectively for ϕ and ψ . By (68) (see (67) for the definition of γk ) and Lemma 9 we have (Φϕ)n (ξ ) − (Φψ)n (ξ )
∞ A(k + 1, n)−1 Qk+1 γk k=n
∞ A(k + 1, n)−1 Qk+1 δk −3ε−1 k=n
× C(k − n + 1)a ξ · ϕ − ψ + C(k − n + 1)a nε ξ · ϕ − ψ
∞ 2CDδξ · ϕ − ψ (k − n + 2)−b (k + 1)ε k −3ε−1 (k − n + 1)a nε
k=n
21+ε CDδξ · ϕ − ψ
∞ (k − n + 1)a−b−ε−1 . k=n
Therefore, choosing δ > 0 such that λ := 21+ε CDδλa−b−ε−1 < 1, it follows that Φϕ − Φψ λϕ − ψ , and with this λ we obtain a contraction in X. By Proposition 2, the map Φ has a unique fixed point ϕ in X that is the desired sequence. 2 Proof of Theorem 2. By Lemma 7, for each ϕ ∈ X there is a unique sequence x ϕ ∈ B satisfying identity (45). It remains to solve the identity (46) with x = x ϕ . By Lemma 8, this is the same as solving (63). Finally, by Lemma 10, there is a unique solution of (63). This establishes the existence of the stable manifolds for δ > 0 sufficiently small. Since the functions ϕn are of class
A.J.G. Bento, C. Silva / Journal of Functional Analysis 257 (2009) 122–148
145
C 1,1 the graphs Vn are C 1,1 manifolds. Finally, for each n ∈ N, m n, ξ, ξ¯ ∈ En , it follows from (54) and (57) that F(m, n)(v) − F(m, n)(v) ¯ xm (ξ ) − xm (ξ¯ ) + ϕ ∗ (ξ ) − ϕ ∗ (ξ¯ ) m
m
2C(m − n + 1) n ξ − ξ¯ a ε
and from (49) and (58) that dv F(m, n) − dv¯ F(m, n) dξ xm − d ¯ xm + dξ ϕ ∗ − d ¯ ϕ ∗ m ξ ξ m
C + 2C 2 (m − n + 1)a n2ε ξ − ξ¯ , where v = (ξ, ϕn (ξ )) and v¯ = (ξ¯ , ϕn (ξ¯ )). This completes the proof of the theorem.
2
5. Examples In this section we are going to give examples of nonuniform polynomial dichotomies and two families of perturbations, the first one verifying our assumptions in Section 3 and the second one verifying the assumptions of Section 4. Let a < 0 b and ε 0. We are going to construct a sequence of linear operators An : R2 → R2 given by diagonal matrices
a An = n 0
0 bn
with positive entries in the diagonal and that verify (2) and (3) with D = 1 and projections given by Pn (x, y) = (x, 0) and Qn (x, y) = (0, y). Because A(2, 1)P1 = a1 , from (2) we must have a1 2a . Therefore we put a1 = 2a . Using again (2) we must have A(3, 2)P2 2a 2ε
and A(3, 1)P1 3a .
Since A(3, 2)P2 = a2 and A(3, 1)P1 = a2 a1 , we put 3a . a2 = min 2a 2ε , a1 Using the same arguments we put 3a 2ε 4a a3 = min 2a 3ε , . , a2 a2 a1 Hence the values of an are given recursively by a1 = 2a and 3a n ε na 2ε (n + 1)a ,..., , an+1 = min 2a (n + 1)ε , . an a n . . . a2 a n . . . a 1
146
A.J.G. Bento, C. Silva / Journal of Functional Analysis 257 (2009) 122–148
Using similar arguments, from (3) we set b1 = 2b 2−ε and bn+1 = max
2b 3b nb (n + 1)b . , , . . . , , (n + 1)ε (n + 1)ε bn (n + 1)ε bn . . . b2 (n + 1)ε bn . . . b1
It follows immediately from the construction that (An ) admits a nonuniform polynomial dichotomy. For instance, if ε = −a = b we can easily verify that
a ( n+1 n ) An = 0
0 . 1
For the local case considered in Section 3 we have the perturbations fn : R2 → R2 given by
fn (x, y) = x q+1 , y q+1
or fn (x, y) = y q+1 , x q+1
with q ∈ N. Obviously we have fn (0, 0) = (0, 0) and from inequalities √ a 2 + b2 |a| + |b| 2 a 2 + b2 and q q q+1
q q+1 q−k k a − b = (a − b) a b |a − b| |a|q−k |b|k |a − b| |a| + |b| k=1
k=1
it follows that
fn (x, y) − fn (u, v) = x q+1 − uq+1 2 + y q+1 − v q+1 2 x q+1 − uq+1 + y q+1 − v q+1
q
q |x − u| |x| + |u| + |y − v| |y| + |v|
q |x − u| + |y − v| |x| + |u| + |y| + |v| √
q 2(x, y) − (u, v) |x| + |u| + |y| + |v| q 2(q+1)/2 (x, y) − (u, v) (x, y) + (u, v) . This proves that the family of perturbations fn satisfy (10) and (11). Therefore if we have aq + ε + 1 < 0 and a + β < 0, using Theorem 1 with (An ) and (fn ) as above, we get local Lipschitz stable manifolds for the dynamics defined by (4). 2 Let g : R → R be the function defined by g(t) = t 2 e−t . It is easy to see that
g (t) = 2t 1 − t 2 e−t 2 1
(69)
g (t) − g (s) 2|t − s|
(70)
and
A.J.G. Bento, C. Silva / Journal of Functional Analysis 257 (2009) 122–148
147
for every t ∈ R. Then if fn : R2 → R2 is the function defined by
δ fn (x, y) = n−3ε−1 g(x), g(y) 2
δ or fn (x, y) = n−3ε−1 g(y), g(x) , 2
from (69) and (70) it follows immediately that the functions fn satisfy (35), (36) and (37). A closer look in the proofs of Lemmas 7, 9 and 10, allows us to conclude, taking C = 2D, that it is sufficient to have δ<
1 , 14 2ε−a D 3 λ−ε−1
and in our example, since D = 1 and λ−ε−1 < 1 + 1/ε, it is enough to have δ<
1 14 2ε−a (1 + 1/ε)
.
With these (fn ) and with (An ) as above, using Theorem 2 we conclude that the dynamics given by (4) has global stable C 1 manifolds when a + ε < b and ε > 0. References [1] L. Barreira, Y. Pesin, Lyapunov Exponents and Smooth Ergodic Theory, Univ. Lecture Ser., vol. 23, American Mathematical Society, Providence, RI, 2002. [2] L. Barreira, Y. Pesin, Nonuniform Hyperbolicity: Dynamics of Systems with Nonzero Lyapunov Exponents, Encyclopedia Math. Appl., vol. 115, Cambridge University Press, Cambridge, 2007. [3] L. Barreira, C. Silva, C. Valls, Integral stable manifolds in Banach spaces, J. Lond. Math. Soc. (2) 77 (2) (2008) 443–464. [4] L. Barreira, C. Silva, C. Valls, Regularity of invariant manifolds for nonuniformly hyperbolic dynamics, J. Dynam. Differential Equations 20 (2) (2008) 281–299. [5] L. Barreira, C. Valls, Higher regularity of invariant manifolds for nonautonomous equations, Nonlinearity 18 (5) (2005) 2373–2390. [6] L. Barreira, C. Valls, Smoothness of invariant manifolds for nonautonomous equations, Comm. Math. Phys. 259 (3) (2005) 639–677. [7] L. Barreira, C. Valls, Existence of stable manifolds for nonuniformly hyperbolic C 1 dynamics, Discrete Contin. Dyn. Syst. 16 (2) (2006) 307–327. [8] L. Barreira, C. Valls, Smooth invariant manifolds in Banach spaces with nonuniform exponential dichotomy, J. Funct. Anal. 238 (1) (2006) 118–148. [9] L. Barreira, C. Valls, Stable manifolds for nonautonomous equations without exponential dichotomy, J. Differential Equations 221 (1) (2006) 58–90. [10] L. Barreira, C. Valls, Stability theory and Lyapunov regularity, J. Differential Equations 232 (2) (2007) 675–701. [11] L. Barreira, C. Valls, Characterization of stable manifolds for nonuniform exponential dichotomies, Discrete Contin. Dyn. Syst. 21 (4) (2008) 1025–1046. [12] L. Barreira, C. Valls, Growth rates and nonuniform hyperbolicity, Discrete Contin. Dyn. Syst. 22 (3) (2008) 509– 528. [13] S.-N. Chow, K. Lu, C k centre unstable manifolds, Proc. Roy. Soc. Edinburgh Sect. A 108 (3–4) (1988) 303–320. [14] M.S. Elbialy, On sequences of Cbk,δ maps which converges in the uniform C 0 -norm, Proc. Amer. Math. Soc. 128 (11) (2000) 3285–3290. [15] D. Henry, Geometric Theory of Semilinear Parabolic Equations, Lecture Notes in Math., vol. 840, Springer-Verlag, Berlin, 1981. [16] O.E. Lanford III, Bifurcation of periodic solutions into invariant tori: The work of Ruelle and Takens, in: I. Stakgold, D.D. Joseph, D.H. Sattinger (Eds.), Nonlin. Probl. Phys. Sci. Biology, Proc. Battelle Summer Inst., Seattle, 1972, in: Lecture Notes in Math., vol. 322, Springer, Berlin, 1973, pp. 159–192.
148
A.J.G. Bento, C. Silva / Journal of Functional Analysis 257 (2009) 122–148
[17] R. Mañé, Lyapunov exponents and stable manifolds for compact transformations, in: J. Palis Jr. (Ed.), Geometric Dynamics, Rio de Janeiro, 1981, in: Lecture Notes in Math., vol. 1007, Springer, Berlin, 1983, pp. 522–577. [18] V.I. Oseledets, A multiplicative ergodic theorem. Ljapunov characteristic numbers for dynamical systems, Tr. Mosk. Mat. Obs. 19 (1968) 179–210 (in Russian); English transl.: Trans. Moscow Math. Soc. 19 (1968) 197–231. [19] O. Perron, Die Stabilitätsfrage bei Differentialgleichungen, Math. Z. 32 (1) (1930) 703–728. [20] Y. Pesin, Families of invariant manifolds that corresponding to nonzero characteristic exponents, Izv. Akad. Nauk SSSR Ser. Mat. 40 (6) (1976) 1332–1379 (in Russian); English transl.: Math. USSR-Izv. 10 (1976) 1261–1305. [21] Y. Pesin, Characteristic Ljapunov exponents, and smooth ergodic theory, Uspekhi Mat. Nauk 32 (4) (1977) 55–112 (in Russian); English transl.: Russian Math. Surveys 32 (1977) 55–114. [22] Y. Pesin, Geodesic flows in closed Riemannian manifolds without focal points, Izv. Akad. Nauk SSSR Ser. Mat. 41 (6) (1977) 1252–1288 (in Russian); English transl.: Math. USSR-Izv. 11 (1977) 1195–1228. [23] C. Pugh, M. Shub, Ergodic attractors, Trans. Amer. Math. Soc. 312 (1) (1989) 1–54. [24] D. Ruelle, Ergodic theory of differentiable dynamical systems, Inst. Hautes Études Sci. Publ. Math. 50 (1) (1979) 27–58. [25] D. Ruelle, Characteristic exponents and invariant manifolds in Hilbert space, Ann. of Math. (2) 115 (2) (1982) 243–290. [26] G.R. Sell, Y. You, Dynamics of Evolutionary Equations, Appl. Math. Sci., vol. 143, Springer-Verlag, New York, 2002. [27] P. Thieullen, Fibrés dynamiques asymptotiquement compacts. Exposants de Lyapounov. Entropie. Dimension, Ann. Inst. H. Poincaré Anal. Non Linéaire 4 (1) (1987) 49–97.
Journal of Functional Analysis 257 (2009) 149–193 www.elsevier.com/locate/jfa
Banach spaces without minimal subspaces Valentin Ferenczi a , Christian Rosendal b,∗,1 a Departamento de Matemática, Instituto de Matemática e Estatística, Universidade de São Paulo,
05311-970 São Paulo, SP, Brazil b Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago,
322 Science and Engineering Offices (M/C 249), 851 S. Morgan Street, Chicago, IL 60607-7045, USA Received 21 October 2008; accepted 23 January 2009 Available online 20 February 2009 Communicated by K. Ball
Abstract We prove three new dichotomies for Banach spaces à la W.T. Gowers’ dichotomies. The three dichotomies characterise respectively the spaces having no minimal subspaces, having no subsequentially minimal basic sequences, and having no subspaces crudely finitely representable in all of their subspaces. We subsequently use these results to make progress on Gowers’ program of classifying Banach spaces by finding characteristic spaces present in every space. Also, the results are used to embed any partial order of size ℵ1 into the subspaces of any space without a minimal subspace ordered by isomorphic embeddability. © 2009 Elsevier Inc. All rights reserved. Keywords: Minimal Banach spaces; Dichotomies; Classification of Banach spaces
Contents 1. 2.
3.
Introduction . . . . . . . . . . . . . . . . . . . . . . . Preliminaries . . . . . . . . . . . . . . . . . . . . . . . 2.1. Notation, terminology, and conventions 2.2. Gowers’ block sequence game . . . . . . 2.3. A trick and a lemma . . . . . . . . . . . . . Tightness . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
* Corresponding author.
E-mail addresses:
[email protected] (V. Ferenczi),
[email protected] (C. Rosendal). 1 Partially supported by NSF grant DMS 0556368 and by FAPESP.
0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.01.028
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
150 154 154 156 156 158
150
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
3.1. Tight bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2. A generalised asymptotic game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3. A game for minimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4. A dichotomy for minimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4. Tightness with constants and crude stabilisation of local structure . . . . . . . . . . . . . . 5. Local block minimality, asymptotic structure and a dichotomy for containing c0 or p 6. Tightness by range and subsequential minimality . . . . . . . . . . . . . . . . . . . . . . . . . 7. Chains and strong antichains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8. Refining Gowers’ dichotomies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9. Open problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
158 160 166 167 170 174 177 183 189 192 192
1. Introduction In the paper [20], W.T. Gowers initiated a celebrated classification theory for Banach spaces. Since the task of classifying all (even separable) Banach spaces up to isomorphism is extremely complicated (just how complicated is made precise in [11]), one may settle for a loose classification of Banach spaces up to subspaces, that is, look for a list of classes of Banach spaces such that: (a) each class is pure, in the sense that if a space belongs to a class, then every subspace belongs to the same class, or maybe, in the case when the properties defining the class depend on a basis of the space, every block subspace belongs to the same class, (b) the classes are inevitable, i.e., every Banach space contains a subspace in one of the classes, (c) any two classes in the list are disjoint, (d) belonging to one class gives a lot of information about operators that may be defined on the space or on its subspaces. We shall refer to this list as the list of inevitable classes of Gowers. Many classical problems are related to this classification program, as for example the question whether every Banach space contains a copy of c0 or p , solved in the negative by B.S. Tsirelson in 1974 [42], or the unconditional basic sequence problem, also solved negatively by Gowers and B. Maurey in 1993 [21]. Ultimately one would hope to establish such a list so that any classical space appears in one of the classes, and so that belonging to that class would yield most of the properties which are known for that space. For example, any property, which is known for Tsirelson’s space, is also true for any of its block subspaces. So Tsirelson’s space is a pure space, and, as such, should appear in one of the classes with a reasonable amount of its properties. Also, presumably the nicest among the classes would consist of the spaces isomorphic to c0 or p , 1 p < ∞. After the discovery by Gowers and Maurey of the existence of hereditarily indecomposable (or HI) spaces, i.e., spaces such that no subspace may be written as the direct sum of infinite-dimensional subspaces [21], Gowers proved that every Banach space contains either an HI subspace or a subspace with an unconditional basis [19]. These were the first two examples of inevitable classes. We shall call this dichotomy the first dichotomy of Gowers. He then used his famous Ramsey or determinacy theorem [20] to refine the list by proving that any Banach space contains a subspace with a basis such that either no two disjointly supported block subspaces are isomorphic, or such that any two subspaces have further subspaces which are isomorphic. He
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
151
called the second property quasi minimality. This second dichotomy divides the class of spaces with an unconditional basis into two subclasses (up to passing to a subspace). Finally, recall that a space is minimal if it embeds into any of its subspaces. A quasi minimal space which does not contain a minimal subspace is called strictly quasi minimal, so Gowers again divided the class of quasi minimal spaces into the class of strictly quasi minimal spaces and the class of minimal spaces. Obviously the division between minimal and strictly quasi minimal spaces is not a real dichotomy, since it does not provide any additional information. The main result of this paper is to provide the missing dichotomy for minimality, which we shall call the third dichotomy. A first step in that direction was obtained by A.M. Pełczar, who showed that any strictly quasi minimal space contains a further subspace with the additional property of not containing any subsymmetric sequence [35]. The first author proved that the same holds if one replaces subsymmetric sequences by embedding homogeneous sequences (any subspace spanned by a subsequence contains an isomorphic copy of the whole space) [10]. A crucial step in the proofs of [35] and [10] is the notion of asymptoticity. An asymptotic game of length k in a space E with a basis is a game where I plays integers ni and II plays block vectors xi supported after ni , and where the outcome is the length k sequence (xi ). Asymptotic games have been studied extensively and the gap between finite-dimensional and infinite-dimensional phenomena was usually bridged by fixing a constant and letting the length of the game tend to infinity. For example, a basis is asymptotic p if there exists C such that for any k, I has a winning strategy in the length k asymptotic game so that the outcome is C-equivalent to the unit vector basis of kp . In [35] it is necessary to consider asymptotic games of infinite length, which are defined in an obvious manner. The outcome is then an infinite block sequence. The proof of the theorem in [35] is based on the obvious fact that if a basic sequence (ei ) is subsymmetric and (xi ) is a block sequence of (ei ), then II has a strategy in the infinite asymptotic game in E = [ei ] to ensure that the outcome is equivalent to (xi ). In [10] a similar fact for embedding homogeneous basic sequences is obtained, but the proof is more involved and a more general notion of asymptoticity must be used. Namely, a generalised asymptotic game in a space E with a basis (ei ) is a game where I plays integers ni and II plays integers mi and vectors xi such that supp(xi ) ⊆ [n1 , m1 ] ∪ · · · ∪ [ni , mi ], and the outcome is the sequence (xi ), which may no longer be a block basis. The second author analysed infinite asymptotic games in [38] (a previous study had also been undertaken by E. Odell and T. Schlumprecht in [34]), showing that the most obvious necessary conditions are, in fact, also sufficient for II to have a strategy to play inside a given set. This was done through relating the existence of winning strategies to a property of subspaces spanned by vectors of the basis with indices in some intervals of integers. Now the methods of [38] extend to the setting of generalised asymptotic games and motivate the following definition. A space Y is tight in a basic sequence (ei ) if there is a sequence of successive intervals I0 < I1 < I2 < · · · of N such that for all infinite subsets A ⊆ N, we have / Y en n ∈ Ii , i∈A
where Y X denotes that Y embeds into X. In other words, any embedding of Y into [ei ] has a “large” image with respect to subsequences of the basis (ei ) and cannot avoid an infinite number of the subspaces [en ]n∈Ii . We then define a tight basis as a basis such that every subspace is tight in it and a tight space as a space with a tight basis.
152
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
As we shall prove in Lemma 3.7, using the techniques of [38], essentially a block subspace Y = [yi ] is not tight in (ei ), when II has a winning strategy in the generalised asymptotic game in [ei ] for producing a sequence equivalent to (yi ). This relates the notion of tight bases to the methods of [10], and by extending these methods we prove the main result of this paper: Theorem 1.1 (3rd dichotomy). Let E be a Banach space without minimal subspaces. Then E has a tight subspace. Theorem 1.1 extends the theorems of [35,10], since it is clear that a tight space cannot contain a subsymmetric or even embedding homogeneous block sequence. This dichotomy also provides an improvement to the list of Gowers: a strictly quasi minimal space must contain a tight quasi minimal subspace. Example 3.6 shows that this is a non-trivial refinement of the unconditional and strictly quasi minimal class, and Corollary 4.3 states that Tsirelson’s space is tight. Theorem 1.1 also refines the class of HI spaces in the list, i.e., every HI space contains a tight subspace, although it is unknown whether the HI property for a space with a basis does not already imply that the basis is tight. Our actual examples of tight spaces turn out to satisfy one of two stronger forms of tightness. The first is called tightness with constants. A basis (en ) is tight with constants when for every infinite-dimensional space Y , the sequence of successive intervals I0 < I1 < · · · of N witnessing / IK ] for each K. This is the case the tightness of Y in (en ) may be chosen so that Y K [en | n ∈ for Tsirelson’s space. The second kind of tightness is called tightness by range. Here the range, range x, of a vector x is thesmallest interval of integers containing its support, and the range of a block subspace [xn ] is n range xn . A basis (en ) is tight by range when for every block subspace Y = [yn ], the sequence of successive intervals I0 < I1 < · · · of N witnessing the tightness of Y in (en ) may be defined by Ik = range yk for each k. This is equivalent to no two block subspaces with disjoint ranges being comparable. In a companion paper [14], we show that tightness by range is satisfied by an HI space and also by a space with unconditional basis both constructed by Gowers. It turns out that there are natural dichotomies between each of these strong forms of tightness and respective weak forms of minimality. For the first notion, we define a space X to be locally minimal if for some constant K, X is K-crudely finitely representable in any of its subspaces. Notice that local minimality is easily incompatible with tightness with constants. Using an equivalent form of Gowers’ game, as defined by J. Bagaria and J. López-Abad [4], we prove: Theorem 1.2 (5th dichotomy). Any Banach space E contains a subspace with a basis that is either tight with constants or is locally minimal. The ideas involved in the notion of local minimality also make sense for block representability, which allows us to connect these notions with asymptoticity of basic sequences. Proving a simple dichotomy for when a space contains an asymptotically p subspace, we are led to the following dichotomy for when a Banach space contains a copy of either c0 or p . Theorem 1.3 (The c0 and p dichotomy). Suppose X is a Banach space not containing a copy of c0 nor of p , 1 p < ∞. Then X has a subspace Y with a basis such that either
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
153
(1) ∀M ∃n ∀U1 , . . . , U2n ⊆ Y ∃ui ∈ SUi
u1 < · · · < u2n & (u2i−1 )ni=1 M (u2i )ni=1 .
(2) For all block bases (zn ) of Y = [yn ] there are intervals I1 < I2 < I3 < · · · such that (zn )n∈IK is not K-equivalent to a block sequence of (yn )n∈I / K. Here, of course, the variables range over infinite-dimensional spaces. Property (1) indicates some lack of homogeneity and (2) some lack of minimality. It is interesting to see which conditions the various examples of Banach spaces not containing c0 or p satisfy; obviously, Tsirelson’s space and its dual satisfy (2) and indeed (2) is the only option for spaces being asymptotic p . On the other hand, Schlumprecht’s space S satisfies (1). There is also a dichotomy concerning tightness by range. This direction for refining the list of inevitable classes of spaces was actually suggested by Gowers in [20]. P. Casazza proved that if a space X has a shrinking basis such that no block sequence is even–odd (the odd subsequence is equivalent to the even subsequence), then X is not isomorphic to a proper subspace, see [17]. So any Banach space contains either a subspace, which is not isomorphic to a proper subspace, or is saturated with even–odd block sequences, and, in the second case, we may find a further subspace in which Player II has a winning strategy to produce even–odd sequences in the game of Gowers associated to his Ramsey theorem. This fact was observed by Gowers, but it was unclear to him what to deduce from the property in the second case. We answer this question by using Gowers’ theorem to obtain a dichotomy which on one side contains tightness by range, which is a slightly stronger property than the Casazza property. On the other side, we define a space X with a basis (xn ) to be subsequentially minimal if every subspace of X contains an isomorphic copy of a subsequence of (xn ). This last property is satisfied by Tsirelson’s space and will also be shown to be incompatible with tightness by range. Theorem 1.4 (4th dichotomy). Any Banach space E contains a subspace with a basis that is either tight by range or is subsequentially minimal. It is easy to check that the second case in Theorem 1.4 may be improved to the following hereditary property of a basis (xn ), that we call sequential minimality: every block sequence of [xn ] has a further block sequence (yn ) such that every subspace of [xn ] contains a copy of a subsequence of (yn ). The five dichotomies and the interdependence of the properties involved can be visualised in the following diagram. Unconditional basis ⇑ Tight by support ⇓ Tight by range ⇓ Tight ⇑ Tight with constants
∗ ∗ 1st dichotomy ∗ ∗ ∗ ∗ 2nd dichotomy ∗ ∗ ∗ ∗ 4th dichotomy ∗ ∗ ∗ ∗ 3rd dichotomy ∗ ∗ ∗ ∗ 5th dichotomy ∗ ∗
Hereditarily indecomposable ⇓ Quasi minimal ⇑ Sequentially minimal ⇑ Minimal ⇓ Locally minimal
154
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
From a different point of view, coming from combinatorics and descriptive set theory, Theorem 1.1 also has important consequences for the isomorphic classification of separable Banach spaces. To explain this, suppose that X is a Banach space and SB∞ (X) is the class of all infinitedimensional subspaces of X. Then the relation of isomorphic embeddability induces a partial order on the set of biembeddability classes of SB∞ (X) and we denote this partial order by P(X). Many questions about the isomorphic structure of X translate directly into questions about the structure of P(X), e.g., X has a minimal subspace if and only if P(X) has a minimal element and X is quasi minimal if and only if P(X) is downwards directed. In some sense, a space can be said to be pure in case the complexity of P(X) does not change by passing to subspaces and Gowers [20, Problem 7.9], motivated by this, asked for a classification of, or at least strong structural information about, the partial orders P for which there is a Banach space X saturated with subspaces Y ⊆ X such that P ∼ = P(Y ). A simple diagonalisation easily shows that such P either consist of a single point (corresponding to a minimal space) or are uncountable, and, using methods of descriptive set theory and metamathematics, this was successively improved in [12] and [37] to either |P | = 1 or P having a continuum size antichain. Using a strengthening of Theorem 1.1, we are now able to show that such P , for which |P | > 1, have an extremely complex structure by embedding any partial order of size at most ℵ1 into them. For A, B ⊆ N, we write A ⊆∗ B to mean that A \ B is finite. Theorem 1.5. Given a Banach space X, let P(X) be the set of all biembeddability classes of infinite-dimensional subspaces of X, partially ordered under isomorphic embeddability. Let P be a poset for which there exists a Banach space X such that X is saturated with subspaces Y such that P(Y ) ∼ = P . Then either |P | = 1, or ⊆∗ embeds into P . In the second case it follows that (a) any partial order of size at most ℵ1 embeds into P , and (b) any closed partial order on a Polish space embeds into P . From the point of view of descriptive set theory, it is more natural to study another problem, part of which was originally suggested to us by G. Godefroy some time ago. Namely, the space SB∞ (X), for X separable, can easily be made into a standard Borel space using the Effros– Borel structure. In this way, the relations of isomorphism, ∼ =, and isomorphic embeddability, , become analytic relations on SB∞ (X) whose complexities can be measured through the notion of Borel reducibility. We obtain Theorem 1.5 as a consequence of some finer results formulated in this language and that are of independent interest. In Section 8, we put all the dichotomies together in order to make progress on the loose classification mentioned above. In connection with this, we shall also rely on work by A. Tcaciuc [41], who proved a dichotomy for containing a strongly asymptotic p basis, i.e., a basis such that finite families of disjointly supported (but not necessarily successive) normalised blocks supported “far enough” are uniformly equivalent to the basis of np . Using just the first four dichotomies, in Theorem 8.3 we find 6 classes of inevitable spaces, 4 of which are known to be non-empty, while if we use all 5 dichotomies plus Tcaciuc’s, we find 19 classes. Out of these, 8 of them are known to be non-empty, though for 4 of the examples, we will need the results of a companion paper [14] where these are constructed and investigated. The resulting classification gives fairly detailed knowledge about the various types of inevitable spaces, though much work remains to be done. In particular, the new dichotomies explains some of the structural differences between the wealth of new exotic spaces constructed
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
155
in the wake of the seminal paper of Gowers and Maurey [21]. It seems an interesting task to determine which of the remaining 11 of the 19 cases are non-empty. 2. Preliminaries 2.1. Notation, terminology, and conventions We shall in the following almost exclusively deal with infinite-dimensional Banach spaces, so to avoid repeating this, we will always assume our spaces to be infinite-dimensional. The spaces can also safely be assumed to be separable, but this will play no role and is not assumed. Moreover, all spaces will be assumed to be over the field of real numbers R, though the results hold without modification for complex spaces too. Suppose E is a Banach space with a normalised Schauder basis (en ). Then, by a standard Skolem hull construction, there is a countable subfield F of R containing the rational numbers Q such that for any finite linear combination λ 0 e0 + λ1 e1 + · · · + λn en with λi ∈ F, we have λ0 e0 + λ1 e1 + · · · + λn en ∈ F. This means that any F-linear combination of (en ) can be normalised, while remaining a F-linear combination. Thus, as the set of Q and hence also F-linear combinations of (en ) are dense in E, also the set of F-linear normalised combinations of (en ) are dense in the unit sphere SE . A block vector is a normalised finite linear combination x = λ0 e0 + λ1 e1 + · · · + λn en where λi ∈ F. We insist on blocks being normalised and F-linear and will be explicit on the few occasions that we deal with non-normalised blocks. The restriction to F-linear combinations is no real loss of generality, but instead has the effect that there are only countably many blocks. We denote by D the set of blocks. The support, supp x, of a block x = λ0 e0 + λ1 e1 + · · · + λn en is the set of i ∈ N such that λi = 0 and the range, range x, is the smallest interval I ⊆ N containing supp x. A block (sub)sequence, block basis, or blocking of (en ) is an infinite sequence (xn ) of blocks such that supp xn < supp xn+1 for all n and a block subspace is the closed linear span of a block sequence. Notice that if X is a block subspace, then the associated block sequence (xn ) such that X = [xn ] is uniquely defined up to the choice of signs ±xn . So we shall sometimes confuse block sequences and block subspaces. For two block subspaces X = [xn ] and Y = [yn ], write Y X if Y ⊆ X, or, equivalently, yn ∈ span(xi ) for all n. Also, let Y ∗ X if there is some N such that yn ∈ span(xi ) for all n N . When we work with block subspaces of some basis (en ), we will assume that we have chosen the same countable subfield F of R for all block sequences (xn ) of (en ), and hence a vector in [xn ] is a block of (xn ) if and only if it is a block of (en ), so no ambiguity occurs. We consider the set bb(en ) of block sequences of (en ) as a closed subset of DN , where D is equipped with the discrete topology. In this way, bb(en ) is a Polish, i.e., separable, completely metrisable space. If = (δn ) is a sequence of positive real numbers, which we denote by > 0, and A ⊆ bb(en ), we designate by A the set
A = (yn ) ∈ bb(en ) ∃(xn ) ∈ bb(en ) ∀n xn − yn < δn .
156
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
If A is an infinite subset of N, we denote by [A] the space of infinite subsets of A with the topology inherited from 2A . Also, if a ⊆ N is finite,
[a, A] = B ∈ [N] a ⊆ B ⊆ a ∪ A ∩ [max a + 1, ∞[ . We shall sometimes confuse infinite subsets of N with their increasing enumeration. So if A ⊆ N is infinite, we denote by An the (n + 1)st element of A in its increasing enumeration (we start counting at 0). A Banach space X embeds into Y if X is isomorphic to a closed subspace of Y . Since we shall work with the embeddability relation as a mathematical object itself, we prefer to use the slightly non-standard notation X Y to denote that X embeds into Y . Given two Banach spaces X and Y , we say that X is crudely finitely representable in Y if there is a constant K such that for any finite-dimensional subspace F ⊆ X there is an embedding T : F → Y with constant K, i.e., T · T −1 K. Also, if X = [xn ] and Y = [yn ] are spaces with bases, we say that X is crudely block finitely representable in Y if for some constant K and all k, there are (not necessarily normalised) blocks z0 < · · · < zk of (yn ) such that (x0 , . . . , xk ) ∼K (z0 , . . . , zk ). Two Banach spaces are said to be incomparable if neither one embeds into the other, and totally incomparable if no subspace of one is isomorphic to a subspace of the other. We shall at several occasions use non-trivial facts about the Tsirelson space and its p-convexifications, for which our reference is [8], and also facts from descriptive set theory that can all be found in [24]. For classical facts in Banach space theory we refer to [27]. 2.2. Gowers’ block sequence game A major ingredient in several of our proofs will be the following equivalent version of Gowers’ game due to J. Bagaria and J. López-Abad [4]. Suppose E = [en ] is given. Player I and II alternate in choosing blocks x0 < x1 < x2 < · · · and y0 < y1 < y2 < · · · as follows: Player I plays in the kth round of the game a block xk such that xk−1 < xk . In response to this, II either chooses to pass, and thus play nothing in the kth round, or plays a block yi ∈ [xl+1 , . . . , xk ], where l was the last round in which II played a block. I x0 · · · xk0 xk0 +1 · · · xk1 II y0 ∈ [x0 , . . . , xk0 ] y1 ∈ [xk0 +1 , . . . , xk1 ] We thus see I as constructing a block sequence (xi ), while II chooses a block subsequence (yi ). This block subsequence (yi ) is then called the outcome of the game. (Potentially the blocking could be finite, but the winning condition can be made such that II loses unless it is infinite.) We now have the following fundamental theorem of Gowers (though he only proves it for real scalars, it is clear that his proof is valid for the field F too). Theorem 2.1. (See W.T. Gowers [20].) Suppose (en ) is a Schauder basis and A ⊆ bb(ei ) is an analytic set such that any (xi ) ∈ bb(ei ) has a block subsequence (yi ) belonging to A, then for all > 0, there is a block subsequence (vi ) ∈ bb(ei ) such that II has a strategy to play in A if I is restricted to play blockings of (vi ).
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
157
2.3. A trick and a lemma We gather here a couple of facts that will be used repeatedly later on. We shall at several occasions use coding with inevitable subsets of the unit sphere of a Banach space, as was first done by López-Abad in [28]. So let us recall here the relevant facts and set up a framework for such codings. Suppose E is an infinite-dimensional Banach space with a basis not containing a copy of c0 . Then by the solution to the distortion problem by Odell and Schlumprecht [31] there is a block subspace [xn ] of E and two closed subsets F0 and F1 of the unit sphere of [xn ] such that dist(F0 , F1 ) = δ > 0 and such that for all block bases (yn ) of (xn ) there are block vectors v and u of (yn ) such that v ∈ F0 and u ∈ F1 . In this case we say that F0 and F1 are positively separated, inevitable, closed subsets of S[xn ] . We can now use the sets F0 and F1 to code infinite binary sequences, i.e., elements of 2N in the following manner. If (zn ) is a block sequence of (xn ) such that for all n, zn ∈ F0 ∪ F1 , we let ϕ((zn )) = α ∈ 2N be defined by 0, if zn ∈ F0 ; αn = 1, if zn ∈ F1 . Since the sets F0 and F1 are positively separated, this coding is fairly rigid and can be extended to block sequences (vn ) such that dist(vn , F0 ∪ F1 ) < 2δ by letting ϕ((vn )) = β ∈ 2N be defined by 0, if dist(vn , F0 ) < 2δ ; βn = 1, if dist(vn , F1 ) < 2δ . In this way we have that if (zn ) and (vn ) are block sequences with zn ∈ F0 ∪F1 and vn −zn < 2δ for all n, then ϕ((zn )) = ϕ((vn )). One can now use elements of Cantor space 2N to code other objects in various ways. For example, let H denote the set of finite non-empty sequences (q0 , q1 , . . . , qn ) of rationals with qn = 0. Then, as H is countable, we can enumerate it as h0 , h1 , . . . . If now (yn ) and (vn ) are block sequences with ϕ((vn )) = 0n0 10n1 10n2 1 · · · , then (vn ) codes an infinite sequence Ψ ((vn ), (yn )) = (un ) of finite linear combinations of (yn ) by the following rule: uk = q0 y0 + q1 y1 + · · · + qm ym , where hnk = (q0 , . . . , qm ). We should then notice three things about this type of coding: – It is inevitable, i.e., for all block sequences (yn ) of (xn ) and α ∈ 2N , there is a block sequence (vn ) of (yn ) with ϕ((vn )) = α. – It is continuous, i.e., to know an initial segment of (un ) = Ψ ((vn ), (yn )), we only need to know initial segments of (vn ) and of (yn ). – It is stable under small perturbations. I.e., given > 0, we can find some = (δn ) only depending on and the basis constant of (xn ) with the following property. Assume that (vn ) and (yn ) are block bases of (xn ) with vn ∈ F0 ∪ F1 for all n and such that Ψ ((vn ), (yn )) = (un ) is a block sequence of (yn ) with 12 < un < 2. Then whenever (vn ) and (yn ) are other
158
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
block sequences of (xn ) with vn − vn < 2δ and yn − yn < δn for all n, the sequence Ψ ((vn ), (yn )) = (un ) will be a block sequence of (yn ) that is 1 + -equivalent to (un ). One can of course consider codings of other objects than sequences of vectors and, depending on the coding, obtain similar continuity and stability properties. The inevitability of the coding is often best used in the following form. – Suppose B is a set of pairs ((yn ), α), where (yn ) is a block sequence of (xn ) and α ∈ 2N , such that for all block sequences (zn ) of (xn ) there is a further block sequence (yn ) and an α such that ((yn ), α) ∈ B. Then for all block sequences (zn ) of (xn ) there is a further block sequence (yn ) such that for all n, y2n+1 ∈ F0 ∪ F1 and ((y2n ), ϕ((y2n+1 ))) ∈ B. To see this, let (zn ) be given and notice that by the inevitability of the coding there is a block sequence (wn ) of (zn ) such that w3n+1 ∈ F0 and w3n+2 ∈ F1 . Pick now a block sequence (vn ) of (w3n ) and an α such that ((vn ), α) ∈ B. Notice now that between vn and vn+1 there are block vectors w3in +1 and w3in +2 of (zn ) belonging to F0 , respectively F1 . Thus, if we let y2n = vn and set w3in +1 , if αn = 0; y2n+1 = w3in +2 , if αn = 1, then ((y2n ), ϕ((y2n+1 ))) ∈ B. Lemma 2.2. Let (xn0 ) (xn1 ) (xn2 ) · · · be a decreasing sequence of block √ bases of a basic 0 0 sequence (xn ). Then there exists a block basis (yn ) of (xn ) such that (yn ) is K-equivalent with a block basis of (xnK ) for every K 1. Proof. Let c(L) be a constant depending on the basis constant of (xn0 ) such that if two block bases differ in at most L terms, then they are c(L)-equivalent. Find √ now a sequence L1 L2 · · · of non-negative integers tending to +∞ such that c(LK ) K. We can now easily construct an infinite block basis (yn ) of (xn0 ) such that for all K 1 at most the first LK terms K of (yn ) are not blocks of (xnK )∞ n=LK +1 . Then (yn ) differs from a block basis of (xn ) in at most √ LK terms and hence is K-equivalent with a block basis of (xnK ). 2 3. Tightness 3.1. Tight bases The following definition is central to the rest of the paper. Definition 3.1. Consider a Banach space E with a basis (en ) and let Y be an arbitrary Banach space. We say that Y is tight in the basis (en ) if there is a sequence of successive non-empty intervals I0 < I1 < I2 < · · · of N such that for all infinite subsets A ⊆ N, we have / Y en n ∈ Ii . i∈A
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
159
In other words, if Y embeds into [en ]n∈B , then B ⊆ N intersects all but finitely many intervals Ii . We say that (en ) is tight if every infinite-dimensional Banach space Y is tight in (en ). Finally, an infinite-dimensional Banach space X is tight if it has a tight basis. Also, the following more analytical criterion will prove to be useful. For simplicity, denote by PI the canonical projection onto [en ]n∈I . Lemma 3.2. Let X be a Banach space, (en ) a basis for a space E, and (In ) finite intervals such that min In − −−− → ∞ and for all infinite A ⊆ N, n→∞ X [en ]n∈/ k∈A Ik . Then whenever T : X → [en ] is an embedding, we have lim infk PIk T > 0. Proof. Suppose towards a contradiction that T : X → E is an embedding such that for some infinite A ⊆ N, lim PIk T = 0.
k→∞ k∈A
Then, by passing to an infinite subset of A, we can suppose that k∈A PIk T < 12 T −1 −1 and that the intervals (In )n∈A are disjoint. Thus, the sequence of operators (PIk T )k∈A is absolutely
summable and therefore the operator k∈A PIk T : X → E exists and has norm < 12 T −1 −1 . But then for x ∈ X we have 1 1 T −1 · T x = 1 T x, x P T x P T Ik Ik · x 2 2T −1 2T −1 k∈A
k∈A
and hence also T− T x − 1 T x = 1 T x. T x − P T x P T x Ik Ik 2 2 k∈A
k∈A
So T − k∈A PIk T is still an embedding of X into E. But this is impossible as T − k∈A PIk T takes values in [en ]n∈/ k∈A Ik . 2 Proposition 3.3. A tight Banach space contains no minimal subspaces. Proof. Suppose (en ) is a tight basis for a space E and let Y be any subspace of E. Pick a block subspace X = [xn ] of E that embeds into Y . Since Y is tight in (en ), we can find a sequence of intervals (Ii ) such that Y does not embed into [en ]n∈B whenever B ⊆ N is disjoint from an infinite number of intervals Ii . By passing to a subsequence (zn ) of (xn ), we obtain a space Z = [zn ] that is a subspace of some [en ]n∈B where B ⊆ N is disjoint from an infinite number of intervals Ii , and hence Y does not embed into Z. Since Z embeds into Y , this shows that Y is not minimal. 2
160
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
The classical example of space without minimal subspaces is Tsirelson’s space T and it is not too difficult to show that T is tight. This will be proved later on as a consequence of a more general result. Any block sequence of a tight basis is easily seen to be tight. And also: Proposition 3.4. If E is a tight Banach space, then every shrinking basic sequence in E is tight. Proof. Suppose (en ) is a tight basis for E and (fn ) is a shrinking basic sequence in E. Let Y be an arbitrary space and find intervalsI0 < I1 < · · · associated to Y for (en ), i.e., for all infinite subsets A ⊆ N, we have Y [en | n ∈ / i∈A Ii ]. We notice that, since (en ) is a basis, we have for all m PIk |[fi |im] − −−→ 0, k→∞
(1)
and, since (fn ) is shrinking and the PIk have finite rank, we have for all k PIk |[fi |i>m] − −−−→ 0. m→∞
(2)
Using alternately (1) and (2), we can construct integers k0 < k1 < · · · and intervals J0 < J1 < · · · such that PIkn |[fi |i ∈J / n] <
2 . n+1
To see this, suppose kn−1 and Jn−1 have been defined and find some large kn > kn−1 such that PIkn |[fi |imax Jn−1 ]
1 . n+1
Now, choose m large enough that PIkn |[fi |i>m]
1 , n+1
2 and set Jn = [max Jn−1 + 1, m]. Then PIkn |[fi |i ∈J / n ] < n+1 . It follows that if A ⊆ N is infi nite and T : Y → [fi ]i ∈/ n∈A Jn is an embedding, then limn∈A PIkn T = 0, which contradicts Lemma 3.2. So (Jn ) witnesses that Y is tight in (fn ). 2
Corollary 3.5. If a tight Banach space X is reflexive, then every basic sequence in X is tight. Notice that, since c0 and 1 are minimal, we have by the classical theorem of James, that if X is a tight Banach space with an unconditional basis, then X is reflexive and so every basic sequence in X is tight. Example 3.6. The symmetrisation S(T (p) ) of the p-convexification T (p) of Tsirelson’s space, 1 < p < +∞, does not contain a minimal subspace, yet it is not tight.
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
161
Proof. Since S(T (p) ) is saturated with isomorphic copies of subspaces of T (p) and T (p) does not contain a minimal subspace, it follows that S(T (p) ) does not have a minimal subspace. The canonical basis (en ) of S(T (p) ) is symmetric, therefore S(T (p) ) is not tight in (en ) and so (en ) is not tight. By reflexivity, no basis of S(T (p) ) is tight. 2 3.2. A generalised asymptotic game Suppose X = [xn ] and Y = [yn ] are two Banach spaces with bases. We define the game HY,X with constant C 1 between two players I and II as follows: I will in each turn play a natural number ni , while II will play a not necessarily normalised block vector ui ∈ X and a natural number mi such that ui ∈ X[n0 , m0 ] + · · · + X[ni , mi ], where, for ease of notation, we write X[k, m] to denote [xn ]knm . Diagrammatically, n1 n2 n3 ··· I n0 u1 , m1 u2 , m2 u3 , m3 · · · II u0 , m0 We say that the sequence (ui )i∈N is the outcome of the game and say that II wins the game if (ui ) ∼C (yi ). For simplicity of notation, if X = [xn ] is space with a basis, Y a Banach space, I0 < I1 < I2 < · · · a sequence of non-empty intervals of N and K is a constant, we write Y K (X, Ii ) if there is an infinite set A ⊆ N containing 0 such that / Ii , Y K xn n ∈ i∈A
i.e., Y embeds with constant K into the subspace of X spanned by (xn )n∈/ i∈A Ii . Also, write Y (X, Ii ) if there is an infinite set A ⊆ N such that Y [xn | n ∈ / i∈A Ii ]. Notice that in the latter case we can always demand that 0 ∈ A by perturbating the embedding with a finite rank operator. It is clear that if Y = [yn ] and II has a winning strategy in the game HY,X with constant K, then for any sequence of intervals (Ii ), Y K (X, Ii ). Modulo the determinacy of open games, the next lemma shows that the converse holds up to a perturbation. Lemma 3.7. Suppose X = [xn ] is space with a basis and K, are positive constants such that for all block bases Y of X there is a winning strategy for I in the game HY,X with constant K + . Then there is a Borel function f : bb(X) → [N] such that for all Y if Ij = [f (Y )2j , f (Y )2j +1 ], then Y K (X, Ij ).
162
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
Proof. Notice that the game HY,X is open for player I and, in fact, if DK+ denotes the set of 1 u K + , then the set blocks u with K+
A = (Y, p) ∈ bb(X) × (N × DK+ × N)N either p is a legal run of the game HY,X
with constant K + in which I wins or p is not a legal run of the game HY,X is Borel and has open sections AY = {p ∈ (N × DK+ × N)N | (Y, p) ∈ A}. Also, since there are no rules for the play of I in HY,X , AY really corresponds to the winning plays for I in HY,X with constant K + . By assumption, I has a winning strategy to play in AY for all Y , and so by the theorem on strategic uniformisation (see (35.32) in [24]), there is a Borel function σ : Y → σY that to each Y associates a winning strategy for I in the game HY,X with constant K + . Now let = (δn ) be a sequence of positive reals such that for all 2KC-basic sequences of blocks (wn ) of X with K1 wn K (where C is the basis constant of X) and sequences of vectors (un ), if for all n, wn − un < δn , then (wn ) ∼√1+ /K (un ). We also choose sets Dn of finite (not necessarily normalised) blocks with the following properties: – for each finite d ⊆ N, the number of vectors u ∈ Dn such that supp u = d is finite, – for all blocks vectors w with K1 w K, there is some u ∈ Dn with supp w = supp u such that w − u < δn . This is possible since the K-ball in [xi ]i∈d is totally bounded for all finite d ⊆ N. So for all 2KC-basic sequences (wn ) of blocks with K1 wn K, there is some (un ) ∈ n Dn such that supp wn = supp un and wn − un < δn for all n, whence (wn ) ∼√1+ /K (un ). Suppose now that Y = [yn ] is given. For each p = (n0 , u0 , m0 , . . . , ni , ui , mi ), where uj ∈ Dj for all j and n1 · · · ni I n0 II u0 , m0 u1 , m1 · · · ui , mi is a legal position in the game HY,X in which I has played according to σY , we write p < k if nj , uj , mj < k for all j i. Notice that for all k there are only finitely many such p with p < k, so we can define
α(k) = max k, max σY (p) p < k and set Ik = [k, α(k)]. Clearly, the sequence (Ik ) can be computed in a Borel fashion from Y . The Ik are not necessarily successive, but their minimal elements tend to ∞, so to prove the lemma it is enough to show that Y does not K-embed into [xn ] avoiding an infinite number of Ik including I0 . Suppose now for a contradiction that A ⊆ N is infinite, 0 ∈ A and yi → wi is a K-embedding into [xn | n ∈ / k ∈A / Ik ]. By perturbing the embedding slightly, we can suppose that the wi are √ 1 blocks such that K wi K and we still have a K 1 + /K-embedding. Using the defining properties of Di , we find ui ∈ Di such that wi − ui < δi and supp wi = supp ui for all i, whereby (ui ) ∼√1+ /K (wi ) ∼K √1+ /K (yi ), and therefore (ui ) ∼K+ (yi ). We now proceed to define natural numbers ni , mi , and ai ∈ A such that for pi = (n0 , u0 , m0 , . . . , ni , ui , mi ), we have
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
(i) (ii) (iii) (iv)
163
a0 = 0 and [0, n0 [ ⊆ Ia0 , mi = ai+1 − 1, pi is a legal position in HY,X in which I has played according to σY , ]mi , ni+1 [ ⊆ Iai+1 .
Let a0 = 0 and n0 = σY (∅) = α(0), whence Ia0 = [0, α(0)] = [0, n0 ]. Find a1 such that n0 , u0 , a0 < a1 and set m0 = a1 − 1. Then p0 = (n0 , u0 , m0 ) is a legal position in HY,X in which I has played according to σY , p0 < a1 , so n1 = σY (n0 , u0 , m0 ) α(a1 ), and therefore ]m0 , n1 [ ⊆ Ia1 = [a1 , α(a1 )]. Now suppose by induction that n0 , . . . , ni and a0 , . . . , ai have been defined. Since [0, n0 [ ⊆ Ia0 and ]mj , nj +1 [ ⊆ Iaj +1 for all j < i, we have ui ∈ X[n0 , m0 ] + · · · + X[ni−1 , mi−1 ] + X[ni , ∞[ . Find some ai+1 greater than all of n0 , . . . , ni , u0 , . . . , ui , a0 , . . . , ai and let mi = ai+1 − 1. Then ui ∈ X[n0 , m0 ] + · · · + X[ni−1 , mi−1 ] + X[ni , mi ] and pi = (n0 , u0 , m0 , . . . , ni , ui , mi ) is a legal position played according to σY . Since pi < ai+1 also ni+1 = σY (n0 , u0 , m0 , . . . , ni , ui , mi ) α(ai+1 ). Thus ]mi , ni+1 [ ⊆ Iai+1 = [ai+1 , α(ai+1 )]. Now since p0 ⊆ p1 ⊆ p2 ⊆ · · · , we can let p = i pi and see that p is a run of the game in which I followed the strategy σY and II has played (ui ). Since σY is winning for I, this implies that (ui ) K+ (yi ) contradicting our assumption. 2 Lemma 3.8. Suppose X = [xn ] is a space with a basis and Y is a space such that for all constants K there are intervals I0(K) < I1(K) < I2(K) < · · · such that Y K (X, Ij(K) ). Then there are intervals J0 < J1 < J2 < · · · such that Y (X, Jj ). Moreover, the intervals (Jj ) can be computed in (K) a Borel manner from (Ii )i,K . Proof. By induction we can construct intervals J0 < J1 < J2 < · · · such that Jn contains one (1) (n) interval from each of (Ii ), . . . , (Ii ) and if M = min Jn −1 and K = n·c(M), then max Jn > (K) max I0 + M, where c(M) is a constant such that if two subsequences of (xn ) differ in at most M terms then they are c(M) equivalent. It then follows that if A ⊆ N is infinite, then Y [xn ]n∈/ i∈A Ji . To see this, suppose towards a contradiction that A ⊆ N is infinite and that for some integer N , Y N [xn ]n∈/ i∈A Ji .
164
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
Pick then a ∈ A such that a N and set M = min Ja − 1 and K = a · c(M). Define an isomorphic embedding T from
/ Ji xn n ∈ i∈A
into
(K) / xn n ∈ Ji & n > max Ja + xn max I0 < n max Ja i∈A
by setting T (xn ) =
xn , xmax I (K) +n+1 , 0
if n > max Ja ; if n M.
This is possible since max Ja > max I0(K) + M. Also, since T only changes at most M vectors from (xn ), it is a c(M) embedding. Therefore, by composing with T and using that N · c(M) a · c(M) K, we see that (K) / Y K xn n ∈ Ji & n > max Ja + xn max I0 < n max Ja . i∈A (K)
In particular, as almost all Ji contain an interval Il taining 0 such that
, we can find and infinite set B ⊆ N con-
(K) / , Ii Y K xn n ∈ i∈B
which is a contradiction.
2
Lemma 3.9. Let E = [en ] be given and suppose that for all block subspaces Z E and constants C there is a block subspace X Z such that for all block subspaces Y X, I has a winning strategy in the game HY,X with constant C. Then there is a block subspace X E and a Borel function f : bb(X) → [N] such that for all normalised block bases Y X, if we set Ij = [f (Y )2j , f (Y )2j +1 ], then Y (X, Ij ). Proof. Using the hypothesis inductively together with Lemma 3.7, we can construct a sequence X0 X1 X2 · · · of block subspaces XK and corresponding Borel functions fK : bb(XK ) → [N] such that for all V XK if Ij = [fK√(V )2j , fK (V )2j +1 ], then V K 2 (XK , Ij ). Pick by Lemma 2.2 some block X∞ of X0 that is K-equivalent with a block sequence ZK of XK for every K 1. Then for any block√ sequence Y of X∞ and any K 1 there is some block sequence V ZK XK such that Y is K-equivalent with V . Let (Ij ) be the intervals given
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
165
by fK (V ) so that V K 2 (XK , Ij ). We can then in a Borel way from (Ij ) construct intervals (Jj ) such that V K 2 (ZK , Jj ) and therefore also Y K (X∞ , Jj ). This means that there are Borel functions gK : bb(X∞ ) → [N] such that for all Y X∞ if JjK (Y ) = [gK (Y )2j , gK (Y )2j +1 ], then Y K (X∞ , JjK (Y )). Using Lemma 3.8 we can now in a Borel manner in Y define intervals LY0 < LY1 < · · · such that Y X∞ , LYj . Letting f : bb(X∞ ) → [N] be the Borel function corresponding to Y → (LYj ), we have our result. 2 As will be clear in Section 7 it can be useful to have a version of tightness that not only assures us that certain intervals exist, but also tells us how to obtain these. Thus, we call a basis (en ) continuously tight if there is a continuous function f : bb(en ) → [N] such that for all normalised block bases X, if we set Ij = [f (X)2j , f (X)2j +1 ], then X [en ], Ij , i.e., X does not embed into [en ] avoiding an infinite number of the intervals Ij . We shall now improve Lemma 3.9 to conclude continuous tightness from its hypothesis. Lemma 3.10. Let E = [en ] be given and suppose that for all block subspaces Z E and constants C there is a block subspace X Z such that for all block subspaces Y X, I has a winning strategy in the game HY,X with constant C. Then there is a continuously tight block subspace X E. Proof. We observe that E does not contain a copy of c0 . Indeed if Z is a block subspace of E spanned by a block sequence which is C-equivalent to the unit vector basis of c0 , then for any Y X Z, II has a winning strategy in the game HY,X with constant C 2 . We shall then use codings with inevitable subsets. So find first a block subspace Z of E such that there are inevitable, positively separated, closed subsets F0 and F1 of SZ . By Lemma 3.9, we can find a further block subspace V of Z and a Borel function g : bb(V ) → [N] such that for all Y V , if Ij = [g(Y )2j , g(Y )2j +1 ], then Y (V , Ij ). Define the set
/ g (y2n+1 ) and y2n ∈ F1 ⇔ n ∈ g (y2n+1 ) . A = (yn ) ∈ bb(V ) y2n ∈ F0 ⇔ n ∈ Obviously, A is Borel and, using inevitability, one can check that any block basis of V contains a further block basis in A. Thus, by Gowers’ Determinacy Theorem, we have that for all > 0 there is a block sequence X of V such that II has a strategy to play into A when I plays block subspaces of X. Choosing > 0 sufficiently small, this easily implies that for some block basis X of E, there is a continuous function h : bb(X) → bb(X) × [N] that to each W X associates a pair (Y, (In )) consisting of a block sequence Y of W and a sequence of intervals (In ) such that Y (V , Ij ). Notice now that continuously in the sequence (Ij ), we can construct intervals (Jj ) such that Y (X, Jj ) and hence also W (X, Jj ). So the continuous function f : bb(X) → [N] corresponding to W → (Jj ) witnesses the continuous tightness of X. 2 We shall need the following consequence of continuous tightness in Section 7.
166
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
Lemma 3.11. Suppose (en ) is continuously tight. Then there is a continuous function f : [N] → [N] such that for all A, B ∈ [N], if B is disjoint from an infinite number of intervals [f (A)2i , f (A)2i+1 ], then [en ]n∈A does not embed into [en ]n∈B . Proof. It is enough to notice that the function h : [N] → bb(en ) given by h(A) = (en )n∈A is continuous. So when composed with the function witnessing continuous tightness we have the result. 2 3.3. A game for minimality For L and M two block subspaces of E, define the infinite game GL,M with constant C 1 between two players as follows. In each round I chooses a subspace Ei ⊆ L spanned by a finite block sequence of L, a normalised block vector ui ∈ E0 + · · · + Ei , and an integer mi . In the first round II plays an integer n0 , and in all subsequent rounds II plays a subspace Fi spanned by a finite block sequence of M, a (not necessarily normalised) block vector vi ∈ F0 + · · · + Fi and an integer ni+1 . Moreover, we demand that ni Ei and mi Fi . Diagrammatically, I
n0 E 0 ⊆ L u0 ∈ E0 , m0
n1 E 1 ⊆ L u1 ∈ E0 + E1 , m1 m0 F0 ⊆ M v 0 ∈ F0 , n 1
II n0
··· m1 F1 ⊆ M · · · v 1 ∈ F0 + F1 , n 2
The outcome of the game is the pair of infinite sequences (ui ) and (vi ) and we say that II wins the game if (ui ) ∼C (vi ). Lemma 3.12. Suppose that X and Y are block subspaces of E and that player II has a winning strategy in the game HY,X with constant C. Then II has a winning strategy in the game GY,X with constant C. Proof. We shall in fact prove that II has a winning strategy in a game that is obviously harder for her to win. Namely, we shall suppose that II always plays ni = 0, which obviously puts less restrictions on the play of I. Moreover, we do not require I to play the finite-dimensional spaces Ei , which therefore also puts fewer restrictions on I in subsequent rounds. Therefore, we shall suppress all mention of Ei and ni and only require that the ui are block vectors in Y . While playing the game GY,X , II will keep track of an auxiliary play of the game HY,X in the following way. In the game GY,X we have the following play I u0 ∈ Y, m0 II
u1 ∈ Y, m1 m0 F0 ⊆ X v 0 ∈ F0
··· m1 F1 ⊆ X · · · v 1 ∈ F0 + F1
k We write each vector ui = ji=0 λij yj and may for simplicity of notation assume that ki < ki+1 . The auxiliary run of HY,X that II will keep track of is as follows, where II plays according to her winning strategy for HY,X .
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
167
I m0 · · · m0 m1 · · · m1 ··· wk0 , pk0 wk0 +1 , pk0 +1 · · · wk1 , pk1 · · · II w 0 , p0 · · · To compute the vi and Fi in the game GY,X , II will refer to the play of HY,X and set vi =
ki
λij wj ,
j =0
and let Fi = X mi , max{pki−1 +1 , . . . , pki } . It is not difficult to see that mi Fi ⊆ X, vi ∈ F0 + · · · + Fi , and that the Fi and vi only depends on u0 , . . . , ui and m0 , . . . , mi . Thus this describes a strategy for II in GY,X and it suffices to verify that it is a winning strategy. But since II follows her strategy in HY,X , we know that (wi ) ∼C (yi ) and therefore, since ui and vi are defined by the same coefficients over respectively (yi ) and (wi ), we have that (vi ) ∼C (ui ). 2 3.4. A dichotomy for minimality We are now in condition to prove the central result of this paper. Theorem 3.13 (3rd dichotomy). Let E be a Banach space with a basis (ei ). Then either E contains a minimal block subspace or a continuously tight block subspace. Proof. Suppose that E has no continuously tight block basic sequence. By Lemma 3.10, we can, modulo passing to a block subspace, suppose that for some constant C and for all block subspaces X E there is a further block subspace Y X such that I has no winning strategy in the game HY,X with constant C. By the determinacy of open games, this implies that for all block subspaces X E there is a further block subspace Y X such that II has a winning strategy in the game HY,X with constant C. A state is a pair (a, b) with a, b ∈ (D × F)<ω , where F is the set of subspaces spanned by finite block sequences and D the set of not necessarily normalised blocks, such that |a| = |b| or |a| = |b| + 1. The set S of states is countable, and corresponds to the possible positions of a game GL,M after a finite number of moves were made, restricted to elements that affect the outcome of the game from that position (i.e., mi ’s and ni ’s are forgotten). For each state s = (a, b) we will define the game GL,M (s) in a manner similar to the game GL,M depending on whether |a| = |b| or |a| = |b| + 1. To avoid excessive notation we do this via two examples: If a = (a0 , A0 , a1 , A1 ), b = (b0 , B0 , b1 , B1 ), the game GL,M (s) will start with II playing some integer n2 , then I playing (u2 , E2 , m2 ) with n2 E2 ⊆ L and u2 ∈ A0 + A1 + E2 , II playing (v2 , F2 , n3 ) with m2 F2 ⊆ M and v2 ∈ B0 + B1 + F2 , etc., and the outcome of the game will be the pair of infinite sequences (a0 , a1 , u2 , . . .) and (b0 , b1 , v2 , . . .). If a = (a0 , A0 , a1 , A1 ), b = (b0 , B0 ), the game GL,M (s) will start with I playing some integer m1 , then II playing (v1 , F1 , n2 ) with m1 F1 ⊆ M and v1 ∈ B0 + F1 , I playing (u2 , E2 , m2 )
168
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
with n2 E2 ⊆ L and u2 ∈ A0 + A1 + E2 , etc., and the outcome of the game will be the pair of infinite sequences (a0 , a1 , u2 , . . .) and (b0 , v1 , v2 , . . .). The following lemma is well known and easily proved by a simple diagonalisation. Lemma 3.14. Let N be a countable set and let μ : bb(E) → P(N ) satisfy either V ∗ W
⇒
μ(V ) ⊆ μ(W )
V ∗ W
⇒
μ(V ) ⊇ μ(W ).
or
Then there exists a stabilising block subspace V0 E, i.e., such that μ(V ) = μ(V0 ) for any V ∗ V0 . Let now τ : bb(E) → P(S) be defined by s ∈ τ (M)
⇔
∃L M such that player II has a winning strategy in GL,M (s).
By the asymptotic nature of the game we see that M ∗ M ⇒ τ (M ) ⊆ τ (M), and therefore there exists M0 E which is stabilising for τ . We then define a map ρ : bb(E) → P(S) by setting s ∈ ρ(L)
⇔
player II has a winning strategy in GL,M0 (s).
Again L ∗ L ⇒ ρ(L ) ⊇ ρ(L) and therefore there exists L0 M0 which is stabilising for ρ. Finally, the reader will easily check that ρ(L0 ) = τ (L0 ) = τ (M0 ), see, e.g., [35] or [10]. Lemma 3.15. For every M L0 , II has a winning strategy for the game GL0 ,M . Proof. Fix M a block subspace of L0 . We begin by showing that (∅, ∅) ∈ τ (L0 ). To see this, we notice that as L0 E, there is a Y L0 such that II has a winning strategy for HY,L0 and thus, by Lemma 3.12, also a winning strategy in GY,L0 with constant C. So (∅, ∅) ∈ τ (L0 ). We will show that for all states (u0 , E0 , . . . , ui , Ei ), (v0 , F0 , . . . , vi , Fi ) ∈ τ (L0 ), there is an n such that for all n E ⊆ L0 and u ∈ E0 + · · · + Ei + E, we have (u0 , E0 , . . . , ui , Ei , u, E), (v0 , F0 , . . . , vi , Fi ) ∈ τ (L0 ). Similarly, we show that for all states (u0 , E0 , . . . , ui+1 , Ei+1 ), (v0 , F0 , . . . , vi , Fi ) ∈ τ (L0 ) and for all m there are m F ⊆ M and v ∈ F0 + · · · + Fi + F such that (u0 , E0 , . . . , ui+1 , Ei+1 ), (v0 , F0 , . . . , vi , Fi , v, F ) ∈ τ (L0 ).
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
169
Since the winning condition of GL0 ,M is closed, this clearly shows that II has a winning strategy in GL0 ,M (except for the integers m and n, τ (L0 ) is a winning quasi strategy for II). So suppose that s = (u0 , E0 , . . . , ui , Ei ), (v0 , F0 , . . . , vi , Fi ) ∈ τ (L0 ) = ρ(L0 ), then II has a winning strategy in GL0 ,M0 (s) and hence there is an n such that for all n E ⊆ L0 and u ∈ E0 + · · · + Ei + E, II has a winning strategy in GL0 ,M0 (s ), where s = (u0 , E0 , . . . , ui , Ei , u, E), (v0 , F0 , . . . , vi , Fi ) . So s ∈ ρ(L0 ) = τ (L0 ). Similarly, if s = (u0 , E0 , . . . , ui+1 , Ei+1 ), (v0 , F0 , . . . , vi , Fi ) ∈ τ (L0 ) = τ (M) and m is given, then as II has a winning strategy for GL,M (s) for some L M, there are m F ⊆ M and v ∈ F0 + · · · + Fi + F such that II has a winning strategy in GL,M (s ), where s = (u0 , E0 , . . . , ui+1 , Ei+1 ), (v0 , F0 , . . . , vi , Fi , v, F ) . So s ∈ τ (M) = τ (L0 ).
2
Choose now Y = [yi ] L0 such that II has a winning strategy in HY,L0 . We shall show that any block subspace M of L0 contains a C 2 -isomorphic copy of Y , which implies that Y is C 2 + -minimal for any > 0. To see this, notice that, since II has a winning strategy in HY,L0 , player I has a strategy in the game GL0 ,M to produce a sequence (ui ) that is C-equivalent with the basis (yi ). Moreover, we can ask that I plays mi = 0. Using her winning strategy for GL0 ,M , II can then respond by producing a sequence (vi ) in M such that (vi ) ∼C (ui ). So (vi ) ∼C 2 (yi ) and Y C 2 M. 2 Finally we observe that by modifying the notion of embedding in the definition of a tight basis, we obtain variations of our dichotomy theorem with a weaker form of tightness on one side and a stronger form of minimality on the other. Theorem 3.16. Every Banach space with a basis contains a block subspace E = [en ] which satisfies one of the two following properties: (1) For any [yi ] E, there exists a sequence (Ii ) of successive intervals such that for any infinite subset A of N, the basis (yi ) does not embed into [en ]n∈/ i∈A Ii as a sequence of disjointly supported blocks , resp. as a permutation of a block sequence, resp. as a block sequence. (2) For any [yi ] E, (en ) is equivalent to a sequence of disjointly supported blocks of [yi ], resp. (en ) is permutatively equivalent to a block sequence of [yi ], resp. (en ) is equivalent to a block sequence of [yi ]. The case of block sequences immediately implies the theorem of Pełczar [35]. The fact that the canonical basis of T ∗ is strongly asymptotically ∞ implies easily that it is tight for “embedding as a sequence of disjointly supported blocks” although T ∗ is minimal in
170
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
the usual sense. We do not know of other examples of spaces combining one form of minimality with another form of tightness in the above list. 4. Tightness with constants and crude stabilisation of local structure We shall now consider a stronger notion of tightness, which is essentially local in nature. Let E be a space with a basis (en ). There is a particularly simple case when a sequence (Ii ) of intervals associated to a subspace Y characterises the tightness of Y in (en ). This is when for all integer constants K, Y K [en ]n∈I / K . This property has the following useful reformulations. Proposition 4.1. Let E be a space with a basis (en ). The following are equivalent: (1) For any block sequence (yn ) there are intervals I0 < I1 < I2 < · · · such that for all K, [yn ]n∈IK K [en ]n∈I / K. (2) For any space Y , there are intervals I0 < I1 < I2 < · · · such that for all K, Y K [en ]n∈I / K. (3) No space embeds uniformly into the tail subspaces of E. (4) There is no K and no subspace of E which is K-crudely finitely representable in any tail subspace of E. A basis satisfying properties (1), (2), (3), (4), as well as the space it generates, will be said to be tight with constants. Proof. The implications (1) ⇒ (2) ⇒ (3) are clear. To prove (3) ⇒ (4) assume some subspace Y of E is K-crudely finitely representable in any tail subspace of E. Without loss of generality, we may assume that Y = [yn ] is a block subspace of E. We pick a subsequence (zn ) of (yn ) in the following manner. Let z0 = y0 , and if z0 , . . . , zk−1 have been chosen, we choose zk supported far enough on the basis (en ), so that [z0 , . . . , zk−1 ] has a 2K-isomorphic copy in [en | k n < min(supp zk )]. It follows that for any k, Z = [zn ] has an M-isomorphic copy in the tail subspace [en | n k] for some M depending only on K and the constant of the basis (en ). To prove (4) ⇒ (1), let c(L) be a constant such that if two block sequences differ in at most L terms, then they are c(L)-equivalent. Now assume (4) holds and let (yn ) be a block sequence of (en ). Suppose also that I0 < · · · < IK−1 have been chosen. By (4) applied to l Y = [yn ]∞ n=max IK−1 +1 , we can then find m and l > max IK−1 such that [yn ]n=max IK−1 +1 does not K · c(max IK−1 + 1)-embed into [en ]∞ n=m . Let now IK = [max IK−1 + 1, l + m] and notice that, as [yn ]ln=max IK−1 +1 ⊆ [yn ]n∈IK , we have that [yn ]n∈IK does not K ·c(max IK−1 + ∞ 1)-embed into [en ]∞ n=m . Also, since (en )n=m and max IK−1
(en )n=0
(en )∞ n=max IK−1 +1+m
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
171
only differ in max IK−1 + 1 many terms, [yn ]n∈IK does not K-embed into max IK−1
[en ]n=0
+ [en ]∞ n=max IK−1 +1+m ,
and thus not into the subspace [en ]n∈I / K either.
2
It is worth noticing that a basis (en ), tight with constants, is necessarily continuously tight. For a simple argument shows that in order to find the intervals IK satisfying (1) above, one only needs to know a beginning of the block sequence (yn ) and hence the intervals can be found continuously in (yn ). From Proposition 4.1 we also deduce that any block basis or shrinking basic sequence in the span of a tight with constants basis is again tight with constants. There is a huge difference between the fact that no subspace of E is K-crudely finitely representable in all tails of E and then that no space is K-crudely finitely representable in all tails of E. For example, we shall see that while the former holds for Tsirelson’s space, by Dvoretzky’s theorem (see e.g. [16]), 2 is always finitely representable in any Banach space. Recall that a basis (en ) is said to be strongly asymptotically p , 1 p +∞ [9], if there exist a constant C and a function f : N → N such that for any n, any family of n unit vectors which are disjointly supported in [ek | k f (n)] is C-equivalent to the canonical basis of np . Proposition 4.2. Let E be a Banach space with a strongly asymptotically p basis (en ), 1 p < +∞, and not containing a copy of p . Then (en ) is tight with constants. Proof. Assume that some Banach space Y embeds with constant K in any tail subspace of E. We may assume that Y is generated by a block sequence (yn ) of E and, since any strongly asymptotically p basis is unconditional, (yn ) is unconditional. By renorming E we may assume it is 1-unconditional. By a result of W.B. Johnson [23] for any n there is a constant d(n) such that (y0 , . . . , yn ) is 2K-equivalent to a sequence of vectors in the linear span of d(n) disjointly supported unit vectors in any tail subspace of E, in particular in [ek | k f (d(n))]. Therefore [y0 , . . . , yn ] 2KC-embeds into p . This means that Y is crudely finitely representable in p and therefore embeds into Lp , and since (yn ) is unconditional asymptotically p , that Y contains a copy of p (details of the last part of this proof may be found in [9]). 2 Corollary 4.3. Tsirelson’s space T and its convexifications T (p) , 1 < p < +∞, are tight with constants. Observe that on the contrary, the dual T ∗ of T , which is strongly asymptotically ∞ and does not contain a copy of c0 , is minimal and therefore does not contain any tight subspace. Suppose a space X is crudely finitely representable in all of its subspaces. Then there is some constant K and a subspace Y such that X is K-crudely finitely representable in all of the subspaces of Y . For if not, we would be able to construct a sequence of basic sequences (xnK ) in X such that (xnK+1 ) is a block sequence of (xnK ) and such that X is not K 2 -crudely finitely√representable in [xnK ]. By Lemma 2.2, we can then find a block sequence (yn ) of (xn0 ) that is K-equivalent with a block sequence of (xnK ) for any K and hence if X were K-crudely finitely representable in [yn ] for some K, then it would also be K 3/2 -crudely finitely representable in [xnK ], which is a contradiction.
172
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
When a space X is K-crudely finitely representable in any of its subspaces for some K, we say that X is locally minimal. For example, by the universality properties of c0 , any space with an asymptotically ∞ basis is locally minimal. Theorem 4.4 (5th dichotomy). Let E be an infinite-dimensional Banach space with basis (en ). Then there is a block sequence (xn ) satisfying one of the following two properties, which are mutually exclusive and both possible. (1) (xn ) is tight with constants, (2) [xn ] is locally minimal. Proof. If E contains c0 , the result is trivial. So suppose not and find by the solution to the distortion problem a block sequence (yn ) and inevitable, positively separated, closed subsets F0 and F1 of the unit sphere of [yn ]. Define for each integer K 1 the set AK = (zn ) (yn ) z2n ∈ F0 ∪ F1 and (z2n ) codes by 0’s and 1’s a block sequence (vn ) of
(z2n+1 ) such that for all N, [vn ] K [z2n+1 ]nN and moreover 1/2 < vn < 2 . Clearly AK is analytic, so we can apply Gowers’ Determinacy Theorem to get one of two cases: (i) either there is a block sequence (xn ) and a K such that player II has a strategy to play inside (AK ) whenever I plays a block sequence of (xn ), where will be determined later, (ii) or we can choose inductively a sequence of block sequences (xnK ) such that (xnK+1 ) (xnK ) and such that no block sequence of (xnK ) belongs to AK . n and choose now further block sequences (x ) and (h ) Consider first case (ii). Set wn = x2n n n of (wn ) such that
x0 < h0 < h1 < x1 < h2 < h3 < x4 < · · · and h2n ∈ F0 , h2n+1 ∈ F1 . We claim that (xn ) is tight with constants. If not, we can find some block sequence (un ) of (xn ) and a K such that [un ] embeds with constant K into any tail subspace of [xn ]. By passing to tails of (xn ) and of (un ), we can suppose that (xn ) is a block sequence of (xnK ), (un ) is a block sequence of (xn ) and [un ] K-embeds into all tails of [xn ]. By filling in with appropriate hi between xn and xn+1 , we can now produce a block sequence (zn ) of (xnK ) such that (z2n ) codes by 0’s and 1’s the block sequence (un ) of (z2n+1 ) with the property that for all N , [un ] K [z2n+1 ]nN . In other words, we have produced a block sequence of (xnK ) belonging to AK , which is impossible. Thus, (xn ) is tight with constants. Consider now case (i) instead and let II play according to her strategy. We suppose that is chosen sufficiently small so that δi < dist(F0 , F1 )/3 and if two block sequences are -close then they are 2-equivalent. Let (yn ) ∈ (AK ) be the response by II to the sequence (xn ) played by I and let (zn ) ∈ AK be such that zn − yn < δn for all n. Then (z2n ) codes by 0’s and 1’s a block sequence (vn ) of (z2n+1 ). Let (un ) be the block sequence of (y2n+1 ) constructed in the same way as (vn ) is constructed over (z2n+1 ). We claim that (un ) is 4K-crudely finitely representable in any block subspace of [xn ].
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
173
For this, let [u0 , . . . , um ] be given and suppose that (fn ) is any block subspace of (xn ). Find a large k such that (z0 , z2 , . . . , z2k ) codes the block sequence (v0 , . . . , vm ) of (z1 , . . . , z2k+1 ) and let l be large enough so that when I has played x0 , . . . , xl then II has played y0 , . . . , y2k+1 . Consider now the game in which player I plays x0 , x1 , . . . , xl , fl+1 , fl+2 , . . . . Then, following the strategy, II will play a block sequence y0 , . . . , y2k+1 , g2k+2 , g2k+3 , . . . ∈ (AK ) . So let (hn ) ∈ AK be such that hn − yn < δn for all n 2k + 1 and hn − gn < δn for all n 2k + 2. For n k, we have, as h2n − z2n < 2δn < 23 dist(F0 , F1 ), that h2n ∈ Fi ⇔ ∞ z2n ∈ Fi . Also, (h2n+1 )kn=0 and (y2n+1 )kn=0 are 2-equivalent and (h2n+1 )∞ n=k+1 and (g2n+1 )n=k+1 are 2-equivalent, so (h2n ) will code a block sequence (wn ) of (h2n+1 ) such that (w0 , . . . , wm ) is 2-equivalent to (u0 , . . . , um ). Moreover, since (hn ) ∈ AK , [wn ] will K-embed into every tail subspace of [h2n+1 ], and hence 2K-embed into every tail subspace of [g2n+1 ]. Therefore, since (g2n+1 ) is block sequence of (fn ), [u0 , . . . , um ] will 4K-embed into [fn ], which proves the claim. It follows that [un ] is locally minimal, which proves the theorem. 2 Local minimality can be reformulated in a way that makes the relation to local theory clearer. For this, let Fn be the metric space of all n-dimensional Banach spaces up to isometry equipped with the Banach–Mazur metric d(X, Y ) = inf log T · T −1 T : X → Y is an isomorphism . Then for every Banach space X, the set of n-dimensional Y that are almost isometrically embeddable into X form a closed subset (X)n of Fn . It is well known that this set (X)n does not always stabilise, i.e., there is not necessarily a subspace Y ⊆ X such that for all further subspaces Z ⊆ Y , (Z)n = (Y )n . However, if instead X comes equipped with a basis and for all block subspaces Y we let {Y }n be the set of all n-dimensional spaces that are almost isometrically embeddable into all tail subspaces of Y , then one can easily stabilise {Y }n on subspaces. Such considerations are for example the basis for [30]. Theorem 4.4 gives a dichotomy for when one can stabilise the set (X)n in a certain way, which we could call crude. Namely, X is locally minimal if and only if there is some constant K such that for all subspaces Y of X and all n, dH ((X)n , (Y )n ) K, where dH is the Hausdorff distance. So by Theorem 4.4, the local structure stabilises crudely on a subspace if and only if a space is not saturated by basic sequences tight with constants. Often it is useful to have a bit more than local minimality. So we say that a basis (en ) is locally block minimal if it is K-crudely block finitely representable in all of its block bases for some K. As with crude finite representability we see that there then must be a constant K and a block (yn ) such that (en ) is K-crudely block finitely representable in all block subspaces of (yn ). We now have the following version of Theorem 4.4 for finite block representability. Theorem 4.5. Let (en ) be a Schauder basis. Then (en ) has a block basis (xn ) with one of the following two properties, which are mutually exclusive and both possible.
174
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
(1) For all block bases (yn ) of (xn ) there are intervals I1 < I2 < I3 < · · · such that (yn )n∈IK is not K-equivalent to a block sequence of (xn )n∈I / K, (2) (xn ) is locally block minimal. Finally we note that there exist tight spaces which do not admit subspaces which are tight with constants: Example 4.6. There exists a reflexive, tight, locally block minimal Banach space. Proof. E. Odell and T. Schlumprecht [33] have built a reflexive space OS with a basis such that every monotone basis is block finitely representable in any block subspace of OS. It is in particular locally block minimal and therefore contains no basic sequence which is tight with constants. We do not know whether the space OS is tight. Instead, we notice that since the summing basis of c0 is block finitely representable in any block subspace of OS, OS cannot contain an unconditional block sequence. By Gowers’ 1st dichotomy it follows that some block subspace of OS is HI, and by the 3rd dichotomy (Theorem 3.13) and the fact that HI spaces do not contain minimal subspaces, that some further block subspace is tight, which completes the proof. 2 It is unknown whether there is an unconditional example with the above property. There exists an unconditional version of OS [32], but it is unclear whether it has no minimal subspaces. However, the dual of a space constructed by Gowers in [17] can be shown to be both tight and locally minimal. Example 4.7. (See [14].) There exists a space with an unconditional basis which is tight and locally minimal. 5. Local block minimality, asymptotic structure and a dichotomy for containing c0 or p Recall that a basis (en ) is said to be asymptotically p (in the sense of Tsirelson’s space) if there is a constant C such that for all normalised block sequences n < x1 < · · · < xn , (xi )ni=1 is C-equivalent with the standard unit vector basis of np . When (xn ) is asymptotically p and some block subspace [yn ] of [xn ] is K-crudely block finitely representable in all tail subsequences of (xn ), then it is clear that (yn ) must actually be equivalent with the unit vector basis of p , or c0 for p = ∞. So this shows that for asymptotically p bases (xn ), either [xn ] contains an isomorphic copy of p or c0 or (xn ) itself satisfies condition (1) of Theorem 4.5. This is the counterpart of Proposition 4.2 for block sequences. As an example, we mention that, since T ∗ does not contain c0 , but has a strongly asymptotically ∞ basis, it thus satisfies (1). This small observation indicates that one can characterise when p or c0 embeds into a Banach space by characterising when a space contains an asymptotic p space. We first prove a dichotomy for having an asymptotic p subspace, for the proof of which we need the following lemma. Lemma 5.1. Suppose (en ) is a basic sequence such that for some C and all n and normalised block sequences n < y1 < y2 < · · · < y2n , we have (y2i−1 )ni=1 ∼C (y2i )ni=1 . Then (en ) has an asymptotic p subsequence for some 1 p ∞.
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
175
Proof. By the theorem of Brunel and Sucheston [5], we can, by passing to a subsequence of (en ), suppose that (en ) generates a spreading model, i.e., we can assume that for all integers n < l1 < l2 < · · · < ln and n < k1 < k2 < · · · < kn we have (el1 , . . . , eln ) ∼1+ 1 (ek1 , . . . , ekn ). n
Now suppose that en < y1 < y2 < · · · < yn and en < z1 < · · · < zn are normalised block sequences of (e2i ). Then there are n < l1 < l2 < · · · < ln and n < k1 < k2 < · · · < kn such that e n < y 1 < e l1 < y 2 < e l2 < · · · < y n < e ln and en < z1 < ek1 < z2 < ek2 < · · · < zn < ekn , so (yi ) ∼C (eli ) ∼1+ 1 (eki ) ∼C (zi ) and (yi ) ∼2C 2 (zi ). Thus, asymptotically all finite norn
malised block sequences are 2C 2 -equivalent. By Krivine’s theorem [25], there is some p that is block finitely representable in (e2i ) and hence asymptotically all finite normalised block sequences are equivalent to np of the correct dimension and (e2i ) is asymptotic p . 2 Theorem 5.2. Suppose X is a Banach space with a basis. Then X has a block subspace W , which is either asymptotic p , for some 1 p +∞, or such that ∀M ∃n ∀U1 , . . . , U2n ⊆ W ∃ui ∈ SUi u1 < · · · < u2n & (u2i−1 )ni=1 M (u2i )ni=1 . Proof. Assume first that for some M and V ⊆ X we have ∀n ∀Y ⊆ V ∃Z ⊆ Y ∀z1 < · · · < z2n ∈ SZ (z2i−1 )ni=1 ∼M (z2i )ni=1 . Then we can inductively define V ⊇ Z1 ⊇ Z2 ⊇ Z3 ⊇ · · · such that for each n, ∀z1 < · · · < z2n ∈ SZn (z2i−1 )ni=1 ∼M (z2i )ni=1 . Diagonalising over this sequence, we can find a block subspace W = [wn ] such that for all m n, wm ∈ Zn . Therefore, if n < z1 < · · · < z2n is a sequence of normalised blocks of (wn ), then (z2i−1 )ni=1 ∼M (z2i )ni=1 . By Lemma 5.1, it follows that W has an asymptotic p subspace. So suppose on the contrary that ∀M ∀V ⊆ X ∃n ∃Y ⊆ V ∀Z ⊆ Y ∃z1 < · · · < z2n ∈ SZ (z2i−1 )ni=1 M (z2i )ni=1 . Now find some small > 0 such that two normalised block sequences of X that are -close are √ 2-equivalent. Applying the determinacy theorem of Gowers to and the sets
A(M, n) = (zi ) (z2i−1 )ni=1 M (z2i )ni=1 ,
176
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
we have that ∀M ∀V ⊆ X ∃n ∃Z ⊆ V II has a strategy to play normalised z1 < · · · < z2n such that (z2i−1 )ni=1 M (z2i )ni=1 2
when I is restricted to playing blocks of Z. Using this we can inductively define a sequence X ⊇ W1 ⊇ W2 ⊇ W3 ⊇ · · · such that for all M there is an n = n(2M) such that II has a strategy to play normalised z1 < · · · < z2n such that (z2i−1 )ni=1 M (z2i )ni=1 whenever I is restricted to playing blocks of WM . Diagonalising over this sequence, we find some W ⊆ X such that for all M, W ⊆∗ WM . So for all M there is n such that II has a strategy to play normalised z1 < · · · < z2n such that (z2i−1 )ni=1 M (z2i )ni=1 whenever I is restricted to playing blocks of W . Letting player I play a segment of the block basis of Ui until II plays a vector, we easily see that whenever U1 , . . . , U2n ⊆ W , there are zi ∈ SUi such that z1 < · · · < z2n and (z2i−1 )ni=1 M (z2i )ni=1 . This finishes the proof. 2 Using this, we can now prove the main result of this section. Theorem 5.3 (The c0 and p dichotomy). Suppose X is a Banach space not containing a copy of c0 nor of p , 1 p < ∞. Then X has a subspace Y with a basis satisfying one of the following properties. (i) ∀M ∃n ∀U1 , . . . , U2n ⊆ Y ∃ui ∈ SUi u1 < · · · < u2n & (u2i−1 )ni=1 M (u2i )ni=1 . (ii) For all block bases (zn ) of Y = [yn ] there are intervals I1 < I2 < I3 < · · · such that (zn )n∈IK is not K-equivalent to a block sequence of (yn )n∈I / K. Notice that both (i) and (ii) trivially imply that Y cannot contain a copy of c0 or p , since (i) implies some lack of homogeneity and (ii) some lack of minimality. However, we do not know if there are any spaces satisfying both (i) and (ii) or if, on the contrary, these two properties are incompatible. Proof. If X has no block subspace satisfying (i), then it must have an asymptotically p block subspace Y for some 1 p ∞. Let C be the constant of asymptoticity. Suppose now that Z = [zn ] is a further block subspace. Since Z is not isomorphic to p , this means that for all M−L−1 and hence not L and K, there is some M such that (zn )M n=L is not KC-equivalent to p ∞ K-equivalent with a normalised block sequence of (yn )n=M either. So if I1 < I2 < · · · < IK−1 have been defined, to define IK , we let N = max IK−1 + 1 and find M such that (zn )M n=2N −1 is M not K-equivalent with a normalised block sequence of (yn )∞ . It follows that (z ) n n=N is not n=M K-equivalent with a normalised block sequence of (y1 , y2 , . . . , yN −1 , yM+1 , yM+2 , . . .). Letting IK = [N, M] we have the result. 2 We should mention that G. Androulakis, N. Kalton and Tcaciuc [1] have extended Tcaciuc’s dichotomy from [41] to a dichotomy characterising containment of p and c0 . The result above implies theirs and moreover provides additional information.
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
177
6. Tightness by range and subsequential minimality Theorem 1.1 shows that if one allows oneself to pass to a basis for a subspace, one can find a basis in which there is a close connection between subspaces spanned by block bases and subspaces spanned by subsequences. Thus, for example, if the basis is tight there can be no space embedding into all the subspaces spanned by subsequences of the basis. On the other hand, any block basis in Tsirelson’s space T is equivalent to a subsequence of the basis, and actually every subspace of a block subspace [xn ] in T contains an isomorphic copy of a subsequence of (xn ). In fact, this phenomenon has a deeper explanation and we shall now proceed to show that the connection between block sequences and subsequences can be made even closer. Lemma 6.1. If (en ) is a basis for a space not containing c0 , then for all finite intervals (In ) such that min In − −−− → ∞ and all subspaces Y , there is a further subspace Z such that n→∞ PIk |Z − −−→ 0. k→∞ Proof. By a standard perturbation argument, we can suppose that Y is generated by a normalised block basis (yn ). Let K be the basis constant of (en ). As min In − −−− → ∞ and each In is finite, n→∞ we can choose a subsequence (vn ) of (yn ) such that for all k the interval Ik intersects the range of at most one vector vm from (vn ). Now, since c0 does not embed into [en ], no tail sequence of (vn ) can satisfy an upper c0 estimate. This implies that for all N and δ > 0 there is a normalised vector
z=
N
ηi v i ,
i=N
where |ηi | < δ. Using this, we now construct a normalised block sequence (zn ) of (vn ) such that there are m(0) < m(1) < · · · and αi with zn =
m(n+1)−1
αi vi
i=m(n)
and |αi | < n1 whenever
m(n) i < m(n + 1). Now suppose u = j λj zj and k are given. Then there is at most one vector zn whose range intersect the interval Ik . Also, there is at most one vector vp from the support of zn whose range intersect Ik . Therefore, PI (u) = PI (λn zn ) = PI (λn αp vp ) k k k 2Kλn αp vp |λn | · 2
4K 2 2K u. n n
It follows that PIk |[zl ] 4K nk , where nk is such that Ik intersects the range of znk (or nk = k if Ik intersects the range of no zn ). Since min Ik − −−→ ∞ and (zn ) is a block basis, nk → ∞ when k→∞ k → ∞, and hence PIk |[zl ] − − − → 0. 2 k→∞
178
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
Our next result should be contrasted with the construction by Pełczy´nski [36] of a basis (fi ) such that every basis is equivalent with a subsequence of it, and hence such that every space contains an isomorphic copy of a subsequence. We shall see that for certain spaces E such constructions cannot be done relative to the subspaces of E provided that we demand that (fn ) lies in E too. Recall that two Banach spaces are said to be incomparable if neither of them embeds into the other. Proposition 6.2. Suppose that (en ) is a basis such that any two block subspaces with disjoint ranges are incomparable. Suppose also that (fn ) is either a block basis or a shrinking basic sequence in [en ]. Then [en ] is saturated with subspaces Z such that no subsequence of (fn ) embeds into Z. Proof. Let Y be an arbitrary subspace of [en ]. Suppose first that (fn ) is a normalised shrinking basic sequence. Then, by taking a perturbation of (fn ), we can suppose that each fn is a finite block vector of (ei ) and, moreover, that min range(fn ) → ∞. Let In = range(fn ). Fix an infinite set N ⊆ N. Then for all infinite subsets A ⊆ N there is an infinite subset B ⊆ A such that (fn )n∈B is a block sequence and hence, since block subspaces of (en ) with disjoint ranges are incomparable, [fn ]n∈B [en ]n∈/ i∈B Ii , and so also [fn ]n∈N [en ]n∈/ i∈A Ii . Applying Lemma 3.2 to X = [fn ]n∈N and (In )n∈N , this implies that for all embeddings T : [fn ]n∈N → [en ]n∈N , we have lim infn∈N PIk T > 0. So find by Lemma 6.1 a subspace −−→ 0. Then no subsequence of (fn )∈N embeds into Z. Z ⊆ Y such that PIk |Z − k→∞ The argument in the case (fn ) is a block basis is similar. We set In = range fn and repeat the argument above. 2 We notice that in the above proof we actually have a measure for how “flat” a subspace Z of [en ] needs to be in order that the subsequences of (fn ) cannot embed into Z. Namely, it suffices −−→ 0. that PIk |Z − k→∞ We should also mention that, by similar but simpler arguments, one can show that if (en ) is a basis such that any two disjoint subsequences span incomparable spaces, then some subspace of [en ] fails to contain any isomorphic copy of a subsequence of (en ). The assumption in Proposition 6.2 that block subspaces with disjoint ranges are incomparable is easily seen to be equivalent to the following property of a basis (en ), that we call tight by range. If (yn ) is a block sequence of (en ) and A ⊆ N is infinite, then / [yn ]n∈N en n ∈ range yi . i∈A
Thus, (en ) is tight by range if it is tight and for all block sequences (yn ) of (en ) the corresponding sequence of intervals Ii is given by Ii = range yi . This property is also weaker than disjointly supported subspaces being incomparable, which we shall call tight by support. It is trivial to see that a basis, tight by range, is continuously tight. We say that a basic sequence (en ) is subsequentially minimal if any subspace of [en ] contains an isomorphic copy of a subsequence of (en ). It is clearly a weak form of minimality. In [26] the authors study another notion in the context of certain partly modified mixed Tsirelson spaces that they also call subsequential minimality. According to their definition, a basis (en ) is subsequentially minimal if any block basis has a further block basis equivalent to a subsequence of (en ). However, in all their examples the basis (en ) is weakly null and it is easily
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
179
seen that whenever this is the case the two definitions agree. They also define (en ) to be strongly non-subsequentially minimal if any block basis contains a further block basis that has no further block basis equivalent to a subsequence of (en ). By Proposition 6.2, this is seen to be weaker than tightness by range. We shall now proceed to show a dichotomy between tightness by range and subsequential minimality. Theorem 6.3 (4th dichotomy). Let E be a Banach space with a basis (en ). Then there exists a block sequence (xn ) of (en ) with one of the following properties, which are mutually exclusive and both possible: (1) Any two block subspaces of [xn ] with disjoint ranges are incomparable. (2) The basic sequence (xn ) is subsequentially minimal. Arguably Theorem 6.3 is not a dichotomy in Gowers’ sense, since property (2) is not hereditary: for example the universal basis of Pełczy´nski [36] satisfies (2) while admitting subsequences with property (1). However, it follows obviously from Theorem 6.3 that any basis (en ) either has a block basis such that any two block subspaces with disjoint ranges are incomparable or has a block basis (xn ) that is hereditarily subsequentially minimal, i.e., such that any block basis has a further block basis that is subsequentially minimal. Furthermore, by an easy improvement of our proof or directly by Gowers’ second dichotomy, if the first case of Theorem 6.3 fails, then one can also suppose that [xn ] is quasi minimal. We shall call a basis (xn ) sequentially minimal if it is both hereditarily subsequentially minimal and quasi minimal. This is equivalent to any block basis of (xn ) having a further block basis (yn ) such that every subspace of [xn ] contains an equivalent copy of a subsequence of (yn ). We may therefore see Theorem 6.3 as providing a dichotomy between tightness by range and sequential minimality. Before giving the proof of Theorem 6.3, we first need to state an easy consequence of the definition of Gowers’ game. Lemma 6.4. Let E be a space with a basis and assume II has a winning strategy in Gowers’ game in E to play in some set B. Then there is a non-empty tree T of finite block sequences such that [T ] ⊆ B and for all (y0 , . . . , ym ) ∈ T and all block sequences (zn ) there is a block ym+1 of (zn ) such that (y0 , . . . , ym , ym+1 ) ∈ T . Proof. Suppose σ is the strategy for II. We define a pruned tree T of finite block bases (y0 , . . . , ym ) and a function ψ associating to each (y0 , . . . , ym ) ∈ T a sequence (z0 , . . . , zk ) such that for some k0 < · · · < km = k, I z0 · · · zk0 zk0 +1 · · · zk1 · · · zkm−1 +1 · · · zkm II y0 y1 · · · ym has been played according to σ . • The empty sequence ∅ is in T and ψ(∅) = ∅. • If (y0 , . . . , ym ) ∈ T and ψ(y0 , . . . , ym ) = (z0 , . . . , zk ),
180
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
then we let (y0 , . . . , ym , ym+1 ) ∈ T if there are some zk < zk+1 < · · · < zl and k0 < · · · < km = k such that zk0 +1 · · · zk1 · · · zkm +1 · · · zl I z0 · · · zk0 II y0 y1 · · · ym+1 has been played according to σ and in this case we let ψ(y0 , . . . , ym , ym+1 ) = (z0 , . . . , zk , zk+1 , . . . , zl ) be some such sequence. Now, if (y0 , y1 , y2 , . . .) is such that (y0 , . . . , ym ) ∈ T for all m, then ψ(∅) ⊆ ψ(y0 ) ⊆ ψ(y 0 , y1 ) ⊆ · · · and (yi ) is the play of II according to the strategy σ in response to (zi ) = n ψ(y0 , . . . , yn ) being played by I. So [T ] ⊆ B. It also follows by the construction that for each (y0 , . . . , ym ) ∈ T and block sequence (zi ) there is a block ym+1 of (zi ) such that (y0 , . . . , ym , ym+1 ) ∈ T . 2 We now pass to the proof of Theorem 6.3. Proof. If E contains c0 the theorem is trivial. So suppose not. By the solution to the distortion problem and by passing to a subspace, we can suppose there are two positively separated inevitable closed subsets F0 and F1 of the unit sphere of E, i.e., such that dist(F0 , F1 ) > 0 and every block basis has block vectors belonging to both F0 and F1 . Suppose that (en ) has no block sequence satisfying (1). Then for all block sequences (xn ) there are further block sequences (yn ) and (zn ) with disjoint ranges such that [yn ] [zn ]. We claim that there is a block sequence (fn ) and a constant K such that for all block sequences (xn ) of (fn ) there are further block sequences (yn ) and (yn ) with disjoint ranges such that [yn ] K [zn ]. If not, we can construct a sequence of block sequences (fnK ) such that (fnK+1 ) is a block of (fnK ) and such that any two block sequences of (fnK ) with disjoint ranges are K 2 -incomparable. √ By Lemma 2.2, we then find a block sequence (gn ) of (en ) that is K-equivalent with a block sequence of (fnK ) for every K 1. Find now block subspaces (yn ) and (zn ) of (gn ) with disjoint √ ranges and a K such that [yn ] √K [zn ]. Then (gn ) is K-equivalent with a block sequence of (fnK ) and hence we can find K 3/2 -comparable block subspaces of (fnK ) with disjoint ranges, contradicting our assumption. So suppose (fn ) and K are chosen as in the claim. Then for all block sequences (xn ) of (fn ) we can find an infinite set B ⊆ N and a block sequence (yn ) of (xn ) such that [yn ]n∈B K-embeds into [yn ]n∈B / . We claim that any block basis of (fn ) has a further block basis in the following set of normalised block bases of (fn ):
A = (yn ) ∀n y2n ∈ F0 ∪ F1 & ∃∞ n y2n ∈ F0 & [y2n+1 ]y2n ∈F0 K [y2n+1 ]y2n ∈F1 . To see this, suppose that (xn ) is a block sequence of (fn ) and let (zn ) be a block sequence of (xn ) such that z3n ∈ F0 and z3n+1 ∈ F1 . We can now find an infinite set B ⊆ N and a block sequence (vn ) of (z3n+2 ) such that [vn ]n∈B K [vn ]n∈B / . Let now y2n+1 = vn and notice that we
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
181
can choose y2n = zi ∈ F0 for n ∈ B and y2n = zi ∈ F1 for n ∈ / B such that y0 < y1 < y2 < · · · . Then (yn ) ∈ A. Choose now a sequence = (δn ) of positive reals, δn < dist(F0 , F1 )/3, such that if (xn ) and (yn ) are block bases of (en ) with xn − yn < δn , then (xn ) ∼2 (yn ). Since A is clearly analytic, it follows by Gowers’ determinacy theorem that for some block basis (xn ) of (fn ), II has a winning strategy to play in A whenever I plays a block basis of (xn ). We now show that some block basis (vn ) of (xn ) is such that any subspace of [vn ] contains a sequence 2K-equivalent to a subsequence of (vn ), which will give us case (2). Pick first by Lemma 6.4 a non-empty tree T of finite block sequences of (xn ) such that [T ] ⊆ A and for all (u0 , . . . , um ) ∈ T and all block sequences (zn ) there is a block um+1 of (zn ) such that (u0 , . . . , um , um+1 ) ∈ T . Since T is countable, we can construct inductively a block sequence (vn ) of (xn ) such that for all (u0 , . . . , um ) ∈ T there is some vn with (u0 , . . . , um , vn ) ∈ T . We claim that (vn ) works. For if (zn ) is any block sequence of (vn ), we construct inductively a sequence (un ) ∈ A as follows. Using inductively the extension property of T , we can construct an infinite block sequence (h0n ) of (zn ) that belongs to [T ]. Since [T ] ⊆ A , there is a shortest initial segment (u0 , . . . , u2k0 ) ∈ T of (h0n ) such that d(u2k0 , F0 ) < δ2k0 . Pick now a term u2k0 +1 from (vn ) such that (u0 , . . . , u2k0 , u2k0 +1 ) ∈ T . Again, using the extension property of T , there is an infinite block sequence (h1n ) of (zn ) such that (u0 , . . . , u2k0 , u2k0 +1 ) h1n n ∈ [T ]. Also, as [T ] ⊆ A , there is a shortest initial segment (u0 , . . . , u2k0 , u2k0 +1 , . . . , u2k1 ) ∈ T of (u0 , . . . , u2k0 , u2k0 +1 ) h1n n that properly extends (u0 , . . . , u2k0 , u2k0 +1 ) and such that d(u2k1 , F0 ) < δ2k1 . We then pick a term u2k1 +1 of (vn ) such that (u0 , . . . , u2k1 , u2k1 +1 ) ∈ T . We continue in the same fashion. At infinity, we then have a block sequence (un ) ∈ A and integers k0 < k1 < · · · such that d(u2n , F0 ) < δ2n if and only if n = ki for some i and such that for every i, u2ki +1 is a term of (vn ). Let now (wn ) ∈ A be such that wn − un < δn . Then, as δn < dist(F0 , F1 )/3, we have that w2n ∈ F0 if and only if n = ki for some i and w2n ∈ F1 otherwise. Moreover, as (wn ) ∈ A, [w2ki +1 ]i∈N = [w2n+1 ]w2n ∈F0 K [w2n+1 ]w2n ∈F1 = [w2n+1 ]n=ki . So by the choice of δn we have [u2ki +1 ]i∈N 2 [w2ki +1 ]i∈N K [w2n+1 ]n=ki 2 [u2n+1 ]n=ki . Since [u2n+1 ]n=ki is a subspace of [zn ] and (u2ki +1 ) a subsequence of (vn ) this finishes the proof. 2 If, for some constant C, all subspaces of [en ] contain a C-isomorphic copy of a subsequence of (en ), we say that (en ) is subsequentially C-minimal. Our proof shows that condition (2) in
182
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
Theorem 6.3 may be improved to “For some constant C the basic sequence (xn ) is subsequentially C-minimal”. We notice that if (xn ) is hereditarily subsequentially minimal, then there is some C and a block sequence (vn ) of (xn ) such that (vn ) is hereditarily subsequentially C-minimal with the obvious definition. To see this, we first notice that by Proposition 6.2, (xn ) can have no block bases (yn ) such that further block subspaces with disjoint ranges are incomparable. So, by the proof of Theorem 6.3, for any block base (yn ) there is a constant C and a further block basis (zn ) which is subsequentially C-minimal. A simple diagonalisation using Lemma 2.2 now shows that by passing to a block (vn ) the C can be made uniform. Recall that Gowers also proved that a quasi minimal space must contain a further subspace which is C-quasi minimal [20]. We also indicate a variation on Theorem 6.3, relating the Casazza property to a slightly stronger form of sequential minimality. This answers the original problem of Gowers left open in [20], which was mentioned in Section 1. This variation is probably of less interest than Theorem 6.3 because the Casazza property does not seem to imply tightness and also because the stronger form of sequential minimality may look somewhat artificial (although it is satisfied by Tsirelson’s space and is reminiscent of Schlumprecht’s notion of Class 1 space [40]). We say that two block sequences (xn ) and (yn ) alternate if either x0 < y0 < x1 < y1 < · · · or y0 < x 0 < y 1 < x 1 < · · · . Proposition 6.5. Let E be a Banach space with a basis (en ). Then there exists a block sequence (xn ) with one of the following properties, which are exclusive and both possible: (1) [xn ] has the Casazza property, i.e., no alternating block sequences in [xn ] are equivalent. (2) There exists a family B of block sequences saturating [xn ] and such that any two block sequences in B have subsequences which alternate and are equivalent. In particular, in case (2), E contains a block subspace U = [un ] such that for every block sequence of U , there is a further block sequence equivalent to, and alternating with, a subsequence of (un ). Proof. If (en ) does not have a block sequence satisfying (1), then any block sequence of (en ) has a further block sequence in A = {(yn ) | (y2n ) ∼ (y2n+1 )}. Let be small enough so that A = A. By Gowers’ theorem, let (xn ) be some block sequence of (en ) so that II has a winning strategy to play in A whenever plays a block sequence of (xn ). Let T be the associated tree given by Lemma 6.4. By construction, for any block sequence (zn ) of (xn ), we may find a further block sequence (vn ) such that for any (y0 , . . . , ym ) ∈ T , there exists some vn with (y0 , . . . , ym , vn ) ∈ T . We set f ((zn )) = (vn ) and B = {f ((zn )) | (zn ) (xn )}. Given (vn ) and (wn ) in B, it is then clear that we may find subsequences (vn ) and (wn ) so that (v0 , w0 , v1 , w1 , . . .) ∈ T , and therefore (vn ) and (wn ) are equivalent. 2 We may also observe that there is no apparent relation between tightness by range and tightness with constants. Indeed Tsirelson’s space is tight with constants and sequentially minimal. Similarly, Example 4.7 is tight by support and therefore by range, but is locally minimal. Using the techniques of [3], one can construct a space combining the two forms of tightness. Example 6.6. (See [14].) There exists a space with a basis which is tight with constants and tight by range.
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
183
Finally, if a space X is locally minimal and equipped with a basis which is tight by support and therefore unconditional (such as Example 4.7), then the reader will easily check the following. The canonical basis of X ⊕ X is tight (for a block subspace Y = [yn ] of X ⊕ X use the sequence of intervals associated to the ranges of yn with respect to the canonical 2-dimensional decomposition of X ⊕ X), but neither tight by range nor with constants. However, a more interesting question remains open: does there exist a tight space which does not contain a basic sequence which is tight by range or with constants? There is a natural strengthening of sequential minimality that has been considered in the literature, namely, the blocking principle (also known as the shift property in [7]) due to Casazza, Johnson, and Tzafriri [6]. It is known that for a normalized unconditional basis (en ) the following properties are equivalent (see, e.g., [12]). (1) Any block sequence (xn ) spans a complemented subspace of [en ]. (2) For any block sequence (xn ), (x2n ) ∼ (x2n+1 ). (3) For any block sequence (xn ) and integers kn ∈ supp xn , (xn ) ∼ (ekn ). Moreover, any of the above properties necessarily hold uniformly. We say that (en ) satisfies the blocking principle if the above properties hold for (en ). The following proposition can be proved along the lines of the proof of the minimality of T ∗ in [6] (Theorem 14). Proposition 6.7. Let (en ) be an unconditional basis satisfying the blocking principle and spanning a locally minimal space. Then (en ) spans a minimal space. Thus, by the 5th dichotomy (Theorem 4.4), we have Corollary 6.8. Let (en ) be an unconditional basis satisfying the blocking principle. Then there is a subsequence (fn ) of (en ) such that either [fn ] is minimal or (fn ) is tight with constants. 7. Chains and strong antichains The results in this section are in response to a question of Gowers from his fundamental study [20] and concern what types of quasi orders can be realised as the set of (infinitedimensional) subspaces of a fixed Banach space under the relation of isomorphic embeddability. Problem 7.1. (See Problem 7.9. in [20].) Given a Banach space X, let P(X) be the set of all equivalence classes of subspaces of X, partially ordered by isomorphic embeddability. For which posets P does there exist a Banach space X such that every subspace Y of X contains a further subspace Z with P(Z) = P ? Gowers noticed himself that by a simple diagonalisation argument any such poset P(X) must either have a minimal element, corresponding to a minimal space, or be uncountable. We shall now use our notion of tightness to show how to attack this problem in a uniform way and improve on several previous results. Suppose X is a separable Banach space and let SB(X) denote the set of all closed linear subspaces of X. We equip SB(X) with the so-called Effros–Borel structure, which is the σ algebra generated by sets on the form {Y ∈ SB(X) Y ∩ U = ∅},
184
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
where U is an open subset of X. In this way, SB(X) becomes a standard Borel space, i.e., isomorphic as a measurable space to the real line equipped with its Borel algebra. We refer to the measurable subsets of SB(X) as Borel sets. Let also SB∞ (X) be the subset of SB(X) consisting of all infinite-dimensional subspaces of X. Then SB∞ (X) is a Borel subset of SB(X) and hence a standard Borel space in its own right. Definition 7.2. Suppose that X is a separable Banach space and E is an analytic equivalence relation on a Polish space Z. We say that X has an E-antichain, if there is a Borel function f : Z → SB(X) such that for x, y ∈ Z (1) if xEy, then f (x) and f (y) are biembeddable, (2) if xE / y, then f (x) and f (y) are incomparable. We say that X has a strong E-antichain if there is a Borel function f : Z → SB(X) such that for x, y ∈ Z (1) if xEy, then f (x) and f (y) are isomorphic, / y, then f (x) and f (y) are incomparable. (2) if xE For example, if =R is the equivalence relation of identity on R, then =R -antichains and strong =R -antichains simply correspond to a perfect antichain in the usual sense, i.e., an uncountable Borel set of pairwise incomparable subspaces. Also, having a strong E-antichain implies, in particular, that E Borel reduces to the isomorphism relation between the subspaces of X. The main result of [11] reformulated in this language states that if EΣ 1 denotes the complete 1 analytic equivalence relation, then C[0, 1] has a strong EΣ 1 -antichain. 1 We will now prove a result that simultaneously improves on two results due respectively to the first and the second author. In [12], the authors proved that a Banach space not containing a minimal space must contain a perfect set of non-isomorphic subspaces. This result was improved by Rosendal in [37], in which it was shown that if a space does not contain a minimal subspace it must contain a perfect set of pairwise incomparable spaces. And Ferenczi proved in [10] that if X is a separable space without minimal subspaces, then E0 Borel reduces to the isomorphism relation between the subspaces of X. Recall that E0 is the equivalence relation defined on 2N by xE0 y if and only if ∃m ∀n m xn = yn . Theorem 7.3. Let X be a separable Banach space. Then X either contains a minimal subspace or has a strong E0 -antichain. Proof. Suppose X has no minimal subspace. By Theorem 3.13 and Lemma 3.11, we can find a basic sequence (en ) in X and a continuous function f : [N] → [N] such that for all A, B ∈ [N], if B is disjoint from an infinite number of intervals [f (A)2i , f (A)2i+1 ], then [en ]n∈A does not embed into [en ]n∈B . We claim that there is a continuous function h : 2N → [N] such that (1) if xE0 y, then |h(x) \ h(y)| = |h(y) \ h(x)| < ∞, / 0 y, then [en ]n∈h(x) and [en ]n∈h(y) are incomparable spaces. (2) if xE This will clearly finish the proof using the fact that subspaces of the same finite codimension in a common superspace are isomorphic.
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
185
We will construct a partition of N into intervals I00 < I01 < I02 < I10 < I11 < I12 < · · · such that if we set Jn0 = In0 ∪ In2 and Jn1 = In1 , the following conditions hold: (1) for all n, |Jn0 | = |Jn1 |, sn−1 s (2) if s ∈ 2n , a = J0 0 ∪ J1s1 ∪ · · · ∪ Jn−1 ∪ In0 , and A ∈ [a, N], then for some i, f (A)2i , f (A)2i+1 ⊆ In0 , s
s
n−1 ∪ In1 , and A ∈ [a, N], then for some i, (3) if s ∈ 2n , a = J0 0 ∪ J1s1 ∪ · · · ∪ Jn−1
f (A)2i , f (A)2i+1 ⊆ In1 . Assuming this is done, for x ∈ 2N we set h(x) = J0 0 ∪ J1x1 ∪ · · · . Then for all n there is an i such that x
f h(x) 2i , f h(x) 2i+1 ⊆ Jnxn . y
y
/ 0 y, then h(y) = J0 0 ∪ J1 1 ∪ · · · is disjoint from an infinite number of Jnxn and Therefore, if xE thus also from an infinite number of intervals [f (h(x))2i , f (h(x))2i+1 ], whence [en ]n∈h(x) does not embed into [en ]n∈h(y) . Similarly, [en ]n∈h(y) does not embed into [en ]n∈h(x) . On the other hand, if xE0 y, then clearly |h(x) \ h(y)| = |h(y) \ h(x)| < ∞. It therefore only remains to construct the intervals Ini . So suppose by induction that I00 < 0 1 2 < In−1 < In−1 have been chosen (the initial step being n = 0) such that I01 < I02 < · · · < In−1 0 2 + 1 = max In−1 + 1. For each s ∈ 2n and a = the conditions are satisfied. Let m = max Jn−1 sn−1 s0 s1 J0 ∪ J1 ∪ · · · ∪ Jn−1 , there are by continuity of f some ks > m, some interval m Ms ks and an integer is such that for all A ∈ [a ∪ [m, ks ], N], we have f (A)2is , f (A)2is +1 = Ms . s
s
n−1 , we have for Let now k = maxs∈2n ks and In0 = [m, k]. Then if s ∈ 2n and a = J0 0 ∪ · · · ∪ Jn−1 0 all A ∈ [a ∪ In , N] some i such that
f (A)2i , f (A)2i+1 ⊆ In0 . s
s
n−1 there are by continuity of f some Again for each s ∈ 2n and a = J0 0 ∪ J1s1 ∪ · · · ∪ Jn−1 ls > k + 1, some interval k + 1 Ls ls and an integer js such that for all A ∈ [a ∪ [k + 1, ls ], N], we have
f (A)2js , f (A)2js +1 = Ls .
186
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193 s
s
n−1 Let now l = maxs∈2n ls + k and In1 = [k + 1, l]. Then if s ∈ 2n and a = J0 0 ∪ · · · ∪ Jn−1 , we have 1 for all A ∈ [a ∪ In , N] some j such that
f (A)2j , f (A)2j +1 ⊆ In1 . Finally, we simply let In2 = [l + 1, l + |In1 | − |In0 |]. This finishes the construction.
2
Definition 7.4. We define a quasi order ⊆∗ and a partial order ⊆0 on the space [N] of infinite subsets of N by the following conditions: A ⊆∗ B
⇔
A \ B is finite
and A ⊆0 B
⇔
A = B or ∃n ∈ B \ A: A ⊆ B ∪ [0, n[ .
Also, if (an ) and (bn ) are infinite sequences of integers, we let (an ) ∗ (bn )
⇔
∀∞ n an bn .
We notice that ⊆0 is a partial order refining the quasi order ⊆∗ , namely, whenever A ⊆∗ B we let A ⊆0 B if B ∗ A or A = B or A B admits a greatest element which belongs to B. Proposition 7.5. (1) (2) (3) (4)
Any closed partial order on a Polish space Borel embeds into ⊆0 . Any partial order on a set of size at most ℵ1 embeds into ⊆0 . The quasi order ⊆∗ embeds into ⊆0 , but does not Borel embed. And finally ⊆0 Borel embeds into ⊆∗ .
Proof. (1) By an unpublished result of A. Louveau [29], any closed partial order on a Polish space Borel embeds into (P(N), ⊆). And if we let (Jn ) be a partition of N into countable many infinite subsets, we see that (P(N), ⊆) Borel embeds into ⊆∗ and ⊆0 by the mapping A → n∈A Jn . (2) & (3) It is well known that any partial order of size at most ℵ1 embeds into ⊆∗ and if we let s : [N] → [N] be any function such that |A B| < ∞ ⇔ s(A) = s(B) and |A s(A)| < ∞, i.e., s is a selector for E0 , then s embeds ⊆∗ into ⊆0 . To see that there cannot be a Borel embedding of ⊆∗ into ⊆0 , we notice that if h : [N] → [N] was a Borel function such that A ⊆∗ B ⇔ h(A) ⊆0 h(B), then, in particular, |A B| is finite ⇔ h(A) = h(B), contradicting that E0 is a non-smooth equivalence relation on [N]. (4) To see that ⊆0 Borel embeds into ⊆∗ , we define for an infinite subset A of N a sequence of integers g(A) = (an ) by an =
i∈A∩[0,n]
2i .
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
187
Suppose now that g(A) = (an ) and g(B) = (bn ). Then for each n, an = bn
⇔
A ∩ [0, n] = B ∩ [0, n]
and an < bn
⇔
∃m ∈ B \ A, m n, A ∩ [0, n] ⊆ B ∪ [0, m[.
Thus, we have an = bn for infinitely many n if and only if A = B, and if an < bn for infinitely many n, then either B \ A is infinite or for some m ∈ B \ A we have A ⊆ B ∪ [0, m[. Moreover, if B \ A is infinite, then for infinitely many n, an < bn . So B ∗ A
(bn ) ∗ (an )
⇒
(B ∗ A or A ⊆0 B),
⇒
and thus by contraposition (bn ) ∗ (an )
⇒
B ⊆∗ A.
Also, if (bn ) ∗ (an ), then B ∗ A or A ⊆0 B, so if moreover B ⊆0 A, we would have A ⊆0 B and hence A = B, contradicting g(B) = (bn ) ∗ (an ) = g(A). Thus, B ⊆0 A
⇒
(bn ) ∗ (an ).
To see that also (bn ) ∗ (an )
⇒
B ⊆0 A,
notice that if (bn ) ∗ (an ) but B 0 A, then, as B ⊆∗ A, we must have A ⊆0 B and hence (an ) ∗ (bn ). But then an = bn for almost all n and thus A = B, contradicting B 0 A. Therefore, B ⊆0 A
⇔
(bn ) ∗ (an ),
and we thus have a Borel embedding of ⊆0 into the quasi order ∗ on the space NN . It is well known and easy to see that this latter Borel embeds into ⊆∗ and hence so does ⊆0 . 2 Proposition 7.6. Any Banach space without a minimal subspace contains a subspace with an F.D.D. (Fn ) satisfying one of the two following properties: (a) if A, B ⊆ N are infinite, then
Fn
n∈A
Fn
⇔
A ⊆∗ B,
Fn
⇔
A ⊆0 B.
n∈B
(b) if A, B ⊆ N are infinite, then n∈A
Fn
n∈B
188
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
Proof. Suppose X is a Banach space without a minimal subspace. Then by Theorem 3.13, we can find a continuously tight basic sequence (en ) in X. Using the infinite Ramsey theorem for analytic sets, we can also find an infinite set D ⊆ N such that (i) either for all infinite B ⊆ D, [ei ]i∈B embeds into its hyperplanes, (ii) or for all B ⊆ D, [ei ]i∈B is not isomorphic to a proper subspace. And, by Lemma 3.11, we can after renumbering the sequence (en )n∈D as (en )n∈N suppose that there is a continuous function f : [N] → [N] that for A, B ∈ [N], if B is disjoint from an infinite number of intervals [f (A)2i , f (A)2i+1 ], then [en ]n∈A does not embed into [en ]n∈B . We now construct a partition of N into intervals I0 < I1 < I2 < · · · such that the following conditions hold: – for all n, |I0 ∪ · · · ∪ In−1 | < |In |, – if A ∈ [N] and In ⊆ A, then for some i, f (A)2i , f (A)2i+1 ⊆ In . Suppose by induction that I0 < I1 < · · · < In−1 have been chosen such that the conditions are satisfied. Let m = max In−1 + 1. For each a ⊆ [0, m[ there are by continuity of f some la > m, some interval m Ma la and an integer ia such that for all A ∈ [a ∪ [m, la ], N], we have f (A)2ia , f (A)2ia +1 = Ma . Let now l > maxa⊆[0,m[ la be such that |I0 ∪ · · · ∪ In−1 | < l − m, and set In = [m, l[. Then if a ⊆ [0, m[, we have for all A ∈ [a ∪ In , N] some i such that f (A)2i , f (A)2i+1 ⊆ In , which ends the construction.
n−1 ∗ i∈In . Clearly, i=0 dim Fi < dim Fn , and if A\B is infinite and we let A = Let now Fn =∗ [ei ] ∗ then B will be disjoint from an infinite number of the intervals n∈A In and B = n∈B In , defined by f (A∗ ) and hence n∈A Fn = [en ]n∈A∗ does not embed into n∈B Fn = [en ]n∈B ∗ . In case of (i) we have that for all infinite C ⊆ N, (en )n∈C (en )n∈C (en )n∈C (en )n∈C · · · ,
where D denotes D \ min D. So, in particular, for any infinite A ⊆ N, into all
n∈A Fn embeds of its finite codimensional subspaces and thus if A \ B is finite, then n∈A Fn n∈B Fn . This gives us (a).
n−1
but B 0 A, we have, as dim Fn > In case (ii), if A ⊆0 B n∈A Fn embeds i=0 dim Fi , that F , then A \ B is finite and as a proper subspace of n∈B Fn . Conversely, if n∈A Fn n∈B n
so either A ⊆0 B or B ⊆0 A. But if B ⊆0 A and A 0 B, then n∈B Fn embeds as a proper subspace into n∈A Fn and thus also into itself, contradicting (ii). Thus, A ⊆0 B. So assuming (ii) we have the equivalence in (b). 2
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
189
We may observe that Tsirelson’s space satisfies case (a) of Proposition 7.6, while case (b) is verified by Gowers–Maurey’s space, or more generally by any space of type (1) to (4). By Propositions 7.5 and 7.6 we now have the following result. Theorem 7.7. Let X be an infinite-dimensional separable Banach space without a minimal subspace and let SB∞ (X) be the standard Borel space of infinite-dimensional subspaces of X ordered by the relation of isomorphic embeddability. Then ⊆0 Borel embeds into SB∞ (X) and by consequence (a) any partial order of size at most ℵ1 embeds into SB∞ (X), (b) any closed partial order on a Polish space Borel embeds into SB∞ (X). We notice that this proves a strong dichotomy for the partial orders of Problem 7.1, namely, either they must be of size 1 or must contain any partial order of size at most ℵ1 and any closed partial order on a Polish space. In particular, in the second case we have well-ordered chains of length ω1 and also R-chains. This completes the picture of [13]. 8. Refining Gowers’ dichotomies We recall the list of inevitable classes of subspaces contained in a Banach space given by Gowers in [20]. Remember that a space is said to be quasi minimal if any two subspaces have a common -minorant, and strictly quasi minimal if it is quasi minimal but does not contain a minimal subspace. Also two spaces are incomparable in case neither of them embeds into the other, and totally incomparable if no space embeds into both of them. Theorem 8.1. (See Gowers [20].) Let X be an infinite-dimensional Banach space. Then X contains a subspace Y with one of the following properties, which are all possible and mutually exclusive. (i) Y is hereditarily indecomposable, (ii) Y has an unconditional basis such that any two disjointly supported block subspaces are incomparable, (iii) Y has an unconditional basis and is strictly quasi minimal, (iv) Y has an unconditional basis and is minimal. Here the condition of (ii) that any two disjointly supported block subspaces are incomparable, i.e., tightness by support, is equivalent to the condition that any two such subspaces are totally incomparable or just non-isomorphic. Theorem 1.1 improves the list of Gowers in case (iii). Indeed, any strictly quasi minimal space contains a tight subspace, but the space S(T (p) ), 1 < p < +∞ is strictly quasi minimal and not tight: it is saturated with subspaces of T (p) , which is strictly quasi minimal, and, as was already observed, it is not tight because its canonical basis is symmetric. Concerning case (i), properties of HI spaces imply that any such space contains a tight subspace, but it remains open whether every HI space with a basis is tight. Question 8.2. Is every HI space with a basis tight?
190
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
Using Theorems 1.1 and 1.4, we refine the list of inevitable spaces of Gowers to 6 main classes as follows. Theorem 8.3. Let X be an infinite-dimensional Banach space. Then X contains a subspace Y with one of the following properties, which are all mutually exclusive. (1) Y is hereditarily indecomposable and has a basis such that any two block subspaces with disjoint ranges are incomparable, (2) Y is hereditarily indecomposable and has a basis which is tight and sequentially minimal, (3) Y has an unconditional basis such that any two disjointly supported block subspaces are incomparable, (4) Y has an unconditional basis such that any two block subspaces with disjoint ranges are incomparable, and is quasi minimal, (5) Y has an unconditional basis which is tight and sequentially minimal, (6) Y has an unconditional basis and is minimal. We conjecture that the space of Gowers and Maurey is of type (1), although we have no proof of this fact. Instead, in [14] we prove that an asymptotically unconditional HI space constructed by Gowers [18] is of type (1). We do not know whether type (2) spaces exist. If they do, they may be thought of as HI versions of type (5) spaces, i.e., of Tsirelson like spaces, so one might look for an example in the family initiated by the HI asymptotically 1 space of Argyros and Deliyanni, whose “ground” space is a mixed Tsirelson’s space based on the sequence of Schreier families [2]. The first example of type (3) was built by Gowers [17] and further analysed in [22]. Other examples are constructed in [14]. Type (4) means that for any two block subspaces Y and Z with disjoint ranges, Y does not embed into Z, but some further block subspace Y of Y does (Y therefore has disjoint support but not disjoint range from Z). It is unknown whether there exist spaces of type (4). Gowers sketched the proof of a weaker result, namely the existence of a strictly quasi minimal space with an unconditional basis and with the Casazza property, i.e., such that for no block sequence the sequence of odd vectors is equivalent to the sequence of even vectors, but his example was never actually checked. Alternatively, results of [26, Section 4] suggest that a mixed Tsirelson space example might be looked for. The main example of a space of type (5) is Tsirelson’s space. Actually since spaces of type (1) to (4) are either HI or satisfy the Casazza property, they are never isomorphic to a proper subspace. Therefore, for example, spaces with a basis saturated with block subspaces isomorphic to their hyperplanes must contain a subspace of type (5) or (6). So our results may reinforce the idea that Tsirelson’s space is the canonical example of classical space without a minimal subspace. It is worth noting that as a consequence of the theorem of James, spaces of type (3), (4) and (5) are always reflexive. Using some of the additional dichotomies, one can of course refine this picture even further. We shall briefly consider how this can be done using the 5th dichotomy plus a stabilisation theorem of A. Tcaciuc [41] generalising a result of [15]. We state a slightly stronger version of the theorem of Tcaciuc than what is proved in his paper and also point out that there is an unjustified use of a recent result of Junge, Kutzarova and Odell in his paper; their result only holds for 1 p < ∞. Tcaciuc’s theorem states that any Banach
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
191
space contains either a strongly asymptotically p subspace, 1 p +∞, or a subspace Y such that ∀M ∃n ∀U1 , . . . , U2n ⊆ Y ∃xi ∈ SUi (x2i−1 )ni=1 M (x2i )ni=1 , where the Ui range over infinite-dimensional subspaces of Y . The second property in this dichotomy will be called uniform inhomogeneity. As strongly asymptotically p bases are unconditional, while the HI property is equivalent to uniform inhomogeneity with n = 2 for all M, Tcaciuc’s dichotomy is only relevant for spaces with an unconditional basis. When combining Theorem 8.3, Tcaciuc’s result, Proposition 4.2 (see also [9]), the 5th dichotomy, the fact that asymptotically ∞ spaces are locally minimal, and the classical theorem of James, we obtain 19 inevitable classes of spaces and examples for 8 of them. The class (2) is divided into two subclasses and the class (4) into four subclasses, which are not made explicit here for lack of an example of space of type (2) or (4) to begin with. Recall that the spaces contained in any of the 12 subclasses of type (1)–(4) are never isomorphic to their proper subspaces, and in this sense these subclasses may be labeled “exotic”. On the contrary “classical”, “pure” spaces must belong to one of the 7 subclasses of type (5)–(6). Theorem 8.4. Any infinite-dimensional Banach space contains a subspace with a basis of one of the following types: Type
Properties
Examples
(1a) (1b)
HI, tight by range and with constants HI, tight by range, locally minimal
? G∗
(2)
HI, tight, sequentially minimal
?
(3a) (3b) (3c) (3d)
tight by support and with constants, uniformly inhomogeneous tight by support, locally minimal, uniformly inhomogeneous tight by support, strongly asymptotically p , 1 p < ∞ tight by support, strongly asymptotically ∞
? G∗u Xu Xu∗
(4)
unconditional basis, quasi minimal, tight by range
?
(5a)
?
(5d)
unconditional basis, tight with constants, sequentially minimal, uniformly inhomogeneous unconditional basis, tight, sequentially and locally minimal, uniformly inhomogeneous tight with constants, sequentially minimal, strongly asymptotically p , 1 p < ∞ tight, sequentially minimal, strongly asymptotically ∞
?
(6a) (6b) (6c)
unconditional basis, minimal, uniformly inhomogeneous minimal, reflexive, strongly asymptotically ∞ isomorphic to c0 or lp , 1 p < ∞
S T∗ c0 , p
(5b) (5c)
? T , T (p)
We know of no space close to being of type (5a) or (5b). A candidate for (5d) could be the dual of some partly modified mixed Tsirelson’s space not satisfying the blocking principle
192
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
(see [26]). Schlumprecht’s space S [39] does not contain an asymptotically p subspace, therefore it contains a uniformly inhomogeneous subspace, which implies by minimality that S itself is of type (6a). The definition and analysis of the spaces G∗ , G∗u , Xu and Xu∗ can be found in [14]. For completeness we should mention that R. Wagner has also proved a dichotomy between asymptotic unconditionality and a strong form of the HI property [43]. His result could be used to further refine the cases of type (1) and (2). 9. Open problems Problem 9.1. (1) (2) (3) (4)
(5) (6) (7) (8) (9) (10) (11)
Does there exist a tight Banach space admitting a basis which is not tight? Does there exist a tight, locally block minimal and unconditional basis? Find a locally minimal and tight Banach space with finite cotype. Does there exist a tight Banach space which does not contain a basic sequence that is either tight by range or tight with constants? In other words, does there exist a locally and sequentially minimal space without a minimal subspace? Suppose [en ] is sequentially minimal. Does there exist a block basis all of whose subsequences are subsequentially minimal? Is every HI space with a basis tight? Is every tight basis continuously tight? Do there exist spaces of type (2), (4), (5a), (5b), (5d)? Suppose (en ) is tight with constants. Does (en ) have a block sequence that is (strongly) asymptotically p for some 1 p < ∞? Does there exist a separable HI space X such that ⊆∗ Borel embeds into SB∞ (X)? If X is a separable Banach space without a minimal subspace, does ⊆∗ Borel embed into SB∞ (X)? What about more complicated quasi orders, in particular, the complete analytic quasi order Σ 1 ? 1
References [1] G. Androulakis, N. Kalton, A. Tcaciuc, On Banach spaces containing p or c0 , preprint. [2] S. Argyros, I. Deliyanni, Examples of asymptotic 1 Banach spaces, Trans. Amer. Math. Soc. 349 (1997) 973–995. [3] S. Argyros, I. Deliyanni, D. Kutzarova, A. Manoussakis, Modified mixed Tsirelson spaces, J. Funct. Anal. 159 (1998) 43–109. [4] J. Bagaria, J. López-Abad, Weakly Ramsey sets in Banach spaces, Adv. Math. 160 (2) (2001) 133–174. [5] A. Brunel, L. Sucheston, On B-convex Banach spaces, Math. Systems Theory 7 (4) (1974) 294–299. [6] P.G. Casazza, W.B. Johnson, L. Tzafriri, On Tsirelson’s space, Israel J. Math. 47 (2–3) (1984) 81–98. [7] P.G. Casazza, N.J. Kalton, Unconditional bases and unconditional finite-dimensional decompositions in Banach spaces, Israel J. Math. 95 (1996) 349–373. [8] P.G. Casazza, T. Shura, Tsirelson’s Space, Lecture Notes in Math., vol. 1363, Springer-Verlag, Berlin, 1989. [9] S. Dilworth, V. Ferenczi, D. Kutzarova, E. Odell, On strongly asymptotically p spaces and minimality, J. London Math. Soc. (2) 75 (2007) 409–419. [10] V. Ferenczi, Minimal subspaces and isomorphically homogeneous sequences in a Banach space, Israel J. Math. 156 (2006) 125–140. [11] V. Ferenczi, A. Louveau, C. Rosendal, The complexity of classifying separable Banach spaces up to isomorphism, J. London Math. Soc., in press. [12] V. Ferenczi, C. Rosendal, Ergodic Banach spaces, Adv. Math. 195 (1) (2005) 259–282. [13] V. Ferenczi, C. Rosendal, Complexity and homogeneity in Banach spaces, in: Beata Randrianantoanina, Narcisse Randrianantoanina (Eds.), Banach Spaces and Their Applications in Mathematics, Walter de Gruyter, Berlin, 2007, pp. 83–110.
V. Ferenczi, C. Rosendal / Journal of Functional Analysis 257 (2009) 149–193
193
[14] V. Ferenczi, C. Rosendal, Banach spaces without minimal subspaces – Examples, preprint. [15] T. Figiel, R. Frankiewicz, R.A. Komorowski, C. Ryll-Nardzewski, Selecting basic sequences in φ-stable Banach spaces, Dedicated to Professor Aleksander Pełczy´nski on the occasion of his 70th birthday, Studia Math. 159 (3) (2003) 499–515. [16] T. Figiel, J. Lindenstrauss, V.D. Milman, The dimension of almost spherical sections of convex bodies, Acta Math. 139 (1977) 53–94. [17] W.T. Gowers, A solution to Banach’s hyperplane problem, Bull. London Math. Soc. 26 (6) (1994) 523–530. [18] W.T. Gowers, A hereditarily indecomposable space with an asymptotic unconditional basis, in: Geometric Aspects of Functional Analysis, Israel, 1992–1994, in: Oper. Theory Adv. Appl., vol. 77, Birkhäuser, Basel, 1995, pp. 112– 120. [19] W.T. Gowers, A new dichotomy for Banach spaces, Geom. Funct. Anal. 6 (6) (1996) 1083–1093. [20] W.T. Gowers, An infinite Ramsey theorem and some Banach space dichotomies, Ann. of Math. (2) 156 (3) (2002) 797–833. [21] W.T. Gowers, B. Maurey, The unconditional basic sequence problem, J. Amer. Math. Soc. 6 (4) (1993) 851–874. [22] W.T. Gowers, B. Maurey, Banach spaces with small spaces of operators, Math. Ann. 307 (1997) 543–568. [23] W.B. Johnson, A reflexive Banach space which is not sufficiently Euclidean, Studia Math. 55 (1978) 201–205. [24] A.S. Kechris, Classical Descriptive Set Theory, Springer-Verlag, New York, 1995. [25] J.L. Krivine, Sous-espaces de dimension finie des espaces de Banach réticulés, Ann. of Math. (2) 104 (1) (1976) 1–29. [26] D. Kutzarova, D.H. Leung, A. Manoussakis, W.K. Tang, Minimality properties of Tsirelson type spaces, Studia Math. 187 (3) (2008) 233–263. [27] J. Lindenstrauss, L. Tzafriri, Classical Banach Spaces, Springer-Verlag, New York, Heidelberg, Berlin, 1979. [28] J. López-Abad, Coding into Ramsey sets, Math. Ann. 332 (4) (2005) 775–794. [29] A. Louveau, Closed orders and their vicinity, preprint, 2001. [30] B. Maurey, V. Milman, N. Tomczak-Jaegermann, Asymptotic infinite-dimensional theory of Banach spaces, in: Geometric Aspects of Functional Analysis, Israel, 1992–1994, in: Oper. Theory Adv. Appl., vol. 77, Birkhäuser, Basel, 1995, pp. 149–175. [31] E. Odell, T. Schlumprecht, The distortion problem, Acta Math. 173 (2) (1994) 259–281. [32] E. Odell, T. Schlumprecht, On the richness of the set of p’s in Krivine’s theorem, in: Geometric Aspects of Functional Analysis, Israel, 1992–1994, in: Oper. Theory Adv. Appl., vol. 77, Birkhäuser, Basel, 1995, pp. 177–198. [33] E. Odell, T. Schlumprecht, A Banach space block finitely universal for monotone bases, Trans. Amer. Math. Soc. 352 (4) (2000) 1859–1888. [34] E. Odell, T. Schlumprecht, Trees and branches in Banach spaces, Trans. Amer. Math. Soc. 354 (10) (2002) 4085– 4108. [35] A.M. Pełczar, Subsymmetric sequences and minimal spaces, Proc. Amer. Math. Soc. 131 (3) (2003) 765–771. [36] A. Pełczy´nski, Universal bases, Studia Math. 32 (1969) 247–268. [37] C. Rosendal, Incomparable, non-isomorphic and minimal Banach spaces, Fund. Math. 183 (3) (2004) 253–274. [38] C. Rosendal, Infinite asymptotic games, Ann. Inst. Fourier, in press. [39] T. Schlumprecht, An arbitrarily distortable Banach space, Israel J. Math. 76 (1–2) (1991) 81–95. [40] T. Schlumprecht, unpublished notes. [41] A. Tcaciuc, On the existence of asymptotic-p structures in Banach spaces, Canad. Math. Bull. 50 (4) (2007) 619– 631. [42] B.S. Tsirelson, Not every Banach space contains p or c0 , Funct. Anal. Appl. 8 (1974) 138–141. [43] R. Wagner, Finite high-order games and an inductive approach towards Gowers’s dichotomy, in: Proceedings of the International Conference “Analyse & Logique”, Mons, 1997, Ann. Pure Appl. Logic 111 (1–2) (2001) 39–60.
Journal of Functional Analysis 257 (2009) 194–218 www.elsevier.com/locate/jfa
Orbits in symmetric spaces ✩ F. Sukochev a,∗ , D. Zanin b a School of Mathematics and Statistics, University of New South Wales, Kensington, NSW 2052, Australia b School of Computer Science, Engineering and Mathematics, Flinders University, Bedford Park, SA 5042, Australia
Received 10 November 2008; accepted 21 January 2009 Available online 20 February 2009 Communicated by N. Kalton
Abstract We characterize those elements in fully symmetric spaces on the interval (0, 1) or on the semi-axis (0, ∞) whose orbits are the norm-closed convex hull of their extreme points. © 2009 Elsevier Inc. All rights reserved. Keywords: Symmetric function spaces; Orbits; Closed convex hull of extreme points
1. Introduction The following semigroups of bounded linear operators play a fundamental role in the interpolation theory of linear operators for the couple (L1 , L∞ ) of Lebesgue measurable functions on intervals (0, 1) and (0, ∞). The semigroup of absolute contractions, or admissible operators (see e.g. [10, II.3.4]) Σ := T : L1 + L∞ → L1 + L∞ : max T L1 →L1 , T L∞ →L∞ 1 , the semigroup of substochastic operators (see e.g. [2, p. 107]) Σ+ := {0 T ∈ Σ} ✩
Research was partially supported by the ARC.
* Corresponding author.
E-mail addresses:
[email protected] (F. Sukochev),
[email protected] (D. Zanin). 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.01.025
F. Sukochev, D. Zanin / Journal of Functional Analysis 257 (2009) 194–218
195
and, in the case of the interval (0, 1), the semigroup of doubly stochastic operators Σ := 0 T ∈ Σ+ :
1
1 (T x)(s) ds =
0
x(s) ds, ∀x 0, T 1 = 1 0
(see e.g. [13]). If x ∈ L1 + L∞ (respectively, 0 x ∈ L1 + L∞ or 0 x ∈ L1 (0, 1)) we denote by Ω(x) (respectively Ω+ (x) and Ω (x)) the orbit of x with respect to the semigroups Σ (respectively, Σ+ , and Σ ). A Banach function space E (on (0, 1) or (0, ∞), see [2, pp. 2–3]) is called an exact interpolation space if every T ∈ Σ maps E into itself and T E→E 1, or alternatively, if Ω(x) ⊂ E and yE xE for every x ∈ E and every y ∈ Ω(x). The class of exact interpolation spaces admits an equivalent description in terms of (sub)majorization in the sense of Hardy, Littlewood and Polya. Recall, that if x, y ∈ L1 + L∞ , then y is said to be a submajorized by x in the sense of Hardy, Littlewood and Polya, written y ≺≺ x if and only if t
∗
t
y (s) ds 0
x ∗ (s) ds,
t 0.
0
Here, x ∗ denotes the non-increasing right-continuous rearrangement of x given by x ∗ (t) = inf s 0: m |x| s t and m is Lebesgue measure. If 0 x, y ∈ L1 , then we say that y is majorized by x (written y ≺ x) if and only if y ≺≺ x and y1 = x1 . A Banach function space E is said to be fully symmetric if and only if x ∈ E, y ∈ L1 + L∞ y ≺≺ x ⇒ y ∈ E and yE xE . The classical Calderon– Mityagin theorem (see [4,10,2]) gives an alternative description of the sets Ω(x), x ∈ L1 + L∞ and Ω+ (x), 0 x ∈ L1 + L∞ as follows Ω(x) = {y ∈ L1 + L∞ : y ≺≺ x},
Ω+ (x) = {0 y ∈ L1 + L∞ : y ≺≺ x}
and (in the case of the interval (0, 1) and 0 x ∈ L1 (0, 1)) Ω (x) = {0 y ∈ L1 : y ≺ x}, which shows, in particular, that the classes of exact interpolation spaces and fully symmetric spaces coincide. Let fully symmetric Banach function space E be fixed. The principal aim of the paper is to give conditions for a given 0 x ∈ E which are necessary and sufficient for each of the sets Ω+ (x), Ω (x) to be the norm closure of the convex hull of their extreme points. If E = L1 (0, 1), then it has been shown by Ryff (see [13]) that if 0 x ∈ E, then the orbit Ω (x) is weakly compact and so, due to the Krein–Milman theorem, the orbit Ω (x) is the weak (and hence norm)-closed convex hull of its extreme points. It follows from the results of [7] that the set Ω (x) is weakly compact in any separable symmetric space E. Hence, Ω (x) is the weak (and hence norm)-closed convex hull of its extreme points in any separable symmetric space E. If a fully symmetric space E is not separable, then it is not the case in general that orbits are weakly compact. A trivial example yields the orbit Ω(χ[0,1] ) in fully symmetric non-separable
196
F. Sukochev, D. Zanin / Journal of Functional Analysis 257 (2009) 194–218
space L∞ (0, 1). Indeed, it is obvious that this orbit coincides with the unit ball of L∞ (0, 1) and the latter is not weakly compact since the space L∞ (0, 1) is non-reflexive. Nonetheless, it is an interesting question to give necessary and sufficient conditions that the orbit of a given element should be the norm-closed convex hull of its extreme points. This question was considered by Braverman and Mekler (see [3]) in the case of the interval (0, 1) and orbits Ω(x). They showed that for every fully symmetric space E on (0, 1) satisfying the condition lim
τ →∞
1 στ E→E = 0 τ
(1)
that Ω(x) is indeed the norm-closed convex hull of the set of its extreme points, for every x ∈ E (see [3, Theorem 3.1]). Here στ denotes the usual dilation operator (see the following section for definition and properties). They showed as well that the converse assertion is valid in case that E is a Marcinkiewicz space on (0, 1). As explained below, this converse assertion, however, fails for arbitrary fully symmetric spaces. We show (Theorem 21) that if E is a fully symmetric space on (0, 1) and if 0 x ∈ E, then Ω (x) is the norm-closed convex hull of its extreme points if and only if ϕ(x) := lim
τ →∞
1 στ (x ∗ ) = 0. E τ
(2)
As shown in Corollary 27 this implies the result of Braverman and Mekler. In Appendix A, we demonstrate that the conditions (1) and (2) are distinct in the class of Orlicz spaces. If E is an Orlicz space, then it is the case that (2) holds, and so by Theorems 21 and 22 for every 0 x ∈ E, the sets Ω (x), Ω+ (x) and Ω(x) are the norm-closed convex hulls of its extreme points. However, there are non-separable Orlicz spaces E which fail condition (1). In Appendix A, we also introduce the notion of symmetric and fully symmetric functionals. The latter are a “commutative” counterpart of Dixmier traces appeared in non-commutative geometry (see e.g. [5]). Symmetric and fully symmetric functionals are extensively studied in [8,9] (see also [5] and references therein). Note, however, that our terminology differs from that used in just cited articles. A subclass of Marcinkiewicz spaces admitting symmetric functionals which fail to be fully symmetric is described in [9]. It follows from our results that any symmetric functional on a fully symmetric space satisfying (2) is automatically fully symmetric. In particular, this implies that an Orlicz space does not possess any singular symmetric functionals (see Proposition 34). This latter result strengthens the result of [8, Theorem 3.1] that an Orlicz space does not possess any singular fully symmetric functionals. Results similar to Theorems 21 and 22 hold also for fully symmetric spaces E on the semi-axis (see Theorems 23–26). The main results of this article are contained in Section 4. In the following section we present some definitions from the theory of symmetric spaces, as some of our results hold in a slightly more general setting than that of fully symmetric spaces. For more details on the latter theory we refer to [10,11,2]. Section 3 treats various properties of the functional ϕ and its modifications needed in Section 4. We would like to emphasize the difference between geometric properties of the orbit Ω(x) and those of Ω (x) and Ω+ (x). This is especially noticeable in the description of the respective sets of their extreme points. The extreme points of the sets Ω(x) and Ω (x), x 0 are well known (see [14,6]) and are given by extr Ω(x) = {y ∈ L1 + L∞ : y ∗ = x ∗ },
extr Ω (x) = {0 y ∈ L1 : y ∗ = x ∗ },
F. Sukochev, D. Zanin / Journal of Functional Analysis 257 (2009) 194–218
197
whereas the description of extreme points of the set Ω+ (x), x 0 given by extr Ω+ (x) = {0 y ∈ L1 + L∞ : y ∗ = x ∗ χ[0,β] for some β ∞} when x ∗ (∞) := limt→∞ x ∗ (t) = 0, and by extr Ω+ (x) = {0 y ∈ L1 + L∞ : y ∗ = x ∗ χ[0,β] for some β ∞ and yχ{y
0, is somewhat less known, so we present in Appendix A a careful exposition of the latter equalities. 2. Preliminaries Let L0 be a space of Lebesgue measurable functions either on (0, 1) or (0, ∞) finite almost everywhere (with identification m-a.e.). Here m is a Lebesgue measure. Define S0 as the subset of L0 which consists of all functions x such that m({|x| > s}) is finite for some s. Let E be a Banach space of real-valued Lebesgue measurable functions either on (0, 1) or (0, ∞) (with identification m-a.e.). E is said to be ideal lattice if x ∈ E and |y| |x| implies that y ∈ E and yE xE . The ideal lattice E ⊆ S0 is said to be symmetric space if for every x ∈ E and every y the assumption y ∗ = x ∗ implies that y ∈ E and yE = xE . If E = E(0, 1) is a symmetric space on (0, 1), then L∞ ⊆ E ⊆ L1 . If E = E(0, ∞) is a symmetric space on (0, ∞), then L1 ∩ L ∞ ⊆ E ⊆ L 1 + L ∞ . Symmetric space E is said to be fully symmetric if and only if x ∈ E, y ∈ L1 + L∞ y ≺≺ x ⇒ y ∈ E and yE xE . We now gather some additional terminology from the theory of symmetric spaces that will be needed in the sequel. Suppose E is a symmetric space. Following [3], E will be called strictly symmetric if and only if whenever x, y ∈ E and y ≺≺ x then yE xE . It is clear that if E is fully symmetric then E is strictly symmetric, but the converse assertion is not valid. The norm · E is called Fatou norm if, for every sequence xn ↑ x ∈ E, it follows that xn E ↑ xE . This is equivalent to the assertion that the unit ball of E is closed with respect to almost everywhere convergence. It is well known that if the norm on E is a Fatou norm then E is strictly symmetric. If τ > 0, the dilation operator στ is defined by setting (στ (x))(s) = x( τs ), s > 0 in the case of the semi-axis. In the case of the interval (0, 1) the operator στ is defined by
(στ x)(s) =
x(s/τ ), 0,
s min{1, τ }, τ < s 1.
198
F. Sukochev, D. Zanin / Journal of Functional Analysis 257 (2009) 194–218
The operators στ (τ 1) satisfy semigroup property στ1 στ2 = στ1 τ2 . If E is a symmetric space and if τ > 0, then the dilation operator στ is a bounded operator on E and στ E→E max{1, τ }. We will need also the notion of a partial averaging operator (see [3]). and denote Let A = {Ak } be a (finite or infinite) sequence of disjoint sets of finite measure by A the collection of all such sequences. Denote by A∞ the complement of k Ak . The partial averaging operator is defined by 1 P (x|A) = x(s) ds χAk + xχA∞ . m(Ak ) k
Ak
Note, that we do not require A∞ to have a finite measure. Every partial averaging operator is a contraction both in L1 and L∞ . Hence, P (·|A) is also a contraction in E. In case of the interval (0, 1), P (·|A) is a doubly stochastic operator in the sense of [13]. Since P (·|A) ∈ Σ , then P (x|A) ∈ Ω(x) (respectively, P (x|A) ∈ Ω (x) if x ∈ L1 ) for every A ∈ A. As will be seen, elements of the form P (x|A) play a central role. The following properties of rearrangements can be found in [10]. If x, y ∈ L1 + L∞ , then (x + y)∗ ≺≺ x ∗ + y ∗
(3)
(x ∗ − y ∗ ) ≺≺ (x − y)∗ .
(4)
and
Let us recall some classical examples of fully symmetric spaces. Let ψ be a concave increasing continuous function. The Marcinkiewicz space Mψ is the linear space of those functions x ∈ S0 , for which xMψ
1 = sup t ψ(t)
t
x ∗ (s) ds < ∞.
0
Equipped with the norm xMψ , Mψ is a fully symmetric space with Fatou norm. Let M(t) be a convex function on [0, ∞) such that M(t) > 0 for all t > 0 and such that 0 = M(0) = lim
t→0
M(t) t = lim . t→∞ M(t) t
Denote by LM the Orlicz space on [0, ∞) (see e.g. [11,10]) endowed with the norm
∞
xLM = inf λ: λ > 0,
M x(t) /λ dt 1 .
0
Equipped with the norm xLM , LM is a fully symmetric space with Fatou norm.
(5)
F. Sukochev, D. Zanin / Journal of Functional Analysis 257 (2009) 194–218
199
For further properties of Marcinkiewicz and Orlicz spaces, we refer to [10–12]. For 0 x ∈ L1 + L∞ , we set Q+ (x) = Conv extr Ω+ (x) . For 0 x ∈ L1 , we set Q (x) = Conv extr Ω (x) . For 0 x ∈ L1 + L∞ , we set Q (x) = Conv{y ∗ = x ∗ , yχ{y
(6)
2
Proof. Fix ε > 0. It follows from [10, II.2.1], t
∗
x (s) ds ε +
t x(s) ds,
e1
0
y ∗ (s) ds ε +
y(s) ds e2
0
for some e1 and e2 with m(ei ) = t. However,
x(s) ds + e1
y(s) ds
e2
(x + y)(s) ds
e1 ∪e2
sup m(e)=2t
Note, that
2t 0
u(s) ds =
t
0 (2σ 12 u)(s) ds.
2t (x + y)(s) ds =
e
2
Lemma 2. If x, y ∈ L1 + L∞ and y ≺≺ x, then, ∗ στ (y) στ (y ∗ ) ≺≺ στ (x ∗ ).
0
(x + y)∗ (s) ds.
200
F. Sukochev, D. Zanin / Journal of Functional Analysis 257 (2009) 194–218
Proof. Set dy (s) = m(t: |y(t)| > s). In the case of the semi-axis, dστ y = τ dy = dστ (y ∗ ) . In the case of the interval (0, 1), dστ y τ dy and dστ (y ∗ ) = min{1, τ dy }. Hence, dστ y dστ (y ∗ ) and so (στ (y))∗ στ (y ∗ ). Finally, t
t
τ
∗
στ (y )(s) ds = τ 0
t
τ
∗
y (s) ds τ 0
t
∗
x (s) ds = 0
στ (x ∗ )(s) ds.
2
0
The next lemma introduces the dilation functional ϕ on E, which is a priori non-linear. The behavior of the functional ϕ on the positive part E+ of E provides the key to our main question. Lemma 3. For every x ∈ E the following limit exists and is finite. 1 σs (x ∗ ) , E s→∞ s
ϕ(x) = lim
x ∈ E.
(7)
If, in addition, E = E(0, ∞), then the following limits exist and are finite. 1 σs (x ∗ )χ[0,1] , E s 1 ϕcut (x) = lim σs (x ∗ )χ[0,s] E , s→∞ s ϕfin (x) = lim
s→∞
x ∈ E,
(8)
x ∈ E.
(9)
The following properties hold. (i) (ii) (iii) (iv) (v) (vi)
If E If E If E If E If E If E
is symmetric, then ϕ(y) ϕ(x) provided that x, y ∈ E satisfy y ∗ x ∗ . is symmetric, then ϕ(x) xE for every x ∈ E. is strictly symmetric, then ϕ(y) ϕ(x) provided that x, y ∈ E satisfy y ≺≺ x. is symmetric, then ϕ(στ (x ∗ )) = τ ϕ(x), τ > 0. is strictly symmetric, then ϕ is norm-continuous. is strictly symmetric, then ϕ is convex.
If, in addition, E = E(0, ∞), then ϕfin also satisfies (i)–(vi), while ϕcut satisfies (i), (ii), (iii), (v) and (vi). If, in addition, E L1 , then ϕcut also satisfies (iv). Proof. We prove that the function s → 1s σs x ∗ E is decreasing. Let s2 > s1 . We have s2 = s3 s1 and s3 > 1. Therefore, 1 σs σs (x ∗ ) σs3 E→E σs (x ∗ ) 1 σs (x ∗ ) , 3 1 1 1 E E E s2 s2 s1 since σs3 E→E s3 . It follows immediately that the limit in (7) exists. (i) Trivial. (ii) This follows from the fact that σs (x ∗ )E sxE . (iii) Since y ≺≺ x, it follows that σs (y ∗ ) ≺≺ σs (x ∗ ). Since E is strictly symmetric, it follows that σs (y ∗ )E σs (x ∗ )E . Therefore, 1 σs (y ∗ ) lim 1 σs (x ∗ ) = ϕ(x). E E s→∞ s s→∞ s
ϕ(y) = lim
F. Sukochev, D. Zanin / Journal of Functional Analysis 257 (2009) 194–218
201
(iv) Applying the semigroup property of the dilation operators στ , 1 σs στ (x ∗ ) = τ lim 1 σsτ (x ∗ ) = τ ϕ(x). E E s→∞ s τ →∞ sτ lim
(v) By triangle inequality, σs (x ∗ ) − σs (y ∗ ) σs (x ∗ − y ∗ ) . E E E Using (4) and Lemma 2 one can obtain σs (x ∗ − y ∗ ) ≺≺ σs ((x − y)∗ ). Since E is strictly symmetric, σs (x ∗ ) − σs (y ∗ ) σs (x − y)∗ . E
E
E
Now, one can divide by s and let s → ∞. Therefore, ϕ(x) − ϕ(y) ϕ(x − y) x − yE . (vi) It follows from (3) and Lemma 2 that σs ((x + y)∗ ) ≺≺ σs (x ∗ ) + σs (y ∗ ). Therefore, ϕ(x + y) = lim
s→∞
1 σs (x + y)∗ lim 1 σs (x ∗ ) + σs (y ∗ ) = ϕ(x) + ϕ(y). E E E s→∞ s s
Existence and properties (i)–(vi) of ϕfin can be proved in a similar way. Existence and properties (i), (ii), (iii), (iv), (vi) of ϕcut can be proved in a similar way. Let us prove (iv) for ϕcut . (iv) Assume E ⊂ L1 . By Lemma 4 below, ϕ(x ∗ χ[τ −1 ,1] ) = ϕcut (x ∗ χ[τ −1 ,1] ) = 0. Hence, ϕ(x ∗ χ[0,τ −1 ] ) ϕ(x ∗ χ[0,1] ) ϕ(x ∗ χ[0,τ −1 ] ) + ϕ(x ∗ χ[τ −1 ,1] ) = ϕ(x ∗ χ[0,τ −1 ] ). Therefore, ϕcut στ (x ∗ ) = ϕ στ (x ∗ χ[0,τ −1 ] ) = τ ϕ(x ∗ χ[0,τ −1 ] ) = τ ϕcut (x).
2
Lemma 4. If E = E(0, 1) be a symmetric space on (0, 1) and x ∈ L∞ , then ϕ(x) = 0. If E = E(0, ∞) be a symmetric space on (0, ∞) and x ∈ L∞ ∩ E, then ϕfin (x) = 0. If E = E(0, ∞) L1 and x ∈ E ∩ L∞ , then ϕcut (x) = 0. In particular, the functional ϕ vanishes on every separable space E = E(0, 1). Proof. Clearly, ϕ(x) = ϕ(x ∗ χ[0,1] ) x∞ ϕ(χ[0,1] ) in the first case. Similarly, ϕfin (x) x∞ ϕfin (χ[0,1] ) (ϕcut (x) x∞ ϕcut (χ[0,1] )) in the second (third) case. It is clear that ϕ(χ[0,1] ) = 0 (ϕfin (χ[0,1] ) = 0) in the first (second) case. Also, E ⊂ L1 implies that χ[0,n] E = o(n) and, therefore, ϕcut (χ[0,1] ) = 0. The assertion follows immediately. 2 Lemma 5. Let E be a strictly symmetric space. For functions 0 x1 , . . . , xk ∈ E and numbers λ 1 , . . . , λk 0 ϕ
k i=1
λi xi = ϕ
k i=1
λi xi∗
.
202
F. Sukochev, D. Zanin / Journal of Functional Analysis 257 (2009) 194–218
If E = E(0, ∞), then the same is valid for ϕfin . If, in addition, E L1 , then the same is valid for ϕcut . Proof. Applying the inequality (6) n times, we have for positive functions x1 , . . . , x2n ∗ x1 + · · · + x2∗n ≺≺ 2n σ2−n (x1 + · · · + x2n ). Therefore, by Lemma 3(iii), ϕ x1∗ + · · · + x2∗n ϕ 2n σ2−n (x1 + · · · + x2n ) . By Lemma 3(iv), 2k ϕ(σ2−k (z∗ )) = ϕ(z∗ ). Therefore, ϕ x1∗ + · · · + x2∗n ϕ(x1 + · · · + x2n ). Converse inequality follows trivially from (3) and Lemma 3(iii). The assertion of the lemma follows now from Lemma 3(v). 2 Note, that y and z in the proposition below are arbitrary, that is, y, z do not necessary belong to Q+ (x). Proposition 6. Let E be a symmetric space equipped with a Fatou norm. If x 0 ∈ E, then in each of the following cases there exists a decomposition x = y + z, such that y, z 0 and such that the following assertions hold. (i) (ii) (iii) (iv)
If E = E(0, 1), then ϕ(x) = ϕ(y) = ϕ(z). If E = E(0, ∞) and ϕcut (x) = 0, then ϕ(x) = ϕ(y) = ϕ(z). If E = E(0, ∞), then ϕfin (x) = ϕfin (y) = ϕfin (z). If E = E(0, ∞), then ϕcut (x) = ϕcut (y) = ϕcut (z).
Proof. We will prove only the first assertion. The proofs of the third and fourth assertions are exactly the same. The proof of the second assertion requires replacement of the interval [ m1 , n1 ] with the interval [n, m]. We may assume that x = x ∗ . Fix n ∈ N . The sequence σn (xχ[ 1 , 1 ] ) converges to σn (xχ[0, 1 ] ) m n n almost everywhere when m → ∞. By the definition of Fatou norm, σn (xχ
→m σn (xχ 1 ) . [0, ] E
1 1 ) [m ,n] E
n
For each n ∈ N , one can select f (n) > n, such that
1 σn (xχ 1 ) . 1 [ f (n) , n1 ] ) E 1 − n [0, n ] E
σn (xχ
Fix some n0 and set nk = f k (n0 ), k ∈ N . Here, f k = f ◦ · · · ◦ f (k times). Define
F. Sukochev, D. Zanin / Journal of Functional Analysis 257 (2009) 194–218
y=
∞
xχ[
k=0
z=
∞
xχ[
k=1
1 1 n2k+1 , n2k
203
],
1 1 n2k , n2k−1 ]
.
It is clear, that 1 σn (y ∗ ) 1 σn (y) 1 σn (xχ 1 1 ) . 2k 2k 2k [ , ] E E E n2k+1 n2k n2k n2k n2k
(10)
By definition of nk ,
1 1 1 σn (xχ 1 σn (xχ 1 ) . 1− 1 ) 2k 2k [ , ] [0, ] E E n n n n2k n2k n2k 2k+1 2k 2k
(11)
It follows from (10) and (11) that
1 1 σn (y ∗ ) 1 − 1 σn (xχ 1 ) 1 − 1 ϕ(xχ 1 ). 2k 2k [0, n ] E [0, n ] E n2k n2k n2k n2k 2k 2k By Lemma 4, ϕ(xχ[
1 n2k
ϕ(xχ[0,
,1] ) = 0.
1 n2k
(12)
Since ϕ is convex, then
] ) ϕ(x) ϕ(xχ[0, n1 ] ) + ϕ(xχ[ n1 ,1] ) = ϕ(xχ[0, n1 ] ). 2k
2k
(13)
2k
It follows from (12) and (13) that
1 σn (y ∗ ) 1 − 1 ϕ(x). 2k E n2k n2k Passing to the limit, we obtain ϕ(y) ϕ(x). The converse inequality is obvious. Hence, ϕ(y) = ϕ(x) = ϕ(z), and this completes the proof of the proposition. 2 Lemma 7. If space E is strictly symmetric, then ϕ(y) = ϕ(x) for every y ∈ Q (x). If, in addition, E = E(0, ∞), then ϕfin (y) = ϕfin (x) for every y ∈ Q (x). If E L1 , then ϕcut (y) = ϕcut (x) for every y ∈ Q (x). Proof. Let z=
s
λi xi ,
i=1
where λi 0, si=1 λi = 1, xi 0 and xi∗ = x. By Lemma 5, we obtain ϕ(z) = ϕ(x). However, y ∈ Q (x) can be approximated by such z. Since ϕ is continuous in strictly symmetric spaces, the lemma follows readily. The proofs are the same in cases of ϕfin and ϕcut . 2
204
F. Sukochev, D. Zanin / Journal of Functional Analysis 257 (2009) 194–218
If A is a convex set, then function θ : A → R is called additive homogeneous if and only if θ αy1 + βy2 = αθ (y1 ) + βθ (y2 ),
y1 , y2 ∈ A, α, β ∈ R+ .
Proposition 8. Let E be a strictly symmetric space and x ∈ E. Then the following assertions hold. (i) If E = E(0, 1), then ϕ is additive homogeneous on Q+ (x). (ii) If E = E(0, ∞), then ϕfin is additive homogeneous on Q+ (x). (iii) If E L1 , then ϕcut is additive homogeneous on Q+ (x). Proof. We will only prove the first assertion. The proofs of the other two assertions are exactly the same. Let y ∈ Conv(extr(Ω+ (x))), so that y=
m
λi xi ,
i=1
m ∗ ∗ ∗ where λi 0, m i=1 λi = 1, xi 0 and xi = x χ[0,βi ] . Denote z = i=1 λi x χ[0,βi ] and u = ∗ βi >0 λi x χ[0,1] . By Lemma 5, ϕ(y) = ϕ(z). Since |z − u| ∈ L∞ , then ϕ(|u − z|) = 0 by Lemma 4. By the triangle inequality, ϕ(u) ϕ(z) + ϕ |u − z| = ϕ(z) ϕ(u) + ϕ |u − z| = ϕ(u). Hence, ϕ(y) = ϕ(u) = ( βi >0 λi )ϕ(x). It is clear that the last expression is additive homogeneous on the set Conv(extr(Ω+ (x))). By Lemma 3, the functional ϕ is continuous on Q+ (x). Hence, it is additive homogeneous on the set Q+ (x). 2 Proposition 9. Let E = E(0, ∞) be a symmetric space on semi-axis equipped with a Fatou norm. Suppose that E L1 and x ∈ E. If Ω+ (x) = Q+ (x), then ϕ is additive homogeneous on Ω+ (x). Proof. It follows from Proposition 8 that ϕcut is additive homogeneous on Q+ (x). By assumption, Ω+ (x) = Q+ (x). Hence, ϕcut is additive homogeneous on Ω+ (x). It follows now from Proposition 6(iv) that ϕcut (x) = 0. This assertion and Lemma 2 imply that ϕ(x ∗ χ[0,β] ) = 0 for every finite β. Let y ∈ Conv(extr(Ω+ (x))). Hence, y=
m
λi xi ,
i=1
where λi 0,
m
i=1 λi
= 1, xi 0 and xi∗ = x ∗ χ[0,βi ] . By convexity of ϕ, ϕ(y) ϕ
βi ∈[0,∞)
λi xi + ϕ
βi =∞
λi xi .
F. Sukochev, D. Zanin / Journal of Functional Analysis 257 (2009) 194–218
205
However, 0ϕ
λi xi
βi ∈[0,∞)
λi ϕ(x ∗ χ[0,βi ] ) = 0.
βi ∈[0,∞)
It then follows that ϕ(y) ϕ
λi xi .
βi =∞
The converse inequality is obvious. By Lemma 5, ϕ(y) = ϕ
λi xi = ϕ λi xi∗ = λi ϕ(x).
βi =∞
βi =∞
βi =∞
Clearly, the last expression is additive homogeneous on Conv(extr(Ω+ (x))). Hence, the functional ϕ is additive homogeneous on Q+ (x) = Ω+ (x). 2 Lemma 10. Let E = E(0, ∞) be a strictly symmetric space on (0, ∞) and x ∈ E. Suppose, that E L1 . If P (x|A) ∈ Q (x) for every A, then ϕcut (x) = 0. Proof. Suppose that x = x ∗ . Set A = {[0, 1]} and y = P (x|A) ∈ E ∩ L∞ . By the assumption, y ∈ Q (x). By Lemmas 7 and 4, ϕcut (x) = ϕcut (y) = 0. 2 Lemma 11. Let E and x be as in Lemma 10. If L∞ ⊆ E, then ϕ(x) = 0. Proof. Due to the choice of E, we have 1 ∈ E. However, στ (1) = 1 implies ϕ(1) = 0. Thus, for every z ∈ E ∩ L∞ , we have ϕ(z) = 0. However, for every x ∈ E, we have ϕ(x ∗ χ[0,1] ) = 0 due to Lemma 10. Hence, 0 ϕ(x) = ϕ(x ∗ ) ϕ(x ∗ χ[0,1] ) + ϕ(x ∗ χ[1,∞) ) = 0 + 0 = 0. Lemma 12. Let E and x be as in Lemma 10. If y ∈ E ∩ L∞ and if t ω(y) := lim sup 0t t→∞
0
y ∗ (s) ds x ∗ (s) ds
,
then ϕ(y) = ω(y)ϕ(x). In particular, if in addition ϕ(x) > 0, then ω(y) < ∞. Proof. Fix ε > 0. There exists T > 0, such that for every t > T , t 0
y (s) ω(y) + ε ∗
t 0
x ∗ (s) ds.
2
206
F. Sukochev, D. Zanin / Journal of Functional Analysis 257 (2009) 194–218
It then follows that y ≺≺ (ω(y) + ε)(x ∗ + Cχ[0,T ] ) for some constant C. By Lemma 3(iii), ϕ(y) (ω(y) + ε)ϕ(x ∗ + Cχ[0,T ] ). By Lemma 4, ϕ(Cχ[0,T ] ) = 0 and, therefore, ϕ(x ∗ + Cχ[0,T ] ) = ϕ(x). Hence ϕ(y) ω(y)ϕ(x). Now, fix ω < ω(y). There exists a sequence tk → ∞, such that tk
tk
∗
y (s) ds ω 0
x ∗ (s) ds.
0
Without loss of generality, t0 = 0. Set u = P (x ∗ |A), where A = {[tk , tk+1 )}. It then follows that ωu ≺≺ y and ωϕ(u) ϕ(y). However, u ∈ Q (x) and ϕ(u) = ϕ(x) due to Lemma 7. Hence ω(y)ϕ(x) ϕ(y). 2 Proposition 13. Let E = E(0, ∞) be a symmetric space on the semi-axis and let x ∈ E. If ϕ(x) = 0, then xχA ∈ Q (x) for every Lebesgue measurable subset A ⊆ (0, ∞). Proof. Let [0, ∞) = B ∪ C, where B, C are disjoint sets such that m(B) = m(A) and m(C) = ∞. Fix a partition C = n+1 i=1 Ci , where m(Ci ) = m(R+ \ A), 1 i n. Let γ : B → A and γi : Ci → R+ \ A, 1 i n be measure-preserving transformations. Define functions xni , 1 i n by the following construction. Set xni χB = x ◦ γ , xni |Ci = x ◦ γi and xni |Cj = 0 if i = j . Clearly, xni ∼ x and n 1 i 1 1 xn = σn (xχ[0,∞)\A ) E σn (x ∗ ) E → 0. (xχA ) ◦ γ − n n n i=1
E
Hence, (xχA ) ◦ γ ∈ Q (x). Thus, xχA ∈ Q (x).
2
Corollary 14. Let E = E(0, ∞) be a symmetric space on semi-axis. If ϕ(x) = 0, then yχA ∈ Q (x) for every y ∈ Q (x). Proof. It follows from assumption and Lemma 7 that ϕ(y) = ϕ(x) = 0. Lemma 13 implies that yχA ∈ Q (y). Since yχA ∈ Q (y) and y ∈ Q (x), then Lemma 16 implies yχA ∈ Q (x). 2 An assertion somewhat similar to the lemma below is contained in [3, Lemma 1.3]. Lemma 15. Assume that x ∈ E satisfies conditions of Proposition 13. If y ∈ Q (x) and 0 z y, then z ∈ Q (x). Proof. Define sets eni , i = 1, . . . , n by
i i −1 y(t) z(t) y(t) . eni = t: n n Define functions ynk , k = 1, . . . , n as ynk = y k<(i+n)/2 χeni . By Corollary 14, ynk ∈ Q (x). Put 1 k yn ∈ Q (x). n n
sn =
k=1
F. Sukochev, D. Zanin / Journal of Functional Analysis 257 (2009) 194–218
207
Clearly, sn (t) − y(t) + z(t) /2 2y(t) , n
∀t ∈ eni .
Hence, sn → (y + z)/2 by norm. Therefore, (y + z)/2 ∈ Q (x). We can repeat this procedure n times and obtain 2−n ((2n − 1)z + y) ∈ Q (x). Therefore, z ∈ Q (x). 2 The following assertion seems to be known. We include the details of the proof for lack of a convenient reference. Lemma 16. Let E be a symmetric space either on (0, 1) or (0, ∞) and x ∈ E. If y ∈ Q (z) and z ∈ Q (x), then y ∈ Q (x). Proof. Without loss of generality, y = y ∗ , z = z∗ and x = x ∗ . Let y ∈ Q (z). Hence, for every ε> 0, one can find n ∈ N , λi ∈ R+ and measurable functions zi ∼ z, i = 1, . . . , n, such that n i=1 λi = 1 and n λi zi ε. y − i=1
E
One can find measure-preserving transformations γi , such that zi − z ◦ γi L1 ∩L∞ ε. Hence, n λi z ◦ γi 2ε. y − i=1
E
However, z ∈ Q (x). Consequently, arguing in a similar way,one can find m ∈ N , μj ∈ R+ and measure preserving transformations δj , 1 j n such that m j =1 μj = 1 and m μj x ◦ δj 2ε. z − j =1
E
Therefore, n m λi μj x ◦ γi ◦ δj 4ε y − i=1 j =1
and this suffices to complete the proof.
2
Remark 17. The collection of sets {Q(x), x ∈ E} also satisfies the transitivity property expressed in Lemma 16. We do not know whether this is the case for the collection {Q+ (x), x ∈ E}.
208
F. Sukochev, D. Zanin / Journal of Functional Analysis 257 (2009) 194–218
4. Main results The implication (ii) ⇒ (i) in the following theorem is almost verbatim repetition of the argument given in [3, Lemma 3.1] for the case of finite measure. For convenience of the reader, we present here a proof of the most important case. Theorem 18. (a) Let E be a fully symmetric space and x ∈ E. If E = E(0, 1) or E = E(0, ∞) and E L1 , then the following conditions are equivalent. (i) P (x|A) ∈ Q (x) for every A ∈ A. (ii) ϕ(x) = 0. (b) If E = E(0, ∞) and E ⊆ L1 , then the following conditions are equivalent. (i) P (x|A) ∈ Q (x) for every A ∈ A. (ii) ϕfin (x) = 0. Proof. (a) (i) ⇒ (ii) Let E = E(0, 1) and x = x ∗ . Set A = {[0, 1]} and y = P (x|A). By assumption, y ∈ Q (x). By Lemmas 7 and 4, ϕ(x) = ϕ(y) = 0. Let E = E(0, ∞) and L∞ ⊆ E L1 . The assertion is proved in Lemma 11. Let E = E(0, ∞) and L∞ E L1 . Suppose that x = x ∗ and ϕ(x) > 0. Set B = {[0, 1]}, t ψ = P (x|B) and ψ(t) = 0 ψ (s) ds. By Lemma 7, ϕ(ψ ) = ϕ(x). Let y ∈ E ∩ L∞ . It follows from Lemma 12, that ω(y) < ∞. Therefore, y ∈ Mψ . Hence, E ∩ L∞ ⊆ Mψ . Since E is fully symmetric and ψ ∈ E ∩ L∞ , then Mψ ⊆ E ∩ L∞ . Therefore, E ∩ L∞ = M ψ . If u = 2σ 1 ψ , then ϕ(u) = ϕ(ψ ) by Lemma 3(v). Hence ω(u)ϕ(x) = ϕ(x) and ω(u) = 1. 2 However, t 2x(2s) ds ψ(2t) . = lim sup ω(u) = lim sup 0 t t→∞ t→∞ ψ(t) 0 x(s) ds Thus, lim
t→∞
ψ(2t) = 1. ψ(t)
(14)
Let G be the set defined by
y ∗ (t) G = y ∈ E: ∃C sup <∞ . t1 ψ (Ct) Note, that our set G differs from the one introduced in [10]. If y1 , y2 ∈ G, then yi∗ (t) Ci ψ(Ct) for t 12 . It then follows (y1 + y2 )∗ (t) y1∗
t t C t . + y2∗ (C1 + C2 )ψ 2 2 2
F. Sukochev, D. Zanin / Journal of Functional Analysis 257 (2009) 194–218
209
In particular, G is a linear set and Conv({y ∗ = x ∗ }) ⊆ G. If the condition (14) holds, then there exists a sequence tk , such that t0 = 0, t1 = 1 and for every k ψ(tk+1 ) − ψ(tk ) 2 ψ( 12 tk+1 ) . tk+1 − tk 3 tk+1 Set A = {[tk , tk+1 ]} and z = P (x|A). It follows from the construction given in [10] that (z − y)χ[ 1 tk ,tk ] Mψ 14 for every y ∈ G and every sufficiently large k. However, 2
(y − z)χ[ 1 tk ,tk ] L∞ → 0. Since Mψ = E ∩ L∞ , then (z − y)χ[ 1 tk ,tk ] E 2
2
1 4
for sufficiently
(z, Q (x))
large k. In particular, y − zE Hence, distE (z, G) and distE 14 . This contradicts the assumption that P (x|A) ∈ Q (x). (a) (ii) ⇒ (i) Let E = E(0, 1) or E = E(0, ∞) L1 . We will prove the assertion for the case when A = {[0, 1]}. The general proof is similar. Without loss of generality, x decreases on [0, 1]. Define functions xni , i = 0, . . . , n − 1 such that (i) xni = x outside (0, 1) and (ii) xni (t) = x((t + ni )(mod 1)) if t ∈ (0, 1). Set xn (t) = x(t − ni ) if ni t i+1 n , 0 i n − 1 and xn (t) = 0 if t 1. Clearly, xni ∼ x and (xn )∗ σn (x ∗ ). We will show that 1 4.
1
1 4
1 x(s) ds − x n n−1
i=0
0
1
n i (mod 1) x(s) ds t+ n 0
and 1
1 x(s) ds − x n n−1 i=0
0
i 1 t+ (mod 1) − xn (t). n n
We will prove only the first inequality. The proof of the second one is identical. Without loss of generality, t ∈ [0, n1 ]. Clearly, i+2
n 1 i x t+ x(s) ds n n i+1 n
for i = 0, . . . , n − 2. Hence, 1 0
1
n
n−1 n−1 1 i 1 = x(s) ds − x t + x(s) ds − x t+ n n n n i=0
0 i+2 n1 n n−2 1 i x t+ − − x(s) ds x(s) ds. n n
i=0
i+1 n
0
210
F. Sukochev, D. Zanin / Journal of Functional Analysis 257 (2009) 194–218
Therefore, 1 n−1 1 i 1 xn (t) zn (t), x(s) ds − n n
t ∈ [0, 1],
i=0
0
1 where zn = xn + ( 0 xn (s) ds)χ[0,1] . Obviously, zn ≺≺ 2xn 2σn (x ∗ ) and, therefore, zn E 2σn (x ∗ )E . It then follows that n−1 1 i 2 xn σn (x ∗ ) E → 0. P (x|A) − n n i=0
E
(b) (i) ⇒ (ii) Let E = E(0, ∞) and E ⊂ L1 . Set A = {s: x(s) 1} and A = {A}. Set y = P (x|A) ∈ E ∩ L∞ . Lemma 4 implies that ϕ(y) = 0. By the assumption, y ∈ Q (x). By Lemma 7, ϕ(x) = ϕ(y) = 0. (b) (ii) ⇒ (i) The assertion follows from Theorem 23. 2 The following proposition is the core technical result of the article. In case of the interval (0, 1) it may be found in [3, Lemma 3.2]. However, our proof is more general, simpler and shorter. We consider the functions of the form x=
xi χ[ai−1 ,ai ] ,
y=
i∈Z
yi χ[ai−1 ,ai ] ,
(15)
i∈Z
where {ai }i∈Z is an increasing sequence (possibly finite or one-sidedly infinite). Proposition 19. Let y = y ∗ and x = x ∗ be the functions of the form (15) either on (0, 1) or on (0, ∞). If y ≺≺ x, then there exists a countable collection {k }k∈K of disjoint sets, where k = Ik ∪ Jk with intervals Ik and Jk of finite measure, such that (i) the functions x and y are constant on the intervals Ik and Jk and the interval Ik lies to the left of Jk , k ∈ K. (ii) y|k ≺ x|k , k ∈ K. (iii) y(t) x(t) if t ∈ / k∈K k . If, in addition, x and y are functions on (0, 1) and t∈ / k∈K k .
1 0
y(s) ds =
1 0
x(s) ds, then y(t) = x(t) if
Proof. There exists a subsequence {ami }i∈I (possibly finite or one-sidedly infinite) such that {x < y} = ∈I [ami −1 , ami ]. Since y ≺≺ x, we have t
t (x − y)+ (s) ds −
0
t (y − x)+ (s) ds =
0
t x(s) ds −
0
y(s) ds 0. 0
F. Sukochev, D. Zanin / Journal of Functional Analysis 257 (2009) 194–218
211
For each i ∈ I, denote by bi the minimal t > 0, such that t
ami (x − y)+ (s) ds = (y − x)+ (s) ds.
0
0
Clearly, for every i ∈ I, ami −1
ami ami (x − y)+ (s) ds = (x − y)+ (s) ds (y − x)+ (s) ds.
0
0
0
Hence, bi ami −1 . For each i ∈ I, the set [bi−1 , bi ] ∩ {x > y} is a finite union disjoint intervals on which each of x and y is finite. By the definition of bi , we have ami
bi (y − x)+ (s) ds =
ami −1
(x − y)+ (s) ds =
ni j =1
bi−1
ni
j j =1 Ii
of
(x − y)+ (s) ds.
j
Ii
j
Set K = {(i, j ): 1 j ni , i ∈ I}. If k = (i, j ) ∈ K, set Ik = Ii and j j −1 j Jk = Ji = ami −1 + (ymi − xmi )−1 ci , ami −1 + (ymi − xmi )−1 ci , where j ci
=
j l=1
(x − y)+ (s) ds,
i ∈ I, 0 j ni .
Iil
Using the fact that x and y are constant on the interval [ami −1 , ami ], we obtain Jk ⊂ [ami −1 , ami ] i j Ji = [ami −1 , ami ]. and nj =1 (i) Both x and y are constant on Ik and Jk , k ∈ K. Since bi ami −1 for each i ∈ I, then Ik lies to the left of Jk for k ∈ K. It then follows from (i), that
(x − y)+ (s) ds =
Ik
(y − x)+ (s) ds,
k ∈ K.
(16)
Jk
(ii) Since x|Ik y|Ik and x|Jk y|Jk for all k ∈ K, then the assertion follows directly from (i) and (16). ni j (iii) The set {y > x} = i∈I j =1 Ji ⊆ k∈K k . The last assertion is immediate. 2
212
F. Sukochev, D. Zanin / Journal of Functional Analysis 257 (2009) 194–218
Corollary 20. Let E be a fully symmetric space either on the interval (0, 1) or on the semi-axis. If x, y and B = {k }k∈K are as in Proposition 19 and y(t) = x(t) if t ∈ / k k , then y can be arbitrary well approximated in the norm of E by convex combinations of functions of the form P (x|A), A ∈ A. Proof. Set λk = (y|Ik − y|Jk )/(x|Ik − x|Jk ), k ∈ K. Since y|k ≺ x|k , it is not difficult to verify that λk ∈ [0, 1], k ∈ K. Further, a simple calculation shows that y = (1−λk )P (x|B)+λk x on k , k ∈ K. As is well known, every [0, 1]-valued sequence can be uniformly approximated by convex combinations of {0, 1}-valued sequences. n Fix ε > 0. There nexists μ ∈ l∞ (K) with μ = i=1 θi χDi for some n ∈ N , 0 θi ∈ R and Di ⊆ K such that i=1 θi = 1 and λ − μ∞ ε. Set z = (1 − μk )P (x|B) + μk x on k , k ∈ K and z = x outside k∈K k . It is clear that |y − z|χk = |λk − μk ||x − P (x|B)|χk , k ∈ K and |y − z| = k∈K |y − z|χk 2ε(x + P (x|B)). Therefore, y − zE 2εxE . Set Fi = k∈Di k and Ai = {k }k ∈D / i ∈ A, 1 i n. It is then clear that z=
n n θi (1 − χFi )P (x|B) + χFi x = θi P (x|Ai ). i=1
2
i=1
4.1. The case that E ⊆ L1 Theorem 21. Let E = E(0, 1) be a fully symmetric space on the interval (0, 1). If x ∈ E, then the following statements are equivalent. (i) Ω (x) = Q (x). (ii) ϕ(x) = 0. Proof. (i) ⇒ (ii) Suppose that Q (x) = Ω (x). Set A = {[0, 1]} and y = P (x|A). Clearly, y ∈ Ω (x) = Q (x). Lemma 7 implies that ϕ(x) = ϕ(y). Lemma 4 implies ϕ(y) = 0. The assertion is proved. (ii) ⇒ (i) Let x = x ∗ and 0 y ∈ Ω (x). In this case, y = y ∗ ◦ γ for some measure-preserving transformation γ (see [15] or [2, Theorem 7.5, p. 82]). Without loss of generality, we may assume that y = y ∗ . Fix ε > 0. Set sn (ε) = inf s: y(s) y(1) + nε ,
n ∈ N.
Let Aε be the partition, determined by the points sn (ε), n ∈ N . Set u = P (y|Aε ) and z = P (x|Aε ). The functions u and z satisfy the condition u ≺ z and are of the form given in (15). By Lemma 3(iii), ϕ(z) ϕ(x) = 0. By Theorem 18, P (z|A) ∈ Q (z) for every A ∈ A. It follows now from Corollary 20 that u ∈ Q (z). However, z ∈ Q (x) by Theorem 18. Therefore, by Lemma 16, u ∈ Q (x). However, y − uL∞ ε. Since ε is arbitrary, y ∈ Q (x). 2 Theorem 22. Let E = E(0, 1) be a fully symmetric space on the interval (0, 1). If x ∈ E and ϕ(x) = 0, then Ω+ (x) = Q+ (x). If, in addition, the norm on E is a Fatou norm, then converse assertion also holds.
F. Sukochev, D. Zanin / Journal of Functional Analysis 257 (2009) 194–218
213
Proof. Suppose that ϕ(x) = 0 and let y ∈ Ω+ (x). Hence, there exists s0 ∈ [0, 1], such that s0 ∗ 1 ∗ ∗ 0 x (s) ds = 0 y (s) ds. Set z = x χ[0,s0 ] . By Theorem 21, y ∈ Q (z). Hence, y ∈ Q (z) ⊆ Q+ (x). By Proposition 6, there exist 0 y, z ∈ E, such that x = y + z and ϕ(x) = ϕ(y) = ϕ(z). By Proposition 8, ϕ(x) = ϕ(y) + ϕ(z). Consequently, ϕ(x) = 0. 2 Now, consider the case that E = E(0, ∞). Theorem 23. Let E = E(0, ∞) be a fully symmetric space on semi-axis. If E ⊆ L1 and x ∈ E, then the following assertions are equivalent. (i) Ω (x) = Q (x). (ii) ϕfin (x) = 0. Proof. (i) ⇒ (ii) Let x = x ∗ and suppose that Q (x) = Ω (x). Set A = {[0, 1]} and y = P (x|A). Clearly, y ∈ Ω (x) = Q (x). Lemma 7 implies that ϕfin (x) = ϕfin (y). Lemma 4 implies ϕfin (y) = 0. The assertion is proved. (ii) ⇒ (i) Let x = x ∗ and 0 y ∈ Ω (x). It follows from [10, Lemma II.2.1] that for every fixed ε > 0 there exists a measure-preserving transformation γ such that y − y ∗ ◦ γ E ε. Without loss of generality, we may assume that y = y ∗ . For every S > 0, 1 (στ x)χ[0,S] S (στ x)χ[0,1] → 0. E E τ τ (a) Suppose first that supp(x) = supp(y) = (0, ∞). Fix ε > 0. There exists T , such that xχ[T ,∞) L1 ∩L∞ ε,
yχ[T ,∞) L1 ∩L∞ ε.
∞ S T T Clearly, 0 x(s) ds < 0 x(s) ds. Hence, there exists S T , such that 0 y(s) ds = 0 x(s) ds. By Theorem 21, yχ[0,S] ∈ Q (xχ[0,T ] ). Hence, y ∈ Q (x) + yχ(S,∞) − Q (xχ(T ,∞) ) and, therefore, dist(y, Q (x)) 2ε. Since ε is arbitrary, y ∈ Q (x). (b) Suppose now that m(supp(x)) < ∞ or m(supp(y)) < ∞. Fix z = z∗ ∈ L1 ∩ L∞ with infinite support. It is clear that (y + εz) ∈ Ω (x + εz), ε > 0. By assumption and Lemma 4, ϕfin (x + εz) = 0. Hence, using (a) preceding, it follows that (y + εz) ∈ Q (x + εz) ⊂ Q (x) + εQ (z). Hence, dist(y, Q (x)) ε for every ε > 0 and, therefore, y ∈ Q (x). 2 Theorem 24. Let E = E(0, ∞) be a fully symmetric space on (0, ∞) such that E ⊆ L1 . If 0 x ∈ E and ϕfin (x) = 0, then Ω+ (x) = Q+ (x). If, in addition, the norm on E is a Fatou norm, then converse assertion also holds. Proof. Let ϕfin (x) = 0 and y ∈ Ω+ (x). As in Theorem 23, we may assume y = y ∗ . Fix ε > 0. There exists T > 0 such that xχ[T ,∞) L1 ∩L∞ ε,
yχ[T ,∞) L1 ∩L∞ ε.
214
F. Sukochev, D. Zanin / Journal of Functional Analysis 257 (2009) 194–218
Select S T such that S
∗
T
x (s) ds = 0
y ∗ (s) ds.
0
Clearly, yχ[0,T ] ∈ Ω (x ∗ χ[0,S] ). By Theorem 21, yχ[0,T ] ∈ Q (x ∗ χ[0,S] ) ⊆ Q+ (x). Hence, y ∈ Q+ (x). By Proposition 6, there exist 0 y, z ∈ E, such that x = y + z and ϕfin (x) = ϕfin (y) = ϕfin (z). By Proposition 8, ϕfin (x) = ϕfin (y) + ϕfin (z). Consequently, ϕfin (x) = 0. 2 4.2. The case that E L1 Theorem 25. Let E = E(0, ∞) be a fully symmetric space on the semi-axis and let x ∈ E. If ϕ(x) = 0, then Ω+ (x) = Q (x). Proof. Let us assume first that y = y ∗ ∈ Ω+ (x). Fix ε > 0. Set tn (ε) = 1 + nε, sn (ε) = inf s: y(s) y(1) + nε , s−n (ε) = sup s: y(s) y(1) − nε . Let Aε be the partition, determined by the points s±n (ε), tn (ε). Set u = P (y|Aε ) and z = P (x|Aε ). The functions u and z satisfy the conditions u ≺≺ z and (15). Set v=u
k∈K
χk + zχ(0,∞)\k∈K k ,
where the collection {k }k∈K is given by Proposition 19. By Lemma 3(iii), ϕ(z) ϕ(x) = 0. By Theorem 18, P (z|A) ∈ Q (z) for every A ∈ A. It follows now from Corollary 20 that v ∈ Q (z). Since u v, it follows from Lemma 15 that u ∈ Q (z). Theorem 18 implies that z ∈ Q (x). By Lemma 16, u ∈ Q (x). However, dist y, Q (x) y − uE y − P (y|Aε ) L
1 ∩L∞
ε 1 + y(1) .
Since ε is arbitrary, y ∈ Q (x). Let now y ∈ Ω+ (x) be arbitrary. By [10, Lemma II.2.1 and Theorem II.2.1], for every fixed ε > 0, there exist y1 ∈ E, y2 ∈ E, y = y1 + y2 and measure-preserving transformation γ such that 0 y1 y ∗ ◦ γ and y2 E ε. Since we already proved that y ∗ ∈ Q (x), the assertion follows immediately. 2 Theorem 26. Let E = E(0, ∞) be a fully symmetric space on semi-axis. Suppose that E L1 and x ∈ E. If ϕ(x) = 0, then the set Ω+ (x) is the norm-closed convex hull of its extreme points. If, in addition, the norm on E is a Fatou norm, then converse assertion also holds. Proof. Suppose that ϕ(x) = 0. Applying Theorem 25 and noting the embedding Q (x) ⊆ Q+ (x) yields the assertion.
F. Sukochev, D. Zanin / Journal of Functional Analysis 257 (2009) 194–218
215
Conversely, by Proposition 6(iv), there exist 0 y1 , z1 ∈ E, such that x = y1 + z1 and ϕcut (x) = ϕcut (y1 ) = ϕcut (z1 ). By assumption, Ω+ (x) = Q+ (x) and so y1 , z1 ∈ Q+ (x). By Proposition 8, ϕcut (x) = ϕcut (y1 ) + ϕcut (z1 ). Consequently, ϕcut (x) = 0. By Proposition 6(ii), there exist 0 y2 , z2 ∈ E, such that x = y2 + z2 and ϕ(x) = ϕ(y2 ) = ϕ(z2 ). Again, by the assumption, we have y2 , z2 ∈ Q+ (x) and therefore, by Proposition 9, we have ϕ(x) = ϕ(y2 ) + ϕ(z2 ). Consequently, ϕ(x) = 0. 2 Acknowledgments We would like to thank Peter Dodds for many helpful comments on the content of this paper and lengthy discussions of earlier drafts. We also thank Sergei Astashkin, Aleksandr Sedaev and Evgenii Semenov for their interest. Appendix A A.1. An application to the case of orbits Ω(x) The following consequence of Theorem 22 is essentially due to Braverman and Mekler [3]. Corollary 27. If ϕ(x) = 0, then Ω(x) is the norm-closed convex hull of its extreme points. Proof. Let x = x ∗ and y ∈ Ω(x). Clearly, y = u · |y|, where |u| = 1 a.e. and |y| ∈ Ω+ (x). Fix ε > 0. By Theorem 22, there exist n ∈ N , scalars λn,i , βn,i ∈ [0, 1] and functions xn,i ∼ xχ[0,βn,i ] , such that ni=1 λn,i = 1 and n λn,i xn,i ε. |y| − i=1
E
There exist measure-preserving transformations γn,i , 1 i n (see [15]), such that xn,i = 1 =u·x ◦γ 2 (x ∗ χ[0,βn,i ] ) ◦ γn,i . Set xn,i n,i and xn,i = u · (xχ[0,βn,i ] − xχ[βn,i ,1] ) ◦ γn,i , 1 i n. It is clear that xn,i ∼ x, 1 i n, and n n 1 1 1 2 λn,i xn,i − λn,i xn,i ε. 2 y − 2 2 i=1
i=1
E
A.2. Extreme points of the orbit Ω+ (x) The following theorem is due to Ryff (see [14]). Theorem 28. If 0 x ∈ L1 (0, 1), then y ∈ extr(Ω (x)) if and only if y ∗ = x ∗ . Corollary 29. If 0 x ∈ L1 (0, 1), then y ∈ extr(Ω+ (x)) if and only if y ∗ = x ∗ χ[0,β] for some β 0. 1 β Proof. Indeed, if 0 x ∗ (s) ds = 0 y ∗ (s) ds, then y ∈ Ω (x ∗ χ[0,β] ). Therefore, if y ∈ extr(Ω+ (x)), then obviously y ∈ extr(Ω (x ∗ χ[0,β] )) and the assertion follows immediately from Theorem 28.
216
F. Sukochev, D. Zanin / Journal of Functional Analysis 257 (2009) 194–218
t t If y ∗ = x ∗ χ[0,β] and y = 12 (u1 + u2 ) with ui ∈ Ω+ (x), then 0 u∗i (s) ds = 0 x ∗ (s) ds for t ∈ [0, β] and supp(ui ) = supp(y). Therefore, (u1 + u2 )∗ = u∗1 + u∗2 . It follows now from [10, (II.2.19)] that u1 = u2 . 2 Lemma 30. If 0 x ∈ L1 + L∞ and y ∈ extr(Ω+ (x)), then yχ{y 0, then either y ∗ = x ∗ χ[0,β] for some β ∈ [0, ∞) or y ∗ = x ∗ and yχ{y 0 and find t2 t1 t1 + t2 ∗ ∗ such that 0 x (s) ds = 0 y (s) ds. Clearly, y ∗ χ[0,t1 ] ≺ x ∗ χ[0,t2 ] and y ∗ χ[t1 ,∞) ≺≺ x ∗ χ[t2 ,∞) . If y ∗ χ[0,t1 ] = 12 (u1 + u2 ) with u1 , u2 ∈ Ω (x ∗ χ[0,t2 ] ), then set yi = ui χ[0,t1 ] + y ∗ χ[t1 ,∞) . We claim yi ≺≺ x. Indeed, if e ∈ (0, ∞) and m(e) < ∞, then e = e1 ∪ e2 with e1 ⊂ [0, t1 ] and e2 ⊂ [t1 , ∞). Therefore,
yi (s) ds = e
ui (s) ds +
e1
u∗i (s) ds
y (s) ds e2
min{t2 ,m(e1 )}
∗
t1 +m(e 2)
m(e 1)
∗
y ∗ (s) ds
+ t1
0 t2 +m(e 2)
x (s) ds + 0
∗
m(e)
x ∗ (s) ds.
x (s) ds t2
0
/ extr(Ω+ (x)). Therefore, y ∗ χ[0,t1 ] ∈ Hence, yi ∈ Ω+ (x) and y = 12 (y1 + y2 ). Thus, y ∈ ∗ ∗ ∗ extr(Ω (x χ[0,t2 ] )). By Theorem 28, y = x on [0, t2 ]. The assertion follows now from Lemma 30. The converse assertion is easy. 2 Corollary 32. If x ∈ L1 (0, ∞), then 0 y ∈ extr(Ω (x)) if and only if y ∗ = x ∗ . The proof is identical to that of Corollary 31. A.3. Marcinkiewicz spaces with trivial functional ϕ It follows from Lemma 3 and the definition of Marcinkiewicz space, that ϕ = 0 if and only if ϕ(ψ ) = 0. It is now easy to derive, that in case of the interval (0, 1) this is equivalent to the
F. Sukochev, D. Zanin / Journal of Functional Analysis 257 (2009) 194–218
217
condition lim inf
ψ(2t) > 1. ψ(t)
lim inf
ψ(2t) >1 ψ(t)
t→0
In case of the semi-axis, the condition
t→∞
needs to be added. A.4. A comparison of conditions (1) and (2) in Orlicz spaces Let M be a convex function satisfying (5) and let LM be the corresponding Orlicz space on (0, 1). The following proposition shows that LM always satisfies condition (2). Proposition 33. We have ϕ(x) = 0 for every x ∈ LM . Proof. Using the description of relatively weakly compact subsets in LM given in [1] (see also [12, p. 144]) we see that for every 0 y ∈ LM 1
n n
M
1 y(s) ds → 0. n
0
We are going to prove that n1 σn xLM → 0 for every x ∈ LM . Assume the contrary. Let σn xLM nα for some 0 x ∈ LM , some α > 0 and for arbitrary large n 1. By the definition of the norm · LM , we have
1 M
1 σn x(s) ds 1. nα
0
Hence, 1
n n
1 y(s) ds 1 M n
0
with y = α −1 x ∈ LM . A contradiction.
2
However, there exists an Orlicz space LM which fails to satisfy condition (1). For the definition of Boyd indices 1 pE qE ∞ of a fully symmetric space E, we refer the reader to [11, 2.b.1 and p. 132]. It is clear, that the condition (1) holds for a fully symmetric space E if and only if pE > 1. However (see e.g. [11]) Orlicz space LM is separable if and only if qLM < ∞. It is well known that there exists a non-separable Orlicz space LM with pLM = 1.
218
F. Sukochev, D. Zanin / Journal of Functional Analysis 257 (2009) 194–218
A.5. An application to symmetric functionals Let E be a fully symmetric space. A positive functional f ∈ E ∗ is said to be symmetric (respectively, fully symmetric) if f (y) = f (x) (respectively, f (y) f (x)) for all 0 x, y ∈ E such that y ∗ = x ∗ (respectively, y ≺≺ x). We refer to [8,5] and references therein for the exposition of the theory of singular fully symmetric functionals and their applications. Recently, symmetric functionals which fail to be fully symmetric were constructed in [9] on some Marcinkiewicz spaces. However, for Orlicz spaces situation is different. The following proposition shows that a symmetric functional on an Orlicz space on the interval (0, 1) is necessary fully symmetric. Proposition 34. Any symmetric functional on LM is fully symmetric. Proof. Let ω ∈ E ∗ be symmetric. It is clear, that ω(x ∗ χ[0,β] ) ω(x) for x 0. Therefore, ω(y) ω(x) for y ∈ Conv{y ∗ = x ∗ χ[0,β] }. Since ω is continuous, we have ω(y) ω(x) for y ∈ Q+ (x). By Theorem 22 and Proposition 33, we have Q+ (x) = Ω+ (x), and so ω is a fully symmetric functional on LM . 2 Corollary 35. Any singular symmetric functional on LM vanishes. Proof. Indeed, there are no fully symmetric singular functionals on LM (see [8, Theorem 3.1]). 2 We also formulate the following hypothesis: If E is a fully symmetric space, then functional ϕ vanishes if and only if there are no singular symmetric functionals on E. References [1] T. Ando, Weakly compact sets in Orlicz spaces, Canad. J. Math. 14 (1962) 170–176. [2] C. Bennett, R. Sharpley, Interpolation of Operators, Pure Appl. Math., vol. 129, Academic Press Inc., Boston, MA, 1988. [3] M.Sh. Braverman, A.A. Mekler, The Hardy–Littlewood property for symmetric spaces, Siberian Math. J. 18 (1977) 371–385. [4] A.-P. Calderón, Spaces between L1 and L∞ and the theorem of Marcinkiewicz, Studia Math. 26 (1966) 273–299. [5] A.L. Carey, F.A. Sukochev, Dixmier traces and their applications in noncommutative geometry, Russian Math. Surveys 61 (2006) 45–110. [6] V.I. Chilin, A.V. Krygin, F.A. Sukochev, Extreme points of convex fully symmetric sets of measurable operators, Integral Equations Operator Theory 15 (1992) 186–226. [7] P.G. Dodds, F.A. Sukochev, G. Schlüchtermann, Weak compactness criteria in symmetric spaces of measurable operators, Math. Proc. Cambridge Philos. Soc. 131 (2001) 363. [8] P.G. Dodds, B. de Pagter, E.M. Semenov, F.A. Sukochev, Symmetric functionals and singular traces, Positivity 2 (1998) 47–75. [9] N. Kalton, F. Sukochev, Rearrangement-invariant functionals with applications to traces on symmetrically normed ideals, Canad. Math. Bull. 51 (2008) 67–80. [10] S.G. Krein, Ju.I. Petunin, E.M. Semenov, Interpolation of Linear Operators, Nauka, Moscow, 1978 (in Russian); English translation in: Transl. Math. Monogr., vol. 54, Amer. Math. Soc., Providence, RI, 1982. [11] J. Lindenstrauss, L. Tzafriri, Classical Banach Spaces I and II: Sequence Spaces; Function Spaces, Springer, 1996. [12] M.M. Rao, Z.D. Ren, Theory of Orlicz Spaces, Monographs and Textbooks in Pure and Applied Mathematics, vol. 146, Marcel Dekker Inc., New York, 1991. [13] J.V. Ryff, Orbits of L1 -functions under doubly stochastic transformations, Trans. Amer. Math. Soc. 117 (1965) 92–100. [14] J.V. Ryff, Extreme points of some convex subsets of L1 (0, 1), Proc. Amer. Math. Soc. 18 (1967) 1026–1034. [15] J.V. Ryff, Measure preserving transformations and rearrangements, J. Math. Anal. Appl. 31 (1970) 449–458.
Journal of Functional Analysis 257 (2009) 219–242 www.elsevier.com/locate/jfa
Some s-numbers of an integral operator of Hardy type on Lp(·) spaces D.E. Edmunds a,∗ , J. Lang b,∗ , A. Nekvinda c,1 a Department of Mathematics, Mantell Building, University of Sussex, Brighton BN1 9RF, UK b Department of Mathematics, The Ohio State University, 100 Math Tower, 231 West 18th Avenue,
Columbus, OH 43210-1174, USA c Department of Mathematics, Faculty of Civil Engineering, Czech Technical University, Thákurova 7,
16629 Prague 6, Czech Republic Received 25 October 2008; accepted 23 February 2009 Available online 6 March 2009 Communicated by C. Kenig
Abstract Let I = [a, b] ⊂ R, let p : I → (1, ∞) be either a step-function or strong log-Hölder continuous on I , let p(·) p(·) Lp(·) (I ) be the usual space of Lebesgue type with x variable exponent p, and let T : L (I ) → L (I ) be the operator of Hardy type defined by Tf (x) = a f (t) dt. For any n ∈ N, let sn denote the nth approximation, Gelfand, Kolmogorov or Bernstein number of T . We show that 1/p(t) 1 p (t)p(t)p(t)−1 sin π/p(t) dt lim nsn = n→∞ 2π I
where p (t) = p(t)/(p(t) − 1). The proof hinges on estimates of the norm of the embedding id of Lq(·) (I ) in Lr(·) (I ), where q, r : I → (1, ∞) are measurable, bounded away from 1 and ∞, and such that, for some ε ∈ (0, 1), r(x) q(x) r(x) + ε for all x ∈ I . It is shown that min 1, |I |ε id ε|I | + ε−ε , a result that has independent interest. * Corresponding author.
E-mail addresses: [email protected] (D.E. Edmunds), [email protected] (J. Lang), [email protected] (A. Nekvinda). URL: http://www.math.ohio-state.edu/~lang/ (J. Lang). 1 The third author was supported by grant no. MSM 201/08/0383 of the Grant Agency of the Czech Republic. 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.02.015
220
D.E. Edmunds et al. / Journal of Functional Analysis 257 (2009) 219–242
© 2009 Elsevier Inc. All rights reserved. Keywords: Hardy-type operator; Compactness; s-Numbers; Lp(·)
1. Introduction Let I = [a, b] be a compact interval in the real line and let T be the operator of Hardy type given by x Tf (x) :=
f (t) dt
(x ∈ I ).
(1.1)
a
It is well known that if p ∈ (1, ∞), then T is a compact map from Lp (I ) to Lp (I ) (see, for example, [2, Chapter 2, §3]); moreover, if sn (T ) stands for the nth approximation, Bernstein, Gelfand or Kolmogorov number of T , then lim nsn (T ) = γp (b − a)/2,
n→∞
(1.2)
where γp = π −1 p 1/p (p )1/p sin(π/p) and p = p/(p − 1). We refer to [2] and [7] for details of this and similar results for more general operators of Hardy type. The position when T is viewed as a map from Lp (I ) to Lq (I ) and p = q is less simple, but nevertheless genuine asymptotic results similar to (1.2) have been obtained for various s-numbers of T in particular circumstances: see [3] and [5]. The focus of the present paper is on the behavior of s-numbers of the map T when it acts from the variable exponent space Lp(·) (I ) to Lp(·) (I ). Here p : I → (1, ∞) and by Lp(·) (I ) is meant the space of all real-valued functions f on I such that for some λ > 0,
f (x)/λp(x) dx < ∞;
I
endowed with the norm
p(x) f p(·) := inf λ > 0: f (x)/λ dx 1
(1.3)
I
it is a Banach space. Because of their natural occurrence in various significant physical contexts (see [12]), these spaces (which are particular cases of Musielak–Orlicz spaces) have been intensively studied in recent years, considerable emphasis being placed on the properties on them of such classical operators in harmonic analysis as the Hardy–Littlewood maximal operator. Our main result is a direct analogue of (1.2): if p is either a step-function or a strong log-Hölder continuous function (see Definition 4.10 and note that any Lipschitz or Hölder function p(·) is a strong log-Hölder continuous function), then
D.E. Edmunds et al. / Journal of Functional Analysis 257 (2009) 219–242
1 lim nsn (T ) = n→∞ 2π
1/p(x) sin π/p(x) dx, p (x)p(x)p(x)−1
221
(1.4)
I
where sn (T ) is the nth approximation, Gelfand, Kolmogorov or Bernstein number of T : Lp(·) (I ) → Lp(·) (I ). So far as we are aware, this is the first result concerning the s-numbers of operators acting on spaces with variable exponent, despite the clear importance of these numbers and the considerable literature devoted to them in the context of classical Lebesgue spaces. A key step in the proof is the following two-sided estimate of the norm of the embedding id of Lq(·) (I ) in Lp(·) (I ) when, for ε ∈ (0, 1), p(x) q(x) p(x) + ε (x ∈ I ): min 1, |I |ε id ε|I | + ε −ε , This has intrinsic interest, being a sharp improvement of the classical embedding theorem for Lp(·) spaces due to Ková˘cík and Rákosník [6]. 2. Preliminaries Throughout the paper I will stand for a compact interval [a, b] in the real line R, and given any measurable subset E of I, the Lebesgue measure of E will be denoted by |E| and the characteristic function of E by χE . By M(I ) is meant the family of all extended scalar-valued (real or complex) measurable functions on I, and P(I ) will stand for the subset of M(I ) consisting of all those functions p(·), with values in (1, ∞), such that 1 < p− := ess inf p(x) p+ := ess sup p(x) < ∞. x∈I
x∈I
For all f ∈ M(I ), define ρp(·) (f ) =
f (x)p(x) dx
I
and f p(·),I = f p(·) = inf λ > 0: ρp(·) (f/λ) 1 . The generalised Lebesgue space Lp(·) (I ) (or space with variable exponent) is the set Lp(·) (I ) := f : f p(·) < ∞ , equipped with the norm · p(·) ; it is routine to verify that it is a Banach space; indeed, it is a Banach function space. We refer to [6] for an account of the fundamental properties of these spaces and in particular for the following basic embedding theorem, in which by X → Y we mean that the Banach space X is continuously embedded in the Banach space Y. Theorem 2.1. Let p(·), q(·) ∈ P(I ) be such that for all x ∈ I, p(x) q(x). Then Lq(·) (I ) → Lp(·) (I ).
222
D.E. Edmunds et al. / Journal of Functional Analysis 257 (2009) 219–242
In what follows we consider the Hardy operator T acting on a space Lp(·) (I ) with variable exponent. To establish its compactness we use the following well-known result concerning its behaviour on classical Lebesgue spaces. Theorem 2.2. Let r, s ∈ (1, ∞). Then T maps Lr (I ) compactly into Ls (I ). This follows from [2], Theorems 2.3.1 and 2.3.4, for example. Now the compactness of T on spaces with variable exponent follows quickly. Lemma 2.3. Let 1 < c < d < ∞ and suppose that p(·), q(·) ∈ P(I ) are such that for all x ∈ I we have p(x), q(x) ∈ (c, d). Then T maps Lp(·) (I ) compactly into Lq(·) (I ). Proof. By Theorem 2.1, Lp(·) (I ) and Ld (I ) are continuously embedded in Lc (I ) and Lq(·) (I ), respectively. Moreover, by Theorem 2.2, T maps Lc (I ) compactly into Ld (I ). The result now follows by composition of these maps. 2 More detailed information about the compactness properties of T is provided by the approximation, Bernstein, Gelfand and Kolmogorov numbers, and we next recall the definition of these quantities. Let X and Y be Banach spaces and let S : X → Y be compact and linear. Then given any n ∈ N, the nth approximation number of S is defined to be an (S) = inf S − F , where the infimum is taken over all bounded linear maps F : X → Y with rank less than n; the nth Bernstein number of S is bn (S) = sup
inf
x∈Xn \{0}
SxY /xX ,
where the supremum is taken over all n-dimensional subspaces Xn of X; the nth Gelfand number of S is X : M is a linear subspace of X, codim X < n , cn (S) = inf SJM X is the embedding map from M to X; and the nth Kolmogorov number of S is where JM
dn (S) = inf
sup
inf Sf − gY /f X ,
Xn 0<f X 1 g∈Xn
where the outer infimum is taken over all n-dimensional subspaces Xn of X. Further details of these numbers and their basic properties will be found in [1,9] and [11]; for the moment we simply note that the approximation numbers are the largest of them. We recall that not all these s-numbers have the multiplicative property detailed in [1], p. 72: the Bernstein numbers fail to have it (see [10]). However, every s-number sn satisfies the following inequality sn (R ◦ T ◦ S) Rsn (T )S, for arbitrary and appropriately composed bounded linear maps R, S and T .
(2.1)
D.E. Edmunds et al. / Journal of Functional Analysis 257 (2009) 219–242
223
To determine the properties of T we introduce certain functions that will play a key role in our analysis. Definition 2.4. Let p(·), q(·) ∈ P(I ), let J = (c, d) ⊂ I, and let ε > 0. We define · Ap(·),q(·) (J ) = inf sup f y∈J y
· Bp(·) (J ) = inf sup f y∈J y
: f p(·),J 1 , q(·),J
pJ+ ,J
: f p− ,J 1 , J
Cp(·),q(·) (J ) = sup Tf q(·),J : f p(·),J 1, (Tf )(c) = (Tf )(d) = 0 , Dp(·) (J ) = sup Tf p− ,J : f p+ ,J 1, (Tf )(c) = (Tf )(d) = 0 J
J
where pJ− = inf{p(x); x ∈ J } and pJ+ = sup{p(x); x ∈ J }. Corresponding to these functions we define NAp(·),q(·) (ε) to be the minimum of all those n ∈ N such that I can be written as I = nj=1 Ij , where each Ij is a closed sub-interval of I , |Ii ∩ Ij | = 0 (i = j ) and Ap(·),q(·) (Ij ) ε for every j. The quantities NBp(·) (ε), NCp(·),q(·) (ε), NDp(·) (ε) are defined in an exactly similar way. We shall write Ap(·) (J ) = Ap(·),p(·) (J ) and Cp(·) (J ) = Cp(·),p(·) (J ), denoting these quantities by Ap , Cp respectively if p(x) = p is a constant function. When p(x) = p and q(x) = q are constant functions then we will write Ap,q (J ) = Ap(·),q(·) (J ) and Cp,q (J ) = Cp(·),q(·) (J ). Functions of this kind were introduced in previous work on the s-numbers of Hardy-type operators in the context of classical Lebesgue spaces (see, for example, [2,3,5] and [7]), and in fact for that situation we have the following result. Lemma 2.5. When J = (c, d) ⊂ I, and p is a constant function, so that p(x) = p ∈ (1, ∞) for all x ∈ I, 1/p |J | , Ap (J ) = Bp (J ) = p p p−1 2πp where πp =
2π p sin(π/p) .
The following lemma was proved in [13]. Lemma 2.6. Let J = (c, d) ⊂ I, and p, q ∈ (1, ∞). Then 1− p1 + q1
Tf q,J (p + q) sup = f f p,J
and the extremals are the non-zero multiples of cosp,q (πp,q x/2). This leads us to the following result.
1− p1 + q1
(p )1/q q 1/p |J | B(1/p , 1/q)
,
224
D.E. Edmunds et al. / Journal of Functional Analysis 257 (2009) 219–242
Lemma 2.7. Let J = (c, d) ⊂ I, and p, q ∈ (1, ∞). Then Ap,q (J ) = Cp,q (J ) =
1− p1 + q1
(p + q)
1− p1 + q1
(p )1/q q 1/p |J | 2B(1/p , 1/q)
1− p1 + q1
:= B(p, q)|J |
(2.2)
From Lemma 2.5, by using techniques from [4,7,8] and with the help of the well-known inequality bn (T ) min cn (T ), dn (T ) max cn (T ), dn (T ) an (T ),
(2.3)
we obtain the following theorem. Theorem 2.8. Let p be as in Lemma 2.5 and let T be viewed as a map from Lp (I ) to itself. Then for all n ∈ N, an (T ) = cn (T ) = dn (T ) = bn (T ) = where πp =
(p p p−1 )1/p |I |, πp n
2π p sin(π/p) .
It is known that under the conditions of the last lemma, A(J ) depends continuously on the right-hand endpoint of J ; that is, with a slight abuse of notation, the function A(c, ·) is continuous. A similar result holds for non-constant p: this is formulated in the next lemma together with the corresponding results for B, C and D. Lemma 2.9. Let p(·), q(·) ∈ P(I ). Then the functions Ap(·),q(·) (c, t), Bp(·) (c, t), Cp(·),q(·) (c, t) and Dp(·) (c, t) of the variable t are non-decreasing and continuous. Analogously the functions Ap(·),q(·) (t, d), Bp(·) (t, d), Cp(·),q(·) (t, d) and Dp(·) (t, d) are non-decreasing and continuous. Proof. We start with A := Ap(·),q(·) . First we prove that A(c, d) A(c, d + h) when h 0. Clearly · A(c, d + h) = inf sup f (t) dt y∈(c,d+h)
; f p(·),(c,d+h) 1
q(·),(c,d+h)
y
· = min inf sup f (t) dt y∈(c,d)
y
· inf sup f (t) dt y∈(d,d+h) y
:= min{X, Y }.
; f p(·),(c,d+h) 1 , q(·),(c,d+h)
; f p(·),(c,d+h) 1 q(·),(c,d+h)
D.E. Edmunds et al. / Journal of Functional Analysis 257 (2009) 219–242
225
Now · X inf sup f (t) dt y∈(c,d) y
; f p(·),(c,d) 1 = A(c, d) q(·),(c,d)
and · Y inf sup f (t) dt y∈(d,d+h)
; f p(·),(c,d) 1
q(·),(c,d)
y
· sup f (t) dt
; f p(·),(c,d) 1
q(·),(c,d)
d
· inf sup f (t) dt y∈(c,d)
; f p(·),(c,d) = A(c, d),
q(·),(c,d)
y
which gives A(c, d + h) A(c, d). Let us prove the continuity of A. By Hölder’s inequality (see [6]) we have, for some α 1 (independent of f , x and y), x f (t) dt α1p (·),(y,x) f p(·),(y,x) y
and considering 1p (·),(y,x) as a function of x we obtain 1p (·),(y,x)
q(·),(d,d+h)
1p (·),(c,d+h) 1q(·),(d,d+h)
which gives A(c, d) A(c, d + h)
· = inf sup f (t) dt y∈(c,d+h) y
· inf sup f (t) dt y∈(c,d+h) y
· + f (t) dt y
; f p(·),(c,d+h) 1 q(·),(c,d+h)
q(·),(c,d)
; f p(·),(c,d+h) 1 q(·),(d,d+h)
226
D.E. Edmunds et al. / Journal of Functional Analysis 257 (2009) 219–242
· inf sup f (t) dt y∈(c,d+h)
q(·),(c,d)
y
+ α 1p (·),(y,x) f p(·),(y,x)
q(·),(d,d+h)
; f p(·),(c,d+h) 1
· inf sup f (t) dt y∈(c,d+h)
q(·),(c,d)
y
+ α 1p (·),(y,x)
q(·),(d,d+h)
; f p(·),(c,d+h) 1
· inf sup f (t) dt y∈(c,d+h) y
; f p(·),(c,d+h) 1
q(·),(c,d)
+ α1p (·),(c,d+h) 1q(·),(d,d+h) ·
inf sup f (t) dt ; f p(·),(c,d+h) 1 y∈(c,d) y
q(·),(c,d)
+ α1p (·),(c,d+h) 1q(·),(d,d+h) ·
= inf sup f (t) dt ; f p(·),(c,d) 1 y∈(c,d) y
q(·),(c,d)
+ α1p (·),(c,d+h) 1q(·),(d,d+h) = A(c, d) + α1p (·),(c,d+h) 1q(·),(d,d+h) . Since q(x) ∈ P(I ) we know that 1q(·),(d,d+h) → 0 as h → 0 and so, A(c, ·) is rightcontinuous. Left-continuity is proved in a corresponding manner, and the continuity of A(c, ·) follows. The arguments for B, C and D are similar. 2 As an immediate consequence of this and Lemma 2.3 we have Lemma 2.10. Let p(·) ∈ P(I ). Then T : Lp(·) (I ) → Lp(·) (I ) is compact and for all ε > 0 the quantities NAp(·) (ε), NBp(·) (ε), NCp(·) (ε) and NDp(·) (ε) are finite. We now have Lemma 2.11. Let p(·) be as in the last lemma. Then given any N ∈ N, there exists a unique ε > 0 such that NA (ε) = N, and there is a non-overlapping covering of I by intervals IAi (i = 1, . . . , N) such that A(IAi ) = ε for i = 1, . . . , N. The same result holds when A is replaced by B, C, D. Proof. The existence follows from the continuity properties established in Lemma 2.9.
D.E. Edmunds et al. / Journal of Functional Analysis 257 (2009) 219–242
227
i N For uniqueness, observe that given two non-overlapping coverings of I , {IAi }N i=1 and {JA }i=1 , j there are m, j, k, l such that IAm ⊂ JA and JAk ⊂ IAl . Now, assuming A(IAi ) = ε1 , A(JAi ) = ε2 we obtain ε1 ε2 ε1 by the monotonicity of A. 2
3. The case when p(·) is a step-function Let {Ji }m i=1 be a disjoint covering of I by intervals and let p be the step-function defined by p(x) =
m
(3.1)
χJi (x)pi ,
i=1
where each pi belongs to (1, ∞). For simplicity, in this section we shall write A instead of Ap(·) ; B, C, D will have the analogous meaning. Lemma 3.1. Let p(·) be the step-function given by (3.1). Then T : Lp(·) (I ) → Lp(·) (I ) is compact and for sufficiently small ε > 0, (i) bNC (ε)−m (T ) > ε, (ii) aNA (ε)+2m−1 (T ) < ε. Proof. Let ε > 0. The compactness of T follows from Lemma 2.10, as does the finiteness of NA (ε) and NC (ε). (i) By the continuity of C(c, ·), there exists a set of non-overlapping intervals {Ii : i = , . . . , NC (ε)} covering I and such that C(Ii ) = ε whenever 1 i < NC (ε) and C(INC (ε) ) ε. Let η ∈ (0, ε). Then corresponding to each i, 1 i < NC (ε), there is a function fi such that supp fi ⊂ Ii := (ai , ai+1 ), fi p(·) = 1, ε − η < Tfi p(·) ε and (Tf )(ai ) = (Tf )(ai+1 ) = 0. By {Iik }M k=1 we denote the set of those intervals Ii , 1 i < NC (ε), each of which is contained in one of the intervals Jl from the definition (3.1) of p(·). Then NC (ε) − m M NC (ε). Define by XM = f =
M
αir fir ; αir ∈ R
r=1
an M-dimensional subspace of Lp(·) (I ). Note that since p(·) is constant on Iir , p(x) = pir on Iir . Choose 0 = f ∈ XM . With λ0 := Tf p we have M Tf (x) p(x) 1 dx λ0 I
r=1I
ir
Tf (x) p(x) dx λ 0
228
D.E. Edmunds et al. / Journal of Functional Analysis 257 (2009) 219–242
=
M M pir pir 1 pir ε − η pir Tf (x) dx f (x) dx λ0 λ0 r=1
=
M f (x) p(x) dx = λ /(ε − η) 0 r=1I
=
r=1
Iir
∪M r=1 Iir
ir
f (x) p(x) dx. λ /(ε − η) 0
Iir
f (x) p(x) dx λ /(ε − η) 0
I
Hence f p,I Tf p,I /(ε − η), and so bNC (ε)−m (T ) bM (T ) ε − η. NA (ε) be a set of non(ii) This follows a pattern similar to that of (i). This time we let {Ii }i=1 overlapping intervals covering I for which A(Ii ) = ε for i = 1, . . . , NA (ε)−1 and A(INA (ε) ) ε. By {Ii+ }M i=1 we denote the family of all non-empty intervals for which there exist j and k such that Ii+ = Ij ∩ Jk . Clearly NA (ε) M NA (ε) + 2(m − 1). Let η > 0. Then given any i ∈ {1, 2, . . . , M} there exists yi ∈ Ii+ such that · sup f yi
p,Ii+
: f p,I + = 1 ε + η. i
Define
Pε (f ) =
yi M
f (x) dx χI + . i
i=1
a
Plainly Pε is a linear map from Lp(·) (I ) to Lp(·) (I ) with rank M. Let pi be the constant value of p(·) on Ii+ . Then we have for any λ0 ∈ (0, ∞) and f ∈ Lp(·) (I ), M (T − Pε )f (x) p(x) dx = λ0 i=1
I
Ii+
M
x p(x) M yi f −pi dx = λ0 λ 0
−pi
λ0
i=1
(ε + η)pi
i=1
Now choose λ0 = (1 − η)(T − Pε )f p,I . Then
|f |pi dx = Ii+
x pi f dx
Ii+ yi
f (x) p(x) dx. λ /(ε + η) 0 I
D.E. Edmunds et al. / Journal of Functional Analysis 257 (2009) 219–242
(T − Pε )f (x) p(x) 1< dx λ0 I
I
229
f (x) p(x) dx, λ /(ε + η) 0
from which we see that f p(·),I > (1 − η)(T − Pε )f p(·),I /(ε + η), so that ε + η (T − Pε )f p(·),I > . 1−η f p(·),I The proof is completed on letting η → 0.
2
Lemma 3.2. Let p(·) be the step-function given by (3.1). Then 1 lim εN(ε) = ε→0 2π
1/p(x) p (x)p(x)p(x)−1 sin π/p(x) dx,
I
where N stands for NA , NB , NC or ND . Proof. Simply use the fact that p(·) is a step-function together with Lemmas 2.5 and 2.11.
2
Finally we can give the main result of this section. Theorem 3.3. Let p(·) be the step-function given by (3.1). Then for the compact map T : Lp(·) (I ) → Lp(·) (I ) we have lim nsn (T ) =
n→∞
1 2π
1/p(x) p (x)p(x)p(x)−1 sin π/p(x) dx,
I
where sn denote the nth approximation, Gelfand, Kolmogorov or Bernstein number of T . Proof. Using Lemmas 3.1 together with inequalities (2.3), we have εNA (ε) aNA (ε)+2m−1 (T )NA (ε) bNA (ε)+2m−1 (T )NA (ε) and εNC (ε) bNC (ε)−m (T )NC (ε). Now use Lemma 3.2 to obtain the result for the approximation and Bernstein numbers. The rest follows from (2.3) again. 2
230
D.E. Edmunds et al. / Journal of Functional Analysis 257 (2009) 219–242
4. The case when p(·) is strongly log-Hölder-continuous To obtain a result in this case similar to that of Theorem 3.3 the idea is to approximate p by step-functions. This requires that control be kept of the changes in the various norms when p is replaced by an approximating function, and we begin by giving such a result, which has independent interest. Let p(·), q(·) ∈ P(I ) be such that for some ε ∈ (0, 1), 1 < p(x) q(x) p(x) + ε
for all x ∈ I.
(4.1)
We know from Theorem 2.1 that Lq(·) (I ) is continuously embedded in Lp(·) (I ); denote by id the norm of the corresponding embedding. Our object is to obtain upper and lower bounds for id in terms of ε. 4.1. Suppose that p(·) and q(·) satisfy (4.1) and that f ∈ M(I ) is such that Lemma q(x) |f (x)| dx 1 Then I
f (x)p(x) dx ε|I | + ε −ε .
I
Proof. Set I1 = {x ∈ I : |f (x)| < ε}, I2 = {x ∈ I : ε |f (x)| 1} and I3 = {x ∈ I : 1 < |f (x)|}. Then
3 f (x)p(x) dx =
j =1 I
I
3 f (x)p(x) dx = Aj ,
say.
j =1
j
Evidently A1
ε p(x) dx
I1
ε dx ε|I |
(4.2)
I1
and A3
f (x)q(x) dx.
I3
Since ε |f (x)| 1 on I2 and ε < 1 we have, by (4.1), q(x)−p(x) ε ε ε q(x)−p(x) f (x) 1 on I2 , and so p(x)−q(x) ε −ε . 1 f (x) Hence
(4.3)
D.E. Edmunds et al. / Journal of Functional Analysis 257 (2009) 219–242
A2 =
f (x)q(x) f (x)p(x)−q(x) dx ε −ε
I2
231
f (x)q(x) dx.
(4.4)
I2
Now (4.2), (4.3) and (4.4) give
f (x)p(x) dx ε|I | + ε −ε
I
f (x)q(x) dx +
I2
ε|I | + ε
−ε
ε|I | + ε −ε
I3
f (x)q(x) dx +
I2
f (x)q(x) dx
f (x)q(x) dx
I3
f (x)q(x) dx ε|I | + ε −ε ,
I
2
as required.
Lemma 4.2. Suppose that p(·) and q(·) satisfy (4.1). Then id ε|I | + ε −ε . Proof. Observe that K := ε|I | + ε −ε > 1. Given any f such that I |f (x)|q(x) dx 1, by Lemma 4.1 we see that f (x)/K p(x) dx f (x)/K 1/p(x) p(x) dx = K −1 f (x)p(x) dx I
Thus id K.
I
I
ε|I | + ε −ε /K = 1. 2
We now turn to lower bounds. Lemma 4.3. Suppose that p(·) and q(·) satisfy (4.1) and that |I | 1. Then id 1. Proof. Define a function g by g(x) = |I |−1/q(x) (x ∈ I ). Then |I |−p(x)/q(x) |I |−1 , we have for each λ ∈ (0, 1), I
g(x)/λp(x) dx =
|I |−p(x)/q(x) dx λp(x)
I
|I |−1 dx λp(x)
I
Hence id λ for each λ ∈ (0, 1), and so id 1.
I
|g(x)|q(x) dx = 1. Since
|I |−1 1 dx = > 1. λ λ
I
2
Lemma 4.4. Suppose that p(·) and q(·) satisfy (4.1) and that |I | < 1. Then id |I |ε . Proof. Again we consider the function g(x) = |I |−1/q(x) : 1− p(x) q(x)
|I |
= |I |
q(x)−p(x) q(x)
I
|g(x)|q(x) dx = 1. Since
|I |ε/q(x) |I |ε ,
232
D.E. Edmunds et al. / Journal of Functional Analysis 257 (2009) 219–242
we have
g(x)p(x) dx =
I
− p(x) q(x)
|I |
dx = |I |−1
I
1− p(x) q(x)
|I |
dx |I |ε .
I
Thus, for each positive λ < |I |ε , I
g(x)/λp(x) dx >
g(x) p(x) dx |I |ε I
= |I |−ε
I
g(x) p(x) dx |I |ε/p(x)
g(x)p(x) dx |I |−ε |I |ε = 1.
I
It follows that id λ for each λ < |I |ε , which gives the result.
2
Putting these results together we have the following theorem and corollary. Theorem 4.5. Suppose that p(·) and q(·) satisfy (4.1). Then min 1, |I |ε id ε|I | + ε −ε . Corollary 4.6. Let p(·) ∈ P(I ) and suppose that for each n ∈ N, qn (·) ∈ P(I ) and εn > 0, where limn→∞ εn = 0, and for all n ∈ N and all x ∈ I, 1 < p(x) qn (x) p(x) + εn . Denote by idn the natural embedding of Lqn (·) (I ) in Lp(·) (I ). Then lim idn = 1.
n→∞
In the next, we prove a few technical lemmas. Lemma 4.7. Let δ > 0 and let J ⊂ I be an interval and p(·), q(·) ∈ P(J ). Assume p(x) q(x) p(x) + δ in J . Then −2 2 δ|J | + δ −δ Ap(·)+δ,p(·) (J ) Aq(·) (J ) δ|J | + δ −δ Ap(·),p(·)+δ (J ). Proof. Set B1 = f ; f q(·) 1 ,
B2 = f ; f p(·) δ|J | + δ −δ ,
where the norms are with respect to the interval J . By Theorem 4.5 we have f p(·) (δ|J | + δ −δ )f q(·) which gives B1 ⊂ B2 and
D.E. Edmunds et al. / Journal of Functional Analysis 257 (2009) 219–242
· Aq(·) (J ) = inf sup f y∈J y
· ; f q(·) 1 = inf sup f y∈J
233
q(·)
y
· −δ inf sup δ|J | + δ f y∈J
y
; f ∈ B1 q(·)
; f ∈ B2 p(·)+δ
· f −δ 2 = δ|J | + δ inf sup δ|J | + δ −δ y∈J
p(·)+δ
y
· −δ 2 = δ|J | + δ inf sup g y∈J y
f ; −δ δ|J | + δ
1
p(·)
; gp(·) 1
p(·)+δ
2 = δ|J | + δ −δ Ap(·),p(·)+δ (J ) The second part of the inequality can be proved analogously.
2
Lemma 4.8. Let an interval J ⊂ I, with |J | 1, and p ∈ (1, ∞) be given. Then there exists a bounded positive function η defined on (0, 1), with η(δ) → 0 as δ → 0, such that if p(·), q(·) ∈ P(J ) with p p(x) p + δ,
p q(x) p + δ
in J,
then Ap(·) (J ) 1 + η(δ) |J |−2δ . 1 − η(δ) |J |2δ Aq(·) (J ) Proof. It suffices to prove only the right-hand part of the inequality. By Lemma 4.7 and (2.2) we have 1
1
4 Ap,p+δ (J ) 4 B(p, p + δ) |J |1− p + p+δ Ap(·) (J ) δ|J | + δ −δ = δ|J | + δ −δ 1 + p1 Aq(·) (J ) Ap+δ,p (J ) B(p + δ, p) |J |1− p+δ −2δ 4 B(p, p + δ) 4 B(p, p + δ) −2δ |J | p(p+δ) δ|J | + δ −δ |J | . = δ|J | + δ −δ B(p + δ, p) B(p + δ, p)
Since 4 B(p, p + δ) lim δ|J | + δ −δ = 1, δ→0 B(p + δ, p) we can choose η(δ) such that
4 B(p, p + δ) η(δ) := max δ, δ|J | + δ −δ −1 B(p + δ, p) to establish our assertion.
2
234
D.E. Edmunds et al. / Journal of Functional Analysis 257 (2009) 219–242
Lemma 4.9. Let δ > 0, a1 < b1 a2 < b2 and Ji = (ai , bi ) ⊂ I , i = 1, 2. Assume that f1 , f2 are functions on I such that supp fi ⊂ Ji , i = 1, 2, and Tf1 p(·),J1 > δ. Then T (f1 − f2 )
p(·),I
> δ.
Proof. Since T (f1 /δ)p(·),J1 > 1 we have p(x) b1 x f1 (t) dt dx > 1. δ
a1
a1
Then p(x) b b x T (f1 − f2 )(x) p(x) f (t) − f (t) 1 2 dt dx = dx δ δ a
a
a
a1 =
b1 ... +
a
a2 ... +
a1
b2 ... +
b1
b ... +
a2
... b2
p(x) b1x f1 (t) − f2 (t) dt dx δ a1
a
p(x) b1x f1 (t) = dt dx > 1 δ a1
and so T (f1 − f2 )p(·),I > δ.
a
2
Next we recall the well-known concept of a log-Hölder continuous function which is widely used in the theory of variable exponent spaces. Following current terminology we shall say that p(·) is log-Hölder continuous if there is a positive constant L such that −|p(x) − p(y)| ln |x − y| L for all x, y ∈ I with 0 < |x − y| < 12 . In what follows we will require a little stronger condition on the function p(·) defined on I . We remind the reader that I = [a, b] is a compact interval. Definition 4.10. Let p(·) ∈ P(I ). We say that p(·) is strong log-Hölder continuous (and write p(·) ∈ SLH(I )) if there is an increasing continuous function ψ(t) defined on [0, |I |] such that lim ψ(t) = 0 and t→0+
−p(x) − p(y) ln |x − y| ψ |x − y|
for all x, y in I with 0 < |x − y| < 1/2.
It is easy to see any Lipschitz or Hölder function p(·) is SLH(I ).
(4.5)
D.E. Edmunds et al. / Journal of Functional Analysis 257 (2009) 219–242
235
Proposition 4.11. Let p(·) ∈ P(I ) be strong log-Hölder continuous on I. Then 1 lim εN(ε) = ε→0 2π
1/p(x) p (x)p(x)p(x)−1 sin π/p(x) dx,
I
where N stands for NAp(·) or NCp(·) . Proof. We prove only the case N = NAp(·) , the case N = NCp(·) follows by a simple modification. Let N ∈ N. By Lemma 2.11 there exists a constant εN > 0 and a set of non-overlapping intervals N {IiN }N i=1 covering I such that Ap(·) (Ii ) = εN for every i. Define a step-function qN (x) by qN (x) =
N
p +N χI N (x).
i=1
Ii
i
and set δN,i = p +N − p −N . Ii
Ii
Then p(x) qN (x) p(x) + δN,i
for all i = 1, 2, . . . , N.
Claim 1. εN → 0 as N → ∞. Proof. Clearly, εN is non-increasing. Assume for a moment that there exists δ > 0 such that εN > δ for all N . Fix N and denote IiN := Ii = (ai , ai+1 ). Since Ap(·),Ii > δ there are fi , with · supp fi ⊂ Ii , such that fi p(·),Ii 1 and ai fi p(·),Ii = Tfi p(·),Ii > δ for i = 1, . . . , N . By Lemma 4.9, T (fi − fj ) > δ for i < j p(·),I and so, we have found N functions f1 , f2 , . . . , fN from the unit ball such that T (fi − fj ) > δ. p(·),I The fact that N can be arbitrary contradicts the compactness of T .
2
Claim 2. limN →∞ max{|IiN |; i = 1, 2, . . . , N} = 0. Proof. Assume the contrary. Then there are sequences Nk , ik ∈ {1, 2, . . . , Nk } and an interval J such that J ⊂ IiNk k and so, N εNk = Ap(·) Iik k Ap(·) (J ) > 0 which contradicts the fact that εN → 0.
2
236
D.E. Edmunds et al. / Journal of Functional Analysis 257 (2009) 219–242
Claim 3. There is a sequence βN 1 such that 2δ −2δ −1 εN IiN N,i AqN (·) IiN βN εN IiN N,i βN holds for all i ∈ {1, 2, . . . , N}. Proof. Since p − qN (x), p(x) p − + δN,i on IiN we have by Lemma 4.8, 2δ −2δ Ap(·) (IiN ) 1 − η(δN,i ) IiN N,i 1 + η(δN,i ) IiN N,i . N AqN (·) (Ii ) Now, using εN = Ap(·) (IiN ) we have N 2δN,i N −2δN,i εN εN I I AqN (·) IiN i 1 + η(δN,i ) 1 − η(δN,i ) i and the assertion follows.
2
Claim 4. The inequality N −δN,i N I eψ(|Ii |) i holds for all N and i ∈ {1, 2, . . . , N}. Proof. Fix IiN . Since p(·) ∈ SLH(I ) we know that p(·) is a continuous function on I . Because p +N − p −N = δN,i there are points x, y ∈ IiN with |p(x) − p(y)| = δN,i . Using (4.5) we obtain Ii
Ii
N −δN,i N I |x − y|−|p(x)−p(y)| eψ(|x−y|) eψ(|Ii |) . i
2
Claim 5. There is a constant C > 0 such that the inequality C −1 εN IiN CεN holds for all N and i ∈ {1, 2, . . . , N}. Proof. Remark that qN (·) = p −N + δN,i := rN,i is a constant function on IiN and so, by Lemma 2.7,
Ii
AqN (·) IiN = B(rN,i , rN,i )IiN . It is easy to see that there is a > 0 such that a −1 B(rN,i , rN,i ) a holds for all N and i ∈ {1, 2, . . . , N}. Using Claim 4 we have N −2δN,i N I e2ψ(|Ii |) e2ψ(|I |) := K, i
D.E. Edmunds et al. / Journal of Functional Analysis 257 (2009) 219–242
237
and by Claim 3, −1 K −1 βN εN B(rN,i , rN,i )IiN KβN εN . Hence −1 a −1 K −1 βN εN IiN aKβN εN . Since βN 1, the assertion follows.
2
Now, since by Claim 1, εN → 0 we know by Claim 5 that max{|IiN |; i = 1, 2, . . . , N } → 0 as N → ∞ and by Claim 4, N −2δN,i N N I e2ψ(|Ii |) e2ψ(max{|Ii |; i=1,2,...,N }) := γN 1. i Setting αN = βN γN we obtain αN 1, and by Claim 2 we have −1 αN εN AqN (·) IiN αN εN . Moreover, by Claim 5 we have N εN = C
N i=1
C −1 εN C
N N I = C|I | i
i=1
which gives, by (4.6), N N −1 −1 NεN αN αN εN − εN −1 = AqN (·) IiN − N εN i=1
i=1
N
(αN εN − εN ) = N εN (αN − 1).
i=1
Consequently, N N AqN (·) Ii − N εN → 0 as N → ∞. i=1
On the other hand we have by Lemma 2.5 (recall again that qN (·) is constant on IiN ), N
N 1/ 1 qN (·)qN (·)qN (·)−1 qN (·) sin π/qN (·) IiN AqN (·) IiN = 2π i=1 i=1 1/p(x) 1 → p (x)p(x)p(x)−1 sin π/p(x) dx, 2π I
(4.6)
238
D.E. Edmunds et al. / Journal of Functional Analysis 257 (2009) 219–242
and so,
1 lim N εN = N →∞ 2π
1/p(x) p (x)p(x)p(x)−1 sin π/p(x) dx.
I
Since εN is monotone it is not difficult to see limN →∞ N εN = limε→0 εN (ε) and consequently 1 lim εN(ε) = ε→0 2π
p (x)p(x)p(x)−1
1/p(x)
sin π/p(x) dx.
2
I
Given p(·) ∈ SLH(I ), we construct step-functions that are approximations to p(·). Let N ∈ N and use Lemma 2.11, applied to the function D := Dp(·) : there exists ε > 0 such that ND (ε) = N and there are non-overlapping intervals IiD (i = 1, . . . , N ) that cover I and are such that D(IiD ) = ε for i = 1, . . . , N. Define + pD,N (x) =
N i=1
pI+D χI D (x), i
− pD,N (x) =
i
N
pI−D χI D (x);
i=1
i
i
+ − step-functions pB,N (·) and pB,N (·) are defined in an exactly similar way, with the function B in place of D and with intervals IiB arising from the use of that part of Lemma 2.11 related to B.
Lemma 4.12. Let p(·) ∈ P(I ) and N ∈ N. Let ε > 0 correspond to N in the sense of Lemma 2.11, applied to B, so that NB (ε) = N , and write − p − (x) = pB,N (x),
+ p + (x) = pB,N (x),
− + where pB,N (·) and pB,N (·) are defined as indicated above. Then
− + aN +1 T : Lp (·) (I ) → Lp (·) (I ) ε. Proof. In the notation of Lemma 2.11, there are intervals IiB such that B(IiB ) = ε for i = 1, . . . , N. For each i there exists yi ∈ IiB such that · B B Ii = sup f yi
: f p− ,I B 1 . i
p + ,IiB
Define N
yi
PN f (x) =
f (y) dy · χI B (x); i
i=1 a
D.E. Edmunds et al. / Journal of Functional Analysis 257 (2009) 219–242
plainly PN has rank N. Let f ∈ Lp
− (·)
239
(I ) and set λ0 = εf p− ,I .
(4.7)
Then − N f (x) p (x) 1= dx = λ0 /ε i=1
I
− f (x) p (x) dx. λ /ε 0
IiB
Recall that on IiB the functions p − (·) and p + (·) have constant values pi− , pi+ , say, respectively, with pi+ /pi− 1. Thus − p+ /p− p+ /p− N N i i i i p − f (x) pi pi+ i f (x) = (ε/λ0 ) dx . 1 λ /ε dx 0 i=1
i=1
IiB
IiB
Use of the fact that p+ 1/p+ x 1/p− i i i p − i f (y) ε = sup / dy f (y) dy dx f IiB yi
IiB
now gives N + (1/λ0 )pi 1 i=1
x p + N i f (y) dy dx = i=1
IiB yi
x + yi f (y) dy pi dx λ0
IiB
+ (T − PN )(f )(x) p (x) dx, = λ0 I
from which it follows that (T − PN )f p+ ,I λ0 . Using the definition (4.7) of λ0 we see that (T − PN )f + εf p− ,I , p ,I and so aN +1 (T : Lp
− (·)
(I ) → Lp
+ (·)
(I )) ε, as claimed.
2
We next obtain a lower estimate for the Bernstein numbers. Lemma 4.13. Let p(·) ∈ P(I ) and N ∈ N. Let ε > 0 correspond to N in the sense of Lemma 2.11, applied to D, so that ND (ε) = N, and write − (x), p − (x) = pD,N
+ p + (x) = pD,N (x),
240
D.E. Edmunds et al. / Journal of Functional Analysis 257 (2009) 219–242
− + where pD,N (·) and pD,N (·) are defined as indicated above. Then
+ − bN T : Lp (·) (I ) → Lp (·) (I ) ε. Proof. In the notation of Lemma 2.11, there are intervals IiD such that D(IiD ) = ε for i = + 1, . . . , N. Since T is compact, for each i there exists fi ∈ Lp (IiD ) with supp fi ⊂ IiD , Tfi p− ,I D /fi p+ ,I D = ε, i
(4.8)
i
and Tfi (ci ) = Tfi (ci+1 ) = 0, where ci and ci+1 are the endpoints of IiD . On each IiD the functions p − (·) and p + (·) are constant; denote these constant values by pi− and pi+ , respectively and note that pi− /pi+ 1. Set XN = f =
N
αi fi ; αi ∈ R .
i=1
Then dim XN = N . Choose any non-zero f ∈ XN and set λ0 = εf p+ ,I . Then 1=
+ f (x) p (x) dx λ /ε 0 I
+ N f (x) p (x) = dx λ /ε i=1
IiD
0
+ p− /p+ N i i f (x) pi λ /ε dx 0 i=1
IiD
p N i p + pi− i f (x) = (ε/λ0 ) dx
−
i=1
=
/pi+
IiD
N + − αi fi (x)pi dx)pi− /pi+ . (ε/λ0 )pi i=1
IiD
Use of (4.8) now shows that pi− N N − − Tf (x) p pi i T (αi fi )(x) 1 (1/λ0 ) dx = λ dx = i=1
IiD
from which it follows that
i=1
IiD
0
I
p − Tf (x) (x) dx λ 0
D.E. Edmunds et al. / Journal of Functional Analysis 257 (2009) 219–242
241
+ − ε bN T : Lp (·) (I ) → Lp (·) (I ) , and the proof is complete.
2
Theorem 4.14. Let p(·) ∈ P(I ) be continuous on I . For all N ∈ N denote by εN numbers satisfying N = NB (εN ). Then there are sequences KN , LN with KN → 1, LN → 1 as N → ∞ such that (i) aN +1 (T : Lp(·) (I ) → Lp(·) (I )) KN εN , (ii) bN (T : Lp(·) (I ) → Lp(·) (I )) LN εN . Proof. First we prove (i). In view of the property (2.1) of the approximation numbers, aN +1 (T : Lp(·) (I ) → Lp(·) (I )) is majorized by − p(·) − id : L (I ) → LpB,N (I ) N − + + pB,N × aN +1 T : LpB,N (I ) → LpB,N (I ) × id+ (I ) → Lp(·) (I ), N :L where id− and id+ are the obvious embedding maps. When N → ∞, since |IiB | → 0 and p(·) is continuous, it is clear that p(·) − p − (·) → 0 and p(·) − p + (·) → 0. B,N B,N ∞ ∞ + − (·), pB,N (·) are the same as in Lemma 4.12.) Thus by Corollary 4.6, (Here IiB , pB,N
− p(·) − id : L (I ) → LpB,N (I ) → 1, N
+ p+ id : L B,N (I ) → Lp(·) (I ) → 1 N
as N → ∞. The result now follows from Lemma 4.12. To prove (ii) we follow the idea of the proof of (i) with the help of Lemma 4.13.
2
Theorem 4.15. Let p ∈ SLH(I ). Then 1 lim nsn (T ) = n→∞ 2π
1/p(t) p (t)p(t)p(t)−1 sin
π dt, p(t)
I
where sn denote the nth approximation, Gelfand, Kolmogorov or Bernstein number of T . Proof. Use Theorem 4.14, Proposition 4.11 and the inequality (2.3).
2
It is not difficult to combine proofs of Theorem 3.3 and Theorem 4.15 to obtain the following theorem, which contains both these results. Theorem 4.16. Let Ji , i = 1, 2, . . . , m, be a finite decomposition of I . Assume that p(·) be such that p(·) ∈ SLH(Ii ) for each i ∈ {1, 2, . . . , m}. Then
242
D.E. Edmunds et al. / Journal of Functional Analysis 257 (2009) 219–242
lim nsn (T ) =
n→∞
1 2π
p (t)p(t)p(t)−1
1/p(t)
sin
π dt, p(t)
I
where sn denote the nth approximation, Gelfand, Kolmogorov or Bernstein number of T . References [1] D.E. Edmunds, W.D. Evans, Spectral Theory and Differential Operators, Oxford University Press, 1987. [2] D.E. Edmunds, W.D. Evans, Hardy Operators, Function Spaces and Embeddings, Springer-Verlag, Berlin, 2004. [3] D.E. Edmunds, J. Lang, Approximation numbers and Kolmogorov widths of Hardy-type operators in a nonhomogeneous case, Math. Nachr. 279 (2006) 727–742. [4] D.E. Edmunds, J. Lang, Behaviour of the approximation numbers of a Sobolev embedding in the one-dimensional case, J. Funct. Anal. 206 (2004) 149–166. [5] D.E. Edmunds, J. Lang, Bernstein widths of Hardy-type operators in a non-homogeneous case, J. Math. Anal. Appl. 325 (2007) 1060–1076. [6] O. Ková˘cik, J. Rákosník, On spaces Lp(x) (Ω) and W k,p(x) (Ω), Czechoslovak Math. J. 41 (1991) 592–618. [7] J. Lang, Improved estimates for the approximation numbers of Hardy-type operators, J. Approx. Theory 121 (2006) 61–70. [8] J. Lang, Estimates for n-widths of the Hardy-type operators, J. Approx. Theory 140 (2) (2006) 141–146. [9] G.G. Lorentz, V. Golitschek, Yu. Makovoz, Constructive Approximation: Advanced Problems, Springer-Verlag, Berlin, 1996. [10] A. Pietsch, Bad properties of the Bernstein numbers, Studia Math. 184 (2008) 263–269. [11] A. Pinkus, n-Widths in Approximation Theory, Springer-Verlag, Berlin, 1985. [12] M. R˚užiˇcka, Electrorheological Fluids: Modeling and Mathematical Theory, Lecture Notes in Math., vol. 1748, Springer-Verlag, Berlin, 2000. [13] E. Schmidt, Über die Ungleichung, welche die Integrale über eine Potenz einer Funktion und über eine andere Potenz ihrer Ableitung verbindet, Math. Ann. 117 (1940) 301–326 (in German).
Journal of Functional Analysis 257 (2009) 243–270 www.elsevier.com/locate/jfa
Computing the first eigenvalue of the p-Laplacian via the inverse power method Rodney Josué Biezuner ∗ , Grey Ercole, Eder Marinho Martins Departamento de Matemática – ICEx, Universidade Federal de Minas Gerais, Av. Antônio Carlos 6627, Caixa Postal 702, 30161-970, Belo Horizonte, MG, Brazil Received 28 October 2008; accepted 13 January 2009 Available online 12 February 2009 Communicated by H. Brezis
Abstract In this paper, we discuss a new method for computing the first Dirichlet eigenvalue of the p-Laplacian inspired by the inverse power method in finite dimensional linear algebra. The iterative technique is independent of the particular method used in solving the p-Laplacian equation and therefore can be made as efficient as the latter. The method is validated theoretically for any ball in Rn if p > 1 and for any bounded domain in the particular case p = 2. For p > 2 the method is validated numerically for the square. © 2009 Elsevier Inc. All rights reserved. Keywords: p-Laplacian; First eigenvalue; Comparison principle; Power method
1. Introduction In finite dimensional linear algebra, the power method is often used in order to compute the first eigenvalue of invertible linear operators defined on Euclidean spaces (see [9], for instance). Briefly, given an invertible linear operator L, one picks a vector x and forms the sequence x, Ax, A2 x, . . . . In order to produce this sequence, it is not necessary to get the powers of A explicitly, since each vector in the sequence can be obtained from the previous one by multiplying it by A. It is * Corresponding author.
E-mail addresses: [email protected] (R.J. Biezuner), [email protected] (G. Ercole), [email protected] (E.M. Martins). 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.01.023
244
R.J. Biezuner et al. / Journal of Functional Analysis 257 (2009) 243–270
easy to show that the sequence converges, up to scaling, to the dominant eigenvector. Thus the largest eigenvalue can be found. In order to obtain the first (and therefore smallest) eigenvalue, one considers instead powers of A−1 , a method which is sometimes called the inverse power method or inverse iteration. More specifically, if {yn }n∈N denotes the normalized sequence of vectors produced by the inverse power method, then the first eigenvalue λ1 of A can be explicitly given as (see [5]) λ1 = lim yn · A−1 yn , n→∞
where · denotes the Euclidean inner product. In this work we carry this idea further, to nonlinear operators in infinite dimensional spaces, in order to develop a numerical method to compute the first eigenvalue of the nonlinear degenerate elliptic p-Laplacian operator. In order to describe the technique, first we establish some notation and recall some well-known results. Throughout this paper, Ω will denote a smooth bounded region in RN , N 1, and p will denote the p-Laplacian operator, that is, p u = div |∇u|p−2 ∇u for 1 < p < ∞. We consider the Dirichlet eigenvalue problem for the p-Laplacian −p u = λ|u|p−2 u in Ω, u=0 on ∂Ω.
(1)
(2)
As it is well known, the first eigenvalue λp (Ω) is positive, simple, can be variationally characterized, and the corresponding eigenfunction belongs to the Hölder space C 1,α (Ω), does not change sign and therefore can be taken positive. If p = 2, when p becomes the Laplacian operator , the value of λp (Ω) is well known for domains with simple geometry; for more general domains it can be determined by several methods. For p > 1 and N = 1, the value of λp (Ω) is known: if Ω = (a, b), then πp p λp (Ω) = (p − 1) , b−a where 1 πp := 2 0
ds π/p . =2 √ p sen(π/p) 1 − sp
However, if p = 2 and N 2, the value of λp (Ω) is not explicitly known, not even for simple domains such as a ball or a square. In such cases, there are few available numerical methods for finding λp (Ω). In the absence of an exact value or even a good approximation for λp (Ω), lower bounds play an important role in its estimation, being of special interest in the literature (upper bounds are more easily obtainable from the variational characterization of λp (Ω)). An important lower bound for λp (Ω) (see [7]) is λp (B) where B ⊂ RN is a ball centered at the origin and with the same N -dimensional Lebesgue measure of Ω. We propose a method based on the inverse power method for obtaining λp (Ω) and prove its applicability in the case when Ω = B (without loss of generality we choose B to be the unit ball).
R.J. Biezuner et al. / Journal of Functional Analysis 257 (2009) 243–270
245
We believe this specific result is very relevant and certainly will contribute to research on quasilinear problems in which spherical geometry or the estimation of λp (Ω) are important. Moreover, the main results that support the method are valid for a general domain (even unbounded) and thus they lead us to conjecture that our method is applicable to a more general class of domains. In the remainder of this introduction we describe in more detail the main results of this paper and our conjecture. 1,p Let W0 (Ω) denote the standard Sobolev space with norm u W 1,p (Ω) = ∇u Lp (Ω) . We 0 1,p recall that u ∈ W0 (Ω) is a weak solution to the homogeneous Dirichlet problem
−p u = f u=0
in Ω, on ∂Ω,
(3)
for a given function f ∈ Lp (Ω), where p = p/(p − 1) denotes the conjugate exponent of p, if
|∇u|p−2 ∇u · ∇v dx =
Ω
f v dx
(4)
Ω
1,p
for every test function v ∈ W0 (Ω). It is well known that if f ∈ C 0 (Ω) then the corresponding 1,p weak solution u belongs to C 1,α (Ω) and the inverse operator (−p )−1 : C 0 (Ω) → W0 (Ω) ∩ C 1,α (Ω) → C 0 (Ω) is continuous and compact. The technique runs as follows. First, iteratively define a sequence of functions (φn )n∈N ⊂ 1,p W0 (Ω) ∩ C 1.α (Ω) by setting φ0 ≡ 1 and, for n = 1, 2, 3, . . . , letting φn be the solution to the Dirichlet problem
p−1
−p φn = φn−1 φn = 0
in Ω, on ∂Ω.
(5)
Then, for n 1, define the following sequences of real numbers φn p−1 γn := inf Ω φn+1
(6)
and Γn := sup Ω
φn φn = . φn+1 φn+1 L∞ (Ω)
(7)
We show that sequence (6) is indeed well defined and bounded above by the first Dirichlet eigenvalue of the p-Laplacian λp . This result in itself is of particular importance, since lower bounds for λp are hard to obtain. For sequence (7), we give strong evidence that it is well defined for at least all sufficiently large n (we prove this is true for balls and for general domains in the special case p = 2) and that it is bounded below by λp . We also show that sequence (6) is increasing (which implies among other things that we can obtain successively better lower bounds for λp ),
246
R.J. Biezuner et al. / Journal of Functional Analysis 257 (2009) 243–270
whereas sequence (7) is decreasing. Thus, the limit γ := lim γn exists and is finite, the limit Γ := lim Γn exists and is likely finite (it is definitely finite in some cases), and they satisfy γ λp Γ. We conjecture that λp = γ = Γ.
(8)
If this conjecture is true, we also show that the first eigenfunction might be construed as the limit of a certain scaling of sequence (φn ). We are able to show that this indeed happens in the special cases of balls, for general p, and general domains, for p = 2. In the latter case we use the Hilbert space structure of the space W01,2 (Ω) and some well-known results for generalized Fourier expansions in eigenfunctions of the Laplacian. Needless to say, such an argument does not work for p = 2. However, numerical experiments indicate that for general p we do have γ =Γ and that these values approach known values of λp obtained by other authors using different techniques. We also consider the sequence of numbers νn =
φn Lp (Ω)
φn+1 Lp (Ω)
p−1 (9)
and show that it is bounded below by λp and above by the sequence (Γn ), so that it is also our (independent) conjecture that λp = ν := lim νn .
(10)
In numerical experiments, it is observed that the convergence of this sequence is significantly faster than the convergences of the above two sequences. This paper is organized as follows. In Section 2, we prove the monotonicity of sequences (γn ) and (Γn ), and that the first eigenvalue is an upper bound for sequence (γn ). We also study sequence (νn ) and find that the first eigenvalue is a lower bound for this sequence as well as for sequence (Γn ). In Section 3, we show that if conjecture (8) is true, a certain scaled limit of sequence (φn ) might approach the first eigenfunction. Section 4 gives a complete proof of conjecture (8) in the case of n-dimensional balls. In Section 5, we prove the convergence of all three sequences to the first eigenvalue in the case p = 2. Section 6 shows that in principle this technique can be extended to yield higher eigenvalues and corresponding eigenfunctions, at least in the p = 2 case. Finally, in Section 7 we describe the numerical experiments performed and their results.
R.J. Biezuner et al. / Journal of Functional Analysis 257 (2009) 243–270
247
2. Behavior of sequences (γn ), (Γn ) and (νn ) In the following, we denote by u ∞ and u p respectively the L∞ and Lp norms of a 1,p function u in Ω. Let (φn )n∈N ⊂ W0 (Ω) ∩ C 1.α (Ω) be the sequence defined by φ0 ≡ 1 and, for n 1, φn is the solution to the Dirichlet problem
p−1
−p φn = φn−1 φn = 0
in Ω, on ∂Ω.
(11)
2.1. Sequence (γn ) Set φn p−1 . γn := inf Ω φn+1 First we show that sequence (γn ) is well defined. In the sequel we will resort several times to the well-known comparison principle for the p-Laplacian (see [4], for instance). For easiness of reference we state it here. Proposition 2.1 (Comparison principle). Let u1 , u2 ∈ C 1.α (Ω) satisfy
−p u1 −p u2 u1 u2
in Ω, on ∂Ω.
Then, u1 u2 in Ω. Proposition 2.2. The sequence (φn )n∈N satisfies 0 < φn φ1 ∞ φn−1
in Ω
for every n 1. Proof. First we show that φn is positive by using a simple argument involving the comparison principle. The comparison principle already implies that φn 0. Fix an arbitrary x0 ∈ Ω and let R > 0 be such that the ball BR (x0 ) centered at x0 with radius R is contained in Ω. We will show that φn (x) > 0 for all x ∈ BR (x0 ). Let ψ1 be the solution to the Dirichlet problem
−p ψ1 = 1 in BR (x0 ), on ∂BR (x0 ). ψ1 = 0
It is well known that the solution to this problem is radially symmetric, that is, ψ1 (x) = v1 (r) for r = |x − x0 |, where v1 satisfies the problem
p−2 v1 = r N −1 − r N −1 v1
v1 (0) = v1 (R) = 0.
for 0 < r < R,
248
R.J. Biezuner et al. / Journal of Functional Analysis 257 (2009) 243–270
Integrating the differential equation we find 1 R r N −1 p−1 s ds dr v1 (r) = r
r
0
and hence v1 > 0 for 0 < r < R. Thus, since
−p φ1 = −p ψ1 φ1 0 = ψ1
in BR (x0 ), on ∂BR (x0 ),
it follows from the comparison principle that φ1 ψ1 > 0 in BR (x0 ). For n 2 define the sequence (ψn ) iteratively by
p−1
−p ψn+1 = ψn ψn+1 = 0
in BR (x0 ), on ∂BR (x0 ).
As before, ψn+1 (x) = vn+1 (r) where vn+1 is the solution to the (nonlinear) problem
p−2 p−1
− r N −1 vn+1 vn+1 = r N −1 vn vn+1 (0) = vn+1 (R) = 0,
for 0 < r < R,
so that 1 R r N −1 p−1 s p−1 ψn+1 (x) = vn (s) ds dr > 0 for x ∈ BR (x0 ). r |x−x0 |
0
Assuming by induction that φn ψn in BR (x0 ), it follows that
p−1
p−1
−p φn+1 = φn ψn φn+1 0 = ψn+1
= −p ψn+1
in BR (x0 ), on ∂BR (x0 ),
which implies that φn+1 ψn+1 in BR (x0 ). Therefore, we conclude that φn+1 ψn+1 > 0 in BR (x0 ) for all n. Since x0 is arbitrary, it follows that φn > 0 in Ω for all n. It remains to show that φn φ1 ∞ φn−1 . Again we use an induction argument together with the comparison principle. Trivially, φ1 φ1 ∞ = φ1 ∞ φ0 . Assume φn φ1 φn−1 .
R.J. Biezuner et al. / Journal of Functional Analysis 257 (2009) 243–270
249
We have
p−1 p−1 p−1 −p φn+1 = φn φ1 ∞ φn−1 = −p φ1 ∞ φn in Ω, φn+1 = 0 = φ1 ∞ φn on ∂Ω,
whence φn+1 φ1 ∞ φn .
2
As a consequence of Proposition 2.2, it follows that
φn+1 ∞ φ1 ∞ φn ∞ and φn 1 > 0. φn+1 φ1 ∞ Therefore, the sequence φn p−1 γn = inf Ω φn+1
(12)
p−1 1 1 γ0 = inf = . p−1 Ω φ1
φ1 ∞
(13)
is well defined. Observe that
Next we show that (γn ) is an increasing sequence, bounded above by the first eigenvalue. We will need the following lemma: Lemma 2.3. Let Ω ⊂ R N be a smooth bounded region and h ∈ C 0 (Ω) a nonnegative function. 1,p If u ∈ W0 (Ω) ∩ C 1,α (Ω) is a positive solution to the Dirichlet problem
−p u = λp up−1 + h in Ω, u=0 on ∂Ω,
(14)
then h ≡ 0 in Ω and consequently u is a positive eigenfunction corresponding to the first eigenvalue λp . Proof. This proof is adapted from [1, Theorem 2.4] and based on the inequality wp ∇u · ∇ p−1 , u
|∇w| |∇u| p
p−2
(15)
valid for all differentiable functions u, w in Ω that satisfy u > 0 and w 0; this inequality follows from Picone’s identity.
250
R.J. Biezuner et al. / Journal of Functional Analysis 257 (2009) 243–270 1,p
Multiplying Eq. (14) by any v ∈ W0 (Ω) and integrating on Ω, we have
|∇u|p−2 ∇u · ∇v dx =
Ω
λp up−1 + h v dx.
(16)
Ω
1,p
Let up ∈ W0 (Ω) ∩ C 1,α (Ω) be a positive eigenfunction corresponding to λp . Applying (15) with w = up and integrating on Ω, we find that
|∇up | dx
|∇u|
p
Ω
p−2
p up ∇u · ∇ p−1 dx. u
Ω 1,p
p
Now, since u > 0 in Ω, by Hopf’s lemma we have up /up−1 ∈ W0 (Ω). Therefore, we can apply (16) to conclude that
|∇up |p dx
Ω
upp λp up−1 + h p−1 dx. u
Ω
Thus, 0=
|∇up |p dx −
Ω
which implies that h ≡ 0.
p
λp up dx
h
Ω
φp dx 0, up−1
Ω
2
Proposition 2.4. For all n 2 the following hold: (i) γ0 < λp . (ii) γ0 γn γn+1 < λp . (iii) There exists γ := lim γn and γ0 γ λp . Proof. Property (iii) is a direct consequence of (i) and (ii). We prove (i) by a contradiction argument. Assume γ0 λp . Then, setting p−1
h = 1 − λp φ1
,
it follows by (13) that p−1
h 1 − γ0 φ1
0.
R.J. Biezuner et al. / Journal of Functional Analysis 257 (2009) 243–270
251
Write
p−1
−p φ1 = 1 = λp φ1 φ1 = 0
+h
in Ω, on ∂Ω.
From Lemma 2.3, we conclude that h ≡ 0 and so λp φ1 ≡ 1. This implies that φ1 is constant, contradicting −p φ1 = 1 in Ω. In order to prove (ii), observe that γ0 γn follows immediately from Proposition 2.2: γ0 =
1 p−1
φ1 ∞
φn φn+1
p−1 .
The monotonicity of the sequence (γn ) can be shown by using the comparison principle. We have by definition
p−1
p−1
−p φn = φn−1 γn−1 φn φn = 0 = φn+1 ,
1/(p−1) = −p γn−1 φn+1
in Ω, on ∂Ω,
whence 1/(p−1)
φn γn−1
φn+1 ,
and, therefore φn p−1 γn−1 . γn = inf Ω φn+1 Finally, in order to verify that γn < λp we use again a contradiction argument. Suppose that γn λp for some n. Then, p−1
p−1
λp φn+1 γn φn+1
φn φn+1
p−1
p−1
p−1
φn+1 = φn
,
the second inequality a consequence of (12). Thus, p−1
− λp φn+1 0
p−1
= λp φn+1 + h
h := φn
p−1
in Ω. Since
−p φn+1 = φn φn+1 = 0
p−1
in Ω, on ∂Ω,
it follows from Lemma 2.3 that h ≡ 0. Thus, p−1
φn
p−1
= λp φn+1
(17)
252
R.J. Biezuner et al. / Journal of Functional Analysis 257 (2009) 243–270
and hence λp ≡
φn φn+1
p−1
φn p−1 = inf = γn . Ω φn+1
Furthermore, it also follows from (17) that 1/(p−1) p−1 p−1 φn−1 = −p φn = −p λp φn+1 = λp (−p φn+1 ) = λp φn whence λp ≡
φn−1 φn
p−1
φn−1 p−1 = inf = γn−1 . Ω φn
Proceeding recursively we obtain λ p = γ0 , 2
which contradicts (i).
As a consequence from Propositions 2.2 and 2.4(i), we obtain the following behavior for the sequence (φn ): Corollary 2.5.
φn
φ1 n∞
→ 0 monotonically and uniformly.
Proof. Set wn :=
φn ,
φ1 n∞
so that, by (13),
p−1
−p wn = γ0 wn−1 wn = 0
in Ω, on ∂Ω.
It follows from Proposition 2.2 that the sequence (wn ) is decreasing and uniformly bounded, since
φ1 n∞ φn+1 1 φn+1 wn+1 = = 1 n+1 φ wn
φ
φ1 ∞ n 1 ∞ φn and
wn ∞ =
φn ∞ φn−1 ∞ φ1 ∞ φn−2 ∞ φ1 2∞ · · · 1.
φ1 n∞
φ1 n∞
φ1 n∞
The uniform boundedness of (wn ) together with the compacity of the operator (−p )−1 : C 0 (Ω) → C 0 (Ω) imply that (wn ) = (−p )−1 (γ0 wn−1 ) has a subsequence which converges uniformly to some function w ∈ C 0 (Ω). The monotonicity of (wn ) guarantees that the whole sequence converges uniformly and monotonically to w. The continuity of −−1 p implies that
R.J. Biezuner et al. / Journal of Functional Analysis 257 (2009) 243–270
253
w = (−p )−1 (γ0 w). Therefore,
−p w = γ0 w p−1 w=0
in Ω, on ∂Ω.
Since γ0 < λp and λp is the first eigenvalue for −p , we conclude that w = 0.
2
2.2. Sequence (Γn ) Set Γn := sup Ω
φn φn+1
p−1
φn p−1 = . φ n+1 L∞ (Ω)
(18)
Observe that p−1 φ0 = ∞, Γ0 = φ 1 ∞ since φ0 = 1 on Ω and φ1 = 0 on the boundary ∂Ω. However, if one can guarantee that Γn0 is finite for some n0 , then sequence (Γn ) is well defined from n0 on: Proposition 2.6. Assume Γn0 < ∞ for some n0 1. Then
φn φn+1
p−1 Γn0
for all n n0
and therefore (Γn ) is well defined and bounded from above for n n0 . Moreover, the sequence (Γn ) is decreasing for n n0 . Proof. Proceeding by induction, assume we have shown Γn0 +k · · · Γn0 . Then, for j = n0 +k we observe that 1 φj p−1 p−1 p−1 p−1 φj +1 Γj φj +1 = −p Γjp−1 φj +2 −p φj +1 = φj = φj +1 in Ω, and 1
φj +1 = 0 = Γjp−1 φj +2 on ∂Ω. Thus, it follows from the comparison principle that 1
φj +1 Γjp−1 φj +2
in Ω.
Hence, φj +1 p−1 Γj . Γj +1 = φj +2 ∞
2
254
R.J. Biezuner et al. / Journal of Functional Analysis 257 (2009) 243–270
For special domains, we are able to prove that there exists n0 such that Γn0 is finite. Indeed, the following result holds: Proposition 2.7. Let Ω = BR be the ball of center at the origin and radius R. Then Γ1 is finite. Proof. Using the same notation as in Proposition 2.2, we have φn (x) = vn (r) with 1 1 θ N −1 p−1 s p−1 vn−1 (s) ds dr. vn (r) = θ r
(19)
0
Thus, if x ∈ ∂BR , L’Hôpital’s rule implies that v (r) φ1 (x) v1 (r) = lim = lim 1 = φ2 (x) r→R − v2 (r) r→R − v2 (r)
R
1 N −1 ds p−1 0 s
R N −1 v (s)p−1 ds 1 0 s
< ∞.
2
2.3. Sequence (νn ) Set νn =
φn Lp (Ω)
φn+1 Lp (Ω)
p−1 .
Clearly, sequence (νn ) is well defined. We show that both it and sequence (Γn ) are bounded below by the first eigenvalue. Proposition 2.8. There holds λp νn Γn for all n 1. Proof. By (4) and Hölder’s inequality we have p−1 p p−1 p
∇φn+1 p = |∇φn+1 | dx = φn φn+1 dx φn p φn+1 p , Ω
Ω
whence p
p−1
∇φn+1 p φn p
φn+1 p .
Hence, from this and the variational characterization of the first eigenvalue λp = it follows that
inf
1,p v∈W0 (Ω)\{0}
∇v p ,
v p
R.J. Biezuner et al. / Journal of Functional Analysis 257 (2009) 243–270 p
λp
∇φn+1 p p
φn+1 p
p−1
φn p
255
φn+1 p p
φn+1 p
= νn =
1 p−1
φn+1 p
Ω
φn φn+1
p
p−1 p φn+1 dx
p
p−1 p φn p−1 1 p φn+1 dx p−1 φ n+1 ∞
φn+1 p Ω
= Γn .
2
Corollary 2.9. If lim Γn = λp then lim νn = λp . As an interesting consequence of Propositions 2.4 and 2.8 we have Corollary 2.10. If Ω is connected and Γn0 < ∞ for some n0 1, then for each n n0 there exists at least one xn ∈ Ω such that λp =
φn (xn ) . φn+1 (xn )
3. The first eigenvalue and the first eigenfunction Recall that if γ = lim γn , then it follows from Proposition 2.4 that γ λp . If we set Γ := lim Γn and ν := lim νn ,
256
R.J. Biezuner et al. / Journal of Functional Analysis 257 (2009) 243–270
we have from Propositions 2.6 and 2.8 that ν < ∞ and λp ν Γ. We conjecture the following: Conjecture 3.1. There holds λp = γ = Γ = ν.
(20)
In order to find a sequence of functions approximating the first eigenfunction, we define for each n ∈ N the function un :=
φn , an
where an is such that 1 an φn = γnp−1 = inf . Ω φn+1 an+1
For instance, if we set a1 := φ1 ∞ , then a2 =
a1
=
1/(p−1)
γ1
a1 infΩ
φ1 φ2
φ2 = φ1 ∞ φ 1 ∞
and, in general, an =
φ1 ∞
1
φ1 φ2
infΩ
infΩ
φ2 φ3
···
1 infΩ
φn−1 φn
φ2 φ3 · · · φn . = φ1 ∞ φ φ φ 1 ∞ 2 ∞ n−1 ∞
Since φk
φk ∞ ,
φk−1 ∞ φk−1 ∞ we have an φ1 ∞
φ2 ∞ φ3 ∞
φn ∞ ··· = φn ∞ .
φ1 ∞ φ2 ∞
φn−1 ∞
Therefore, un
φn 1.
φn ∞
(21)
R.J. Biezuner et al. / Journal of Functional Analysis 257 (2009) 243–270
257
1,p
Proposition 3.2. Let (un ) ⊂ W0 (Ω) ∩ C 1.α (Ω) be the sequence of functions defined above. Then (un ) is decreasing and satisfies
p−1
−p un+1 = γn un un = 0
in Ω, on ∂Ω.
Furthermore, (un ) converges uniformly to a function u ∈ C 1.α (Ω) which satisfies
−p u = γ up−1 u=0
in Ω, on ∂Ω.
Proof. We have −p un+1 =
φn an+1
p−1
=
an an+1
p−1
p−1
un
p−1
= γn un
.
Moreover, un+1 =
φn φn+1 φn+1 φn+1 φn φn = inf = = un , an+1 an Ω φn+1 an φn+1 an
which proves that (un ) is decreasing and therefore we can define a function u in Ω by u(x) := lim un (x) for each x ∈ Ω. Since (un ) ⊂ C 1,α (Ω), 0 un u1 , and the operator (−p )−1 : C 0 (Ω) → C 0 (Ω) is compact, the whole sequence (un ) converges to u uniformly and we can pass the limit in p−1
−p un+1 = γn un to obtain −p u = γ up−1 .
2
In view of Conjecture 3.1, this result suggests that u is the eigenfunction corresponding to the first eigenvalue. However, since we are not able to guarantee that u is not the null function, this does not follow automatically. On the other hand, since Proposition 3.2 is independent of the conjecture, its proof would show that γ = λp . We end this section by remarking that all results proved above are valid if we consider a positive weight ω(x) multiplying the right-hand side of the eigenvalue equation, that is, if we consider the equation −p u = λω(x)|u|p−1 u in Ω. The above arguments are easily adapted to contemplate this case and the most remarkable change appears in the sequence (νn ) which becomes νn =
ω(x)1/p φn p
ω(x)1/p φn+1 p
p−1 .
258
R.J. Biezuner et al. / Journal of Functional Analysis 257 (2009) 243–270
4. Spherical domains In this section we show the validity of Conjecture 3.1 for balls. Let B = B1 (0) ⊂ RN , N 2, denote the unit ball centered at the origin. The following lemma in the form it is given was first stated in [2], even though it has already often been used as a technical tool in differential geometry. A proof is provided for completeness. Lemma 4.1. Let f, g : [a, b] → R be continuous on [a, b] and differentiable in (a, b). Suppose (x)−f (a) g (x) = 0 for all x ∈ (a, b). If fg is (strictly) increasing [decreasing], then both fg(x)−g(a) and f (x)−f (b) g(x)−g(b)
are (strictly) increasing [decreasing].
Proof. Assume fg is increasing. Then the Cauchy mean value theorem implies that for each x ∈ (a, b) there exists y ∈ (a, x) such that f (x) − f (a) f (y) f (x) = . g(x) − g(a) g (y) g (x) On the other hand, since g = 0 we always have g (x) > 0. g(x) − g(a) Thus, f (x) g (x) f (x) − f (a) d f (x) − f (a) = − dx g(x) − g(a) g(x) − g(a) g(x) − g(a) g(x) − g(a) and so If
f (x)−f (a) g(x)−g(a)
f g
g (x) f (x) f (x) − = 0, g(x) − g(a) g(x) − g(a) g (x)
is increasing. f (x)−f (a) g(x)−g(a)
is decreasing, then the same arguments prove that
the above inequalities are strict if the monotonicity of similar. 2
f g
is decreasing. Moreover,
is strict. The proof for
We use this lemma in order to show that for each n 0 the quotient function of r = |x|:
φn φn+1
Theorem 4.2. Let p > 1 and for r ∈ [0, 1] set φ0 (r) ≡ 1, 1 θ φn (r) = r
0
s θ
N −1
p−1 φn−1 (s) ds
1 p−1
dθ,
if n 1.
f (x)−f (b) g(x)−g(b)
is
is increasing as a
R.J. Biezuner et al. / Journal of Functional Analysis 257 (2009) 243–270
259
Then, for each n 1 the function φn is strictly decreasing and for each n 0 the quotient is strictly increasing on [0, 1].
φn φn+1
Proof. As r 1 p−1 N −1 s p−1 φn (r) = − φn−1 (s) ds < 0, r 0
for r > 0, the functions φn are strictly decreasing for n 1. n In particular, the quotient φφn+1 is strictly increasing when n = 0. In order to show that the quotients are strictly increasing for n 1, we use an induction argument. Assume that the quotient φn−1 φn is strictly increasing for some n 1. Noticing that φn (1) = φn+1 (1) = 0, we can write φn (r) − φn (1) φn . (r) = φn+1 φn+1 (r) − φn+1 (1) We will apply the previous lemma in order to show that the quotient in the right-hand side of this equation is strictly increasing. For this, it is enough to verify that φn (r) = φn+1 (r)
r
1 N −1 φ p−1 (s) ds p−1 n−1 0 s
r N −1 φ p−1 (s) ds n 0 s
1
is increasing. Since the map ξ → ξ p−1 is increasing, this is equivalent to showing that
r 0
s N −1 φn−1 (s) ds
0
s N −1 φn
r
p−1 p−1
(s) ds
is increasing. But this is itself a consequence of the lemma, for both
r N −1 p−1 φn (s) ds equal zero at r = 0 and 0 s
r p−1 ( 0 s N −1 φn−1 (s) ds) φn−1 (r) p−1 = ,
r p−1 φn (r) ( 0 s N −1 φn (s) ds) which is strictly increasing by the induction hypothesis.
2
Corollary 4.3. φn p−1
φn ∞ p−1 γn = inf = . B φn+1
φn+1 ∞
r 0
s N −1 φn−1 (s) ds and p−1
260
R.J. Biezuner et al. / Journal of Functional Analysis 257 (2009) 243–270
Proof. Since φn and φn+1 are decreasing functions, we have φn ∞ = φn (0) and φn+1 ∞ = n is increasing, φn+1 (0). Therefore, since φφn+1 φn p−1 φn (0) p−1
φn ∞ p−1 inf = = . B φn+1 φn+1 (0)
φn+1 ∞
2
Theorem 4.4. Let (un ) be the sequence defined by un :=
φn
φn
for each n ∈ N. Then γ = λp (B) and (un ) converges uniformly (and monotonically) to a positive function u ∈ C 1,α (B) such that
u ∞ = 1 and −p u = λp up−1 in B, (22) u=0 on ∂B. Proof. Notice that sequence (un ) is the same as that defined in Proposition 3.2, since an defined in (21) is now, in view of Corollary 4.3, an =
φ1 ∞
1
infΩ φφ12
infΩ φφ23
···
1 infΩ φφn−1 n
=
φ1 ∞
1
φ1 ∞ φ2 ∞
φ2 ∞ φ3 ∞
Thus, it satisfies the nonlinear problem p−1 −p un+1 = γn un un = 0
···
in B, on ∂B,
1
φn−1 ∞
φn ∞
= φn ∞ .
(23)
and is decreasing. Therefore, arguing as in the proof of Proposition 3.2, we can pass the limit in (23) in order to obtain (22). However, differently from the sequence of Proposition 3.2, we have in addition
un ∞ = 1 for every n ∈ N, which allows us to conclude that u ∞ = lim un ∞ = 1, hence u is not the null function and thus γ = λp (see remark after Proposition 3.2). 2 Next we will show that the sequence (φn /φn+1 ) converges uniformly to the constant function λp on each compact set contained in B. Lemma 4.5. For each 0 < ε < 1 define 1 N−1 −1 ε p−1 Kε := dθ . θ 1−ε
R.J. Biezuner et al. / Journal of Functional Analysis 257 (2009) 243–270
261
Then 0
φn φn+1
φ1 Kε φ 2 ∞
on the interval [ε, 1 − ε].
(24)
Proof. Since (see Propositions 2.6 and 2.7) φn φ
n+1
φ1 φ , 2 ∞ ∞
if n 2
(25)
and 0
φn φn+1
=
φn+1 φn − φn φn+1 2 φn+1
=
| |φn | | φn |φn+1 φn |φn+1 − , φn+1 φn+1 φn φn+1 φn+1
it suffices to show that |φn+1 |
φn+1
Kε
in [ε, 1 − ε].
And indeed, for ε r 1 − ε < 1 we have 1 1 θ N −1 p−1 s p−1 φn (s) ds dθ φn+1 (r) = θ r
0
1
θ
− N−1 p−1
r dθ
r
p−1 s N −1 φn (s) ds
1 p−1
0
1 =
θ
− N−1 p−1
N−1
dθ r p−1 φn+1 (r)
1−ε
1 N−1
ε p−1 dθ φn+1 (r) . θ
2
1−ε
Theorem 4.6. For each fixed 0 < ε < 1 we have
φn (|x|) φn+1 (|x|)
1 p−1
→ λp
uniformly on the annulus Ωε1−ε := {ε < |x| < 1 − ε} ⊂ B1 . Proof. In view of (24) and (25), it follows from Arzela–Ascoli’s theorem that, up to a subsen (r) quence, ( φφn+1 (r) ) converges uniformly to a function w ∈ C([ε, 1 − ε]).
262
R.J. Biezuner et al. / Journal of Functional Analysis 257 (2009) 243–270
Taking un (|x|) =
φn (|x|)
φn ∞
as in the proof of Theorem 4.4, we can write
p−1
−p un+1 =
φn
p−1
φn+1 ∞
=
φn φn+1
p−1
p−1
in Ωε1−ε .
un+1
Thus, passing the limit in this equation for a convenient subsequence, we obtain −p u = w p−1 u, where u is the eigenfunction given in Theorem 4.4. Therefore, w p−1 ≡ λp , for −p u = λp up−1 . The proof is complete since the limit is independent of the particular subsequence that converges to w. 2 Corollary 4.7. Γ = λp . Proof. We have Γn = sup 0r1
φn φn+1
p−1
= lim
r→1+
φn (r) φn+1 (r)
p−1
=
φn (1) φn+1 (1)
p−1
and
φn (1) φn+1 (1)
1
p−1 =
N −1 φ p−1 (s) ds n−1 0 s
1 N −1 φ p−1 (s) ds s n 0
1
=
0
φn (s) p−1 s N −1 ( φφn−1 (s))p−1 ( φ ) ds n n ∞ .
1 N −1 ( φn (s) )p−1 ds s 0
φn ∞
and φφnn ∞ are bounded, λp = lim( φφn−1 )p−1 and u(|x|) = lim( φ φn (|x|) )p−1 is the eigenSince φφn−1 n n n ∞ function obtained above, we can apply Lebesgue’s dominated convergence theorem to obtain
φ (1) Γ = lim n φn+1 (1)
1
p−1 =
0
s N −1 λp up−1 (s) ds = λp .
1 N −1 up−1 (s) ds 0 s p−1
Corollary 4.8.
φn p p−1 ν := lim = λp .
φn+1 p
2
R.J. Biezuner et al. / Journal of Functional Analysis 257 (2009) 243–270
Proof. In view of the last corollary, this follows from Corollary 2.9.
263
2
This result in fact holds for any Lq -norm (q > 1): Corollary 4.9. For any q > 1 there holds
φn q p−1 lim = λp .
φn+1 q Proof. As before set u(|x|) = lim( φ φn (|x|) )p−1 . Then it follows from Lebesgue’s dominated conn ∞ vergence theorem that 1 N −1 q p−1 q φn (s) ds
φn q p−1 0 s lim = lim 1 q
φn+1 q s N −1 φ (s) ds n+1
0
1 s N −1 lim( φn (s) )q ds p−1 q
φn ∞ p−1 0
φn ∞ = lim
1
φn+1 ∞ s N −1 lim( φn+1 (s) )q ds 0
1 = λp 01 0
= λp .
s N −1 uq (s) ds
φn+1 ∞
p−1 q
s N −1 uq (s) ds 2
As at the end of last section, we also remark that all results in this section remain valid if we consider a radially symmetric weight ω(|x|). Moreover, the results can be naturally extended to the Dirichlet problem in RN with an appropriate weight. 5. The case p = 2 In this section, we give a complete proof of the convergence of the three sequences to the first eigenvalue of the Laplacian. Let 0 < λ1 < λ2 λ3 · · · be the increasing sequence of Dirichlet eigenvalues for the Laplacian − in Ω and (en )n∈N ⊂ W01,2 (Ω) ∩ C 2 (Ω) be a corresponding sequence of eigenfunctions which is also an orthogonal system for L2 (Ω) and normalized by the sup-norm, that is, en ∞ = 1 for all n ∈ N. Denote the inner product in L2 (Ω) by u, v =
uv dx. Ω
264
R.J. Biezuner et al. / Journal of Functional Analysis 257 (2009) 243–270
If ξ ∈ W01,2 (Ω) ∩ C 2 (Ω) is such that ξ > 0 in Ω, then ξ=
∞
αk ek
k=1
with αk = ξ, ek ,
k = 1, 2, . . . ,
and we may assume α1 = ξ, e1 =
ξ e1 dx > 0, Ω
since we can take e1 > 0 in Ω. Moreover, ∞
αk2 ek 2 = ξ 22 − α12 e1 2 < ξ 22 .
k=2
Now, if φ ∈ W01,2 (Ω) ∩ C 2 (Ω) is such that
−φ = ξ φ=0
in Ω, on ∂Ω,
it follows that ∞ φ= φ, ek ek k=1
with ∞
αk ek = ξ = −φ =
k=1
∞ φ, ek (−ek ) k=1
=
∞
λk φ, ek ek ,
k=1
whence φ, ek =
αk . λk
Thus, φ=
∞ αk k=1
λk
ek .
(26)
R.J. Biezuner et al. / Journal of Functional Analysis 257 (2009) 243–270
265
Returning to sequence (φn ), if 1=
∞
αk ek
k=1
is the expansion of the function ξ ≡ 1, we obtain recursively φn =
∞ αk k=1
λnk
ek ,
that is, φn =
1 (α1 e1 + ψn ), λn1
where ψn :=
∞ λ1 n λk
k=2
αk ek .
n 2 Theorem 5.1. λ1 = lim φ φn+1
2 .
Proof. We assert that if (ψn )n∈N is the sequence defined above, then ψn → 0 in L2 (Ω). Indeed, this follows immediately from the estimate
ψn 22 =
∞ λ1 2n k=2
λk
αk2 ek 22
λ1 λ2
2n ∞ k=2
αk2 ek 22 1 2
λ1 λ2
n
and the fact that λ1 < λ2 . Therefore, lim
1 2 α e1 2 + ψn 22 2
φn 2
α1 e1 + ψn 2 = λ1 lim 2 1 2 2 = lim λ1 = λ1 .
φn+1 2
α1 e1 + ψn+1 2 α1 e1 2 + ψn+1 22
Theorem 5.2. ψn → 0 uniformly in Ω. Proof. Since the convergence of the eigenfunction expansion of ξ ≡ 1 is absolute, let M :=
∞ k=1
|αk |.
2
266
R.J. Biezuner et al. / Journal of Functional Analysis 257 (2009) 243–270
Then ∞ λ n 1
ψn ∞ = αk ek λk
∞
k=2
and the result follows letting n → ∞. Corollary 5.3.
φn φn+1
M
λ1 λ2
n
2
→ λ1 uniformly in any compact subset K Ω.
Proof. Let K Ω be a compact set and set mK := min e1 > 0. K
Since ψn → 0 uniformly in Ω, we have for all sufficiently large n that α1 mK α1 e1 |α1 e1 + ψn+1 | + |ψn+1 | α1 < |α1 e1 + ψn+1 | + mK , 2 whence |α1 e1 + ψn+1 |
α1 mK . 2
Thus, on K we obtain
φn
α1 e1 + ψn
− λ1 = λ1
− 1
φ α1 e1 + ψn+1 n+1
ψn − ψn+1
= λ1
α e +ψ 1 1
n+1
2λ1 |ψn − ψn+1 | α1 mK →0 uniformly.
2
6. Higher eigenvalues In the case p = 2, higher eigenvalues and their respective eigenfunctions can also in principle be obtained by this technique. Suppose now that the first nonzero coefficient of ξ ∈ W01,2 (Ω) ∩ C 2 (Ω) is αk0 for some k0 > 1, that is, ξ=
∞ k=k0
αk ek
(27)
R.J. Biezuner et al. / Journal of Functional Analysis 257 (2009) 243–270
267
with αk = v, ek ,
k = k0 , k0 + 1, . . . .
Then φn =
∞ αk ek , λnk
k=k0
which we write in the form φn =
1 (αk0 ek0 + ψn ), λnk0
(28)
where now ψn :=
∞ λk0 n αk ek . λk
k=k0 +1
Following the same steps of the previous section, we can conclude that ψn → 0 in L2 (Ω) and that λk0 = lim
φn 2 .
φn+1 2
Moreover, choosing ξ sufficiently regular so that the series of the coefficients M :=
∞
|αk |
k=1
is absolutely convergent, we also have ∞ λ n k0 αk ek
ψn ∞ = λk k=k0 +1
∞
M
λk0 λk0 +1
n ,
n that is, ψn → 0 uniformly in Ω, which implies as above that φφn+1 converges uniformly to the constant function λk0 in compact subsets of K Ω ∩ supp(ek0 ).
7. Numerical results In this section we present some of the numerical results which we were able to compute for some domains. We compare them with results obtained elsewhere. Computations were performed on a Windows XP/Pentium 4 – 2.8 GHz platform, using the GCC compiler.
268
R.J. Biezuner et al. / Journal of Functional Analysis 257 (2009) 243–270
Table 1 First eigenvalue for the p-Laplacian on the unit ball. p
N =2
N =3
N =4
p
1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5
2.5694 2.9656 3.3263 3.6741 4.0180 4.3624 4.7098 5.0619 5.4195 5.7835 6.1543 6.5321 6.9174 7.3103 7.7108
3.8728 4.5151 5.1283 5.7431 6.3717 7.0201 7.6920 8.3898 9.1153 9.8698 10.6545 11.4701 12.3177 13.1979 14.1115
5.1871 6.1020 7.0064 7.9390 8.9154 9.9443 11.0314 12.1810 13.3969 14.6822 16.0400 17.4730 18.9841 20.5759 22.2510
2.6 2.7 2.8 2.9 3.0 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4.0
N =2 8.1192 8.5355 8.9598 9.3921 9.8324 10.2809 10.7375 11.2022 11.6751 12.1561 12.6453 13.1427 13.6482 14.1619 14.6838
N =3
N =4
15.0590 16.0412 17.0586 18.1117 19.2013 20.3278 21.4917 22.6937 23.9341 25.2136 26.5327 27.8919 29.2916 30.7325 32.2150
24.0121 25.8617 27.8027 29.8374 31.9687 34.1991 36.5314 38.9681 41.5120 44.1659 46.9325 49.8144 52.8146 55.9359 59.1810
Fig. 1. Graphs of p (1 < p 4) versus values of γ10 , ν10 and Γ10 for the N -dimensional unit ball and N = 2 (left), N = 3 (center), N = 4 (right).
7.1. The unit ball In order to compute the value of the first eigenvalue for the p-Laplacian in the unit ball, we mixed the composite Simpson and trapezoidal methods for computation of the associated integrals in the expression of νn . In Table 1, the results for the first eigenvalue of the p-Laplacian for values of p ranging from 1.1 to 4.0 for balls of dimensions N = 2, 3, 4, are displayed, truncated at the fourth decimal place, after 10 iterations. The results are also visually displayed in Fig. 1. For comparison, the known value of the first eigenvalue for the Laplacian on the unit bidimensional ball is 5.7832, which means that our error should be about 0.04%. This result compares well with the one obtained in [8], where a 1.3% precision was attained. 7.2. The unit square In order to solve the p-Laplacian in the unit square [0, 1] × [0, 1] we used the algorithm proposed in [3], coupled with the homotopy perturbation method (HPM) of [6] for the exact line searches in the nonlinear conjugate gradient method. In Table 2 we see the values for the first eigenvalue of the p-Laplacian for values of p ranging from p = 2 to p = 3,
R.J. Biezuner et al. / Journal of Functional Analysis 257 (2009) 243–270
269
Table 2 First eigenvalue for the p-Laplacian on the unit square. p
γ5
ν5
Γ5
2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3.0
19.7145 22.3239 25.2168 28.2413 31.9750 35.5746 38.5547 41.4917 5.5593 7.8823 14.6719
19.7348 22.3460 25.2412 28.4495 32.0024 35.9344 40.2827 45.0890 50.3972 56.2567 62.7208
19.9270 22.4447 25.3343 28.6139 32.5685 37.6961 40.8167 52.8657 642.6432 670.7254 205.0535
Fig. 2. Graph of p (2 p 3) versus ν5 for the square [0, 1] × [0, 1].
with 0.1 increment, truncated at the fourth decimal place, after 5 iterations, for all three sequences. We see that sequence (νn ) has a much faster rate of convergence and less numerical error, especially as the value of p increases. For comparison, the known value of the first eigenvalue for the Laplacian on the unit square is 2π 2 = 19.7392, which means that the error in ν5 is about 0.02%. Fig. 2 displays the results for ν5 . This result with only 5 iterations compares very favourably with the one obtained in [8], where a 3% precision was attained. It must be remarked that better and faster results should be obtainable with more precise and faster methods for solving the p-Laplacian equation in each iteration.
270
R.J. Biezuner et al. / Journal of Functional Analysis 257 (2009) 243–270
Acknowledgment The second author thanks the support of FAPEMIG. References [1] W. Allegretto, Y.X. Huang, A Picone’s identity for the p-Laplacian and applications, Nonlinear Anal. 32 (1998) 819–830. [2] G.D. Anderson, M.K. Vamanamurthy, M. Vuorinen, Inequalities for quasiconformal mappings in space, Pacific J. Math. 160 (1993) 1–18. [3] B. Andreianov, F. Boyer, F. Hubert, Finite volume schemes for the p-Laplacian on Cartesian meshes, ESAIM Math. Model. Numer. Anal. 38 (6) (2004) 931–959. [4] L. Damascelli, Comparison theorems for some quasilinear degenerate elliptic operators and applications to symmetry and monotonicity results, Ann. Sc. Norm. Super. Pisa Cl. Sci. 26 (4) (1998) 689–707. [5] J.W. Demmel, Applied Numerical Linear Algebra, SIAM, 1997. [6] X. Feng, Y. He, High order iterative methods without derivatives for solving nonlinear equations, Appl. Math. Comput. 186 (2007) 1617–1623. [7] B. Kawohl, V. Fridman, Isoperimetric estimates for the first eigenvalue of the p-Laplace operator and the Cheeger constant, Comment. Math. Univ. Carolin. 44 (2003) 659–667. [8] L. Lefton, D. Wei, Numerical approximation of the first eigenpair of the p-Laplacian using finite elements and the penalty method, Numer. Funct. Anal. Optim. 18 (3–4) (1997) 389–399. [9] D.S. Watkins, Fundamentals of Matrix Computations, second ed., John Wiley & Sons, 2002.
Journal of Functional Analysis 257 (2009) 271–331 www.elsevier.com/locate/jfa
Isomorphic copies in the lattice E and its symmetrization E (∗) with applications to Orlicz–Lorentz spaces Anna Kami´nska a , Yves Raynaud b,∗ a Department of Mathematical Sciences, The University of Memphis, Memphis, TN 38152, United States b Institut de Mathématiques de Jussieu (UMR 7586 CNRS), Case 186, UPMC University Paris 06,
F-75005 Paris, France Received 30 October 2008; accepted 25 February 2009 Available online 9 April 2009 Communicated by N. Kalton
Abstract The paper is devoted to the isomorphic structure of symmetrizations of quasi-Banach ideal function or sequence lattices. The symmetrization E (∗) of a quasi-Banach ideal lattice E of measurable functions on I = (0, a), 0 < a ∞, or I = N, consists of all functions with decreasing rearrangement belonging to E. For an order continuous E we show that every subsymmetric basic sequence in E (∗) which converges to zero in measure is equivalent to another one in the cone of positive decreasing elements in E, and conversely. Among several consequences we show that, provided E is order continuous with Fatou property, E (∗) contains an order isomorphic copy of p if and only if either E contains a normalized p -basic sequence which converges to zero in measure, or E (∗) contains the function t −1/p . We apply these results to the family of two-weighted Orlicz–Lorentz spaces Λϕ,w,v (I ) defined on I = N or I = (0, a), 0 < a ∞. This family contains usual Orlicz–Lorentz spaces Λϕ,w (I ) when v ≡ 1 and Orlicz–Marcinkiewicz spaces Mϕ,w (I ) when v = 1/w. We show that for a large class of weights w, v, it is equivalent for the space Λϕ,w,v (0, 1), and for the non-weighted Orlicz space Lϕ (0, 1) to contain a given sequential Orlicz space hψ isomorphically as a sublattice in their respective order continuous parts. We provide a complete characterization of order isomorphic copies of p in these spaces over (0, 1) or N exclusively in terms of the indices of ϕ. If I = (0, ∞) we show that the set of exponents p for which p lattice embeds in the order continuous part of Λϕ,w,v (I ) is the union of three intervals determined respectively by the indices of ϕ and by the condition that the function t −1/p belongs to the space.
* Corresponding author.
E-mail addresses: [email protected] (A. Kami´nska), [email protected] (Y. Raynaud). 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.02.016
272
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
© 2009 Elsevier Inc. All rights reserved. Keywords: Quasi-Banach lattice of measurable functions; Symmetrization; Rearrangement invariant spaces; Orlicz–Lorentz spaces; Isomorphic lattice structure
We present a comprehensive study of (order) isomorphic copies of p for 0 < p ∞ as well as c0 in symmetrized quasi-Banach ideal (function) lattices. Given a function lattice E over I = N or I = (0, a), 0 < a ∞, and the quasi-norm · E in E, define its symmetrization E (∗) as the set of all elements x in E such that their decreasing rearrangements x ∗ belong to E, and let xE (∗) = x ∗ E . The continuity of the dilation operator in E guarantees that the latter functional is also a quasi-norm. There are a number of important examples of symmetric spaces that are defined by this procedure. The most important are all sorts of Lorentz spaces, including classical Lp,q , 0 < p, q < ∞, Λp,w with a positive weight w and 0 < p < ∞, Orlicz–Lorentz spaces Λϕ,w with ϕ being an Orlicz function, and the Marcinkiewicz spaces Mp,w or more general Orlicz–Marcinkiewicz spaces Mϕ,w , being in general dual spaces to Lorentz spaces. While studying the problem of isomorphic copies in different spaces, including in particular Orlicz– Lorentz spaces [23,24], we realized that some general methods could be applied to all spaces of the type E (∗) . We present here a general approach to investigate asymptotic relationships between certain type of basic sequences in E and E (∗) . For the cases we have in mind, the study of basic disjoint sequences is far easier in E than in E (∗) . This allows us to apply these schemes for different spaces E without repeating similar reasonings. The paper is organized as follows. The first section contains preliminaries such as basic notions, notations as well as basic facts. For instance we show that the continuity of the dilation operator on a complete quasi-normed space E is necessary and sufficient for the symmetrization E (∗) to be a quasi-normed complete space. In Section 2 we discuss a number of relationships among basic sequences in the spaces E (∗) and E (∗) . Let us denote by E0 the order continuous part of E (∗) and by Eb the closure in E of the linear subspace of bounded elements with supports in finite intervals. We suppose that E0(∗) = {0}. We prove first a key result on equivalence between a finite or infinite disjoint sequence in E (∗) consisting of bounded elements supported on sets of finite measure, and some sequence of suitable translations of their rearrangements in E. The equivalence constant is controlled in a precise way by a certain function of the E (∗) -norm and L∞ -norm of the elements, and of the amplitudes of the shifts. This technical result leads to the main result of the section, which states that every seminormalized sequence in E (∗) which converges to zero in measure (“L0 -null” in short) has a basic subsequence equivalent to a L0 -null semi-normalized sequence in Eb , consisting of non-negative decreasing elements. Conversely such a sequence in Eb has a basic subsequence equivalent to a (∗) semi-normalized L0 -null sequence in E0 . An immediate consequence of this result is that every subsymmetric semi-normalized L0 -null basic sequence in E (∗) is equivalent to a (subsymmetric) semi-normalized L0 -null basic sequence in Eb , consisting of non-negative decreasing elements, and conversely. Note that by standard disjointification arguments, every semi-normalized L0 -null sequence (∗) in E0 has a basic subsequence close to and equivalent to a disjoint L0 -null sequence, and we show that the same happens in Eb for sequences of decreasing non-negative elements. Thus
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331 (∗)
273
every subsymmetric, L0 -null, semi-normalized and disjoint sequence in E0 is equivalent to a sequence in Eb with the same properties, but now the converse may be false. These results are the main tools for further consideration of isomorphic structure of the spaces. In fact, this method often shifts the complicated constructions from the space E (∗) to the space E, which makes them considerably easier since it does not require dealing with decreasing rearrangements. In Section 3 we state general necessary and sufficient conditions for the existence of subspaces or sublattices isomorphic to p , 0 < p < ∞, or c0 , in order continuous rearrangement invariant quasi-Banach spaces. The general form of these conditions is that either the r.i. space X contains a normalized L0 -null basic sequence equivalent to the unit basis of p (or c0 ), or X contains certain peculiar decreasing functions. This kind of criterion is well adapted to the case where X (∗) is the order continuous part E0 of the symmetrization of an ideal function space E. In fact by the results of Section 2, these conditions transfer to analogous conditions on the cone of decreasing non-negative elements in E. The outline of the first criterion on p -subspaces is that an order continuous rearrangement invariant space X contains p isomorphically as subspace if and only if either it contains p isomorphically as sublattice, or p 2 and the “Fatou closure” of its restriction to [0, 1] contains p-stable symmetric random variables, which can be replaced by the function t −1/p if p < 2. This is an extension to the case of quasi-Banach spaces of a known result presented in [36]. We only give a sketch of its proof, attaching a result on p subspaces in Lr , 0 < r 1, in the spirit of Aldous [1] and Dacunha-Castelle and Krivine [8], in the appendix in the last section. We then analyze the existence of order isomorphic copies of p or c0 . In the case of sequence spaces (different from c0 ) or of spaces over finite intervals, the existence of a sublattice isomorphic to p or c0 implies the existence of an L0 -null normalized basic sequence, which spans the sublattice and is equivalent to the p or c0 unit basis. In the spaces over (0, ∞), there are two fundamental types of p or c0 isomorphic sublattices. The first ones are spanned by a normalized basic sequence converging to zero in measure, the second ones are spanned by a disjoint basic sequence of elements, the rearrangements of which converge pointwise to the function t −1/p . In fact, if an p or c0 isomorphic sublattice exists, there must exist a sublattice of one of these two types. The existence of the second type of p -sublattice is equivalent to the fact that the order part L0p,∞ of the Marcinkiewicz space Lp,∞ is included in X. If X has Fatou property it is equivalent to the inclusion Lp,∞ ⊂ X, and also to the fact that X contains the function t −1/p . Our reasoning here is based on a recent paper by Hernandez, Sanchez and Semenov [13]. It needs a supplementary technical hypothesis on the r.i. space, namely that this quasi-Banach lattice X is L-convex, i.e. it is r-convex for some 0 < r < ∞. We do not know if this additional hypothesis can be removed. Finally, for a function space E we give criteria for the containment of p or c0 in E (∗) as a subspace or sublattice in terms of conditions imposed on the space E. They vary depending on the underlying set I . The rest of the paper consists of investigations of the isomorphic structure of a new, very general class of Orlicz–Lorentz spaces, so called two-weighted Orlicz–Lorentz spaces Λϕ,w,v (I ). The techniques and results developed in sections two and three find very natural applications for studying basic sequences in these spaces. Given two positive weights w, v, and an Orlicz function ϕ (not necessarily convex), we introduce in Section 4 the space Λϕ,w,v (I ) as the set of all measurable functions x : I → R such that for some λ > 0, ϕ λx ∗ (t)v(t) w(t) dt < ∞ I
274
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
(in the sequential case, i.e. when I = N, this space will also be denoted by d(w, v, ϕ), consistently with the traditional notation in the classical case, see [27]). Clearly, these spaces are obtained by symmetrization of certain two-weighted Orlicz spaces that belong to the more general family of Musielak–Orlicz spaces. The new family of spaces Λϕ,w,v (I ) is very vast and contains not only all sorts of classical Lorentz and Orlicz–Lorentz function and sequence spaces but also in most cases their dual spaces, known as Marcinkiewicz or Orlicz–Marcinkiewicz spaces [15,22,37]. In fact such an Orlicz–Marcinkiewicz space Mϕ,w (I ) is just a space Λϕ,w,v (I ) with v = 1/w. This latter fact was our prime motivation for working with the family of twoweighted spaces. Many results for Marcinkiewicz spaces are obtained for granted as corollaries of our main theorems. Note also that the spaces considered by Torchinsky [39], which appear naturally as real interpolation spaces [35], belong also to this class with v increasing and w(t) = 1/t. We provide in Section 4 a set of sufficient conditions for both weights w, v and the Orlicz function ϕ in order our space to become a quasi-Banach space. We also state several lemmas and auxiliary facts that connect different properties of the weights and the function ϕ. The next Section 5 contains criteria for Λϕ,w,v (I ) to have (order) isomorphic copies of c0 or ∞ , and some preparatory work needed further. After recalling the sets of Orlicz functions classically attached to a given Orlicz function ϕ, and used for studying subspaces and sublattices of different types of Orlicz spaces, we finish this section with a general theorem stating that Λϕ,w,v (I ), independently of the sort of I , needs to contain a sublattice isomorphic to p for some 0 < p < ∞. More precisely, in the order continuous part Λ0ϕ,w,v (I ) of the space, the closed linear span of any semi-normalized sequence converging in measure to zero contains a subsequence equivalent to the unit basis of some p , which is also equivalent to a disjoint sequence in the Orlicz space Lϕ (0, ∞). Recall that the characterization of sublattices of Orlicz spaces Lϕ (I ), which are isomorphic to spaces p , depends strongly on the interval I . This was demonstrated in early papers by Lindenstrauss and Tzafriri [27,29], by Nielsen [33], and Hernandez with Rodriguez-Salinas [12], where it was also pointed out that the strongest differences occur between spaces defined over finite and infinite interval I . At this point several natural questions arise: (a) Does every lattice copy of p in Λ0ϕ,w,v (I ) have an order copy in Lϕ (0, ∞)? (b) Does every lattice copy of p in Λ0ϕ,w,v (I ) have an order copy in Lϕ (I ) for the same I ? (c) Conversely, is every lattice copy of p in Lϕ (I ) isomorphic to a sublattice of Λ0ϕ,w,v (I )? Recall that the answers to these questions are positive in the case of I = N and a constant weight v [24]. It turns out that when v is constant the answers are still positive for a finite interval I = (0, a), but when either I = (0, ∞) or v is not constant, the situation is far more diversified. Section 6 is devoted to spaces over finite intervals (0, a). In this case we are able to obtain a positive answer to question (b), and hence (a) under the assumptions that the weight w is integrable over every finite interval (0, t), and to question (c) under the additional condition that the weight v is increasing. Thus under all these assumptions together we provide a complete characterization of all sublattices of Λϕ,w,v (0, a) isomorphic to p , 0 < p ∞ (Theorem 6.13). These conditions are expressed in terms of Matuszewska–Orlicz indices similarly as in [33]. Actually we obtain a more general result, namely that an Orlicz sequence space hψ embeds isomorphically as sublattice into Λ0ϕ,w,v (0, a) if and only if it does into L0ϕ (0, a). Moreover we show that every infinite dimensional sublattice of Λϕ,w,v (0, a) contains an order copy of some space p .
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
275
In Section 7 we treat the case when the measure of I is infinite, that is when either I = N or I = (0, ∞). In either case there is a similar situation when considering normalized disjoint sequences approaching zero in the ∞-norm. Under certain mild conditions on the weights v, w such sequences have a subsequence spanning an Orlicz space hψ , which is also isomorphic to a sublattice of hϕ . In the case of sequence spaces, under the same conditions on the weights, we obtain a positive answer to questions (a) and (b), and assuming in addition that v is increasing, we also get a positive answer to question (c). Under these assumptions, in Theorem 7.8, we determine that p -spaces embed isomorphically in d0 (w, v, ϕ), the order continuous part of d(w, v, ϕ), if and only if they embed in hϕ . Note however that the equivalence of order embeddings of hψ in d0 (ϕ, w, v) and in hϕ is obtained now only for a restricted class of embeddings, those which associate the hψ basis with a ∞ -null seminormalized disjoint sequence. As an application we show that a reflexive Orlicz–Lorentz sequence space with decreasing weight contains a subspace isomorphic to p for some p 1 if and only if the space has a quotient space isomorphic to p , analogously to reflexive Orlicz spaces [27]. As we can expect in view of the general results in Section 3, the case when I = (0, ∞) is more complicated. This is caused by the fact that there may exist subspaces isomorphic to p that are spanned by normalized sequences of equimeasurable functions. Our main result, Theorem 7.18, states that p is lattice embeddable into Λ0ϕ,w,v (0, ∞) if and only if either it is isomorphic to a sublattice of hϕ or Lϕ (0, 1), or the function t −1/p belongs to Λϕ,w,v (0, ∞). This determines three intervals in the positive real line, the first two depending only on the function ϕ, while the third one depends also on the weights. The latter condition means that Lp,∞ ⊂ Λϕ,w,v (0, ∞). We also show, inspired by Carothers and Dilworth’s paper [6], that under the assumption that the fundamental function of the space dominates some power function, the main theorem reduces to the first part of the alternative only. Also, under this assumption we can show more, namely that every lattice contained in the space Λ0ϕ,w,v (0, ∞) has a sublattice isomorphic to p , which is also isomorphic to a sublattice either of hϕ or of L0ϕ (0, 1). We provide several interesting examples of Orlicz–Lorentz spaces Λϕ,w (0, ∞) for which the answers to questions (a), (b) or (c) are negative. All results concerning two-weighted Orlicz–Lorentz spaces apply in particular to oneweighted Orlicz–Lorentz spaces Λ0ϕ,w (I ), to classical Lorentz spaces Λp,w (I ), as well as to Orlicz–Marcinkiewicz spaces Mϕ,w (I ). 1. Preliminaries In the course of the article we shall deal with quasi-Banach (normed) lattices. We follow the standard terminology and notation used in Banach spaces. In particular, the notions related to Schauder bases, basic sequences or related to the theory of lattices are analogous to the corresponding notions in Banach spaces or Banach lattices which can be found in [2,3,5,25,27,28,41]. For a general theory of metric vector spaces, in particular quasi-normed or p-normed spaces, we refer to [17]. Recall first that given vector space X and 0 < p 1, a quasi-norm · X on X is called p p p a p-norm if it satisfies the p-triangle inequality, that is x + yX xX + yX , x, y ∈ X, and p-norm with constant D if it fulfills p-triangleinequality withconstant D, i.e. for evp p ery finite sequence xi ∈ X, i = 1, . . . , n, we have ni=1 xi X D ni=1 xi X . A complete quasi-normed space equipped with a p-norm is called a p-Banach space. Given quasi-normed spaces (X, · X ) and (Y, · Y ), we say that two basic sequences (xn ) ⊂ X and (yn ) ⊂ Y are equivalent β > 0 and every integer n 1 and scalars λ1 , . . . , λn , it holds true if for some α, that α ni=1 λi xi X ni=1 λi yi Y β ni=1 λi xi X . They are C-equivalent, or equivalent
276
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
with constant C, if β/α C. Let us also say that two basic sequences (xn ) and (yn ) are almost C-equivalent if for every ε > 0 there exists m such that the sequences (xn )nm and (yn )nm are (C + ε)-equivalent. We shall use the notation [xn ] for a closed linear span of (xn ) in a quasiBanach space X. The first result, commonly known in Banach spaces as a Principle of Small Perturbations can be proved analogously as in Banach spaces [3, Theorem 1.3.9], applying r-triangle inequality with constant D instead of triangle inequality. Theorem 1.1. Let (X, · ) be a quasi-Banach space where · is an r-norm with constant D. Let (xn ) be a basic sequence in X, and (yn ) a sequence in X such that 2D 2 B r
∞ xn − yn r n=1
xn r
< 1,
where B is the basis constant of (xn ). Then (yn ) is a basic sequence in X equivalent to (xn ). Remark 1.2. Like in the normed case [3, Theorem can easily be given a quan 1.3.9] this theorem r /x r < 1, the basic sequences titative version. In fact, setting θ := 2D 2 B r ∞ x − y n n n n=1 1/r . (yn ) and (xn ) are equivalent with constant ( D(D+θ) ) 1−θ Remark 1.3. We also have that if (xn ) is basic and n ||xn − yn r /xn r < ∞ then up to suppressing a finite numberof terms, (yn ) is basic and almost D 2/r -equivalent to (xn ). Indeed, r r we have θm := 2D 2 B r ∞ n=1 xn − yn /xn → 0 as m → ∞. Moreover, if θm < 1 then m ) 1/r (xn )nm and (yn )nm are equivalent with constant ( D(D+θ by the previous remark, and 1−θm ) m ) 1/r → D 2/r , the sequences (xn ) and (yn ) are almost D 2/r -equivalent. by ( D(D+θ 1−θm )
A well-known theorem by Aoki and Rolevicz [17] states that every quasi-norm is equivalent to a p-norm, for some 0 < p 1. For avoiding to modify some naturally defined quasi-norms, we shall often use a slightly different statement, that every quasi-norm is itself a p-norm with constant 1 D 4, for some p ∈ (0, 1] depending on the “modulus of concavity” of the quasinorm (see [17, Lemma 1.1]). Let I = N or I = (0, a) with 0 < a ∞, equipped with the counting measure if I = N, and the Lebesgue measure if I = (0, a). If A ⊂ I is a measurable set, we denote by |A| its measure. By L0 (I ) we denote the collection of all real-valued measurable functions on I . In the case when I = N the elements are sequences x = (x(n)), and in the other cases they are real-valued Lebesgue measurable functions x. Let E denote a subspace of L0 (I ) which is complete with respect to a quasi-norm · E , and which is also a lattice and an ideal with respect to the pointwise order of elements in L0 (I ). Thus if 0 |x| |y| a.e. and y ∈ E, then x ∈ E and xE yE . It is customary to assume that E is order dense, i.e there exists an element x in E such that supp x = I . However, this will not be sufficient for our purpose here, and we shall suppose in fact that the indicator function of any interval (0, m), m < ∞, belongs to E. It can be shown that this condition is equivalent to the existence of a decreasing element x in E such that supp x = I . We shall call such a space a quasi-Banach ideal lattice or more specifically quasi-Banach function lattice if I = (0, a), or quasi-Banach sequence lattice if I = N. The Aoki–Rolewicz theorem provides a p-norm (with
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
277
constant D) on E that is equivalent to the given quasi-norm and that preserves also the order [24, Remark 1.2]. Recall that a quasi-Banach lattice (E, · E ) has the Fatou property whenever for any 0 < xn ↑ x a.e. with xn ∈ E and sup xn E < ∞ we have that x ∈ E and sup xn E = xE . Recall also that x ∈ E is called order continuous whenever for any xn ↓ 0 a.e. and 0 xn |x| we have xn E ↓ 0. Let E0 be the set of all order continuous elements in E. Let Eb be the closure of all simple measurable functions supported inside finite intervals, or equivalently, of the functions in E which are bounded and supported inside finite intervals (adapting the proof of [5, Proposition 3.10]). Then E0 is an order ideal and a closed subspace in E such that E0 ⊂ Eb . The subspace E0 is also itself a quasi-Banach ideal lattice. In the sequence spaces case the unit vectors en in E, where en = (en (k)) with en (n) = 1 and en (k) = 0 for k = n, form a Schauder basis of E0 = Eb . The Köthe dual E of a quasi-Banach ideal lattice E over I , is the collection of all x ∈ L0 (I ) such that xE = sup
x(t)y(t) dt: yE 1 < ∞. I
Then · E is a norm, and (E , · E ) is a Banach ideal lattice. It need not be order dense or can even be trivial, reduced to {0}, in particular in case when E is not a normed space, e.g. if E = Lp (I ) for I = (0, a), 0 < a ∞, 0 < p < 1. If E is order dense then .E is a norm and E := (E ) is a Banach ideal lattice containing E. Observe that E has the Fatou property and that (Eb ) = E . If E is a normed space and has the Fatou property then E = E, and conversely [28, p. 30]. For more information on (quasi-) Banach function lattices we refer to [2,5,20,22,25,41]. A quasi-normed ideal lattice E is called rearrangement invariant (r.i.) or symmetric [5,25] whenever x ∈ E and y ∈ L0 (I ), and x ∗ = y ∗ implies y ∈ E and yE = xE . Recall that x ∗ is the decreasing rearrangement of x defined following [5,28] as x ∗ (t) = inf{s > 0: d|x| (s) t}, t 0, where dx (s) = |{t ∈ I : x(t) > s}|, s 0, is the distribution function of x (the definition of x ∗ in [25] is slightly different). We shall also use the well-known fact that whenever E is a r.i. space over (0, a), then E0 is non-trivial if and only if limt→0 χ(0,t) E = 0, which in turn is equivalent to E0 = Eb . If E is normed and symmetric then L1 ∩ L∞ ⊂ E ⊂ L1 + L∞ [25, Theorem II.4.1]. Then E is also a (symmetric) Banach ideal lattice as well as E . We shall always consider the situation where E0 = {0}. Then E0 = E , we have the norm one embeddings E0 ⊂ E ⊂ E and the embedding E0 ⊂ E is isometric (but perhaps not the inclusion E ⊂ E ). Given the quasi-normed ideal lattice E define its symmetrization [22] as the set
E (∗) = x ∈ L0 (I ): x ∗ ∈ E , and for x ∈ E (∗) let xE (∗) = x ∗ E . Let us present first a simple lemma which gives the conditions for · E (∗) to be a quasi-norm in E (∗) , and the space (E (∗) , · E (∗) ) to be complete.
278
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
Given any sequence x we define the dilation operator σ2 as σ2 x(n) = x(n/2), n = 2, 3, . . . , and σ2 x(1) = x(1), where s denotes the smallest integer bigger than or equal to s. For any function x on (0, a) we set σ2 x(t) = x(t/2) for all t ∈ (0, a). We adopt here the convention of using the interval notation also for subsets of natural numbers. Thus for instance the set (a, b) ⊂ N denotes {n ∈ N: a < n < b}. Analogously we define the other intervals in N. Lemma 1.4. If (E, · E ) is a quasi-normed lattice, then · E (∗) is a quasi-norm if and only if the dilation operator σ2 is bounded on the cone of non-negative decreasing elements in E. If moreover (E, · E ) is a quasi-Banach space, then the space (E (∗) , · E (∗) ) is quasi-Banach too. Proof. Let first assume that σ2 x ∗ E Cx ∗ E for some C > 0 and all x ∗ ∈ E. Then in view of (x + y)∗ σ2 x ∗ + σ2 y ∗ we get x + yE (∗) σ2 x ∗ + σ2 y ∗ E K σ2 x ∗ E + σ2 y ∗ E KC xE (∗) + yE (∗) , where K > 0 is the quasi-norm constant of E. For the reverse implication assume that · E (∗) is a quasi-norm. Let us take any x ∈ E, which is non-negative and decreasing. If I = N then we can always choose y, z in such a way that y ∗ = z∗ = x and σ2 x = y + z = (σ2 x)∗ . Hence for some C > 0, σ2 xE = σ2 xE (∗) C(yE (∗) + zE (∗) ) = 2Cx ∗ E = 2CxE , which shows that σ2 is bounded. The same arguments work in the case where I is an interval of R+ , provided the set of values of x is finite or countable. For a general positive decreasing x, find a positive decreasing x1 with range finite or countable such that x1 x 2x1 . Then σ2 xE 2σ2 x1 E 2Kx1 E 2KxE . A quasi-normed space (X, · ) is complete whenever for a fixed summable positive sequence (an ), for any sequence (xn ) ⊂ X such that xn an for all n ∈ N, we have the series ∞ n=1 xn converges in X (a version of the Riesz theorem on completness of normed spaces [38]). Let · E (∗) be a q-norm with constant D and E be complete. Take an = 2−n/q σ2 −n and (xn ) with xn E (∗) an . Then for any m > n, m m m m q ∗ nq ∗ q nq q n σ 2 xn D σ2 xn E D σ2 an = D 2−n . i=n
Thus the series
i=n
E
∞
∗ n=1 σ2n (xn )
∞ n=1
i=n
i=n
is convergent in E. Now by a well-known inequality [5,25],
∗
|xn |
(t)
∞ n=1
xn∗ (t/2n ) =
∞
σ2n xn∗ (t),
n=1
∞ ∞ and so ∞ n=1 |xn |(t) < ∞ a.e. Thus n=1 xn (t) converges a.e. Clearly the function n=1 xn ∞ belongs to E (∗) and in fact the series n=1 xn is convergent in E (∗) . 2 From now on we assume that · E (∗) is a quasi-norm. It is evident that (E (∗) , · E (∗) ) is a r.i. (∗) space. We shall denote shortly by E0 the order continuous part (E (∗) )0 of E (∗) . In the sequence (∗) ∗ case the unit vector basis (ei ) is a symmetric basis for E0 and xE (∗) = ∞ i=1 x (i)ei E .
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
279
(∗)
For I = (0, a) we have E0 = {0} if and only if χ(0,t] E → 0 when t → 0, and it is equivalent that E0(∗) coincides with the closure of the subspace of simple functions supported on finite measure sets (or even on finite intervals) [5, Theorem 3.13]. In this case, i.e. if limt→0 χ(0,t] E = 0, then every element x ∈ Eb is the limit of a sequence (xn ) of elements of E, which are bounded functions supported in intervals (an , bn ), with 0 < an < bn < ∞. Note also that if E has the Fatou property then so has E (∗) , since 0 xn ↑ x implies xn∗ ↑ x ∗ [5, Chapter 2, Proposition 1.7]. Similarly, if E is order continuous and does not contain the function χ(0,∞) then E (∗) is also order continuous. In fact, in this case for any x ∈ E (∗) we have limt→∞ x ∗ (t) = 0, which implies that if x xn ↓ 0, then xn∗ ↓ 0 [25, p. 67, 12o ]. In general, E (∗) is a rearrangement invariant (quasi-) Banach space, a symmetric space in the sense of [25], but not necessarily minimal or maximal in the sense that either it coincides with its order continuous part or it has the Fatou property, respectively. For Banach r.i. spaces it is thoroughly discussed in [28] (see remarks on Köthe spaces on pages 28-30, and Definition 2.a.1 with the comments afterwards). Recall that a quasi-normed ideal lattice E on I is said to be p-convex if for some constant C the inequality n n
1/p
1/p |fi |p fi p C i=1
i=1
holds true for all f1 , . . . fn ∈ E, n ∈ N. The function f = ( ni=1 |fi |p )1/p is defined pointwise in n If 0 < p 1 and E is p-convex, L0 (I ), and since ( i=1 |fi |p )1/p np max1in |fi |, f ∈ E. then it is p-normable, in view of the inequality ni=1 |fi | ( ni=1 |fi |p )1/p , but the converse implication does not hold when p < 1. Given s > 0, the s-convexification of a quasi-Banach ideal function space E is another quasi-Banach ideal space, defined as E (s) = {x: |x|s ∈ E}, with quasi norm xs = |x|s 1/s . Clearly, if E is p-convex then E (s) is ps-convex, and conversely. These two notions can be given a meaning in the more general framework of Banach lattices [28] or quasi-Banach lattices [34]. In certain developments of the present paper, it will be important to deal with L-convex quasi-Banach lattices [16], that is quasi-Banach lattices which are p-convex for some p > 0. Equivalently we can say that E is L-convex if some convexification of E is normable. This class of quasi-Banach lattices is convenient for “reducing to the convex case by convexification”. In fact for deciding if a certain type of quasi-Banach lattice X, e.g. an p -space, embeds isomorphically as quasi-normed sublattice in a given quasi-normed function space E it is sufficient to know if a suitable convexification X (s) of X embeds in E (s) . If E is L-convex, we may take s sufficiently large for E (s) to be normable. In this regard the following easy proposition will be of interest. Proposition 1.5. If E is an L-convex quasi-Banach ideal function space and E (∗) is quasinormed, then E (∗) is also L-convex. Proof. By Lemma 1.4, the dilation operator σ2 is bounded on the cone of non-negative decreasing functions in E. Set σ2 = 2r , for some r 0 and let q > r be sufficiently large for E (q) to
280
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
be 1-convex. We shall prove that the Calderón operator 1 Sf (t) = t
t f (s) ds,
t ∈ I,
0
is defined and bounded on the cone of non-negative decreasing functions in E (q) . Denote by (σ2 )k the composition of σ2 k-times. Since for a decreasing function f we have
Sf (t) =
2−k t ∞ 1 k=0
t
f (s) ds
2−k−1 t
∞
∞ 2−k−1 f 2−k−1 t = 2−k−1 (σ2 )k+1 f (t),
k=0
k=0
it is sufficient to prove that the discretized Calderón operator S2 := on non-negative decreasing functions in E (q) . In fact
∞
k=1 2
−k (σ
k 2)
is bounded
1/q 1/q 1/q σ2 f E (q) = (σ2 f )q E = σ2 f q E 2r f q E = 2r/q f E (q) . Now in view of 1-convexity of E (q) we have for some C > 0 that S2 f E (q) C
∞
2−k 2kr/q f E (q) = Df E (q) ,
k=1
−k(1− qr ) < ∞. where D = C ∞ k=1 2 Finally N (f ) := f ∗∗ E (q) , where as usual f ∗∗ = S(f ∗ ), defines a 1-convex quasi-norm on (E (q) )(∗) = (E (∗) )(q) . It follows from the fact that the map f → f ∗∗ is subadditive. Thus (E (∗) )(q) is 1-convex, so E (∗) is 1/q-convex, and thus it is L-convex. 2 2. Basic sequences in quasi-Banach lattices and their symmetrizations In this section we assume that E is a p-Banach lattice for some 0 < p 1, such that the dilation operator σ2 is bounded on the cone of non-negative decreasing elements in E. By Lemma 1.4, · E (∗) is a quasi-norm, and so by [17, Lemma 1.1] there exist 0 < q 1 and a constant 1 D 4, such that · E (∗) is a q-norm with constant D. Hence n q n q yi D yi E (∗) (∗) i=1
E
for all yi ∈ E (∗) , n ∈ N.
(2.1)
i=1
Observe that setting r = min(p, q), the p-norm · E is also an r-norm, while (∗) is · αE1/α an r-norm with constant D. This follows from the well-known inequality ( |a | ) n n ( n |an |β )1/β for any 0 < β α < ∞. Assume for the rest part of this section that the letters p, q, r and D have the meaning as above.
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
281
Notation 1. For any m ∈ N and x = (x(n)), let Sm x be the right shift of x by m, that is (Sm x)(n) = 0 for n = 1, . . . , m, and (Sm x)(n) = x(n − m) for n > m. Similarly we define the right shift Sb x, 0 b < a, when x is defined on the interval (0, a), that is Sb x(t) = 0 if 0 t b and Sb x(t) = x(t − b) if b < t < a. Lemma 2.1. Let (xi )ni=1 be a sequence of disjoint bounded elements in E (∗) supported on sets of a finite real number with mi | supp xi | and set finite measure. For every i = 1, . . . , n, let mi be Mi = j 1 and M1 = 0. Assume ni=1 mi a. Then p n p n n p p ∗ xi S M i xi + xi ∞ χ(0,Mi ] E , (∗) i=1 i=1 i=1 E E n q n q n q q ∗ S x D x + D xi ∞ χ(0,Mi ] E , Mi i i (∗) i=1
i=1
E
i=1
E
with the convention that χ(0,M1 ] = χ∅ = 0. Proof. We may assume without loss of generality that xi 0 for all i = 1, . . . , n. Let εi = x j ∞ with εn = 0, and zi = xi + εi χAi where Ai ⊃ supp xi , |Ai | = mi , i = 1, . . . , n, and j >i the sets Ai are disjoint. Note that, setting ε0 = nj=1 xj ∞ , zi ∞ xi ∞ + εi = xj ∞ = εi−1 , j i
and so for i = 1, . . . , n,
εi max zj ∞ = z j . j >i
It follows that
n
∞
j >i
∗ zi
=
i=1
n
SMi zi∗ .
i=1
Clearly we have zi∗ = xi∗ + εi χ(0,mi ] . Since 0 xi zi we deduce that n ∗ n ∗ n n xi zi = SMi xi∗ + εi SMi χ(0,mi ] . i=1
i=1
i=1
i=1
On the other hand n
εi SMi χ(0,mi ] =
n
xj ∞ SMi χ(0,mi ] =
i=1 j >i
i=1
=
n j =1
xj ∞ χ(0,Mj ] .
n j =1
xj ∞
i<j
SMi χ(0,mi ]
282
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
Hence p n p n p n n n p p xi SMi xi∗ + xi ∞ χ(0,Mi ] SMi xi∗ + xi ∞ χ(0,Mi ] E . (∗) i=1
i=1
E
i=1
i=1
E
i=1
E
For the second inequality, using the fact that xi∗ zi∗ we have n n n ∗ ∗ S M i xi SMi zi = zi i=1
E
i=1
i=1
E
n n = xi + εi χAi
E (∗)
i=1
.
E (∗)
i=1
But n
εi χAi =
n
xj ∞ χAi =
xj ∞
j =1
i=1 j >i
i=1
n
χAi .
i<j
Thus n q n n q ∗ S x x + x χ Mi i i i ∞ Aj (∗) i=1
E
i=1
j
i=1
E
q n q n q D xi +D xi ∞ χAj (∗) (∗) i=1
E
i=1
i=1
E
i=1
j
E
n q n q q = D xi +D xi ∞ χ(0,Mi ] E , (∗) which ends the proof of the lemma.
2
Proposition 2.2. Let (xi ) be a finite or infinite sequence of nonzero, bounded and disjoint elements in E (∗) supported on sets of finite measure. Suppose also that for r = min(p, q), K :=
xi r∞ χ(0,Mi ] r
E
i1
xi rE (∗)
< 1,
with ∞ > mj | supp xj |, and either Mi = j 1 and M1 = 0, or Mi = j >i mj and M1 < ∞. We also assume that j 1 mj a. Then in both cases the basic sequences (xi ) 1/r . in E (∗) and (SMi xi∗ ) in E are equivalent with constant (D 1+K 1−K ) Proof. Suppose first that Mi = j
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
283
n r r n n ∗ λi S M i xi λi xi − λi xi r∞ χ(0,Mi ] rE (∗) i=1 i=1 i=1 E E r n (1 − K) λi xi , (∗) i=1
E
and also n r r n
n λi SMi xi∗ D λi xi + λi xi r∞ χ(0,Mi ] rE (∗) i=1 i=1 i=1 E E n r D(1 + K) λi xi . (∗) i=1
E
Hence the basic sequences (xi ) in E (∗) and (SMi xi∗ ) in E are equivalent. Let now Mi = j >i mj < ∞. We have to prove that for every n, (xi )ni=1 ∈ E (∗) and (SMi xi∗ )ni=1 ∈ E are equivalent with the constants independent of n. Given n, consider y = χ supp xi , which belongs to E (∗) by the assumption | i>1 supp xi | < ∞. It is sufficient to i>n prove that (SM1 x1∗ , . . . , SMn xn∗ , y ∗ ) in E and (x1 , . . . , xn , y) in E (∗) are equivalent. Set ξ1 = y and ξi+1 = xn−i+1 for i = 1, . . . , n. Apply then the first part of the proposition to (ξ1 , . . . , ξn+1 ) with Mi = j n mj and m j +1 = mn−j +1 for j = 1, . . . , n. We have M1 = 0 and for i = 1, . . . , n we have = Mi+1
j
m j = m 1 +
i−1 j =1
m j +1 =
j >n
mj +
i−1
mn−j +1 =
j =1
mj = Mn+1−i .
j >n−i+1
∗ ) = (y ∗ , S x ∗ , . . . , S x ∗ ), and so (y, x , . . . , x ) in E (∗) is Hence (SM1 ξ1∗ , . . . , SMn+1 ξn+1 Mn n M1 1 n 1 equivalent (with constants independent of n) to (y ∗ , SMn xn∗ , . . . , SM1 x1∗ ) in E. 2
Corollary 2.3. Let (xi )ni=1 be a finite sequence of nonzero elements in E (∗) which are bounded, supported on disjoint finite intervals (ai , bi ] and decreasing on their respective supports. Suppose also that bi ai+1 , i = 1, . . . , n − 1, and that K :=
n xi r∞ χ(0,ai ] r
E
i=1
xi rE (∗)
< 1.
1/r -equivalent to itself in E. Then the sequence (xi )ni=1 in E (∗) is (D 1+K 1−K ) ∗ Proof. Let a0 = 0 and x0 = χ(a0 ,a1 ] . Then for every i = 0, . . . , n, we have xi = Sai xi , and mi := ai+1 − ai bi − ai = | supp xi |. Note that ai = j
284
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
Definition 2.4. We shall say that a sequence (xn ) of elements of E is a block sequence if each xn is a bounded element supported in some finite successive intervals In ⊂ I , i.e. for some monotone positive sequence (an ) we have either (i) In = (an , an+1 ] for every n ∈ N and n1 In = (a1 , ∞) (we shall then speak of forward block sequence), or (ii) In = (an+1 , an ] for every n ∈ N and n1 In = (0, a1 ] (backward block sequence). A sequence (xn ) with xn = xn + xn , where (xn ) is a backward block sequence with supports in (0, a0 ] and where (xn ) is a forward block sequence with supports in (a0 , ∞] will be called a composite block sequence. In the case of I = N, the concept of forward block sequence coincides with the classical notion of block basic sequence. The following disjointification result is quite standard. It will be applied in particular to the (∗) space F = E0 . Proposition 2.5. Let F be an order-continuous quasi-Banach function space, equipped with an r-norm with constant D. Every semi-normalized sequence (xi ) ⊂ F which converges to zero in measure has a subsequence (xik ) which is almost D 2/r -equivalent to a sequence (xk ) of pairwise disjoint functions in F that are bounded and supported on sets of finite measure and such that |xk | |xik | for all k ∈ N. In fact we have k xik − xk rF < ∞. Proof. We may assume w.l.o.g. that xi rF 2D for all i. By order continuity of F for each iwe can find zi ∈ F , bounded with support of bounded measure, such that |zi | |xi | and r i xi − zi F < ∞. Since xi → 0 in measure, so does zi , and there exists εi ↓ 0 such that |{|zi | > εi }| < εi . Letting Bi = {|zi | > εi }, we may assume, upto extracting a subsequence, that i |Bi | < ∞. Thus C = i Bi has finite measure and C = i Bi decreases to ∅. Then by order continuity of F , for every z ∈ F we have χC zF → 0, |z| ∧ εi F → 0 and |z| ∧ |zi |r D |z| ∧ εi r + DχC zr → 0. i F F F We can then construct by induction a subsequence (zik ) such that for any > k we have |zi | ∧ |zi |r 2−(k+) . k F It follows that r r |zi | ∧ |zi |r < D2−k . |zi | ∧ |z |z | | ∧ |z | i i D k ik k F =k
Setting Ak = {|zik | >
F
=k
=k |zi |},
χ
Ack
F
=k
the sets Ak are pairwise disjoint and
zik rF
r −k |zik | ∧ |zi | D2 . =k
F
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
285
r Thus the elements zk := χAk zik are pairwise disjoint, |zk | |zk | and k zik − zk F r −k < ∞. Finally, k xik − zk F < ∞ and the conclusion follows from the principle k D2 of small perturbations (Theorem 1.1 and Remark 1.3). 2 (∗)
Proposition 2.6. Assume that E0 = {0}. Then every semi-normalized sequence of non-negative decreasing elements (yi ) ⊂ Eb with yi → 0 in measure (resp. yi ∞ → 0 or | supp(yi )| → 0) has a subsequence (yik ) which is almost 1-equivalent to a composite (resp. forward or backward) block sequence (yk ) and such that for all k, yk = χIk yik where Ik is the union Ik ∪ Ik of two min Ik → ∞ (resp. Ik = ∅ and min Ik → ∞, or max Ik → 0 disjoint intervals, max Ik → 0 and r and Ik = ∅). In fact k yik − yk E < ∞. Proof. We may assume w.l.o.g. that yi rE 2 for all i. Recall that · E is an r-norm with constant D = 1. For every y ∈ Eb and ε > 0 we can find η > 0, β > 0 such that y − χA yrE < ε, [η, β]. Thus for every i we choose zi = χAi yi , with Ai = [ηi , ai ] such that where A = r < ∞. Note that |z | M χ where M = y (η ) < ∞. We may assume that η y − z i i E i i Ai i i i i i 1 ai and that ηi → 0. If I = N we may assume that ηi = 0 for all i. If lim sup yi ∞ < ∞ we may suppose, if necessary suppressing a finite number of yi ’s, that sup yi ∞ < ∞ and then take ηi = 0 for all i. If lim sup yi ∞ = ∞ we may assume that yi ∞ → ∞. If lim sup | supp zi | < ∞ then lim sup ai =: α < ∞ and we may assume that for all i, ai = α. If lim sup | supp zi | = ∞ we may assume that the sequence (ai ) is strictly increasing and ai → ∞. Since zi goes to zero in measure, there exists εi → 0 such that |{|zi | > εi }| < εi . If lim sup zi ∞ = 0 we may suppose that |{|zi | > εi }| = 0, taking e.g. εi = 2zi ∞ . The set Bi := {|zi | > εi } is an interval (ηi , bi ), (∗) with bi − ηi εi . Hence bi → 0. In view of the assumption that the space E0 is not trivial we have that χ(0,t] E → 0 as t → 0. Thus by induction we find a strictly increasing sequence (ik ) such that for = k + 1, εirk+1 χAik rE 2−(k+2)
and Mirk χBik+1 rE 2−(k+2) .
We may also assume that the sequence (bik ) is either strictly decreasing or eventually equal to zero. Define the sets for k ∈ N, Ck = (0, bik+1 ) ∪ (bik , aik−1 ]. Let uk = χCk zik and yk = zik − uk = χ(bik+1 ,bik ] zik + χ(aik−1 ,aik ] zik = χ(bik+1 ∨ηik ,bik ] yik + χ(aik−1 ,aik ] yik = χIk yik + χIk yik , where Ik = (bik+1 ∨ ηik , bik ] and Ik = (aik−1 , aik ]. Then (yk ) is clearly a composite block sequence. It is a forward block sequence if bik = 0, in particular if yi ∞ → 0, and a backward block sequence if aik = α, in particular when | supp(yi )| → 0. Moreover, r yi − y r yi − zi r + χ(0,b ] zi r + χ(b ,a k k k E k E ik+1 ik ik−1 ] zik E k E yik − zik rE + Mirk χ(0,bik+1 ] rE + εirk χ(0,aik−1 ] rE yik − zik rE + 2−(k+2) + 2−(k+2) .
286
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
Thus yi − y r yik − zik rE + 2−(k+1) < ∞. k k E k
k
k
For sufficiently large k, we have that yik − yk E < 1, and then yk r yik r − yik − yk r 1. Applying Theorem 1.1 and remarks afterwards with constant D = 1 we conclude the proof. 2 Theorem 2.7. (∗)
(a) Let (xi ) ⊂ E0 be a semi-normalized sequence converging to zero in measure (resp. (∗) xi ∞ → 0 or | supp xi | → 0). Then there is a subsequence of (xi ), which is basic in E0 and D 3/r -equivalent to a sequence of decreasing elements in Eb with the same convergence properties. (∗) (b) Conversely, assume that E0 = {0}. Then every semi-normalized sequence (yi ) of nonnegative decreasing elements in Eb which converges to zero in measure (resp. yi ∞ → 0 or | supp yi | → 0) has a basic subsequence almost D 3/r -equivalent to a sequence of elements (∗) in E0 with the same convergence properties. (∗)
Proof. (a) Let (xi ) ⊂ E0 be a semi-normalized sequence converging to zero in measure. By (∗) Proposition 2.5 a subsequence (xik ) of (xi ) is almost D 2/r -equivalent to a sequence (xk ) in E0 of pairwise disjoint functions which are bounded and supported by sets of finite measure and such that |xk | |xik | for all k. In particular (xk ) has the same convergence properties as (xi ). Thus xk → 0 in measure and we can find a sequence εk → 0 such that |{|xk | > εk }| < εk . Let Bk = {|xk | > εk }, uk = χBk xk , vk = χBkc xk . Then xk = uk + vk ,
| supp uk | → 0 and vk ∞ → 0.
(2.2)
Moreover, all the uk ’s are disjoint from all the vk ’s, and the sequences (uk ) and (vk ) both consist of pairwise disjoint elements. Note that if xi ∞ → 0, then xk ∞ → 0, and we may choose uk = 0, and xk = vk . Similarly, if | supp xi | → 0 we may choose vk = 0 and xk = uk . Let us consider the most complicated situation where lim inf uk E (∗) > 0 and lim inf vk E (∗) > 0. Then both sequences (uk ) and (vk ) are semi-normalized, and we may assume that uk E (∗) 1 and vk E (∗) 1 for all k. Now, we construct by induction two subsequences u = uk and v = vk satisfying the hypotheses of Proposition 2.2. In a preparatory step we find a strictly increasing sequence (j ()) of natural numbers such that ∞ =1 | supp uj ( )| = μ1 < ∞. For the sake of simplifying notations, let us assume that j () = . Set for every k 1, | supp uj | and νk = μ1 + | supp vj |. μk = j k
j k
Since μk → 0, so χ(0,μk ] E (∗) → 0, and vk ∞ → 0, we can find a strictly increasing sequence (k ) of integers such that χ(0,μ1 ] rE (∗) vk1 r∞ < 2−1 and for every 2, χ(0,μk+1 ] rE (∗) uk r∞ < 2−−2
and χ(0,νk ] rE (∗) vk+1 r∞ < 2−−2 .
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
287
Observe that ∞
χ(0,μk+1 ] rE (∗) uk r∞ + χ(0,νk ] rE (∗) vk+1 r∞ + χ(0,μ1 ] rE (∗) vk1 r∞ < 1. (2.3)
=1
Set now for 1, u = uk ,
v = vk ,
M =
supp u , j
N =
∞ supp u + supp v . j
j
j =1
j >
j <
Set also for 1, g = SM u ∗
and h = SN v ∗ .
Since g , h are bounded and supported on bounded sets, they belong to Eb , and so does f = g + h . Note that h ∞ = v ||∞ → 0 and | supp g | = | supp u | → 0. Thus (f ) converges to zero in measure. Let us prove that (f ) in E is almost D 1/r -equivalent to (xk ) in E (∗) . Clearly it is sufficient to prove that for every n > m the finite sequence (un , un−1 , . . . , um , vm , vm+1 , . . . vn ) in E (∗) is D 1/r + ε(m)-equivalent to the sequence (gn , gn−1 , . . . , gm , hm , hm+1 , . . . , hn ) in E, where ε(m) does not depend on n and ε(m) → 0 when m → ∞. Since M μk+1 ,
N1 μ0
and N νk−1
for > 1,
we have by (2.3), and in view of u E (∗) 1, v E (∗) 1 that K(m) :=
∞ χ(0,M ] rE (∗) u r∞ =m
u rE (∗)
+
∞ χ(0,N ] rE (∗) v r∞ =m
v rE (∗)
< 1.
(2.4)
Then (2.4) implies by Proposition 2.2 that the sequences (u n , u n−1 , . . . , u 1 , v1 , v2 , . . . , vn ) in 1/r -equivalent, and clearly this E (∗) and (gn , gn−1 , . . . , gm , hm , h2 , . . . , hn ) in E are (D 1+K(m) 1−K(m) ) 1/r equivalence constant goes to D when m → ∞. Hence for every ε > 0 there exists 0 1 such that (f )0 and (xk )0 are (D 1/r + ε)-equivalent, and so they are almost D 1/r -equivalent. The last step now is to replace the disjoint sequence (f ) by an equivalent sequence of nonnegative decreasing elements in Eb . We simply put g = g ∞ χ(0,M ] + g ,
h = h ∞ χ(0,N ] + h
and f = g + h .
288
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
Then the functions g , h and f are decreasing and f → 0 in measure. Moreover, since g ∞ = u ∞ , h ∞ = v ∞ it follows from (2.4) that ∞ ∞ ∞ f − f r g − g r + h − h r < 1, E E E =1
=1
=1
and thus by the small perturbation principle (f ) and (f ) are almost 1-equivalent basic sequences in E. Hence (f ) is a basic sequence of decreasing elements in Eb almost D 1/r equivalent to (xk l ), which in turn is almost D 2/r -equivalent to a subsequence of (xi ). Consequently, (f ) is almost D 3/r -equivalent to a subsequence of (xi ). This ends the proof of (a) in the case where we assume that in (2.2) both sequences (uk E (∗) ) from below. If for example (uk E (∗) ) is not bounded away from 0, and (vk E (∗) ) are bounded we may assume that k uk rE (∗) < ∞. By Proposition 2.5 we have k xik − xk rE (∗) < ∞ for a subsequence (xik ) of (xi ). Then k xik −vk rE (∗) D k xik −xk rE (∗) +D k uk rE (∗) < ∞, and it follows that the sequence (xik ) is almost D 2/r -equivalent to (vk ). Then reasoning similarly as in the first part above we find that a subsequence of (vk ) is almost D 1/r -equivalent to a sequence of non-negative decreasing functions (h ) in Eb , with h ∞ → 0. Similarly, if (vk E (∗) ) is not bounded from below, then a subsequence (xik ) is almost D 2/r -equivalent to (uk ), which has itself a subsequence almost D 1/r -equivalent to a sequence of non-negative decreasing functions (g ) in Eb , with | supp g | → 0. (b) Let (yi ) be a semi-normalized sequence of non-negative decreasing elements in Eb which converges to zero in measure. By Proposition 2.6, there is a subsequence (yik ) which is almost 1-equivalent to a composite block sequence (yk ) and such that for all k, yk = χIk yik where I k is the union Ik ∪ Ik of (at most) two disjoint intervals, max Ik → 0 and min Ik → ∞, and r k yik − yk < ∞. We may assume that Ik = (ak , bk ], Ik = (ak , bk ], and that · · · ak < bk < · · · < a2 < b2 < a1 < b1 < a1 < b1 < a2 < b2 < · · · < ak < bk < · · · . Let uk = χIk yk and vk = χIk yk . If lim infk uk E = 0 then extracting a subsequence, we may r ) is bounded below by some positive number c. assume that k uk E < ∞ and that (vk E Consequently, k yik − vk rE /vk rE c−r k yik − yk rE + c−r k uk rE < ∞, and thus by the small perturbation principle, the sequence (yk ) is almost 1-equivalent to (vk ). In this case we may assume that uk = 0 for all k. Similarly, if lim infk vk E = 0 we can reduce the proof to the case where vk = 0 for all k. Let us now investigate the more complicated case where both sequences (uk E ), (vk E ) are bounded away from zero. We assume w.l.o.g. that uk rE 2D and vk rE 2D for every k. Since for all k, uk ∞ χ(0,ak ) + vk ∞ χ(bk ,ak ) yik − yk , we have
uk r∞ χ(0,ak ) rE
k
k
yi − y r < ∞, k k E k
vk r∞ χ(bk ,ak ) rE
yi − y r < ∞. k k E k
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
289
On the other hand, we have vk ∞ → 0 since χ(bk ,ak ) E is increasing with k, and thus bounded away from 0. Thus we may assume, extracting a subsequence if necessary, that
vk r∞ χ(0,bk ) r < ∞.
k
We thus have that
uk r∞ χ(0,ak ) rE +
k
vk r∞ χ(0,ak ) rE < ∞.
k
Then for sufficiently large k0 we have K(k0 ) :=
uk r∞ χ(0,ak ) rE +
kk0
vk r∞ χ(0,ak ) rE < 1.
(2.5)
kk0
Let u˜ k = uk ∞ χ(0,ak ) + uk and v˜k = vk ∞ χ(0,ak ) + vk . Since u˜ k , v˜k are decreasing, 2D uk rE u˜ k rE = u˜ k rE (∗) Duk r∞ χ(0,ak ) rE (∗) + Duk rE (∗) . Hence and by (2.5), for k k0 , uk rE (∗) 1. Similarly, vk rE (∗) 1 for k k0 . Thus (2.5) implies by Corollary 2.3 that for every n k0 , the sequence (un , un−1 , . . . , uk0 , vk0 , . . . vn−1 , vn ) (∗) 0 ) 1/r in E is (D 1+K(k 1−K(k0 ) ) -equivalent to itself in E . Indeed the uk , resp. vk , are decreasing on their supports, which are disjoint intervals placed in the same order. 1+K(k0 ) 1/r Thus (yk )kk0 in E (∗) is (D 1−K(k ) -equivalent to itself in E. Since K(k0 ) → 0 when 0) ∗ k0 → ∞, we have that (yk ) in E is almost D 1/r -equivalent to itself in E. Set now
zk = uk ∞ χ(0,ak ) + yk + vk ∞ χ(bk ,ak ) . Then zk is a non-negative decreasing element in E (∗) and 0 zk − yk wk := uk ∞ χ(0,ak ) + vk ∞ χ(0,ak ) . Thus wk rE (∗) = wk rE uk ∞ r χ(0,ak ) rE + vk r∞ χ(0,ak ) rE , and by (2.5), for sufficiently large k0 , zk − y
k E (∗)
kk0
kk0
wk rE (∗) < 1.
290
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
By Remark 1.3 it follows that for some k1 the sequence (zk )kk1 in E (∗) is basic and almost D 2/r -equivalent to (yk )kk1 in E (∗) . Thus we have obtained that (yk ) in E (∗) is almost D 1/r -equivalent to itself in E, which is also almost D 2/r -equivalent to (zk ) in E (∗) . Putting all together, (yik ) in E is almost D 3/r -equivalent to (zk ) in E (∗) . 2 As a consequence of Theorem 2.7, in view of the fact that any subsymmetric basic sequence is equivalent to any of its subsequence, we get the following corollary. (∗)
Corollary 2.8. Assume that E0 = {0}. Let (xn ) be a subsymmetric basic sequence in a quasi(∗) Banach space X. Then E0 contains a basic sequence converging to zero in measure and equivalent to (xn ) if and only if Eb contains a basic sequence of non-negative and decreasing elements, converging to zero in measure and equivalent to (xn ). 3. p subspaces in rearrangement invariant quasi-Banach lattices In the present section F denotes a quasi-Banach r.i. ideal lattice over I = N or I = (0, a), 0 < a ∞. We shall characterize p or c0 -subspaces of F0 according to the fact whether they are spanned by p -basic sequences, i.e. basic sequences that are equivalent to the p or c0 standard basis, converging to zero in measure or not. The case I = N is particularly simple. Proposition 3.1. Let I = N and let p , 0 < p < ∞, (resp. c0 when p = ∞) embed isomorphically as a subspace in F0 . Then either F0 contains a normalized basic sequence equivalent to the standard unit vector basis (en ) in p (resp. c0 ) converging to zero in ∞ norm, or p = ∞ and F0 = c0 as sets with equivalent (quasi)norms. Proof. Assume that en F = 1 for every n. The sequence (en ) in F forms a 1-symmetric basis of F0 . It is well known that if p (or c0 when p = ∞) embeds as subspace in a quasi-Banach space with a Schauder basis, this basis has a block basis (xn ) equivalent to the standard p -basis (resp. c0 -basis) (see e.g. [24, Lemma 2.1]). Let X = [xn ] be the closed linear span of (xn ). Observe that F → ∞ and F0 → c0 with norm one injections. Then we consider two cases. Let first the norm of F and that of ∞ be equivalent on X. Then X is isomorphic to a subspace of c0 , which implies that p = ∞. We have infn xn ∞ = δ > 0, and so for every n we can find k(n) such that |xn (k(n))| δ/2. Then for every finite sequence of scalars (λn ) we get for some C > 0, δ δ sup |λn | λn ek(n) λn xn k(n) ek(n) λn xn |λn |, C sup 2 n 2 n n F F F n n where the last inequality stems from the equivalence of the sequence (xn ) with the standard c0 basis. Since (en ) is 1-symmetric in F0 , it follows that (en ) in F is 2C/δ-equivalent to (en ) in c0 , and thus F0 = c0 . Assume now that the norm of F and that of ∞ are not equivalent on X. Then there is a sequence (yk ) in X with yk F = 1 and yk ∞ → 0. This yields (yk ) to converge to zero coordinatewise on the basis (xn ) of X. Now by the standard “gliding hump” argument (see e.g.
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
291
[27, Proposition 1.a.12] or [3, Proposition 1.3.10]), there is a subsequence (ykn ) of (yk ), which is equivalent to a block basis of (xn ) in F . Since (xn ) is equivalent to (en ) in p , so (ykn ) is also equivalent to (en ) in p . 2 Now we start to treat the case I = (0, a). First we investigate when linear embeddings of p or c0 can be made lattice isomorphic embeddings. Before we prove the next theorem, recall some necessary notions. For any 0 < p < 2, let γp : [0, 1] → R be a symmetric p-stable variable [28, p. 181]). It is well known that its distribution d|γp | (t) is (asymptotically) equivalent to a function Ct −p for t → ∞, for some C > 0, which means here that limt→∞ d|γp | (t)/(Ct −p ) = 1 (see [9, Theorem XVII.5.1]). For the definition of γp we may assume that C = 1. For √ p = 2 we consider γ2 : [0, 1] → R, a standard Gaussian variable. Then d|γ2 | (t) is equivalent to 2/πt −1 exp(−t 2 /2) when t → ∞ in the same sense (see [9, Lemma VII.1.2]). Consequently, we have that γp∗ (x) is √ equivalent to x −1/p if 0 < p < 2 and γ2∗ (x) is equivalent to 2| ln x| when x → 0. We say that the sequence of random variables (fn ) converges to f in distribution whenever dfn (t) → df (t) for every continuity point t > 0 of df . As usual, given c = a ∧ 1, by F [0, c] we denote the space of all restrictions to the interval [0, c] of functions from F . We also recall that given 0 < p < ∞, the Marcinkiewicz space Lp,∞ (I ) over I consists of all functions x ∈ L0 (I ) such that xp,∞ = sup t 1/p x ∗ (t) < ∞. t∈I
The functional · p,∞ is a quasi-norm and (Lp,∞ (I ), · p,∞ ) is an r.i. quasi-Banach ideal lattice with the Fatou property. Let further Lp,∞ := Lp,∞ (0, ∞). Theorem 3.2. Let I = (0, a), 0 < a ∞. Then p , 0 < p < ∞, (resp. c0 when p = ∞) embeds isomorphically as subspace in F0 if and only if either F0 contains a normalized disjoint sequence equivalent to the standard p (resp. c0 ) basis, or p 2 and the sequence of functions hp,n , n 1, defined by hp,n = hp ∧ n and hp (t) = χ(0,a∧1] t −1/p
if p < 2 and h2 (t) = χ(0,a∧1] | ln t|1/2
if p = 2,
is bounded in F0 . If F has the Fatou property this is equivalent to the condition that hp ∈ F [0, c], where c = a ∧ 1. Proof. We may assume that either a = ∞ or a = 1. In the case where F is normable, this is a rephrasing of the statement [36, Proposition 1 and Remark 2]. According to this statement p embeds linearly isomorphically in F0 if and only if either p embeds isomorphically as sublattice or 1 < p < 2 and the Köthe bidual of F0 restricted to [0, 1] contains a p-stable random variable (a Gaussian one if p = 2). By the fact that the decreasing rearrangement of a p-stable, resp. Gaussian variable, is equivalent to hp near 0 and that a function h belongs to (F0 ) [0, 1] if and only if the sequence (|h| ∧ n) is bounded in F0 [0, 1], we obtain indeed the result in this case, at least when 1 < p 2. The main argument in the proof of this result in [36] is a deep classical result by Dacunha-Castelle and Krivine [8] on subspaces of L1 (which can be also deduced from [1]), following which if E is a subspace of L1 [0, 1] isomorphic to p , p > 1, then E contains a normalized sequence converging in distribution to a function f which is equimeasurable with some product wγp , where 0 w ∈ L1 [0, 1], and γp is a symmetric p-stable random variable on [0, 1] which is independent from w (if p = 2, γ2 is a Gaussian variable).
292
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
Let us show now how to adapt the proof of [36, Proposition 1] to the quasi-normed case. We need an analogue of Dacunha–Castelle and Krivine’s result for of Lr -spaces with 0 < r 1 which we state and provide a sketch of its proof in Appendix A. Another questionable point of that proof will be that considering the Köthe bidual of F0 isn’t any longer relevant, since in our situation it can be trivial. Instead of it we shall consider 0 defined by F 0 = the “Fatou closure” of F0 [0, 1], namely the quasi ideal space of functions F {x ∈ L0 [0, 1]: xF˜0 < ∞}, where
xF˜0 = sup yF : y ∈ F0 , |y| |x| . This is a symmetric quasi-Banach function space on [0, 1], and if F0 is a Banach lattice it coincides with the Köthe bidual of F0 [0, 1]. By order continuity of F0 , we have that the natural 0 is isometric, and that inclusion F0 [0, 1] ⊆ F
xF˜0 = sup |x| ∧ nF : n ∈ N . By Aoki–Rolewicz theorem, first we may assume that F0 is equipped with an α-norm for some 0 < α 1. Let 0 < β < α. Then it is known that F [0, b] ⊆ Lα,∞ [0, b] ⊆ Lβ [0, b] for every finite b, with bounded inclusion maps. Indeed, since the fundamental function φE (t) = χ[0,t] F of the α-normed r.i. space F verifies φE (2t) 21/α φE (t), it is easy to see that φE (t)/t 1/α is equivalent to a decreasing function. In particular there is a constant cb such that φE (t) cb t 1/α for any t ∈ [0, b]. If f ∈ F [0, b], we have for all t ∈ [0, b], f F f ∗ (t)χ[0,t] E = f ∗ (t)φE (t) cb t 1/α f ∗ (t), and the inclusion F [0, b] ⊆ Lα,∞ [0, b] follows. Let (xn ) be a normalized basic sequence equivalent to the p unit basis and G be the closed subspace of F0 spanned by (xn ). Then we shall consider two cases: Case (a). For some integer n, the map πn : F0 → Lβ [0, n], f → χ[0,n] f induces an isomorphism on G (of course n = 1 in the case of [0, 1]) . Case (b). For no n 1, the map πn is an isomorphism on G. In the second case (b), we can find a normalized sequence (fn ) in G with πn (fn )Lβ → 0. Like in the proof of Prop. 1 in [36] a further subsequence (which we still denote by (fn )) is equivalent to a disjoint sequence in F0 . In particular it is unconditional. By extracting a subsequence we may assume for that all k, the sequence (ck (fn ))n of k-th coordinates of the fn with respect to the basis (xn ), converges. Then setting gn = f2n − f2n+1 we obtain a new sequence in G, still equivalent to a disjoint sequence in F , which converges coordinatewise to zero with respect to the basis (xn ). By unconditionality of (fn ), the sequence (gn ) is seminormalized and some subsequence of it is equivalent to the p unit basis. In case (a) we may assume (rescaling) that the quasi-norms of F0 and that of Lβ [0, 1] are equivalent on G. It follows in particular that α p 2. The second inequality holds indeed because Lβ has cotype 2, and thus G must also have cotype 2. As for the first inequality, it holds because F , and hence G, is equipped with an α-norm and thus p must be α-normable. Now consider the interval [0, 1] with Lebesgue measure as a probability space, and measurable functions on [0, 1] as random variables. By Theorem A.1 in Appendix A, there is a normalized sequence (fn ) in G which converges in distribution to a function f which is equimeasurable with some wγp , where 0 w ∈ Lβ [0, 1], and γp is a symmetric p-stable random variable
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
293
on [0, 1] which is independent from w (if p = 2, γ2 is a Gaussian variable). One can show also (see the Appendix) that fn∗ converges to f ∗ a. e. in [0, 1], and thus, by order continuity of F0 it follows that for every n 1, f ∗ ∧ nF = limm fm∗ ∧ nF supm fm∗ F = 1. Choose δ ∈ (0, 1] such that the set A := {t ∈ [0, 1]: w(t) > δ} has positive measure c. Hence sup(χA γp )∗ ∧ nF δ −1 sup(wγp )∗ ∧ nF = δ −1 sup f ∗ ∧ nF < ∞. n
n
n
Since the random variable χA is a function of w, it has to be independent from γp on [0, 1]. Then for every t > 0 we have
χA |γp | > t = A ∩ {γp |> t} = |A| · {γp |> t}, and it follows that (χA γp )∗ (t) = γp∗ (t/c) = Dc γp∗ (t), where Dc is the dilation operator by c. Thus we have (χA γp )∗ ∧ n = Dc [γp∗ ∧ n] and thus γp∗ ∧ n = Dc−1 (χA γp )∗ ∧ n . Now, since the dilation operator is bounded on F0 (it can be proved analogously as for Banach spaces, Theorem 4.4 in [25]), we have also that supn γp∗ ∧ nF < ∞. Since γp∗ is equivalent to hp near zero, we obtain supn hp ∧ nF < ∞. Conversely, assume that supn hp ∧ nF < ∞, or equivalently that supn γp∗ ∧ nF < ∞. If F0 has the Fatou property then γp∗ ∈ F0 , hence γp ∈ F0 , and any sequence of independent random variables equidistributed with γp is included in F0 and spans a subspace isometric to p 0 , the Fatou closure of F0 [0, 1]. Like in [28, p. 182]. In the general case, we have that γp∗ |[0,1] ∈ F 0 , one can the α = 1 case investigated in [36, Proposition 1], but replacing there (F0 [0, 1]) by F construct in F0 [0, 1] a sequence of independent variables converging in distribution to a p-stable variable and spanning almost isometrically p in F0 [0, 1]. 2 Note that if a < ∞ then in the first case of the alternative described in Theorem 3.2 we obtain a basic sequence converging to zero in measure and equivalent to the p -basis. If a = ∞ this is not evident, and we therefore investigate when a lattice embedding of p can be made based on a basic sequence converging to zero in measure. The case of Orlicz spaces [33] shows that it is possible that some p -sublattices of F cannot be spanned by basic sequences converging to zero in measure. We shall see that it is possible that some sublattices of F can be spanned by normalized disjoint basic sequences consisting of equimeasurable elements. This phenomenon is connected to the behavior of the function fp (t) = t −1/p inside F . Along this line is the following well known result which indicates the importance of how the function fp is related to the space. Fact 1. (See [28, Theorem 2.f.2].) Let F be an r.i. function space on (0, ∞). If for some 1 p < ∞, the function fp (t) = t −1/p belongs to F then Lp (0, ∞) is order isometric to a sublattice of F .
294
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
Theorem 3.3. Assume that 0 < p ∞, and that the quasi-Banach ideal lattice F over (0, ∞) is L-convex. Then its order continuous part F0 contains a closed sublattice order isomorphic to p (resp. c0 when p = ∞) if and only if either F0 contains a closed sublattice order isomorphic to p spanned by a semi-normalized sequence of functions converging to zero in measure, or the functions fp,n (t) = χ(0,n] t −1/p ∧ n (resp. = χ(0,n] if p = ∞),
n ∈ N,
form a bounded set in F0 . If F has the Fatou property, then in the second case F contains the function fp (t) = t −1/p (resp. χ(0,∞) in the case p = ∞). Before we prove Theorem 3.3 let us recall some results from [13]. The following lemma is a simple reformulation of [13, Proposition 3.3] to which we added the last sentence. Lemma 3.4. (See [13, Proposition 3.3].) Let F be a r. i. Banach space of measurable functions over (0, ∞) such that L1 ⊂ F . Assume that the norms of F and that of L1 + L∞ are equivalent on the linear span of an infinite sequence (xn ) of disjoint, semi-normalized nonzero elements of L1 + L∞ . Then there exists an infinite sequence (yn ) of equimeasurable disjoint functions in the Köthe bidual F such that the norms of F and that of L1 + L∞ are equivalent on the linear span of the sequence (yn ). Moreover, if the basic sequence (xn ) is symmetric (or simply subsymmetric) in F then the basic sequences (xn ) in F and (yn ) in F are equivalent. Proof. We have only to justify the last sentence, which results in fact from the proof of [13, Proposition 3.3] itself. Recall the main features of this proof. One can extract a subsequence (xik ) such that y(t) = limk→∞ xi∗k (t) exists for a. a. t > 0. Then y ∈ F . Let (yk ) be a sequence of disjoint functions in F which are equimeasurable with y. Such a sequence exists since F is a r.i. space over the infinite interval. Then, by the inclusion F → L1 + L∞ and by the Fatou property of the norm · F , there exists C > 0 such that for every finite sequence (ak ), for all n, n n n n −1 C ak yk ak yk lim inf ak xik+m lim inf ak xik+m m→∞ m→∞ k=1 k=1 k=1 k=1 L1 +L∞ F F F n n C lim ak xik+m = C ak yk . m→∞ k=1
L1 +L∞
k=1
L1 +L∞
(The last equality is not trivial and its proof depends on the hypothesis L1 ⊂ F : see the proof of [13, Proposition 3.3].) On the other hand since (xn ) is subsymmetric then the basic sequences (xk ) and (xik+m ) are equivalent independently of m. Thus (xn ) in F and (yn ) in F are equivalent. 2 Let φ be a continuous concave increasing function on [0, +∞) with φ(0) = 0 and φ(t) > 0 for t > 0. For every s > 0 define a function φs on [0, ∞) by φs (t) = φ(st) φ(s) , and set φ = conv{φs | 0 < s 1}, C where the closure is taken for the topology of uniform convergence over every interval [0, m]. It was proved in [13, Proposition 3.5] that if lim inft→0 φ(2t) φ(t) > 1 then the set Cφ contains some α power function t with 0 < α 1.
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
295
If y ∈ L1 + L∞ , y = 0, define the function φ y by t φ (t) = y
y ∗ (s) ds.
0
Let now F be a r.i. Banach space over (0, ∞) and y ∈ F . By the proof of [13, Theorem 3.6], if y α y lim inft→0 φφ y(2t) (t) > 1 and α ∈ (0, 1] is such that the function t belongs to Cφ then the function t α−1 belongs to F . If on the contrary lim inft→0
φ y (2t) φ y (t)
= 1 then L1 ⊂ F .
Proof of Theorem 3.3. By a standard convexification argument we may assume that F is an r-convex Banach lattice, for some r > 1. In fact F can be equipped with an equivalent lattice p-norm, 0 < p 1, by the Aoki–Rolewicz theorem. Since F is assumed to be L-convex, it is q-convex for any 0 < q < p. Take then s such that ps > 1. Then the s-convexification F (s) of 1/s F , equipped with the p-norm xF (s) = |x|s F becomes r-convex for any 1 < r < ps and has thus an equivalent r-convex norm. Moreover, F0s := (F0 )(s) = (F (s) )0 , and p order embeds in (s) (s) F0 if and only if ps does in F0 . We also have that fps,n ∈ F0 if and only if fp,ns ∈ F0 , and (s) the boundedness of (fps,n ) in F0 is equivalent to the boundedness of (fp,n ) in F0 . Sufficiency. If a rearrangement invariant Banach function space F over (0, ∞) contains the function fp then F contains Lp (0, ∞) isometrically as sublattice (see [28, Theorem 2.f.2]), and thus p order embeds in F . On the other hand if the sequence (fp,n ) is bounded in F then L0p,∞ ⊂ F0 , where L0p,∞ is the order continuous part of the Marcinkiewicz space Lp,∞ . We also know that every rearrangement invariant Banach space on (0, ∞) is contained in the space L1 + L∞ [25, Chapter II, Theorem 4.1]. It was proved in [13] that L0p,∞ contains a closed sublattice L isomorphic to p on which the norms of L0p,∞ and that of L1 + L∞ are equivalent (see the first part of the proof of [13, Theorem 3.6]). Since L0p,∞ → F0 → L1 + L∞ , this sublattice L is also closed and isomorphic to p in F0 . Necessity. Let (xn ) be a normalized disjoint sequence in F0 which is equivalent to the unit basis of p or c0 . Let X = [xn ], and more generally let Xm = span{xn | n m} for m ∈ N. We have two cases: Case 1. The norms of F and that of L1 + L∞ are not equivalent on X. Then they are not equivalent on any subspace Xm . Indeed, if they were, then Xm would be closed in L1 + L∞ . The algebraic direct decomposition X = span{x1 , . . . , xm−1 } + Xm would be a topological direct sum for the norm of L1 + L∞ , (since one factor is finite dimensional and the other one is closed) and since the two norms would be equivalent on each factor separately they would be also equivalent on X. Thus one can find a sequence (yn ) in X which is normalized for the norm of F , converges to zero in the norm of L1 + L∞ , and moreover such that yn ∈ Xn for every n 1. A standard gliding hump argument relative to the Schauder basis (xn ) of the space X, shows that some subsequence (yn ) of (yn ) is equivalent to a block basis of (xn ), that is to the unit p - or c0 -basis. Since yn L1 +L∞ → 0, it converges to zero in measure, and so the sequence (yn ) also converges to zero in measure. Thus in this case the first part of the alternative in the theorem holds. Case 2. The norms of F and that of L1 + L∞ are equivalent on X.
296
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
Then by Lemma 3.4 applied to the disjoint basis (xn ) of X, equivalent to the unit basis in p (or c0 ), there exists a sequence (yn ) of disjoint equimeasurable functions in (F0 ) which is equivalent to the unit basis of p (or c0 ) in both (F0 ) and L1 + L∞ . Let φ = φ y1 be the concave function associated with y1 as explained above. We cannot have lim inft→0 φ(2t) φ(t) = 1 since it would imply that L1 ⊂ F0 , which is impossible. In fact, L1 ⊂ F0 implies that L1 [0, 1] ⊂ F0 [0, 1], and since F0 [0, 1] ⊂ L1 [0, 1], the spaces F0 [0, 1], L1 [0, 1] coincide as sets with equivalent norms, which contradicts the assumption of r-convexity of F for some r > 1. Hence lim inft→0 φ(2t) φ(t) > 1 α α−1 and thus there exists α ∈ (0, 1] such that the function t belongs to Cφ while t belongs to (F0 ) . Let us identify this exponent α. For every n 1 we have n ei i=1
=n
1/p
p
n and yi i=1
L1 +L∞
∗ 1 1 n s 1 ds = nφ . = yi (s)ds = y1∗ n n 0
i=1
0
By equivalence of (en ) in p and (yn ) in L1 + L∞ , the functions φ and t 1−1/p are equivalent on [0, 1], that is C −1 φ(t) t 1−1/p Cφ(t) for all t ∈ [0, 1] and some C > 0. It follows that φ , the functions ψ and t 1−1/p have equivalent restrictions to [0, 1]. In for every function ψ ∈ C particular, for ψ(t) = t α we obtain that α = 1 − 1/p. Finally we see that the function fp (t) = t −1/p (resp. χ(0,∞) in the case of c0 ) belongs to (F0 ) . Since F0 embeds isometrically in its Köthe bidual (see p. 30 in [28]), we obtain that the sequence (fp,n ) (resp. (χ0,n) ) is bounded in F0 . Note also that (F0 ) = F , so (F0 ) = F . If F has the Fatou property we obtain thus that the function fp belongs to F . 2 From Corollary 2.8, Propositions 1.5, 3.1 and Theorems 3.2, 3.3 we deduce: Corollary 3.5. Let E be a quasi-Banach ideal space over I = N or I = (0, a), 0 < a ∞. (∗) Assume that E (∗) is also quasi-normed and E0 = {0}. Let 0 < p ∞. (∗)
(1) If I = N, then p embeds in E0 isomorphically if and only if either Eb contains a seminormalized p -basic sequence of non-negative decreasing elements, converging to zero in ∞ -norm, or p = ∞ and E0 = Eb = c0 with equivalent norms. (∗) (2) If I = (0, a), 0 < a ∞, then p embeds in E0 isomorphically if and only if either p (∗) embeds in E0 isomorphically as sublattice, or p 2 and the sequence (hp,n ) = χ(0,a∧1] t −1/p ∧ n if p < 2, resp. (h2,n ) = χ(0,a∧1] | ln t|1/2 ∧ n if p = 2, is bounded in E. (∗) (3) If I = (0, a), 0 < a < ∞, then p embeds in E0 isomorphically as sublattice if and only if Eb contains a semi-normalized p -basic sequence of non-negative decreasing elements, supported on intervals of length converging to zero. (∗) (4) If I = (0, ∞) and E is L-convex, then p embeds in E0 isomorphically as sublattice if and p only if either Eb contains a semi-normalized -basic sequence of non-negative decreasing elements, converging to zero in measure, or the sequence resp., (fp,n ) = (χ(0,n] ) if p = ∞ , (fp,n ) = χ(0,n] t −1/p ∧ n if 0 < p < ∞, is bounded in E.
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
297
4. Two-weighted Orlicz–Lorentz spaces Λϕ,w,v (I ) Let ϕ be a non-degenerate Orlicz function, that is ϕ : R+ → R+ , ϕ(0) = 0, ϕ is continuous, strictly increasing, and limt→∞ ϕ(t) = ∞. Given ϕ, a pair of positive functions v, w ∈ L0 (I ) with I = (0, a) or I = N, is called an admissible pair of weight functions whenever for all s > 0 and x ∈ (0, a), x
ϕ sv(t) w(t) dt < ∞ and
0
Let V (t) := function
∞
ϕ sv(t) w(t) dt = ∞
if a = ∞ or I = N.
(4.1)
0
t 0
v and W (t) :=
t 0
w for t ∈ (0, a). Given ϕ and v, w define the Musielak–Orlicz
φ(s, t) = ϕ sv(t) w(t),
s 0, t ∈ (0, a),
and the Musielak–Orlicz space Lφ (I ) as the collection of all x ∈ L0 (I ) such that for some λ > 0, Iφ (λx) < ∞, where a Iφ (x) =
φ x(t), t dt =
0
a
ϕ x(t)v(t) w(t) dt.
0
It is clear that Lφ (I ) coincides with the set of all x ∈ L0 (I ) such that the Minkowski functional xφ = inf{ > 0: Iφ (x/) 1} is finite [32,40]. Its closed subspace of order continuous elements will be denoted by L0φ (I ). It is also standard to show that (Lφ (I ))b = L0φ (I ). If both weights v and w are constant, then Lφ (I ) coincides with the Orlicz space Lϕ (I ). The (two-weighted) Orlicz–Lorentz space Λϕ,w,v (I ) associated to the Orlicz function ϕ and to the admissible pair of weights v, w, is the set of all x ∈ L0 (I ) such that for some λ > 0, we have I (λx) < ∞, where a I (x) := Iϕ,w,v (x) =
ϕ(x ∗ v)w,
x ∈ L0 .
0
Thus Λϕ,w,v (I ) = (Lφ (I ))(∗) = {x ∈ L0 : x ∗ ∈ Lφ (I )}. Let xΛ := xΛϕ,w,v (I ) = x ∗ φ . Notice that if v ≡ 1, then Λϕ,w,v (I ) = Λϕ,w (I ) is an Orlicz–Lorentz space [7,14,15,18]. We denote by Λ0ϕ,w,v (I ) the subspace of all order continuous elements in Λϕ,w,v (I ). If ϕ(u) = up , 0 < p < ∞, and v ≡ 1, then Λϕ,w,v (I ) = Λp,w (I ) [19]. If (w, 1/w) is an admissible pair of weights then Λϕ,w,1/w (I ) is the Orlicz–Marcinkiewicz space denoted by Mϕ,w (I ) and investigated in [14,15] for w decreasing, and then studied in [22] for arbitrary weight w (note that if w is decreasing the pair (w, 1/w) is always admissible). Under some additional conditions on w and ϕ, Mϕ ∗ ,w (I ) is the dual space to Λϕ,w (I ), where ϕ ∗ (t) = sups>0 {st − ϕ(s)} is the Young conjugate function to ϕ [15,22].
298
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
In the case when I = N we frequently shall use the sequence notation. Thus given an Orlicz function ϕ, the pair of positive sequences v = (v(n)), w = (w(n)) is called an admissible pair of weight sequences whenever for all s > 0, ∞ ϕ sv(n) w(n) = ∞.
(4.2)
n=1
Let W (n) =
n
i=1 w(i)
and V (n) =
n
i=1 v(i),
n ∈ N. Let φ = (φn ), where
φn (t) = ϕ tv(n) w(n),
t 0, n ∈ N.
Then φ = (φn ) is the corresponding Musielak–Orlicz sequence space [32,40] with the modular Iφ (x) =
∞ ϕ x(n)v(n) w(n). n=1
For constant weights v and w, φ becomes the Orlicz sequence space ϕ . By hϕ we denote its closed subspace of order continuous elements. The (two-weighted) Orlicz–Lorentz sequence space will be denoted also by d(w, v, ϕ), that is d(w, v, ϕ) = Λϕ,w,v (N), and xd := xΛ = xd(w,v,ϕ) = x ∗ φ . The modular corresponding to the space d(w, v, ϕ) will be denoted by I (x) =
∞ ϕ x ∗ (n)v(n) w(n),
x = x(n) .
n=1
The space d0 (w, v, ϕ), the order continuous part of d(w, v, ϕ), is the closure of the linear span of the unit vectors (ei ) in d(w, v, ϕ). Notice that if v ≡ 1, then d(w, v, ϕ) = d(w, ϕ) is the Orlicz– Lorentz sequence space studied for instance in [7]. If v ≡ 1 and 0 < p < ∞, then d(w, v, ϕ) = d(w, p). If the pair (w, 1/w) is admissible then d(w, 1/w, ϕ) = m(w, ϕ) = Mϕ,w (N) is the Orlicz–Marcinkiewicz sequence space [15,22]. Given an Orlicz function ϕ, we define its lower and upper Matuszewska–Orlicz indices in three different categories, at zero, at infinity and on R+ as follows: αϕ0 = sup r: αϕ∞ = sup r: αϕ = sup r:
ϕ(at) < ∞ , ϕ(t)a r
βϕ0 = inf r:
ϕ(at) < ∞ , ϕ(t)a r
βϕ∞ = inf r:
ϕ(at) sup <∞ , r 0
βϕ = inf r:
sup 0
sup 01
t>0
inf
0
ϕ(at) > 0 , ϕ(t)a r
inf
ϕ(at) > 0 , ϕ(t)a r
inf
ϕ(at) >0 . ϕ(t)a r
01
00
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
299
It is clear that αϕ = min(αϕ0 , αϕ∞ ) and βϕ = max(βϕ0 , βϕ∞ ). We say that ϕ satisfies condition 02 , ∞ 2 and 2 , whenever ϕ(2t) < ∞, 0
sup t>1
ϕ(2t) < ∞ and ϕ(t)
sup t>0
ϕ(2t) < ∞, ϕ(t)
respectively. Let ϕ1 , ϕ2 be Orlicz functions. We call them equivalent at zero, equivalent at infinity, or just equivalent, whenever for some C, D > 0, C −1 ϕ1 (D −1 t) ϕ2 (t) Cϕ1 (Dt) for all 0 < t < 1, t > 1, or for all t > 0, respectively. It is well known and easy to prove that appropriate indices of Orlicz functions are preserved by the corresponding equivalence relations (e.g. see [21]), that is if ϕ1 and ϕ2 are equivalent (resp., equivalent at 0; equivalent at ∞) then αϕ1 = αϕ2 (resp., αϕ01 = αϕ02 ; αϕ∞1 = αϕ∞2 ). The similar equalities hold for upper indices. Remark 4.1. It is well known that any upper index of ϕ is finite if and only if ϕ satisfies the corresponding condition 2 . Given r > 0, ϕ is said to be r-convex if ϕ(t 1/r ) is convex. We also recall that if αϕ > 0 then for any r ∈ (0, αϕ ), ϕ(t 1/r ) is equivalent to a convex Orlicz function [21,31]. The following proposition is well known but for the sake of completness we provide its proof. Proposition 4.2. If αϕ > 0 then · φ is a quasi-norm on Lφ (I ). If ϕ is r-convex, then · φ is r-convex, and in particular when 0 < r 1 it is an r-norm. Proof. By the assumption on ϕ, ϕ(at) Ca p ϕ(t) for some p > 0, all t 0 and all 0 < a < 1. Take 0 < a < 1 such that Ca p 1/2. Since ϕ is increasing, ϕ(bs + (1 − b)t) ϕ(t) + ϕ(s) for all 0 < b < 1 and s, t 0. Take any elements x, y from Lφ (I ) and α, β > 0 such that Iφ (x/α) 1, Iφ (y/β) 1. Then x+y x y Iφ a Iφ a + Iφ a Ca p Iφ (x/α) + Iφ (y/β) 1. α+β α β Hence x + yφ a −1 (xφ + yφ ). If ϕ is convex then Iφ is a convex functional on Lφ (I ) and the triangle inequality of · φ is immediate. If ϕ is r-convex, then ψ(t) := ϕ(t 1/r ) is convex. Letting · Ψ to be the associated quasi-norm to ψ we see that it is a norm, and also we have the equality 1/r xφ = |x|r Ψ . It follows that r |x| + |y|r 1/r = |x|r + |y|r 1/r |x|r + |y|r 1/r = xr + yr 1/r , φ φ Ψ φ Ψ Ψ that is, the quasi-norm · φ is r-convex. Finally, if 0 < r 1 then for every u, v 0 we have u + v (ur + v r )1/r , and thus 1/r xr + yr 1/r . x + yφ |x|r + |y|r φ φ φ Hence the functional · φ is an r-norm, which completes the proof. By Proposition 1.5 we deduce the following result.
2
300
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
Corollary 4.3. If αϕ > 0 and · Λ (resp. · d ) is a quasi-norm then the quasi-Banach lattice Λϕ,w,v (I ) (resp., d(w, v, ϕ)) is L-convex. The proof of the next result is standard and can be compared with [24, Lemma 1.7]. Proposition 4.4. If two Orlicz functions ϕ1 and ϕ2 are equivalent and αϕ1 > 0, then the quasinorms · Λϕ1 ,w,v (I ) and · Λϕ2 ,w,v (I ) (resp., · Λϕ,w,v (I ) ) and are also equivalent. If I = N and infn w(n) > 0, (resp. I = (0, a), a < ∞ and W (a) < ∞), then the equivalence of ϕ1 and ϕ2 at 0 (resp. at ∞) is sufficient. Proof. Let μ be a measure on R defined as dμ = w dt. Then the given conditions imply that the Orlicz spaces Lϕ1 (I, μ) and Lϕ2 (I, μ) over the measure space (I, μ) coincide and the quasinorms are equivalent [30,32]. The conclusion follows since f Λϕ1 ,w,v (I ) = vf ∗ Lϕ1 (I,μ) and f Λϕ2 ,w,v (I ) = vf ∗ Lϕ2 (I,μ) . 2 Proposition 4.5. Let ϕ be an Orlicz function such that αϕ > 0. Then · Λ (resp. · d ) is a quasi-norm on Λϕ,w,v (I ) (resp., on d(w, v, ϕ)) whenever for some C > 0, 2t
t ϕ(sv)w
0
ϕ(Csv)w
2n n resp., ϕ sv(i) w(i) ϕ Csv(i) w(i) i=1
0
(4.3)
i=1
holds for all s > 0, 2t ∈ I = (0, a) (resp., s > 0, n ∈ N). Proof. Assume the inequality (4.3). By Proposition 4.2, · φ is a quasi-norm. Applying the Aoki–Rolewicz theorem, we can assume further that · φ is a p-norm for some 0 < p 1. Thus by Lemma 1.4, it is enough to prove that the dilation operator σ2 is bounded on the cone of non-negative decreasing elements of Lφ (0, a) or φ . In view of Remark 4.1 and Proposition 4.4 it is enough to give the proof under the assumption that ϕ(t 1/r ) is convex for usome 0 < r 1. Let ϕ denote the right derivative of ϕ. Then for every u > 0, ϕ(uv(2t)) = 0 ϕ (sv(2t))v(2t) ds. When conducting a proof in function case we assume for simplicity that a = ∞. Thus by the Fubini theorem, ∞ I (σ2 x) = 2
ϕ x ∗ (t)v(2t) w(2t) dt = 2
0
x∗ (t)
∞
ϕ sv(2t) v(2t) ds
w(2t) dt 0
0
∞ dx (s)
∞ 2dx (s)
0
0
ϕ sv(2t) v(2t)w(2t) dt ds =
=2 0
ϕ sv(t) v(t)w(t) dt ds. (4.4)
0
Analogously we have ∞ dx (s) ϕ sv(t) v(t)w(t) dt ds. I (x) = 0
0
(4.5)
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
301
For any convex Orlicz function ψ , tψ (t) ψ(2t) 2tψ (2t) for all t > 0. Thus by the assumption that ϕ(t 1/r ) is convex, we have that for every t > 0, rϕ(t) tϕ (t) rϕ 21/r t . Then by (4.3), for every t, s > 0 and some C > 0, 2t
2t
r ϕ sv(z) v(z)w(z) dz s
0
r ϕ 21/r sv(z) w(z) dz s
0
t
ϕ C21/r sv(z) w(z) dz
0
t C21/r
ϕ C21/r sv(z) w(z)v(z) dz.
0
Therefore in view of (4.4) and (4.5), I (σ2 x) DI (Dx) with D = C21/r . Thus by convexity of ϕ(t 1/r ) there exists M > 0 such that I (σ2 x) I (Mx), and so σ2 xΛ MxΛ . Assuming (4.3) for sequences, define on R+ the weights vχ[n−1,n) = v(n) and wχ[n−1,n) = w(n). The space d(w, v, ϕ) is isometrically embedded into Λϕ,w,v (0, ∞) by the usual identification x = (x(n)) → x(t) with xχ[n−1,n) (t) = x(n), n ∈ N. It is enough to show that the triple ϕ, w, v satisfies condition (4.3) for functions. In fact for any t > 0 there exists n ∈ N such that n − 1 t < n. Thus for s > 0, 2t 0
2n n ϕ sv(i) w(i) ϕ Csv(i) w(i) ϕ(Csv)w + ϕ Csv(n) w(n). t
ϕ(sv)w
i=1
i=1
0
But n/2 n ϕ Csv(i) w(i) ϕ C 2 sv(i) w(i) ϕ Csv(n) w(n) i=1
t
0
0
ϕ C 2 sv w
2t
Thus for some C > 0 we have
0
i=1
n−1
ϕ(sv)w
t 0
ϕ C 2 sv w.
ϕ(C sv)w, and so (4.3) is satisfied.
2
Remark 4.6. For further use let us note that if v is increasing then condition (4.3) implies condition 2 for W . Let t ∈ I = (0, a) such that 2t ∈ I . If W (t) = ∞, then W (2t) = W (t) = ∞. If W (t) < ∞ then ϕ sv(t) W (2t) − W (t)
2t t
ϕ sv(z) w(z) dz
t 0
ϕ Csv(z) w(z) dz ϕ Csv(t) W (t).
302
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
Taking s = 1/v(t) we obtain that
ϕ(1) W (2t) − W (t) ϕ(C)W (t),
and so W (2t) (1 + ϕ(C)/ϕ(1))W (t) for every t. For I = N the proof is similar. t Remark 4.7. If v is constant then the conditions (4.1) and (4.2) mean that 0 w < ∞ for every ∞ t ∈ I and 0 w = ∞. Therefore a single function or sequence w is called a weight whenever it satisfies these conditions. By Proposition 4.5 we obtain the following corollaries. Corollary 4.8. Let ϕ be an Orlicz function such that 0 < αϕ βϕ < ∞ and let w be a single weight. Then the following assertions are equivalent. (i) · Λ is a quasi-norm on the Orlicz–Lorentz space Λϕ,w (I ). (ii) W satisfies condition 2 . (iii) Conditions (4.3) are satisfied. Proof. (i) ⇒ (ii) Setting x1 (t) = χ(0,b) (t), x2 (t) = χ(b,2b) (t), where b, 2b ∈ I , we have x1 Λ = x2 Λ = 1/ϕ −1 (1/W (b)) and x1 + x2 Λ = 1/ϕ −1 (1/W (2b)). Assuming that x1 + x2 Λ D(x1 Λ + x2 Λ ) we have ϕ −1 (1/W (b)) 2Dϕ −1 (1/W (2b)). It follows by the 2 -condition of ϕ that W (2b) KW (b) for some K > 0. The implication (ii) ⇒ (iii) is trivial and (iii) ⇒ (i) results from Proposition 4.5. 2 Definition 4.9. We say that a positive, locally integrable function w on I is regular if there exists C > 0 such that C −1 tw(t) W (t) Ctw(t) for some C > 0 and all t ∈ I . Proposition 4.10. If a positive, locally integrable function w on I is regular then 1 < inf t∈I
W (2t) W (2t) sup < ∞. W (t) t∈I W (t)
(4.6)
Consequently, 0 < αW βW < ∞ and so W satisfies condition 2 . Proof. We shall give a proof only for I = (0, ∞). By regularity of w for any s > 0, C
−1
2s
1 dt t
s
2s s
w(t) dt C W (t)
2s
1 dt. t
s
Then for any s > 0, ln 2C
−1
= C −1 ln
W (2s) 2s 2s ln C ln = ln 2C , s W (s) s
which is (4.6). By the results in [21,31] we then obtain the inequality on indices.
2
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
303
Corollary 4.11. If αϕ > 0, w is decreasing, regular and v = 1/w, then the condition (4.3) is satisfied. Consequently, the Minkowski functional x ∗ φ corresponding to φ(s, t) = ϕ(s/w(t))w(t), s 0, t ∈ I , is a quasi-norm on Mϕ,w (I ). Proof. Let just I = (0, ∞). Then for any s, t > 0, since W (t) W (2t) KW (t) by Proposition 4.10, and C −1 tw(t) W (t) Ctw(t), we have 2t t t 1 C s s 2Czs W (2z) w(z) dz = w(2z) dz ϕ ϕ ϕ dz w(z) 2 w(2z) 2 W (2z) 2z 0
0
CK 4
0
t
t 2 2Czs W (z) 2C s C2K w(z) dz ϕ ϕ dz W (z) z 4 w(z)
0
0
t s w(z) dz, ϕ C w(z) 0
for some C > 0 by the assumption that αϕ > 0.
2
Lemma 4.12. Assume that I = (0, ∞) or I = N. Let v be regular, W (t) < ∞ for all t > 0, and αW > 0. Then there exists a constant M > 0 such that for every xΛ 1 or xd 1, we have ϕ M −1 x ∗ (t)v(t)
M , W (t/2)
t > 0,
or ϕ M −1 x ∗ (n)v(n)
M , W (n/2)
n ∈ N,
respectively. In particular, since necessarily W (∞) = ∞, it holds that limt→∞ x ∗ (t)v(t) = 0 or limn→∞ x ∗ (n)v(n) = 0. Proof. We show it only for sequence spaces. Let C be a constant in the regularity condition for v, and by Proposition 4.10, let K 1 be such that V (n/2) K −1 V (n). Since αW > 0, there exist > 0 and n0 such that W (n) (1 + )W (n/2)) for all n n0 . Then for sufficiently large n ∈ N and some M > 0 we have 1
n i=n/2
n i=n/2
n V (i) w(i) ϕ x ∗ (n)v(i) w(i) ϕ C −1 x ∗ (n) i i=n/2
V (n/2) V (n) w(i) ϕ (CK)−1 x ∗ (n) W (n) − W n/2 ϕ C −1 x ∗ (n) n n
M −1 ϕ M −1 x ∗ (n)v(n) W n/2 , which yields the required inequality. Finally, since αW > 0, so for some C > 0 and p > 0, W (u) Cup W (1) for every u 1, which implies that W (∞) = ∞. 2
304
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
Remark 4.13. If v is decreasing, in particular if v is a constant function, then we need neither the assumption that αW > 0 nor that v is regular in Lemma 4.12. Indeed, in that case we simply obtain that xΛ 1 implies ϕ(x ∗ (t)v(t)) 1/W (t) for every t ∈ I . Lemma 4.14. If w is decreasing and regular then v = w −1 is also regular. Proof. We show it only in the case of the function weight on (0, ∞). Let w be decreasing and regular and let v = w −1 . By t/W (t) 1/w(t) Ct/W (t), we have for any t > 0, t V (t)
s ds W (s)
t/2
t
t2 t t ds = (4C)−1 . 2W (t) 4W (t) w(t)
t/2
On the other hand, since the function t/W (t) is increasing, for any t > 0, t V (t) C
t2 t s ds C C . W (s) W (t) w(t)
0
Thus v is regular.
2
Proposition 4.15. Let αϕ > 0 and the conditions (4.1), (4.2), (4.3) be satisfied. Then the following properties are fulfilled. (1) The space (Λϕ,w,v (I ), · Λ ) satisfies the Fatou property. (2) An element f ∈ Λϕ,w,v (I ) is order continuous whenever I (kf ) < ∞ for every k > 0. (3) If ϕ satisfies condition 2 then · Λ is order continuous in Λϕ,w,v (I ). If I = (0, a), a < ∞, 0 condition ∞ 2 is sufficient provided that W (a) < ∞. If I = N condition 2 is sufficient provided that one of the following conditions is fulfilled. (a)
W (∞) = ∞
(b)
αW > 0 and v
(c)
inf w(n) > 0.
and v
is decreasing,
is regular, (4.7)
Proof. (1) The Fatou property is clear (see the Preliminaries). (2) We have to prove that if |f | fn ↓ 0 then fn Λ → 0. Due to condition (4.1), we have f ∗ (t) → 0 when t → ∞, and it yields that fn∗ ↓ 0. By the monotone convergence theorem, we have then I (kfn ) → 0 for every k, thus fn Λ → 0. (3) It is sufficient to prove that for any f ∈ Λϕ,w,v (I ) such that I (f ) < ∞ we have I (2f ) < ∞, since by (2) this implies that every element of Λϕ,w,v (I ) is order continuous. This is clear (for any I ) if ϕ verifies 2 . If I = (0, a), a < ∞, and ϕ satisfies condition ∞ 2 , then there is K > 0 such that for f ∈ Λϕ,w,v (I ) with I (f ) < ∞ and A = {t: f ∗ (t)v(t) 1}, we get I (2f ) = Iφ (2f ∗ ) = Iφ (2χA f ∗ ) + Iφ (2χAc f ∗ ) KIφ (χA f ∗ ) + ϕ(2)W (a) ϕ(2)W (a) + KI (f ). Hence I (2f ) is finite provided that W (a) < ∞.
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
305
If I = N, and one of the conditions (4.7) is satisfied, then by Lemma 4.12 and Remark 4.13, for every x ∈ d(w, v, ϕ) with I (x) < ∞, the set A = {n ∈ N : x ∗ (n)v(n) > 1} is finite. Thus if ϕ satisfies condition 02 with constant K > 0, I (2x) = Iφ (2x ∗ ) = Iφ (2χA x ∗ ) + Iφ (2χAc x ∗ ) = ϕ 2x ∗ (n)v(n) + KI (x) < ∞,
ϕ 2x ∗ (n)v(n) + KIφ (x ∗ ) n∈A
n∈A
which completes the proof.
2
5. c0 , ∞ and hψ isomorphic subspaces in Λϕ,w,v (I ) Starting from this section we assume in the rest of the article that the triple ϕ, v, w consisting of the Orlicz function ϕ and the pair of weights v, w, satisfy conditions (4.1), (4.2), (4.3) and that ϕ is r-convex for some 0 < r 1. Then in view of Proposition 4.5, for I = (0, a) or I = N, (Λϕ,w,v (I ), · Λ ) is a quasi-Banach space containing all simple functions supported on finite intervals. Assuming r-convexity of ϕ we are not losing generality by Remark 4.1 and Proposition 4.2. The first result describes the conditions for the existence of (order) isomorphic copies of c0 or ∞ in Λϕ,w,v (I ). Theorem 5.1. Let a < ∞ and consider the following properties. The Orlicz function ϕ satisfies condition 2 . The Orlicz function ϕ satisfies condition 02 (resp., ∞ 2 ; 2 ). Λϕ,w,v (I ) = Λ0ϕ,w,v (I ) for I = N (resp., I = (0, a), I = (0, ∞)). Λϕ,w,v (I ), I = N, (resp. (0, a), (0, ∞)) does not contain a subspace isomorphic to ∞ . Λϕ,w,v (I ), I = N, (resp. (0, a), (0, ∞)) does not contain a sublattice order isomorphic to ∞ . (iv) Λ0ϕ,w,v (I ), I = N, (resp. (0, a), (0, ∞)) does not contain a sublattice order isomorphic to c0 . (a) The unit vectors (en ) form a boundedly complete basis in d0 (w, v, ϕ). (b) d0 (w, v, ϕ) does not contain a subspace isomorphic to c0 .
(α) (i) (ii) (iii) (iii )
Then (ii)–(iv) are equivalent; in case I = N they are also equivalent to (a) and to (b). Moreover (α) implies the conditions (ii)–(iv). Also (i) implies (ii) under any of the condition (4.7) in the case I = N, resp. if W (a) < ∞ in the case I = (0, a), a < ∞. Finally if the weight v is increasing, W is finite valued on I and W (∞) = ∞, then (iv) implies (i). Proof. The proof is similar to that of Theorem 2.2 in [24], so we provide only its sketch here. (iii ) ⇒ (i), (iv) ⇒ (i). We provide a proof only in the case of I = N. Assume v is increasing, W attains only finite values, W (∞) = ∞ and ϕ does not satisfy condition 02 . By Remark 4.6,
306
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
let K be a constant in 2 of W . We find by induction a sequence ui ↓ 0 and an increasing sequence (ni ) ⊂ N such that n0 = 0 and for all i ∈ N, 1 ui > 2i ϕ(ui ) and ϕ 1+ i
1 1 ϕ(ui ) W (ni ) − W (ni−1 ) i . i+1 2 K2
(5.1)
Indeed, since ϕ does not satisfy condition 02 , we find u1 > 0 such that ϕ(2u1 ) > 2ϕ(u1 )
and ϕ(u1 )W (1) 1/22 .
In view of W (∞) = ∞, choose n1 1 such that ϕ(u1 )W (n1 ) 1/2 and ϕ(u1 )W (2n1 ) > 1/2. Thus 1/2 < ϕ(u1 )W (2n1 ) Kϕ(u1 )W (n1 ), and so ϕ(u1 )W (n1 ) > 1/(2K). Let now choose 0 < u2 < u1 with ϕ (1 + 1/2)u2 > 22 ϕ(u2 )
and ϕ(u2 )W (n1 ) 1/23 .
Then there exists n2 n1 such that ϕ(u2 ) W (n2 ) − W (n1 ) 1/22
and ϕ(u2 ) W (2n2 ) − W (n1 ) > 1/22 .
Hence 1/22 < ϕ(u2 )(W (2n2 ) − W (n1 )) Kϕ(u2 )W (n2 ) Kϕ(u2 )W (n1 ) + Kϕ(u2 )(W (n2 ) − W (n1 )) 1/23 + Kϕ(u2 )(W (n2 ) − W (n1 )), and so we have ϕ(u2 ) W (n2 ) − W (n1 ) 1/22
and ϕ(u2 ) W (n2 ) − W (n1 ) > 1/ K23 ,
what is required in step two. Further induction process is similar. Letting x=
∞ ui χ(n ,n ] (j ), v(j ) i−1 i i=1
we have that x ∗ = x by the assumption that v is increasing, and in view of (5.1), I (x) 1, while for any λ > 0 there is i0 so that 1/λ > 1 + 1/i for all i i0 , and again by (5.1), I (x/λ)
∞
ni
∞ ϕ (1 + 1/i)ui w(j )
i=i0 j =ni−1 +1
ni
2i ϕ(ui )w(j ) = ∞.
i=i0 j =ni−1 +1
Hence xd = 1. In view of the construction of x we have that ∞ j =n x(j )ej d = 1 for all , we can find a disjoint sequence (xi ) of n ∈ N. Thus by the Fatou property of the norm · d “long blocks” xi = χ(ai−1 ,ai ] x such that x = ∞ i=1 xi and for every i ∈ N, 1 − 1/i xi d 1
∞ and xi = xd = 1. i=1
d
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
307
We conclude by observing that the norm closure of the linear span of (xi ) is isomorphic to c0 in d0 (w, v, ϕ), and its order closure is isomorphic to ∞ in d(w, v, ϕ). For I = (0, a), a ∞, the proof is similar. The implications (α) ⇒ (ii) and (i) ⇒ (ii) under suitable conditions were proved in Proposition 4.15. Since Λ0ϕ,w,v (I ) is separable the implication (ii) ⇒ (iii) is clear, and (iii) ⇒ (iii ) is trivial. The implication (iii ) ⇒ (ii) is a classical fact for order complete Banach lattices [28, Proposition 1.a.7], and extends to the L-convex quasi-Banach case by convexification [24]. Note that the Fatou property implies order completeness. Thus (ii)–(iii ) are equivalent. If T : c0 → Λ0ϕ,w,v (I ) is a quasi-normed latticeembedding, and (un ) is the image of the unit basis (en ) of c0 , then by the Fatou property u = n un exists in Λϕ,w,v (I ), and T extends to a lattice embedding of ∞ in Λϕ,w,v (I ) (see [2, Theorem 14.4]) which proves (iii ) ⇒ (iv). If c0 does not embeds as sublattice in a Banach lattice E then this space has the Fatou property (see [2, Theorem 14.12]). This extends to the L-convex quasi-Banach lattice case by convexification. Hence (iv) implies (ii), because every positive element in Λϕ,w,v (I ) is the supremum of a sequence of positive elements in Λ0ϕ,w,v (I ). For I = N, the equivalence (a) ⇔ (ii) results from the Fatou property, and the equivalence of (b) and (iv) results from [24, Lemma 2.1 and Theorem 2.2]. 2 Corollary 5.2. Let w be decreasing and regular. Let Mϕ,w (I ) be the corresponding Orlicz– Marcinkiewicz space over I . Then the following conditions are equivalent. (1) (2) (3) (4)
0 (I ) does not contain an order isomorphic copy of c . Mϕ,w 0 Mϕ,w (I ) does not contain an isomorphic copy of ∞ . ϕ satisfies 2 (resp., 02 , ∞ 2 ) for I = (0, ∞) (resp., I = N, I = (0, a), a < ∞). 0 =M . Mϕ,w ϕ,w
Proof. The assumptions on w assure that the Minkowski functional in Orlicz–Marcinkiewicz space is a quasi-norm (see Corollary 4.11). Moreover the regularity condition of w implies that W is finite valued and that αW > 0 by Proposition 4.10. Hence W (∞) = ∞, and thus all equivalences follow from Theorem 5.1. 2 Let us recall now classical tools in the geometry of Orlicz spaces. Given an Orlicz function ϕ and b ∈ (0, ∞), let ϕb be the function ϕ scaled at b, defined by ϕb (t) =
ϕ(bt) , ϕ(b)
t 0.
Let Γ be the set of all normalized r-convex functions on [0, 1], that is of r-convex functions ψ : [0, 1] → [0, 1] with ψ(0) = 0 and ψ(1) = 1. These functions are continuous on [0, 1) and possibly left discontinuous at the point 1. Given an Orlicz function ψ we shall say that ψ ∈ Γ whenever its restriction ψ|[0,1] ∈ Γ . In particular Γ contains all the scaled functions ϕb , for rconvex Orlicz function ϕ and b ∈ (0, ∞). For every s ∈ (0, 1), the restrictions of the functions of Γ to the interval [0, s] form a compact convex set in C(0, s), the space of all continuous functions on [0, s] (see [24,27]). It follows that on Γ the topology of pointwise convergence on
308
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
[0, 1] coincides with the topology of uniform convergence on every interval [0, s], 0 s < 1. This topology can be defined by the family of pseudometrics ds (ψ1 , ψ2 ) = ψ1 − ψ2 C[0,s] ,
0 < s < 1,
but in fact it is metrizable and can be defined by the metric d(ψ1 , ψ2 ) = max (1 − t)ψ1 (t) − ψ2 (t). t∈[0,1]
The set Γ will be further equipped with the above metric. Every ψ ∈ Γ can be extended to a possibly degenerate Orlicz function with values in [0, +∞] by setting ψ(t) = +∞ for t > 1. Then we can define ψ and hψ in the standard way. If ψ is the restriction of an Orlicz function ψ1 , the definition is consistent since for any Orlicz function ψ with ψ(1) = 1 the spaces hψ and ψ depend only on the restriction of ψ to the segment [0, 1]. For example the function ψ∞ (t) = 0, t < 1, ψ∞ (1) = 1 belongs to Γ and hψ∞ = c0 , ψ∞ = ∞ . Let us recall now three categories of subsets of Γ , at zero, at infinity and on R+ , associated to a given r-convex Orlicz function ϕ. For 0 < A < ∞, let Eϕ,A = {ϕb : 0 < b < A},
Cϕ,A = conv Eϕ,A ,
Eϕ =
Cϕ =
Eϕ,A ,
A>0
∞ = {ϕb : b > A}, Eϕ,A
∞ ∞ , Cϕ,A = conv Eϕ,A
Eϕ∞ =
∞ Eϕ,A ,
Cϕ,A ;
A>0
Cϕ∞ =
A>0
Eϕ (0, ∞) = {ϕb : b > 0},
∞ Cϕ,A ;
A>0
Cϕ (0, ∞) = conv Eϕ (0, ∞).
Here conv X, where X ⊂ Γ denotes the set of all convex combinations of functions in X and X is the closure of X in the topology Γ . All of these sets are non-empty compact subsets of Γ [24,27,29,33]. One can show that if ϕ does not verify 2 at 0 or ∞ then Eϕ , resp. Eϕ∞ contains the degenerate function ψ∞ , which justifies the introduction of the set Γ . It has been proved (Lemma 7 in [23], Lemma 3.9 in [24]) that Cϕ = conv Eϕ on the interval [0, 1]. We can prove analogously that Cϕ∞ = conv Eϕ∞ . Recall now how the above sets are related to the Matuszewska–Orlicz indices and to isomorphic subspaces of Orlicz spaces. The proposition below summarizes all known results. In the case of convex Orlicz function ϕ, they are due to Lindenstrauss and Tzafriri [27,29], and Nielsen [33]. In our case when ϕ does not need to be convex but αϕ > 0, we extend these properties using the convexification procedure (compare Proposition 3.6 in [24]). Proposition 5.3. Let ϕ be an Orlicz function with αϕ > 0 and 0 < p ∞. Then (a) p ∈ [αϕ0 , βϕ0 ] if and only if up is equivalent at zero to a function in Cϕ . This is also equivalent for hϕ to contain a closed subspace isomorphic to p (c0 if p = ∞). (b) p ∈ [αϕ∞ , βϕ∞ ] if and only if up is equivalent at zero to a function in Cϕ∞ , which is equivalent that p (c0 if p = ∞) is order isomorphic to a sublattice of L0ϕ (0, a), a < ∞. (c) The function up is equivalent at zero to a function in Cϕ (0, ∞) if and only if p ∈ [αϕ∞ , βϕ0 ] when βϕ∞ < αϕ0 , and p ∈ [αϕ0 , βϕ0 ] ∪ [αϕ∞ , βϕ∞ ] when βϕ∞ αϕ0 . This is also equivalent to embedding order isomorphically p (c0 if p = ∞) into L0ϕ (0, ∞).
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
309
Given the sequence (yn ) in Lφ (I ) define the Orlicz functions ϕ (yn ) associated to (yn ) as ϕ (yn ) (t) = ϕ t yn (s)v(s) w(s) ds, t 0. (5.2) I
Notice that if (yn ) is a block sequence then it is a basic sequence equivalent to the unit vectors in the sequence Musielak–Orlicz space (ϕ (yn ) ) . In fact, for any real-valued sequence λ = (λn ), Iφ
∞
λn yn =
n=1
which yields that
∞
ϕ
(yn )
(λn ) = Iϕ(yn )
n=1
∞
n=1 λn yn φ
=
∞
λn en ,
n=1
∞
n=1 λn en (ϕ (yn ) ) . We Iφ (yn ), then ϕ (yn ) ∈ Cϕ,An ,
also observe that if the block is
where An = ess sup{|yn (s)|v(s): normalized, that is, 1 = yn φ = s ∈ I }. We finish this section with a general result which states that there exists 0 < p < ∞ such that p is order isomorphic to a sublattice of Λ0ϕ,w,v (I ) as well as to a sublattice of Lϕ (0, ∞). We start with a simple but very useful proposition. Proposition 5.4. Let (yn ) be a normalized sequence in L0φ (I ) which converges to zero in measure, and such that ϕ (yn ) → ψ for some ψ ∈ Γ (pointwise on [0, 1]). Then some subsequence of (yn ) is equivalent to the unit vector basis of hψ . The equivalence constant can be bounded from above independently of the sequence (yn ). Proof. Since ϕ is r-convex for some 0 < r 1, the quasi-norm on Lφ (I ) is an r-norm (see Proposition 4.2). Using the order continuity of the yn ’s and their convergence to zero in measure we may find by Proposition 2.5 a subsequence (ynk ) and a sequence (yk ) of disjoint functions of the form yk = χAk ynk such that k ynk − yk rφ < ∞. Since ynk − yk φ → 0 and φ is r-convex
we have Iφ (t (ynk − yk )) → 0 for every t > 0, that is ϕ (ynk −yk ) (t) → 0 for every t > 0. Noticing that ϕ (ynk ) = ϕ (yk ) + ϕ (ynk −yk ) since yk and ynk − yk are disjoint, we see that ϕ (yk ) (t) → ψ(t) y
for all t ∈ [0, 1]. Letting zk = y k and ψk = ϕ (zk ) , the disjoint sequence (zk ) in Lφ (I ) is 1k φ equivalent to the unit vector basic sequence of the sequential Musielak–Orlicz space (ψk ) . On the other hand since y − zk = 1 − 1 y = 1 − y → 0, k k φ k φ φ yk φ we may assume, passing to a subsequence, that yk − zk rφ < ∞, and thus (zk ) is almost 1-equivalent to (yk ). We have 1 yk φ → 1. So for every 0 t s < 1, yk φ s for sufficiently large k. Then ϕ (yk ) (t) ψk (t) = ϕ (yk ) t/yk φ ϕ (yk ) (t/s), and passing to the limit when k → ∞, ψ(t) lim inf ψk (t) lim sup ψk (t) ψ(t/s).
310
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
Since ψ belongs to Γ , it is continuous on [0, 1). Thus letting s ↑ t in the preceding inequalities we obtain that ψk (t) → ψ(t). This remains trivially true for t = 1 since ψk (1) = ψ(1) = 1, so ψk → ψ pointwise on [0, 1] and thus in every C[0, s], 0 < s < 1, or equivalently for the distance d on Γ . By well-known facts (Lemma 4 in [23] and Lemma 3.4 in [24]), there is a subsequence (ψjk ) such that the unit vectors in (ψjk ) is equivalent (in fact almost 1-equivalent) to the unit basis in hψ . Therefore a subsequence of (zk ), and thus the corresponding subsequences of (yk ) and of (ynk ), are almost 1-equivalent to the unit basis in hψ . 2 Theorem 5.5. Every semi-normalized sequence (yn ) in Λ0ϕ,w,v (I ) which converges to zero in measure has a subsequence equivalent to a disjoint sequence in Λ0ϕ,w,v (I ) and to the unit vector basis of the Orlicz space hψ , for some ψ ∈ Cϕ (0, ∞) (this space hψ is thus also order isomorphic to a sublattice of L0ϕ (0, ∞)). Consequently, there is a sequence of blocks of (yn ) which is equivalent to the unit basis of an p -space, which is itself isomorphic to a sublattice of L0ϕ (0, ∞). Proof. We can assume that (yn ) is a normalized sequence in Λ0ϕ,w,v (I ). By Proposition 2.5 and Theorem 2.7, since Λϕ,w,v (I ) is a symmetrization of Lφ (I ), we can extract a subsequence (ynj ) of (yn ) which is equivalent to a disjoint sequence in Λϕ,w,v (I ) and also to some sequence of functions (gj ) in (Lφ (I ))b = L0φ (I ), converging to zero in measure. We may suppose that (gj ) is normalized. Let ψj (t) = ϕ (gj ) = I ϕ(tgj (s)v(s))w(s) ds be the associated Orlicz functions to the sequence (gj ). It is clear that ψj ∈ Cϕ (0, ∞) since ψj (t) = ϕgj (s)v(s) (t)ϕ gj (s)v(s) w(s) ds, t 0, (5.3) I
and I ϕ(gj v)w = 1. By compactness of Cϕ (0, ∞) there exist a subsequence (jk ) and ψ ∈ Cϕ (0, ∞) such that ψjk = ϕ (gjk ) → ψ in the metrics of C(0, s) for every 0 < s < 1. By Proposition 5.4 some subsequence of (gjk ) is equivalent to the unit basis in hψ . If ϕ is convex then by [33, Theorem 1.1], hψ is order isomorphic to a sublattice of Lϕ (0, ∞). By convexification method, this remains true when ϕ is only supposed to be r-convex for some r > 0. Finally the Orlicz space hψ contains an p -sublattice for every p ∈ [αψ0 , βψ0 ] (see [23,27]) that yields the last conclusion of the theorem. 2 Remark 5.6. The equivalence constant of the basic sequences given by Theorem 5.5 are estimated from above independently of the sequence (yn ) (like in Proposition 5.4 and Theorem 2.7). In the subsequent sections we shall try to improve the conclusions of Theorem 5.5, and to find converse statements, for different kinds of “interval” I , at the cost of adding some new hypotheses on the weights or the Orlicz function. 6. Isomorphic copies of hψ and p in Λϕ,w,v (I ) over finite interval (0, a) This part is devoted to spaces Λϕ,w,v (0, a) with a < ∞. We assume that W (t0 ) < ∞ for some t0 ∈ (0, a), unless stated otherwise. Theorem 6.1. Every semi-normalized disjoint sequence in Λ0ϕ,w,v (0, a) with a < ∞, has a subsequence which is equivalent to the unit basis of an Orlicz sequence space hψ , with ψ in Cϕ∞ .
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
311
Proof. Let (yn ) be a normalized disjoint sequence in Λ0ϕ,w,v (0, a). Then yn → 0 in measure. Since Λϕ,w,v (0, a) is a symmetrization of Lφ (0, a), by Theorem 2.7, we can extract a subsequence (ynj ) of (yn ) that is equivalent to some sequence of non-negative, decreasing functions (gj ) in L0φ (0, a) such that aj := | supp gj | → 0. Like in the proof of Theorem 5.5, we consider a the Orlicz functions ψj (t) = 0 ϕ(tgj (s)v(s))w(s) ds associated with the sequence (gj ). We can modify slightly gj in such a way that
inf gj (s)v(s): s ∈ supp gj → ∞. In fact, setting AN j = {0 < gj v N}, and by supp gj ⊂ [0, aj ], we have for every t > 0: a Iφ (tχAN gj ) ϕ(tN ) 0 j w(s)ds = ϕ(tN )W (aj ) → 0 when j → ∞, since W is finite and conj tinuous on [0, t0 ] and aj → 0. Thus χAN gj φ → 0 as j → ∞. Then we choose a sequence j
Nk ↑ ∞ of integers and a subsequence (jk ) such that χ
r N g Aj k jk φ
< 2−k . Hence if we remove
k
from (gjk ) the part (χ
N g ), Aj k jk
we obtain a new sequence (g˜ k ) which still converges to zero in
k
measure and is equivalent to a subsequence of (yn ). The new functions g˜ k need not to be still decreasing, but this does not harm the reasoning that follows. Observe that 1 − 2−k g˜ k φ 1. By normalizing (g˜ k ) in Lφ (0, a), we obtain an equivalent sequence (gˆ k ) verifying moreover |gˆ k | |g˜ k |, and thus inf{gˆ k (s)v(s): s ∈ supp gˆ k } Nk → ∞. Then using (5.3) we see that ϕ(tb) ∞ ψj ∈ conv . : b Nj ⊂ Cϕ,N j ϕ(b) ∞ By compactness of each Cϕ,N and the fact that Nj → ∞, there exist a subsequence (jk ) and j ∞ ψ ∈ Cϕ such that ψjk → ψ in the metrics of C(0, s) for every 0 < s < 1. It follows that ψjk (t) → ψ(t) for all t ∈ [0, 1]. Then by Proposition 5.4, some subsequence of (gj ) is equivalent to the unit basis in hψ . 2
Corollary 6.2. If the Orlicz space hψ is order isomorphic to a sublattice of Λ0ϕ,w,v (0, a), a < ∞, then ψ is equivalent at zero to a function in Cϕ∞ ; consequently hψ is also order isomorphic to a sublattice of L0ϕ (0, a). Proof. By the assumption there exists a disjoint sequence (yn ) ∈ Λ0ϕ,w,v (0, a), which is equivalent to the unit vector basis in hψ . By Theorem 6.1 there are a subsequence (ynk ) and ψ0 ∈ Cϕ∞ such that (ynk ) is equivalent to the unit vector basis in hψ0 . Hence, the unit vectors in both spaces hψ and hψ0 are equivalent. This yields that ψ is equivalent at zero to ψ0 [24,27] and completes the proof. 2 Corollary 6.3. If for some 0 < p ∞, p (c0 in case when p = ∞) is order isomorphic to a subspace of Λ0ϕ,w,v (0, a), a < ∞, then p ∈ [αϕ∞ , βϕ∞ ]. Proof. Let p , 0 < p < ∞, be an order isomorphic subspace of Λ0ϕ,w,v (0, a). Then by Corollary 6.2, up is equivalent at zero to a function in Cϕ∞ . But it follows by Proposition 5.3 that
312
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
p ∈ [αϕ∞ , βϕ∞ ]. If c0 embeds isomorphically in Λ0ϕ,w,v (0, a), then by Theorem 5.1, ∞ 2 is not satisfied and so βϕ∞ = ∞. 2 The next corollary generalizes a similar result in [10] proved for Λp,w (0, 1) in case of decreasing weights w and 1 p < ∞. Corollary 6.4. Every closed infinite dimensional sublattice of Λ0ϕ,w,v (0, a) contains an order isomorphic copy of p for some p ∈ [αϕ∞ , βϕ∞ ]. In particular, for every disjointly supported normalized elements fn ∈ Λp,w (0, a), a < ∞, 0 < p < ∞, its closed linear span [fn ] contains a sublattice isomorphic to p . Proof. If X is an infinite dimensional closed sublattice of Λ0ϕ,w,v (0, a), it contains a sequence of non-negative normalized and pairwise disjoint elements (hn ). By Theorem 6.1, a subsequence (hnj ) is equivalent to the unit basis of hψ for some ψ ∈ Cϕ∞ . Then by Theorem 3.11 in [24], one can find a block basis of (hnj ) which is equivalent to the unit basis of p (or c0 ) for some 0 < p < ∞. However this block basis spans also a sublattice in Λ0ϕ,w,v (0, a) and thus it is order isomorphic to p (or c0 ). Finally p ∈ [αϕ∞ , βϕ∞ ] by Corollary 6.3. 2 Remark 6.5. If we did not assume in the above statements of this section the local integrability of w near zero, that is W (t) < ∞ for t close to zero, then we would obtain weaker results. Indeed, in Theorem 6.1 and Corollary 6.2, we would obtain only that ψ ∈ Cϕ (0, ∞) and ψ is equivalent at zero to a function in Cϕ (0, ∞), respectively. In Corollaries 6.3 and 6.4 we would only get that p ∈ [αϕ , βϕ ] if βϕ∞ < αϕ0 and p ∈ [αϕ∞ , βϕ∞ ] ∪ [αϕ0 , βϕ0 ] if βϕ∞ αϕ0 by Proposition 5.3. The next two examples show that without the assumption that W (t) < ∞ near zero Corollaries 6.3 and 6.4 may not hold. Example 6.6. Let a = 1, ϕ(t) = t + t 2 , v(t) = t and w(t) = t −3/2 . Then 1 Iϕ,v,w (f ) =
∗
f (t)t 0
−1/2
1 dt +
f ∗ (t)2 t 1/2 dt = f L2,1 (0,1) + f 2L4/3,2 (0,1) ,
0
where L2,1 (0, 1) and L4/3,2 (0, 1) are the classical Lorentz spaces Lp,q (I ) := Λq,w (I ) with w(t) = t q/p−1 , 0 < p, q < ∞. Since L2,1 (0, 1) ⊂ L4/3,2 (0, 1) with bounded inclusion map (see e.g. [28]), we see that Λϕ,v,w (0, 1) = L2,1 (0, 1) with equivalent norms. This space contains order isomorphically only 1 spaces [10], while αϕ∞ = βϕ∞ = 2 since ϕ is equivalent to the square function for large arguments. Example 6.7. Torchinsky spaces [35]. Let Λϕ,v;T (0, 1) := Λϕ,w,v (0, 1) with v increasing and w(t) = 1/t, t ∈ (0, 1). We assume that 0 < αv βv < ∞ and, as usual, αϕ > 0. In particular v satisfies the condition 2 . Then it is easy to see that the non-triviality condition (4.1) as well as the quasi-normability criterion (4.3) are satisfied. Let us show that if hψ order embeds into Λ0ϕ,v;T then ψ is equivalent at zero to a function in Cϕ,A , for some A > 0. This will imply that hψ embeds as lattice into hϕ (Theorem 4.a.8 in [27], Theorem 8 in [23], Theorem 10 in [24]). In particular if p order embeds into Λ0ϕ,v;T (0, 1) then p is isomorphic to a sublattice of hϕ and so p ∈ [αϕ0 , βϕ0 ].
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
313
In fact if f is a norm one function in Λϕ,v;T (0, 1) we have 1 1=
ds ϕ v(s)f ∗ (s) s
0
t
ds ϕ v(t/2)f ∗ (t) s
t/2
= ϕ v(t/2)f ∗ (t) ln 2 ϕ K −1 v(t)f ∗ (t) ln 2, where K is a 2 constant for v. Hence vf ∗ ∞ B := Kϕ −1 (1/ ln 2). Let now (fn ) be a normalized disjoint sequence in Λ0ϕ,v;T (0, 1) which is equivalent to the unit vectors in hψ . Then by Theorem 2.7 we can find in L0φ (0, 1) a non-negative decreasing sequence (gk ), which is equivalent to the unit basis of hψ and to a subsequence of (fn ). We can normalize it in Lφ (0, 1), and then gk φ = gk Λ = 1. Thus for every k, gk ∞ B. Then the associated Orlicz functions ϕ (gk ) belong to Cϕ,B . By compactness of the class Cϕ,B , we can assume that ϕ (gk ) → ω in the topology of Γ , for some ω ∈ Cϕ,B . Then by Proposition 5.4, a subsequence of (gk ) is equivalent to the unit basis of hω . Thus the unit basis in both hω and hψ are equivalent, and so hψ = hω with equivalent norms, which implies that the functions ψ and ω are equivalent at zero [24,27]. Remark 6.8. It can be shown that conversely hϕ embeds into Λ0ϕ,v;T (0, 1) [35, Lemma 19]. Now we intent to give converse statements to the preceding ones. For this we introduce a new hypothesis on the weight v. Theorem 6.9. Assume that v is increasing. Then for any ψ ∈ Cϕ∞ there exists a block sequence in Λ0ϕ,w,v (0, a), a < ∞, equivalent to the unit vector basis in hψ . Consequently, hψ is isomorphic to a sublattice of Λ0ϕ,w,v (0, a). For proving this theorem we need the following lemma. Lemma 6.10. If ψ ∈ Cϕ∞ then for every ε > 0 and s ∈ (0, a) there exists a non-negative decreasing function g ∈ L0φ (0, a) such that (1) The support of g is contained in the interval [0, s]; (2) gφ = 1; (3) d(ϕ (g) , ψ) < ε. Proof. Since conv Eϕ∞ is dense in Cϕ∞ we may suppose w.l.o.g. that ψ ∈ conv Eϕ∞ . Let θi ∈ Eϕ∞ , 0 < βi 1, i = 1, . . . , l, be such that ψ = li=1 βi θi . Then for every i = 1, . . . , l, we choose a (k) sequence (bi )∞ k=1 such that for all i = 1, . . . , l, lim d(ϕb(k) , θi ) = 0,
k→∞
i
314
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331 (k)
and inf1il bi
→ ∞ as k → ∞. Let m0 be such that d(ϕb(k) , θi ) < ε for all i = 1, . . . l and i
(k )
k m0 . We can choose 0 = r0 < r1 < · · · < rl s and k1 , . . . kl m0 such that b1 1 · · · (k ) bl and for all i, (k ) βi = ϕ bi i W (ri ) − W (ri−1 ) . (k)
(k )
(kl )
In fact, by ϕ(bi ) → ∞, we first find k1 , . . . , kl m0 such that b1 1 · · · bl β1 (k ) ϕ(b1 1 )
+ ··· +
βl (kl )
ϕ(bl
)
and
W (s),
and then by continuity of W we choose 0 < r1 < · · · < rl s such that ri w= ri−1
βi (ki )
ϕ(bi
. )
We have then l
l (k ) βi = ϕ bi i W (ri ) − W (ri−1 ) = 1.
i=1
i=1
Setting g=
l
(ki ) −1
bi
v
χ(ri−1 ,ri ) ,
i=1
g is decreasing, gφ = gΛ = 1 and g belongs to L0φ (0, a) since for all t > 0, Iφ (tg) =
l (k ) ϕ tbi i W (ri ) − W (ri−1 ) < ∞. i=1
Moreover the Orlicz function associated with g is ϕ (g) = max d(ϕ
(ki )
bi
, θi ) ε, which proves the lemma.
2
i=1 βi ϕb(ki ) .
Thus d(ϕ (g) , ψ)
i
Proof of Theorem 6.9. Using the preceding lemma we find a normalized sequence (gn ) of non-negative decreasing elements in L0φ (0, a) such that | supp gn | → 0 and ϕ (gn ) → ψ. Then by Proposition 5.4, a subsequence (gnj ) is basic and equivalent to the unit basic sequence of hψ and by Theorem 2.7 a further subsequence is equivalent to a semi-normalized sequence (hk ) in Λ0ϕ,w,v (0, a), with | supp hk | → 0. By Proposition 2.5 this last sequence is equivalent to a disjoint sequence in Λ0ϕ,w,v (0, a). 2 From Corollary 6.2 and Theorem 6.9 we deduce:
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
315
Corollary 6.11. Assume that v is increasing. For any ψ ∈ Γ , the following assertions are equivalent: (1) Λ0ϕ,w,v (0, a) contains an order isomorphic copy of hψ ; (2) L0ϕ (0, a) contains an order isomorphic copy of hψ ; (3) ψ is equivalent at zero to a function in the set Cϕ∞ . Now we specialize to order embeddings of p spaces. Corollary 6.12. Assume that v is increasing. If p ∈ [αϕ∞ , βϕ∞ ], then p (replaced by c0 in case when p = ∞) is order isomorphic to a subspace of Λ0ϕ,w,v (0, a), a < ∞. Proof. By Proposition 5.3, if 0 < p < ∞, then up ∈ Cϕ∞ . Thus by Theorem 6.9, Λ0ϕ,w,v (0, a) contains an order isomorphic copy of p . If p = ∞ then βϕ∞ = ∞ and so ϕ does not satisfy 0 condition ∞ 2 , and thus by Theorem 5.1, Λϕ,w,v (0, a) contains an order copy of c0 . 2 By Corollaries 6.3 and 6.12 and [29] we obtain our main result in this section, a characterization of order p -copies in Λ0ϕ,w,v (0, a). Theorem 6.13. Assume that v is increasing, W (t) < ∞ for all t ∈ (0, a) and a < ∞. Then the following conditions are equivalent. (1) p (replaced by c0 in case when p = ∞) is order isomorphic to a sublattice of Λ0ϕ,w,v (0, a). (2) p (replaced by c0 in case when p = ∞) is order isomorphic to a sublattice of L0ϕ (0, a). (3) p ∈ [αϕ∞ , βϕ∞ ]. 0 (0, a) over fiThe analogous result follows instantly for Marcinkiewicz–Orlicz spaces Mϕ,w nite interval (0, a).
Corollary 6.14. Let w be a decreasing and regular weight function over (0, a), a < ∞. Then the space p , 0 < p ∞, (replaced by c0 when p = ∞) is order isomorphic to a sublattice of 0 (0, a) if and only if p ∈ [α ∞ , β ∞ ]. Mϕ,w ϕ ϕ 7. Isomorphic copies of hψ and p in Λϕ,w,v (0, ∞) and d(w, v, ϕ) Here we shall consider sequence spaces d(w, v, ϕ) and Λϕ,w,v (0, ∞), that is when I = N or I = (0, ∞). We also assume in the entire section that W (t) < ∞ for every t ∈ I and that W (∞) = ∞. The main results in this section are Theorems 7.8 and 7.18, where we give a complete characterization of isomorphic copies and order isomorphic copies of p in the spaces d0 (w, v, ϕ) and Λ0ϕ,w,v (0, ∞), respectively. Theorem 7.1. Every semi-normalized sequence (xi ) in d0 (w, v, ϕ) or Λ0ϕ,w,v (0, ∞) such that xi ∞ → 0, has a subsequence which is equivalent to a disjoint sequence in d(w, v, ϕ) or Λ0ϕ,w,v (0, ∞) and to the unit basis of an Orlicz space hψ for some ψ ∈ Cϕ (0, ∞). If in addition either v is decreasing, or v is regular and αW > 0, then one can find ψ ∈ Cϕ (and then hψ order embeds in the Orlicz space hϕ ).
316
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
Proof. We treat first the case of d0 (w, v, ϕ) which is slightly simpler than that of Λ0ϕ,w,v (0, ∞). The first part of the theorem is simply a specialization of Theorem 5.5. Let us now prove the second part. Let (xi ) ⊂ d0 (w, v, ϕ) be such that xi ∞ → 0. Then by Theorem 2.7, there exists a basic subsequence (xik ) and a sequence of decreasing non-negative elements (yi ) in hφ such that (xik ) in d0 (w, v, ϕ) and (yi ) in φ are equivalent, and moreover yi ∞ → 0. We assume w.l.o.g. that (yi ) is normalized in φ . Then following the proof of Theorem 5.5, for a suitable sequence (ik ) we have ϕ (yik ) → ψ ∈ C(0, ∞) and (yik ) in φ is equivalent to the unit basis of hψ . Since ϕ (yi ) ∈ Cϕ,Ai , where Ai = yi v∞ , it is sufficient to prove that Ai → 0 for obtaining that ψ ∈ Cϕ . Applying Lemma 4.12 or Remark 4.13 to each yi , we obtain that for some constant M and all i, n, M yi (n)vi (n) Mϕ −1 . W (n/2) For any fixed k 1, we then have lim sup sup yi (n)vi (n) Mϕ −1 i→∞ n2k
M . W (k)
By yi ∞ → 0, we also have that maxn<2k |yi (n)vi (n)| → 0. Thus M . lim sup Ai = lim sup yi v∞ Mϕ −1 W (k) i→∞ i→∞ Since k is arbitrary and W (∞) = ∞, we have that Ai → 0, and so ψ ∈ Cϕ . In the case of Λ0ϕ,w,v (0, ∞) we have still that for every k > 0, sup χ(2k,+∞) vyi ∞ Mϕ i
−1
M . W (k)
On the other hand χ(0,2N ] yi φ χ(0,2N ] φ yi ∞ → 0. We can extract a subsequence (yik ) such that k χ(0,2k] yik rφ < ∞. Let zk = χ(2k,+∞) yik . Then by the principle of small perturbations, the sequences (yik ) and (zk ) are almost 1-equivalent. On the other hand vzk ∞ → 0 and zi φ → 1. By the reasoning as above, a subsequence of (zk ) is equivalent to the unit basis of hψ for some ψ ∈ Cϕ , and so is the corresponding subsequence of (yik ). 2 Let us now give some consequences of Theorem 7.1 in the sequential case, i.e. d(w, v, ϕ). The next result generalizes a known fact [27, Proposition 4.e.3], proved for Lorentz spaces d(w, p) for 1 p < ∞ and w decreasing. Proposition 7.2. Assume that either v is decreasing, or v is regular and αW > 0. Then every closed infinite dimensional subspace X of d0 (w, v, ϕ) contains a closed subspace Y which is isomorphic either to c0 or to some Orlicz space hψ with ψ ∈ Cϕ . Consequently, if p , 0 < p < ∞, embeds isomorphically in d0 (w, v, ϕ) then up is equivalent at zero to some function in Cϕ .
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
317
In particular, every closed infinite dimensional subspace of d(w, p), where 0 < p < ∞ and w is a positive weight such that W (∞) = ∞, contains an isomorphic copy of p . Proof. By an easy adaptation of 1.a.11 in [27] to p-Banach spaces, there exists a normalized block-basic sequence (un ) of (en ) such that [un ] is isomorphic to a subspace of X. If (un ) is not boundedly complete then by the proof of Corollary 3.3 in [24] there exists a normalized block basis (vn ) of (un ) such that (vn ) is equivalent to (en ) in c0 . On the other hand if (un ) is boundedly complete then following the proof of Corollary 3 in [23] we can find a normalized block sequence (vn ) of (un ) with vn ∞ → 0. Then by Theorem 7.1, there is a subsequence (vnj ) equivalent to (en ) in hψ for some ψ ∈ Cϕ . Now, if p , 0 < p < ∞, is isomorphic to a subspace of d0 (w, v, ϕ) then p contains a subspace Y which is isomorphic to hψ for some ψ ∈ Cϕ . But by Proposition 3.7 in [24] (see also Theorem 4.a.8 in [27]), ψ must be equivalent at zero to some function in Cup ,1 . However the latter class contains the only function up , and so up must be equivalent at zero to ψ , which completes the proof. 2 Corollary 7.3. Assume that v is decreasing, or v is regular and αW > 0. If for 0 < p ∞, p (replaced by c0 in case when p = ∞) is isomorphic to a subspace of d0 (ϕ, w, v), then p ∈ [αϕ0 , βϕ0 ]. Proof. Let p , 0 < p < ∞, be linearly isomorphic to a subspace of d0 (w, v, ϕ). Then by Propositions 7.2 and 5.3, up is equivalent at zero to a function in Cϕ and thus p ∈ [αϕ0 , βϕ0 ]. If c0 embeds isomorphically in d0 (w, v, ϕ), then by Theorem 5.1, ϕ does not satisfy condition 02 and so βϕ0 = ∞. 2 Theorem 7.4. Assume that v is increasing. Then for every ψ ∈ Cϕ there exists a block sequence in d0 (w, v, ϕ) (resp. Λ0ϕ,w,v (0, ∞)), which converges to zero in ∞ -norm (resp. L∞ -norm) and is equivalent to the unit basis in hψ . Therefore d0 (w, v, ϕ) (resp. Λ0ϕ,w,v (0, ∞)) contains an order isomorphic copy of hψ . We present the proof in the sequence case only, pointing out where things differ in the function case, which is slightly simpler. Similarly as in Theorem 6.9 we assume that ϕ is r-convex, · φ is an r-norm and · d is an r-norm with constant D > 0. Let us denote by K the 2 -constant of the function W . Recall that under conditions (4.3) the function W satisfies condition 2 (see Remark 4.6). Lemma 7.5. If ψ ∈ conv Eϕ then for every ε > 0 there is a non-negative decreasing element u in hφ (resp. L0φ (0, ∞)) and a function θ ∈ Γ such that: (1) uφ = 1, and u∞ < ε, (2) d(ϕ (u) , θ ) < ε, (3) C −1 ψ(t) θ (t) ψ(C 1/r t) for all t ∈ (0, 1) (which we shall write shortly θ ∼C ψ) where C 2 in the function case, C = 2K − 1 in the sequence case.
318
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
Proof. We have ψ = li=1 βi ϕi where ϕi ∈ Eϕ , βi > 0 and li=1 βi = 1. For every i = 1, . . . , l (k) (k) there exists a sequence bi of positive numbers such that bi ↓ 0 and ϕb(k) → ϕi as k → ∞. Let i m be such that for k m we have (k)
d(ϕb(k) , ϕi ) < ε i
and
bi < εC −1/r . v(1)
(7.1)
For constructing u, we shall find a finite sequence of integers (positive reals in the function case) 0 = r0 < r1 < · · · < r and a sequence k1 , k2 , . . . , kl m of integers (both depending on m) such that for some C > 0, (k ) βi ϕ bi i W (ri ) − W (ri−1 ) βi , C
i = 1, . . . , l.
(7.2) (k)
Choose s1 1 (resp. such that W (s1 ) 2W (1) in the function case). Since b1 → 0 there exists k1 m such that (k ) ϕ b1 1 W (s1 ) − W (r0 ) β1 . (k )
Since W (∞) = ∞, the set {r: ϕ(b1 1 )(W (r) − W (r0 )) β1 } is bounded. Set
(k ) r1 = sup r: ϕ b1 1 W (r) − W (r0 ) β1 . Note that r1 s1 , W (r1 ) 2W (r0 ) (resp. W (s1 ) 2W (1) in the function case) and (k ) (k ) ϕ b1 1 W (r1 ) − W (r0 ) β1 ϕ b1 1 W (r1 + 1) − W (r0 ) (k )
(in the function case one obtains ϕ(b1 1 )(W (r1 ) − W (r0 )) = β1 , since W is continuous). Now let s2 such that W (s2 ) > 2W (r1 ) and k2 m such that b(k2 ) b(k1 ) and (k ) ϕ b2 2 W (s2 ) − W (r1 ) β2 . Set
(k ) r2 = sup r: ϕ b2 2 W (r) − W (r1 ) β2 . (k )
(kl )
By iteration of this procedure we find the numbers 0 = r0 < r1 < · · · < rl and b1 1 · · · bl ki m, such that ri > ri−1 , W (ri ) 2W (ri−1 ), (k ) (ki ) W (ri ) − W (ri−1 ) βi ϕ bi i W (ri + 1) − W (ri−1 ) , ϕ bi
,
i = 1, . . . , l.
(In the function case ϕ(bi i )(W (ri ) − W (ri−1 )) = βi and (7.2) follows with C = 1.) Since W verifies the condition 2 , W (ri + 1) W (2ri ) KW (ri ) 2K W (ri ) − W (ri−1 ) , (k
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
319
and so W (ri + 1) − W (ri−1 ) KW (ri ) − W (ri−1 ) = (K − 1)W (ri ) + W (ri ) − W (ri−1 ) C W (ri ) − W (ri−1 ) with C = 2K − 1, and thus we obtain the relations (7.2) in sequence case. Define x=
l
(ki )
bi
i=1
1 χS , max(v, v(1)) i
(7.3)
(k )
(k )
where Si = (ri−1 , ri ]. In view of the assumption that v is increasing and b1 1 · · · bl l , the element x is decreasing, and it results then from (7.1) that x∞ < εC −1/r . Note also that the element x is order continuous, since it is bounded with support included in a finite interval. (k ) ϕ(b i t) Let also θ (t) = li=1 βi ϕ (ki ) (t) = li=1 βi i(ki ) . Then it results from (7.1) that θ satisfies bi
ϕ(bi
)
condition (2) of the lemma. In the sequence case we have for every t > 0 Iφ (tx) =
Hence and by (7.2),
l l (k ) ϕ tbi i w(j ) = ϕ
(k ) bi i
(k ) (t)ϕ bi i W (ri ) − W (ri−1 ) .
i=1 j ∈Si
i=1
1 C θ (t) Iφ (tx) θ (t).
In the function case we have:
l (k ) (k ) ϕ tb1 1 W (r1 ) − W (1) + ϕ tbi i W (ri ) − W (ri−1 ) i=2
Iφ (tx)
l (k ) ϕ tbi i W (ri ) − W (ri−1 ) . i=1
Hence by (7.2) and the fact that W (r1 ) − W (1) 12 W (r1 ) we obtain 12 θ (t) Iφ (tx) θ (t). Finally in both case we have: 1 θ (t) Iφ (tx) θ (t) C
(7.4)
with C = 2K − 1 in sequence case, C = 2 in the function case. In particular C1 Iφ (x) 1 since θ (1) = 1, and so C −1/r xφ 1 by r-convexity of ϕ. Define u = x/xφ . Then uφ = 1, u is order continuous, and verifies condition (1) of the lemma. Then by (7.4), C −1 θ t/xφ ϕ (u) (t) = Iφ tx/xφ θ t/xφ .
320
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
Thus C −1 θ (t) ϕ (u) (t) θ C 1/r t , and condition (3) of the Lemma is fulfilled.
t >0
2
Proof of theorem 7.4. Let ψ ∈ Cϕ . Choose a sequence (ψn ) in conv Eϕ converging to ψ, and for each n, use Lemma 7.5 for finding a decreasing element un in hφ (resp. L0φ (0, ∞)), and a function θn in Γ , such that (1) un φ = 1, and un ∞ < n1 , (2) d(ϕ (un ) , θn ) < n1 , (3) θn ∼C ψn . By compactness of Γ we may, up to extracting a subsequence, assume that the sequence (θn ) converges to a function θ ∈ Γ . Passing to the limit in condition (3) we have thus θ ∼C ψ. It results from (1) that (un ) is a normalized sequence in hφ (resp. L0φ (0, ∞)) converging to zero in ∞ -norm (resp. L∞ -norm) and that the associated Orlicz functions ϕ (un ) converge to θ . Then by Proposition 5.4, a subsequence (unj ) is basic and equivalent to the unit basic sequence of hθ and by Theorem 2.7 a further subsequence is equivalent to a semi-normalized sequence (gk ) in d0 (w, v, ϕ) (resp. Λ0ϕ,w,v (0, ∞)), with gk ∞ → 0 (resp. gk → 0 in measure). By Proposition 2.5 this last sequence is equivalent to a disjoint sequence in d0 (v, w, ϕ) (resp. Λ0ϕ,w,v (0, ∞)). Moreover from θ ∼C ψ it follows that hθ = hψ , and their unit bases are equivalent. 2 From Theorems 7.1 and 7.4 it follows. Corollary 7.6. Assume that v is constant, or that v is regular, increasing and αW > 0. Given ψ ∈ Γ , the following assertions are equivalent. (1) Λ0ϕ,w,v (0, ∞) (resp. d0 (w, v, ϕ)) contains a normalized sequence converging to zero in L∞ norm (resp. ∞ -norm) which is equivalent to the unit basis of hψ . (2) hϕ contains a normalized sequence converging to zero in ∞ -norm which is equivalent to the unit basis of hψ . (3) ψ belongs to the set Cϕ . We examine now further consequences of Theorem 7.4 for sequence spaces. Corollary 7.7. Assume that v is increasing. Then for every p ∈ [αϕ0 , βϕ0 ], the space p (replaced by c0 in case when p = ∞) is order isomorphic to a sublattice of d0 (w, v, ϕ). Proof. By Proposition 5.3, if p ∈ [αϕ0 , βϕ0 ] then up is equivalent at zero to a function in Cϕ . Thus by Theorem 7.4, d0 (w, v, ϕ) contains an order isomorphic copy of p . If p = ∞ then βϕ0 = ∞ and so ϕ does not satisfy condition 02 , and so by Theorem 5.1, d0 (w, v, ϕ) contains an order copy of c0 . 2 The next result, the first main theorem in this section, gives a complete characterization of p copies in the space d0 (w, v, ϕ) which follows from Corollaries 7.3, 7.7 and [24, Theorem 4.11].
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
321
Theorem 7.8. Assume that v is constant, or that v is regular, increasing and αW > 0. Then the following conditions are equivalent (where c0 is meant in place of p if p = ∞). (1) (2) (3) (4)
p is order isomorphic to a sublattice of d0 (w, v, ϕ). p is isomorphic to a subspace of d0 (w, v, ϕ). p is isomorphic to a subspace of the Orlicz space hϕ . p ∈ [αϕ0 , βϕ0 ].
The next corollary for Orlicz–Marcinkiewicz sequence spaces follows from Theorem 7.8, Proposition 4.10 and Lemma 4.14. Corollary 7.9. Let w be decreasing and regular. Then the space p , 0 < p ∞, (replaced by c0 when p = ∞) is isomorphic (or order isomorphic) to a subspace of m0 (w, ϕ) if and only if p ∈ [αϕ0 , βϕ0 ]. As a corollary of Theorem 7.8 we also obtain a duality result between subspaces and quotient spaces for reflexive Orlicz–Lorentz spaces d(w, ϕ), analogous to the one in Orlicz sequence spaces [27, Theorem 4.b.3]. Before we state it we need a characterization of reflexivity of d(w, ϕ). Proposition 7.10. Let w be decreasing and ϕ be a convex Orlicz function. Then d(w, ϕ) is reflexive if and only if ϕ is reflexive, that is if and only if ϕ and ϕ ∗ satisfy condition 02 . Proof. A Banach lattice is reflexive if and only if it does not contain order isomorphically neither c0 nor 1 [41]. By [23, Theorem 9], d0 (w, ϕ) satisfies these conditions if and only if hϕ does. By [24, Theorem 2.2], d0 (w, ϕ) contains c0 if and only if d(w, ϕ) does and if it is not the case then d(w, ϕ) = d0 (w, ϕ); the same is true for ϕ and hϕ . Hence the reflexivity of d(w, ϕ) is equivalent to that of ϕ . The last assertion is well known [27]. 2 Corollary 7.11. Let w be a decreasing regular weight sequence, and ϕ a convex Orlicz function such that d(w, ϕ) is reflexive. Then d(w, ϕ) contains a subspace isomorphic to p for some p 1 if and only if d(w, ϕ) has a quotient space isomorphic to p . Proof. By Proposition 7.10 and [15], d(w, ϕ)∗ = m(w, ϕ ∗ ). Since ϕ and ϕ ∗ satisfy condition 02 , by Theorem 5.1 and Corollary 5.2, d(w, ϕ) = d0 (w, ϕ) and m(w, ϕ ∗ ) = m0 (w, ϕ ∗ ). By Theorem 7.8, p is isomorphic to a subspace of d(w, ϕ) if and only if p ∈ [αϕ0 , βϕ0 ]. However the last relation is equivalent to p ∈ [αϕ0 ∗ , βϕ0∗ ], where 1/p + 1/p = 1 [27], and by Corollary 7.9 it is
equivalent that m(w, ϕ ∗ ) contains a subspace isomorphic to p . Now by the well known duality between subspaces and quotient spaces and by reflexivity of d(w, ϕ), the conclusion follows. 2 We return now to the spaces on (0, ∞).
Notation 2. Let us denote by Cϕ0,∞ the set of all possible limits in Γ of the functions ϕ (un ) , where un is any normalized sequence in Lϕ (0, ∞) converging to zero in measure. Then ψ is equivalent at zero to a function in Cϕ0,∞ if and only if there exists a normalized basic sequence in Lϕ (0, ∞) converging to zero in measure and such that it is equivalent to the hψ unit basis. It is not difficult to see that Cϕ0,∞ = conv(Cϕ , Cϕ∞ ).
322
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
Theorem 7.12. (a) Assume that v is decreasing, or that v is regular, and αW > 0. Then every semi-normalized sequence in Λ0ϕ,w,v (0, ∞) which converges to zero in measure has a subsequence equivalent to the unit basis of an Orlicz space hψ with ψ ∈ Cϕ0,∞ . (b) Assume now that v is increasing. Then for every ψ ∈ Cϕ0,∞ there exists a disjoint normalized sequence in Λ0ϕ,w,v (0, ∞), which converges to zero in measure and is equivalent to the unit basis in hψ . Proof. (a) Let (gn ) be a semi-normalized sequence in Λ0ϕ,w,v (0, ∞) which converges to zero in measure. By the proof of Theorem 2.7 (see (2.2)), we may assume that the sequence (gn ) is disjoint and splits as gn = gn + gn , where gn , gn are disjoint, gn ∞ → 0, | supp gn | → 0. We may assume that both are semi-normalized (the case where one of them goes to zero in quasinorm is trivial), say gn → α, gn → β for some α, β > 0. Then by Theorems 6.1, 7.1, by performing two successive extractions we can find subsequences (α −1 gn k ) and (β −1 gn k ) which are equivalent to the unit bases of hψ1 ∈ Cϕ and hψ2 ∈ Cϕ∞ , respectively. By Theorem 2.7 some subsequences of (α −1 gn k ) and (β −1 gn k ) are equivalent to normalized disjoint sequences (fk ) and (fk ) in Lϕ (0, ∞) respectively, with fk ∞ → 0, | supp fk | → 0. Thus (fk ) and (fk ) are respectively equivalent to the unit bases of hψ1 ∈ Cϕ and hψ2 ∈ Cϕ∞ . Then a subsequence of (gnk ) in Λ0ϕ,w,v (0, ∞) is equivalent to (fk ) := (αfk + βfk ) in Lϕ (0, ∞), which is a semi-normalized sequence converging to zero in measure. Then a subsequence of (fk ) is equivalent to hψ , with ψ = αψ1 + βψ2 ∈ Cϕ0,∞ , and so is the corresponding subsequence of (gnk ). (b) The proof is similar to that of (a), using now Theorems 6.9 and 7.4. 2 While in the case of I = (0, a), a < ∞, or I = N, the sets of exponents p for which p is order isomorphic to a sublattice of a given Orlicz space and a (two-weighted) Orlicz–Lorentz space induced by the same Orlicz function ϕ coincide independently of w and v for very broad class of weights, this is no longer true in spaces over (0, ∞) as we shall see in the examples below. Example 7.13. Let w(t) = (t 1/p ∧ t 1/q )t −1 where 1 p < q < ∞. Then Λ1,w (0, ∞) = Λϕ,w (0, ∞) for ϕ(u) = u, contains an order isometric copy of Lr (0, ∞) and so of r , for every p < r < q while L1 (0, ∞) contains only sublattices isomorphic to 1 . In fact we have w(t) = t 1/p−1 for 0 < t 1, and w(t) = t 1/q−1 for t 1, and so ∞ t 0
−1/r
1 w(t) dt =
t 0
1/p−1/r
dt + t
∞ t 1/q−1/r
dt < ∞. t
1
Thus the function t −1/r ∈ Λ1,w (0, ∞) and so r is isomorphic to a sublattice of Λ1,w (0, ∞). Before presenting another example, let us state a lemma. Lemma 7.14. If ψ is an Orlicz function such that βψ∞ < αψ0 then the function t −1/r belongs to the space Lψ (0, ∞) for every r ∈ (βψ∞ , αψ0 ).
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
323
Proof. Notice that for every r ∈ (βψ∞ , αψ0 ) there exists a constant C = C(r) such that ψ(t) Ct r for every t ∈ (0, ∞). In fact since, by the definition of Matuszewska–Orlicz indices, ψ(t)/t r is pseudo-decreasing near infinity, resp. pseudo-increasing near 0, there exist constants C , resp. C such that ψ(t) C t r ψ(1) for all t 1, resp. ψ(t) C t r ψ(1) for all t 1. Choose r0 , r1 with βψ∞ < r0 < r < r1 < αψ0 . Then ∞
ψ t −1/r dt C
0
1 t
−r0 /r
∞ dt + C
0
t −r1 /r dt < ∞.
2
1
Example 7.15. Let ϕ be an Orlicz function such that 0 < αϕ∞ βϕ∞ < αϕ0 βϕ0 < ∞, and let w(t) = t γ −1 , γ > 0. Hence for 0 < r < ∞, ∞ 0
1 ϕ t −1/r t γ −1 dt = γ
∞
ϕ t −1/(rγ ) dt.
0
Thus t −1/r ∈ Λϕ,w (0, ∞) if and only if t −1/r ∈ Lψ (0, ∞), where ψ(t) = ϕ(t 1/γ ). Note that βψ∞ = γ1 βϕ∞ < γ1 αϕ0 = αψ0 . Hence by Lemma 7.14 the function t −1/r belongs to
Lψ (0, ∞), and thus to Λϕ,w (0, ∞), for every r ∈ ( γ1 βϕ∞ , γ1 αϕ0 ). In view of Fact 1 in Section 3, for these values of r, the space Λϕ,w (0, ∞) contains an order isometric copy of Lr (0, ∞), and a fortiori of r . On the other hand by Proposition 5.3, the set of values r for which Lϕ (0, ∞) contains an order isomorphic copy of r coincides in the present case with the interval [αϕ∞ , βϕ0 ]. For appropriate values of γ , the intervals ( γ1 βϕ∞ , γ1 αϕ0 ) and [αϕ∞ , βϕ0 ] are disjoint. Now we shall give necessary or sufficient conditions for p to embed as sublattice in Λϕ,w,v (0, ∞). Let us first state an analogue of Corollary 7.3 in the function case, i.e. Λϕ,w,v (0, ∞), for lattice isomorphic copies. Recall the notation fp (t) = t −1/p . Proposition 7.16. Assume that v is decreasing, or v is regular and αW > 0. If (gn ) is a normalized basic sequence in Λ0ϕ,w,v (0, ∞) which converges to zero in measure then some block sequence of (gn ) is equivalent to the unit vector basis of p with p ∈ [αϕ0 , βϕ0 ] ∪ [αϕ∞ , βϕ∞ ]. Proof. By the proof of Theorem 2.7 (see (2.2)), we may assume that the sequence (gn ) splits as gn = gn + gn , where gn ∞ → 0, | supp gn | → 0, and these sequences are disjoint in the sense = 0 for n = m. that gn ∧ gm If one of the sequences (gn ), resp. (gn ), approaches zero in norm, then by the small perturbation principle, Theorem 1.1, a subsequence of (gn ) is equivalent to the corresponding subsequence of (gn ), resp. (gn ). In this case we know by Theorems 7.1 or 6.1, that a subsequence of (gn ) is equivalent to the unit basis of hψ with ψ ∈ Cϕ or ψ ∈ Cϕ∞ . On the other hand a block basis of (en ) in hψ , and so a block basis of (gn ), is equivalent to the unit basis in p , with p ∈ [αψ0 , βψ0 ]. We have [αψ0 , βψ0 ] ⊂ [αϕ0 , βϕ0 ] or [αψ0 , βψ0 ] ⊂ [αϕ∞ , βϕ∞ ] according to whether ψ ∈ Cϕ or ψ ∈ Cϕ∞ . Thus p ∈ [αϕ0 , βϕ0 ] or p ∈ [αϕ∞ , βϕ∞ ].
324
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
Now, if both sequences (gn ) and (gn ) are semi-normalized, then by Theorems 6.1 and 7.1 again we can perform two successive extractions and obtain subsequences (gn k ) and (gn k ) spanning sublattices isomorphic to hψ1 and hψ2 , with ψ1 ∈ Cϕ and ψ2 ∈ Cϕ∞ , respectively. Then (gnk ), where gnk = gn k + gn k , is equivalent to the unit basis in hψ1 ∩ hψ2 . Relabelling, we denote this extracted sequence by (gn ). Now we can reason by dichotomy: – Either the unit bases of hψ1 and hψ2 are equivalent and then the three basic sequences (gn ), (gn ) and (gn ) are also equivalent. Thus for any block basis of (gn ) which is equivalent to the unit basis of p , the corresponding block sequences of (gn ) and (gn ) are also equivalent to the unit basis of p . We have now p ∈ [αϕ0 , βϕ0 ] ∩ [αϕ∞ , βϕ∞ ]. – Or there is a sequence of finite blocks (un ) of the unit basis hψ1 ∩ hψ2 which are normalized in one space, say hψ1 , and go to zero in the norm of the other space hψ2 . We may assume that these blocks are disjoint. Again by a perturbation argument, we may up to extraction assume the basic sequences (un ) in hψ1 and (un ) in hψ1 ∩ hψ2 are equivalent. Let (vn ) be a block subsequence of (un ) equivalent in hψ1 to the unit basis of some p with p ∈ [αψ0 , βψ0 ]. Then (vn ) is also equivalent in hψ1 ∩ hψ2 to the p -basis, and so does the corresponding block sequence of (gn ). Moreover, since ψ1 ∈ Cϕ so [αψ0 1 , βψ0 1 ] ⊂ [αϕ0 , βϕ0 ] and thus p ∈ [αϕ0 , βϕ0 ]. If (un ) is normalized in hψ2 , then analogously we show that a block sequence of (gn ) is isomorphic to p in Λϕ,w,v (0, ∞) and to a disjoint block sequence of the basis of hψ2 ; since ψ2 ∈ Cϕ∞ , we have that p ∈ [αϕ∞ , βϕ∞ ]. 2 Corollary 7.17. Assume that v is decreasing, or v is regular and αW > 0. If for 0 < p ∞, p (replaced by c0 in case when p = ∞) is order isomorphic to a sublattice of Λ0ϕ,w,v (0, ∞), then ∞ either p ∈ [αϕ0 , βϕ0 ] ∪ [αϕ∞ , βϕ∞ ] or for some c > 0 it holds that 0 ϕ(ct −1/p v(t))w(t)dt < ∞. Proof. By Theorem 3.3, and the fact that Λϕ,w,v (0, ∞) has the Fatou property, if p embeds order isomorphically in Λ0ϕ,w,v (0, ∞) then either Λϕ,w,v (0, ∞) contains the function fp or Λ0ϕ,w,v (0, ∞) contains a normalized sequence (fn ) converging to zero in measure and equivalent ∞ to the unit vector basis of p (or c0 ). In the first case we have 0 ϕ(ct −1/p v(t))w(t)dt < ∞ for any 0 < c < fp Λ , while in the second case we conclude from Proposition 7.16 that the closed linear span [fn ] contains q with q ∈ [αϕ0 , βϕ0 ] ∪ [αϕ∞ , βϕ∞ ]. Then necessarily q = p. 2 In the next theorem we state the second main result of this section, necessary and sufficient conditions for p to be an order isomorphic copy in Λ0ϕ,w,v (0, ∞). Theorem 7.18. Assume that either v is constant, or that v is increasing, regular and αW > 0. Then for 0 < p ∞, the following assertions are equivalent. (1) p (replaced by c0 in case when p = ∞) is order isomorphic to a sublattice of Λ0ϕ,w,v (0, ∞). ∞ (2) Either p ∈ [αϕ0 , βϕ0 ] ∪ [αϕ∞ , βϕ∞ ] or, for some c > 0, 0 ϕ(ct −1/p v(t))w(t)dt < ∞. (3) p is order isomorphic to a sublattice of hϕ or of L0ϕ (0, 1), or Lp,∞ ⊂ Λϕ,w,v (0, ∞). Proof. In view of Proposition 5.3, (3) is simply a reformulation of (2). The implication (1) ⇒ (2) instantly follows from Corollary 7.17. Let us show the converse implication (2) ⇒ (1). If p ∈ [αϕ0 , βϕ0 ] ∪ [αϕ∞ , βϕ∞ ], with p < ∞ then the function t p belongs to Cϕ ∪ Cϕ∞ by Proposition 5.3 (see also Theorem 1.5 in [33]). If up ∈ Cϕ then p is order isomorphically embedded into
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
325
Λ0ϕ,w,v (0, ∞) by Theorem 7.4. If up ∈ Cϕ∞ then we have the same conclusion by Theorem 6.9 and the obvious fact that Λ0ϕ,w,v (0, 1) is a closed sublattice of Λ0ϕ,w,v (0, ∞). Let p = ∞. Then either βϕ0 = ∞ or βϕ∞ = ∞. In the first case ϕ does not satisfy 02 , which implies by Theorem 5.1 that d0 (w, v, ϕ) contains an order copy of c0 , and the same holds for Λ0ϕ,w,v (0, ∞). In the second case ϕ does not satisfy ∞ 2 , and thus by the same theorem, 0 Λϕ,w,v (0, 1) contains an order copy of c0 and the same holds for Λ0ϕ,w,v (0, ∞). ∞ Finally if 0 ϕ(ct −1/p v(t))w(t)dt < ∞ then fp ∈ Λϕ,w,v (0, ∞) and by Theorem 3.3 or Fact 1, the space Λϕ,w,v (0, ∞) contains an order isomorphic copy of p . 2 Remark 7.19. The set Jϕ,w,v of p ∈ (0, ∞) such that the function t −1/p belongs to Λϕ,w,v (0, ∞) is an interval. In fact if p1 , p2 ∈ Jϕ,w,v and p = (1 − θ )p1 + θp2 , 0 < θ < 1 we have t −1/p max{t −1/p1 , t −1/p2 } for every t > 0. Thus the set of exponents p such that p embeds in Λϕ,w,v (0, ∞) as sublattice is the union of three intervals P = [αϕ0 , βϕ0 ] ∪ [αϕ∞ , βϕ∞ ] ∪ Jϕ,w,v . The interval Jϕ,w,v is not necessarily closed. Remark 7.20. In the case of Orlicz spaces Lϕ (0, ∞), the conditions (2) in Theorem 7.18 and (c) of Proposition 5.3 are equivalent (see also Lemma 7.14). While lattice isomorphic copies of p in Lϕ (0, ∞) are described entirely in terms of indices of ϕ, this is not the case in Orlicz–Lorentz spaces defined on (0, ∞). Examples 7.13, 7.15 above show that the analogue of Corollary 6.4 is not true in general on (0, ∞). However if the fundamental function of the space Λϕ,v,w (0, ∞) dominates a power function, a result in the spirit of Corollary 6.4 can be stated. To this purpose let us state first an auxiliary fact which was showed as Proposition 2.3 in [6] under somewhat more restrictive conditions. However by inspecting its proof, we see that it remains true also under our assumptions below. Proposition 7.21. Let F be an order continuous r.i. quasi-Banach space on (0, ∞) with the Fatou property. Assume that χ(0,t) F Ct γ for all t 0 and some C, γ > 0. Then for any normalized disjoint sequence (xn ) in F there exists a normalized block sequence (yn ) of (xn ) converging to zero in measure. Observe that the assumption made on the fundamental function above implies in fact the first case of Theorem 3.3. Theorem 7.22. Assume that there exist γ > 0 and c > 0 such that χ[0,t] Λ ct γ for all t > 0. Assume also that ϕ verifies condition 2 . Then every infinite dimensional closed sublattice L of Λϕ,w,v (0, ∞) contains an order isomorphic copy of a space p , which is also order isomorphic to a sublattice of Lϕ (0, ∞). If in addition v is either decreasing, or v is regular and αW > 0, we can find in the sublattice L an order copy of p which is also order isomorphic to a sublattice either in Lϕ (0, 1) or in ϕ , that is p ∈ [αϕ0 , βϕ0 ] ∪ [αϕ∞ , βϕ∞ ]. Proof. Every infinite dimensional sublattice contains a sequence of normalized pairwise disjoint elements (fn ). Due to the assumption on the fundamental function of Λϕ,w,v (0, ∞) and to the assumption of 2 -condition for ϕ, we may appeal to Proposition 7.21 to conclude that a block sequence (gn ) of (fn ) converges to zero in measure.
326
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
In view of Theorem 5.5 some subsequence of (gn ) spans an Orlicz space hψ , with ψ ∈ Cϕ (0, ∞), and a sequence of disjoint blocks hk of the gn is equivalent to the unit basis of p in hψ , which in turn is order isomorphic to a closed sublattice of Lϕ (0, ∞). Note that like the fn , the gn and thus the hn are pairwise disjoint, hence L contains p isomorphically as sublattice. To prove the last sentence of the theorem, we apply Proposition 7.16 to the sequence (gn ). 2 Remark 7.23. Let (F, · ) be a r.i. quasi-Banach space over (0, ∞) and let F (t) = χ(0,t) , t > 0, be its fundamental function. If βF0 < αF∞ then for any γ ∈ (βF0 , αF∞ ) it holds χ(0,t) Ct γ for all t > 0 and some C > 0. This fact follows directly from the definition of indices and gives a simple condition for the fundamental function to be estimated below by a power function. Combining Theorems 7.22 and 7.18, we obtain the following corollary. Corollary 7.24. Assume that there are γ > 0 and c > 0 such that χ[0,t] Λ ct γ for all t 0. Assume also that ϕ verifies condition 2 , and that either v is increasing, regular and αW > 0, or v is constant. Under these assumptions, the space p is order isomorphic to a sublattice of Λ0ϕ,w,v (0, ∞) if and only if p ∈ [αϕ0 , βϕ0 ] ∪ [αϕ∞ , βϕ∞ ]. In particular, if in addition w is decreasing and regular, then the similar statement holds for 0 (0, ∞). the Orlicz–Marcinkiewicz space Mϕ,w Now we are ready to provide another example which shows that the Orlicz space and the corresponding Orlicz–Lorentz space may have quite different pools of order isomorphic copies of p . Example 7.25. Let 0 < p < q < ∞ and ϕ(t) = t p ∧ t q and W (t) = t p ∨ t q . Consider the space Λϕ,w (0, ∞). Then for t > 0, χ(0,t) Λ = 1/ϕ −1 1/W (t) = t. It is also clear that αϕ0 = βϕ0 = q and αϕ∞ = βϕ∞ = p. Thus by Corollary 7.24, the set of r > 0 such that r is isomorphic to a sublattice of Λϕ,w (0, ∞) coincides with [αϕ0 , βϕ0 ] ∪ [αϕ∞ , βϕ∞ ] = {q} ∪ {p}. However, by Proposition 5.3, the analogous set for the space Lϕ (0, ∞) is equal to the interval [αϕ∞ , βϕ0 ] = [p, q]. Consequently, the Orlicz space Lϕ (0, ∞) has considerably more sublattices isomorphic to r than the Orlicz–Lorentz space Λϕ,w (0, ∞). Remark 7.26. We wish to point out that the appropriately adjusted Theorems 7.18, 7.22 and their corollaries hold true in particular in Orlicz–Lorentz and Orlicz–Marcinkiewicz spaces over (0, ∞). We finish by a result in classical Lorentz spaces Λp,w (0, ∞) with 0 < p < ∞ and a positive weight w. It generalizes the Carothers and Dilworth result, Corollary 2.4 in [6] for the spaces Lp,q (0, ∞), and it follows from Theorem 7.18, Proposition 7.22 in view of χ(0,t) Λp,w = W (t)1/p , and Corollary 7.24. Corollary 7.27. Let 0 < p < ∞ and w be a positive measurable function on (0, ∞), such that W (t) < ∞ for all t > 0 and W satisfies condition 2 . Let 0 < r < ∞.
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
327
(i) Let W (∞) = ∞. The space Λp,w (0, ∞) contains an order isomorphic copy of r if and ∞ only if either p = r or 0 t −r/p w(t) dt < ∞. (ii) Let W (∞) = ∞. If W (t) Ct γ for all t > 0, and some C, γ > 0, then any closed infinite dimensional sublattice of the Lorentz space Λp,w (0, ∞) contains an order isomorphic copy of p . Moreover, the space does not contain order isomorphically r for any r = p. (iii) If W (∞) < ∞, then Λp,w (0, ∞) has a sublattice isomorphic to c0 . Appendix A Let (fn ) be a sequence of measurable functions (random variables) on a probability space (Ω, A, P). We say that (fn ) converges in distribution to a function f if and only if for every bounded continuous function F : R → R we have F ◦ fn (ω) dP(ω) → F ◦ f (ω) dP(ω). Ω
Ω
It is equivalent to say that dfn (t) → df (t) in every point of continuity of df [9]. Then fn∗ → f ∗ in every continuity point of f ∗ . Indeed |fn | converges also in distribution to |f |, thus d|fn | → d|f | almost everywhere, and by Lebesgue’s theorem we have for every bounded continuous function F and N > 0, N
F d|fn | (t) dt →
0
N
F d|f | (t) dt.
0
Approximating indicator functions of intervals by continuous functions, we deduce for every a > 0, {d|f | a} ∩ [0, N] lim sup{d|f | a} ∩ [0, N] n lim inf{d|f | > a} ∩ [0, N] {d|f | > a} ∩ [0, N]. n
Since |{d|f | > a}| = f ∗ (a) < ∞, and |{d|f | a}| = f ∗ (a − ), (left limit of f ∗ at a) we obtain by choosing N > f ∗ (a): f ∗ (a) lim sup fn∗ (a) lim inf fn∗ a + f ∗ a + thus fn∗ (a) → f ∗ (a) for all a > 0 where f ∗ is continuous. Recall that a random variable f is symmetric and p-stable if for some c > 0 we have for every t ∈ R, p eitf (ω) dP(ω) = e−c|t| Ω
[28, pp. 181–182]. Such random variable do exist if and only if 0 < p 2 (for p = 2 these are symmetric Gaussian variables).
328
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
Theorem A.1. Let 0 < r 1, (Ω, A, P) be a nonatomic separable probability space, and X be a subspace of Lr [Ω, A, P] which is isomorphic to the space p for some p ∈ (0, ∞). Then X contains a normalized sequence of functions which converge in distribution to a norm one function which is equimeasurable with to a product wγp , where 0 w ∈ Lr , γp is a symmetric p-stable random variable and w is independent from γp . This theorem is a part of the folklore and could certainly be recovered from Aldous’ work [1], verifying that changing L1 to Lr in the proofs there is harmless. Rather than suggesting this tedious task to the reader, let us indicate another path, using the connection between two points of view, that of random measures of [1] and that of stable Banach spaces of [26]. Such a connection is nicely sketched in [11, Section 6]. Indeed, the adaptation of the theory of stability to r-Banach spaces has been done explicitely in [4]. A type on a r-Banach space Y is a function τ : Y → R+ which can be defined by τ (y) = lim y + yn , n
y ∈ Y,
where (yn ) is a bounded sequence in Y (such a sequence is called a defining sequence). The type defined by the sequence constantly equal to 0 is called the “trivial type”. The space Y is called stable if whenever (yn ) defines a type τ and (zn ) defines a type σ , the limits limn σ (yn ) and limm τ (zn ) exist and are equal. In this case one can define an operation on the types, called “convolution”, by τ ∗ σ (y) = lim σ (yn ) = lim σ (zn ). n
n
One can also define an action of λ ∈ R on the types, named “dilation”, by λ · τ (y) = lim y + λyn . n
A type τ is called an p -type if for all a, b ∈ R, 1/p (a · τ ) ∗ (b · τ ) = |a|p + |b|p · τ. If τ is a non-trivial p -type then the linear span of every defining sequence of τ contains a basic sequence almost 1-equivalent to the unit vector basis of p [4, Proposition 3]. Conversely, every infinite dimensional closed subspace X of a stable r-Banach space E contains a sequence defining a non-trivial q -type, for some r q < ∞ [4, Propositions 1 and 2]. Among the stable quasi-Banach spaces are the Lp -spaces, 0 < p < ∞ (see [22] in the case p 1 and [4] when p < 1). Let P be the convex set of (Radon) probability measures on R equipped with the narrow topology. A random probability on a probability space (Ω, A, P) is a map Ω → P, ω → μω which is measurable from A to the Borel σ -algebra of P. The usual operations in P, the dilation and the convolution are extended pointwise to random probabilities, that is (λ · μ)ω =λ · μω and (μ ∗ ν)ω = (μω ) ∗ (νω ). (The dilation λ · m of a probability measure m is defined by f (t) d(λ · m)(t) = f (λt) dm(t).) Let Mr be the set of random probabilities μ such that |t|r dμω (t) dP(ω) < ∞. Ω R
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
329
If μ ∈ Mr and α ∈ R+ , then the following map Lr → R+ τ (f ) =
t + f (ω)r dμω (t) dP(ω) + β r
1/r
Ω R
defines always a type on Lr (Ω, A, P), and every type is represented in this form. The argument in [11, Proposition 6.4] shows that this representation is unique, whenever 0 < r 1 (in fact, when r is not an even integer). Denote by T (μ, β) the type represented by the pair (β, μ). A particular case is β = 0 and μω = δh(ω) , the evaluation map at point h(ω), where h is a function in Lr (Ω, A, P). Then T (0, δh )(f ) = f + hr . It appears that for any β, γ > 0 and μ, ν ∈ Mr we have T |λ|β, λ · μ = λ · T (β, μ)
and
1/r ,μ ∗ ν . T (β, μ) ∗ T (γ , ν) = T β r + γ r
In particular T (β, μ) is an p -type if and only if for all a, b ∈ R, 1/p r 1/r p r = |a| + |b|p β |a|β + |b|β
1/p and (aμ) ∗ (bμ) = |a|p + |b|p μ.
Assuming that p > r we have necessarily β = 0 and (aμω ) ∗ (bμω ) = (|a|p + |b|p )1/p μω for a.a. ω, that is μω is the probability distribution of a symmetric p-stable random variable. The condition μ ∈ Mr requires then r < p 2. Therefore we have for all s ∈ R,
eist dμω (t) = e−w(ω)
p |s|p
,
where w is a non-negative measurable function on Ω. The condition μ ∈ Mr yields that w ∈ Lr (Ω, A, P). Equivalently, there is a fixed symmetric p-stable variable γ defined say on ([0, 1], | |), where | | is the Lebesgue measure, such that
F (t) dμω (t) =
F w(ω)γ (s) ds
[0,1]
R
for every bounded continuous function F . Let X ⊂ Lr (Ω, A, μ) be a closed subspace isomorphic to p and (xn ) be a normalized sequence in X which defines an p -type. It follows from [11, Proposition 6.4] that
F xn (ω) dP(ω) =
F (t)dδxn (ω) (t) dP(ω) −→
A R
A
F (t) dμω (t) dP(ω)
n→∞
= A [0,1]
A R
F w(ω)γ (s) ds dP(ω)
330
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
for every A ∈ A and every bounded continuous function F on R. In particular taking A = Ω we see that the sequence (xn ) (of random variables defined on (Ω, A, P)) converges in distribution to the random variable Z on (Ω ×[0, 1], P⊗| |), with Z(ω, s) = w(ω)γ (s). Now since all separable nonatomic probability spaces are equivalent, there is a pair (w , γ ) of random variables defined on (Ω, A, P) with the same joint probability distribution as the pair (w, γ ). Thus w , γ are independent and respectively equimeasurable with w, γ . References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21]
[22] [23] [24] [25] [26] [27] [28] [29] [30]
D.J. Aldous, Subspaces of L1 , via random measures, Trans. Amer. Math. Soc. 267 (2) (1981) 445–463. C.A. Aliprantis, O. Burkinshaw, Positive Operators, Academic Press, 1985. F. Albiac, N.J. Kalton, Topics in Banach Space Theory, Springer, 2006. J. Bastero, l q -subspaces of stable p-Banach spaces, 0 < p 1, Arch. Math. (Basel) 40 (6) (1983) 538–544. C. Bennett, R. Sharpley, Interpolation of Operators, Academic Press, 1988. N.L. Carothers, J. Dilworth, Subspaces of Lp,q , Proc. Amer. Math. Soc. 104 (1988) 537–545. J. Cerdá, H. Hudzik, A. Kami´nska, M. Mastyło, Geometric properties of symmetric spaces with applications to Orlicz–Lorentz spaces, Positivity 2 (1998) 311–337. D. Dacunha-Castelle, J.-L. Krivine, Sous-espaces de L1 , Israel J. Math. 26 (3–4) (1977) 320–351. W. Feller, An Introduction to Probability Theory and Its Applications (vols. I and II), Wiley, New York, 1957, 1971. T. Figiel, W.B. Johnson, L. Tzafriri, On Banach lattices and spaces having local unconditional structure with applications to Lorentz function spaces, J. Approx. Theory 13 (1975) 297–312. R. Haydon, E. Odell, Th. Schlumprecht, Small subspaces of Lp , preprint, arXiv:0711.3919v1. F.L. Hernández, B. Rodríguez-Salinas, Remarks on the Orlicz function space Lϕ (0, ∞), Math. Nachr. 156 (1992) 225–232. F.L. Hernández, V.M. Sanchez, E.M. Semenov, Strictly singular inclusions into L1 + L∞ , Math. Z. 258 (1) (2008) 87–106. H. Hudzik, A. Kami´nska, M. Mastyło, Geometric properties of some Calderón–Lozanovskii spaces and Orlicz– Lorentz spaces, Houston J. Math. 22 (3) (1996) 639–663. H. Hudzik, A. Kami´nska, M. Mastyło, On the dual of Orlicz–Lorentz spaces, Proc. Amer. Math. Soc. 130 (6) (2002) 1645–1654. N.J. Kalton, Convexity conditions on non locally-convex lattices, Glasgow Math. J. 25 (1984) 141–152. N.J. Kalton, N.T. Peck, J.W. Roberts, An F -Sampler, London Math. Soc. Lecture Note Ser., vol. 89, Cambridge Univ. Press, 1984. A. Kami´nska, Some remarks on Orlicz–Lorentz spaces, Math. Nachr. 147 (1990) 29–38. A. Kami´nska, L. Maligranda, Order convexity and concavity of Lorentz spaces Λp,w , 0 < p < ∞, Studia Math. 160 (3) (2004) 267–286. A. Kami´nska, L. Maligranda, L.E. Persson, Indices, convexity and concavity of Calderón–Lozanovskii spaces, Math. Scand. 92 (2003) 141–160. A. Kami´nska, L. Maligranda, L.E. Persson, Indices and regularizations of measurable functions, in: Function Spaces, The Fifth Conference: Proceedings of the Conference at Pozna´n,Poland, in: Lect. Notes Pure Appl. Math., vol. 213, Marcel Dekker, 2000, pp. 231–246. A. Kami´nska, M. Mastyło, Abstract duality Sawyer’s formula and its applications, Monatshefte Math. 151 (2007) 223–245. A. Kami´nska, Y. Raynaud, Isomorphic p -subspaces in Orlicz–Lorentz sequence spaces, Proc. Amer. Math. Soc. 134 (8) (2006) 2317–2327. A. Kami´nska, Y. Raynaud, Copies of p and c0 in general quasi-normed Orlicz–Lorentz sequence spaces, in: Function Spaces, in: Contemp. Math., vol. 435, 2007, pp. 207–227. S.G. Krein, Ju.I. Petunin, E.M. Semenov, Interpolation of Linear Operators, Amer. Math. Soc., Providence, RI, 1982. J.L. Krivine, B. Maurey, Espaces de Banach stables, Israel J. Math. 39 (4) (1981) 273–295. J. Lindenstrauss, L. Tzafriri, Classical Banach Spaces I, Springer-Verlag, 1977. J. Lindenstrauss, L. Tzafriri, Classical Banach Spaces II, Springer-Verlag, 1979. J. Lindenstrauss, L. Tzafriri, On Orlicz sequence spaces III, Israel J. Math. 14 (1972) 368–389. W.A.J. Luxemburg, Banach function spaces, Thesis, Technische Hogeschool te Delft, 1955.
A. Kami´nska, Y. Raynaud / Journal of Functional Analysis 257 (2009) 271–331
331
[31] W. Matuszewska, W. Orlicz, On certain properties of ϕ-functions, Bull. Acad. Polon. Sci. Ser. Sci. Math. Astronom. Phys. 8 (1960) 439–443. [32] J. Musielak, Orlicz Spaces and Modular Spaces, Lecture Notes in Math., vol. 1034, Springer-Verlag, 1983. [33] N.J. Nielsen, On the Orlicz function spaces LM (0, ∞), Israel. J. Math. 20 (1975) 237–259. [34] N. Popa, Uniqueness of the symmetric structure in Lp (μ) for 0 < p < 1, Rev. Roumaine Math. Pures Appl. 27 (10) (1982) 1061–1089. [35] Y. Raynaud, On Lorentz–Sharpley spaces, in: Interpolation Spaces and Related Topics, Haifa, 1990, in: Israel Math. Conf. Proc., vol. 5, Bar-Ilan Univ., Ramat Gan, 1992, pp. 207–228. [36] Y. Raynaud, Complemented Hilbertian subspaces in rearrangement invariant function spaces, Illinois J. Math. 39 (1995) 212–250. [37] S. Reisner, On the duals of Lorentz function and sequence spaces, Indiana Univ. Math. J. 31 (1982) 65–72. [38] H.L. Royden, Real Analysis, third ed., Macmillan Publishing Company, 1988. [39] A. Torchinsky, Interpolation of operations and Orlicz classes, Studia Math. 59 (1976) 177–207. [40] J.Y.T. Woo, On modular sequence spaces, Studia Math. 48 (1973) 271–289. [41] A.C. Zaanen, Riesz Spaces II, North-Holland Math. Library, vol. 30, 1983.
Journal of Functional Analysis 257 (2009) 332–339 www.elsevier.com/locate/jfa
Isomorphism of Hilbert modules over stably finite C∗-algebras Nathanial P. Brown a,∗,1 Alin Ciuperca b,2 a Department of Mathematics, Penn State University, State College, PA 16802, USA b Fields Institute, 222 College Street, Toronto, Ontario, Canada, M5T 3J1
Received 4 November 2008; accepted 4 December 2008 Available online 19 December 2008 Communicated by Dan Voiculescu
Abstract It is shown that if A is a stably finite C∗ -algebra and E is a countably generated Hilbert A-module, then E gives rise to a compact element of the Cuntz semigroup if and only if E is algebraically finitely generated and projective. It follows that if E and F are equivalent in the sense of Coward, Elliott and Ivanescu (CEI) and E is algebraically finitely generated and projective, then E and F are isomorphic. In contrast to this, we exhibit two CEI-equivalent Hilbert modules over a stably finite C∗ -algebra that are not isomorphic. © 2008 Elsevier Inc. All rights reserved. Keywords: C∗ -algebras; Hilbert modules; Cuntz semigroup; compact
1. Introduction In [3] a new equivalence relation—we will call it CEI equivalence—on Hilbert modules was introduced. In general CEI equivalence is weaker than isomorphism, but it was shown that if A has stable rank one, then it is the same as isomorphism [3, Theorem 3]. Quite naturally, the authors wondered whether their result could be extended to the stably finite case. Unfortunately, it cannot. In Section 4, we give examples of Hilbert modules over a stably finite C∗ -algebra which are CEI-equivalent, but not isomorphic. On the other hand, we show in Section 3 that * Corresponding author.
E-mail addresses: [email protected] (N.P. Brown), [email protected] (A. Ciuperca). 1 N.B. was partially supported by DMS-0554870. 2 A.C. was partially supported by Fields Institute.
0022-1236/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.12.004
N.P. Brown, A. Ciuperca / Journal of Functional Analysis 257 (2009) 332–339
333
CEI equivalence amounts to isomorphism when restricted to “compact” elements of the Cuntz semigroup, in the stably finite case. 2. Definitions and preliminaries Throughout this note all C∗ -algebras are assumed to be separable and all Hilbert modules are assumed to be right modules and countably generated. We will follow standard terminology and notation in the theory of Hilbert modules (see, for example, [5]). In particular, K denotes the compact operators on 2 (N), while K(E) will denote the “compact” operators on a Hilbert module E. For the reader’s convenience, we recall a few definitions that are scattered throughout [3]. Definition 2.1. If E ⊂ F are Hilbert A-modules, we say E is compactly contained in F if there exists a self-adjoint T ∈ K(F ) such that T |E = idE . In this situation we write E ⊂⊂ F . Note that E ⊂⊂ E if and only if K(E) is unital; it can be shown that this is also equivalent to E being algebraically finitely generated and projective (in the purely algebraic category of right A-modules)—see the proof of [3, Corollary 5] (this part of the proof did not require the assumption of stable rank one). Definition 2.2. We say a Hilbert A-module E is CEI subequivalent to another Hilbert A-module F if every compactly contained submodule of E is isomorphic to a compactly contained submodule of F . We say E and F are CEI equivalent if they are CEI subequivalent to each other—i.e., a third Hilbert A-module X is isomorphic to a compactly contained submodule of E if and only if X is isomorphic to a compactly contained submodule of F . Definition 2.3. We let Cu(A) denote the set of Hilbert A-modules, modulo CEI equivalence. The class of a module E in Cu(A) will be denoted [E]. It turns out that Cu(A) is an abelian semigroup with [E] + [F ] := [E ⊕ F ]. (Note: it is not even obvious that this is well defined!) Moreover Cu(A) is partially ordered—[E] [F ] ⇐⇒ E is CEI subequivalent to F —and every increasing sequence has a supremum (i.e., least upper bound). See [3, Theorem 1] for proofs of these facts. Definition 2.4. An element x ∈ Cu(A) is compact (in the order-theoretic sense) if for every increasing sequence {xn } ⊂ Cu(A) with x supn xn there exists n0 ∈ N such that x xn0 . For a unital C∗ -algebra A, stable finiteness means that for every n ∈ N, Mn (A) contains no infinite projections. In the nonunital case there are competing definitions, but it seems most popular to say A is stably finite if the unitization A˜ is stably finite, so this is the definition we will use. 3. Main results The proof of our first lemma is essentially contained in the proof of [3, Corollary 5].
334
N.P. Brown, A. Ciuperca / Journal of Functional Analysis 257 (2009) 332–339
Lemma 3.1. Assume E ⊂⊂ F is a compact inclusion of Hilbert A-modules. If E ∼ = F then either E = F or A ⊗ K contains a scaling element (in the sense of [1]). If A is stably finite, then A ⊗ K cannot contain a scaling element; hence, in this case, E ∼ = F if and only if E = F Proof. Assume E is properly contained in F ; we will show A ⊗ K contains a scaling element. Let v : F → E be an isomorphism and T ∈ K(F ) be a positive operator such that T |E = idE . As observed in [3], the map vT is adjointable—i.e. defines an element of L(F )—and, in fact, is compact. (This assertion is readily checked whenever T is a “finite-rank” operator). Moreover, a calculation shows that (vT )∗ |E = T v −1 . It is also worth noting that T (vT ) = vT , since T |E = idE and vT (F ) ⊂ E. The scaling element we are after is x = vT . Indeed, one checks that x ∗ x = T 2 ; hence, ∗ (x x)(xx ∗ ) = T 2 (vT )(vT )∗ = (vT )(vT )∗ = xx ∗ . Finally, we must see why xx ∗ = x ∗ x. But if xx ∗ = x ∗ x, then T 2 = (vT )(vT )∗ and thus T 2 (F ) ⊂ vT (F ) ⊂ E. It follows that T 2 is a selfadjoint projection onto E (since T 2 |E = idE , too), and hence x = vT is a partial isometry whose support and range coincide with E. But this is impossible because T = T 2 (since T 0), so vT (F ) E (since T (F ) = E F ). We have shown that if E F , then K(F ) contains a scaling element. But Kasparov’s stabilization theorem provides us with an inclusion K(F ) ⊂ A ⊗ K, so the proof of the first part is complete. In the case that A is stably finite, it is well known to the experts that A ⊗ K cannot contain a scaling element. Indeed, if it did, then [1, Corollary 4.4] implies that Mn (A) contains a scaling element, for some n ∈ N. But it was shown in [1] that the unitization M n (A) would then have an ˜ which contradicts infinite projection. However, there is a natural embedding Mn (A) ⊂ Mn (A), the assumption of stable finiteness. 2 Note that the canonical Hilbert module 2 (A) is isomorphic to lots of (non-compactly contained) proper submodules. Proposition 3.2. Let E be a Hilbert A-module such that [E] is compact in Cu(A). Then either E ⊂⊂ E or A ⊗ K contains a scaling element. Proof. Let h ∈ K(E) be a norm-one strictly positive element. If 0 is an isolated point in the spectrum σ (h), then functional calculus provides a projection p ∈ K(E) such that p = idE ; so E ⊂⊂ E, in this case. If 0 ∈ σ (h) is not isolated, then, again using functional calculus, we can find E1 ⊂⊂ E2 ⊂⊂ E3 ⊂⊂ · · · ⊂⊂ E such that i Ei is dense in E and Ei Ei+1 for all i ∈ N. (For example, if fi ∈ C0 (0, 1] is zero on the interval (0, 1/2i ] and one on the interval [1/2i−1 , 1], ¯ then, since fi+1 (h)fi (h) = fi (h), we can let Ei = (fi (h)E )). Since [E] is compact, there exists i such that [Ei ] = [E]. Since Ei+1 ⊂⊂ E, Ei+1 is isomorphic to a compactly contained submodule of Ei and this isomorphism restricted to Ei maps onto a proper submodule of Ei (since Ei Ei+1 ). Thus Ei is isomorphic to a proper compactly contained submodule of itself. Hence, by Lemma 3.1, A ⊗ K contains a scaling element. 2 Corollary 3.3. Let A be stably finite and E be a Hilbert A-module. Then [E] ∈ Cu(A) is compact if and only if E ⊂⊂ E. In particular, if [E] is compact and [E] [F ], then E is isomorphic to a compactly contained submodule of F .
N.P. Brown, A. Ciuperca / Journal of Functional Analysis 257 (2009) 332–339
335
Proof. The “only if” direction is immediate from the previous proposition. So assume E ⊂⊂ E and let [Fn ] ∈ Cu(A) be an increasing sequence such that [E] [F ] := sup[Fn ]. By definition, E is then isomorphic to a compactly contained submodule E ⊂⊂ F . In the proof of [3, Theorem 1] it is shown that if E ⊂⊂ F and [F ] = sup[Fn ], then there is some n ∈ N such that [E ] [Fn ]. Since [E] = [E ], the proof is complete. 2 Corollary 3.4. Let A be stably finite and E, F be Hilbert A-modules. If [E] = [F ] ∈ Cu(A) is compact, then E ∼ = F . In particular, if [E] = [F ] and E is algebraically finitely generated and projective, then [E] ∈ Cu(A) is compact; hence, E ∼ = F. Proof. Assume [E] = [F ] is compact. Then E ⊂⊂ E and F ⊂⊂ F , by the previous corollary. Hence there exist isomorphisms v : F → F ⊂⊂ E and u : E → E ⊂⊂ F . It follows that F ∼ = u(v(F )) ⊂⊂ F , which, by Lemma 3.1, implies that u(v(F )) = F . Hence u is surjective, as desired. As mentioned after Definition 2.1, if E is algebraically finitely generated and projective, then E ⊂⊂ E, which implies [E] is compact (as we have seen). 2 In the appendix of [3] it is shown that Cu(A) is isomorphic to the classical Cuntz semigroup W (A ⊗ K). When A is stable, the isomorphism W (A) → Cu(A) is very easy to describe: the Cuntz class of a ∈ A+ is sent to Ha := aA (with its canonical Hilbert A-module structure). Theorem 3.5. Let A be a stable, finite C∗ -algebra, a ∈ A+ and Ha = aA. The following are equivalent: (1) (2) (3) (4)
Ha is algebraically finitely generated and projective; [Ha ] ∈ Cu(A) is compact; σ (a) ⊂ {0} ∪ [ε, ∞) for some ε > 0; a = p ∈ W (A) for some projection p ∈ A.
Proof. The implication (1) ⇒ (2) was explained above. (2) ⇒ (3): Let aε = (a − ε)+ . Then Haε ⊂⊂ Ha and ε Haε is dense in Ha . Since [Ha ] ∈ Cu(A) is compact, there exists ε > 0 such that [Ha ] = [Haε ]. Corollary 3.4 implies that Ha ∼ = Haε ; thus Ha = Haε , by Lemma 3.1. It follows that σ (a) ⊂ {0} ∪ [ε, ∞), because otherwise functional calculus would provide a nonzero element b ∈ C ∗ (a) such that 0 b a (so b ∈ Ha ) and aε b = 0 (so b ∈ / Haε ), which would contradict the equality Ha = Haε . (3) ⇒ (4) is a routine functional calculus exercise. (4) ⇒ (1): Assume a = p ∈ W (A). Since pA is singly generated and algebraically projective, Corollary 3.4 implies Ha is isomorphic to pA. 2 The equivalence of (3) and (4) above generalizes Proposition 2.8 in [6]. Corollary 3.6. If A is stably finite, then A ⊗ K has no nonzero projections if and only if Cu(A) contains no compact element.
336
N.P. Brown, A. Ciuperca / Journal of Functional Analysis 257 (2009) 332–339
4. A counterexample Now let us show that if A is stably finite and E, F are Hilbert A-modules such that [E] = [F ], then it need not be true that E and F are isomorphic. Let A = C0 (0, 1] ⊗ O3 ⊗ K, where O3 is the Cuntz algebra with three generators. Voiculescu’s homotopy invariance theorem (cf. [7]) implies that A is quasidiagonal, hence stably finite. Let p, q ∈ O3 ⊗ K be two nonzero projections which are not Murray–von Neumann equivalent. If x ∈ C0 (0, 1] denotes the function t → t, then we define fp = x ⊗ p and fq = x ⊗ q in A. Since A is purely infinite in the sense of [4] and the ideals generated by fp and fq coincide, it follows that [fp A] = [fq A] ∈ Cu(A). We claim that the modules fp A and fq A are not isomorphic. Indeed, if they were isomorphic, then we could find v ∈ A such that v ∗ v = fp and vv ∗ A = 1/2 fq A. (See [2, Lemma 3.4.2]; if T : fp A → fq A is an isomorphism, then v = T (fp ) has the asserted properties.) Letting π : A → O3 ⊗ K be the quotient map corresponding to evaluation at 1 ∈ (0, 1], it follows that π(v)∗ π(v) = p and π(v)π(v)∗ (O3 ⊗ K) = q(O3 ⊗ K). Since π(v)π(v)∗ is a projection whose associated hereditary subalgebra agrees with the hereditary subalgebra generated by q, it follows that π(v)π(v)∗ = q (since both projections are units for the same algebra). This contradicts the assumption that p and q are not Murray–von Neumann equivalent, so fp A and fq A cannot be isomorphic. 5. Questions and related results If the following question has an affirmative answer, then the proof of [3, Corollary 5] would show that A has real rank zero if and only if the compacts are “dense” in Cu(A). Question 5.1. Can Corollary 3.4 be extended to the “closure” of the compact elements? That is, if A is stably finite and E and F are Hilbert A-modules such that [E] = [F ] = sup[Cn ] for an increasing sequence of compact elements [Cn ], does it follow that E ∼ = F? The next question was raised in [3], but we repeat it because the modules in Section 4 are not counterexamples—they mutually embed into each other. (To prove this, use the fact that p is Murray–von Neumann equivalent to a subprojection of q, and vice versa.) Question 5.2. Are there two Hilbert modules E and F such that [E] = [F ], but F is not isomorphic to a submodule of E? Question 5.3. If x ∈ Cu(A) is compact, is there a projection p ∈ A ⊗ K such that x = p? Of course, in the stably finite case the results of Section 3 tell us that much more is true, but for general C∗ -algebras we do not know the answer to this question. However, we can give an affirmative answer in some interesting cases, as demonstrated below. First, a definition. Definition 5.4. An element x ∈ Cu(A) will be called infinite if x + y = x for some nonzero y ∈ Cu(A). Otherwise, x will be called finite. Note that [2 (A)] ∈ Cu(A) is always infinite. Lemma 5.5. If A is simple, then [2 (A)] ∈ Cu(A) is the unique infinite element.
N.P. Brown, A. Ciuperca / Journal of Functional Analysis 257 (2009) 332–339
337
Proof. Assume [E] + [F ] = [E] for some nonzero Hilbert A-module F . Adding [F ] to both sides, we see that [E] + 2[F ] = [E]; repeating this, we have that [E] + k[F ] = [E] for all k ∈ N. By uniqueness of suprema, it follows that [E] + [2 (F )] = [E] (cf. [3, Theorem 1]). Since A is simple, F is necessarily full and hence 2 (F ) ∼ = 2 (A) [5, Proposition 7.4]. Thus [E] = [E] + 2 (F ) = E ⊕ 2 (A) = 2 (A) , 2
by Kasparov’s stabilization theorem.
In the proof of the following lemma, we use the operator inequality xbx ∗ + y ∗ by xby + y ∗ bx ∗ , for any b in A+ , and x, y ∈ A. (Which follows from the fact that (x − y ∗ )b(x − y ∗ )∗ 0.) Lemma 5.6. Let A be a stable algebraically simple C*-algebra. (1) For any nonzero x ∈ Cu(A) there exists n ∈ N such that nx = [A]. (2) There exists a projection q ∈ A such that [A] = [qA]. In particular, [A] is a compact element of the Cuntz semigroup Cu(A). Proof. It will be convenient to work in the original positive-element picture of the Cuntz semigroup. Our notation is by now standard (cf. [6]). Proof of (1): Let x = [bA] for some 0 = b ∈ A+ and let a ∈ A be a strictly positive element. (Stability implies that every right Hilbert A-module is isomorphic to a closed right ideal of A.) Since A is algebraically simple, one can find x1 , . . . , xn , y1 , . . . , yn ∈ A such that a = ki=1 xi byi . Thus, a ∼ 2a = a + a ∗ =
k xi byi + yi∗ bxi∗ i=1
k xi bxi∗ + yi∗ byi i=1
x1 bx1∗ ⊕ y1∗ by1 ⊕ · · · xk bxk∗ ⊕ yk∗ byk b ⊕ b ⊕ · · · ⊕ b, where the last sum has n = 2k summands. Since A is stable, one can embed the Cuntz algebra On in the multiplier algebra M(A). This gives us isometries s1 , . . . , sn ∈ M(A) with orthogonal ranges. Set bi = si bsi∗ and note that bi ∼ b and bi ⊥ bj . Moreover, a b1 + · · · + bn a (since a is strictly positive, it Cuntz-dominates any element of A). Therefore, a = nb = nx, or equivalently, [A] = nx. Proof of (2): Since A is stable and algebraically simple, [1, Theorem 3.1] implies A has a nonzero projection p. As above, we can find orthogonal projections p1 , . . . , pn ∈ A such that pi ∼ p and p1 + · · · + pn = np = [A]. Defining q = p1 + · · · + pn , we are done. 2 We will also need a consequence of the work in Section 3.
338
N.P. Brown, A. Ciuperca / Journal of Functional Analysis 257 (2009) 332–339
Proposition 5.7. If A is stable, a ∈ W (A) = Cu(A) is compact and 0 ∈ σ (a) is not an isolated point, then A contains a scaling element and a is infinite. Proof. Assume A contains no scaling element. Since a is compact, Proposition 3.2 implies that Ha ⊂⊂ Ha . As in the proof of (2) ⇒ (3) in Theorem 3.5, there exists ε > 0 such that [Ha ] = [Haε ] and hence Ha is isomorphic to a compactly contained submodule E of Haε . Lemma 3.1 implies E = Ha , so Haε = Ha too. As we have seen, this implies σ (a) ⊂ {0}∪[ε, ∞), contradicting our hypothesis; hence, A contains a scaling element. To prove the second assertion, choose ε > 0 such that [Ha ] = [Haε ]. Since 0 ∈ σ (a) is not isolated, we can find a nonzero positive function f ∈ C0 (0, a] such that f (t) = 0 for all t ε. Thus f (a) + (a − ε)+ a and f (a)(a − ε)+ = 0. It follows that [Hf (a) ] + [Ha ] = [Hf (a) ] + [Haε ] [Ha ] and thus [Ha ] is infinite.
2
Theorem 5.8. Let x ∈ Cu(A) be compact. (1) If A is simple, then there exists a projection p ∈ A ⊗ K such that x = p. (2) If x is finite, then there exists a projection p ∈ A ⊗ K such that x = p. Proof. In both cases we may assume A is stable. Proof of (1): Fix a nonzero positive element a ∈ A such that x = [Ha ]. If 0 ∈ σ (a) is an isolated point, then functional calculus provides us with a Cuntz equivalent projection, and we’re done. Otherwise Proposition 5.7 tells us that x is infinite and A contains a scaling element. By simplicity and Lemma 5.5, we have that x = [2 (A)] = [A] (by stability). Moreover, the existence of a scaling element ensures that A is algebraically simple (see [1, Theorem 1.2]). Hence part (2) of Lemma 5.6 provides the desired projection. Proof of (2): Choose a ∈ A+ such that x = a. Since x is finite, Proposition 5.7 implies 0 ∈ σ (a) is an isolated point, so we are done. 2 Remark 5.9. It is possible to improve part (2) of the theorem above. Namely, it is shown in [2] that if x ∈ Cu(A) is compact and there is no compact element y ∈ Cu(A) such that x = x + y, then there exists a projection p ∈ A ⊗ K such that x = p. Note added in proof : Immediately after posting the preprint version of this paper, Leonel Robert constructed a counterexample to Question 5.1. Acknowledgments We thank George Elliott, Francesc Perera, Leonel Robert, Luis Santiago, Andrew Toms and Wilhelm Winter for valuable conversations on topics related to this work. References [1] B. Blackadar, J. Cuntz, The structure of stable algebraically simple C ∗ -algebras, Amer. J. Math. 104 (1982) 813–822. [2] A. Ciuperca, Some properties of the Cuntz semigroup and an isomorphism theorem for a certain class of non-simple C∗ -algebras, PhD thesis, University of Toronto, 2008.
N.P. Brown, A. Ciuperca / Journal of Functional Analysis 257 (2009) 332–339
339
[3] K.T. Coward, G.A. Elliott, C. Ivanescu, The Cuntz semigroup as an invariant for C∗ -algebras, J. Reine Angew. Math., in press. [4] E. Kirchberg, M. Rørdam, Non-simple purely infinite C ∗ -algebras, Amer. J. Math. 122 (2000) 637–666. [5] E.C. Lance, Hilbert C ∗ -Modules. A Toolkit for Operator Algebraists, London Math. Soc. Lecture Note Ser., vol. 210, Cambridge University Press, Cambridge, 1995. [6] F. Perera, A.S. Toms, Recasting the Elliott conjecture, Math. Ann. 338 (2007) 669–702. [7] D.V. Voiculescu, A note on quasi-diagonal C ∗ -algebras and homotopy, Duke Math. J. 62 (1991) 267–271.
Journal of Functional Analysis 257 (2009) 340–356 www.elsevier.com/locate/jfa
Qualitative uncertainty principles for groups with finite dimensional irreducible representations Eberhard Kaniuth Institut für Mathematik, Universität Paderborn, D-33095 Paderborn, Germany Received 17 November 2008; accepted 20 December 2008 Available online 18 January 2009 Communicated by P. Delorme
Abstract its dual space. Roughly speaking, qualitative uncertainty Let G be a locally compact group of type I and G principles state that the concentration of a nonzero integrable function on G and of its operator-valued is limited. Such principles have been established for locally compact abelian groups Fourier transform on G and for compact groups. In this paper we prove generalizations to the considerably larger class of groups with finite dimensional irreducible representations. © 2008 Elsevier Inc. All rights reserved. Keywords: Locally compact group; Dual space; Moore group; Uncertainty principle; Fourier transform; Plancherel measure
0. Introduction Let G be a unimodular locally compact group equipped with a left Haar measure mG . Let G denote the dual space of G, that is, the set of equivalence classes of irreducible unitary representations of G, endowed with Fell’s topology. Suppose that there exists a (necessarily unique) such that the Plancherel formula measure μG on G
f (x)2 dmG (x) =
G
tr π(f )π(f )∗ dμG (π)
G
E-mail address: [email protected]. 0022-1236/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.12.020
E. Kaniuth / Journal of Functional Analysis 257 (2009) 340–356
341
holds for all f ∈ L1 (G) ∩ L2 (G). Here π(f ) = G f (x)π(x) dmG (x) for f ∈ L1 (G) and any representation π of G and tr T denotes the trace of an operator T . Following [15], let us call such a group a Plancherel group. For f ∈ L1 (G), let π(f ) = 0 . Af = x ∈ G: f (x) = 0 and Bf = π ∈ G: With this notation, G is said to satisfy the qualitative uncertainty principle (QUP) if for every implies that f ∈ L1 (G) ∩ L2 (G), the condition that mG (Af ) < mG (G) and μG (Bf ) < μG (G) f = 0 almost everywhere. Moreover, G satisfies the weak qualitative uncertainty principle (weak QUP) if mG (Af )μG (Bf ) 1 for all nonzero functions f ∈ L1 (G) ∩ L2 (G). Note that neither of these properties depends on the particular choice of Haar measure on G and Plancherel measure provided that these two measures are related by the Plancherel formula. on G In 1974 Benedicks showed that Rn satisfies the QUP (see [3]), and the same result was obtained by Amrein and Berthier [1] using Hilbert space techniques. Hogan [14] proved that the QUP holds for a noncompact and nondiscrete abelian locally compact group G with connected component of the identity G0 if and only if G0 is noncompact. Somewhat more general, the main result of [15] implies that the same statement holds true if G possesses an abelian subgroup of finite index (equivalently, there is a finite upper bound for the dimensions of irreducible representation). Furthermore, an infinite compact group satisfies the QUP if and only if it is connected [15]. Matolcsi and Szücs [20] proved that the weak QUP holds for every locally compact abelian group (see also [25]). On the other hand, a compact group G satisfies the weak QUP precisely when the quotient group G/G0 is abelian [18]. For variants of these qualitative uncertainty principles for other types of groups compare [2,4,7,18], and for the description of minimizing functions in some special cases see [6,16,18,25]. An excellent survey is given in [10]. A natural and considerably larger class of locally compact groups, which comprises both, finite extensions of abelian groups and compact groups, is the class of groups with finite dimensional irreducible representations. Such groups are usually called Moore groups because their structure has been completely clarified by Moore [21] (see also [23]). In this paper we investigate the QUP and the weak QUP for Moore groups. A Moore group G which is neither compact nor discrete, turns out to satisfy the QUP if and only if G0 is noncompact (Theorem 2.4). If the weak QUP holds for a Moore group G, then either G0 is noncompact or G0 is compact and G/G0 is abelian (Theorem 3.1). The converse is true at least when G is a Lie group (Theorem 3.3). We strongly emphasize that we do not assume G to be second countable. Consequently, we cannot utilize the general Plancherel formulae as developed in [17]. On the other hand, for a are available and Moore group G many more specific properties of the Plancherel measure on G these are discussed in Section 1. These properties combined with the structure theory of Moore groups represent our main tools. 1. Preliminaries and the fine structure of Plancherel measure We have to introduce some notation. Let H be a closed subgroup of a locally compact group G. For representations π of G and τ of H , π|H and indG H τ will denote the restriction of π to H and the representation of G induced by τ , respectively. If π is finite dimensional, dπ stands for the dimension of π and the support of π , supp π , is the collection of all irreducible subrepresentations of π . Note that if H has finite index [G : H ] in G and π = indG H τ , then dπ = [G : H ]dτ .
342
E. Kaniuth / Journal of Functional Analysis 257 (2009) 340–356
Suppose that H is normal in G. Then G acts on representations of H by x · τ (h) = τ (x −1 hx), x ∈ G, h ∈ H , and the orbit of τ under this action is denoted G(τ ). Moreover, Gτ will denote the stabilizer of τ , that is, the subgroup {x ∈ G: x · τ = τ } of G. When π is irreducible, supp(π|H ) . In many cases, Mackey’s theory [19] allows to construct G by equals G(τ ) for some τ ∈ H . In particular, Mackey’s inducing irreducible representations from stability groups Gτ , τ ∈ H is theory applies to groups with finite dimensional irreducible representations. Thus, if τ ∈ H G such that Gτ = H then indH τ ∈ G. Whenever convenient, we shall identify a representation π of is always considered as a closed G/H with its pullback to G. In this manner, the dual space G/H As references to representation theory and topologies in spaces of representations subset of G. as well as continuity of the restriction and inducing processes we mention [5] and [9]. Finally, with slight abuse of notation, 1E will denote the indicator function of a subset E of a set, but also the trivial 1-dimensional representation of E when E is a group. Let G be a Moore group. Theorem 2 of [21] asserts that a Lie group (a locally compact group whose connected component of the identity is open and an analytic group) is a Moore group if and only if it has a closed normal subgroup H of finite index such that H /Z(H ), the quotient group of H modulo its centre Z(H ), is compact. Of course, quotients of Moore groups are Moore groups. The class of Moore groups is closed under the formation of projective limits, and an arbitrary Moore group is a projective limit of Lie groups [21, Theorem 3]. Every Moore group G possesses a neighbourhood basis of the identity consisting of conjugation invariant sets and GF , the subgroup consisting of all elements of G with relatively compact conjugacy classes, is an open normal subgroup of finite index in G. It follows from a structure theorem established in [11, Theorem 3.16] that GF is a direct product of a vector group Rn and of a locally compact group H whose closed commutator subgroup [H, H ] is compact and is contained in some compact open normal subgroup. was constructed satisfying In [12] a Borel measure νG on G
f (x)2 dmG (x) =
dρ−1 tr ρ(f )ρ(f )∗ dνG (ρ)
G
G
for all f ∈ L1 (G) ∩ L2 (G). Thus dμG (ρ) = dρ−1 dνG (ρ) is a Plancherel measure. Proposition 1.1. Let G be a Moore group and let μG and μGF be Plancherel measures on G and GF , respectively. F and S ⊆ G be defined by (i) Let T ⊆ G F : Gτ = GF } and S = indG T = {τ ∈ G GF τ : τ ∈ T . Then T and S are open and F \ T ) = 0 and μG (G \ S) = 0. μGF (G F /G be the space of all G-orbits in G F , equipped with the quotient topology. De(ii) Let G and fine maps r : G → GF /G and s : GF → GF /G by r(π) = supp(π|GF ) for π ∈ G
E. Kaniuth / Journal of Functional Analysis 257 (2009) 340–356
343
F . Then both r and s are continuous and open and s(τ ) = G(τ ) for τ ∈ G μG (M) =
1 μG s −1 r(M) [G : GF ] F
for every Borel set M in G. Proof. All the statements follow essentially from what has been shown in [12] after identifying and G F with the set of normalized traces of irreducible representations of G and GF , respecG F and G F \ T is a local νGF -zero set. However, since tively. By Lemma 1 of [12], T is open in G F \ T is actually a νGF -zero set and this implies that νGF is inner regular [12, Satz 3], G
F \ T ) = μGF (G
dτ−1 dνGF (τ ) = 0.
F \T G
[12, Korollar 2]. With maps r : G →G F /G and s : G F → G F /G as The set S is open in G in (ii), the measure νG was defined in [12] by νG (M) = νGF (s −1 (r(M)) ∩ T ) for any Borel (compare the proof of [12, Satz 3]). Consequently, subset M of G \ S) ∩ T = νGF (∅) = 0 \ S) = νGF s −1 r(G νG (G and this in turn implies that \ S) = μG (G
dπ−1 dνG (π) = 0.
G\S
Moreover, since s −1 (r(S)) = T and dπ = [G : GF ]dτ for π = indG GF τ , τ ∈ T , it follows that μG (M) = μG (M ∩ S) =
dπ−1 dνG (π)
M∩S
=
=
1 [G : GF ]
dτ−1 dνGF (τ )
s −1 (r(M))∩T
1 μGF s −1 r(M) [G : GF ]
Finally, the maps r and s are continuous and open [12]. for every Borel subset M of G.
2
Let G be a Moore group such that GF = G and let K be a compact normal subgroup of G such the coset σ ⊗ G/K = {σ ⊗ λ: λ ∈ G/K} is an open that G/K is abelian. Then for each σ ∈ G, and closed subset of G and the mapping tσ : λ → σ ⊗ λ from G/K onto σ ⊗ G/K is continuous and open. If π and σ are irreducible representations of G such that supp(π|K ) = supp(σ |K ), then π = σ ⊗ λ for some λ ∈ G/K.
344
E. Kaniuth / Journal of Functional Analysis 257 (2009) 340–356
The measure νG was shown to be translation invariant in the following sense. For any function by and π ∈ G, define π g on G g on G
−1 −1 dτ m(τ, π ⊗ ρ)g(τ ), ρ ∈ G, π g(ρ) = dπ dρ τ ∈G
where m(τ, π ⊗ ρ) denotes the multiplicity of the representation τ in the tensor product π ⊗ ρ. Then g(τ ) dνG (τ ) = π g(τ ) dνG (τ ) G
G
and g ∈ L1 (G, νG ). for all π ∈ G and let kσ denote the number Lemma 1.2. Retain the preceding situation and notation. Let σ ∈ G such that σ ⊗ λ = σ . Then, for every Borel subset S of σ ⊗ G/K, of elements λ of G/K μG (S) =
dσ μG tσ−1 (S) . kσ
Proof. Let S be as in the lemma and let Q = tσ−1 (S). Then
−1 −1 m(λ, σ ⊗ ρ) = dσ−1 dρ−1 m(σ ⊗ λ, ρ). σ (1Q )(ρ) = dσ dρ λ∈Q
λ∈Q
In particular, σ (1Q )(ρ) = 0 whenever ρ ∈ / S. On the other hand, if ρ ∈ S, then ρ = σ ⊗ λ0 for some λ0 ∈ Q and hence σ ⊗ λ = ρ if and only if σ ⊗ λλ0 = σ . Thus σ (1Q )(ρ) = dσ−2 kσ for ρ ∈ S. Since νG is translation invariant, it follows that μG (Q) = dρ−1 1Q (ρ) dνG (ρ) = σ dρ−1 1Q (ρ) dνG (ρ) G
= dσ−2 kσ
G
dνG (ρ) = dσ−1 kσ
S
dρ−1 dνG (ρ)
S
= dσ−1 kσ μG (S), as was to be shown.
2
In the next lemma, we present a special situation in which the numbers kσ turn out to be equal to one. Lemma 1.3. Let K be a compact normal subgroup of G such that G/K is abelian. Z be a closed subgroup of G which is contained in the centre of G and suppose that G/Z is connected. Then μG (S) = dσ μG tσ−1 (S) and every Borel subset S of σ ⊗ G/K. for every σ ∈ G
E. Kaniuth / Journal of Functional Analysis 257 (2009) 340–356
345
is such that σ ⊗ λ = σ , then λ = 1G . The set Γσ = Proof. We have to show that if λ ∈ G/K for some closed and hence Γσ = G/H {γ ∈ G/K: σ ⊗ γ = σ } is a closed subgroup of G/K subgroup H of G containing K. Using basic properties of induced representations and tensor yields that σ ⊗ indG 1H = indG (σ |H ) is a products, the fact that σ = σ ⊗ γ for all γ ∈ G/H H H multiple of σ . Since σ is finite dimensional, we conclude that H has finite index in G. Moreover, it follows that both, σ |Z and (indG H (σ |H ))|Z are multiples of the same character χ of Z and hence indZ (σ | ) is also a multiple of χ . This in turn implies that H ⊇ Z. Finally, as G/Z H ∩Z H ∩Z is connected and H is open in G, we conclude that H = G. Thus Γσ = {1G }, as required. 2 2. The QUP for Moore groups As mentioned in the introduction, Hogan [14] has shown that a locally compact abelian group G, which is neither compact nor discrete, satisfies the QUP if and only if the connected component of the identity of G is noncompact. More generally, the main result of [15] implies that the same statement holds true for groups with bounded representation dimension, that is, locally compact groups possessing an abelian subgroup of finite index. The purpose of this section is to extend this to the considerably larger class of Moore groups. In preparation, we present the following three lemmas. Lemma 2.1. Let G be a Plancherel group and K a compact normal subgroup of G. If G satisfies QUP (the weak QUP), then the same is true of G/K. Proof. Let Haar measures on K, G and G/K be chosen so that mK (K) = 1 and Weil’s formula holds. Then, with q : G → G/K denoting the quotient homomorphism, mG/K (Af ) = mG (Af ◦q )
and μG/K (Bf ) = μG (Bf ◦ q) = μG (Bf ◦q )
for any f ∈ L1 (G/K). In fact, the last equality follows from the orthogonality relations for irreducible representations of compact groups which imply that f q(x) π(x) π(k) dmK (k) = 0 π(f ◦ q) = G/K
K
\ G/K. Because mG/K (G/K) = mG (G/K) and μG/K (G/K) = μG (G/K) for all π ∈ G the statements of the lemma follow. 2 μG (G), Lemma 2.2. Let G be a Moore group and suppose that GF satisfies the QUP. Then the QUP holds for G. Proof. We can assume that G is noncompact. Let H = GF and let T and S be as in Proposition 1.1, that is, : Gτ = H } and S = indG τ : τ ∈ T . T = {τ ∈ H H Fix a ∈ G Let f ∈ L1 (G) ∩ L2 (G) be such that mG (Af ) < mG (G) and μG (Bf ) < μG (G). 1 2 and consider g = (La f )|H ∈ L (H ) ∩ L (H ) and h = f |a −1 H , the trivial extension of f |a −1 H
346
E. Kaniuth / Journal of Functional Analysis 257 (2009) 340–356
to all of G. Then, since mG (Af ) < ∞ and mG (H ) = ∞, mG (Ah ) < mG (a −1 H ) < mH (H ) and therefore mH (Ag ) = mG (A g ) = mG (ALa −1 ( g ) ) = mG (Ah ) < mH (H ). If τ ∈ Bg ∩ T , then indG H τ (f ) = 0 [7, Lemma 1.1] and hence, using Proposition 1.1(i), μH (Bg ) = μH (Bg ∩ T ) μH s −1 r indG H τ : τ ∈ Bg μH s −1 r(Bf ∩ S) = [G : H ]μG (Bf ∩ S) = [G : H ]μG (Bf ). ) = [G : H ]μG (G), we get that μH (Bg ) < μH (H ). Because the Since μG (Bf ) < ∞ and μH (H weak QUP holds for H , it follows that g = 0 and hence f = 0 as a ∈ G was arbitrary. 2 Lemma 2.3. Let G be a direct product G = A × H , where A is an abelian locally compact group with noncompact connected component of the identity and H is a Moore group. Then G satisfies the QUP. =A × H is the product of Haar measure on Proof. Note first that Plancherel measure on G = ∞ since A is and Plancherel measure on H . Moreover, mG (G) = ∞ as well as μG (G) A noncompact and nondiscrete. Let f ∈ L1 (G) be such that mG (Af ) < ∞ and μG (Bf ) < ∞. There exists a conull set Σ in α × σ ∈ Bf }) < ∞ for all σ ∈ Σ . Fix σ ∈ Σ and let ξ, η ∈ H (σ ), the such that μA ({α ∈ A: H representation space of σ . Then, since f ∈ L1 (G), the integral H f (a, x) σ (x)ξ, η dmH (x) exists for almost all a ∈ A and defines a function, denoted fσ,ξ,η , in L1 (A). Then mA (Afσ,ξ,η ) < ∞ since fσ,ξ,η (a) = 0 implies that f (a, x) = 0 for all x in a set of positive measure in H . we have For α ∈ A, fσ,ξ,η (α) = f (a, x) σ (x)ξ, η α(a) dmH (x) dmA (a) A H
= (α × σ )(f )(1 ⊗ ξ ), 1 ⊗ η . Therefore, if α ∈ Bfσ,ξ,η then α × σ ∈ Bf . Consequently, α × σ ∈ Bf } < ∞. μA (Bfσ,ξ,η ) μA {α ∈ A: Since A is abelian with noncompact connected component of the identity, the QUP holds for , it follows that A and hence fσ,ξ,η = 0. Since ξ, η ∈ H (σ ) are arbitrary and Σ is dense in H f (a, x) = 0 almost everywhere on A × H , and we are done. 2 The reader who is familiar with representation theory, will easily recognize that if the appropriate minor modifications are made, the proof of Lemma 2.3 goes through if both A and H are only assumed to be Plancherel groups with one of them satisfying the QUP and if one replaces and H by the reduced duals of A and H , respectively. A We are now ready to extend Hogan’s result mentioned at the outset of this section.
E. Kaniuth / Journal of Functional Analysis 257 (2009) 340–356
347
Theorem 2.4. Let G be a Moore group which is neither compact nor discrete. Then G satisfies the QUP if and only if the connected component of the identity of G is noncompact. Proof. Suppose first that the QUP holds for G and, towards a contradiction, assume that G0 is compact. Since H = G/G0 is totally disconnected and every compact subgroup of HF is contained in a compact normal subgroup of H [11], it follows that G possesses a compact open normal subgroup K. Then, since G is noncompact, mG (A1K ) = mG (K) < ∞ = mG (G). More \ G/K is a nonempty open subset of G. over, K is nontrivial since G is nondiscrete, and hence G Then, since supp μG = G, μG (G \ G/K) > 0. On the other hand, μG (B1K ) = μG (G/K) < ∞ So the QUP is since G/K is discrete. These two facts together imply that μG (B1K ) < μG (G). violated and therefore G0 must be noncompact. Conversely, suppose that G0 is noncompact. By Lemma 2.2 it suffices to show that GF satisfies the QUP. Now, GF is a direct product GF = V × H , where V is a vector group and H contains a compact open normal subgroup. Thus, since (GF )0 is noncompact, V has to be nontrivial. Then Lemma 2.3 applies with A = V and yields that the QUP holds for GF . 2 The remaining cases of a compact group and a discrete Moore group are treated in Corollaries 2.6 and 2.7 and Proposition 2.8 below. To start with, let G be an infinite compact group. if and only if E is finite and this is equivalent to μ(E) < ∞ Since, for such G, μG (E) < μG (G) for the measure μ considered in [15], Corollary 2.6 is precisely Theorem 2.6 of [15] which was proved there by reducing to the Lie group case and applying [22, Lemma 0.3]. For the reader’s convenience, we present a somewhat shorter and more focused proof. The next lemma will also be used later in the paper. Lemma 2.5. Let K be a compact group and f ∈ L2 (G) ⊆ L1 (G) and suppose that Bf is finite. Then there exists an open subgroup C of K such that, for every a ∈ K, f |aC either vanishes almost everywhere or is nonzero almost everywhere. Proof. Since Bf is finite, by the Peter–Weyl theorem f is equal almost everywhere to a finite linear combination, g say, of coordinate functions associated to irreducible representations σ1 , . . . , σn of K. It suffices to prove the statement of the lemma for g. There exists a closed So g is constant normal subgroup L of K such that K/L is a Lie group and σ1 , . . . , σn ∈ K/L. on cosets of L and therefore, after passing to K/L, we can henceforth assume that K is a Lie group. Let C = K0 , the connected component of the identity. Now, any coordinate function of a finite dimensional representation on the connected Lie group C, and hence on cosets of C, is an analytic function. Consequently, g|aC is either zero or nonzero almost everywhere. 2 Corollary 2.6. An infinite compact group G satisfies the QUP if and only if it is connected. Proof. Suppose first that G is connected and let f ∈ L2 (G) be such that Bf is finite. It then follows from Lemma 2.4 that f is nonzero almost everywhere on G. Thus mG (Af ) = mG (G). Conversely, if G is not connected, then there exists a proper open normal subgroup H of G ) is and the function f = 1H satisfies mG (Af ) = mG (H ) < mG (G) and μG (Bf ) = μG (G/H finite. 2 Corollary 2.7. Let G be an infinite discrete Moore group. Then the QUP holds for G if and only if GF is a torsion-free abelian group.
348
E. Kaniuth / Journal of Functional Analysis 257 (2009) 340–356
Proof. Suppose first that QUP holds for G. Then G cannot have a nontrivial finite normal subgroup (compare the proof of Theorem 2.4). Since every finite subgroup of GF is contained in a finite normal subgroup of G, it follows that GF is torsion-free. However, a torsion-free group with finite conjugacy classes is abelian [24, Theorem 4.32]. Conversely, let GF be torsion-free (and hence abelian). Then the compact dual group G F is connected [13, Theorem 24.25] and therefore satisfies the QUP. As pointed out in [15] after in the Corollary 2.5, for an abelian group A, it follows from the symmetry between A and A statement of the QUP and Pontrjagin’s duality theorem that the QUP holds for A if and only if it So the QUP holds for GF and hence for G by Lemma 2.2. 2 holds for A. For completeness, we finally discuss the case of a finite group. Proposition 2.8. A finite group G satisfies the QUP if and only if G is of order 1 or 2. Proof. We only have to show that if the QUP holds for G, then G is of order 2. Assume first that G is not abelian. Then there exists an irreducible representation π of G with dπ 2. Define a function f on G by f (x) = tr(π(x)). By a theorem of Burnside [8, (6.9)], f (x) = 0 for some x ∈ G, whence mG (Af ) < mG (G). The orthogonality relations imply that Bf equals the and this contradiction shows that G is abelian. Suppose singleton {π}. Thus μG (Bf ) < μG (G) and define a function f on G by now that G has a character χ which has order 3 in G f (x) = χ(x)(1 − χ(x)). Then Af ⊆ G \ {e} and f(1G ) =
x∈G
χ(x) −
χ(x)2 = 0,
x∈G
so that the QUP does not hold. Therefore every nontrivial character of G has order two. Finally, suppose that G has order 3. Then G possesses a nontrivial subgroup H of index 2, and the characteristic function 1H of H satisfies mG (A1H ) < mG (G) This contradiction finishes the proof.
) < μG (G). and μG (B1H ) = μG (G/H 2
Remark 2.9. In some papers, for instance [2,7,22], the following variant of the QUP is considered (sometimes also replacing μG by a measure akin to it). If f ∈ L1 (G) ∩ L2 (G) is such that mG (Af ) < ∞ and μG (Bf ) < ∞, then f = 0. Of course, if G is noncompact and μG is an infinite measure, then this property equals the QUP. However, there are nondiscrete Plancherel groups which nevertheless have a finite Plancherel measure. Examples are provided by the ax + b-group and by the semidirect product R Z, where Z acts on R by n · t = en t, t ∈ R, n ∈ Z. Actually, we do not even know whether for a Moore group G finiteness of the Plancherel measure forces G to be discrete. 3. The weak QUP for Moore groups In this section we investigate the weak qualitative uncertainty principle for Moore groups. Our first purpose is to establish the following theorem.
E. Kaniuth / Journal of Functional Analysis 257 (2009) 340–356
349
Theorem 3.1. Let G be a Moore group. If G satisfies the weak QUP, then either G0 is noncompact, or G0 is compact and G/G0 is abelian. The essential step in proving Theorem 3.1 is the following lemma. Lemma 3.2. Let G be a discrete Moore group. If the weak QUP holds for G, then G is abelian. Proof. We first show that G = GF . Let δe denote the Dirac function at the identity e and let S = {indG N τ : τ ∈ N , Gτ = N} be as in Lemma 1.1. Then 1 = δe 22 =
tr π(δe )π(δe )∗ dμG (π)
G
=
dπ dμG (π) =
dπ dμG (π) [G : GF ]μG (S)
G
S
= [G : GF ]μG (G). [G : GF ]−1 and therefore Thus μG (G) [G : GF ]−1 . mG (Aδe )μG (Bδe ) μG (G) So validity of the weak QUP entails G = GF . Now, since G is a group with finite conjugacy classes and has an abelian subgroup A of finite index, the centre Z(G) has finite index in G. In fact, if F is a finite subset of G such that G = FA, then the centralizer CG (F ) of F has finite index in G and A ∩ CG (F ) ⊆ Z(G). Let Z = Z(G) let and for each χ ∈ Z, χ = {π ∈ G: π|Z is a multiple of χ}. G by For simplicity, we let dχ = dμZ (χ). Define a Borel measure μ on G Z
dρ2 dχ.
μ(M) =
χ ∩M ρ∈G
We claim that μ is translation invariant. and let η ∈ Z such that π ∈ G η . Observe that for any σ ∈ G and χ ∈ Z, To see this, fix π ∈ G supp(π ⊗σ ) ∩ Gχ = ∅ only if σ ∈ Gηχ and in this case supp(π ⊗ σ ) ⊆ Gχ . Then, using that we obtain π ⊗ σ = τ ∈G m(τ, π ⊗ σ )τ and the translation invariance of Haar measure on Z,
−1 π (1M )(τ ) dμ(τ ) = dπ
dσ 1M (σ )
σ ∈G
G
= dπ−1
σ ∈M
G
dσ
Z
χ τ ∈G
dτ−1 m(σ, π ⊗ τ ) dμ(τ ) m(σ, π ⊗ τ ) dτ dχ
350
E. Kaniuth / Journal of Functional Analysis 257 (2009) 340–356
= dπ−1
dσ
χ τ ∈G
σ ∈M Z
= dπ−1
dσ
χη σ ∈M∩ G Z
=
m(τ, π ⊗ σ ) dτ dχ
τ ∈G
ηχ σ ∈M∩ G
Z
dσ2 dχ =
=
m(τ, π ⊗ σ ) dτ dχ
Z
dσ2 dχ
χ σ ∈M∩ G
1M (τ ) dμ(τ ). G
and π ∈ G χ , Note next that for each χ ∈ Z m π, indG Z χ = m(χ, π|Z ) = dπ by the Frobenius reciprocity theorem. Thus
[G : Z] = dim indG m π, indG dπ2 . Zχ = Z χ dπ = χ π∈G
χ π∈G
This implies that = μ(G)
Z
dπ2
= [G : Z]. dχ = [G : Z]μZ (Z)
χ π∈G
= 1 and μ is translation invariant, it follows that νG = [G : Z]−1 μ. Consequently, Since νG (G)
1 μG (M) = dπ dχ [G : Z] Z
χ ∩M π∈G
for every Borel subset M of G. = μG (Bδe ) 1. It follows that for Since the weak QUP holds for G, we must have μG (G) almost all χ ∈ Z,
dπ = dπ2 χ π∈G
χ π∈G
This implies that G = GF . Note that since the cosets and therefore dπ = 1 for almost all π ∈ G. G], ρ ∈ G, are open in G, the function π → dπ is locally constant on G. We conclude ρ ⊗ G/[G, and so G is abelian. 2 that dπ = 1 for all π ∈ G In the countable case we could have alternatively deduced Lemma 3.2 from the Plancherel formula for compact group extensions due to Kleppner and Lipsman [17, Theorem 4.4], the proof of which, however, utilizes much more sophisticated tools.
E. Kaniuth / Journal of Functional Analysis 257 (2009) 340–356
351
Now we can already show Theorem 3.1. Assuming that G0 is compact, we have to verify that G/G0 is abelian. The totally disconnected Moore group G/G0 is a projective limit of discrete groups [21, Theorem 3]. Therefore, since G0 is compact, there exists a system of compact open normal subgroups Kα in G such that Kα = G0 (compare the proof of Theorem 2.4). For each α, by Lemma 2.1 theweak QUP holds for G/Kα and hence G/Kα is abelian by Lemma 3.2. It follows that [G, G] ⊆ Kα = G0 , and we are done. The following theorem shows that, at least under some mild additional hypothesis, the converse of Theorem 3.1 is true. Theorem 3.3. Let G be a Moore group. Then the weak QUP holds for G provided that one of the following two conditions is satisfied. (i) G0 , the connected component of the identity, is noncompact. (ii) G0 is compact, G/G0 is abelian and G is a Lie group. It is clear from Theorem 2.4 that the weak QUP holds for the group G whenever G0 is noncompact. The proof that condition (ii) entails the weak QUP is more complicated and requires two lemmas. The first one relates μG (Bf ) and μH (Bf |H ) for an open subgroup H of G of finite index. Lemma 3.4. Let G be a Moore group containing a compact normal subgroup K such that G/K is abelian. Let H be a closed subgroup of G such that K ⊆ H and G/H is finite. Then, for any f ∈ L1 (G), μH (Bf |H ) [G : H ]3 μG (Bf ). Proof. In what follows we retain the notations introduced in Section 2. Let f = B
}. {χ ⊗ Bf : χ ∈ G/H
and χ ∈ G/K, Then, since μG (M) = μG (χ ⊗ M) for every Borel subset M of G f ) μG (B
μG (χ ⊗ Bf ) = [G : H ]μG (Bf ).
χ∈G/H
f ). To that end, consider any ρ ∈ H It therefore suffices to show that μH (Bf |H ) [G : H ]2 μG (B such that ρ σ |H , and let and σ ∈ G χ ∈ r −1 tρ−1 Bf |H ∩ (ρ ⊗ H /K) . extending Then (χ|H ⊗ ρ)(f |H ) = 0 and this implies that (ϕ ⊗ σ )(f ) = 0 for some ϕ ∈ G/K χ|H , that is, ϕ ∈ χ · G/H . Indeed, this follows from
352
E. Kaniuth / Journal of Functional Analysis 257 (2009) 340–356
(χ|H ⊗ σ |H )(f |H ) = (χ ⊗ σ )(f |H )
= [G : H ]−1 (χ ⊗ σ ) η ·f
=
η∈G/H
(ηχ ⊗ σ )(f )
η∈G/H
means precisely and χ|H ⊗ ρ χ|H ⊗ σ |H . The fact that (ϕ ⊗ σ )(f ) = 0 for some ϕ ∈ χ · G/H that χ ⊗ σ ∈ Bf . Thus we have seen so far that f ∩ (σ ⊗ G/K) . r −1 tρ−1 Bf |H ∩ (ρ ⊗ H /K) ⊆ tσ−1 B Since μH is the image of μG/K under the restriction map r : G/K → H /K, an application of /K Lemma 1.1 to both G and H yields dρ μH Bf |H ∩ (ρ ⊗ H /K) = μH /K tρ−1 Bf |H ∩ (ρ ⊗ H /K) kρ dρ = μG/K r −1 tρ−1 Bf |H ∩ (ρ ⊗ H /K) kρ dρ f ∩ (σ ⊗ G/K) μG/K tσ−1 B kρ d ρ kσ f ∩ (σ ⊗ G/K) . = μG B kρ d σ Note next that since the support of σ |H equals the G-orbit G(ρ) of ρ, we have dσ = dσ |H |G(ρ)| · dρ . We now estimate kσ . Let Gσ and Hρ be the subgroups of G and H defined by χ ⊗ σ = σ } and H σ = {χ ∈ G/K: /Hρ = {λ ∈ H /K: λ ⊗ ρ = ρ}, G/G σ then χ|H ⊗ σ |H = σ |H and hence χ|H ⊗ ρ ∈ G(ρ). Moreover, respectively. If χ ∈ G/G if χ1 , χ2 ∈ G/Gσ are such that χ1 |H ⊗ ρ = χ2 |H ⊗ ρ, then (χ2−1 χ1 )|H ⊗ ρ = ρ and so /Hρ . These two facts imply (χ2−1 χ1 )|H ∈ H kσ [G : H ] · G(ρ) · kρ . Combining this with the above inequality between dσ and dρ gives f ∩ (σ ⊗ G/K) . /K) [G : H ]μG B μH Bf |H ∩ (ρ ⊗ H be such that K is the disjoint union of the supports of the representations σ |K , Finally, let S ⊆ G such that ρσ σ |H . Then H is the disjoint union σ ∈ S, and for each σ ∈ S, choose ρσ ∈ H (ρ ⊗ H /K) . H= σ ∈S
ρ∈G(ρσ )
E. Kaniuth / Journal of Functional Analysis 257 (2009) 340–356
353
Observing that |G(ρσ )| [G : H ], we get μH (Bf |H ) =
σ ∈S
μH Bf |H ∩ (ρ ⊗ H /K)
ρ∈G(ρσ )
[G : H ] ·
G(ρσ )μG B f ∩ (σ ⊗ G/K)
σ ∈S
f ). [G : H ] μG (B 2
f ) [G : H ]μG (Bf ), the proof is complete. Since μG (B
2
Lemma 3.5. Let G = AK, where K is a connected compact open subgroup of G and A is a closed subgroup which is contained in the centre of G. Let f ∈ L1 (G) be such that μG (Bf ) < ∞. Then, for each a ∈ G, f |aK is either zero or nonzero almost everywhere on aK. with Proof. Since BLa f = Bf for each a ∈ G, we can assume that a = e. Let P be a subset of G the property that each G-orbit G(τ ), τ ∈ K, coincides with exactly one of the sets supp(π|K ), π ∈ P . Since A ⊆ Z(G) and G/A is connected, by Lemma 1.3 μG (E) =
dπ μG tπ−1 E ∩ (π ⊗ G/K)
π∈P
=
π∈P
=
1E (π ⊗ λ) dλ
dπ G/K
G/K
dπ 1E (π ⊗ λ) dλ
π∈P
Thus, if μG (E) < ∞ then for almost all λ ∈ G/K, the set {π ∈ P : for every Borel subset E of G. λ ⊗ π ∈ E} is finite. Let Haar measures on G and K be normalized such that mK = mG |K , mG/K is counting measure and Weil’s formula holds. Let R ⊆ A be a representative system for the cosets of K in G. Since, for a ∈ G, 1 for a ∈ K λ(a) dμG/K (λ) = , 0 for a ∈ /K G/K
it follows that π|K (f |K ) =
f (x)π(x) dmK (x) K
=
a∈R
G/K
λ(a) dμG/K (λ) f (ax)π(ax) dmK (x) K
354
E. Kaniuth / Journal of Functional Analysis 257 (2009) 340–356
=
G/K
f (ax)λ(a)π(ax) dmK (x) dμG/K (λ)
a∈R K
=
(π ⊗ λ)(f ) dμG/K (λ)
G/K
for each π ∈ G. Now, let σ ∈ Bf |H and let π ∈ P such that π|K σ . Then π|K (f |K ) = 0 and hence, by the By the first preceding formula, (π ⊗ λ)(f ) = 0 for all λ in a nonempty open subset of G/K. paragraph of the proof there are only finitely many such π (for fixed λ) and hence only finitely with σ (f |K ) = 0. Since K is connected, this means that either f |K = 0 or f |K is many σ ∈ K nonzero almost everywhere on K. 2 Using the preceding two lemmas, we are now in a position to prove that condition (ii) in Theorem 3.3 implies the weak QUP. So suppose that G is a Lie group, G0 is compact and G/G0 is abelian. Since G is a Lie group, by Theorem 2 of [21] G contains a normal subgroup L of finite index such that L/Z(L) is compact. Then G0 ⊆ L and the normal subgroup H = G0 Z(L) has finite index in G since G/Z(L) is compact and G0 is open in G. Moreover, H = G0 Z(H ). Let f ∈ L1 (G) ∩ L2 (G), f = 0, and let K = G0 . To show that mG (Af )μG (Bf ) 1, we can assume that mG (Af ) < ∞ as well as μG (Bf ) < ∞. We claim that for any x ∈ G, f |xK is either zero or nonzero almost everywhere on xK. To see this, let R be a representative system for the cosets of H in G, and for r ∈ R, let fr = (Lr f )|H . Then, by Lemma 3.4, μH (Bfr ) [G : H ]3 μG (BLr f ) = [G : H ]3 μG (Bf ) < ∞. Thus, since H = Z(H )K, Lemma 3.5 applies and yields that for any x ∈ H , f |xK is either zero or nonzero almost everywhere on xK. This of course implies that for every r ∈ R and any x ∈ G, f · 1rH = Lr −1 (f r ) is either zero or nonzero almost everywhere on xK. Since the ∈ R, are supported on different cosets of H and K ⊆ H , it follows that the functions f · 1rH , r same is true of f = r∈R f · 1rH . We have seen so far that for each x ∈ G, either f |xK = 0 or f |xK = 0 almost everywhere. Now define a function g on G/K by g(xK) = f (xk) dmK (k), x ∈ G. K
As before, let mG and mK be normalized such that mG (K) = mK (K) = 1. Then mG/K (Ag ) = mG (Af ) since g(xK) = 0 implies that f (xk) = 0 for all k in a set of positive measure and hence ⊆ G. Since for almost all k ∈ K. Moreover, μG/K is the measure induced from μG on G/K λ(g) = λ(f ) for every λ ∈ G/K, it follows that μG/K (Bg ) μG (Bf ). Finally, since G/K is abelian, the weak QUP holds for G/K [20] (for a simple proof see [16, Lemma 1.1]) and consequently mG (Af )μG (Bf ) mG/K (Ag )μG/K (Bg ) 1. This finishes the proof of Theorem 3.3.
E. Kaniuth / Journal of Functional Analysis 257 (2009) 340–356
355
Remark 3.6. (1) The reader will have observed that our proof that condition (ii) in Theorem 3.3 implies the weak QUP used in a substantial manner the existence of a normal subgroup H of finite index in G such that H /Z(H ) is compact rather than the somewhat more restrictive hypothesis that G is a Lie group. (2) We have not been able to show that if a Plancherel group G is a projective limit of Lie groups, Gα say, and each Gα satisfies the weak QUP, then so does G. If this were true then, because every Moore group is a projective limit of Lie groups, the hypothesis that G be a Lie group in Theorem 3.3(ii) could be dropped. Whereas, by Theorem 2.4, the QUP holds for G if and only if it holds for GF , it follows from Theorems 3.1 and 3.3 that when the weak QUP holds for GF , it may nevertheless fail for G. In fact, every nonabelian discrete group G for which GF is abelian, serves as such an example. However, if GF satisfies the weak QUP, then we at least have mG (Ag )μG (Bg )
1 [G : GF ]
for every nonzero g ∈ L1 (G) ∩ L2 (G). This is an immediate consequence of part (i) of our final lemma, while assertion (ii) might prove useful in subsequent investigations. Lemma 3.7. Let G be a Moore group and let f ∈ L1 (GF ) ∩ L2 (GF ). (i) If g ∈ L1 (G) ∩ L2 (G) is such that g|GF = f , then mGF (Af )μGF (Bf ) [G : GF ]mG (Ag )μG (Bg ). (ii) Let g = f , the trivial extension of f to all of G. Then mGF (Af )μGF (Bf ) mG (Ag )μG (Bg ). Proof. Of course, we can assume that mGF equals the restriction of mG to GF . We have seen in the proof of Lemma 2.2 that if g ∈ L1 (G) ∩ L2 (G) is any extension of f , then μGF (Bf ) [G : GF ]μG (Bg ). Since mGF (Af ) mG (Ag ), (i) follows. For (ii), let A ⊆ G be a coset representative system for GF in G and recall that μG (M) = (Proposition 1.1). Now, if π ∈ Bg and [G : GF ]−1 μGF (s −1 (r(M))) for any Borel subset M of G F , then π|GF (f ) = π(g) = 0 and hence a · τ (f ) = 0 for at least one supp(π|GF ) = G(τ ), τ ∈ G F and x ∈ G, it follows that a ∈ A. Since μGF (x · M) = μGF (M) for every Borel subset M of G μG (Bg ) = μG (Bg ∩ S) = [G : GF ]−1 μGF s −1 r(Bg ∩ S) a · (Bf ∩ T ) [G : GF ]−1 μGF −1
[G : GF ]
a∈A
μGF a · (Bf ∩ T )
a∈A
= μGF (Bf ∩ T ) = μGF (Bf ). Since mG (Ag ) = mGF (Af ), (ii) is now obvious.
2
356
E. Kaniuth / Journal of Functional Analysis 257 (2009) 340–356
References [1] W.O. Amrein, A.M. Berthier, On support properties of Lp -functions and their Fourier transforms, J. Funct. Anal. 24 (1977) 258–267. [2] D. Arnal, J. Ludwig, Q.U.P. and Paley–Wiener properties of unimodular, especially nilpotent, Lie groups, Proc. Amer. Math. Soc. 125 (1997) 1071–1080. [3] M. Benedicks, On Fourier transforms of functions supported on sets of finite Lebesgue measure, J. Math. Anal. Appl. 106 (1985) 180–183. [4] M. Cowling, J.F. Price, A. Sitaram, A qualitative uncertainty principle for semisimple Lie groups, J. Aust. Math. Soc. 45 (1988) 127–132. [5] J. Dixmier, C ∗ -Algebras, North-Holland, Amsterdam, 1977. [6] D.L. Donoho, P.B. Stark, Uncertainty principles and signal recovery, SIAM J. Appl. Math. 49 (1989) 906–931. [7] S. Echterhoff, E. Kaniuth, A. Kumar, A qualitative uncertainty principle for certain locally compact groups, Forum Math. 3 (1991) 355–369. [8] W. Feit, Characters of Finite Groups, Benjamin, Amsterdam, 1967. [9] J.M.G. Fell, Weak containment and induced representations. II, Trans. Amer. Math. Soc. 110 (1964) 424–447. [10] G.B. Folland, A. Sitaram, The uncertainty principle: A mathematical survey, J. Fourier Anal. Appl. 3 (1997) 207– 238. [11] S. Grosser, M. Moskowitz, Compactness conditions in topological groups, J. Reine Angew. Math. 246 (1971) 1–40. [12] W. Hauenschild, E. Kaniuth, Harmonische Analyse auf Gruppen mit endlich-dimensionalen irreduziblen Darstellungen, Math. Z. 144 (1975) 239–256. [13] E. Hewitt, K.A. Ross, Abstract Harmonic Analysis. I, II, Springer-Verlag, Berlin, 1963/70. [14] J.A. Hogan, A qualitative uncertainty principle for locally compact abelian groups, in: Miniconferences on Harmonic Analysis and Operator Algebras, Canberra, 1987, in: Proc. Centre Math. Anal. Austral. Nat. Univ., vol. 16, 1988, pp. 133–142. [15] J.A. Hogan, A qualitative uncertainty principle for unimodular groups of type I, Trans. Amer. Math. Soc. 340 (1993) 587–594. [16] E. Kaniuth, Minimizing functions for an uncertainty principle on locally compact groups of bounded representation dimension, Proc. Amer. Math. Soc. 135 (2007) 217–227. [17] A. Kleppner, R.L. Lipsman, The Plancherel formula for group extensions, Ann. Sci. Ecole Norm. Sup. 5 (1972) 459–516. [18] G. Kutyniok, A weak qualitative uncertainty principle for compact groups, Illinois J. Math. 47 (2003) 709–724. [19] G.W. Mackey, Unitary representations of group extensions, Acta Math. 99 (1958) 265–311. [20] T. Matolcsi, J. Szücs, Intersection des mesures spectrales conjugees, C. R. Acad. Sci. Paris 277 (1973) 841–843. [21] C.C. Moore, Groups with finite dimensional irreducible representations, Trans. Amer. Math. Soc. 166 (1972) 401– 410. [22] J.F. Price, A. Sitaram, Functions and their Fourier transforms with supports of finite measure for certain locally compact groups, J. Funct. Anal. 79 (1988) 166–181. [23] L.C. Robertson, A note on the structure of Moore groups, Bull. Amer. Math. Soc. 75 (1969) 594–599. [24] D.J.S. Robinson, Finiteness Conditions and Soluble Groups. I, Springer-Verlag, Berlin, 1972. [25] K.T. Smith, The uncertainty principle on groups, SIAM J. Appl. Math. 50 (1990) 876–882.
Journal of Functional Analysis 257 (2009) 357–387 www.elsevier.com/locate/jfa
Bundles of C ∗ -categories, II: C ∗-dynamical systems and Dixmier–Douady invariants Ezio Vasselli Dipartimento di Matematica, University of Rome “La Sapienza”, P. le Aldo Moro, 2-00185 Roma, Italy Received 30 June 2008; accepted 13 April 2009 Available online 21 April 2009 Communicated by D. Voiculescu
Abstract We introduce a cohomological invariant arising from a class in nonabelian cohomology. This invariant generalises the Dixmier–Douady class and encodes the obstruction to a C ∗ -algebra bundle being the fixedpoint algebra of a gauge action. As an application, the duality breaking for group bundles vs. tensor C ∗ categories with nonsimple unit is discussed in the setting of Nistor–Troitsky gauge-equivariant K-theory: there is a map assigning a nonabelian gerbe to a tensor category, and “triviality” of the gerbe is equivalent to the existence of a dual group bundle. At the C ∗ -algebraic level, this corresponds to studying C ∗ -algebra bundles with fibre a fixed-point algebra of the Cuntz algebra and in this case our invariant describes the obstruction to finding an embedding into the Cuntz–Pimsner algebra of a vector bundle. © 2009 Elsevier Inc. All rights reserved. Keywords: Tensor C ∗ -category; Duality; Cuntz algebra; Group bundle; Gerbe
1. Introduction In a series of works in the last eighties, S. Doplicher and J.E. Roberts developed an abstract duality for compact groups, motivated by questions raised in the context of algebraic quantum field theory. In such a scenario, the dual object of a compact group is characterised as a tensor C ∗ -category, namely a tensor category carrying an additional C ∗ -algebraic structure (norm, conjugation). E-mail address: [email protected]. URL: http://axp.mat.uniroma2.it/~vasselli. 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.04.004
358
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
At the C ∗ -algebraic level, one of the main discoveries in that setting has been a machinery performing a duality theory for compact groups in the context of the Cuntz algebra [4]. If d ∈ N and (Od , σd ) is the Cuntz C ∗ -dynamical system (here σd ∈ end Od denotes the canonical endomorphism, see [6, §1]), then every compact subgroup G ⊆ U(d) defines an automorphic action G → aut Od ,
∈ aut Od : G g → G
g(ψ ˆ i ) :=
d
gij ψj ,
(1)
j =1
where gij ∈ C, i, j = 1, . . . , d, are the matrix elements of g and {ψi } denotes the multiplet of mutually orthogonal partial isometries generating Od . Let OG denote the fixed-point algebra of Od w.r.t. the action (1). Since σd commutes with the G-action, the restriction σG := σd |OG ∈ end OG is well defined. The C ∗ -dynamical system (OG , σG ) allows one to reconstruct the following ob of tensor powers of jects: (1) the group G, as the stabiliser of OG in aut Od ; (2) The category G the defining representation G → U(d), as the category σˆ G with objects σGr , r ∈ N, and arrows the intertwiner spaces of σG : r s σG , σG := t ∈ OG : σ s (a)t = tσGr (a), a ∈ OG ,
r, s ∈ N.
(2)
In this way, the map G → (OG , σG )
(3)
may be considered as a “Galois correspondence” for compact subgroups of U(d). A more subtle question is when a C ∗ -dynamical system (A, ρ), ρ ∈ end A, is isomorphic to (OG , σG ) for some G ⊆ U(d). The solution to this problem (for G contained in the special unitary group SU(d)) has been given in [8, §4]: to get the above characterisation, natural necessary conditions are the triviality of the centre of A and the fact that A is generated as a Banach space by the intertwiner spaces (ρ r , ρ s ), r, s ∈ N; a more crucial condition is the existence of an intertwiner ε ∈ (ρ 2 , ρ 2 ), ε = ε −1 = ε ∗ (the symmetry), providing a representation P∞ → A of the infinite permutation group and implementing suitable flips between elements of (ρ r , ρ s ), r, s ∈ N. This structure is an abstract counterpart of the flip operator θ (ψ ⊗ ψ ) := ψ ⊗ ψ , ψ, ψ ∈ H , where H is the Hilbert space of dimension d. In this way, a group G ⊆ SU(d) is associated with (A, ρ, ε) and the intertwiner spaces of ρ are interpreted as G-invariant operators between tensor powers of H . In this sense G is the gauge group associated with (A, ρ, ε), according to the motivation of Doplicher and Roberts [9]. The correspondence (A, ρ, ε) → G is functorial: groups G, G ⊆ SU(d) are conjugates in U(d) if and only if there is an isomorphism α : (A, ρ, ε) → (A , ρ , ε ) of pointed C ∗ -dynamical systems, in the sense that the conditions α ◦ ρ = ρ ◦ α, α(ε) = ε , are fulfilled. As we shall see in the sequel, the previous conditions are equivalent to require an isomorphism of symmetric tensor C ∗ -categories naturally associated with our C ∗ -dynamical systems. Our research program focused on the study of tensor C ∗ -categories with nonsimple unit. This means that the space of arrows of the identity object ι is isomorphic to an Abelian C ∗ -algebra C(X) for some compact Hausdorff space X. Thus the model category, rather than the one of Hilbert spaces, is the one of Hermitian vector bundles over X, that we denote by vect(X). In a previous work [24], we proved that every tensor C ∗ -category with symmetry and conjugates can be regarded in terms of a bundle of C ∗ -categories over X, with fibres duals of compact groups
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
359
(see also [27]). By applying a standard technique, we associate pointed C ∗ -dynamical systems of the type (A, ρ, ε) with objects of these categories; as a consequence of the above-mentioned results, each (A, ρ, ε) is a continuous bundle of C ∗ -algebras with base X and fibres pointed C ∗ -dynamical systems (OGx , σGx , θx ), x ∈ X. Starting from this result, it became natural to search for a classification of locally trivial pointed C ∗ -dynamical systems (A, ρ, ε) with fibre (OG , σG , θ ), G ⊆ SU(d). In the first paper of the present series, we gave such a classification in terms of the cohomology set H 1 (X, QG), QG := NG\G, where N G is the normaliser of G in U(d) [24]. In this way, QG-cocycles q ∈ H 1 (X, QG) are put in one-to-one correspondence with pointed C ∗ -dynamical systems (Oq , ρq , εq ). From a different — but equivalent — point of view, H 1 (X, QG) describes the of isomorphism classes of “locally trivial” symmetric tensor C ∗ -categories with set sym(X, G) and such that (ι, ι) C(X). fibre G In the present paper we study the Galois correspondence (3) and the associated abstract version in the case where X is nontrivial. Instead of Od , our reference algebra is the Cuntz–Pimsner algebra OE associated with the module of sections of a vector bundle E → X, which yields a pointed C ∗ -dynamical system (OE , σE , θE ). If G → X is a bundle of unitary automorphisms of E, then we can construct a pointed C ∗ -dynamical system (OG , σG , θE ), OG ⊆ OE , from which it is possible to recover G with the same method used for compact subgroups of U(d). vs. G-bundles acting on vector bundles This leads to a duality for elements of sym(X, G) in the sense of Nistor and Troitsky [20]. Anyway, what we get is not a generalisation of the Doplicher–Roberts construction, as new phenomena arise. Firstly, in general it is false that a is the dual of a G-bundle; the reason is a cohomological obstruction to category with fibre G the embedding into vect(X): in C ∗ -algebraic terms, there are pointed C ∗ -dynamical systems (Oq , ρq , εq ) which do not admit an embedding into some (OE , σE , θE ). Secondly, an element may be realised as the dual of nonisomorphic G-bundles: at the C ∗ -algebraic of sym(X, G) level, we may get isomorphisms (Oq , ρq , εq ) (OG , σG , θE ), (Oq , ρq , εq ) (OG , σG , θE ), with G not isomorphic to G and E not isomorphic to E . In the present work we give an explanation of these facts in terms of properties of H 1 (X, QG), providing a complete geometrical for what concerns the duality theory. characterisation of sym(X, G) The above-mentioned cohomological machinery has its roots in the general framework of principal bundles and can be applied to generic C ∗ -algebra bundles. Let G be a group of automorphisms of a C ∗ -algebra F• and A• denote the fixed point algebra w.r.t. the G-action. It is natural to ask whether an A• -bundle A admits an embedding into some F• -bundle. In general, the answer is negative and the obstruction is measured by a class δ(A) ∈ H 2 X, G ,
(4)
where G is an Abelian quotient of G. When the above-mentioned embedding exists, A is the fixed-point algebra w.r.t. a gauge-action of a group bundle G → X with fibre G on an F• -bundle, in the sense of [25]. The above-mentioned obstruction for bundles with fibre (OG , σG , θ ) and the classical Dixmier–Douady invariant for bundles with fibre the compact operators [5, Chapter 10], are particular cases of this construction. The present work is organised as follows. In Section 3 we recall some results relating pointed C ∗ -dynamical systems with tensor C ∗ categories. Moreover, under the hypothesis that the inclusion G ⊆ U(d) is covariant (i.e., the into the category of tensor powers of H is unique up to unitary natural transembedding of G formations), we give a geometrical characterisation of the space of embeddings of OG into Od
360
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
(Theorem 7). Note that every in(Lemma 6) and a cohomological classification for sym(X, G) clusion G ⊆ SU(d) is covariant (in essence, this is proved in [7, Lemma 6.7]). In Section 4 we define some cohomological invariants for principal bundles. Given an exact p sequence of topological groups G → N G −→ QG and a space X, we consider the induced map of cohomology sets p∗ : H 1 (X, NG) → H 1 (X, QG) and construct a class δ(q) ∈ H 2 (X, G ) vanishing when q is in the image of p∗ . Moreover, a nonabelian G-gerbe G˘ is associated with q, collapsing to a group bundle G when q is in the image of p∗ . Finally, for each G ⊆ SU(d) we define a Chern class c(q) ∈ H 2 (X, Z) (Lemma 16). In Section 5 we give some properties of gauge C ∗ -dynamical systems and apply to them the construction of the previous section. In this way we construct the class (4), that we apply to pointed C ∗ -dynamical systems (Lemma 18, Theorem 21). The relation with the classical Dixmier–Douady invariant is discussed in Proposition 22. In Section 6 we prove a concrete duality for group bundles with fibre G ⊆ U(d). Let E → X be a rank d vector bundle, E denote the category with objects the tensor powers E r , r ∈ N, and arrows the spaces (E r , E s ) of bundle morphisms; then E is a symmetric tensor C ∗ -category with (ι, ι) C(X). We consider a group bundle G → X with a gauge action G ×X E → E and define of E, with arrows G-equivariant morphisms (E r , E s )G , a symmetric tensor C ∗ -subcategory G r, s ∈ N. We establish a one-to-one correspondence between tensor C ∗ -subcategories of E and are in one-to-one gauge actions (Proposition 25). Tensor C ∗ -subcategories of E with fibre G correspondence with reductions to N G of the structure group of E (Theorem 27): this yields a link between the categorical structure of E and the geometry of E. IsomorIn Section 7 we discuss the breaking of abstract duality for categories T with fibre G. phism classes [T ] ∈ sym(X, G) such that there is an embedding η : T → vect(X) are in oneto-one correspondence with elements of the set p∗ (H 1 (X, N G)) ⊆ H 1 (X, QG) (Theorem 30). For each η there is a vector bundle Eη → X and a G-bundle Gη → X acting on Eη such that T is η . Applying the results of Section 4, we assign a class δ(T ) ∈ H 2 (X, G ): if there isomorphic to G is an embedding T → vect(X) then δ(T ) vanishes, and when such an embedding does not exist the role of the dual G-bundle is played by a G-gerbe (Theorem 35). Finally, we discuss the cases G = SU(d) (Examples 32, 36), G = T (Example 38) and G = Rd (Example 37, Rd denotes the group of roots of unity). 2. Preliminaries 2.1. Keywords and notation Let X be a locally compact Hausdorff space. If {Xi } is a cover of X, then we define Xij := Xi ∩ Xj , Xij k := Xi ∩ Xj ∩ Xk . Moreover, we denote the C ∗ -algebra of continuous functions on X vanishing at infinity by C0 (X); if X is compact, then we denote the C ∗ -algebra of continuous functions on X by C(X). If U ⊂ X is open, then we denote the ideal in C0 (X) (or C(X)) of functions vanishing in X − U by C0 (U ). If W ⊂ X is closed, then we define CW (X) := C0 (X − W ); in particular, for every x ∈ X we set Cx (X) := C0 (X − {x}). Since in ˇ the present paper we shall deal with Cech cohomology, we assume that every space has good covers (i.e. each Xij , Xij k , . . . , is empty or contractible). Let A be a C ∗ -algebra. We denote the set of automorphisms (resp. endomorphisms) of A, endowed with pointwise convergence topology, by aut A (resp. end A). A pair (A, ρ), with ρ ∈ end A, is called C ∗ -dynamical system. If (A, ρ), (A , ρ ) are C ∗ -dynamical systems, then a C ∗ -algebra morphism α : A → A such that α ◦ ρ = ρ ◦ α is denoted by α : (A, ρ) → (A , ρ ).
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
361
In particular, if a ∈ A, a ∈ A and α(a) = a , then we write α : (A, ρ, a) → (A , ρ , a ) and refer to α as a morphism of pointed C ∗ -dynamical systems. We denote the group of automorphisms of the pointed C ∗ -dynamical system (A, ρ, a) by aut(A, ρ, a). Let X be a locally compact Hausdorff space. A C0 (X)-algebra is a C ∗ -algebra A endowed with a nondegenerate morphism from C0 (X) into the centre of the multiplier algebra M(A). It is customary to assume that such a morphism is injective, thus C0 (X) will be regarded as a subalgebra of M(A). For every x ∈ X, we define the fibre epimorphism as the quotient πx : A → Ax := A/(Cx (X)A) and call Ax the fibre of A over x. The group of C0 (X)automorphisms of A is denoted by autX A. The restriction of A on an open U ⊂ X is given by the closed ideal obtained multiplying elements of A by elements of C0 (U ), and is denoted by AU := C0 (U )A. We denote the (spatial) C0 (X)-tensor product by ⊗X (see [16, §1.6], where the notation “C(X)” is used to mean C0 (X)). Examples of C0 (X)-algebras are continuous bundles of C ∗ -algebras in the sense of [5,17]; we refer to the last reference for the notion of locally trivial continuous bundle. Let A• be a C ∗ -algebra; to be concise, we will call A• -bundle a locally trivial continuous bundle of C ∗ -algebras with fibre A• ; to avoid confusion with bundles in the topological setting, we emphasise the fact that an A• -bundle is indeed a C ∗ -algebra. For standard notions about vector bundles, we refer to the classics [1,15,21]. In the present work, we will assume that every vector bundle is endowed with a Hermitian structure. We shall also consider Banach bundles (see [10], [5, Chapter 10]). For basic properties of fibre bundles and principal bundles, we refer to [12, Chapter 4, 6], [11, I.3]. If p : Y → X is a continuous map (i.e., a bundle), then we say that p has local sections if for every x ∈ X there is a neighbourhood U x and a continuous map s : U → Y such that p ◦ s = idU . If p : Y → X is a continuous map, then the fibred product is defined as the space Y ×X Y := {(y, y ) ∈ Y × Y : p(y) = p (y )}. An expository introduction to nonabelian cohomology and gerbes is [2], where a good list of references is provided. For basic properties of C ∗ -categories and tensor C ∗ -categories, we refer to [7]. In particular, we make use of the terms C ∗ -functor, C ∗ -epifunctor, C ∗ -monofunctor, C ∗ -isofunctor, C ∗ autofunctor to denote functors preserving the C ∗ -structure. For every r ∈ N we denote the permutation group of order r by Pr and the infinite permutation group by P∞ , which is endowed with natural inclusions Ps ⊂ P∞ , s ∈ N. For every r, s ∈ N, we denote the permutation exchanging the first r objects with the remaining s objects by (r, s) ∈ Pr+s . 2.2. Bundles of C ∗ -categories A C ∗ -category C is a category having Banach spaces as sets of arrows and endowed with an involution ∗ : (ρ, σ ) → (σ, ρ), ρ, σ ∈ obj C, such that the C ∗ -identity t ∗ ◦ t = t2 , t ∈ (ρ, σ ), is fulfilled. In this way, each (ρ, ρ), ρ ∈ obj C, is a C ∗ -algebra, whilst (ρ, σ ) a Hilbert (σ, σ )(ρ, ρ)-bimodule (see [14,24]). In the present work we will consider C ∗ -categories not necessarily endowed with identity arrows 1ρ ∈ (ρ, ρ) (see [19, §2.1]). In this setting, (ρ, ρ) is not necessarily unital and we denote the multiplier algebra by M(ρ, ρ). Let C be a C ∗ -category and X a locally compact Hausdorff space. C is said to be a C0 (X)category whenever there is a family {iρ , ρ ∈ obj C} of nondegenerate morphisms iρ : C0 (X) → M(ρ, ρ), called the C0 (X)-structure, such that t ◦ iρ (f ) = iσ (f ) ◦ t,
ρ, σ ∈ obj C, t ∈ (ρ, σ ), f ∈ C0 (X).
362
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
The previous equality implies that each (ρ, ρ), ρ ∈ obj C, is a C0 (X)-algebra. We assume that each iρ is injective and write f t := iσ (f ) ◦ t, f ∈ C0 (X), t ∈ (ρ, σ ). Functors preserving the C0 (X)-structure are called C0 (X)-functors. If U ⊆ X is open, then we define the restriction on U as the C ∗ -category CU having the same objects as C and spaces of arrows (ρ, σ )U := C0 (U )(ρ, σ ); note that CU may lack identity arrows also when C has identity arrows. If W is closed, then we denote the C ∗ -category having the same objects as C and spaces of arrows the quotients (ρ, σ )W := (ρ, σ )/(C0 (X − W )(ρ, σ )) by CW ; the corresponding C ∗ -epifunctor πW : C → CW is called the restriction functor. In particular, we define the fibre of C over x as Cx := C{x} and call πx : C → Cx the fibre functor. For every ρ, σ ∈ obj C, t ∈ (ρ, σ ), we define the norm function nt (x) := πx (t), x ∈ X. It can be proved that nt is upper semicontinuous for each arrow t; when each nt is continuous, we say that C is a continuous bundle over X. In this case, each (ρ, σ ) is a continuous field of Banach spaces over X and each (ρ, ρ) is a continuous bundle of C ∗ -algebras. Let C• be a C ∗ -category. The constant bundle XC• is the C0 (X)-category having the same objects as C• and arrows the spaces (ρ, σ )X of continuous maps vanishing at infinity from X to (ρ, σ ), ρ, σ ∈ obj C • . A C0 (X)-category C is said to be locally trivial whenever for each x ∈ X there is an open neighbourhood U x with a C0 (U )-isofunctor αU : CU → U C• , such that the induced map αU : obj C → obj C• does not depend on the choice of U . The functors αU are called local charts. When X is compact, the same constructions apply with the obvious modifications. 3. Tensor C ∗ -categories and C ∗ -dynamical systems The present section has two purposes. Firstly, in order to make the present paper enough self-contained, we collect some results from [6,24] in a slightly different form and recall the notions of special category and embedding functor. Secondly, we describe the space of certain embedding functors in terms of a principal bundle (Lemma 6) and provide a classification result G ⊂ U(d) (Theorem 7); these results shall be applied in Section 6. for bundles with fibre G, A tensor C ∗ -category is a C ∗ -category T with identity arrows endowed with a C ∗ -bifunctor ⊗ : T × T → T , called the tensor product. For brevity, we denote the tensor product of objects ρ, σ ∈ obj T by ρσ , whilst the tensor product of arrows t ∈ (ρ, σ ), t ∈ (ρ , σ ), is denoted by t ⊗ t ∈ (ρρ , σ σ ). We assume the existence of an identity object ι ∈ obj T such that ιρ = ρι = ρ, ρ ∈ obj T : it can be easily verified that (ι, ι) is an Abelian C ∗ -algebra and every space of arrows (ρ, σ ) is a Banach (ι, ι)-bimodule w.r.t. the operation of tensoring with arrows in (ι, ι). Let X ι denote the spectrum of ι; then T is a C(X ι )-category in a natural way. In particular, it can be proved that T is a continuous bundle if certain additional assumptions are satisfied [24, 27]. A tensor C ∗ -category whose objects are r-fold tensor powers of an object ρ, r ∈ N, is denoted by (ρ, ˆ ⊗, ι); for r = 0, we use the convention ρ 0 := ι. In the sequel of the present work, we shall need to keep in evidence an arrow a ∈ (ρ r , ρ s ) for some r, s ∈ N, so that we introduce the notation (ρ, ˆ ⊗, ι, a). Moreover, we denote tensor C ∗ -functors α : ρˆ → ρˆ (the term tensor means
that ⊗ ◦ (α × α) = α ◦ ⊗, α(ι) = ι ) such that α(a) = a by α : (ρ, ˆ ⊗, ι, a) → ρ , ⊗ , ι , a . If (A, ρ, a) is a pointed C ∗ -dynamical system, then the category ρˆ with objects the powers ρ r , r ∈ N, and arrows the intertwiner spaces (ρ r , ρ s ), r, s ∈ N, endowed with the tensor product
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
t ⊗ t := tρ r t ,
ρ r ρ s := ρ r+s ,
363
t ∈ ρr , ρs , ρr , ρs ,
is an example of such singly generated tensor C ∗ -categories with a distinguished arrow. We denote the C ∗ -algebra generated by the intertwiner spaces (ρ r , ρ s ), r, s ∈ N, by Oρ . ˆ ⊗, ι) comes associated with a C ∗ -dynamical system, in Actually, every tensor C ∗ -category (ρ, the following way (see [7, §4] for details). As a first step, we consider the maps jr,s (t) := t ⊗ 1ρ ∈ (ρ r+1 , ρ s+1 ), t ∈ (ρ r , ρ s ) and define the Banach spaces Oρk := lim→r ((ρ r , ρ r+k ), jr,r+k ), k ∈ Z. As a second step, we note that composition of arrows and involution induce a well-defined ∗algebra structure on the direct sum 0 Oρ := k Oρk . It can be proved that there is a unique C ∗ norm on 0 Oρ such that the circle action zˆ (t) := zk t, z ∈ T, t ∈ Oρk , extends to an automorphic action. In this way, the so-obtained C ∗ -completion Oρ comes equipped with a continuous action T → aut Oρ with spectral subspaces Oρk , k ∈ N, and also with a canonical endomorphism ρ∗ ∈ end Oρ ,
ρ∗ (t) := 1ρ ⊗ t,
t ∈ ρr , ρs ,
such that ρ∗ ◦ zˆ = zˆ ◦ ρ∗ , z ∈ T. The pair (Oρ , ρ∗ ) is called the DR-dynamical system associated with ρ. Since the maps jr,s are injective in all the cases of interest in the present work, in the sequel we will identify t ∈ (ρ r , ρ s ) with the corresponding element of Oρ . By construction we have (ρ r , ρ s ) ⊆ (ρ∗r , ρ∗s ), r, s ∈ N. We say that ρ is amenable if (ρ r , ρ s ) = r (ρ∗ , ρ∗s ), r, s ∈ N, and in that case ρˆ is said to be amenably generated. We summarise the above considerations in the following theorem, which also includes a reformulation of [24, Proposition 19]: Theorem 1. The map (ρ, ˆ ⊗, ι, a) → (Oρ , ρ∗ , a) defines a one-to-one correspondence between the class of amenably generated tensor C ∗ -categories with a distinguished arrow and the class of pointed C ∗ -dynamical systems (A, σ, a) such that A is generated by the intertwiner spaces ˆ ⊗, ι, a) → (ρ , ⊗ , ι , a ) are in one-to-one correspondence with of σ . Tensor C ∗ -functors α : (ρ, morphisms α : (Oρ , ρ∗ , a) → (Oρ , ρ∗ , a ) of pointed C ∗ -dynamical systems. The category ρˆ is a continuous bundle over the spectrum X ι of (ι, ι) if and only if Oρ is a continuous bundle over X ι . If ρˆ is locally trivial as a bundle of C ∗ -categories, then Oρ is locally trivial as a C ∗ -algebra bundle. A tensor C ∗ -category (T , ⊗, ι) is said to be symmetric if there is a family of unitary operators ε(ρ, σ ) ∈ (ρσ, σρ), ρ, σ ∈ obj T , implementing the flips t ⊗ t ◦ ε ρ , ρ = ε σ , σ ◦ t ⊗ t . In particular, if (ρ, ˆ ⊗, ι) is symmetric, then we define the symmetry operator ε := ε(ρ, ρ) ∈ ρ 2 , ρ 2 . It is well known that ε induces a unitary representation of P∞ , by considering products of the type ε ◦ (1ρ × ε) ◦ (1ρ r ⊗ ε) ◦ · · · , r ∈ N (for example, see [6, p. 100]). We denote the unitaries arising from such a representation by ε(p) ∈ (ρ r , ρ r ), r ∈ N, p ∈ Pr ⊆ P∞ ; in particular, we denote the unitary associated with (r, s) ∈ Pr+s by ερ (r, s) ∈ (ρ r+s , ρ r+s ). If there is α : (ρ, ˆ ⊗, ι, ε) → ρ , ⊗ , ι , ε ,
364
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
then the above considerations imply that α(ερ (r, s)) = ερ (r, s), r, s ∈ N. We denote the pointed C ∗ -dynamical system associated with (ρ, ˆ ⊗, ι, ε) by (Oρ , ρ∗ , ε). According to the considerations of the previous section, we find that ρˆ is a C(X ι )-category. Now let (ρˆ• , ⊗• , ι• , ε• ) be a symmetric tensor C ∗ -category such that (ι• , ι• ) C; we denote the set of isomorphism classes of locally ˆ ⊗, ι, ε) having fibre ρˆ• and such that (ι, ι) C(X) by trivial symmetric tensor C ∗ -categories (ρ, sym(X, ρˆ• ). ˆ ⊗, ι, ε) → With the term isomorphism, here we mean a tensor C ∗ -isofunctor of the type α : (ρ, ˆ ⊗, ι, ε) with fibre ρˆ• is called (ρ , ⊗ , ι , ε ). A locally trivial symmetric tensor C ∗ -category (ρ, ˆ ⊗, ι, ε] or, more concisely, by [ρ]. ˆ ρˆ• -bundle; the class of ρˆ in sym(X, ρˆ• ) is denoted by [ρ, Remark 2. The condition α(ε) = ε required in the previous notion of isomorphism comes from group duality. Let G1 , G2 be compact groups and RG1 , RG2 the associated symmetric tensor C ∗ -categories of finite dimensional, continuous, unitary representations; if α : RG1 → RG2 is an isomorphism of tensor categories, then a sufficient condition to get an isomorphism α ∗ : G2 → G1 is that α preserves the symmetry (see [13]). The category hilb of Hilbert spaces, endowed with the usual tensor product, is clearly a symmetric tensor C ∗ -category. Of particular interest for the present work is the following class of subcategories of hilb. Let H be the standard Hilbert space of dimension d ∈ N; we denote the r-fold tensor power of H by H r (for r = 0, we define ι := H 0 := C) and the space of linear operators from H r to H s by (H r , H s ), r, s ∈ N; moreover, we consider the flip θ ∈ H 2, H 2 . If G ⊆ U(d) is a compact group, then for every g ∈ G we find that the r-fold tensor power gr is a unitary on H r , so that we consider the spaces of G-invariant operators r s ˆ := gs ◦ t ◦ gr∗ , g ∈ G . H , H G := t ∈ H r , H s : t = g(t)
(5)
with objects H r , r ∈ N, In particular, we have that θ ∈ (H 2 , H 2 )G . By defining the category G r s ∗ ⊗, ι, θ ). The pointed and arrows (H , H )G , we obtain a symmetric tensor C -category (G, ∗ C -dynamical system associated with (G, ⊗, ι, θ ) in the sense of Theorem 1 is (OG , σG , θ ), is amenably where OG , σG are defined in Section 1. As mentioned in Section 1, the category G generated, so that we have equalities r s H , H G = σGr , σGs ,
r, s ∈ N.
(6)
, ⊗, ι, θ ) of tensor powers If G reduces to the trivial group, then we obtain the category (H ∗ of H and Theorem 1 yields the Cuntz C -dynamical system (Od , σd , θ ). If G = U(d), then (H r , H s )U(d) is nontrivial only for r = s; in such a case, (H r , H r )U(d) is generated as a vector space by the unitaries θ (p), p ∈ Pr . Let N G denote the normaliser of G in U(d) and QG := NG/G the quotient group; then, the map (1) induces an injective continuous action QG → aut(OG , σG , θ ),
y → y. ˆ
(7)
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
365
An other symmetric tensor C ∗ -category that shall play an important role in the present paper is the category vect(X) with objects vector bundles over a compact Hausdorff space X and arrows vector bundle morphisms. Definition 3. Let (T , ⊗, ι, ε) be a symmetric tensor C ∗ -category. An embedding functor is a C ∗ -monofunctor E : T → vect(X ι ) preserving tensor product and symmetry. G ⊆ U(d). To this We now describe in geometrical terms a set of embedding functors of G, end, let us denote the set of monomorphisms η : (OG , σG , θ ) → (Od , σd , θ ) by emb OG , and endow it with the pointwise norm topology; by Theorem 1, we can identify ⊗, ι, θ ) → (H , ⊗, ι, θ ). In particular, we denote the emb OG with the set of embeddings β : (G, ⊗, ι, θ ) → (G, ⊗, ι, θ ) by aut G. group of autofunctors of the type β : (G, Definition 4. The faithful representation G ⊆ U(d) is said to be covariant whenever for each η ∈ emb OG there is u ∈ U(d) such that η = u| ˆ OG . By Theorem 1 the property of G ⊆ U(d) being covariant is equivalent to require that the ⊗, ι, θ ) ⊆ (H , ⊗, ι, θ ) is unique up to tensor unitary natural transformainclusion functor (G, tion. By [7, Lemma 6.7, Theorem 4.17] (see also the following Theorem 10) every inclusion G ⊆ SU(d) is covariant, thus we conclude that every compact Lie group has a faithful covariant representation (in fact, it is well known that every compact Lie group G has a faithful representation u : G → U(d), so it suffices to consider u ⊕ det u). Anyway there are interesting examples of covariant representations whose image is not contained in the special unitary group. Example 5. Let G ⊂ U(d) denote the image of T under the action on H Cd defined by has spaces of arrows (H r , H s )G = δrs (H r , H s ), r, s ∈ N, where scalar multiplication. Then G δrs denotes the Kronecker symbol. We have N G = U(d), QG = PU(d). If η ∈ emb OG then η restricts to a C ∗ -isomorphism η : (H, H )G = (H, H ) → H ), which is inner (H, the automorr r r r,Hr) phism induced by a unitary u ∈ U(d). Since (H (H, H ) and η( t ) = i i i η(ti ) = r ˆ i ), ti ∈ (H, H ), i = 1, . . . , r, we conclude that η = u| ˆ OG and G ⊆ U(d) is covariant. i u(t Lemma 6. Let G ⊆ U(d) be covariant. Then emb OG is homeomorphic to the coset space U(d)\G. For each locally compact Hausdorff space Y and continuous map β : Y → emb OG ,
y → βy ,
(8)
there is a finite open cover {Yl } of Y and continuous maps ul : Yl → U(d) such that uˆ l,y (t) = βy (t), where uˆ l,y ∈ aut Od is defined by (1).
y ∈ Yl , t ∈ OG ,
(9)
366
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
Proof. We consider the fibration q : U(d) → U(d)\G and define χ : U(d) → emb OG ,
u → u| ˆ OG .
The map χ is clearly continuous and, since G ⊆ U(d) is covariant, it is also surjective. Now, (1) yields an isomorphism from G to the stabiliser of OG in aut Od [6, Corollary 3.3], thus we find that χ(u1 ) = χ(u2 ) if and only if u∗1 u2 ∈ G, i.e. q(u1 ) = q(u2 ). This proves that emb OG is homeomorphic to U(d)\G. Since U(d) is a compact Lie group, the map q defines a principal G-bundle over U(d)\G, thus there is a finite open cover {Ωl } of U(d)\G and local sections sl : Ωl → U(d), q ◦ sl = idΩl . Now, let us identify emb OG with U(d)\G and consider the map (8); defining Yl := β −1 (Ωl ) we obtain a finite open cover of Y and set ul,y := sl ◦ βy , y ∈ Yl . By definition of χ , Eq. (9) is fulfilled and the theorem is proved. 2 Let now p : NG → QG the natural projection. The following result is a version of [24, Theorem 36] for groups not necessarily contained in SU(d): and for each Theorem 7. If G ⊆ U(d) is covariant then there is an isomorphism QG aut G, 1 compact Hausdorff space X there is a bijective map Q : sym(X, G) → H (X, QG). with aut(OG , σG , θ ). By [24, Lemma 32], to prove Proof. Using Theorem 1 we identify aut G the theorem it suffices to verify that (7) is an isomorphism. Now, the same argument of the then there is u ∈ U(d) such that uˆ ∈ aut Od restricts to η previous lemma shows that if η ∈ aut G ˆ for each g ∈ G we find that uˆ ◦ gˆ ◦ uˆ −1 is the identity on OG , thus on OG ; since OG is u-stable, by [6, Corollary 3.3] there is g ∈ G such that g = ugu∗ . We conclude that u ∈ N G and since ˆ OG = η for all g ∈ G we find η = y, ˆ where y = p(u) and yˆ is the image of y ∈ QG uˆ ◦ g| ˆ OG = u| under (7). 2 Remark 8. Let aut(Od ; OG ) denote the group of automorphisms of (Od , σ, θ ) that restrict to elements of aut OG . The argument of the previous theorem shows that (1) induces the isomorphism NG → aut(Od ; OG ), in such a way that for each u ∈ N G we have u| ˆ OG = y, ˆ
y := p(u) ∈ QG.
This yields a slight generalisation of [24, Theorem 34]. Given a G-bundle (ρ, ˆ ⊗, ι, ε), with G ⊆ U(d) covariant, we denote the associated class in by Q[ρ]. ˆ ˆ ⊗, ι, ε). For every n ∈ N, we define Now, let us consider a symmetric tensor C ∗ -category (ρ, the antisymmetric projection H 1 (X, QG)
Pρ,ε,n :=
1 sign(p)ερ (p). n!
(10)
p∈P(n)
The object ρ is said to be special if there is d ∈ N and a partial isometry S ∈ (ι, ρ d ) with support Pρ,ε,d , such that ∗ S ⊗ 1ρ ◦ (1ρ ⊗ S) = (−1)d−1 d −1 1ρ
⇔
S ∗ ρ∗ (S) = (−1)d−1 d −1 1.
(11)
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
367
In such a case, d is called the dimension of ρ. When ρ is an endomorphism and is special in the above sense, we say that ρ satisfies the special conjugate property (see [8, §4]). Special objects play a pivotal role in the Doplicher–Roberts theory. From the viewpoint of group duality they are an abstract characterisation of the notion of representation with determinant 1 (see [7, §3]). From the C ∗ -algebraic point of view, they are an essential tool for the crossed product defined in [8, §4]. is special and has dimension d. In fact, we consider Let G ⊆ SU(d). Then the object H of G
the isometry S generating the totally antisymmetric tensor power d H , and note that ud ◦ S = det u · S = S, u ∈ SU(d), so that S ∈ (ι, H d )SU(d) ⊆ (ι, H d )G and (11) follows from [6, Lemma 2.2]. In particular when G = SU(d) the spaces (H r , H s )SU(d) , r, s ∈ N, are generated by the operators θ (p), p ∈ P∞ , and S ∈ (ι, H d )SU(d) , by closing w.r.t. composition and tensor product. ˆ ⊗, ι, ε) Definition 9. A special category is a locally trivial, symmetric tensor C ∗ -category (ρ, with fibre (ρˆ• , ⊗• , ι• , ε• ), such that ρ• is a special object. The dimension of the object ρ generating the special category ρˆ is by definition the dimension of the special object ρ• and is denoted by d. The main motivation of the present work is the search of embedding functors for special categories. The first step in this direction is given by the following classification result, proved in [24, Theorem 36]: Theorem 10. Let (ρ, ˆ ⊗, ι, ε) be a special category with fibre (ρˆ• , ⊗• , ι• , ε• ). Then: (1) ρ is amenable; (2) Let d ∈ N denote the dimension of ρ; then there is a compact Lie group G ⊆ SU(d) such ⊗, ι, θ ); that (ρˆ• , ⊗• , ι• , ε• ) (G, (3) There is a bijection sym(X ι , ρˆ• ) H 1 (X ι , QG); (4) Oρ is an OG -bundle. In general, the object generating a special category is not special. The obstruction to ρ being special is encoded by the Chern class introduced in [24, §3.0.3], c(ρ) ∈ H 2 X ι , Z ,
(12)
constructed by observing that the (ι, ι)-module Rρ := {ψ ∈ (ι, ρ d ): Pρ,ε,d ψ = ψ} is the set of sections of a line bundle Lρ → X ι . The invariant c(ρ) is defined as the first Chern class of Lρ . 4. Cohomology classes and principal bundles In the present section we give an exact sequence and a cohomological invariant for a class of principal bundles. This elementary construction has important consequences in the setting of abstract duality for tensor C ∗ -categories and can be regarded as a generalisation of the Dixmier– Douady invariant.
368
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
Let G a topological group with unit 1 and X a locally compact, paracompact Hausdorff space endowed with a (good) open cover {Xi }. A G-cocycle is given by a family g := {gij } of continuous maps gij : Xij → G satisfying on Xij k the cocycle relations gij gj k = gik (which imply gij gj i = gii = 1). In the sequel we will denote the evaluation of gij on x ∈ Xij by gij,x . We say that g is cohomologous to g := {gij } whenever there are maps vi : Xi → G such that gij vj = vi gij on Xi . This defines an equivalence relation over the set of G-cocycles, and ˇ passing to the inverse limit over open good covers provides the Cech cohomology set H 1 (X, G) (see [15, I.3.5]), which is a pointed set with distinguished element the class of the trivial cocycle 1, 1ij,x ≡ 1. To be concise, sometimes in the sequel cocycles will be denoted simply by g or {gij }, and their classes in H 1 (X, G) by [g] or [gij ]. It is well known that H 1 (X, G) classifies the principal G-bundles over X. When G is Abelian, H 1 (X, G) coincides with the first cohomology group with coefficients in the sheaf SX (G) of germs of continuous maps from X into G [11, I.3.1]. ˇ We now pass to give a definition of nonabelian Cech 2-cohomology. The basic object providing the coefficients of the theory is now given by a crossed module (also called 2-group, see [2, §3]), which is defined by a morphism i : G → N of topological groups and an action α : N → aut G, such that i is equivariant for α and the adjoint actions G → aut G, G g → g, ˆ N → aut N , N u → u: ˆ uˆ ◦ i = i ◦ α(u),
gˆ = α ◦ i(g).
The crossed module (G, N, i, α) is denoted for short by G → N . To be concise we write g := i(g) ∈ N , g ∈ G, and α(u) := u, ˆ u ∈ N . The equivariance relations ensure that no confusion will arise from this notation. Example 11. Let N be a topological group and G a normal subgroup of N : then considering the inclusion i : G → N and the adjoint action α : N → aut G, u → u, ˆ yields a crossed module G → N. A cocycle pair b := (u, g) with coefficients in the crossed module G → N is given by families of maps uij : Xij → N,
gij k : Xij k → G,
satisfying the cocycle relations uij uj k = g
u , ij k ik
gij k gikl = uˆ ij (gj kl )gij l , where uˆ ij : Xij → aut G is defined by means of α. Cocycle pairs b := (u, g), b := (u , g ) are said to be cohomologous whenever there is a pair (v, h) of families of maps vi : Xi → N,
hij : Xij → G,
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
369
such that
vi u ij = hij uij vj , hik gij k = vˆi gij k hij uˆ ij (hj k ).
It can be proved that cohomology of cocycle pairs defines an equivalence relation [2, §4]. The set of cohomology classes of cocycle pairs is by definition the cohomology set relative to the cover {Xi } with coefficients in the crossed module G → N ; passing to the limit w.r.t. covers yields ˇ the Cech cohomology set H˘ 2 (X, G→N ) with distinguished element the class of the trivial cocycle pair 1 := (1, 1), 1ij,x := 1 ∈ N , 1ij k,x := 1 ∈ G. The symbol H˘ is used to emphasise that we deal with nonabelian cohomology sets. Note that our notation is not universally used in literature: sometimes the symbol H˘ 1 (X, G→N ) is used instead of H˘ 2 (X, G→N ) (see for example [2]). The cohomology class of the cocycle pair b = (u, g) is denoted by [b] ≡ [u, g]. Remark 12. (1) Each N -cocycle u := {uij } defines the cocycle pair du := (u, 1); (2) If G is Abelian and α is the trivial action, then each cocycle pair (u, g) defines the cocycle g = {gij k } in the second (Abelian) cohomology of G. An important class of examples is the following: let G be a topological group and i : G → aut G, i(g) := g, ˆ denote the adjoint action; then taking α : aut G → aut G as the identity map yields a crossed module G → aut G. Thus we can define the cohomology set H˘ 2 (X, G→ aut G) with elements classes of cocycle pairs (λ, g) of the type λij : Xij → aut G,
gij k : Xij k → G:
λij λj k = gˆ ij k λik , gij k gikl = λij (gj kl )gij l ,
where each gˆ ij k : Xij k → aut G is defined by adjoint action. Remark 13. According to the considerations in [2, §2], H˘ 2 (X, G→ aut G) classifies the G-gerbes on X up to isomorphism. In the present paper we use the term G-gerbe to mean a principal 2-bundle over X with fibre the crossed module G → aut G. In this way, cocycle pairs with coefficients in G → aut G are interpreted as transition maps for G-gerbes, and G-bundles define G-gerbes such that the associated cocycle pairs are of the type dλ = (λ, 1), λ ∈ H 1 (X, aut G) (see Remark 12). We define the maps γ∗ : H 1 (X, N ) → H 1 (X, aut G), ˘2
˘2
γ˘∗ : H (X, G→N ) → H (X, G→ aut G),
[uij ] → [uˆ ij ],
{uij }, g → {uˆ ij }, g .
(13) (14)
Let now N denote a topological group and G ⊆ N a normal subgroup. Defining QG := N \G yields the exact sequence i
p
1 → G → N −→ QG → 1.
(15)
370
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
Let NG denote the smaller normal subgroup of N containing the set [G, N ] := gug −1 u−1 , g ∈ G, u ∈ N . We consider the quotient map πN : N → N := N \NG and define G := πN (G). Note that by construction G is Abelian; when G is contained in the centre of N we have that [G, N ] is trivial and NG = {1}, N = N , G = G . Lemma 14. Let G be a normal subgroup of the topological group N and suppose that the fibration p : N → QG := N \G has local sections. Then for every locally compact, paracompact Hausdorff space X there is an isomorphism of pointed sets ν : H 1 (X, QG) → H˘ 2 (X, G→N ).
(16)
Moreover, there is a commutative diagram G
i
πG
G
N πN
i
(17)
N
which yields the map πN,∗ : H˘ 2 (X, G→N ) → H 2 X, G .
(18)
Proof. The fact that there is an isomorphism as in (16) is proved in [2, Lemma 2], anyway for the reader’s convenience we give a sketch of the proof. Let q := {yij } be a QG-cocycle; since p has local sections, up to performing a refinement of {Xi } there are maps uij : Xij → N such that yij = p ◦ uij (it suffice to define uij := s ◦ yij , where s : U → N , U ⊆ QG, yij (Xij ) ⊆ U , is a local section). Since p ◦ (uij uj k u−1 ik ) = yij yj k yki = 1, we conclude that there is gij k : Xij k → G such that uij uj k = gij k uik . It is trivial to check that ({uij }, {gij k }) is a cocycle pair, and we define
ν[q] := {uij }, {gij k } ,
[q] := [yij ] ∈ H 1 (X, QG).
(19)
On the other side, if b := ({uij }, {gij k }) is a cocycle pair then defining p∗ [b] := [p ◦ uij ] yields an inverse of ν. We now prove (18). Defining πG := πN |G yields the commutative diagram (17); if b := (u, g), g := {gij k }, is a cocycle pair, then we define πN,∗ [b] := [πN ◦ gij k ] and this yields the desired map (note in fact that πN ◦ uˆ ij (gj kl ) = πN ◦ gj kl , so that {πN ◦ gij k } is a 2-G cocycle). 2 Now, by functoriality of H 1 (X, ·) there is a sequence of maps of pointed sets i∗
p∗
H 1 (X, G) −→ H 1 (X, N ) −→ H 1 (X, QG).
(20)
In fact, p∗ ◦ i∗ [g] = [1] for each g ∈ H 1 (X, G). In the following result we give an obstruction to p∗ being surjective.
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
371
Lemma 15. Let G be a normal subgroup of the topological group N such that the fibration p : N → QG := N \G has local sections. Then we have the following sequence of maps of pointed sets:
H 1 (X, G)
i∗
H 1 (X, N )
p∗
H 1 (X, QG) γ˘∗ ◦ν
γ∗
H 1 (X, aut G)
d∗
δ
H 2 (X, G ) (21)
H˘ 2 (X, G→ aut G)
Here d∗ is induced by the map dλ := (λ, 1), and the square is commutative. When G is contained in the centre of N , the upper horizontal row is exact and G = G. Proof. The proof of the lemma is based on the maps introduced in Lemma 14. Define δ := πN,∗ ◦ ν. If u := {uij } is an N -cocycle and q := {yij := p ◦ uij } then by definition of ν we find ν[q] = [u, 1] = d∗ [u] (see Remark 12); moreover, δ[q] = πN,∗ [u, 1] = [1] and this proves that p∗ (H 1 (X, N)) ⊆ ker δ. We now prove that the square is commutative. To this end, note that for each N -cocycle u := {uij } we find d∗ ◦ γ∗ [u] = [d{uˆ ij }] = [{uˆ ij }, 1]; on the other side, if q := {yij } := {p ◦ uij } then γ˘∗ ◦ ν ◦ p∗ [u] = γ˘∗ ◦ ν[q] = γ˘∗ [u, 1] = [{uˆ ij }, 1], and we conclude that the square is commutative. Finally, we prove that the upper horizontal row is exact when G is contained in the centre of N ; to this end, it suffices to verify that ker δ ⊆ p∗ (H 1 (X, N )). Now, we have G = G and the map πN,∗ takes the form πN,∗ [u, g] := [g]. Since ν is bijective we have that δ[q] = [1] if and only if πN,∗ [u, g] = [1], where [u, g] = ν[q]. This means that g = {gij k } is a trivial G-2-cocycle, so that there are maps hij : Xij → G such that hij hj k = gij k hik ; the pair (1, {hij }) defines a 2-cocycle equivalence between (u, g) and (u , 1), where u := {uij hj i } is, by construction, an N -cocycle. By definition of ν we have p∗ [u ] = [q], and this proves p∗ (H 1 (X, N )) = ker δ. Thus the upper horizontal row is exact as desired. 2 Note that by classical results when N is a compact Lie group and G is closed, G, QG, G are compact Lie groups and the fibration N → QG has local sections. An interesting class of examples is the following. Let U be the unitary group of an infinite dimensional Hilbert space; then, the centre of U is the torus T and PU := U/T is the projective unitary group. In this case, δ takes the form
δ : H 1 (X, PU) → H 2 (X, T) H 3 (X, Z), δ[q] := [g],
q := {yij },
g := {gij k }
(22)
(where {gij k } is defined by (19)) and it is well known that it is an isomorphism (see [5, §10.7.12] and following sections). In the following lemma, we define a Chern class for a QG-cocycle when G ⊆ SU(d) and N G is the normaliser of G in U(d). Lemma 16. Let G ⊆ SU(d). Then there is a map c : H 1 (X, QG) → H 2 (X, Z); if q is a trivial QG-cocycle then c[q] = 0.
372
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
Proof. It suffice to note that the determinant defines a group morphism det : N G → T. Since G ⊆ SU(d), we find that det factorises through a morphism detQ : QG → T. The functoriality of H 1 (X, ·), and the well-known isomorphism H 1 (X, T) H 2 (X, Z), complete the proof. 2 5. Bundles of C ∗ -algebras and cohomology classes In the present section we give an application of the cohomology class δ defined in Lemma 15 to bundles of C ∗ -algebras. To this end, in the following lines we present some constructions involving principal bundles and C ∗ -dynamical systems. Let F• be a C ∗ -algebra and X a locally compact, paracompact Hausdorff space. Then the cohomology set H 1 (X, aut F• ) can be interpreted as the set of isomorphism classes of F• -bundles, in the following way: for each aut F• -cocycle u := {uij }, denote the fibre bundle with fibre F• and transition maps {uij } by → X π :F is endowed with local charts (see [12, 5.3.2]); by construction, F |Xi := π −1 (Xi ) → Xi × F• , πi : F where {Xi }i is an open cover of X, in such a way that πi ◦ πj−1 (x, v• ) = x, uij,x (v• ) ,
x ∈ Xij , v• ∈ F• .
(23)
, p ◦ t = idX , such that the norm function {X x → t (x)} vanThe set of sections t : X → F ishes at infinity has a natural structure of F• -bundle, that we denote by Fu . On the converse, given an F• -bundle F , using the method exposed in [25, §3.1] (see also the related references), we can → X with fibre F• , in such a way that F is isomorphic to the construct a fibre bundle π : F . The correspondence F → F is functorial: C0 (X)-morphisms C0 (X)-algebra of sections of F 1 → F 2 such that τ (t) = τˆ ◦ t, t ∈ F1 . τ : F1 → F2 correspond to bundle morphisms τˆ : F Let F1 , F2 be F• -bundles and K a subgroup of aut F• ; a C0 (X)-isomorphism β : F1 → F2 is 1 , F 2 by means of local said to be K-equivariant if there is an open cover {Xi }i∈I trivialising F k |Xi → Xi × F• , k = 1, 2, i ∈ I , with automorphisms βi,x ∈ K, i ∈ I , x ∈ Xi , charts πi,k : F satisfying −1 −1 βˆ ◦ πi,1 x, βi,x (v• ) , (x, v• ) = πi,2
v• ∈ F •
(roughly speaking, at the local level β is described by automorphisms in K). In such a case, we say that F1 is K-C0 (X)-isomorphic to F2 . Moreover, we say that an F• -bundle F has structure group K if F = Fu for some K-cocycle u. It is easy to verify that K-cocycles u, v are equivalent in H 1 (X, K) if and only if the associated F• -bundles are K-C0 (X)-isomorphic. Remark 17. Let (A• , ρ• , a• ) be a pointed C ∗ -dynamical system and K := aut(A• , ρ• , a• ) ⊆ aut A• . An A• -bundle A has structure group K if and only if there is ρ ∈ endX A and a ∈ A Xi → Xi × A• , such that with local charts πi : A| πi ◦ ρ(v) ˆ = ρ• (v• ),
πi (a) = (x, a• ),
v ∈ A,
(x, v• ) := πi (v).
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
373
In this case, we say that (A, ρ, a) is a locally trivial pointed C ∗ -dynamical system. Now, β : A → A is a K-C0 (X)-isomorphism if and only if β is an isomorphism of pointed C ∗ dynamical systems. So that, H 1 (X, K) describes the set of isomorphism classes of locally trivial pointed C ∗ -dynamical systems (A, ρ, a) with fibre (A• , ρ• , a• ). In the sequel, we shall make use of the following fact: if N is a subgroup of K and n is an N -cocycle, then we may regard n as a K-cocycle; thus, if A is an A• -bundle with structure group N , then A defines a locally trivial pointed C ∗ -dynamical system (A, ρ, a). The next lemma is an application of the previous ideas. Lemma 18. Let d ∈ N, G ⊆ U(d) be covariant and QG := N G\G. Then for each compact Hausdorff space X there are one-to-one correspondences between: (1) (2) (3)
QG-cocycles; locally trivial pointed C ∗ -dynamical systems with fibre (OG , σG , θ ); G-bundles.
Proof. Consider the pointed C ∗ -dynamical system (OG , σG , θ ) with the action (7), then apply Remark 17, Theorems 7 and 1. 2 The following construction may be regarded as an analogue of the notion of group action in the setting of C ∗ -bundles and appeared in [25, §3.2]. Let G be a subgroup of aut F• . A gauge C ∗ -dynamical system with fibre (F• , G) is given by a triple (F , G, α), where F is an F• -bundle, η : G → X is a bundle with fibre G and → F α : G ×X F is a continuous map such that for each x ∈ X there is a neighbourhood U of x with local charts ηU : G|U → U × G,
U → U × F• , πU : F|
(24)
satisfying πU ◦ α(y, v) = x, y• (v• ) ,
(25)
where x := π(v) = η(y),
(x, y• ) := ηU (y),
(x, v• ) := πU (v)
(so that y• ∈ G ⊆ aut F• and v• ∈ F• ). We say that (F , G, α) has structure group K if F has structure group K. Usual continuous actions are related with gauge C ∗ -dynamical systems in the following way: if S is a set of sections of G which is also a group w.r.t. the operations defined pointwise, then there is an action S → autX F ; in particular, every continuous action G → autX F can be regarded as a gauge action on F of the bundle G := X × G (see [25, §3.2] for details).
374
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
The fixed-point algebra of (F , G, α) is given by the C0 (X)-algebra F α := t ∈ F : α y, t (x) = t (x), x ∈ X, y ∈ η−1 (x) . Let A• ⊆ F• denote the fixed-point algebra w.r.t. the G-action. Then (25) implies that F α is an A• -bundle. We now expose the main construction of the present section. Again, we consider a C ∗ -algebra F• and a subgroup K of aut F• ; moreover, we pick a subgroup G of K and denote the fixedpoint algebra w.r.t. the G-action by A• . We consider the normaliser of G in K and the associated quotient group, as follows:
N G := u ∈ K: u ◦ g ◦ u−1 ∈ G , p : N G → QG := N G\G.
(26)
By construction, for every u ∈ N G, g ∈ G, a ∈ A• there is g ∈ G such that g ◦ u(a) = u ◦ g (a) = u(a). The above equalities imply that the N G-action on F• factorises through a QGaction QG → aut A• ,
p(u) → u|A• ,
u ∈ N G;
(27)
thus, applying the above procedure, for every QG-cocycle q we can construct an A• -bundle Aq . Lemma 19. Let q := ({Xi }, {yij }) ∈ H 1 (X, QG) and Aq denote the associated A• -bundle. Then the following are equivalent: (1) There is a gauge C ∗ -dynamical system (F , G, α) with fibre (F• , G) and structure group N G, such that Aq is QG-C0 (X)-isomorphic to F α ; (2) There is an NG-cocycle n such that [q] = p∗ [n], where p∗ : H 1 (X, N G) → H 1 (X, QG) is the map induced by (26(2)). Proof. (1) ⇒ (2). Let us denote the N G-cocycle associated with F by n := ({Xi }, {uij }). We (otherwise, we perform a refinement of {Xi }), so that we assume that {Xi } trivialises G and F |Xi → Xi × F• fulfilling (25), with {πi } related have local charts ηi : G|Xi → Xi × G, πi : F α → X associated with F α ; then with {uij } by means of (23). Let us consider the fibre bundle F α and (25) implies ⊆F we have an inclusion F α |Xi = Xi × A• πi F
⇒
Xij × A• = πi ◦ πj−1 (Xij × A• ).
We conclude by (23) that vij,x := uij,x |A• ∈ aut A• for every x ∈ Xij and pair i, j . Moreover, α . Finally, since by (27) we find that vij = p ◦ uij , as i, j vary, yield a set of transition maps for F α A is QG-C0 (X)-isomorphic to F , we conclude that [q] = p∗ [n]. (2) ⇒ (1). Let n := {uij }. We define F as the F• -bundle with cocycle n and G → X as the fibre bundle with fibre G and transition maps γij,x (g) := uij,x ◦ g ◦ u−1 ij,x , x ∈ Xij . Such transition 1 maps define a cocycle with class γ∗ [n] ∈ H (X, aut G). Now, we note that uij,x ◦ g(v• ) = γij,x (g) ◦ uij,x (v• ),
g ∈ G, v• ∈ F• , x ∈ Xij .
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
375
This implies that if we consider the maps αi : (Xi × G) ×Xi (Xi × F• ) → Xi × F• ,
αi (x, g), (x, v• ) := x, g(v• ) ,
→ F with local charts {ηi } of G associated with then there is a unique gauge action α : G ×X F associated with {uij }, fulfilling {γij } and {πi } of F αi = πi ◦ α ◦ ηi−1 × πi−1 for every index i. Since A has QG-cocycle {p ◦ uij }, reasoning as in the first part of the proof we conclude that A is QG-C0 (X)-isomorphic to F α . 2 Corollary 20. With the notation of the previous lemma, if γ∗ [n] = [1] then there is a continuous action α• : G → aut F with fixed-point algebra QG-C0 (X)-isomorphic to A. Proof. Since γ∗ [n] = [1] there is an isomorphism G X × G. Thus, the gauge action α : G ×X F induces the continuous action α• : G → autX F (see [25, Corollary 3.4]). 2 Theorem 21. Let G ⊆ K ⊆ aut F• , A• denote the fixed-point algebra of F• w.r.t. the G-action and QG defined as in (26(2)). For each A• -bundle A with structure group QG there is a class δ(A) ∈ H 2 X, G
(28)
fulfilling the following property: if A is QG-C0 (X)-isomorphic to the fixed-point algebra of a gauge C ∗ -dynamical system (F , G, α) with fibre (F• , G) and structure group N G, then δ(A) = [1]. The converse is also true when G lies in the centre of N G. Proof. Applying Lemma 15 we define δ(A) := δ[q], where q is the QG-cocycle associated with A (of course, there is an abuse of the notation δ in the previous definition, but this should not create confusion). The theorem now follows applying Lemma 19. 2 The class δ may be also interpreted as an obstruction to constructing covariant representations of a gauge C ∗ -dynamical system over a continuous field of Hilbert spaces. Since this point goes beyond the purpose of the present work, we postpone a complete discussion to a forthcoming paper. In the following lines we discuss the relation between the class δ and the Dixmier–Douady invariant. Let H denote the standard separable Hilbert space, U the unitary group of H endowed with the norm topology, T the torus acting on H by scalar multiplication, PU := U/T the projective unitary group, Kr the C ∗ -algebra of compact operators acting on the tensor power H r , r ∈ N, and (H r , H r ) ⊃ Kr the C ∗ -algebra of bounded operators. Moreover, let O∞ denote the Cuntz algebra; it is well known that there is a continuous action U → aut O∞ ,
(29)
376
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
defined as in (1), which restricts to the circle action T → aut O∞ . The construction (26) with K = U, G = T yields QG = PU and the action 0 PU → aut O∞ ,
γ → γˆ .
0 can be constructed using a universal construction on K, as follows (see [3]). Consider Now, O∞ the inductive structure
jr jr+1 jr−1 · · · −→ H r , H r −→ H r+1 , H r+1 −→ · · · ,
jr (t) := t ⊗ 1,
(30)
0 is where 1 ∈ (H, H ) is the identity, and denote the associated C ∗ -algebra by B∞ . Then, O∞ ∗ r r r the C -subalgebra of B∞ generated by the images of the K ⊂ (H , H ), r ∈ N. The PU-action 0 preserves the inductive structure: if i : Kr → O 0 , r ∈ N, are the natural inclusions, then on O∞ r ∞
γˆ ◦ ir (t) ∈ ir Kr ,
γ ∈ PU, t ∈ ir Kr ,
(31)
and in particular PU acts on i1 (K) as the usual adjoint action: γˆ ◦ i1 (t) = i1 ◦ γ (t),
t ∈ K, γ ∈ PU.
(32)
0 -bundles over X with arrows PU-C (X)-isomorphisms by Let us denote the category of O∞ 0 0 0 -bundle A bunPU (X, O∞ ). By the above results, each O∞ ∞ with structure group PU is determined by a PU-cocycle q, and the class
δ(A∞ ) = δ[q] ∈ H 2 (X, T) H 3 (X, Z)
(33)
measures the obstruction to finding a gauge dynamical system with fibre (29) and fixed-point algebra A∞ . Now, we denote the category of K-bundles over X with arrows C0 (X)-isomorphisms by bun(X, K); each K-bundle A is determined by a PU-cocycle q, and its Dixmier–Douady invariant [5, Chapter 10] is computed by (22): δDD (A) = δ(q).
(34)
Proposition 22. For each locally compact, paracompact Hausdorff space X, there is an equiva0 ), A → A , and lence of categories bun(X, K) → bunPU (X, O∞ ∞ δDD (A) = δ(A∞ ),
A ∈ bun(X, K).
(35)
Proof. Let A be a K-bundle with associated PU-cocycle q. The multiplier algebra MA of A → X with fibre can be constructed as the C0 (X)-algebra of bounded sections of the bundle B (H, H ) and transition maps defined by q. For each r ∈ N, we consider the C0 (X)-tensor products MAr := MA ⊗X · · · ⊗X MA, Ar := A ⊗X · · · ⊗X A and the obvious inclusions Ar ⊂ MAr . We have the inductive limit structure jr−1
jr
jr+1
· · · −→ MAr −→ MAr+1 −→ · · · ,
jr (t) := t ⊗ 1,
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
377
where 1 ∈ MA is the identity. The system (MAr , jr ) yields the inductive limit algebra MA∞ and we define A∞ as the C ∗ -subalgebra of MA∞ generated by the images of the C ∗ -algebras Ar , r ∈ N. If β : A → A is a C0 (X)-isomorphism, then it naturally extends to C0 (X)-isomorphisms βr : Ar → A r , r ∈ N, and finally to a C0 (X)-isomorphism β∞ : A∞ → A ∞ . On the converse, 0 -bundle with structure group PU and associated PU-cocycle q. Since the PUlet A∞ be an O∞ 0 preserves the inductive structure (30), and since the PU-action on O 0 restricts action on O∞ ∞ 0 (see (31) and (32)), for each r ∈ N there is a Kr -bundle to the natural PU-action on K ⊂ O∞ Ar ⊂ A∞ with associated PU-cocycle q, with A1 generating A∞ as above; thus our functor 0 ) is surjective on the sets of objects. If β : A∞ → A ∞ is an isomorphism in bunPU (X, O∞ r
then by PU-equivariance we find β |Ar = A for each r ∈ N. Defining β := β |A1 we easily find β = β∞ ; thus our functor is surjective on the sets of arrows. Finally, (35) follows by (33) and (34). 2 6. Gauge-equivariant bundles, and a concrete duality Let X be a compact Hausdorff space. In the present section we give a duality theory in the setting of the category vect(X) of vector bundles over X, relating suitable subcategories of vect(X) with gauge equivariant vector bundles in the sense of [20]. Let d ∈ N and π : E → X a vector bundle of rank d. We denote the Hilbert C(X)-bimodule of sections of E by SE, endowed with coinciding left and right C(X)-actions. For each r ∈ N, we denote the r-fold tensor power of E in the sense of [15, §I.4], [1, 1.2] by E r (for r = 0, we define E 0 := ι := X × C) and by (E r , E s ) the set of vector bundle morphisms from E r into E s . The Serre–Swan equivalence implies that every (E r , E s ) is the C(X)-bimodule of sections of a vector bundle πrs : E rs → X, having fibre (H r , H s ) ≡ Md r ,d s [15, Theorem 5.9]. In explicit terms, E rs E s ⊗ E∗r , where E∗r is the r-fold tensor power of the conjugate bundle and every t ∈ (E r , E s ) can be regarded as a continuous map t : X → E rs ,
πrs ◦ t = idX .
We denote the tensor category with objects E r , r ∈ N, and arrows (E r , E s ) by E. It is clear that (ι, ι) = C(X). Moreover, the flip operator θE ∈ E 2 , E 2 :
θE (x) ◦ v ⊗ v := v ⊗ v,
v, v ∈ Ex , x ∈ X,
(36)
Thus, (E, ⊗, ι, θE ) is a symmetric tensor C ∗ -category; we denote the defines a symmetry on E. ∗ associated pointed C -dynamical system by (OE , σE , θE ). Proposition 23. Let d ∈ N and H denote the standard rank d Hilbert space. (1) For each compact Hausdorff space X there is an isomorphism ) → H 1 X, U(d) . Q : sym(X, H ⊗, ι, θE ) is an H -bundle and all (2) If E → X is a rank d vector bundle, then the category (E, the elements of sym(X, H ) are of this type. = [u]. (3) If u is a U(d)-cocycle associated with E as a set of transition maps, then Q[E]
378
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
(4) OE is the Cuntz–Pimsner algebra associated with SE and is an Od -bundle with structure group U(d). Proof. (1) We apply Theorem 7 to the case G = {1}, so that N G = QG = U(d). (2) Let E → X be a vector bundle; we consider a local chart πU : E|U → U × H and note that, by functoriality, for each r, s ∈ N there are local charts πUrs : E rs |U → X × (H r , H s ). This . Let now (ρ, -bundle; to prove yields the desired local chart πˆ U : E → U H ˆ ⊗, ι, ε) be an H that ρˆ E for some vector bundle E we note that the Hilbert C(X)-bimodule (ι, ρ) defines a locally trivial continuous field of Hilbert spaces with fibre H ; we denote the vector bundle associated with (ι, ρ) by E, and applying the Serre–Swan equivalence we obtain an isomorphism β : SE (ι, E) → (ι, ρ), which extends to the desired isomorphisms β rs : (E r , E s ) → (ρ r , ρ s ), r, s ∈ N. By definition of Q we have that u yields (3) We pick a U(d)-cocycle u with class Q[E]. rs transition maps for the vector bundles E , r, s ∈ N, by means of the action u(t) ˆ := us ◦ t ◦ u∗r , r s u ∈ U(d), t ∈ (H , H ) (compare with (5)). In particular, for r = 0, s = 1, we conclude that = [u]. u defines, up to cocycle equivalence, a set of transition maps for E and thus [u ] = Q[E] (4) It suffices to recall [22, Propositions 4.1, 4.2]. 2 Remark 24. To be concise, we denote the totally antisymmetric projections defined as in (10) by P := PE ,θE ,n ∈ (E n , E n ), n ∈ N. By definition of the totally antisymmetric line bundle
d n E := Pd E d we have that E is a twisted special object, with ‘categorical Chern class’ (12) coinciding with the first Chern class c1 (E). If c1 (E) = 0 then E is a special object and the conjugate bundle E∗ appears as the object associated with the projection Pd−1 ∈ (E d−1 , E d−1 ) (see [7, Lemma 3.6]). Clearly, the existence of the conjugate bundle does not depend on the vanishing of c1 (E), anyway in general it is false that E∗ Pd−1 E d−1 . ⊗, ι); we denote the spaces of arrows of ρˆ by Let ρˆ be a tensor C(X)-subcategory of (E, r, s ∈ N. For every r, s ∈ N, we define the set Eρrs := {tx ∈ E rs : x ∈ X, t ∈ (E r , E s )ρ } ρ and denote the restriction of πrs on Eρrs by πrs . In this way, we obtain Banach bundles (E r , E s )ρ ,
ρ πrs : Eρrs → X.
(37)
Let t ∈ (E r , E s ). If t ∈ (E r , E s )ρ , then by definition tx ∈ Eρrs for every x ∈ X. On the converse, suppose that tx ∈ Eρrs , x ∈ X; then for every x ∈ X there is t ∈ (E r , E s )ρ such that tx = tx . By continuity, for every ε > 0 there is a neighbourhood Uε x with supy∈Uε ty − ty < ε. Thus, [5, 10.1.2(iv)] implies that t ∈ (E r , E s )ρ . We conclude that t ∈ Er, Es ρ
⇔
tx ∈ Eρrs ,
∀x ∈ X.
(38)
Let p : G → X be a group bundle with fibres compact groups Gx := p −1 (x), x ∈ X. According to [20], a gauge action on E is given by a continuous map α : G ×X E → E,
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
379
such that each restriction αx : Gx × Ex → Ex , x ∈ X, is a unitary representation on the Hilbert space Ex ; to economise on notation, we define Gα,x := αx (Gx ),
uα,x := αx (u),
u ∈ Gx .
In this way, E is a G-equivariant vector bundle in the sense of [20, §1], with trivial action on X. Moreover, every πrs : E rs → X is a G-vector bundle, with action α rs : G ×X E rs → E rs ,
(u, v) → α rs (u, v) := uˆ α,x (v),
x := p(u) = πrs (v) ∈ X,
where uˆ α,x (v) is defined as in (1). We denote the category with objects E r , r ∈ N, and arrows r s E , E α := t ∈ E r , E s : α rs u, t (x) = t (x), u ∈ G, x := p(u)
(39)
α,x , x ∈ X, defined by α. ˆ Clearly, (α, ˆ ⊗, ι) is a tensor C ∗ -category with (ι, ι) = C(X) and fibres G as in (5). Since θE (x) = θ , x ∈ X, we conclude that θE ∈ (E 2 , E 2 )α , thus there is an inclusion functor ⊗, ι, θE ). E : (α, ˆ ⊗, ι, θE ) → (E, Let us consider the bundle UE → X of unitary automorphisms of E (see [15, I.4.8]). It is well known that UE has fibre the unitary group U(d); if {uij } is the U(d)-cocycle associated with E , then UE has associated aut U(d)-cocycle γij,x (u) := uij,x · u · u∗ij,x ,
x ∈ Xij , u ∈ U(d).
Note that UE is compact as a topological space. In the same way the bundle SUE → X of special unitary automorphisms of E is defined: it has fibre SU(d) and the same transition maps as UE . Of course, there is an inclusion SUE ⊂ UE . Now let G → X be a closed subbundle of UE , not necessarily locally trivial. Then there is an obvious gauge action α : G ×X E → E. In order to emphasise the picture of G as a subbundle of UE , we use the notations := α, G ˆ
r s E , E G := E r , E s α ,
the dual of G. Clearly, each (E r , E s )G is the module of sections of a Banach bundle and call G G : EGrs → X, πrs
r, s ∈ N.
⊗, ι, θE ). We define (OG , σG , θE ) as the pointed C ∗ -dynamical system associated with (G, Clearly, there is a canonical monomorphism E∗ : (OG , σG , θE ) → (OE , σE , θE ). Actions on the vector bundle E → X by (generally noncompact) groups G of unitary automorphisms have been considered in [23, §4]. This approach has the disadvantage to associate the
380
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
same dual to very different groups (see [23, Example 4.2]). According to [23, Definition 4.7], we is one-to-one can associate a group bundle G ⊆ UE to G, in such a way that the map {G → G} [23, Proposition 4.8]. For this reason in the present paper we passed to consider the notion of gauge action. The following result is a different version of [23, Proposition 4.8]; since the proof is essentially the same, it is omitted. defines a one-to-one corProposition 25. Let E → X be a vector bundle. The map {G → G} respondence between the set of closed
subbundles of SUE and the set of symmetric tensor C ∗ -subcategories ρˆ of E such that (ι, d E) ⊆ (ι, E d )ρ . Let G ⊆ U(d). A G-bundle in E is a G-bundle ρˆ endowed with an inclusion ⊗, ι, θE ). (ρ, ˆ ⊗, ι, θE ) ⊆ (E, Let us denote the inclusion map by i : N G → U(d), and the quotient projection by p : N G → QG; by functoriality of H 1 (X, ·), there are maps i∗ : H 1 (X, NG) → H 1 X, U(d) ,
p∗ : H 1 (X, N G) → H 1 (X, QG).
(40)
Moreover, by (13) each N G-cocycle n = {uij } defines an aut G-cocycle {uˆ ij } with class γ∗ [n]. Theorem 26. Let G ⊆ U(d) be a compact group. For each compact Hausdorff space X and NG-cocycle n = {uij }, there are a vector bundle E → X with U(d)-cocycle {i ◦ uij } and a fibre ⊗, ι, θE ) is a G-bundle G-bundle G ⊆ UE with transition maps {uˆ ij }. The category (G, with 1 associated cohomology class p∗ [n] ∈ H (X, QG). Moreover, there is a gauge action E → O E α : G ×X O with fibre (Od , G) and fixed-point algebra OG . Proof. Clearly, there are E and G defined as above. Since by construction G ⊆ UE , the action ⊗, ι, θE ) and the pointed C ∗ -dynamical α is defined, together with the tensor C ∗ -category (G, so the QG-cocycle system (OG , σG , θE ). By (7) we can regard QG as a subgroup of aut G, q := {p ◦ uij } defines a symmetric tensor C ∗ -category (ρˆq , ⊗, ι, εq ) and a pointed C ∗ -dynamical has associated cocycle q, it suffices to give a QG-C(X)system (Oq , ρq , εq ). To prove that G isomorphism Oq OG . To this end, we note that Lemma 19 implies that Oq is QG-C(X)isomorphic to the fixed-point algebra OEα ; thus, in order to get the desired isomorphism, it suffices to prove that OG = OEα . Now, it is clear that OG ⊆ OEα . To prove the opposite inclusion, we consider the Haar functional ϕ : C(G) → C(X) and the induced invariant mean m : OE → OEα in the sense of [25, §4]. By definition of α we have m((E r , E s )) = (E r , E s )G ⊂ OG , r, s ∈ N, α so that if t ∈ OE is a norm limit of the type t = limn tn , tn ∈ span rs (E r , E s ), then t = m(t) = α r s n m(tn ), with tn ∈ (E , E )G . Thus, OE = OG and this completes the proof. 2 In the following theorem we characterise the G-bundles in E that arise as above.
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
381
Theorem 27. Let G ⊆ U(d) be covariant, E → X a vector bundle with U(d)-cocycle u and ρˆ a G-bundle in E with QG-cocycle q (in the sense of Lemma 18). Then the structure group of E can be reduced to NG, i.e. there is an N G-cocycle n such that [u] = i∗ [n]. Moreover [q] = p∗ [n] where G ⊆ UE is a fibre G-bundle with class γ∗ [n] ∈ H 1 (X, aut G). and ρˆ = G, Proof. We associate to ρˆ the locally trivial pointed C ∗ -dynamical system (Oρ , ρ∗ , θE ) with fibre (OG , σG , θ ), equipped with the inclusion (Oρ , ρ∗ , θE ) ⊆ (OE , σE , θE ). There is a finite open cover {Xi } and local charts ηi : Oρ |Xi → C0 (Xi ) ⊗ OG , defining the QG-cocycle q := {yij } such that −1 , yˆij,x = ηi,x ◦ ηj,x
x ∈ Xij .
Now, up to performing a refinement, we may assume that {Xi } trivialises E, so that there are local charts πi : E|Xi → Xi × H with associated U(d)-cocycle u := {uij := πi ◦ πj−1 }. Moreover, each πi induces a local chart πˆ i : OE → C0 (Xi ) ⊗ Od . Let us define Oρ,i := πˆ i (Oρ ). We introduce the C0 (Xi )-isomorphisms βi := πˆ i ◦ ηi−1 ,
βi : C0 (Xi ) ⊗ OG −→ Oρ,i ⊆ C0 (Xi ) ⊗ Od ,
so that for each pair i, j we find −1 yˆij,x = βi,x ◦ uˆ ij,x ◦ βj,x ,
x ∈ Xij .
(41)
Now, each βi may be regarded as a continuous map βi : Xi → emb OG , thus by Lemma 6 there is an open cover {Yil }l of Xi and continuous maps wil : Yil → U(d) such that βi,x (t) = wˆ il,x (t),
t ∈ OG , x ∈ Yil .
(42)
We extract from {Yil }il a finite open cover {Yh } of X; to economise on notation, we introduced the index h instead of i, l, so that we have maps wh satisfying (42) for each h and x ∈ Yh . Since E|Yh is trivial, we have that u is equivalent to a cocycle defined by transition maps uhk : Yhk → U(d); with a slight abuse of notation, we denote this cocycle again by u. Of course, the same procedure applies to q = {yhk }. By (41), we find −1 , yˆhk,x = wˆ h,x ◦ uˆ hk,x ◦ wˆ k,x
x ∈ Yhk .
(43)
Now, u is equivalent to the U(d)-cocycle n := {zhk := wh uhk wk∗ } and (43) becomes yˆhk,x (t) = zˆ hk,x (t),
x ∈ Yhk , t ∈ OG .
In other terms, each zhk,x ∈ aut(Od , σd , θ ) restricts to the automorphism yˆhk,x ∈ aut OG , so by Remark 8 we conclude that n takes values in N G and yields a reduction to N G of the structure group of E. Moreover, using again Remark 8 we have [q] = p∗ [n]. The fact that G has class γ∗ [n] follows by applying the previous theorem to the N G-cocycle n. 2
382
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
⊗, ι, θE ). Then: Corollary 28. Let ρˆ be a special category with an inclusion (ρ, ˆ ⊗, ι, θE ) → (E, (1) There is a compact group G ⊆ SU(d) such that E has an associated N G-cocycle n; (2) ρˆ has class Q[ρ] ˆ = p∗ [n] ∈ H 1 (X, QG); (3) There is a group bundle G ⊆ SUE such that ρˆ = G. and an Proof. (1) By Theorem 10 there is a compact group G ⊆ SU(d) such that ρˆ has fibre G associated QG-cocycle q; thus, by the previous theorem we conclude that E has an associated NG-cocycle n. (2) The previous theorem implies [q] = p∗ [n]. (3) We apply again the previous theorem. 2 7. Cohomological invariants and duality breaking In the present section we approach the following question: given a covariant inclusion G ⊆ U(d) and a G-bundle (ρ, ˆ ⊗, ι, ε), is there any G-equivariant vector bundle E → X ι with an ⊗, ι, θE ) (ρ, isomorphism (G, ˆ ⊗, ι, ε)? This is what we call the problem of abstract duality, as — differently from the previous section — our category ρˆ is not presented as a subcategory of vect(X ι ). We will give a complete answer to the previous question in terms of the cohomology set H 1 (X ι , QG), reducing the problem of abstract duality to (relatively) simple computations involving cocycles and principal bundles. As a preliminary step we analyse the setting of C ∗ -bundles. Let X be a compact Hausdorff space. By Lemma 18 we have that H 1 (X, QG) describes the set of isomorphism classes of locally trivial, pointed C ∗ -dynamical systems with fibre (OG , σG , θ ). For every QG-cocycle q, we denote the associated pointed C ∗ -dynamical system by (Oq , ρq , εq ). Theorem 29. With the above notation, for each QG-cocycle q and (Oq , ρq , εq ), the following are equivalent: (1) There is a rank d vector bundle E → X with a C0 (X)-monomorphism η : (Oq , ρq , εq ) → (OE , σE , θE ); (2) There is a gauge C ∗ -dynamical system (O, G, α) with fibre (Od , G) and structure group U(d), such that Oq is QG-C0 (X)-isomorphic to the fixed-point algebra Oα ; (3) There is an NG-cocycle n such that p∗ [n] = [q]. Proof. (3) ⇒ (2). We consider the Cuntz algebra Od endowed with the N G-action (1), which factorises through the action QG → aut OG . Then we apply Lemma 19 with F• = Od and A• = O G . (2) ⇒ (1). Let n := {uij } denote the U(d)-cocycle associated with O and E → X be the rank d vector bundle with transition maps {uij }. According to Proposition 23 there is a U(d)-C0 (X)isomorphism O OE , so that, to be concise, we identify O with OE . Now, by construction of OE E ; since (OE , G, α) there is an inclusion SE ⊂ OE , to which corresponds an inclusion E ⊂ O has fibre (Od , G), with G acting on Od as in (1), we conclude that E is G-stable and the map E → O E restricts to an action α : G ×X O G ×X E → E,
(44)
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
383
i.e., E is G-equivariant. By Theorem 26, we conclude that Oα = OG . Moreover, by Theorem 7 we have QG = aut(OG , σG , θ ), thus, from Remark 17 we conclude that the given QG-C0 (X)isomorphism β : Oq → OG yields monomorphisms β
(Oq , ρq , εq ) −→ (OG , σG , θE ) → (OE , σE , θE ). (1) ⇒ (3). Apply again Lemma 19 with F• = Od and A• = OG .
(45)
2
Now, by Theorem 7 there are maps
→ H 1 (X, QG), sym(X, G) H 1 (X, QG) → sym(X, G),
[ρ, ˆ ⊗, ι, ε] → Q[ρ], ˆ [q] → [ρˆq , ⊗, ι, εq ],
(46)
which are the inverses one of each other. The following result is the translation of Theorem 29 in categorical terms; the proof is an immediate application of Theorems 1, 7 and 27, thus it will be omitted. Theorem 30. Let d ∈ N and G ⊆ U(d) be covariant. For each G-bundle (ρ, ˆ ⊗, ι, ε), the following are equivalent: (1) There is an embedding functor E : ρˆ → vect(X ι ); (2) There is a vector bundle E → X ι and a compact G-bundle G ⊆ UE with an isomorphism ⊗, ι, θE ); (ρ, ˆ ⊗, ι, ε) (G, ˆ (3) There is an NG-cocycle n such that p∗ [n] = Q[ρ]. We call a gauge group associated with ρˆ the bundle G → X appearing in Theorem 30, whose isomorphism class is labelled by γ∗ [n] ∈ H 1 (X, aut G). It follows from the previous theorem that the set of embedding functors E : ρˆ → vect(X) is in one-to-one correspondence with the ˆ that we denote by Z 1 (X, N G; ρ). ˆ As we shall set of NG-cocycles n such that p∗ [n] = Q[ρ], 1 ˆ may contain more than a cohomology class, or be empty. Let see in the sequel, Z (X, NG; ρ) ˆ and G, G denote the associated gauge groups. In general, G may be not n, n ∈ Z 1 (X, N G; ρ)
isomorphic to G ; an example of this phenomenon with G = SU(2) is provided in Example 36. Corollary 31. Let G ⊆ U(d) be covariant. If there is a continuous monomorphism s : QG → (ρ, ˆ ⊗, ι, ε) there is at least one embedding functor NG, p ◦ s = idQG , then for each G-bundle ι ρˆ → vect(X ). Proof. By functoriality there is s∗ : H 1 (X ι , QG) → H 1 (X ι , N G) such that p∗ ◦ s∗ is the identity ˆ = p∗ [n], n := s∗ ◦ Q[ρ] ˆ and this means that the desired embedding on H 1 (X ι , QG). Thus Q[ρ] functor exists. 2 Example 32. Let G = SU(d), so that N G = U(d) and QG = T. By (46), we have the map → H 1 (X, T). Q : sym X, SU(d) Elementary computations show that the quotient map p : U(d) → T is the determinant; we define the continuous section s : T → U(d), s(z) = z ⊕ 1d−1 , where 1d−1 is the identity of Md−1 . Since
384
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
s is multiplicative, we conclude by Corollary 31 that for each SU(d)-bundle σˆ there is at least one embedding functor E : σˆ → vect(X). For future reference, we consider the well-known isomorphism B : H 1 (X, T) → H 2 (X, Z). Moreover, we recall the reader to (12). Corollary 33. Let (ρ, ˆ ⊗, ι, ε) be a special category such that ρ has dimension d ∈ N and Chern 2 ι (σˆ , ⊗, ι, ε) with an inclusion functor class c ∈ H (X , Z). Then there is an SU(d)-bundle (σˆ , ⊗, ι, ε) → (ρ, ˆ ⊗, ι, ε).
(47)
If E : ρˆ → vect(X ι ) is an embedding functor and E := E(ρ), then there is a factorisation ρˆ
σˆ
E
SUE
E ⊆
(48)
E
and E has first Chern class c1 (E) = c. Proof. We define σˆ as the tensor C ∗ -subcategory of ρˆ generated by the symmetry operators ερ (r, s), r, s ∈ N, and the elements of Rρ (see (12) and following remarks). The obvious inclusion σˆ ⊆ ρˆ yields the functor (47). If E is an embedding functor then E(Pρ,ε,d ) = Pd (see
Remark 24); this implies E(Rρ ) = (ι, d E) and we conclude that c(ρ) = c1 (E). Finally, since are generated by the flips θE (r, s) = E(ερ (r, s)), r, s ∈ N, and elethe spaces of arrows of SUE
d E), we obtain the desired factorisation (48). 2 ments of (ι, Corollary 34. Let (σˆ , ⊗, ε, ι) be an SU(d)-bundle with B ◦ Q[σˆ ] ∈ H 2 (X ι , Z). Then the set of embedding functors E : σˆ → vect(X ι ) coincides with the set of vector bundles over X ι of rank d and first Chern class B ◦ Q[σˆ ]. For the notion of conjugate in the setting of tensor C ∗ -categories, we refer the reader to [18, §2]. Theorem 35. Let d ∈ N and G ⊆ U(d) be covariant. Then for every G-bundle (ρ, ˆ ⊗, ι, ε) the following invariants are assigned:
δ(ρ) ˆ := δ ◦ Q[ρ] ˆ ∈ H 2 X ι , G , ˆ := γ˘∗ ◦ Q[ρ] ˆ ∈ H˘ 2 X ι , G→ aut G . γ˘∗ (ρ)
The class γ˘∗ (ρ) ˆ defines a G-gerbe G˘ over X ι , unique up to isomorphism, which collapses to a G-bundle G if and only if there is an embedding functor E : ρˆ → vect(X ι ), and in such a case δ(ρ) ˆ = [1]. When G ⊆ SU(d), the Chern class c(ρ) ∈ H 2 X ι , Z ,
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
385
defined in (12), fulfils the following properties: if c(ρ) = 0 then ρ is a special object and the closure for subobjects of ρˆ has conjugates; if E : ρˆ → vect(X ι ) is an embedding functor then c(ρ) is the first Chern class of E(ρ). ˆ and define G˘ as the G-gerbe Proof. We pick a cocycle pair b in the cohomology class γ˘∗ (ρ) with transition maps defined by b according to Remark 13 and Lemma 15. Embeddings E : ρˆ → ˆ and the vect(X ι ) are in one-to-one correspondence with N G-cocycles n such that p∗ [n] = Q[ρ] associated G-bundles G define cohomology classes γ∗ [n] ∈ H 1 (X ι , aut G). Commutativity of the square in (21) implies that ˆ ∈ H˘ 2 X ι , G→ aut G , d∗ ◦ γ∗ [n] = γ˘∗ ◦ Q[ρ] and this proves that G˘ is isomorphic to the gerbe defined by G according to Remark 13. The relation between the existence of E and the vanishing of δ(ρ) ˆ is proved applying Theorem 30 and Lemma 15. Let now G ⊆ SU(d). If c(ρ) = 0 then Rρ is a free Hilbert (ι, ι)-module and there is an isometry S ∈ Rρ . So that ρ is special, and [7, Lemma 3.6] implies that the conjugate ρ¯ is a subobject in ρ. ˆ Using [18, Theorem 2.4] we conclude that the tensor powers ρ r , r ∈ N, and their subobjects, have conjugates. 2 The previous theorem suggests that in general the dual object of a symmetric tensor C ∗ category is a nonabelian gerbe rather than a group bundle. Clearly, we should say in precise terms in which sense a tensor C ∗ -category is the representation category of a gerbe. This could be done considering the notion of action of gerbes on bundles of 2-Hilbert spaces. An alternative point of view is to consider Hilbert C ∗ -bimodules rather than bundles: this situation is analogous to what happens in twisted K-theory, where we can use equivalently (Abelian) gerbes or bimodules with coefficients in a continuous trace C ∗ -algebra to define the same K-group. These aspects will be clarified in a forthcoming paper [26]. Example 36. Let n ∈ N and S n denote the n-sphere. We discuss the map p∗ in the case G = SU(d), d > 1: H 1 Sn, T H 2 Sn, Z . p∗ : H 1 S n , U(d) → sym S n , SU(d) A well-known argument implies H 1 (S n , U(d)) πn−1 (U(d)) [12, Chapter 7.8]; thus, by classical results [12, Chapter 7.12] we have H 1 S 2 , U(d) H 1 S 4 , U(d) Z,
H 1 S 1 , U(d) H 1 S 3 , U(d) 0;
moreover, H 2 S 2 , Z Z,
H 2 S n , Z 0,
n = 2.
Thus the cases S 1 , S 3 are trivial. In the other cases, we have the following: is the class of the • n > 2. The map p∗ is trivial and the unique element of sym(S n , SU(d)) trivial bundle. Now, it is a general fact that if E → X is a vector bundle, then the continuous
386
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
bundle (E, E) is trivial if and only if E is the tensor product of a trivial bundle by a line bundle. In the case X = S n , n = 2, every line bundle is trivial, thus we conclude that (E, E) is trivial if and only if E is trivial. Since (E, E) is generated as a C(X)-module by the special unitary group of E, we conclude that E → S n is trivial if and only if SUE → S n is trivial. is trivial for every E → S n , in spite of the fact that SUE is trivial if and only if Thus, SUE n E = S × Cd . In particular, this holds for S 2m , m = 2, . . . , where nontrivial vector bundles exist. • n = 2. We recall that the Chern character Ch : K 0 S 2 → H 0 S 2 , Z ⊕ H 2 S 2 , Z is a ring isomorphism. The term H 0 (S 2 , Z) Z corresponds to the rank, whilst H 2 (S 2 , Z) Z is the first Chern class. By the well-known stability properties of vector bundles (see [12, Chapter 8, Theorem 1.5] or [15, II.6.10]), we find that rank d vector bundles E, E → S 2 are isomorphic if and only if [E] = [E ] ∈ K 0 (S 2 ), i.e. Ch[E] = Ch[E ]. This implies that p∗ is one-to-one for n = 2. Example 37. We define Rd ⊂ SU(d) as the group of diagonal matrices of the type g := diag(z, . . . , z), where z ∈ T is a root of unity of order d. Then N Rd = U(d) acts trivially on Rd and Rd = Rd . We have the exact sequence of pointed sets i∗ δ p∗ H 1 S 2 , Rd −→ H 1 S 2 , U(d) −→ H 1 S 2 , QRd −→ H 2 S 2 , Rd . Now, every principal Rd -bundle over S 2 is trivial, and the universal coefficient theorem yields H 2 (S 2 , Rd ) Hom(Z, Rd ) Zd . Thus we have δ p∗ 0 → Z −→ H 1 S 2 , QRd −→ Zd , and p∗ is injective. We now prove that there is a left inverse s : Zd → H 1 (S 2 , QRd ) for δ with trivial intersection with p∗ (Z). This suffices to prove that H 1 (S 2 , QR d ) Z ⊕ Zd . To this end, we embed Rd in T and regard each Rd -2-cocycle g := {gij k } as a T-2-cocycle. In this way, the argument of the proof of [5, Theorem 10.8.4(2)] implies that there is a 1-U(d)-cochain u := {uij } such that uij uj k u−1 ik = gij k . Thus, we define the map s : H 2 S 2 , T → H˘ 2 S 2 , Rd →U(d) H 1 S 2 , QRd ,
s[g] := [u, g],
which clearly yields the desired left inverse (recall the definition of δ). We conclude that d Z ⊕ Zd . sym S 2 , R The first direct summand corresponds to the term H 1 (S 2 , U(d)) whose isomorphism with Z is realised by means of the determinant (see [12, §7.8]); this implies that the projection d ) on Z is the Chern class. On the other side, by construction the projection on Zd of sym(S 2 , R corresponds to the class δ.
E. Vasselli / Journal of Functional Analysis 257 (2009) 357–387
387
Example 38. Let G ⊂ U(d) be as in Example 5 and ρˆ be a G-bundle; then it is easy to check that the set of embeddings ρˆ → vect(X) is in one-to-one correspondence with vector bundles E → X such that (E, E) (ρ, ρ). In particular when X is the 3-sphere S 3 then every vector bundle E → S 3 is trivial and ρˆ admits an embedding if and only if (ρ, ρ) is trivial. On the other side, the (classical) Dixmier–Douady invariant is a complete invariant for bundles with fibre Md and base space S 3 , thus we conclude that → H 2 S3, G = H 3 S3, Z = Z δ : sym S 3 , G is an isomorphism. Acknowledgments The author would like to thank Mauro Spera for drawing his attention to gerbes, and an anonymous reviewer for suggesting several improvements on the first version of the present paper. References [1] M.F. Atiyah, K-Theory, Benjamin, New York, 1967. [2] J.C. Baez, D. Stevenson, The classifying space of a topological 2-group, math/0801.3843v1, in: Proceedings of the Abel Symposium, 2009, in press. [3] T. Ceccherini, C. Pinzari, Canonical actions on O∞ , J. Funct. Anal. 103 (1992) 26–39. [4] J. Cuntz, Simple C ∗ -algebras generated by isometries, Comm. Math. Phys. 57 (1977) 173–185. [5] J. Dixmier, C ∗ -Algebras, North-Holland Publishing Company, Amsterdam, 1977. [6] S. Doplicher, J.E. Roberts, Duals of compact Lie groups realized in the Cuntz algebras and their actions on C ∗ algebras, J. Funct. Anal. 74 (1987) 96–120. [7] S. Doplicher, J.E. Roberts, A new duality theory for compact groups, Invent. Math. 9 (1989) 157–218. [8] S. Doplicher, J.E. Roberts, Endomorphisms of C ∗ -algebras, cross products and duality for compact groups, Ann. of Math. 130 (1989) 75–119. [9] S. Doplicher, J.E. Roberts, Why there is a field algebra with a compact gauge group describing the superselection structure in particle physics, Comm. Math. Phys. 131 (1990) 51–107. [10] M.J. Dupré, Classifying Hilbert bundles I, J. Funct. Anal. 15 (1974) 244–278. [11] F. Hirzebruch, Topological Methods in Algebraic Geometry, Springer-Verlag, 1966. [12] D. Husemoller, Fiber Bundles, McGraw–Hill Ser. in Math., 1966. [13] M. Izumi, H. Kosaki, On a subfactor analogue of the second cohomology, Rev. Math. Phys. 14 (2002) 733–757. [14] T. Kajiwara, C. Pinzari, Y. Watatani, Jones index theory for Hilbert C ∗ -bimodules and its equivalence with conjugation theory, J. Funct. Anal. 215 (1) (2004) 1–49. [15] M. Karoubi, K-Theory, Springer-Verlag, Berlin, 1978. [16] G.G. Kasparov, Equivariant KK-theory and the Novikov conjecture, Invent. Math. 91 (1988) 147–201. [17] E. Kirchberg, S. Wassermann, Operations on continuous bundles of C ∗ -algebras, Math. Ann. 303 (1995) 677–697. [18] R. Longo, J.E. Roberts, A theory of dimension, K-Theory 11 (1997) 103–159. [19] P.D. Mitchener, KK-theory of C ∗ -categories and the analytic assembly map, K-Theory 26 (4) (2002) 307–344. [20] V. Nistor, E. Troitsky, An index for gauge-invariant operators and the Dixmier–Douady invariant, Trans. Amer. Math. Soc. 356 (2004) 185–218. [21] G. Segal, Equivariant K-theory, Inst. Hautes Études Sci. Publ. Math. 34 (1968) 129–151. [22] E. Vasselli, Continuous fields of C ∗ -algebras arising from extensions of tensor C ∗ -categories, J. Funct. Anal. 199 (2003) 122–152. [23] E. Vasselli, Crossed products by endomorphisms, vector bundles and group duality, Internat. J. Math. 16 (2) (2003). [24] E. Vasselli, Bundles of C ∗ -categories, J. Funct. Anal. 47 (2007) 351–377. [25] E. Vasselli, Some remarks on group bundles and C ∗ -dynamical systems, Comm. Math. Phys. 274 (1) (2007) 253– 276. [26] E. Vasselli, Nonabelian gerbes, Hilbert bimodules and K-theory, in preparation. [27] P. Zito, 2-C ∗ -categories with non-simple units, Adv. Math. 210 (2007) 122–164.
Journal of Functional Analysis 257 (2009) 388–404 www.elsevier.com/locate/jfa
A third order dispersive flow for closed curves into almost Hermitian manifolds Hiroyuki Chihara a,∗,1 , Eiji Onodera b,2 a Mathematical Institute, Tohoku University, Sendai 980-8578, Japan b Faculty of Mathematics, Kyushu University, Fukuoka 812-8581, Japan
Received 29 July 2008; accepted 9 April 2009
Communicated by C. Kenig
Abstract We discuss a short-time existence theorem of solutions to the initial value problem for a third order dispersive flow for closed curves into a compact almost Hermitian manifold. Our equations geometrically generalize a physical model describing the motion of vortex filament. The classical energy method cannot work for this problem since the almost complex structure of the target manifold is not supposed to be parallel with respect to the Levi-Civita connection. In other words, a loss of one derivative arises from the covariant derivative of the almost complex structure. To overcome this difficulty, we introduce a bounded pseudodifferential operator acting on sections of the pullback bundle, and eliminate the loss of one derivative from the partial differential equation of the dispersive flow. © 2009 Elsevier Inc. All rights reserved. Keywords: Dispersive flow; Vortex filament; Geometric analysis; Pseudodifferential calculus; Energy method
1. Introduction Let (N, J, h) be a 2n-dimensional compact almost Hermitian manifold with an almost complex structure J and a Hermitian metric h. Consider the initial value problem for a third order * Corresponding author.
E-mail addresses: [email protected] (H. Chihara), [email protected] (E. Onodera). 1 Supported by JSPS Grant-in-Aid for Scientific Research #20540151. 2 Supported by Global COE Program “Education and Research Hub for Mathematics-for-industry”, and MEXT Grant-
in-Aid for Scientific Research #21740101. 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.04.006
H. Chihara, E. Onodera / Journal of Functional Analysis 257 (2009) 388–404
389
dispersive flow of the form ut = a∇x2 ux + Ju ∇x ux + bh(ux , ux )ux u(0, x) = u0 (x)
in R × T,
in T,
(1) (2)
where u is an unknown mapping of R × T to N , (t, x) ∈ R × T, T = R/Z, ut = du(∂/∂t), ux = du(∂/∂x), du is the differential of the mapping u, u0 is a given closed curve on N , ∇ is the induced connection, a, b ∈ R are constant. u(t) is a closed curve on N for fixed t ∈ R, and u describes the motion of a closed curve subject to (1). We present local expression of the covariant a b derivative ∇x . Let y 1 , . . . , y 2n be local coordinates of N , and let h = 2n a,b=1 hab dy ⊗ dy . We a denote by Γbc , a, b, c = 1, . . . , 2n, the Christoffel symbol of (N, J, h). For a smooth closed curve u : T → N , Γ (u−1 T N) is the set of all smooth sections of the pullback bundle u−1 T N . If we express V ∈ Γ (u−1 T N) as V (x) =
2n a=1
∂ V a (x) , ∂y a u
then ∇x V is given by ∇x V (x) =
2n ∂V a a=1
∂x
(x) +
2n b,c=1
c ∂ ∂u a b (x) Γbc u(x) V (x) . ∂x ∂y a u
Eq. (1) geometrically generalizes two-sphere-valued partial differential equations modeling the motion of vortex filament. In his celebrated paper [5], Da Rios first formulated the motion of vortex filament as ut = u × uxx ,
(3)
where u = (u1 , u2 , u3 ) is an S2 -valued function of (t, x), S2 is a unit sphere in R3 with a center at the origin, and × is the exterior product in R3 . The physical meanings of u and x are the tangent vector and the signed arc length of vortex filament respectively. After eighty five years, a modified model equation of vortex filament
3 (4) u × ux ) x ut = u × uxx + a uxxx + ux × ( 2 was proposed by Fukumoto and Miyazaki in [7]. When a, b = 0, (1) generalizes (3) and solutions to (1) are called one-dimensional Schrödinger maps. When b = a/2, (1) generalizes (4). In recent ten years, physical models such as (3) and (4) have been generalized and studied from a point of view of geometric analysis in mathematics. The relationship between the geometric settings and the structure of such partial differential equations and their solutions has been recently investigated in mathematics. The reduction of equations to simpler ones leads us to rough understandings of their structure. This idea originated from Hasimoto’s transform discovered in [10]. In their pioneering work [1], Chang, Shatah and Uhlenbeck first rigorously studied the PDE structure of (1) when a = b = 0, x ∈ R, and (N, J, h) is a compact Riemann surface. They constructed a good moving frame along the map and reduced (1) to a simple complex-valued semilinear Schrödinger equation under the
390
H. Chihara, E. Onodera / Journal of Functional Analysis 257 (2009) 388–404
assumption that u(t, x) has a fixed base point as x → +∞. Similarly, Onodera reduced (1) with a = 0 and a one-dimensional fourth order dispersive flow to complex-valued equations in [23]. Generally speaking, these reductions require some restrictions on the range of the mappings, and one cannot make use of them to solve the initial value problem for the original equations without restrictions on the range of the initial data. How to solve the initial value problem for such geometric dispersive equations is a fundamental question. In his pioneering work [14], Koiso first reformulated (3) geometrically, and proposed Eq. (1) with a, b = 0 and the Kähler condition ∇ N J = 0, where ∇ N is the Levi-Civita connection of (N, J, h). Moreover, Koiso established the standard short-time existence theorem, and proved that if (N, J, h) is locally symmetric, that is, ∇ N R = 0, then the solution exists globally in time, where R is the Riemannian curvature tensor of N . See [25] also for one-dimensional Schrödinger maps. Recently, Onodera studied local and global existence theorems of (1)–(2) in case a = 0 in [22,24]. To be more precise, [22] studied the case ∇ N J = 0, and proved a shorttime existence theorem. Moreover, he proved that if (N, J, h) is a compact Riemann surface with a constant sectional curvature K and a condition b = Ka/2 is satisfied, then the time-local solution can be extended globally in time. Nishiyama and Tani proved the global existence of solutions to the initial value problem for (4) in [21,26]. Since K = 1 for N = S2 , the global existence theorem in [22] is the generalization of the results [21,26]. [24] studied a short-time existence theorem for (1)–(2) in case that (N, J, h) is a compact almost Hermitian manifold and x ∈ R. Being inspired by Tarama’s beautiful results on the characterization of L2 -well-posedness of the initial value problem for a one-dimensional linear third order dispersive equations in [28] (see also [17]), Onodera introduced a gauge transform on the pullback bundle to make full use 3 3 of so-called local smoothing effect of et∂ /∂x , and proved a short-time existence theorem. Both of the reduction of equations and the study of existence theorem are deeply connected with the relationship between the geometric settings of equations and the theory of linear dispersive partial differential equations. For the latter subject, see, e.g., [3,6], [16, Lecture VII], [17,27,28] and references therein. Being concerned with the compactness of the source space, we need to mention local smoothing effect of dispersive partial differential equations. It is well known that solutions to the initial value problem for some kinds of dispersive equations gain extra smoothness in comparison with the initial data. In his celebrated work [6], Doi characterized the existence of microlocal smoothing effect of Schrödinger evolution equations on complete Riemannian manifolds according to the global behavior of the geodesic flow on the unit cotangent sphere bundle over the source manifolds. Roughly speaking, the local smoothing effect occurs if and only if all the geodesics go to “infinity”. For more general dispersive equations, the existence or nonexistence of local smoothing effect is determined by the global behavior of the Hamilton flow generated by the principal symbol of the equations. In particular, if the source space is compact, then no smoothing effect occurs since all the integral curves of the Hamilton vector field are trapped. For this reason, it is essential to study the initial value problem (1)–(2) when the source space is T and not R. Here we mention the relationship between the Kähler condition ∇ N J = 0 and the structure of Eq. (1). All the preceding works on (1) except for [24] assume that (N, J, h) is a Kähler manifold. If ∇ N J = 0, then (1) behaves like symmetric hyperbolic systems, and the short-time existence theorem can be proved by the classical energy method. See [22] for the detail. If ∇ N J = 0, then (1) has a first order terms in some sense, and the classical energy method breaks down. The purpose of the present paper is to show a short-time existence theorem for (1)–(2) without using the Kähler condition and the local smoothing effect. To state our results, we here introduce function spaces of mappings. For a nonnegative integer k, H k+1 (T; T N ) is the set of all contin-
H. Chihara, E. Onodera / Journal of Functional Analysis 257 (2009) 388–404
391
uous mappings u : T → N satisfying
u 2H k+1
=
k
h ∇xl ux , ∇xl ux dx < ∞,
l=0 T
see, e.g., [11] for the Sobolev space of mappings. The Nash embedding theorem shows that there exists an isometric embedding w ∈ C ∞ (N; Rd ) with some integer d > 2n. See [8,9,19] for the Nash embedding theorem. Let I be an interval in R. We denote by C(I ; H k+1 (T; T N )) the set of all H k+1 (T; T N )-valued continuous functions on I . In other words, we define it by the pullback of the function space as C(I ; H k+1 (T; T N )) = C(I ; w ∗ H k+1 (T; Rd )), where H k+1 (T; Rd ) is the usual Sobolev space of Rd -valued functions on T. Here we state our main results. Theorem 1. Let k be a positive integer satisfying k 4. Then, for any u0 ∈ H k+1 (T; T N ), there exists T = T ( u H 5 ) > 0 such that (1)–(2) possesses a unique solution u ∈ C([−T , T ]; H k+1 (T; T N )). We will prove Theorem 1 by the uniform energy estimates of solutions to a fourth order parabolic regularized equation. To avoid the difficulty arising from ∇x Ju , we modify the method introduced for the initial value problem for Schrödinger maps of a closed Riemannian manifold to a compact almost Hermitian manifold in [4]. Being inspired by his own previous paper [2], Chihara introduced a transformation of unknown mappings defined by a bounded pseudodifferential operator acting on sections of Γ (u−1 T N), and eliminated first order terms coming from ∇x Ju in [4]. The plan of the present paper is as follows. Section 2 studies the well-posedness of an auxiliary initial value problem for some one-dimensional linear dispersive partial differential equations related with (1)–(2). We believe that Section 2 will be very helpful to understand our idea of the proof of Theorem 1, though the arguments and results there are nonsense from a point of view of the theory of linear partial differential equations. Section 3 proves Theorem 1. 2. An auxiliary linear problem In this section we study the initial value problem for a one-dimensional third order linear dispersive partial differential equation related with (1) of the form LU = F (t, x)
in R × T,
(5)
in T,
(6)
U (0, x) = U0 (x) where LU = Ut + Uxxx +
√ −1 a(x)Ux x + bx (x)Ux + c(x)U,
U is a complex-valued unknown function of (t, x) ∈ R × T, a, b, c ∈ C ∞ (T), Im a = 0, U0 (x) and F (t, x) are given functions. The operator L is very special in the sense that the coefficient of the first order term is a derivative of a smooth function. The well-posedness of the initial value problem for third and fourth order dispersive equations on R or T was studied in
392
H. Chihara, E. Onodera / Journal of Functional Analysis 257 (2009) 388–404
[17,27,28]. In most of cases the well-posedness was characterized by the conditions on the coefficients of differential operators. Let L2 (T) be the standard Lebesgue space of square-integrable functions on T, and let L1loc (R; L2 (T)) be the set of all L2 (T)-valued locally integrable functions on R. Mizuhara characterized the well-posedness of the initial value problem for a general third order dispersive equations on R × T. In view of his results in [17, Theorem 6.1], one can immediately check that the special initial value problem (5)–(6) is well posed. Proposition 2. (5)–(6) is L2 -well-posed, that is, for any U0 ∈ L2 (T) and for any F ∈ L1loc (R; L2 (T)), (5)–(6) possesses a unique solution U ∈ C(R; L2 (T)). All the descriptions in the present section are meaningless from a viewpoint of the general theory of linear partial differential equations. However, the purpose of this section is to illustrate our idea of the proof of Theorem 1 by showing the special proof of Proposition 2. In what follows we make use of an elementary theory of pseudodifferential operators on R. See [15] for instance. In view of the idea in [13, Section 2], one can deal with pseudodifferential operators on T in the same way as those on R without using the general theory of pseudodifferential operators on manifolds. C ∞ (T) is regarded as the set of all 1-periodic smooth functions on R. Its topological dual is the set of all 1-periodic tempered distributions on R. Let p(ξ ) be a real-valued smooth odd function on R satisfying p(ξ ) = 1/ξ for ξ ∈ R \ (−2, 2) and p(ξ ) = 0 for ξ ∈ [−1, 1]. A pseudodifferential operator p(Dx ) is defined by an oscillatory integral of the form p(Dx )U (x) =
1 2π
√
e
−1(x−y)ξ
p(ξ )U (y) dy dξ
for U ∈ B ∞ (R),
R×R
√ where Dx = − −1∂/∂x, B ∞ (R) is the set of all bounded C ∞ -functions on R whose derivative ∞ of any order is also bounded in R. It is well known that p(D √ x ) is well defined on B (R) and extended on the set of all tempered distributions on R. − −1p(Dx ) is an essential realization of the integral over (−∞, x] by pseudodifferential operators. The important properties of p(Dx ) are the following. Lemma 3. If U (x) is real-valued and 1-periodic, then so is
√ −1p(Dx )U (x).
Proof. Let U ∈ C ∞ (T). We can easily check that p(Dx )U (x) is 1-periodic by using a translation x → x + 1. Suppose that U (x) is real-valued in addition. Then, Im
√ −1p(Dx )U (x) = Re p(Dx )U (x) √ 1 = Re e −1(x−y)ξ p(ξ )U (y) dy dξ 2π R×R
=
1 2π
cos (x − y)ξ p(ξ )U (y) dy dξ = 0
R×R
since the integrand in the last integral above is an odd function in ξ .
2
H. Chihara, E. Onodera / Journal of Functional Analysis 257 (2009) 388–404
393
Our special proof of Proposition 2 uses a bounded pseudodifferential operator defined by λ(x, Dx ) = 1 − λ˜ (x, Dx ),
λ˜ (x, ξ ) =
√ −1 b(x)p(ξ ). 3
Roughly speaking, λ(x, Dx ) is a linear automorphism on L2 (T). Indeed, it is easy to see that there exists a constant M > 1 depending on b(x) and p(ξ ) such that M −1 U N (U ) M U for any U ∈ L2 (T),
(7)
where N (U )2 = λ(x, Dx )U 2 +
Dx −1 U 2 , Dx = (1 − ∂ 2 /∂x 2 )1/2 , and · is the norm of L2 (T). We prove Proposition 2 by using a transform U → λ(x, Dx )U as follows. Sketch of proof of Proposition 2. It suffices to show forward and backward energy inequalities. See [12, Section 23.1] for instance. We obtain only an energy inequality in the positive direction in t. The backward one can be obtained similarly. A direct computation shows that √ λ(x, Dx )L = ∂t + ∂x3 + −1∂x a(x)∂x λ(x, Dx ) ˜ − λ(x, Dx ), ∂x3 + bx (x)∂x + r1 (x, Dx ), √ ˜ ˜ Dx )bx (x)∂x + λ(x, Dx )c(x), Dx ), −1∂x a(x)px − λ(x, r1 (x, Dx ) = − λ(x, √
Dx −1 L = ∂t + ∂x3 + −1∂x a(x)∂x Dx −1 + r2 (x, Dx ), √ r2 (x, Dx ) = Dx −1 , −1∂x a(x)∂x + Dx −1 bx (x)∂x + c(x) ,
(8)
(9)
where ∂t = ∂/∂t and ∂x = ∂/∂x. r1 (x, Dx ) and r2 (x, Dx ) are L2 -bounded pseudodifferential operators. We remark that ˜ − λ(x, Dx ), ∂x3 = −bx (x)∂x + r3 (x, Dx ), √ −1 bxxx (x)p(Dx ), r3 (x, Dx ) = bx (x)∂x 1 − p(Dx )Dx − bxx (x)p(Dx )Dx + 3 and r3 (x, Dx ) is also an L2 -bounded pseudodifferential operator. Set r4 = r1 + r3 for short. Then, (8) becomes √ λ(x, Dx )L = ∂t + ∂x3 + −1∂x a(x)∂x λ(x, Dx ) + r4 (x, Dx ).
(10)
Fix arbitrary T > 0. Suppose that U ∈ C([0, T ]; H 3 (T)) ∩ C 1 ([0, T ]; L2 (T)). By using (9) and (10), one can easily show that there exists a positive constant C0 depending on a, b, c and p such that dN (U (t))2 C0 N U (t) + N LU(t) N U (t) , dt
394
H. Chihara, E. Onodera / Journal of Functional Analysis 257 (2009) 388–404
which implies a desired energy inequality t U (t) C1 U (0) + LU(s) ds for t ∈ [0, T ], 0
where C1 is a positive constant depending only on a, b, c and p.
2
3. Proof of Theorem 1 We shall prove Theorem 1 by the uniform energy estimates of solutions to the initial value problem for semilinear parabolic equations of the form uεt = −ε∇x3 uεx + a∇x2 uεx + Juε ∇x uεx + bh uεx , uεx uεx uε (0, x) = u0 (x)
in (0, ∞) × T,
in T,
(11) (12)
where ε ∈ (0, 1] is a parameter. The existence of solutions to (11)–(12) was proved as follows. Lemma 4. (See [22, Proposition 3.1].) Let k be a positive integer satisfying k 2. Then, for any u0 ∈ H k+1 (T; T N ), there exists Tε = T (ε, u H 3 ) > such that (11)–(12) possesses a unique solution uε ∈ C([0, Tε ]; H k+1 (T; T N )). The proof of Lemma 4 given in [22] does not depend on the Kähler condition at all. Lemma 4 is proved by the standard arguments: the contraction mapping theorem and some kind of maximum principle. Firstly, we push forward (11)–(12) into Rd by the Nash embedding w, and construct a solution taking values in a small tubular neighborhood of w(N ). Secondly, we check that the value of the solution remains in w(N). See [22, Section 3] for the detail. We split the proof of Theorem 1 into three steps. Firstly, we construct a solution by the uniform energy estimates and the standard compactness argument. Secondly, we check the uniqueness of solutions. Finally, we recover the continuity in time of solutions. Construction of solutions. Let uε be a unique solution to (11)–(12) with a parameter ε ∈ (0, 1]. It suffices to show that there exists T > 0 which is independent of ε ∈ (0, 1], such that {uε }ε∈(0,1] is bounded in L∞ (0, T ; H k+1 (T; T N )), which is the set of all H k+1 -valued essentially bounded functions on (0, T ). Indeed, if this is true, then the standard compactness argument shows that there exist u and a subsequence {uε }ε∈(0,1] such that uε → u in C [0, T ]; H k (T; T N ) , uε → u in L∞ 0, T ; H k+1 (T; T N ) weakly star, as ε ↓ 0, and u solves (1)–(2) and is H k+1 -valued weakly continuous in time. Set u = uε for short. Any confusion will not occur. We actually evaluate 2 Nk+1 (u)2 = u 2H k + Λ∇xk ux ,
H. Chihara, E. Onodera / Journal of Functional Analysis 257 (2009) 388–404
395
where Λ = Λε (t, x, u) is a properly supported bounded pseudodifferential operator acting on Γ ((uε )−1 T N ) defined later, and · is a norm of L2 (T; T N ) defined by
V = 2
h(V , V ) dx
for V : T → T N .
T
Set Tε∗ = sup T > 0 Nk+1 u(t) 2Nk+1 (u0 ) for t ∈ [0, T ] . We need to compute ∇xl+1 ut + ε∇x3 ux − a∇x2 ux − Ju ∇x ux − bh(ux , ux )ux = 0,
l = 0, . . . , k.
Main tools of the computation are ∇X du(Y ) = ∇Y du(X) + du [X, Y ] = ∇Y du(X), ∇X ∇Y V = ∇Y ∇X V + ∇[X,Y ] V + R du(X), du(Y ) V = ∇Y ∇X V + R du(X), du(Y ) V
(13)
(14)
for X, Y ∈ {∂t , ∂x } and V ∈ Γ (u−1 T N). We make use of basic techniques of geometric analysis of nonlinear problems. See [20] for instance. In view of (13) and (14), we have ∇x ut = ∇t ux ,
(15)
∇x2 ut = ∇t ∇x ux + R(ux , ut )ux , ∇xl+1 ut = ∇t ∇xl ux +
l−1
∇xl−1−m R(ux , ut )∇xm ux
m=0
= ∇t ∇xl ux +
l−1
∇xl−1−m R ux , −ε∇x3 ux + a∇x2 ux + Ju ∇x ux + bh(ux , ux )ux ∇xm ux
m=0
= ∇t ∇xl ux P1,l+1 =
l−1
+ aR ux , ∇xl+1 ux − εP1,l+1 − Q1,l+1 ,
∇xl−1−m R ux , ∇x3 ux ∇xm ux ,
m=0
Q1,l+1 = −a
l−1
∇xl−1−m R ux , ∇x2 ux ∇xm ux + aR ux , ∇xl+1 ux
m=0
−
l−1 m=0
∇xl−1−m R ux , Ju ∇x ux + bh(ux , ux )ux ∇xm ux .
(16)
396
H. Chihara, E. Onodera / Journal of Functional Analysis 257 (2009) 388–404
The Sobolev embeddings show that
P1,l+1 Ck u H l+3 ,
Q1,l+1 Ck u H l+1
(17)
for t ∈ [0, Tε∗ ], where Ck > 1 is a constant depending only on a, b and u0 H k+1 and not on ε ∈ (0, 1]. Such constants are denoted by the same notation Ck below. Using (13) and (14) again, we have ∇xl+1 (Ju ∇x ux ) = ∇x Ju ∇x ∇xl ux + l(∇x Ju )∇x ∇xl ux + Q2,l+1 , ∇xl+1 h(ux , ux )ux = h(ux , ux )∇x ∇xl ux + 2 h ∇xl ux , ux x ux + Q3,l+1 , Q2,l+1 =
l−1 m=0
Q3,l+1 =
(18) (19)
l+1−m m+1 (l + 1)! ∇ Ju ∇x ux , m!(l + 1 − m)! x
α+β+γ =l+1 α,β,γ l
γ (l + 1)! α h ∇x ux , ∇xβ ux ∇x ux − 2h ∇xl ux , ∇x ux ux . α!β!γ !
Q2,l+1 and Q3,l+1 have the same estimates as Q1,l+1 . Combining (15), (16), (17), (18) and (19), we obtain ∇t + ε∇x4 − a∇x3 − ∇x Ju ∇x − l(∇x Ju )∇x − bh(ux , ux )∇x ∇xl ux = −aR ux , ∇xl+1 ux ux + 2b h ∇xl ux , ux x ux + εPl+1 + Ql+1 ,
Pl+1 Ck u H l+3 ,
Ql+1 Ck u H l+1 for t ∈ 0, Tε∗ .
(20) (21)
By using (20), we have k−1
d
u 2H k = 2 dt
h ∇t ∇xl ux , ∇xl ux dx
l=0 T
= −2ε
k−1
h ∇x4 ∇xl ux , ∇xl ux dx
(22)
l=0 T
+ 2a
k−1
h ∇x3 ∇xl ux , ∇xl ux dx
(23)
l=0 T
+2
k−1
h ∇x Ju ∇x ∇xl ux , ∇xl ux dx
(24)
h (∇x Ju )∇x ∇xl ux , ∇xl ux dx
(25)
l=0 T
+2
k−1 l=0 T
H. Chihara, E. Onodera / Journal of Functional Analysis 257 (2009) 388–404
+ 2b
k−1
397
h(ux , ux )h ∇x ∇xl ux , ∇xl ux dx
(26)
h R ux , ∇xl+1 ux ux , ∇xl ux dx
(27)
k−1 l h ∇x ux , ux x h ux , ∇xl ux dx
(28)
l=0 T
− 2a
k−1 l=0 T
+ 4b
l=0 T
+2
k−1
h εPl+1 + Ql+1 , ∇xl ux dx.
(29)
l=0 T
Using integration by parts and the properties of h and J , we deduce that (22), (23), (24), (26), (28) respectively become
(22) = −2ε
k−1
h ∇xl+2 ux , ∇xl+2 ux dx,
(30)
l=0 T
(23) = −2a
k−1
h ∇x ∇xl+1 ux , ∇xl+1 ux dx
l=0 T k−1 l+1 = −a h ∇x ux , ∇xl+1 ux x dx = 0,
(31)
l=0 T
(24) = −2
k−1
h Ju ∇xl+1 ux , ∇xl+1 ux dx = 0,
(32)
l=0 T
(26) = b
k−1
h(ux , ux ) h ∇xl ux , ∇xl ux x dx
l=0 T k−1 = −b h(ux , ux ) x h ∇xl ux , ∇xl ux dx,
(33)
l=0 T
(28) = 2b
k−1 l 2 h ∇x ux , ux x dx = 0.
(34)
l=0 T
Recall the property of the Riemannian curvature tensor R: h(R(X, Y )Z, W ) = h(R(Z, W )X, Y ) for any vector fields X, Y, X, W on N . Using this and integration by parts, we deduce
398
H. Chihara, E. Onodera / Journal of Functional Analysis 257 (2009) 388–404
(27) = −2a
k−1
h R ux , ∇xl ux ux , ∇xl+1 ux dx
l=0 T
= 2a
k−1
h R ux , ∇xl+1 ux ux , ∇xl ux dx
l=0 T
+ 2a
k−1
h ∇ N R ux , ux , ∇xl ux ux , ∇xl ux dx
l=0 T
+ 2a
k−1
h R ∇x ux , ∇xl ux ux , ∇xl ux dx
l=0 T
+ 2a
k−1
h R ux , ∇xl ux ∇x ux , ∇xl ux dx,
l=0 T
which implies (27) = a
k−1
h ∇ N R ux , ux , ∇xl ux ux , ∇xl ux dx
l=0 T
+a
k−1
h R ∇x ux , ∇xl ux ux , ∇xl ux dx
l=0 T
+a
k−1
h R ux , ∇xl ux ∇x ux , ∇xl ux dx.
(35)
l=0 T
Applying the Schwarz inequality to (33), (35) and (29), we have (26), (27) Ck u 2 k , H (29) Ck ε u k+2 u k + Ck u 2 k H H H 2ε
k−1
h ∇xl+2 ux , ∇xl+2 ux dx + Ck u 2H k .
(36)
(37)
l=0 T
Similarly, (25) is estimated as (25) Ck u
H k+1 u H k .
(38)
Combining (30), (31), (32), (34), (36), (37) and (38), we obtain d
u 2H k Ck u H k+1 u H k . dt
(39)
H. Chihara, E. Onodera / Journal of Functional Analysis 257 (2009) 388–404
399
Next we estimate Λ∇xk ux . Here we define the pseudodifferential operator Λ. Let {Nα } be the set of local coordinate neighborhood of N , and let yα1 , . . . , yα2n be the local coordinates of Nα . Pick up a partition of unity {Φα } subordinated to {Nα }, and pick up {Ψα } ⊂ C0∞ (N ) so that Ψα = 1
in supp[Φα ],
supp[Ψα ] ⊂ Nα ,
where C0∞ (N ) is the set of all compactly supported C ∞ -functions on N . We define a properly supported pseudodifferential operator Λ acting on Γ (u−1 T N ) by ˜ Λ = 1 − Λ,
Λ˜ =
√ −1k Ju Φα (u)p(Dx )Ψα (u). 3a α
If V (x) =
2n a=1
∂ V (x) ∈ Γ u−1 T N ∂yαa u a
is supported in u−1 (Nα ), then Φα (u)p(Dx )V (x) =
2n
Φα (u)p(Dx )V a (x)
a=1
∂ ∂yαa
u
is well defined and supported in u−1 (Nα ). Then, each term in Λ˜ can be treated as a pseudodifferential operator acting on Rd -valued functions, and we can make use of pseudodifferential operators with nonsmooth symbols. In other words, we can deal with Λ˜ as if it were a pseudodifferential operator with a smooth symbol. See [2, Section 2] and [18] for the detail. Symbolic calculus below is valid since the Sobolev embedding shows that u(t) ∈ C 4+δ (T) for δ ∈ (0, 1/2). It is easy to see that there exists Ck > 1 such that Ck−1 Nk+1 (u) u H k+1 Ck Nk+1 (u)
for t ∈ 0, Tε∗ .
We compute 0 = Λ∇xk+1 ut + ε∇x3 ux − a∇x2 ux − Ju ∇x ux − bh(ux , ux )ux = Λ ∇t + ε∇x4 − a∇x3 − ∇x Ju ∇x − k(∇x Ju )∇x − bh(ux , ux )∇x ∇xk ux − Λ −aR ux , ∇xk+1 ux ux + 2b h ∇xk ux , ux x ux + εPk+1 + Qk+1 . A direct computation shows that ∂Λ ∂ Λ˜ = ∇t Λ + , ∂t ∂t ∂ Λ˜ k ∂t ∇x ux Ck u H k .
Λ∇t = ∇t Λ −
(40)
400
H. Chihara, E. Onodera / Journal of Functional Analysis 257 (2009) 388–404
Let I2n be the 2n × 2n identity matrix. If we use a local expression ∇x4 = ∂x4 + A3 ∂x3 + A2 ∂x2 + A1 ∂x + A0 with 2n × 2n matrices Aj , j = 0, 1, 2, 3, we deduce that ˜ ∇x4 , εΛ∇x4 = ε∇x4 Λ + ε Λ, ∇x4 = ε∇x4 Λ − ε Λ,
√
−1k 4 4 Λ, ˜ ∇x4 ∇xk ux Ck u H k+3 , ˜ Λ, ∇x = Ju p(Dx ), I2n ∂x + · · · , 3a
(41)
since the matrices of principal symbols Ju p(Dx ) and ∇x4 commute with each other. Next computation is the most crucial part of the proof of Theorem 1. In the same way as εΛ∇x4 , we have ˜ ∇x3 . −aΛ∇x3 = −a∇x3 Λ + a Λ, We see the commutator above in detail. A direct computation shows that √ −1k Ju Φα (u)p(Dx )Ψα (u)∇x3 3 α √ −1k 3 − ∇x Ju Φα (u)p(Dx )Ψα (u) 3 α √ −1k Ju Φα (u)p(Dx )∇x3 Ψα (u) = 3 α √ −1k 3 ∇x Ju Φα (u)p(Dx )Ψα (u) − 3 α √ −1k Ju Φα (u)p(Dx ) Ψα (u), ∇x3 . + 3 α
˜ ∇x3 = a Λ,
The last term above is a smoothing operator since supp Ψα (u) x ∩ supp Φα (u) = ∅. If we compute the commutator in the framework of modulo L2 -bounded operators, we deduce √ −1k Ju Φα (u)p(Dx )∇x3 − ∇x3 Ju Φα (u)p(Dx ) Ψα (u) 3 α √ −1k Ju Φα (u) p(Dx ), ∇x3 Ψα (u) = 3 α √ − −1k ∇x Ju Φα (u) ∇x2 p(Dx )Ψα (u)
˜ ∇x3 ≡ a Λ,
−
√
α
∇x2 Ju Φα (u) ∇x p(Dx )Ψα (u) −1k α
H. Chihara, E. Onodera / Journal of Functional Analysis 257 (2009) 388–404
401
√ ∇x Ju Φα (u) ∇x2 p(Dx )Ψα (u) ≡ − −1k α
p(Dx ) ∇x Ju Φα (u) ∇x2 √ =k Ψα (u) −1 α ≡k ∇x Ju Φα (u) ∇x Ψα (u) α
=k ∇x Ju Φα (u) ∇x α
= k(∇x Ju )
α
Φα (u)∇x + kJu
Φα (u) ∇x
α
x
= k(∇x Ju )∇x . Thus, −aΛ∇x3 ≡ −a∇x3 Λ + k(∇x Ju )∇x
(42)
modulo L2 -bounded operators. In the same way as above, we deduce Λ∇x Ju ∇x ≡ ∇x Ju ∇x Λ,
(43)
−kΛ(∇x Ju )∇x ≡ −k(∇x Ju )∇x ,
(44)
−bΛh(ux , ux )∇x ≡ −bh(ux , ux )∇x Λ
(45)
˜ we deduce modulo L2 -bounded operators. By using 1 = Λ + Λ, −aΛ R ux , ∇xk+1 ux ux = −aR ux , ∇xk+1 ux ux + a Λ˜ R ux , ∇xk+1 ux ux ˜ xk ux ux = −aR ux , ∇x Λ∇xk ux ux − aR ux , ∇x Λ∇ ˜ x R ux , ∇xk ux ux − a Λ˜ ∇ N R ux , ux , ∇xk ux + a Λ∇ − a Λ˜ R ∇x ux , ∇xk ux ux − a Λ˜ R ux , ∇xk ux ∇x ux = −aR ux , ∇x Λ∇xk ux ux + Q1,k+1 , (46) k k k 2bΛ h ∇x ux , ux x ux = 2b h ∇x ux , ux x ux − 2bΛ˜ h ∇x ux , ux x ux ˜ xk ux , ux ux = 2b h Λ∇xk ux , ux x ux + 2b h Λ∇ x k k ˜ ˜ − 2bΛ∇x h ∇x ux , ux ux + 2bΛ h ∇x ux , ux ∇x ux = 2b h Λ∇xk ux , ux x ux + Q2,k+1 , (47) Q 1,k+1 , Q2,k+1 Ck u H k+1 . Combining (40), (41), (42), (43), (44), (45), (46) and (47), we obtain
402
H. Chihara, E. Onodera / Journal of Functional Analysis 257 (2009) 388–404
∇t + ε∇x4 − a∇x3 − ∇x Ju ∇x − bh(ux , ux )∇x Λ∇xk ux = −aR ux , ∇x Λ∇xk ux ux + 2b h Λ∇xk ux , ux x ux + εPk+1 + Qk+1 , Q Ck u k+1 for t ∈ 0, T ∗ . P Ck u k+3 , H H ε l+1 k+1
(48) (49)
˜ ∇x3 ]. By Here we remark that −k(∇x Ju )∇x is canceled out in the left-hand side of (48) by a[Λ, computations similar to (30), (31), (32), (34), (36), (37) and not to (38), we can deduce from (48) and (49) that d Λ∇ k ux 2 Ck Nk+1 (u)2 . x dt
(50)
Combining (39) and (50), we obtain d Nk+1 (u) Ck Nk+1 (u) dt
for t ∈ 0, Tε∗ .
(51)
∗
If we take t = Tε∗ , then we have 2Nk+1 (u0 ) Nk+1 (u0 )eCk Tε , which implies Tε∗ T = log 2/Ck > 0. Thus {uε }ε∈(0,1] is bounded in L∞ (0, T ; H k+1 (T; T N )). This completes the proof. Uniqueness of solutions. The uniqueness of solutions was proved in [22, Section 5]. The proof given there does not depend on the Kähler condition at all. We prove the uniqueness by H 1 energy estimates of the difference of two solutions with the same initial data in Rd . The symmetry of the second fundamental form of the mapping w ◦ u plays a crucial role. See [22, Section 5] for the detail. Recovery of continuity in time. Let u ∈ L∞ (0, T ; H k+1 (T; T N )) be the unique solution to (1)–(2). Following [4, Section 3], we prove that ∇xk ux is strongly continuous in time. We have already known that u ∈ C([0, T ]; H k (T; T N )) and ∇xk ux is a weakly continuous L2 (T; T N )-valued function on [0, T ]. We identify N and w(N ) below. Let {uε }ε∈(0,1] be a sequence of solutions to (11)–(12), which approximates u. We can easily check that for any φ ∈ C ∞ ([0, T ] × T; Rd ), in L2 (0, T ) × T; Rd , Λε ∇xk uεx → u˜ in L2 (0, T ) × T; Rd weakly star, Λ∗ε φ → Λ∗ φ
as ε ↓ 0 with some u. ˜ Then, u˜ = Λ∇xk ux in the sense of distributions. We denote by L (H ) the set of all bounded linear operators of a Hilbert space H to itself. The time-continuity of ∇xk ux is equivalent to that of Λ∇xk ux since Λ ∈ C([0, T ]; L (L2 (T; Rd ))). It suffices to show that lim Λ(t)∇xk ux (t) = Λ(0)∇xk u0x t↓0
in L2 T; Rd ,
(52)
since the other cases can be proved in the same way. (51) and the lower semicontinuity of L2 norm imply
H. Chihara, E. Onodera / Journal of Functional Analysis 257 (2009) 388–404
403
k−1 k−1 l l ∇ ux (t)2 + Λ(t)∇ k ux (t)2 ∇ u0x 2 + Λ(0)∇ k u0x 2 + Ck Nk+1 (u0 )2 t x
x
l=0
x
x
l=0
provided that ε ↓ 0. Letting t ↓ 0, we have 2 2 lim supΛ(t)∇xk ux (t) Λ(0)∇xk u0x t↓0
which implies (52). This completes the proof. References [1] N.-H. Chang, J. Shatah, K. Uhlenbeck, Schrödinger maps, Comm. Pure Appl. Math. 53 (2000) 590–602. [2] H. Chihara, Gain of regularity for semilinear Schrödinger equations, Math. Ann. 315 (1999) 529–567. [3] H. Chihara, The initial value problem for Schrödinger equations on the torus, Int. Math. Res. Not. 2002 (15) (2002) 789–820. [4] H. Chihara, Schrödinger flow into almost Hermitian manifolds, submitted for publication, arXiv:0807.3395. [5] L.S. Da Rios, On the motion of an unbounded fluid with a vortex filament of any shape, Rend. Circ. Mat. Palermo 22 (1906) 117–135 (in Italian). [6] S.-I. Doi, Smoothing effects of Schrödinger evolution groups on Riemannian manifolds, Duke Math. J. 82 (1996) 679–706. [7] Y. Fukumoto, T. Miyazaki, Three-dimensional distortions of a vortex filament with axial velocity, J. Fluid Mech. 222 (1991) 369–416. [8] M.L. Gromov, V.A. Rohlin, Embeddings and immersions in Riemannian geometry, Russian Math. Surveys 25 (1970) 1–57. [9] M. Günther, On the perturbation problem associated to isometric embeddings of Riemannian manifolds, Ann. Global Anal. Geom. 7 (1989) 69–77. [10] H. Hasimoto, A soliton on a vortex filament, J. Fluid Mech. 51 (1972) 477–485. [11] E. Hebey, Nonlinear Analysis on Manifolds: Sobolev Spaces and Inequalities, Courant Lect. Notes Math., vol. 5, Amer. Math. Soc., 2000. [12] L. Hörmander, The Analysis of Linear Partial Differential Operators III, Springer-Verlag, 1985. [13] C.E. Kenig, G. Ponce, L. Vega, Smoothing effects and local existence theory for the generalized nonlinear Schrödinger equations, Invent. Math. 134 (1998) 489–545. [14] N. Koiso, The vortex filament equation and a semilinear Schrödinger equation in a Hermitian symmetric space, Osaka J. Math. 34 (1997) 199–214. [15] H. Kumano-go, Pseudo-Differential Operators, MIT Press, 1981. [16] S. Mizohata, On the Cauchy Problem, Notes and Reports in Mathematics in Science and Engineering, vol. 3, Academic Press, 1985. [17] R. Mizuhara, The initial value problem for third and fourth order dispersive equations in one space dimension, Funkcial. Ekvac. 49 (2006) 1–38. [18] M. Nagase, The Lp -boundedness of pseudo-differential operators with non-regular symbols, Comm. Partial Differential Equations 2 (1977) 1045–1061. [19] J. Nash, The imbedding problem for Riemannian manifolds, Ann. of Math. 63 (1956) 20–63. [20] S. Nishikawa, Variational Problems in Geometry, Transl. Math. Monogr., vol. 205, Amer. Math. Soc., 2002. [21] T. Nishiyama, A. Tani, Initial and initial–boundary value problems for a vortex filament with or without axial flow, SIAM J. Math. Anal. 27 (1996) 1015–1023. [22] E. Onodera, A third-order dispersive flow for closed curves into Kähler manifolds, J. Geom. Anal. 18 (2008) 889– 918. [23] E. Onodera, Generalized Hasimoto transform of one-dimensional dispersive flows into compact Riemann surfaces, SIGMA Symmetry Integrability Geom. Methods Appl. 4 (2008), article No. 044, 10 pp. [24] E. Onodera, The initial value problem for a third-order dispersive flow into compact almost Hermitian manifolds, submitted for publication, arXiv:0805.3219. [25] P.Y.H. Pang, H.-Y. Wang, Y.-D. Wang, Schrödinger flow on Hermitian locally symmetric spaces, Comm. Anal. Geom. 10 (2002) 653–681.
404
H. Chihara, E. Onodera / Journal of Functional Analysis 257 (2009) 388–404
[26] A. Tani, T. Nishiyama, Solvability of equations for motion of a vortex filament with or without axial flow, Publ. Res. Inst. Math. Sci. 33 (1997) 509–526. [27] S. Tarama, On the wellposed Cauchy problem for some dispersive equations, J. Math. Soc. Japan 47 (1995) 143– 158. [28] S. Tarama, Remarks on L2 -wellposed Cauchy problem for some dispersive equations, J. Math. Kyoto Univ. 37 (1997) 757–765.
Journal of Functional Analysis 257 (2009) 405–427 www.elsevier.com/locate/jfa
Blow-up analysis for the prescribed mean curvature equation on R2 Paolo Caldiroli 1 Dipartimento di Matematica, Università di Torino, via Carlo Alberto, 10-10123 Torino, Italy Received 24 October 2008; accepted 3 December 2008 Available online 20 December 2008 Communicated by J.-M. Coron
Abstract We consider the H -system u = 2H (u)ux ∧ uy on R2 , where H ∈ C 0 (R3 , R) satisfies H (q) = H∞ + o(1/|q|) as |q| → +∞ and supq∈R3 |(H (q) − H∞ )q| < 1, for some H∞ ∈ R \ {0}. We show that a sequence of approximate solutions of the H -system on R2 admits a limit configuration made by H˜ -bubbles, namely nonconstant solutions of H˜ -systems on R2 , and H˜ can be the mapping H or the constant H∞ . © 2008 Elsevier Inc. All rights reserved. Keywords: H -systems; Prescribed mean curvature equation; Blowup
0. Introduction This work is concerned with a problem related to H -surfaces. Here, by H -surface we mean a 1 (Ω, R3 ), where Ω is a domain in R2 , which solves in a weak sense the system mapping u ∈ Hloc u = 2H (u)ux ∧ uy (for short, H -system) in Ω. The mapping H : R3 → R is a prescribed function, which we assume at least and bounded. When Ω = R2 a nonconstant H continuous 1 2 3 2 surface U ∈ Hloc (R , R ) with R2 |∇U | < +∞ will be called H -bubble. In fact, an H -bubble U ∈ C 2 (R2 , R3 ) turns out to be a conformal parametrization of a surface S in R3 , of the type of the sphere S2 , such that the mean curvature of S at any regular point p ∈ S equals H (p). E-mail address: [email protected]. 1 Supported by Cofin. M.I.U.R. Progetto di Ricerca “Metodi Variazionali ed Equazioni Differenziali Nonlineari”.
0022-1236/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.12.003
406
P. Caldiroli / Journal of Functional Analysis 257 (2009) 405–427
In 1985, H. Brezis and J.M. Coron in [5] and, independently, M. Struwe in [18], among other things, study the behaviour of a sequence of approximate solutions of the H -system in the unit disc D with null boundary conditions, namely a sequence (un ) ⊂ H01 (D, R3 ) such that ⎧ n u − 2H un unx ∧ uny → 0 in H −1 D, R3 , ⎪ ⎨
2 sup ∇un < +∞. ⎪ ⎩ n
(0.1)
D
They prove that when H is a nonzero constant then either un → 0 strongly in H01 (D, R3 ) or the sequence (un ) admits a limit configuration made by finitely many H -bubbles which, in this case, are parametrizations of spheres of radius |H |−1 (see also [11] for a related result). Every bubble appears according to a typical blow-up phenomenon, namely one finds a sequence of positive numbers rn → 0+ and a sequence of points (zn ) ⊂ D with dist(zn , ∂D)/rn → +∞ such that the rescaled sequence (u˜ n ) defined by u˜ n (z) = un
z − zn rn
(0.2)
1 (R2 , R3 ) to some H -bubble. The occurrence of blow-up is due to the converges weakly in Hloc invariance of problem (0.1) with respect to the noncompact conformal group of the disc. Now let us turn to the case in which the prescribed “mean curvature” H is a nonconstant function. For some time the general belief was that the same result would hold also in the case H nonconstant but the difficulty in getting it were due to technical problems. However, unlike what claimed in [2], the case H variable exhibits new features which are hidden when H is constant. Not only technical difficulties but even substantial differences appear as soon as one considers the case H nonconstant. In order to better understand the problem, we point out that the functional space of H -bubbles is the Sobolev space
2 3 2 3
1 1 2 2 2 ˆ H R , R := u ∈ Hloc R , R
|∇u| + μ |u| < +∞ R2
where μ = μ(z) = 2/(1 + |z|2 ) for all z = (x, y) ∈ R2 . One can see that Hˆ 1 (R2 , R3 ) turns out to be a Hilbert space endowed with the norm u2ˆ 1 H
=
2
1
2
|∇u| +
μ u
4π 2
R2
R2
and is isomorphic to H 1 (S2 , R3 ) via stereographic projection (see Section 1). So, considering a rescaled sequence (u˜ n ) defined as in (0.2), it has to be viewed in Hˆ 1 (R2 , R3 ). Note that also the disc D is rescaled to a sequence of discs centered at − zrnn and with radii r1n which, in the limit, fills the whole plane. The sequence (u˜ n ) admits uniformly bounded Dirichlet integrals, but it is
P. Caldiroli / Journal of Functional Analysis 257 (2009) 405–427
407
not necessarily bounded in Hˆ 1 (R2 , R3 ). In order to recover boundness one has to consider a new sequence (v n ) in Hˆ 1 (R2 , R3 ) defined by v = u˜ − p , n
n
n
1 where p = 4π
μ2 u˜ n .
n
R2
In view of the Poincaré inequality, (v n ) is bounded in Hˆ 1 (R2 , R3 ) and then, up to a subsequence, converges weakly to some U ∈ Hˆ 1 (R2 , R3 ). Since (v n ) satisfies −1 2 R , R3 , v n − 2H v n + p n vxn ∧ vyn → 0 in Hloc the weak limit U is not necessarily an H -bubble but it solves an H˜ -system on R2 where H˜ depends on the behaviour of the sequence (p n ) ⊂ R3 . Clearly, when H is constant, the sequences (u˜ n ) and (v n ) are approximate solutions of the same H -system. But if H is nonconstant, the lack of invariance of H under translations in R3 makes the difference. In essence, one can expect that if p n → p ∈ R3 then H˜ = H (· + p), whereas if |p n | → +∞ and H (q) → H∞ ∈ R \ {0} as |q| → +∞, then H˜ = H∞ . This observation is developed in [10] where, in agreement with the previous claim, a single blow-up phenomenon is described for mappings H : R3 → R of the form H (q) = K(q) + H∞
(0.3)
with H∞ ∈ R \ {0} and K ∈ C 0 (R3 , R) satisfying
(h1 ) sup K(q)q =: MK < 1; q∈R3
(h2 )
lim
|q|→+∞
K(q)q = 0.
In this work, for the same class of mappings H , we provide a complete description of a sequence of approximate solutions of an H -system on R2 , that is, a sequence (un ) ⊂ Hˆ 1 (R2 , R3 ) satisfying un − 2H un unx ∧ uny → 0 in Hˆ −1 = dual of Hˆ 1 R2 , R3 ,
2 sup ∇un < +∞.
(0.4) (0.5)
n∈N R2
Our main result can be stated as follows. Theorem 0.1. Assume that H : R3 → R is of the form (0.3) with H∞ ∈ R\{0} and K ∈ C 0 (R3 , R) satisfying (h1 )–(h2 ). If (un ) ⊂ Hˆ 1 (R2 , R3 ) is a sequence of approximate solutions for the H system on R2 , then either ∇un → 0 strongly in L2 (R2 , R6 ), or there exists an integer k¯ > 0 and,
408
P. Caldiroli / Journal of Functional Analysis 257 (2009) 405–427
¯ a sequence (gn,i ) of conformal transformations of R2 ∪ {∞} into itself for every i = 1, . . . , k, i and a mapping U ∈ Hˆ 1 (R2 , R3 ) such that, setting n,i
u
= u ◦ gn,i n
and p
n,i
1 = 4π
μ2 un,i , R2
for a subsequence of (un ), still denoted by (un ), one has: 1 (R2 , R3 ); (i) un,i − p n,i → U i weakly in Hˆ 1 (R2 , R3 ) and strongly in Hloc i i n,i (ii) U is an Hpi -bubble, where Hpi = H (· + p ) if p → p i ∈ R3 , or Hpi ≡ H∞ if |p n,i | → +∞; k¯
n 2
∇u =
∇U i 2 . (iii) lim n→+∞ R2
i=1
R2
Thus, according to Theorem 0.1, a noncompact sequence of approximate solutions admits a limit configuration made by finitely many bubbles U 1 , . . . , U k . However, differently from the case of H constant, now the bubbles U 1 , . . . , U k are not necessarily solutions of the starting problem which is not invariant under translation in R3 . In fact, when U i is an Hpi -bubble with p i ∈ R3 , then U i + p i is an H -bubble; otherwise U i is an H∞ -bubble. In principle, we cannot say that only H -bubbles or only H∞ -bubbles appear; both of them may exist in a same limit configuration (see [9] for examples of this type, for sequences of approximate solutions in H01 (D, R3 )). We point out that H -bubbles may exist (see [6] and [7] for existence results). We also remark that a same bubble may be blown many times from a given sequence of approximate solutions, but with different concentration speeds (see Remark 3.10). Considering sequences of approximate solutions in Hˆ 1 (R2 , R3 ) rather than in H01 (D, R3 ) as done in [5] and [18] does not constitute a substantial difference. The two cases are closely related, in view of the dilation invariance of H -systems. From a technical viewpoint, working in Hˆ 1 (R2 , R3 ) leads to some advantages since the domain has no boundary and one needs just estimates at the interior. On the other hand, considering the problem on the whole plane, the rescaling argument involves not only flat dilations and translations, as in (0.2). For our problem it is more convenient to work with a larger class of conformal transformations of the compactified plane, including, for instance, the inversion, and this requires some more care. Let us make some remarks about the proof of Theorem 0.1. In the spirit of the concentrationcompactness method [15], the “bubble detector” is the concentration function, but the blow-up analysis is performed in a not completely standard way. Usually, in studying a sequence (un ) of approximate solutions of some problem on which a noncompact invariance group G acts, one sets up an iterative argument that, after k steps, leads to the following situation: one has k nontrivial solutions U 1 , . . . , U k of limit problems, and k rescaling sequences (gn,1 ), . . . , (gn,k ) ⊂ G such −1 that the sequence un,k = un − ki=1 U i ◦ gn,i is again a sequence of approximate solutions of the starting problem. Then one finds a new sequence of rescaling transformations (gn,k+1 ) ⊂ G such that the (k + 1)th “bubble” U k+1 is obtained essentially as a nontrivial weak limit of (un,k ◦ gn,k+1 ). In order that this argument works, a key information is that (∗) each U i is a solution of some limit problem.
P. Caldiroli / Journal of Functional Analysis 257 (2009) 405–427
409
Here we do not follow this procedure, in particular we do not show that (un,k ) is again a sequence of approximate solutions. In fact we do not use the information (∗), at least not in a full way. Instead, we proceed by an iterative argument which works in this way: after k steps, we have again k rescaling sequences (g˜ n,1 ), . . . , (g˜ n,k ) ⊂ G (for our problem G is the conformal group of the compactified plane) such that for every i = 1, . . . , k the sequence (un ◦ g˜ n,1 ◦ · · · ◦ g˜ n,i ), up to translations (in the image) weakly converges to a bubble U i . Then, in order to detect the (k + 1)th bubble U k+1 , we find another rescaling (g˜ n,k+1 ) ⊂ G for the sequence
θ n,k = un,k − U k − U k−1 ◦ g˜ n,k − · · · − U 1 ◦ g˜ n,2 ◦ · · · ◦ g˜ n,k ,
where
un,k = un ◦ g˜ n,1 ◦ · · · ◦ g˜ n,k , and we obtain U k+1 passing to the limit on the sequence (un,k ◦ g˜ n,k+1 ), and not on (θ n,k ◦ g˜ n,k+1 ). What we need is that each sequence (U i ◦ g˜ n,i+1 ◦ · · · ◦ g˜ n,k+1 ), for i = 1, . . . , k, 1 (R2 , R3 )), and this depends just on the behaviour of the sequence converges to a constant (in Hloc of transformations (g˜ n,i+1 ◦ · · · ◦ g˜ n,k+1 ). Hence we sharply separate the study of the equation from the study of the action of the invariance group. So this strategy might be hopefully applied, with no change, for any noncompact problem which is invariant under the conformal group of the compactified plane. The paper is organized as follows: in Section 1 we discuss the conformal invariance of H systems on R2 and we focus on some conformal transformations which play a key role in performing the blow-up analysis. In Section 2 we state some useful results concerning H -bubbles and H -systems; in particular we recall a local “ε-compactness” theorem for sequences of approximate solutions, already proved in [10], which constitutes one of the main tools in the argument. Section 3 contains the proof of Theorem 0.1. Finally, in Section 4 we suggest and anticipate possible applications of Theorem 0.1 or some variants which might be of some interest. 1. Conformal invariance Let S2 be the 2-dimensional sphere, i.e., S2 = {p ∈ R3 | |p| = 1}, let σ = (0, 0, 1) be the North Pole and σ = (0, 0, −1) the South Pole in S2 . We will denote Π the stereographic projection from the North Pole, which maps S2 to the compactified plane R2 ∪ {∞}. The inverse of Π is the mapping (x, y) → (xμ, yμ, 1 − μ) with μ defined as in the Introduction. Moreover Π(σ ) = 0, Π(σ ) = ∞ and Π(S2− ) = D1 (0), where S2− is the open lower hemisphere and D1 (0) is the unit open disc in R2 . In general, we will denote Dr (z) the open disc in R2 centered at z ∈ R2 and with radius r > 0, and Nr (σ ) the spherical neighborhood in S2 centered at some σ ∈ S2 and with radius r, i.e., Nr (σ ) = {σ ∈ S2 | distS2 (σ , σ ) < r}. Hence S2− = N π2 (σ ). For every u : S2 → R3 let uˆ = u ◦ Π −1 : R2 → R3 . One has that u ∈ H 1 (S2 , R3 ) if and only if uˆ ∈ Hˆ 1 (R2 , R3 ). Moreover for every Borel set Σ ⊂ S2 one has that
|du|2 =
Σ
Π(Σ)
|∇ u| ˆ 2,
410
P. Caldiroli / Journal of Functional Analysis 257 (2009) 405–427
where “d” denotes differentiation on the sphere while “∇” is the gradient for functions defined on R2 . In addition 1 –u = uμ ˆ 2. 4π S2
R2
Given a conformal transformation g : S2 → S2 , let gˆ : R2 ∪ {∞} → R2 ∪ {∞} be the corresponding conformal transformation of the compactified plane into itself defined by gˆ = −1 = gˆ −1 . Moreover, for u ∈ H 1 (S2 , R3 ) one has that u Π ◦ g ◦ Π −1 . Note that g ◦ g = uˆ ◦ gˆ and for any Borel set Σ ⊂ S2 the following identities hold Π(Σ)
2
∇(uˆ ◦ g) ˆ =
d(u ◦ g) 2 =
Σ
|du|2 =
g(Σ)
|∇ u| ˆ 2=
Π(g(Σ))
|∇ u| ˆ 2.
(1.1)
g(Π(Σ)) ˆ
Warning! In the following, if not strictly necessary, we will omit the superscript ˆ and we identify a mapping u defined on S2 with the corresponding one uˆ defined on R2 ∪ {∞}. The context will make clear the right interpretation. For example, writing du we mean u as a mapping on S2 , whereas if ∇u appears then u has to be considered as a mapping on R2 . We also point out that 1 (R2 , R3 ) for a sequence of mappings on R2 is equivalent to convergence in convergence in Hloc 1 2 3 Hloc (S \ {σ }, R ) for the corresponding sequence of mappings on S2 . The importance of conformal transformations for our problem is due to the fact that if (gn ) is any sequence of conformal transformations and (un ) is a sequence of approximate solutions of an H -system on R2 , then also (un ◦ gn ) is so. In particular U is an H -bubble if and only if U ◦ g is so, for every conformal transformation g. Indeed the conformal invariance of the H -bubble problem reflects the fact that we deal with a problem which is geometrical in nature, since the true unknown in the H -bubble problem is the image of U , rather than the mapping U itself. The conformal group of S2 contains some subgroups which play a special role in the argument we will develop in the proof of Theorem 0.1. Firstly we introduce and describe the subgroup of conformal transformations of S2 corresponding to dilations of R2 . For every r ∈ (0, π) let λ(r) > 0 be such that Π Nr (σ ) = Dλ(r) (0). sin r One can check that λ(r) = 1+cos r . The mapping r → λ(r) is a diffeomorphism of (0, π) onto + (0, +∞). In particular λ(0 ) = 0+ , λ( π2 ) = 1, and λ(π − ) = +∞. In addition distS2 (σ, σ ) = r if and only if |Π(σ )| = λ(r). It is convenient to introduce the following operation: for r1 , r2 ∈ (0, π) set
r1 × r2 := arccos
cos r1 + cos r2 . 1 + (cos r1 )(cos r2 )
This definition is motivated by the fact that λ(r1 × r2 ) = λ(r1 )λ(r2 ).
P. Caldiroli / Journal of Functional Analysis 257 (2009) 405–427
411
The operation × endows the open interval (0, π) with a structure of abelian group (the identity is π2 , the inverse of r is π − r). The following properties hold: if rn → 0+ if 0 < r1 < r2 < π
then r × rn → 0+
for every r ∈ (0, π),
then r × r1 < r × r2
for every r ∈ (0, π).
(1.2) (1.3)
Then, for every r ∈ (0, π) let δr : S2 → S2 be defined by δr (σ ) = Π −1 (λ(r)Π(σ )) for all σ ∈ S2 . Let us collect some properties of the mappings δr : (δ1) (δ2) (δ3) (δ4) (δ5) (δ6)
δr is a conformal transformation in S2 and δˆr (z) = λ(r)z ∀z ∈ R2 ; δr (σ ) = σ , δr (σ ) = σ ; δr1 ◦ δr2 = δr1 ×r2 ; δπ−r × δr = δ π2 = id, namely δr−1 = δπ−r ; distS2 (δr (σ ), σ ) = r × distS2 (σ, σ ); distS2 (δr (σ ), σ ) = (π − r) × distS2 (σ, σ ).
Since for every σ ∈ S2 one can find a rotation ρ : S2 → S2 such that ρ(σ ) = σ (see Remark 1.1 below), every spherical neighborhood Nr (σ ) in S2 can be represented as the image of the lower hemisphere S2− through a conformal transformation of the form ρ ◦ δr . Now let us discuss some subgroups of rotations of S2 . Rotations of an angle α around the p3 -axis and of an angle β around the p1 -axis are, respectively, the isometries σ → Rα3 σ and σ → Rβ1 σ where ⎛
cos α 3 ⎝ Rα = sin α 0
− sin α cos α 0
⎞ 0 0⎠, 1
⎛
1 0 1 ⎝ Rβ = 0 cos β 0 sin β
⎞ 0 − sin β ⎠ . cos β
Note that in complex notation Rˆ α3 (z) = eiα z,
Rˆ β1 (z) =
z cos β2 + i sin β2 iz sin β2 + cos β2
.
(1.4)
In particular Rˆ π1 is the inversion z → 1z . Remark 1.1. Every point σ ∈ S2 can be expressed in the form ⎡ ⎤ −(sin α)(sin β) σ = ⎣ (cos α)(cos β) ⎦ − cos β with α ∈ [0, 2π) and β ∈ [0, π]. Hence the mapping ρσ := Rα3 Rβ1 is a rotation of S2 such that ρσ (σ ) = σ . In the following, for σ ∈ S2 , we denote ρσ the rotation of S2 defined as above. Note that if g = ρσ ◦ δr then g(S2− ) = Nr (σ ) and, by (δ2), g(σ ) = σ and g(σ ) = −σ . In the proof of Theorem 0.1 we will handle sequences (gn ) of conformal transformations of S2 of the form gn = ρσn ◦ δrn . Here we discuss some preliminary results about such sequences.
412
P. Caldiroli / Journal of Functional Analysis 257 (2009) 405–427
Lemma 1.2. Let u ∈ Hˆ 1 (R2 , R3 ), (σn ) ⊂ S2 , (rn ) ⊂ (0, π) and set gn = ρσn ◦ δrn . Then for every z ∈ R2 there exists a sequence (σ˜ n ) ⊂ S2 such that
∇(u ◦ gn ) 2
|du|2 .
Nrn (σ˜ n )
D1 (z)
Proof. Fix z ∈ R2 and let σ = Π −1 (z). Define σ˜ n = (ρσn ◦ ρσ )(σ ) and ρ˜n = ρσn ◦ ρσ . Hence ρ˜n is a rotation of S2 with ρ˜n (σ ) = σ˜ n . We have that
|du| = 2
Nrn (σ˜ n )
d(u ◦ gn ) 2 =
gn−1 (Nrn (σ˜ n ))
∇(u ◦ gn ) 2
1 −1 λ(rn ) (Π◦ρσ ◦Π )(Dλ(rn ) (0))
because gn−1 (Nrn (σ˜ n )) = (δr−1 ◦ ρσ−1 ◦ ρ˜n ◦ δrn )(S2− ) = (δr−1 ◦ ρσ ◦ δrn )(S2− ) and n n n −1 1 (Π ◦ δr−1 ◦ ρσ ◦ δr )(S2− ) = (δ r ◦ ρˆσ ◦ δˆr )(D1 (0)) = λ(r) ρˆσ (Dλ(r) (0)) (here we use (δ1) and −1 = δˆ−1 ). Moreover, since σ = σ , for all λ > 0 one has that ρˆ (D (0)) ⊃ D (Π(σ )). Thus δ r
r 1 λ(rn ) ρˆσ (Dλ(rn ) (0)) ⊃ D1 (z)
and the conclusion follows.
2
σ
λ
λ
Lemma 1.3. Let (rn ) ⊂ (0, π), (σn ) ⊂ S2 , and define gn := ρσn ◦ δrn . If rn → 0+ and σn → σ 2 2 then gn → σ in L∞ loc (S \ {σ }, S ). In particular, setting dn = distS2 (σn , σ ): 2 2 (i) if drnn → +∞ then, up to a subsequence, δdn ◦ gn → σ ∗ in L∞ loc (S \ {σ }, S ) for some ∗ 2 σ ∈ S \ {σ }; 2 2 (ii) if drnn → ∈ [0, +∞) then, up to a subsequence, δrn ◦ gn → g in L∞ loc (S \ {σ }, S ), where
g = Π −1 ◦ gˆ ◦ Π , with g(z) ˆ =
eiα z−i
for some α ∈ R.
Note that, according to Lemma 1.3, part (ii), any conformal transformation of S2 corresponding to a translation of the plane can be obtained as a limit of conformal transformations made by multiple composition of dilations (i.e. δr -type mappings) and rotations. Proof. Since σn = gn (σ ) and any ρn is an isometry of S2 , thanks to (δ2) and (δ5), for every σ ∈ S2 one has that distS2 gn (σ ), σn = distS2 δrn (σ ), δrn (σ ) = distS2 δrn (σ ), σ = rn × distS2 (σ, σ ). If σ ∈ S2 \ Nr (σ ) then distS2 (σ, σ ) π − r and consequently, by (1.3), sup distS2 gn (σ ), σ distS2 (σn , σ ) + rn × (π − r).
σ ∈S2 \Nr σ
Since rn → 0+ and σn → σ , as r is any value in (0, π), by (1.2), one obtains that gn → σ 2 2 2 in L∞ loc (S \ {σ }, S ). Notice that it is not possible that gn (σ ) → σ uniformly on S since
P. Caldiroli / Journal of Functional Analysis 257 (2009) 405–427
413
gn (σ ) → σ . In order to show the second part, it is convenient to study the corresponding sequence (gˆ n ) of conformal transformations of R2 ∪ {∞}. In fact we identify R2 ∪ {∞} with the compactified complex field C ∪ {∞} =: C∗ . Representing σn in the form ⎡
⎤ −(sin αn )(sin βn ) σn = ⎣ (cos αn )(cos βn ) ⎦ − cos βn with αn ∈ [0, 2π) and βn ∈ [0, π], by (1.4) one has gˆ n (z) = eiαn
λn n z + i ˜n iλn ˜n z + n
where λn = λ(rn ),
n = cos
βn , 2
˜n = sin
βn . 2
Observe that n ∼ d2n → 0+ , ˜n → 1− , λn ∼ rn → 0+ and, up to a subsequence αn → α ∈ [0, 2π]. If drnn → +∞, then λnn → +∞ and n gˆ n (z) = eiαn
λn n z + i ˜n i λnn ˜n z + 1
→ ieiα
in L∞ loc (C, C)
2 2 namely δdn ◦ gn → σ ∗ = Π −1 (2ieiα ) in L∞ loc (S \ {σ }, S ). If and
λn gˆ n (z) = eiαn
dn rn
λn n z + i ˜n 2eiα =: g(z) ˆ n → i ˜n z + λn 2z − i
2 2 namely δrn ◦ gn → g = Π −1 ◦ gˆ ◦ Π in L∞ loc (S \ {σ }, S ).
→ ∈ [0, +∞), then
n λn
→
2
∗ in L∞ loc (C, C )
2
Lemma 1.4. If (gn ) is a sequence of conformal transformations of S2 such that gn → σ ∗ 2 in L∞ ˜ }, S2 ), for some σ ∗ and σ˜ in S2 , then for every u ∈ H 1 (S2 , R3 ) one has that loc (S \ {σ 1 (S2 \ {σ u ◦ gn → const weakly in H 1 (S2 , R3 ) and strongly in Hloc ˜ }, R3 ). 2 ˜ }, S2 ), there exists Proof. Fix r > 0 and a Borel set Σ ⊂⊂ S2 \{σ˜ }. Since gn → σ ∗ in L∞ loc (S \{σ ∗ n¯ ∈ N such that distS2 (gn (σ ), σ ) < r for all σ ∈ Σ and for all n n, ¯ namely gn (Σ) ⊂ Nr (σ ∗ ) for all n n. ¯ Hence, by (1.1),
Σ
d(u ◦ gn ) 2 =
gn (Σ)
|du| 2
|du|2 .
Nr (σ ∗ )
By the absolute continuity of the integral we infer that Σ |d(u ◦ gn )|2 → 0. Since the sequence 1 (S2 \ {σ (u ◦ gn ) is bounded in H 1 (S2 , R3 ) we obtain that u ◦ gn → const strongly in Hloc ˜ }, R3 ) 1 2 3 and then also weakly in H (S , R ). 2
414
P. Caldiroli / Journal of Functional Analysis 257 (2009) 405–427
2. Some auxiliary results about H -systems In the proof of Theorem 0.1 we will use only few–but deep—results about H -bubbles and about the behaviour of a sequence of approximate solutions of the H -system. In particular we will need: (1) a positive lower bound for the Dirichlet integral of H -bubbles, (2) a local compactness result for a sequence of approximate solutions of the H -system. The assumptions made on H play a role in both these questions, and only for these. Regarding H -bubbles, the following result holds. Lemma 2.1. Let H : R3 → R be a function of the form (0.3) with H∞ ∈ R \ {0} and K ∈ C 0 (R3 , R) satisfying (h1 ). Let q ∈ R3 and set Hq (p) = H (p + q) for all p ∈ R3 . If U K 2 is an Hq -bubble for some q ∈ R3 , i.e., U + q is an H -bubble, then R2 |∇U |2 8π( 1−M . H∞ The same holds if U is an H∞ -bubble. Remark 2.2. Under the assumptions made on H an H -bubble is in fact bounded (see, e.g., [10] or [16]). As soon as the prescribed mapping H is slightly more regular than continuous, e.g. locally Lipschitz continuous, regularity theory for H -systems (see, for instance, [12] or [1]) applies and guarantees that U is in fact of class C 2,α as a map on S2 . Furthermore, by known arguments, U solves the conformality conditions Ux · Uy = 0 and |Ux |2 = |Uy |2 on R2 . Hence U describes a parametric surface S of the type of the sphere such that the mean curvature of S at any regular point p ∈ S equals H (p). Moreover the Dirichlet integral of U is equal to (twice) the area of the H -bubble, as a parametric surface. Proof. Taking U + q instead of U it is not restrictive to assume that U is an H -bubble. Multiplying U = 2H (U )Ux ∧ Uy by U and integrating (notice that U ∈ L∞ , see Remark 2.2) we obtain that
|∇U |2 = −2H∞
R2
U · U x ∧ Uy − 2 R2
K(U )U · Ux ∧ Uy .
R2
Using (h1 ) and the isoperimetric inequality (see [3] or [20]) | we infer that (1 − MK )
R2
U · U x ∧ Uy |
√1 ∇U 3 , 2 4 2π
3 |∇U |
2
R2
and the conclusion easily follows.
|H∞ | |∇U | √ 8π
2
2
R2
2
Concerning the behaviour of a sequence of approximate solutions, the following local compactness result holds.
P. Caldiroli / Journal of Functional Analysis 257 (2009) 405–427
415
Theorem 2.3. Let H : R3 → R be a function of the form (0.3) with H∞ ∈ R \ {0} and K ∈ C 0 (R3 , R) satisfying (h1 )–(h2 ). Assume that (un ) ⊂ Hˆ 1 (R2 , R3 ) satisfies (0.4) and un − p n → U weakly in Hˆ 1 , where p n = –S2 un . Then, for every disc D in R2 such that lim sup
n 2 π
∇u < 8
D
1 − MK H∞
2
1 3 n n 3 one has that U ∈ L∞ loc (D, R ) and u − p → U strongly in Hloc (D, R ).
A proof of Theorem 2.3 can be found in [10]. In particular the case (p n ) bounded corresponds to Lemma 2.2 in the above mentioned paper. The case |p n | → +∞ follows from Lemma 3.3, steps 1 and 2 in the proof of Lemma 3.4, and Remark 2.3 in [10]. 3. Proof of Theorem 0.1 From now on, H : R3 → R is a given function of the form (0.3) with H∞ ∈ R \ {0} and K ∈ C 0 (R3 , R) satisfying (h1 )–(h2 ). We take a sequence (un ) of approximate solutions for the H -system on R2 , namely a sequence in Hˆ 1 (R2 , R3 ) satisfying (0.4) and (0.5). As we will see, the sequence (un ) might converge weakly to some bubble U 1 without necessity of rescaling. In this case the first true blow-up occurs at the next step, for the second bubble U 2 and the iterative argument will start just from this step on. Therefore the proof will be split in the following parts: 3.1 3.2 3.3 3.4
Looking for the first bubble. Looking for the second bubble. The iterative argument. Conclusion.
To begin, in order to use Theorem 2.3, we fix ε such that 0<ε<
π 8
1 − MK H∞
2 .
(3.1)
3.1. Looking for the first bubble Here we prove that: Lemma 3.1. √ n (1) If lim inf ∇un 2 < 8π(1 √ − MK )/|H∞ |, then lim inf ∇u 2 = 0. n (2) If lim inf ∇u 2 8π(1 − MK )/|H∞ | then there exist sequences (σn1 ) ⊂ S2 , (rn1 ) ⊂ (0, π) defined by sup σ ∈S2 Nr 1 (σ ) n
n 2
du =
Nr 1 (σn1 ) n
n 2
du = ε,
416
P. Caldiroli / Journal of Functional Analysis 257 (2009) 405–427
and a mapping U 1 ∈ Hˆ 1 (R2 , R3 ) such that, setting gn1 = ρσn1 ◦ δrn1 ,
p n,1 = – un,1 ,
un,1 = un ◦ gn1 ,
(3.2)
S2
one has that, for a subsequence, 1 (R2 , R3 ); (i) un,1 − p n,1 → U 1 weakly in Hˆ 1 (R2 , R3 ) and strongly in Hloc 1 n,1 1 3 1 (ii) U is an Hp1 -bubble if p → p ∈ R , whereas U is an H∞ -bubble if |p n,1 | → +∞. √ Proof. (1) Assume that lim inf ∇un 2 < 8π(1 − MK )/|H∞ |. Arguing by contradiction, if lim inf ∇un 2 > 0, then there exists ε ∈ (0, ε] such that ∇un 22 > ε for all n ∈ N large enough. Introducing the concentration functions (0, π) r → Qn (r) = sup
n 2
du ,
σ ∈S2 Nr (σ )
by a standard procedure, for every n ∈ N (large enough) we can find σn1 ∈ S2 and rn1 ∈ (0, π) such that sup σ ∈S2 Nr 1 (σ )
n 2
du =
n 2
du = ε.
(3.3)
Nr 1 (σn1 )
n
n
Define gn1 , un,1 and p n,1 as in (3.2). Since the sequence (un,1 − p n,1 ) is bounded in Hˆ 1 (R2 , R3 ), there exists U 1 ∈ Hˆ 1 (R2 , R3 ) such that, for a subsequence, un,1 − p n,1 → U 1 weakly in Hˆ 1 (R2 , R3 ). By conformal invariance, the sequence (un,1 ) satisfies (0.4). Moreover, by Lemma 1.2 for every z ∈ R2 there exists a sequence (σ˜ n ) ⊂ S2 such that D1 (z) |∇un,1 |2 n 2 N 1 (σ˜ n ) |du | and then, by (3.3), rn
n,1 2
∇u ε
∀n large enough.
D1 (z)
By Theorem 2.3 and by a diagonal argument we obtain that un,1 − p n,1 → U 1 strongly in 1 (R2 , R3 ). Then U 1 solves u = 2H (u)u ∧ u on R2 where H (p) = H (p + p 1 ) if Hloc x y p1 p1 n,1 1 3 n,1 p → p ∈ R , or Hp1 ≡ H∞ if |p | → +∞. Since D1 (0)
n,1 2
∇u =
S2−
n,1 2
du =
Nr 1 (σn1 ) n
n 2
du = ε > 0
P. Caldiroli / Journal of Functional Analysis 257 (2009) 405–427
417
1 (R2 , R3 ), we infer that U 1 is nonconstant, namely it is an and un,1 − p n,1 → U 1 strongly in Hloc Hp1 -bubble. From Lemma 2.1 and from lower semicontinuity it follows that
1 − MK 8π H∞
2
∇U 1 2 lim inf
R2
n,1 2
∇u = lim inf
R2
n 2
∇u
R2
√ contrary to the assumption that lim inf ∇un 2 < 8π(1 − MK )/|H∞ |. (2) Since lim inf ∇un 22 > ε (recall that ε has been fixed at the beginning, and satisfies (3.1)), we are in the same position as in Part (1), with ε instead of ε, and we conclude repeating the same argument. 2 3.2. Looking for the second bubble Here we assume that the sequence (un ) fixed at the beginning satisfies lim inf ∇un 22 > √ 8π (1 − MK )/|H∞ |. Hence we are in the case (2) of Lemma 3.1, namely (un ) blows a first bubble U 1 . Set θ n,1 = un,1 − U 1 with un,1 defined as in (3.2). Observe that
n,1 2
∇θ = ∇un 2 − ∇U 1 2 + o(1). R2
R2
R2
We show that: Lemma 3.2. √ n,1 (1) If lim inf ∇θ n,1 2 < 8π(1 √ − MK )/|H∞ | then lim inf ∇θ 2 = 0. n,1 (2) If lim inf ∇θ 2 8π(1 − MK )/|H∞ | then there exist sequences (σn2 ) ⊂ S2 , (rn2 ) ⊂ (0, π) defined by
n,1 2
n,1 2
dθ =
dθ = ε, sup (3.4) σ ∈S2 Nr 2 (σ ) n
Nr 2 (σn2 ) n
and a mapping U 2 ∈ Hˆ 1 (R2 , R3 ) such that, setting gn2 = ρσn2 ◦ δrn2 ,
un,2 = un,1 ◦ gn2 ,
p n,2 = – un,2 ,
(3.5)
S2
one has that, for a subsequence, (i) rn2 → 0, σn2 → σ ; 1 (R2 , R3 ); (ii) un,2 − p n,2 → U 2 weakly in Hˆ 1 (R2 , R3 ) and strongly in Hloc 2 n,2 2 3 2 (iii) U is an Hp2 -bubble if p → p ∈ R , whereas U is an H∞ -bubble if |p n,2 | → +∞.
418
P. Caldiroli / Journal of Functional Analysis 257 (2009) 405–427
Proof. (1) Arguing by contradiction, suppose that lim inf ∇θ n,1 2 > 0. Then there exists ε ∈ (0, ε] such that ∇θ n,1 22 > ε for all n ∈ N large enough. As at the beginning of the proof of Lemma 3.1, for every n ∈ N (large enough) we can find σn2 ∈ S2 and rn2 ∈ (0, π) satisfying (3.4) with ε instead of ε. 2 2 Lemma 3.3. One has that rn2 → 0, σn2 → σ and gn2 := ρσn2 ◦ δrn2 → σ in L∞ loc (S \ {σ }, S ). 1 1 2 1 2 3 2 3 Moreover U ◦ gn → const weakly in H (S , R ) and strongly in Hloc (S \ {σ }, R ). 1 (R2 , R3 ), namely for all Proof. According to Lemma 3.1, un,1 − p n,1 → U 1 strongly in Hloc r > 0 one has that S2 \Nr (σ ) |dθ n,1 |2 → 0. Since S2 |dθ n,1 |2 > ε for every n large, it follows that lim inf Nr (σ ) |dθ n,1 |2 > ε. Hence for every r > 0 there exists nr ∈ N such that
sup
n,1 2
dθ > ε
σ ∈S2 Nr (σ )
∀n nr .
Therefore rn2 < r for all n nr . That is rn2 → 0. Moreover we claim that σn2 → σ . Indeed otherwise, for a subsequence, σn2 → σ = σ . Let 0 < r < distS2 (σ, σ ). Then, on one hand, 1 (S2 \ {σ }, R3 ). dθ n,1 → 0 strongly in L2 (Nr (σ ), R6 ) because un,1 − p n,1 → U 1 strongly in Hloc On the other hand, since rn2 → 0 and σn2 → σ , for n ∈ N large enough Nrn2 (σn2 ) ⊂ Nr (σ ) and n,1 |2 n,1 |2 = ε > 0, a contradiction. Hence it must be that σ 2 → σ . The n Nr (σ ) |dθ N 2 (σ 2 ) |dθ rn
n
rest of the statement follows from Lemmas 1.3 and 1.4.
2
Define un,2 and p n,2 as in (3.5). Since (un,2 ) satisfies (0.5), the sequence (un,2 − p n,2 ) is bounded in Hˆ 1 (R2 , R3 ) and, after extracting a subsequence, if necessary, we may assume that un,2 − p n,2 → U 2 weakly in Hˆ 1 (R2 , R3 ) for some U 2 ∈ Hˆ 1 (R2 , R3 ). As a next step we show that: Lemma 3.4. The mapping U 2 is an Hp2 -bubble, where Hp2 (p) = H (p + p 2 ) for all p ∈ R3 if p n,2 → p 2 ∈ R3 , or Hp2 (p) ≡ H∞ if |p n,2 | → +∞. Proof. Our purpose is to apply Theorem 2.3 to the sequence (un,2 ). Notice that, by conformal invariance, (un,2 ) satisfies (0.4). Arguing as in the proof of Lemma 3.1, we find that for every z ∈ R2
n,2
n,1
∇ u − U 1 ◦ g 2 2 =
∇ θ ◦ g 2 2 ε ∀n ∈ N large enough. n n D1 (z)
D1 (z)
By Lemma 3.3 we infer that lim sup
n,2 2
∇u ε
∀z ∈ R2 ,
D1 (z)
and then, thanks to Theorem 2.3 and by a diagonal argument we obtain that un,2 − p n,2 → U 2 1 (R2 , R3 ). Then U 2 solves u = 2H (u)u ∧ u on R2 where H (p) = strongly in Hloc x y p2 p2
P. Caldiroli / Journal of Functional Analysis 257 (2009) 405–427
419
H (p +p 2 ) if p n,2 → p 2 ∈ R3 , or Hp2 ≡ H∞ if |p n,2 | → +∞. Using again Lemma 3.3 and (1.1), we have that
n,2 2
n,1 2
∇u = d θ n,1 ◦ g 2 + U 1 ◦ g 2 2 =
dθ + o(1) = ε + o(1), n n S2−
D1 (0)
Nr 2 (σn2 ) n
1 (R2 , R3 ), we infer that U 2 is nonconstant, namely and since un,2 − p n,2 → U 2 strongly in Hloc it is an Hp2 -bubble. 2
Let us complete the proof of Lemma 3.2. By Lemma 3.3, ∇(θ n,1 ◦ gn2 ) = ∇un,2 − ∇(U 1 ◦ gn2 ) → ∇U 2 weakly in L2 (R2 , R6 ). Then, by Lemma 2.1 and by lower semicontinuity, 8π
1 − MK H∞
2
∇U 2 2 lim inf
R2
n,1 2 2
∇ θ ◦ g = lim inf n
R2
contrary to the hypothesis lim inf ∇θ n,1 2 < diction and the proof of (1) is complete.
n,1 2
∇θ
R2
√ 8π(1 − MK )/|H∞ |. Hence we obtain a contra-
(2) One argues as in the corresponding part of the proof of Lemma 3.1.
2
3.3. The iterative argument Fix an integer k 2 and assume that the sequence (un ) taken at the beginning blows k bubbles. More precisely, suppose that there exist k sequences (σn1 ), . . . , (σnk ) ⊂ S2 , k sequences (rn1 ), . . . , (rnk ) ⊂ (0, π), and k functions U 1 , . . . , U k such that, setting un,0 = un and un,i = un,i−1 ◦ gni , p n,i = – un,i (i = 1, . . . , k), gni = ρσni ◦ δrni , S2
one has that rni → 0
and σni → σ
for every i = 2, . . . , k, un,i − p n,i → U i weakly in Hˆ 1 R2 , R3 1 R2 , R3 for every i = 1, . . . , k, and strongly in Hloc
(3.6)
(3.7)
and U i is an Hpi -bubble if p n,i → p i ∈ R3 , or an H∞ -bubble if |p n,i | → +∞ (i = 1, . . . , k). Moreover, if k > 2 we assume that 2 2 gni ◦ gni+1 → σ in L∞ for every i = 2, . . . , k − 1. (3.8) loc S \ {σ }, S For every i, j = 1, . . . , k with i j we set Uj n,i,j U = j U i ◦ gni+1 ◦ · · · ◦ gn
if i = j , if i < j.
420
P. Caldiroli / Journal of Functional Analysis 257 (2009) 405–427
Thanks to (3.6) and (3.8) we have that: Lemma 3.5. For every i, j = 1, . . . , k with i < j one has that U n,i,j → const weakly in 1 (R2 , R3 ). Hˆ 1 (R2 , R3 ) and strongly in Hloc Proof. By (3.6) and by Lemma 1.3, for every i = 1, . . . , k − 1, one has that gni+1 → σ in 2 2 n,i,i+1 → const weakly in Hˆ 1 (R2 , R3 ) and strongly L∞ loc (S \ {σ }, S ) and then, by Lemma 1.4, U 1 (R2 , R3 ). Now consider the case i, j = 1, . . . , k with j − i 2. For every i = 2, . . . , k − 1 in Hloc set hin = gni ◦ gni+1 . If j − i is even then j −1
j
gni+1 ◦ · · · ◦ gn = hi+1 n ◦ · · · ◦ hn
(where in the right-hand side the upper label runs from i + 1 to j − 1 with step two). Since, by 2 2 i+1 ◦ · · · ◦ g j → σ (3.8) every hin → σ in L∞ in n loc (S \ {σ }, S ), also gn 2 \ {σ }, S2 ). If j − i is odd, and so, in particular, j − i 3, then L∞ (S loc j −1
j
gni+1 ◦ · · · ◦ gn = gni+1 ◦ hi+2 n ◦ · · · ◦ hn
(where in the right-hand side, excepted the first term gni+1 , the upper label runs from i + 2 to j −1 2 2 j − 1 with step two). In this case, by (3.8), hi+2 → σ in L∞ n ◦ · · · ◦ hn loc (S \ {σ }, S ). Moreover j 2 2 gni+1 → σ and consequently gni+1 ◦ · · · ◦ gn → σ in L∞ loc (S \ {σ }, S ). In both cases one can apply Lemma 1.4 and conclude. 2 Set θ n,k = un,k −
k
U n,i,k
i=1
and observe that, by (3.7) and by Lemma 3.5,
n,k 2
∇θ =
R2
k
n 2
∇u − i=1
R2
∇U i 2 + o(1).
(3.9)
R2
We show that: Lemma 3.6. √ (1) If lim inf ∇θ n,k 2 < √ 8π(1 − MK )/|H∞ | then lim inf ∇θ n,k 2 = 0. n,k (2) If lim inf ∇θ 2 8π(1 − MK )/|H∞ | then there exist sequences (σnk+1 ) ⊂ S2 , (rnk+1 ) ⊂ (0, π) defined by
n,k 2
dθ =
sup σ ∈S2 N
rnk+1
(σ )
N
rnk+1
(σnk+1 )
n,k 2
dθ = ε,
(3.10)
P. Caldiroli / Journal of Functional Analysis 257 (2009) 405–427
421
and a mapping U k+1 ∈ Hˆ 1 (R2 , R3 ) such that, setting gnk+1 = ρσ k+1 ◦ δr k+1 , n
un,k+1 = un,k ◦ gnk+1 ,
n
p n,k+1 = – un,k+1 ,
(3.11)
S2
one has that, for a subsequence, (i) rnk+1 → 0, σnk+1 → σ ; 1 (R2 , R3 ); (ii) un,k+1 − p n,k+1 → U k+1 weakly in Hˆ 1 (R2 , R3 ) and strongly in Hloc k+1 n,k+1 k+1 3 k+1 (iii) U is an Hpk+1 -bubble if p →p ∈ R , whereas U is an H∞ -bubble if n,k+1 |p | → +∞. Proof. We argue inductively with respect to k 2. The result is true for k = 2, it corresponds to Lemma 3.2. We fix k > 2 and, assuming that the result holds true for every index i = 2, . . . , k − 1, we show it for the index k. (1) Arguing by contradiction, suppose that lim inf ∇θ n,k 2 > 0. Then there exists ε ∈ (0, ε] such that ∇θ n,k 22 > ε for all n ∈ N large enough. As in the proof of Lemma 3.1, for every n ∈ N (large enough) we can find σnk+1 ∈ S2 and rnk+1 ∈ (0, π) satisfying (3.10) with ε instead of ε. We set gnk+1 = ρσ k+1 ◦ δr k+1 and n
n
U n,i,k+1 = U n,i,k ◦ gnk+1
(i = 1, . . . , k).
Lemma 3.7. For every i = 1, . . . , k one has that U n,i,k+1 → const weakly in H 1 (S2 , R3 ) and 1 (S2 \ {σ }, R3 ). strongly in Hloc Postponing the proof of Lemma 3.7, let us go on with the proof of Lemma 3.6. Define un,k+1 and p n,k+1 as in (3.11). Since (un,k+1 ) satisfies (0.5), the sequence (un,k+1 − p n,k+1 ) is bounded in Hˆ 1 (R2 , R3 ) and, up to a subsequence, un,k+1 − p n,k+1 → U k+1 weakly in Hˆ 1 (R2 , R3 ) for some U k+1 ∈ Hˆ 1 (R2 , R3 ). Now we show that: Lemma 3.8. The mapping U k+1 is an Hpk+1 -bubble, where Hpk+1 (p) = H (p + p k+1 ) for all p ∈ R3 if p n,k+1 → p k+1 ∈ R3 , or Hpk+1 (p) ≡ H∞ if |p n,k+1 | → +∞. Proof. We are going to apply Theorem 2.3 to the sequence (un,k+1 ). By conformal invariance, (un,k+1 ) satisfies (0.4). Arguing as in the proof of Lemma 3.1, we find that for every z ∈ R2 2
k
n,k
n,k+1 n,i,k+1
∇ θ ◦ g k+1 2 ε − U
=
∇ u n
D1 (z)
i=1
D1 (z)
By Lemma 3.7 we infer that lim sup D1 (z)
n,k+1 2
∇u
ε
∀z ∈ R2
∀n ∈ N large enough.
422
P. Caldiroli / Journal of Functional Analysis 257 (2009) 405–427
and then, thanks to Theorem 2.3 and by a diagonal argument we obtain that un,k+1 − p n,k+1 → 1 (R2 , R3 ). Then U k+1 solves u = 2H 2 U k+1 strongly in Hloc p k+1 (u)ux ∧ uy on R where k+1 n,k+1 k+1 3 n,k+1 Hpk+1 (p) = H (p + p ) if p →p ∈ R , or Hpk+1 ≡ H∞ if |p | → +∞. Using again Lemma 3.7 we have that
n,k+1 2
∇u
=
n,k
d θ ◦ g k+1 2 + o(1) =
n,k 2
dθ + o(1) = ε + o(1),
n
S2−
D1 (0)
N
rnk+1
(σnk+1 )
1 (R2 , R3 ), we infer that U k+1 is nonconstant, and since un,k+1 − p n,k+1 → U k+1 strongly in Hloc namely it is an Hpk+1 -bubble. 2
Let us conclude the proof of Lemma 3.6. By Lemma 3.7, we have that k ∇ θ n,k ◦ gnk+1 = ∇un,k+1 − ∇U n,i,k+1 → ∇U k+1
weakly in L2 R2 , R6 .
i=1
Then, by Lemma 2.1 and by lower semicontinuity,
1 − MK 8π H∞
2
∇U k+1 2 lim inf
R2
n,k
∇ θ ◦ g k+1 2 = lim inf n
R2
contrary to the hypothesis lim inf ∇θ n,k 2 < diction and the proof of (1) is complete.
n,k 2
∇θ
R2
√ 8π(1 − MK )/|H∞ |. Hence we obtain a contra-
(2) One argues as in the corresponding part of the proof of Lemma 3.1.
2
It remains to check Lemma 3.7. Indeed it is a consequence of the following two statements: Lemma 3.9. (i) rnk+1 → 0 and σnk+1 → σ ; 2 2 (ii) gnk ◦ gnk+1 → σ in L∞ loc (S \ {σ }, S ). Proof. (i) We know that
dθ
n,k
k−1 n,k k =d u −U − dU n,i,k → 0 strongly in L2loc S2 \ {σ }, R6 . i=1
Arguing as in the proof of Lemma 3.3, we infer that rnk+1 → 0 and σnk+1 → σ . (ii) Denoting dnk+1 = distS2 (σnk+1 , σ ), we claim that rnk = o dnk+1 + rnk+1 .
(3.12)
P. Caldiroli / Journal of Functional Analysis 257 (2009) 405–427
423
Let us prove (3.12). We have that ε= N
k+1 k+1 (σn )
rn
= N
n,k 2
dθ = N
n,k−1
2
d θ ◦ g k − dU k
n
k+1 k+1 (σn )
rn
n,k−1 k 2
d θ ◦ g + o(1) =
n,k−1 2
dθ
+ o(1).
n
rnk+1
(σnk+1 )
gnk (N
rnk+1
(σnk+1 ))
Since dθ n,k−1 → 0 strongly in L2loc (S2 \ {σ }, R6 ) it must be distS2 (gnk (Nr k+1 (σnk+1 )), σ ) → 0. n Using triangular inequality and the fact that σnk → σ we infer that distS2 gnk Nr k+1 σnk+1 , σnk → 0. n
Since σnk = gnk (σ ) and ρσnk is an isometry, distS2 gnk Nr k+1 σnk+1 , σnk = distS2 δrnk Nr k+1 σnk+1 , σ . n
n
Thus we obtain that distS2 (δrnk (Nr k+1 (σnk+1 )), σ ) → 0. Let σ˜ nk+1 be the point of Nr k+1 (σnk+1 ) n n which is closest to σ . Then, by (δ5), distS2 (δrnk (Nr k+1 (σnk+1 )), σ ) = distS2 (δrnk (σ˜ nk+1 ), σ ) = n rnk × distS2 (σ˜ nk+1 , σ ) and distS2 (σ˜ nk+1 , σ ) = distS2 (σnk+1 , σ ) − rnk+1 = π − dnk+1 − rnk+1 . Hence we get rnk × (π − dnk+1 − rnk+1 ) → 0, namely cos rnk + cos(π − dnk+1 − rnk+1 ) 1 + (cos rnk )(cos(π − dnk+1 − rnk+1 ))
→1
and this occurs if and only if (3.12) holds true. Now we study the behaviour of gnk ◦ gnk+1 . We distinguish two cases, according that dnk+1 = O(rnk+1 )
or
dnk+1 rnk+1
→ +∞.
Consider the case dnk+1 = O(rnk+1 ). In view of (δ4), we can write gnk ◦gnk+1 = ρσnk ◦δr k ×(π−r k+1 ) ◦
(δr k+1 ◦ gnk+1 ) and then, since −σnk = gnk (σ ) and using (δ6), for every σ ∈ S2
n
n
n
distS2 gnk ◦ gnk+1 (σ ), −σnk = distS2 δr k ×(π−r k+1 ) ◦ δr k+1 ◦ gnk+1 (σ ), σ n n n k = π − rn × π − rnk+1 × distS2 δr k+1 ◦ gnk+1 (σ ), σ . n
(3.13)
2 2 By Lemma 1.3, δr k+1 ◦ gnk+1 → g in L∞ loc (S \ {σ }, S ), where g is a conformal transformation n of S2 such that g(σ ) = σ . Then
424
P. Caldiroli / Journal of Functional Analysis 257 (2009) 405–427
distS2 δr k+1 ◦ gnk+1 (σ ), σ distS2 δr k+1 ◦ gnk+1 (σ ), g(σ ) + distS2 g(σ ), σ n n = distS2 δr k+1 ◦ gnk+1 (σ ), g(σ ) + π − distS2 g(σ ), g(σ ) . n
Since g is a diffeomorphism of S2 into itself, there exists a constant C > 0 such that distS2 g(σ ), g(σ ) C distS2 (σ, σ )
for any σ ∈ S2 .
Hence distS2 δr k+1 ◦ gnk+1 (σ ), σ distS2 δr k+1 ◦ gnk+1 (σ ), g(σ ) + π − C distS2 (σ, σ ). n
n
Therefore for every r ∈ (0, π) there exists r ∈ (0, π) and n ∈ N such that if distS2 (σ, σ ) > r and n n distS2 δr k+1 ◦ gnk+1 (σ ), σ r .
(3.14)
n
From (3.12) and since dnk+1 = O(rnk+1 ) it follows that rnk = o(rnk+1 ) and thus rnk × π − rnk+1 → π.
(3.15)
Therefore, by (3.14)–(3.15) and (1.2)–(1.3), 2 π − rnk × π − rnk+1 × distS2 δr k+1 ◦ gnk+1 (σ ), σ → 0 in L∞ loc S \ {σ } n
2 2 and finally, since −σnk → σ , from (3.13) we obtain that gnk ◦ gnk+1 → σ in L∞ loc (S \ {σ }, S ).
dnk+1 → +∞. Writing gnk ◦ gnk+1 = ρσnk rnk+1 ρσnk ◦ δr k ×(π−d k+1 ) ◦ (δd k+1 ◦ gnk+1 ) we have that for every σ ∈ S2 n n n
Now consider the case
k+1 = ◦ δrnk ◦ δ −1 k+1 ◦ δd k+1 ◦ gn dn
n
distS2 gnk ◦ gnk+1 (σ ), −σnk = distS2 δr k ×(π−d k+1 ) ◦ δd k+1 ◦ gnk+1 (σ ), σ n n n = π − rnk × π − dnk+1 × distS2 δd k+1 ◦ gnk+1 (σ ), σ . n
(3.16)
2 2 ∗ 2 By Lemma 1.3, δd k+1 ◦ gnk+1 → σ ∗ in L∞ loc (S \ {σ }, S ), for some σ ∈ S \ {σ }. Then n
distS2 δd k+1 ◦ gnk+1 (σ ), σ → distS2 (σ ∗ , σ ) n
Moreover, from (3.12) and since
dnk+1 rnk+1
2 in L∞ loc S \ {σ } .
(3.17)
→ +∞ it follows that rnk = o(dnk+1 ) and thus
rnk × π − dnk+1 → π.
(3.18)
Therefore, by (3.16)–(3.18) and since distS2 (σ ∗ , σ ) < π , using again (1.2)–(1.3), we infer that 2 k distS2 ((gnk ◦ gnk+1 )(σ ), −σnk ) → 0 in L∞ loc (S \ {σ }) and finally, since −σn → σ , from (3.16) we
P. Caldiroli / Journal of Functional Analysis 257 (2009) 405–427
425
2 2 k−1 ◦ g k ◦ g k+1 obtain again that gnk ◦ gnk+1 → σ in L∞ n n loc (S \ {σ }, S ). Lastly, the behaviour of U can be deduced by using Lemma 1.4. 2
Proof of Lemma 3.7. Repeat the proof of Lemma 3.5 with k + 1 instead of k and use Lemma 3.9 and the induction hypotheses. 2 3.4. Conclusion If the sequence (un ) taken at the beginning blows k bubbles, then by (3.9) and by Lemma 2.1 R2
2
n 2
∇u 8πk 1 − MK . H∞
As sup R2 |∇un |2 < +∞, only finitely many bubbles can be blown, namely the iterative argument stops in a finite number k¯ of steps. This concludes the proof of Theorem 0.1, and its statement holds with gn,i = gn1 ◦ · · · ◦ gni . Remark 3.10. For every i = 1, . . . , k¯ the ith bubble U i is blown from (un ) following the se−1 quence (U i ◦gn,i ) which in fact, for i > 1, concentrates at some constant. For i = j the sequences −1 −1 (U i ◦ gn,i ) and (U j ◦ gn,j ) concentrate with different speeds. Indeed, if i < j then
−1 −1 · ∇ U j ◦ gn,j = ∇ U i ◦ gn,i
R2
∇U j · ∇U n,i,j → 0 R2
by Lemma 3.5. Notice also that the case U i = U j may happen. This means that two different blow-up phenomena occur with no correlation each other even if the bubbles which are blown could be the same. 4. Some perspectives The blow-up analysis performed in this paper can be useful to study the so called H -bubble problem:
u = 2H (u)ux ∧ uy u ∈ Hˆ 1 (R2 , R3 ),
on R2 , u nonconstant.
It is known that this problem is variational in nature, that is, its weak solutions can be found as critical points of a suitable energy functional EH (see, e.g., [6]). In fact EH turns out to be regular enough and with a saddle point geometry, hence one can use critical point theory. In this context Theorem 0.1 provides a characterization of Palais–Smale sequences having uniformly bounded Dirichlet integral and then constitutes a key tool in order to get existence of H -bubbles, possibly with minimal energy (as in [6]), or with larger energy. This will be developed in future research.
426
P. Caldiroli / Journal of Functional Analysis 257 (2009) 405–427
The blow-up method used in the proof of Theorem 0.1 might be applied, up to suitable adaptation, also to study problems on the unit disc D of the following type: describe the behaviour of a sequence (un ) ⊂ H 1 (D, R3 ) satisfying: ⎧ un − 2H un unx ∧ uny → 0 in H −1 D, R3 , ⎪ ⎪ ⎪ ⎪ ⎨ un = γ n on ∂D
2 ⎪ ⎪ sup ∇un < +∞ ⎪ ⎪ ⎩ n
(4.1)
D
where H : R3 → R is a prescribed continuous and bounded function satisfying some conditions as in Theorem 0.1, and (γ n ) ⊂ H 1/2 (∂D, R3 ) is a sequence of boundary data having some behaviour. In particular two situations seem of some interest, namely when: (1) γ n shrinks to a constant in some topology, e.g., in H 1/2 (∂D, R3 ) and/or in L∞ (∂D, R3 ); (2) γ n = γ for all n ∈ N. Concerning the case (1) the problem has been studied in an exhaustive way in [5] when H is a nonzero constant. If H is a nonconstant function of the type considered in Theorem 0.1 some partial results have been proved in [8] when (un ) is a sequence of H -surfaces; clearly, in this case, a high degree of smoothness can be exploited and, in addition, a priori estimates hold. Also the case (2) has been solved for H constant (see [18]) but not yet in a more general case of H variable. The knowledge of the behaviour of sequences of approximate solutions for the Dirichlet problem
u = 2H (u)ux ∧ uy u=γ
in D, on ∂D
could be helpful in the direction of the Rellich conjecture (for the case H constant see [4,17–19] and the references therein; see also [2,13,14] for a nonconstant perturbative case). References [1] F. Bethuel, J.M. Ghidaglia, Improved regularity of solutions to elliptic equations involving Jacobians and applications, J. Math. Pures Appl. 72 (1993) 441–474. [2] F. Bethuel, O. Rey, Multiple solutions to the Plateau problem for nonconstant mean curvature, Duke Math. J. 73 (1994) 593–646. [3] V. Bononcini, Un teorema di continuità per integrali su superficie chiuse, Riv. Mat. Univ. Parma 4 (1953) 299–311. [4] H. Brezis, J.M. Coron, Multiple solutions of H -systems and Rellich’s conjecture, Comm. Pure Appl. Math. 37 (1984) 149–187. [5] H. Brezis, J.M. Coron, Convergence of solutions of H -systems or how to blow bubbles, Arch. Ration. Mech. Anal. 89 (1985) 21–56. [6] P. Caldiroli, R. Musina, Existence of minimal H -bubbles, Commun. Contemp. Math. 4 (2002) 177–209. [7] P. Caldiroli, R. Musina, H-bubbles in a perturbative setting: The finite-dimensional reduction method, Duke Math. J. 122 (2004) 457–484. [8] P. Caldiroli, R. Musina, The Dirichlet problem for H -Systems with small boundary data: Blowup phenomena and nonexistence results, Arch. Ration. Mech. Anal. 181 (2006) 1–42. [9] P. Caldiroli, R. Musina, On Palais–Smale sequences for H -systems: Some examples, Adv. Differential Equations 11 (2006) 931–960.
P. Caldiroli / Journal of Functional Analysis 257 (2009) 405–427
427
[10] P. Caldiroli, R. Musina, Weak limit and blowup of approximate solutions to H -systems, J. Funct. Anal. 249 (2007) 171–198. [11] S. Chanillo, A. Malchiodi, Asymptotic Morse theory for the equation v = 2vx ∧ vy , Comm. Anal. Geom. 13 (2005) 187–251. [12] E. Heinz, Über die Regularität schwacher Lösungen nichtlinearer elliptisher Systeme, Nachr. Akad. Wiss. Gottingen Math.-Phys. Kl. II 1 (1986) 1–15. [13] N. Jakobowsky, A perturbation result concerning a second solution to the Dirichlet problem for the equation of prescribed mean curvature, J. Reine Angew. Math. 457 (1994) 1–21. [14] N. Jakobowsky, Multiple surfaces of non-constant mean curvature, Math. Z. 217 (1994) 497–512. [15] P.-L. Lions, The concentration-compactness principle in the calculus of variations, the limit case. Parts I and II, Rev. Mat. Iberoamericana 1 (1) (1985) 145–201, Rev. Mat. Iberoamericana 1 (2) (1985) 45–121. [16] R. Musina, On the regularity of weak solutions to H -systems, Atti Accad. Naz. Lincei Cl. Sci. Fis. Mat. Natur. Rend. Lincei (9) Mat. Appl. 18 (2007) 209–219. [17] K. Steffen, On the nonuniqueness of surfaces with prescribed constant mean curvature spanning a given contour, Arch. Ration. Mech. Anal. 94 (1986) 101–122. [18] M. Struwe, Large H -surface via the mountain-pass lemma, Math. Ann. 270 (1985) 441–459. [19] M. Struwe, Nonuniqueness in the Plateau problem for surfaces of constant mean curvature, Arch. Ration. Mech. Anal. 93 (1986) 135–157. [20] H. Wente, An existence theorem for surfaces of constant mean curvature, J. Math. Anal. Appl. 26 (1969) 318–344.
Journal of Functional Analysis 257 (2009) 428–463 www.elsevier.com/locate/jfa
Multi-variable subordination distributions for free additive convolution Alexandru Nica 1 Department of Pure Mathematics, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada Received 5 November 2008; accepted 29 December 2008 Available online 26 January 2009 Communicated by D. Voiculescu
Abstract Let k be a positive integer and let Dc (k) denote the space of joint distributions for k-tuples of selfadjoint elements in C ∗ -probability space. The paper studies the concept of “subordination distribution of μ ν with respect to ν” for μ, ν ∈ Dc (k), where is the operation of free additive convolution on Dc (k). The main tools used in this study are combinatorial properties of R-transforms for joint distributions and a related operator model, with operators acting on the full Fock space. Multi-variable subordination turns out to have nice relations to a process of evolution towards -infinite divisibility on Dc (k) that was recently found by Belinschi and Nica (arXiv: 0711.3787). Most notably, one gets better insight into a connection which this process was known to have with free Brownian motion. © 2009 Elsevier Inc. All rights reserved. Keywords: Free additive convolution; R-transform; Subordination distribution; Non-crossing partition; Operator model on full Fock space
1. Introduction and statements of results The free additive convolution is a binary operation on the space of probability distributions on R, reflecting the addition operation for free random variables in a noncommutative probability space. A significant fact in its theory (see [9,15,16]) is that the Cauchy transform of the distribution μ ν is subordinated to the Cauchy transforms of μ and of ν, as analytic functions E-mail address: [email protected]. 1 Research supported by a Discovery Grant from NSERC, Canada.
0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.12.022
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
429
on the upper half-plane C+ . Thus (choosing for instance to discuss subordination with respect to ν) one has an analytic subordination function ω : C+ → C+ such that Gμ ν (z) = Gν ω(z) ,
∀z ∈ C+ ,
where Gμ ν and Gν are the Cauchy transforms of μ ν and of ν, respectively. Moreover, the subordination function ω can be identified as the reciprocal Cauchy transform of a uniquely determined probability distribution σ on R. Following [11], this σ will be denoted as “μ ν.” The name used in [11] for σ = μ ν is “s-free additive convolution of μ and ν,” in relation to a suitably tailored concept of “s-freeness” that is also introduced in [11]. Since s-freeness is only marginally addressed in the present paper, μ ν will just be called here the subordination distribution of μ ν with respect to ν. The goal of the present paper is to introduce and study the analogue for μ ν in a multivariable framework where μ, ν become joint distributions of k-tuples of selfadjoint elements in a C ∗ -probability space. The particular case k = 1 corresponds of course to the framework of probability distributions on R as discussed above, with μ, ν compactly supported. The main tool used in the paper is the R-transform for joint distributions. In particular, the k-variable version of μ ν is introduced in Definition 1.1 below via an extension of the formula which describes Rμ ν in terms of Rμ and Rν in the case k = 1. (The 1-variable motivation behind Definition 1.1 is presented in Section 2.1.) It is convenient to write the definition for the k-variable version of μ ν by allowing μ and ν to be any linear functionals on CX1 , . . . , Xk (the algebra of polynomials in non-commuting indeterminates X1 , . . . , Xk ) such that μ(1) = ν(1) = 1. The set of all such “purely algebraic” distributions will be denoted by Dalg (k). The main interest of the paper is in the smaller set of “noncommutative C ∗ -distributions with compact support” μ can appear as joint distribution for a k-tuple Dc (k) := μ ∈ Dalg (k) . of selfadjoint elements in a C ∗ -probability space But in order to define on Dc (k) it comes in handy to first define it as a binary operation on Dalg (k), and then prove that μ ν ∈ Dc (k) whenever μ, ν ∈ Dc (k). In the next definition and throughout the paper, k is a positive integer denoting “the number of variables” that one is working with. Definition 1.1. Let μ, ν be distributions in Dalg (k). The subordination distribution of μ ν with respect to ν is the distribution μ ν ∈ Dalg (k) uniquely determined by the requirement that its R-transform is Rμ
ν
= Rμ z1 (1 + Mν ), . . . , zk (1 + Mν ) · (1 + Mν )−1 .
(1.1)
In (1.1) Mν is the moment series of ν and (1 + Mν )−1 is the inverse of 1 + Mν with respect to multiplication, in the algebra Cz1 , . . . , zk of power series in the non-commuting indeterminates z1 , . . . , zk . (A more detailed review of the notations used here can be found in Section 2.3.)
430
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
Remark 1.2. 1◦ From Eq. (1.1) it is clear that the R-transform of μ ν depends linearly on the one of μ. Since the R-transform linearizes , this amounts to a form of “-linearity” in the way how μ ν depends on μ. More precisely one has (μ1 μ2 ) ν = (μ1 ν) (μ2 ν), or, when looking at -convolution powers, t μ ν = (μ ν) t ,
∀μ1 , μ2 , ν ∈ Dalg (k),
∀μ, ν ∈ Dalg (k), ∀t > 0.
(1.2)
(1.3)
2◦ The series Rμ (z1 (1 + Mν ), . . . , zk (1 + Mν )) appearing in (1.1) bears a resemblance to a well-known “functional equation for the R-transform” (see [13, Lecture 16]), which says that Rμ z1 (1 + Mμ ), . . . , zk (1 + Mμ ) = Mμ , ∀μ ∈ Dalg (k). One can actually invoke this functional equation in the particular case of Definition 1.1 when ν = μ, to obtain that Rμ
μ
= Mμ · (1 + Mμ )−1 ,
μ ∈ Dalg (k).
(1.4)
The series Mμ · (1 + Mμ )−1 is called the η-series of μ, and plays an important role in the study of connections between free and Boolean probability. In particular, the relation between R-transforms and η-series yields a special bijection B : Dalg (k) → Dalg (k), defined as follows: for every μ ∈ Dalg (k), B(μ) is the unique distribution in Dalg (k) which has RB(μ) = ημ .
(1.5)
B is called the Boolean Bercovici–Pata bijection (first put into evidence in the 1-variable case in [7], then extended to multi-variable framework in [5]). This bijection has the important property that it carries Dc (k) into itself and that B(Dc (k)) is precisely the set of distributions in Dc (k) which are infinitely divisible with respect to (cf. Theorem 1 in [5]). By comparing (1.4) to (1.5), one draws the conclusion that μ μ = B(μ),
∀μ ∈ Dalg (k).
(1.6)
Eq. (1.6) can be generalized to a nice formula describing μ1 μ2 in the case when both μ1 and μ2 are -convolution powers of the same μ; see Proposition 5.3 below. 3◦ One can rewrite Eq. (1.1) as Rμ ν · (1 + Mν ) = Rμ z1 (1 + Mν ), . . . , zk (1 + Mν ) , (1.7) and then one can equate coefficients in the series on the two sides of (1.7), in order to obtain an explicit combinatorial formula for the coefficients of Rμ ν . This in turn can be used to obtain an explicit formula for the moments of μ ν, which is stated next. In Theorem 1.3, NC(n) is the set of non-crossing partitions of {1, . . . , n} (cf. review of NC(n) terminology in Section 2.2). The notation “(i1 , . . . , in ) | V ” stands for “(iv(1) , . . . , iv(m) ),” where V = {v(1), . . . , v(m)} is a non-empty subset of {1, . . . , n} (listed with v(1) < · · · < v(m)) and i1 , . . . , in are some indices in {1, . . . , k}.
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
431
Theorem 1.3. Let μ, ν be distributions in Dalg (k). For every n 1 and 1 i1 , . . . , in k let us denote the coefficients of zi1 · · · zin in the series Rμ and Rν by α(i1 ,...,in ) and β(i1 ,...,in ) , respectively. Then for every n 1 and 1 i1 , . . . , in k one has (μ ν)(Xi1 · · · Xin ) = π∈NC(n)
V outer block of π
α(i1 ,...,in )|V ·
α(i1 ,...,in )|W + β(i1 ,...,in )|W .
(1.8)
W inner block of π
Based on the moment formula from Theorem 1.3 one can find an “operator model on the full Fock space” for . This is a recipe which starts from the data stored in the R-transforms Rμ and Rν , and uses creation and annihilation operators on the full Fock space over C2k in order to produce a k-tuple of operators with distribution μ ν. The precise description of how this works appears in Theorem 4.4. Once the full Fock space model is in place it is easy to see that one can in fact upgrade it to a more general operator model for , not making specific reference to the full Fock space, and described as follows. Theorem 1.4. Let H be a Hilbert space, let Ω be a unit-vector in H, and let ϕ be the vector-state defined by Ω on B(H). Suppose that A1 , . . . , Ak , B1 , . . . , Bk ∈ B(H) are such that {A1 , . . . , Ak } is free from {B1 , . . . , Bk } in (B(H), ϕ), and let μ, ν denote the joint distributions of the k-tuples A1 , . . . , Ak and respectively B1 , . . . , Bk . Let moreover P ∈ B(H) denote the orthogonal projection onto the 1-dimensional subspace CΩ of H, and consider the operators Ci := Ai + (1 − P )Bi (1 − P ) ∈ B(H),
1 i k.
(1.9)
Then the joint distribution of C1 , . . . , Ck with respect to ϕ is equal to μ ν. Now, any given pair of distributions μ, ν ∈ Dc (k) can be made to appear in the setting of Theorem 1.4, in such a way that the operators A1 , . . . , Ak , B1 , . . . , Bk involved in the theorem are all selfadjoint. (This is done via a standard free product construction – cf. Remark 4.11 below.) Since in this case the operators C1 , . . . , Ck from Eq. (1.9) are selfadjoint as well, one thus obtains the following corollary, giving the desired fact that can be defined as a binary operation on Dc (k). Corollary 1.5. If μ, ν are in Dc (k) then μ ν belongs to Dc (k) as well. Remark 1.6. In the 1-variable framework, the study of was started in [11]. That paper gives an operator model for μ ν obtained via an “s-free product” construction for Hilbert spaces, and where μ ν appears as the distribution of the sum of two “s-free operators” with distributions μ and ν, respectively. By using Theorem 1.4, it is easy to find a k-variable analogue for this fact – that is, one can make μ ν appear as the distribution of the sum of two s-free k-tuples on an s-free product Hilbert space. The way how this is done is outlined in Remark 4.12 below. The next part of the introduction (from Remark 1.7 to Proposition 1.10) explains how relates to the work in [6] concerning evolution towards -infinite divisibility and its connection to the free Brownian motion.
432
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
Remark 1.7. Here is a brief summary of relevant results from [6]. One considers a family of bijective transformations {Bt | t 0} of Dalg (k) defined by 1/(1+t) , Bt (μ) = μ(1+t)
∀t 0, ∀μ ∈ Dalg (k),
where the -powers and -powers are taken in connection to free and respectively Boolean convolution. The transformations Bt form a semigroup (Bs+t = Bs ◦ Bt , ∀s, t 0), each of them carries Dc (k) into itself, and at t = 1 one has B1 = B, the Boolean Bercovici–Pata bijection that was also encountered in Remark 1.2.2. Thus for a fixed μ ∈ Dc (k) the process {Bt (μ) | t 0} can be viewed as some kind of “evolution of μ towards -infinite divisibility” (since Bt (μ) is infinitely divisible for all t 1). On the other hand let us recall that the free Brownian motion started at a distribution ν ∈ Dc (k) is the process {ν γ t | t 0}, where γ ∈ Dc (k) is the joint distribution of a standard semicircular system (a free family of k centered semicircular elements of variance 1). The paper [6] puts into evidence a certain transformation Φ : Dalg (k) → Dalg (k) which carries Dc (k) into itself and has the property that Φ ν γ t = Bt Φ(ν) ,
∀ν ∈ Dalg (k), ∀t 0.
(1.10)
In other words, (1.10) says that a relation of the form “Φ(ν) = μ” is not affected when ν evolves under the free Brownian motion while μ evolves under the action of the semigroup {Bt | t 0}. The transformation Φ from [6] turns out to be related to subordination distributions, as follows. Theorem 1.8. For every distribution ν ∈ Dalg (k) one has that γ ν = B Φ(ν) ,
(1.11)
where γ is as above (the joint distribution of a standard semicircular system) and B is the Boolean Bercovici–Pata bijection. Remark 1.9. 1◦ Eq. (1.11) thus offers an alternative description for Φ: Φ(ν) = B−1 (γ ν),
ν ∈ Dalg (k).
(1.12)
It is worth noting that the two main properties of Φ obtained in [6] (formula (1.10) and the fact that Φ maps Dc (k) into itself) are very easy to derive by starting from this description and by invoking the suitable properties of subordination distributions; see Proposition 5.7. 2◦ It is also worth noting that one has a simple explicit formula for how μ ν itself evolves under the action of the Bt . This formula pops up when one compares the explicit descriptions for the free and the Boolean cumulants of μ ν (see Remark 3.8.1 and Proposition 5.1 below), and is described as follows. Proposition 1.10. Let μ, ν be two distributions in Dalg (k). Then for every t 0 one has: Bt (μ ν) = μ μ t ν .
(1.13)
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
433
The final part of the introduction discusses two other interesting algebraic properties of , obtained by extrapolating functional equations which are known to be satisfied by subordination functions in the 1-variable framework. One of these two properties extends a remarkable formula for the sum of the subordination functions of μ ν with respect to μ and to ν (see e.g. Theorem 4.1 in [4]). This formula can be equivalently written in terms of the η-series of μ ν and ν μ, and in this form it goes through to the multi-variable framework, as follows. Proposition 1.11. One has that ημ
ν
+ ην
μ
= ημ ν ,
∀μ, ν ∈ Dalg (k).
(1.14)
Another property of comes from the functional equation satisfied by the subordination function of a convolution power ν p with respect to ν, where ν is a probability measure on R and p ∈ [1, ∞) (see Theorem 2.5 in [3]). This too can be translated into a formula involving η-series, which goes through to multi-variable framework. More precisely, the subordination distribution of ν p with respect to ν can be considered for any ν ∈ Dalg (k) and p ∈ [1, ∞) (see Definition 6.3 below), and the following statement holds. Proposition 1.12. For every ν ∈ Dalg (k) and p 1, the subordination distribution of ν p with respect to ν is equal to (B(ν))(p−1) . In particular, for distributions in Dc (k) one gets the following corollary. Corollary 1.13. Let ν be a distribution in Dc (k). Then for every p 1 the subordination distribution of ν p with respect to ν is still in Dc (k), and is -infinitely divisible. One can also put into evidence other natural situations when subordination distributions in Dc (k) are sure to be -infinitely divisible. In particular, an immediate consequence of Remark 1.2.1 (combined with Corollary 1.5) is that μ ν is -infinitely divisible whenever μ, ν ∈ Dc (k) and μ is itself -infinitely divisible; see Corollary 4.13 below. Remark 1.14. After circulating the first version of this paper, I was made aware of the connection between the results obtained here and the paper [2] of Anshelevich, where a two-variable extension of the transformation Φ from Remark 1.7 is being studied. More precisely, [2] introduces a map Dalg (k) × Dalg (k) (ρ, ψ) → Φ[ρ, ψ] ∈ Dalg (k) with the property that Φ[ρ, ψ] ∈ Dc (k) for every ρ, ψ ∈ Dc (k) such that ρ is -infinitely divisible, and with the property that Φ[γ , ψ] = Φ(ψ),
∀ψ ∈ Dalg (k)
(1.15)
(where γ ∈ Dc (k) is the same as in Remark 1.7). The formula which gives the translation between the results of the present paper and those of [2] is Φ[ρ, ψ] = B−1 (ρ ψ),
∀ρ, ψ ∈ Dalg (k).
(1.16)
434
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
Eq. (1.16) can be used to explain why the argument ρ in Φ[ρ, ψ] is naturally chosen to be infinitely divisible: as observed right before the present remark, one has in this situation that ρ ψ is -infinitely divisible, hence that B−1 (ρ ψ) ∈ Dc (k) for every ψ ∈ Dc (k). (Another explanation for why ρ is naturally taken to be infinitely divisible is presented in Remark 10 of [2].) By using the formula (1.16), the description of Φ given in the above Remark 1.9.1 is reduced to (1.15), and it is also easily seen that Proposition 1.10 of the present paper is equivalent to Theorem 11(b) from [2]. The scope of [2] is different from (albeit overlapping with) the one of the present paper, and the methods of proof are different, invoking e.g. results about conditionally positive definite functionals, or about a multi-variable version of monotonic convolution – see specifics in Section 4.2 of [2]. Remark 1.15 (Organization of the paper). Besides the introduction, the paper has five other sections. Section 2 contains a review of some background and notations. Section 3 derives explicit combinatorial formulas for the free and Boolean cumulants of μ ν, and uses them in order to obtain the moment formula announced in Theorem 1.3. Section 4 is devoted to operator models, and to the proof of Theorem 1.4. Section 5 discusses in more detail the relations to the transformations Bt that were stated in Theorem 1.8 and Proposition 1.10. Finally, Section 6 discusses in more detail the statements made in Propositions 1.11, 1.12, and in Corollary 1.13. 2. Background and notations 2.1. Motivation from 1-variable framework Remark 2.1. Recall that for a probability distribution μ on R, the Cauchy transform of μ is the analytic function Gμ defined by 1 dμ(t), z ∈ C \ R. (2.1) Gμ (z) = z−t R
The reciprocal Cauchy transform Fμ is defined by Fμ (z) = 1/Gμ (z),
z ∈ C \ R.
(2.2)
It is easily checked that Gμ maps the upper half-plane C+ = {z ∈ C | Im(z) > 0} to the lower half-plane C− = {z ∈ C | Im(z) < 0}; as a consequence of this, Fμ can be viewed as an analytic self-map of C+ . The measure μ is uniquely determined by Gμ (hence by Fμ as well); and more precisely, μ can be retrieved from Gμ by a procedure called “Stieltjes inversion formula” (see e.g. [1]). Let F denote the set of all analytic self-maps of C+ that can arise as Fμ for some probability measure μ on R. One has a very nice intrinsic description of F, that F (it) + + F = F : C → C F is analytic and lim =1 . (2.3) t→∞ it (For a nice review of this and other properties of F one can consult Section 2 of [12] or Section 5 of [8].)
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
435
As mentioned in Section 1, the starting point of this paper is that for any two probability measures μ, ν on R, there exists a subordination function ω ∈ F such that (2.4) Gμ ν (z) = Gν ω(z) , z ∈ C+ . With μ, ν, ω as in (2.4), it is natural to consider the unique probability measure σ on R such that Fσ = ω. This σ was studied in [11], where it is called the s-free convolution of μ and ν, and is denoted by μ ν. The name “s-free convolution” appears in [11] in connection to a suitably tailored concept of “s-freeness” that is also introduced in [11]. Since s-freeness is only marginally addressed in the present paper, we will refer to μ ν by just calling it the subordination distribution of μ ν with respect to ν. We will only look at μ ν in the special case when μ and ν are compactly supported; in this case μ ν is compactly supported as well (as one sees by examining the operator model obtained for μ ν in [11]). Remark 2.2. If μ is a compactly supported probability measure on R, then in particular μ has moments of all orders: ∞ mn :=
t n dμ(t),
n ∈ N,
−∞
and one can form the moment series of μ, Mμ (z) =
∞
mn zn .
(2.5)
n=1
In (2.5), Mμ can be viewed as an analytic function on a neighbourhood of 0, but in the present paper it is preferable to treat it as a formal power series in z. It is immediate that Mμ is connected to the Cauchy transform Gμ by the formula 1 + Mμ (1/z) = z · Gμ (z),
(2.6)
where (2.6) it is convenient to also treat Gμ as a power series (obtained by writing 1/(t − z) = ∞ inn−1 t /zn and then integrating term by term on the right-hand side of (2.1), for z ∈ C+ with n=1 |z| large enough). In the study of free additive convolution a fundamental object is Voiculescu’s R-transform, which has the linearizing property that Rμ ν = Rμ + Rν . For a compactly supported probability measure μ on R, the R-transform Rμ can be viewed as a power series, defined in terms of Mμ as the unique solution of the equation (2.7) Rμ z 1 + Mμ (z) = Mμ (z) (equation in CJzK, where Mμ is given as data and Rμ is the unknown). For the next proposition it is more convenient to write the definition of Rμ by emphasizing its relation to the Cauchy transform Gμ . On these lines one first defines the so-called K-transform of μ, which is simply the inverse under composition −1 . Kμ := Gμ
(2.8)
436
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
Kμ is a Laurent series of the form Kμ (z) =
1 z
+ α1 + α2 z + α3 z2 + · · · , and one has2
1 . Rμ (z) = z Kμ (z) − z
(2.9)
In Proposition 2.3, Eq. (2.9) will be used in the equivalent form giving Kμ in terms of Rμ , Kμ (z) =
1 + Rμ (z) . z
(2.10)
Proposition 2.3. Let μ, ν be compactly supported probability measures on R, and let the probability measure μ ν be defined as in Remark 2.1. Then Rμ
ν
(z) =
Rμ (z(1 + Mν (z))) . 1 + Mν (z)
(2.11)
Proof. Let us denote for brevity μ ν =: σ . From how μ ν is defined we have that Gμ ν = Gν ◦ Fσ .
(2.12) −1
By taking inverses under composition on both sides of (2.12) one finds that Kμ ν = Fσ ◦ Kν , hence that Fσ ◦ Kμ ν = Kν ; this in turn implies that Gσ ◦ Kμ ν = 1/Kν , and that Kμ ν = Kσ ◦ (1/Kν ). So one gets the formula: Kμ ν (w) = Kσ 1/Kν (w)
(2.13)
(equality of Laurent series in an indeterminate w). In (2.13) let us next replace the K-transforms of μ ν and of σ in terms of the corresponding R-transforms, by using Eq. (2.10). On the left-hand side we obtain Kμ ν (w) =
1 + Rμ ν (w) 1 + Rμ (w) + Rν (w) Rμ (w) = = + Kν (w), w w w
while on the right-hand side we obtain 1 + Rσ (1/Kν (w)) = Kν (w) + Kν (w) · Rσ 1/Kν (w) . Kσ 1/Kν (w) = 1/Kν (w) After making these replacements and after subtracting Kν (w) out of both sides of (2.13) one arrives to Rμ (w) = Kν (w) · Rσ 1/Kν (w) . w
(2.14)
2 The original definition of the R-transform, made in [14], simply has R (z) = K (z) − 1/z. The present paper uses μ μ the shifted version Rμ (z) = zRμ (z), which is more convenient for extension to a multi-variable framework.
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
437
Finally, in (2.14) let us make the substitution z = 1/Kν (w), with inverse w = Gν (1/z) = z(1 + Mν (z)); this substitution converts (2.14) into Rμ (z(1 + Mν (z))) 1 = · Rσ (z), z(1 + Mν (z)) z and (2.11) follows.
2
2.2. Non-crossing partitions Notation 2.4 (NC(n) terminology). Let n be a positive integer. 1◦ Let π = {V1 , . . . , Vp } be a partition of {1, . . . , n} – i.e. V1 , . . . , Vp are pairwise disjoint non-empty sets (called the blocks of π ), and V1 ∪ · · · ∪ Vp = {1, . . . , n}. We say that π is noncrossing if for every 1 i < j < i < j n such that i is in the same block with i and j is in the same block with j , it necessarily follows that all of i, i , j, j are in the same block of π . The set of all non-crossing partitions of {1, . . . , n} will be denoted by NC(n). 2◦ Let π be a partition in NC(n). Since π is, after all, a set of subsets of {1, . . . , n}, it will be convenient to write “V ∈ π ” as a shorthand for “V is a block of π .” In the same vein, various calculations throughout the paper will use functions “c : π → {1, 2}.” Such a function is thus a recipe for assigning a number c(V ) ∈ {1, 2} to every block V of π , and will be referred to as a colouring of π . 3◦ For π ∈ NC(n), the number of blocks of π will be denoted by |π|. 4◦ Let π be a partition in NC(n), and let V be a block of π . If there exists a block W of π such that min(W ) < min(V ) and max(W ) > max(V ), then one says that V is an inner block of π . In the opposite case one says that V is an outer block of π . 5◦ Every partition π ∈ NC(n) has a special colouring oπ : π → {1, 2} which will be called the inner/outer colouring of π , and is defined by oπ (V ) =
1 if V is outer, 2 if V is inner,
V ∈ π.
(2.15)
Remark 2.5. NC(n) is partially ordered by reverse refinement: for π, ρ ∈ NC(n) one writes “π ≤ ρ” to mean that every block of ρ is a union of blocks of π . The minimal and maximal element of (NC(n), ≤) are denoted by 0n (the partition of {1, . . . , n} into n singleton blocks) and respectively 1n (the partition of {1, . . . , n} into only one block). Let ρ = {W1 , . . . , Wq } be a fixed partition in NC(n). It is easy to see that one has a natural poset isomorphism
π ∈ NC(n) π ≤ ρ π → (π1 , . . . , πq ) ∈ NC |W1 | × · · · × NC |Wq |
(2.16)
where for every 1 j q the partition πj ∈ NC(|Wj |) is obtained by restricting π to Wj and by re-denoting the elements of Wj , in increasing order, so that they become 1, 2, . . . , |Wj |. This is a particular case of a more general factorization property satisfied by the intervals of the poset (NC(n), ≤) – see [13, Lecture 9].
438
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
Remark 2.6. This paper also makes use of another partial order relation on NC(n), which was introduced in [5] and is denoted by “.” For π, ρ ∈ NC(n) one writes “π ρ” to mean that π ≤ ρ and that, in addition, the following condition is fulfilled: For every block W of ρ there exists a block (2.17) V of π such that min(W ), max(W ) ∈ V . It is immediately verified that “” is indeed a partial order relation on NC(n). It is much coarser than the reversed refinement order. For instance, the inequality π 1n is not holding for all π ∈ NC(n), but it rather amounts to the condition that the numbers 1 and n belong to the same block of π (or equivalently, that π has a unique outer block). At the other end of NC(n), the inequality π 0n can only take place when π = 0n . The remaining part of Section 2.2 reviews a couple of other properties of that will be used later on in the paper. Definition 2.7. Let π, ρ be partitions in NC(n) such that π ρ. A block V of π is said to be ρspecial when there exists a block W of ρ such that min(V ) = min(W ) and max(V ) = max(W ). Proposition 2.8. Let π ∈ NC(n) be such that π 1n , and consider the set of partitions
ρ ∈ NC(n) π ρ 1n .
(2.18)
Then ρ → {V ∈ π | V is ρ-special} is a one-to-one map from the set (2.18) to the set of subsets of π .3 The image of this map is equal to {V ⊆ π | V V0 }, where V0 denotes the unique outer block of π . For the proof of Proposition 2.8, the reader is referred to Proposition 2.13 and Remark 2.14 of [5]. Remark 2.9 (Interval partitions). A partition π of {1, . . . , n} is said to be an interval partition if every block V of π is of the form V = [i, j ] ∩ Z for some 1 i j n. The set of all interval partitions of {1, . . . , n} will be denoted by Int(n). It is clear that Int(n) ⊆ NC(n), and it is easily verified that every interval partition is a maximal element of the poset (NC(n), ). It is moreover easy to see (left as exercise to the reader) that for every π ∈ NC(n) there exists a unique ρ ∈ Int(n) such that π ρ; the blocks of this special interval partition ρ are in some sense the “convex hulls” of the outer blocks of π . 2.3. Power series in k non-commuting indeterminates Notation 2.10. We will denote by Cz1 , . . . , zk the set of power series with complex coefficients in the non-commuting indeterminates z1 , . . . , zk , and we will use the notation C0 z1 , . . . , zk for the set of series in Cz1 , . . . , zk which have vanishing constant term. The general form of a series f ∈ C0 z1 , . . . , zk is thus f (z1 , . . . , zk ) =
∞
k
α(i1 ,...,in ) zi1 · · · zin
n=1 i1 ,...,in =1
where the coefficients α(i1 ,...,in ) are from C. 3 According to the conventions made in Notation 2.4.2, “subset of π ” stands here for “set of blocks of π .”
(2.19)
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
439
Definition 2.11 (Coefficients for series in C0 z1 , . . . , zk ). 1◦ For n 1 and 1 i1 , . . . , in k we will denote by Cf(i1 ,...,in ) : C0 z1 , . . . , zk → C
(2.20)
the linear functional which extracts the coefficient of zi1 · · · zin in a series f ∈ C0 z1 , . . . , zk . Thus for f written as in Eq. (2.19) we have Cf(i1 ,...,in ) (f ) = α(i1 ,...,in ) . 2◦ Suppose we are given a positive integer n, some indices i1 , . . . , in ∈ {1, . . . , k}, and a partition π ∈ NC(n). We define a (generally non-linear) functional Cf(i1 ,...,in );π : C0 z1 , . . . , zk → C,
(2.21)
as follows. For every block V = {v(1), . . . , v(m)} of π , with 1 v(1) < · · · < v(m) n, let us use the notation (i1 , . . . , in ) | V := (iv(1) , . . . , iv(m) ) ∈ {1, . . . , k}m . Then we define Cf(i1 ,...,in );π (f ) :=
Cf(i1 ,...,in )|V (f ),
∀f ∈ C0 z1 , . . . , zk .
(2.22)
V ∈π
(For example if we had n = 5 and π = {{1, 4, 5}, {2, 3}}, and if i1 , . . . , i5 would be some fixed indices in {1, . . . , k}, then the above formula would become Cf(i1 ,i2 ,i3 ,i4 ,i5 );π (f ) = Cf(i1 ,i4 ,i5 ) (f ) · Cf(i2 ,i3 ) (f ), f ∈ C0 z1 , . . . , zk .) The quantities Cf(i1 ,...,in );π (f ) will be referred to as generalized coefficients of the series f . 3◦ Suppose that the positive integer n, the indices i1 , . . . , in ∈ {1, . . . , k} and the partition π ∈ NC(n) are as above, and that in addition we are also given a colouring c : π → {1, 2}. Then for any two series f1 , f2 ∈ C0 z1 , . . . , zk we define their mixed generalized coefficient corresponding to (i1 , . . . , in ), π and c via the formula Cf(i1 ,...,in );π;c (f1 , f2 ) :=
Cf(i1 ,...,in )|V (fc(V ) ).
(2.23)
V ∈π
(For example if we had n = 5, π = {{1, 4, 5}, {2, 3}} and c : π → {1, 2} defined by c({1, 4, 5}) = 1, c({2, 3}) = 2, then (2.23) would become Cf(i1 ,i2 ,i3 ,i4 ,i5 );π (f ) = Cf(i1 ,i4 ,i5 ) (f1 ) · Cf(i2 ,i3 ) (f2 ), for f1 , f2 ∈ C0 z1 , . . . , zk and 1 i1 , . . . , i5 k.) Remark 2.12. It is clear that for every n 1, 1 i1 , . . . , in k, π ∈ NC(n) and f ∈ C0 z1 , . . . , zk one has Cf(i1 ,...,in );π;c (f, f ) = Cf(i1 ,...,in );π (f ),
440
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
for no matter what colouring c of π . Let us also record here the obvious expansion formula Cf(i1 ,...,in );π (f1 + f2 ) =
Cf(i1 ,...,in );π;c (f1 , f2 ),
(2.24)
c:π→{1,2}
holding for every n 1, 1 i1 , . . . , in k, π ∈ NC(n), and f1 , f2 ∈ C0 z1 , . . . , zk . Definition 2.13 (Review of the series Mμ , Rμ , ημ ). Let μ be a distribution in Dalg (k). 1◦ We will denote by Mμ the series in C0 z1 , . . . , zk defined by Mμ (z1 , . . . , zk ) :=
∞
k
μ(Xi1 · · · Xin )zi1 · · · zin .
(2.25)
n=1 i1 ,...,in =1
Mμ is called the moment series of μ, and its coefficients (the numbers μ(Xi1 · · · Xin ), with n 1 and 1 i1 , . . . , in k) are called the moments of μ. 2◦ The η-series of μ is ημ := Mμ (1 + Mμ )−1 ∈ C0 z1 , . . . , zk ,
(2.26)
where (1 + Mμ )−1 is the inverse of 1 + Mμ under multiplication in Cz1 , . . . , zk . The coefficients of ημ are called the Boolean cumulants of μ. 3◦ There exists a unique series Rμ ∈ C0 z1 , . . . , zk which satisfies the functional equation Rμ z1 (1 + Mμ ), . . . , zk (1 + Mμ ) = Mμ .
(2.27)
Indeed, it is easily seen that Eq. (2.27) amounts to a recursion which determines uniquely the coefficients of Rμ in terms of those of Mμ . The series Rμ is called the R-transform of μ, and its coefficients are called the free cumulants of μ. (See the discussion in [13, Lecture 16], and specifically Theorem 16.15 and Corollary 16.16 of that lecture.) Remark 2.14. It is very useful that one has explicit summation formulas which express the moments of a distribution μ ∈ Dalg (k) either in terms of its free cumulants or in terms of its Boolean cumulants. These are sometimes referred to as moment–cumulant formulas. They say that for every n 1 and 1 i1 , . . . , in k one has μ(Xi1 · · · Xin ) =
Cf(i1 ,...,in );π (Rμ )
(2.28)
Cf(i1 ,...,in );π (ημ )
(2.29)
π∈NC(n)
and respectively μ(Xi1 · · · Xin ) =
π∈Int(n)
(where (2.28), (2.29) use the notations for generalized coefficients from Definition 2.11.2, and Int(n) is the set of interval-partitions from Remark 2.9). Moreover, a similar summation formula
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
441
can be used in order to express the Boolean cumulants of μ in terms of its free cumulants; it says that for every n 1 and 1 i1 , . . . , in k one has Cf(i1 ,...,in ) (ημ ) =
Cf(i1 ,...,in );π (Rμ ).
(2.30)
π∈NC(n), π1n
(For a more detailed discussion of the relation between Rμ and ημ see Section 3 of [5], where Eq. (2.30) appears in Proposition 3.9.) 3. The approach to via R-transforms The goal of this section is to derive explicit combinatorial formulas for the free and Boolean cumulants of μ ν, and then use them in order to obtain the moment formula announced in Theorem 1.3. Remark 3.1. Let μ, ν be distributions in Dalg (k). Consider the subordination distribution μ ν, and recall that its R-transform satisfies the equation Rμ
ν
· (1 + Mν ) = Rμ z1 (1 + Mν ), . . . , zk (1 + Mν ) .
(3.1)
If we denote for convenience Cf(i1 ,...,in ) (Rμ ) =: α(i1 ,...,in ) ,
∀n 1, 1 i1 , . . . , in k,
then the series on the right-hand side of (3.1) is written more precisely as ∞
k
α(j1 ,...,jm ) zj1 (1 + Mν ) · · · zjm (1 + Mν ).
(3.2)
m=1 j1 ,...,jm =1
Let us fix an n 1 and some indices 1 i1 , . . . , in k, and let us look at the coefficient of zi1 · · · zin in the infinite sum from (3.2). Clearly, a term α(j1 ,...,jm ) zj1 (1 + Mν ) · · · zjm (1 + Mν ) contributes to this coefficient if and only if m n and there exist 1 = s(1) < s(2) < · · · < s(m) n such that j1 = is(1) ,
j2 = is(2) , . . . , jm = is(m) .
(3.3)
In the case when (3.3) holds let us denote {s(1), s(2), . . . , s(m)} =: S, and let us refer to the intervals of integers s(1), s(2) ∩ Z,
...,
s(m − 1), s(m) ∩ Z,
s(m), n ∩ Z
by calling them the gaps of S; with this notation the contribution of α(j1 ,...,jm ) zj1 (1 + Mν ) · · · zjm (1 + Mν ) to the coefficient of zi1 · · · zin in (3.2) is written as α(i1 ,...,in )|S ·
G={p,...,q} gap of S
ν(Xip · · · Xiq )
442
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
(we make the convention that if G is an empty gap of S then the corresponding product ν(Xip · · · Xiq ) is taken to be equal to 1). Since the set S appearing above can be any subset of {1, . . . , n} which contains 1, we come to the conclusion that Cf(i1 ,...,in ) Rμ z1 (1 + Mν ), . . . , zk (1 + Mν )
α(i1 ,...,in )|S · ν(Xip · · · Xiq ) . =
(3.4)
G={p,...,q} gap of S
S⊆{1,...,n} such that S 1
By equating coefficients in the series on the two sides of (3.1) and by employing (3.4) one obtains explicit formulas for the coefficients of Rμ ν , as shown in the next lemma and proposition. Lemma 3.2. Consider the same notations as in Remark 3.1. For every n 1 and 1 i1 , . . . , in k one has that Cf(i1 ,...,in ) (Rμ
ν) =
α(i1 ,...,in )|S ·
ν(Xip · · · Xiq ) .
(3.5)
G={p,...,q} gap of S
S⊆{1,...,n} such that S 1,n
Proof. We will prove the required formula (3.5) by induction on n. For n = 1, (3.5) states that Cf(i1 ) (Rμ ν ) = αi1 , ∀1 i1 k; this is indeed true, as one sees by equating the coefficients of zi1 on the two sides of (3.1). Induction step. We fix an integer n 2, we assume that (3.5) holds for 1, 2, . . . , n − 1 and we prove that it also holds for n. So let i1 , . . . , in be some indices in {1, . . . , k}. The coefficient of zi1 · · · zin in Rμ ν · (1 + Mν ) is equal to:
Cf(i1 ,...,in ) (Rμ
ν) +
n−1
Cf(i1 ,...,im ) (Rμ
ν
) · ν(Xim+1 · · · Xin ).
(3.6)
m=1
For every 1 m n − 1 the induction hypothesis gives us that Cf(i1 ,...,im ) (Rμ =
ν
) · ν(Xim+1 · · · Xin ) α(i1 ,...,im )|S ·
S⊆{1,...,m} such that S 1,m
ν(Xip · · · Xiq ) · ν(Xim+1 · · · Xin ).
G={p,...,q} gap of S
In the latter expression the separate factor ν(Xim+1 · · · Xin ) can be incorporated into the product over the gaps of S, via the simple trick of treating S as a subset of {1, . . . , n} rather than a subset of {1, . . . , m}. (Indeed, in this way S gets the additional gap {m + 1, . . . , n}, with corresponding factor ν(Xim+1 · · · Xin ).) When this is done and when the resulting formula for Cf(i1 ,...,im ) (Rμ ν ) · ν(Xim+1 · · · Xin ) is substituted in (3.6), we find that
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
Cf(i1 ,...,in ) Rμ
ν
· (1 + Mν )
= Cf(i1 ,...,in ) (Rμ
ν) +
α(i1 ,...,in )|S ·
443
ν(Xip · · · Xiq ) .
(3.7)
G={p,...,q} gap of S
S⊆{1,...,n} such that 1∈S and n∈S /
Finally, we equate the right-hand sides of Eqs. (3.7) and (3.4), and the required formula for Cf(i1 ,...,in ) (Rμ ν ) follows. 2 Proposition 3.3. Let μ, ν be distributions in Dalg (k). For every n 1 and 1 i1 , . . . , in k one has Cf(i1 ,...,in ) (Rμ ν ) = Cf(i1 ,...,in );π;oπ (Rμ , Rν ), (3.8) π∈NC(n), π1n
where the inner/outer colouring oπ is as in Notation 2.4.5, and the generalized coefficient Cf(i1 ,...,in );π;oπ (Rμ , Rν ) is as in Definition 2.11.3. Proof. We will use the various notations introduced in Remark 3.1 and Lemma 3.2 above. Let us pick a subset S ⊆ {1, . . . , n} such that S 1, n, and let us prove that
ν(Xip · · · Xiq ) =
α(i1 ,...,in )|S ·
G={p,...,q} gap of S
Cf(i1 ,...,in );π;oπ (Rμ , Rν ).
(3.9)
π∈NC(n) such that S∈π
In order to verify (3.9), let us write explicitly S = {s(1), s(2), . . . , s(m)} with 1 = s(1) < s(2) < · · · < s(m) = n; then the gaps of S are listed as G1 , . . . , Gm−1 , with Gj = {pj , . . . , qj } = s(j ), s(j + 1) ∩ Z for 1 j m − 1, and the left-hand side of (3.9) becomes α(i1 ,...,in )|S ·
m−1 j =1
ν(Xipj · · · Xiqj )
(3.10)
(with the same convention as used above, that “ν(Xipj · · · Xiqj )” is to be read as 1 in the case when Gj = ∅). Now in (3.10) let us use the free moment–cumulant formula (2.28) to express the moments ν(Xipj · · · Xiqj ) in terms of the coefficients of Rν ; we get α(i1 ,...,in )|S ·
m−1 j =1
=
πj ∈NC(|Gj |)
π1 ∈NC(|G1 |),..., πm−1 ∈NC(|Gm−1 |)
Cf(ipj ,...,iqj );πj (Rν )
Cf(i1 ,...,in )|S (Rμ ) ·
m−1 j =1
Cf(i1 ,...,in )|Gj );πj (Rν ) .
(3.11)
444
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
But a family of non-crossing partitions π1 ∈ NC(|G1 |), . . . , πm−1 ∈ NC(|Gm−1 |) is naturally assembled, together with S, into one non-crossing partition π ∈ NC(n); and all partitions π ∈ NC(n) such that S ∈ π are obtained in this way, without repetitions. Moreover, when π1 , . . . , πm−1 and S are assembled together into π , it is clear that the big product from (3.11) becomes just Cf(i1 ,...,in );π;oπ (Rμ , Rν ). Hence the substitution (π1 , . . . , πm−1 ) ↔ π leads to the right-hand side of (3.9), and this completes the proof that (3.9) holds. Finally, we sum over S on both sides of (3.9), with S running in the collection of all subsets of {1, . . . , n} which contain 1 and n. The sum on the left-hand side gives Cf(i1 ,...,in ) (Rμ ν ) by Lemma 3.2, while the sum on the right-hand side takes us precisely to the right-hand side of (3.8), as we wanted. 2 It will come in handy to also have an extended version of the formula found in Proposition 3.3, which covers the generalized coefficients “Cf(i1 ,...,in );ρ ” of the R-transform of μ ν. This is presented in Lemma 3.6, and uses the following extension for the concept of inner/outer colouring of a non-crossing partition. Notation 3.4. Let n be a positive integer and let π, ρ be partitions in NC(n) such that π ρ. We denote by oπ,ρ the colouring of π defined by 1, if V is ρ-special, V ∈ π, (3.12) oπ;ρ (V ) = 2, if V is not ρ-special, where the concept of “being ρ-special” for a block of π is as in Definition 2.7. Remark 3.5. Let π be a partition in NC(n) and let ρ be the unique interval-partition with the property that ρ π . Then the colouring oπ,ρ defined above is just the usual inner/outer colouring oπ – indeed, in this case a block V of π is ρ-special if and only if it is outer. Lemma 3.6. Let μ, ν be distributions in Dalg (k). For every n 1, ρ ∈ NC(n) and 1 i1 , . . . , in k one has Cf(i1 ,...,in );π;oπ,ρ (Rμ , Rν ). (3.13) Cf(i1 ,...,in );ρ (Rμ ν ) = π∈NC(n), πρ
Proof. Let us write explicitly ρ = {W1 , . . . , Wq }. Then Cf(i1 ,...,in );ρ (Rμ =
q
ν
)
Cf(i1 ,...,in )|Wj (Rμ
ν
)
j =1
=
q j =1
=
πj ∈NC(|Wj |), πj 1|Wj |
Cf((i1 ,...,in )|Wj );πj ;oπj (Rμ , Rν )
π1 ∈NC(|W1 |),π1 1|W1 | ,..., πq ∈NC(|Wq |),πq 1|Wq |
q j =1
Cf((i1 ,...,in )|Wj );πj ;oπj (Rμ , Rν ) .
(3.14)
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
445
Now let us consider the bijection (2.16) from Remark 2.5. It is immediate that if π ↔ (π1 , . . . , πq ) via this bijection, then q j =1
Cf((i1 ,...,in )|Wj );πj ;oπj (Rμ , Rν ) = Cf(i1 ,...,in );π;oπ,ρ (Rμ , Rν ).
Thus when in (3.14) we perform the change of variable given by the bijection from (2.16), we arrive precisely to the right-hand side of (3.13), as required. 2 On our way towards the formula for moments stated in Theorem 1.3 we next put into evidence an explicit formula for the Boolean cumulants of μ ν. Proposition 3.7. Let μ, ν be distributions in Dalg (k). For every n 1 and 1 i1 , . . . , in k one has Cf(i1 ,...,in ) (ημ ν ) = Cf(i1 ,...,in );π;c (Rμ , Rν ). (3.15) c:π→{1,2} π∈NC(n), π1n with outer block Vo such that c(Vo )=1
Moreover, for every π ∈ NC(n), π 1n with outer block Vo , one has:
Cf(i1 ,...,in );π;c (Rμ , Rν ) = Cf(i1 ,...,in );π;oπ (Rμ , Rμ + Rν ).
(3.16)
c:π→{1,2} such that c(Vo )=1
Hence Eq. (3.15) can also be written in the form Cf(i1 ,...,in ) (ημ
ν
)=
Cf(i1 ,...,in );π;oπ (Rμ , Rμ + Rν ).
(3.17)
π∈NC(n), π1n
Proof. It is immediate that the left-hand side of (3.16) is merely the expansion as a sum for the product which defines Cf(i1 ,...,in );π;oπ (Rμ , Rμ + Rν ). Hence the only non-trivial point in this proof is to verify that (3.15) holds. By using how Cf(i1 ,...,in ) (ημ ν ) is written in terms of the coefficients of Rμ ν (cf. Eq. (2.30) in Remark 2.14), then by invoking Lemma 3.6 and by performing an obvious change in the order of summation we get that Cf(i1 ,...,in ) (ημ
ν
)=
ρ∈NC(n), ρ1n
=
ρ∈NC(n), ρ1n
=
π∈NC(n), π1n
Cf(i1 ,...,in );ρ (Rμ
ν
)
Cf(i1 ,...,in );π;oπ,ρ (Rμ , Rν )
π∈NC(n), πρ
ρ∈NC(n) such that πρ1n
Cf(i1 ,...,in );π;oπ,ρ (Rμ , Rν ) .
446
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
In order to conclude the proof we are left to show that for every partition π ∈ NC(n) with π 1n and with outer block denoted V0 one has
Cf(i1 ,...,in );π;oπ,ρ (Rμ , Rν ) =
Cf(i1 ,...,in );π;c (Rμ , Rν ).
(3.18)
c:π→{1,2} such that c(Vo )=1
ρ∈NC(n) such that πρ1n
And indeed, recall from Proposition 2.8 that we have a bijection
ρ ∈ NC(n) π ρ 1n → {V ⊆ π | V V0 } ρ → {V ∈ π | V is ρ-special}.
When comparing this bijection against the formula which defined oπ,ρ in Notation 3.4, it is immediate that the map ρ → oπ,ρ is itself a bijection from {ρ ∈ NC(n) | π ρ 1n } onto the set of colourings {c : π → {1, 2} | c(V0 ) = 1}, and (3.18) immediately follows. 2 Remark 3.8. 1◦ When considered together, Eqs. (3.17) and (3.8) give that ημ
ν
= Rμ
(μ ν)
;
(3.19)
the latter formula is in turn telling us that B(μ ν) = μ (μ ν),
(3.20)
where B is the Boolean Bercovici–Pata bijection on Dalg (k). Eq. (3.20) is a special case of Proposition 1.10; but actually the general case of Proposition 1.10 easily follows from here, as explained in the proof of Proposition 5.1 below. 2◦ In the same way as the statement of Proposition 3.3 was extended to the one of Lemma 3.6, the formula found in Proposition 3.7 can be extended to Cf(i1 ,...,in );ρ (ημ
ν
)=
Cf(i1 ,...,in );π;oπ,ρ (Rμ , Rμ + Rν ),
(3.21)
π∈NC(n), πρ
holding for every n 1, ρ ∈ NC(n), and 1 i1 , . . . , in k. Eq. (3.21) can be obtained from (3.17) by an argument similar to the one used in the proof of Lemma 3.6; but in fact we do not need to repeat that argument, we can simply infer (3.21) by using Lemma 3.6 itself, in conjunction to Eq. (3.19) from the first part of the present remark. It is now easy to obtain the moment formula stated in Theorem 1.3. Proposition 3.9. Let μ, ν be distributions in Dalg (k). For every n 1 and 1 i1 , . . . , in k one has (μ ν)(Xi1 · · · Xin ) =
π∈NC(n)
Cf(i1 ,...,in );π;oπ (Rμ , Rμ + Rν ).
(3.22)
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
447
Proof. By using the Boolean moment–cumulant formula (Eq. (2.29) in Remark 2.14), then by invoking Remark 3.8.2 and by performing an obvious change in the order of summation we get that Cf(i1 ,...,in );ρ (ημ ν ) (μ ν)(Xi1 · · · Xin ) = =
ρ∈Int(n)
=
π∈NC(n)
ρ∈Int(n)
π∈NC(n), πρ
Cf(i1 ,...,in );π;oπ,ρ (Rμ , Rμ + Rν )
Cf(i1 ,...,in );π;oπ,ρ (Rμ , Rμ + Rν ) .
(3.23)
ρ∈Int(n), ρπ
But for every π ∈ NC(n) there exists a unique partition ρ ∈ Int(n) such that ρ π , and for this ρ we have oπ,ρ = oπ (as observed in Remark 3.5). Thus the sum over ρ in (3.23) consists of just one term, Cf(i1 ,...,in );π;oπ (Rμ , Rμ + Rν ), and (3.22) follows. 2 Remark 3.10. 1◦ A summation of the same type as in Eq. (3.22), which uses coefficients from two series and distinguishes between the inner and outer blocks of π ∈ NC(n), has previously appeared in the theory of c-free convolution – see e.g. the third displayed equation in [10, p. 366]. This connection is not pursued in the present paper, but c-free convolution is heavily used in [2] (which relates to the present paper in the way explained in Remark 1.14). 2◦ In the proof of Theorem 4.4 we will also need the equivalent form of Eq. (3.22) where, for every π ∈ NC(n), the product defining Cf(i1 ,...,in );π;oπ (Rμ , Rμ + Rν ) is expanded into a sum. It is immediate (left as exercise to the reader) to check that the formula for the moments of μ ν will then look as follows: (μ ν)(Xi1 · · · Xin ) = Cf(i1 ,...,in );π;c (Rμ , Rν ), (3.24) (π,c)
where the index set for the sum on the right-hand side of (3.24) is π ∈ NC(n), c : π → {1, 2}, such that . (π, c) c(V ) = 1 for every outer block V of π Remark 3.11. Let μ and (μN )N 1 be in Dalg (k). If lim μN (Xi1 · · · Xin ) = μ(Xi1 · · · Xin ),
N →∞
∀n 1, ∀1 i1 , . . . , in k,
(3.25)
then one says that the sequence (μN )N 1 converges in moments to μ (denoted simply as μN → μ). Due to the moment–cumulant formulas from Remark 2.14, this is equivalent to convergence in coefficients for the R-transforms RμN to Rμ , or for the η-series ημN to ημ . Now, from the fact that one has polynomial expressions giving the moments of μ ν in terms of the free cumulants of μ and of ν it is immediate that the operation is well-behaved under ∞ taking limits in moments in Dalg (k). That is, if μ, ν, (μN )∞ N =1 and (νN )N =1 are distributions in Dalg (k) such that μN → μ and νN → ν, then it follows that μN νN → μ ν. The same conclusion could have been of course derived directly from Proposition 3.3, or from Proposition 3.7.
448
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
4. The approach to via operator models This section puts into evidence a full Fock space model for μ ν, then uses this model in order to obtain Theorem 1.4 stated in the introduction of the paper. The full Fock space model is given in Theorem 4.4, and is just a variation of the “standard” full Fock space model for the R-transform (as presented for instance in [13, Lecture 21]). In order to avoid tedious notations involving formal operators on the full Fock space, we will only consider this model in the special case when the R-transforms Rμ and Rν are polynomials. A generalization of Theorem 4.4 could be obtained from this special case by doing approximations in distribution (a very similar procedure to how Theorem 21.4 is extended to Theorem 21.7 in [13, Lecture 21]). However, for the situation at hand it is actually more convenient to incorporate the necessary approximations in distribution directly into the proof of Theorem 4.10 below, where the full Fock space model is upgraded to the more general framework of Theorem 1.4. Notation 4.1. Let F be the full Fock space over C2k , ⊗2 ⊗n ⊕ · · · ⊕ C2k ⊕ ···. F := C ⊕ C2k ⊕ C2k The vector 1 ⊕ 0 ⊕ 0 ⊕ · · · ⊕ 0 ⊕ · · · is called the vacuum-vector of F and is denoted by Ω. We will let PΩ ∈ B(F ) denote the orthogonal projection onto the 1-dimensional space CΩ ⊆ F . The vector-state T → T Ω, Ω defined by Ω on B(F ) will be referred to as vacuum-state. We fix an orthonormal basis for C2k , which we denote as e1 , . . . , ek , e1 , . . . , ek . This leads to a natural choice of orthonormal basis for F ,
{Ω} ∪ ξ1 ⊗ · · · ⊗ ξn n 1, ξ1 , . . . , ξn ∈ e1 , . . . , ek , e1 , . . . , ek .
(4.1)
For every 1 i k the left creation operators with ei and ei will be denoted by Li and Li , respectively. So Li ∈ B(F ) is the isometry which acts on the orthonormal basis (4.1) by Li (Ω) = ei ,
Li (ξ1 ⊗ · · · ⊗ ξn ) = ei ⊗ ξ1 ⊗ · · · ⊗ ξn ,
and similar formulas hold for Li . Moreover, we will denote by M and M the sets of operators in B(F ) defined by
M := {Li1 · · · Lin | n 1, 1 i1 , . . . , in k}, M := {Li1 · · · Lin | n 1, 1 i1 , . . . , in k}.
(4.2)
The full Fock space model from Theorem 4.4 will use some special monomials “S1∗ M1 · · · formed with the isometries L1 , . . . , Lk , L1 , . . . , Lk and their adjoints, which are described in the next lemma. Sn∗ Mn ”
Lemma 4.2. Given a positive integer n and some fixed indices i1 , . . . , in ∈ {1, . . . , k}. 1◦ Let π be a partition in NC(n) and let c : π → {1, 2} be a colouring. For every m ∈ {1, . . . , n} let V = {v(1), v(2), . . . , v(p)} (with v(1) < v(2) < · · · < v(p)) denote the block of π which contains m, and define
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
Sm := ⎧ ⎪ ⎨ Liv(p) · · · Liv(2) Liv(1) Mm = Liv(p) · · · Liv(2) Liv(1) ⎪ ⎩ 1B(F ) 2◦
Lim Lim
449
if c(V ) = 1,
(4.3)
if c(V ) = 2,
if m = max(V )(= v(p)) and c(V ) = 1, if m = max(V ) and c(V ) = 2,
(4.4)
if m = max(V ).
Then S1∗ M1 · · · Sn∗ Mn Ω = Ω. Suppose that S1 , . . . , Sn , M1 , . . . , Mn ∈ B(F ) are such that (i) Sm ∈ {Lim , Lim }, 1 m n; (ii) Mm ∈ {1B(F ) } ∪ M ∪ M , 1 m n (with M , M as in (4.2)); and (iii) S1∗ M1 · · · Sn∗ Mn Ω = Ω. Then there exist a partition π ∈ NC(n) and a colouring c : π → {1, 2} such that S1 , . . . , Sn , M1 , . . . , Mn are obtained from π and c via the recipe described in part 1◦ of the lemma.
Remark 4.3. 1◦ Here is a concrete example of how the recipe from Lemma 4.2 works. Say for instance that n = 5. Let i1 , . . . , i5 be some indices in {1, . . . , k}, and consider the monomial ∗ ∗ ∗ ∗ ∗ Li 1 L i 2 L i 3 L i 3 L i 2 L i 4 L i 5 L i 5 L i 4 L i 1 .
(4.5)
Note that the product in (4.5) reduces upon simplifications to 1B(F ) , so in particular it fixes Ω. Lemma 4.2 views this product as being S1∗ M1 · · · S5∗ M5 , where
S1 = Li1 ,
S2 = Li2 ,
S3 = Li3 ,
M1 = M2 = M4 = 1B(F ) ,
S4 = Li4 , M3 = Li3 Li2 , M5
S5 = Li5 ,
and
= Li5 Li4 Li1 .
Moreover, these S1 , . . . , S5 , M1 , . . . , M5 correspond in Lemma 4.2 to the partition π = {V1 , V2 } ∈ NC(5) with V1 = {1, 4, 5}, V2 = {2, 3}, and to the colouring c : π → {1, 2} defined by c(V1 ) = 1, c(V2 ) = 2. 2◦ The proof of Lemma 4.2 is very similar to the corresponding argument concerning the standard full Fock space model for the R-transform, as presented e.g. in [13, Lecture 21]. Because of this, I will only explain (in the remaining part of this remark) how one makes the connection to the arguments from [13], and will leave the details as exercise to the reader. Besides M and M from (4.2), let us also use the notation
M := {1B(F ) } ∪ S1 · · · S 1, S1 , . . . , S ∈ L1 , . . . , Lk , L1 , . . . , Lk .
(4.6)
Suppose that the following data is given: a positive integer n, some indices i1 , . . . , in ∈ {1, . . . , k}, and a function b : {1, . . . , n} → {1, 2}. Let the isometries S1 ∈ {Li1 , Li1 }, . . . , Sn ∈ {Lin , Lin } be picked via the rule that Sm =
Lim Lim
if b(m) = 1, if b(m) = 2,
1 m n,
(4.7)
450
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
and consider the following problem: describe all possible ways of choosing (M1 , . . . , Mn ) ∈ Mn such that S1∗ M1 · · · Sn∗ Mn Ω = Ω.4 The solution to this problem is that the n-tuples (M1 , . . . , Mn ) with the required property are canonically parametrized by NC(n). For the description of how to construct the n-tuple (M1 , . . . , Mn ) ∈ Mn canonically associated to a partition π ∈ NC(n), and for the explanation why this construction works, see the discussion on pp. 342, 343 and Exercises 21.20–21.22 on pp. 356, 357 of [13]. The statement of Lemma 4.2 is merely an adjustment of this procedure (for how to construct (M1 , . . . , Mn ) by starting from π ), where one has to take into account the following additional detail: M1 , . . . , Mn are now only allowed to run in the smaller set {1B(F ) } ∪ M ∪ M (instead of all of M). This imposes a compatibility condition between π and the function b : {1, . . . , n} → {1, 2} that was used in (4.7) – specifically, that b must be constant along the blocks of π (and hence must correspond to a colouring c of π ). Theorem 4.4. Let μ, ν be distributions in Dalg (k) such that the R-transforms Rμ and Rν are polynomials: ⎧ N k ⎪ ⎪ ⎪ R (z , . . . , z ) = α(i1 ,...,in ) zi1 · · · zin , ⎪ μ 1 k ⎪ ⎪ ⎨ n=1 i1 ,...,in =1 ⎪ N ⎪ ⎪ ⎪ ⎪ (z , . . . , z ) = R ν 1 k ⎪ ⎩
k
(4.8)
β(i1 ,...,in ) zi1 · · · zin
n=1 i1 ,...,in =1
(where N is a common upper bound for the degrees of Rμ and Rν ). In the framework of Notation 4.1, consider the operator T ∈ B(F ) defined by T = 1B(F ) +
N
k
n=1 i1 ,...,in =1
α(i1 ,...,in ) Lin · · · Li1 +
N
k
n=1 i1 ,...,in =1
β(i1 ,...,in ) Lin · · · Li1 ,
(4.9)
and make the notations ∗ Ai := Li T ,
∗ Bi := Li T ,
1 i k,
(4.10)
Ci := Ai + (1 − PΩ )Bi (1 − PΩ ),
1 i k.
(4.11)
followed by
Then the joint distribution of C1 , . . . , Ck with respect to the vacuum-state on B(F ) is equal to μ ν. Remark 4.5. By comparing the framework of Theorem 4.4 with the “standard” full Fock space model for the R-transform (as presented for instance in Theorem 21.4 of [13]), one sees that the operators A1 , . . . , Ak , B1 , . . . , Bk defined by Eq. (4.10) give the standard full Fock space model for the free product μ ν ∈ Dalg (2k). In particular one has that {A1 , . . . , Ak } is free from 4 It is easy to see that this condition is in fact equivalent to the requirement that the product S ∗ M · · · S ∗ M simplifies n n 1 1 to 1B(F ) after repeated use of the relations (Li )∗ Li = (Li )∗ Li = 1B(F ) , 1 i k.
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
451
{B1 , . . . , Bk } with respect to the vacuum-state on B(F ), and the joint distributions of the k-tuples A1 , . . . , Ak and B1 , . . . , Bk are equal to μ and to ν, respectively. Another way of phrasing this same remark is that the full Fock space model for μ ν is obtained by merely performing an extra step (specifically, by considering the operators C1 , . . . , Ck defined by Eq. (4.11)) in the standard full Fock space model for μ ν. Proof of Theorem 4.4. For the whole proof we fix a positive integer n and some indices 1 i1 , . . . , in k, for which we will show that Ci1 · · · Cin Ω, Ω = (μ ν)(Xi1 · · · Xin ).
(4.12)
From (4.9)–(4.11) it follows that every Ci (1 i k) can be written as a sum of products of the form Q · S ∗ · (γ M) · Q,
(4.13)
where Q ∈ {1B(F ) , 1B(F ) − PΩ }, S ∈ {Li , Li }, and γ M is a term from the sum defining T (where γ ∈ C and M ∈ {1B(F ) } ∪ M ∪ M ). Of course, there are some restrictions on what combinations of Q, S and γ M can go together in (4.13): if Q = 1B(F ) then S = Li and γ M is either 1B(F ) or of the form α(j1 ,...,jm ) Ljm · · · Lj1 , while Q = 1B(F ) − PΩ goes with S = Li and with γ M being either 1B(F ) or of the form β(j1 ,...,jm ) Ljm · · · Lj1 . A precise count thus gives that every Ci splits into a sum of 2 · (1 + k + · · · + k N ) terms of the form (4.13). When one writes each of Ci1 , . . . , Cin as a sum in this way and expands the product, the inner product on the left-hand side of (4.12) is thus broken into a sum of (2 · (1 + k + · · · + k N ))n terms of the form Q1 S1 ∗ (γ1 M1 )Q1 · · · Qn Sn ∗ (γn Mn )Qn Ω, Ω .
(4.14)
Now let us fix one of the possible choices of operators Qi , Si , γi Mi (1 i n) in (4.14), and let us look at the 4n vectors ξ1 = Qn Ω,
ξ2 = Mn Qn Ω,
...,
ξ4n = Q1 S1∗ M1 Q1 · · · Qn Sn∗ Mn Qn Ω
(4.15)
obtained by successively applying the operators Qn , Mn , Sn ∗ , Qn , . . . , Q1 , M1 , S1 ∗ , Q1 to Ω. It is clear that each of these 4n vectors either is 0 or belongs to the orthonormal basis (4.1) for F ; and consequently, the inner product (4.14) is equal to
γ 1 · · · γn 0
if Q1 S1∗ M1 Q1 · · · Qn Sn∗ Mn Qn Ω = Ω, otherwise.
(4.16)
Let us moreover observe that if Q1 S1∗ M1 Q1 · · · Qn Sn∗ Mn Qn Ω = Ω, then we also have S1∗ M1 · · · Sn∗ Mn Ω = Ω. This is because when one successively applies Qn , Mn , . . . , S1 ∗ , Q1 to Ω, the projections Q1 , . . . , Qn used on the way either leave invariant the vector presented to them, or send it to 0 (but cannot actually do the latter, as Q1 S1∗ · · · Mn Qn Ω = Ω = 0). By invoking Lemma 4.2 we thus see that if an inner product as in (4.14) is to be different from 0, then there have to exist a partition π ∈ NC(n) and a colouring c : π → {1, 2} such that S1 , M1 , . . . , Sn , Mn are defined in terms of π and c in the way described in Lemma 4.2. It is immediate that in this case the numbers γ1 , . . . , γn from (4.16) are identified as α(j1 ,...,jm ) ’s and
452
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
β(j1 ,...,jm ) ’s (coefficients of the R-transforms of μ and of ν) in such a way that their product becomes γ1 · · · γn = Cf(i1 ,...,in );π;c (Rμ , Rν ).
(4.17)
Conversely, let π be a partition in NC(n), let c be a colouring of π , and consider the operators S1 , M1 , . . . , Sn , Mn defined in terms of π and c in the way described in Lemma 4.2. Observe that there exists a unique way of choosing projections Q1 , . . . , Qn ∈ {1B(F ) , 1B(F ) − PΩ } so that the Sj , Mj , Qj for 1 j n give together an inner product as in (4.14). To be precise, for every 1 j n the projection Qj is chosen as follows: consider the block V of π which contains the number j , and put Qj =
1B(F ) 1B(F ) − PΩ
if c(V ) = 1, if c(V ) = 2.
(4.18)
Note that whereas Lemma 4.2 ensures that S1∗ M1 · · · Sn∗ Mn Ω = Ω, it may still happen that (with Qj s defined by (4.18)) the vector Q1 S1∗ M1 Q1 · · · Qn Sn∗ Mn Qn Ω is equal to 0. It is easy (though perhaps notationally tedious) to check that Q1 S1∗ M1 Q1 · · · Qn Sn∗ Mn Qn Ω
=
Ω 0
if c(V ) = 1 for every outer block of π, otherwise.
(4.19)
The verification of (4.19) is left as exercise to the reader. Informally speaking, what makes (4.19) hold is that in a sequence of 4n vectors obtained as in (4.15) one reaches Ω precisely at the positions where the outer blocks of π begin and end – hence these are the positions where Qj has a chance to make a difference, and cause the vector Q1 S1∗ M1 Q1 · · · Qn Sn∗ Mn Qn Ω to vanish. Summarizing the above discussion, one sees that Ci1 · · · Cin Ω, Ω =
Cf(i1 ,...,in );π;c (Rμ , Rν ),
(4.20)
(π,c)
where the index set for the sum on the right-hand side of (4.20) is
π ∈ NC(n), c : π → {1, 2}, such that (π, c) c(V ) = 1 for every outer block V of π
.
But the sum on the right-hand side of (4.20) is precisely the expression observed for (μ ν)(Xi1 · · · Xin ) in Remark 3.10.2, and this concludes the proof. 2 Let us now go towards the proof of Theorem 1.4. It will be convenient to adopt a slightly different point of view on the vacuum projection, which does not make explicit use of vectors in a Hilbert space, and is described as follows. Definition 4.6. Let (A, ϕ) be a noncommutative probability space. A vacuum-projection for ϕ is an element P ∈ A such that P = P 2 = 0 and such that P AP = ϕ(A)P ,
∀A ∈ A.
(4.21)
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
453
Remark 4.7. 1◦ The main example of vacuum-projection is of course provided by the situation when A = B(H), the functional ϕ is the vector-state associated to a unit vector ξ0 ∈ H, and P is the orthogonal projection onto the 1-dimensional subspace Cξ0 of H. 2◦ Let (A, ϕ) and P be as in Definition 4.6. Observe that ϕ(P ) = 1 (as seen by making A = P in Eq. (4.21)). Let us also observe that ϕ(P B) = ϕ(B) = ϕ(BP ),
∀B ∈ A.
(4.22)
In order to verify the first of these two equalities we set A = (1A − P )B and find that ϕ(A)P = P AP = P (1A − P )BP = 0, which implies that ϕ(A) = 0 and hence that ϕ(B) = ϕ(P B). The verification of the second equality in (4.22) is analogous. Lemma 4.8. Let (A, ϕ) be a noncommutative probability space and let P ∈ A be a vacuumprojection for ϕ. Then ϕ(T1 P T2 P · · · P Tn ) =
n
ϕ(Ti ),
∀n 2 and T1 , . . . , Tn ∈ A.
(4.23)
i=1
Proof. By induction on n. For n = 2 we write ϕ(T1 P T2 ) = ϕ(T1 P T2 P ) by (4.22) by (4.21) = ϕ T1 · ϕ(T2 )P = ϕ(T2 )ϕ(T1 P ) = ϕ(T2 )ϕ(T1 )
by (4.22) .
The induction step “n ⇒ n + 1” is immediately obtained by writing T1 P T2 P · · · P Tn P Tn+1 as T1 P T2 with T2 := T2 P · · · P Tn P Tn+1 and by repeating the above calculation, followed by the induction hypothesis. 2 Lemma 4.9. Let (A, ϕ) be a noncommutative probability space and let T1 , . . . , T , P be in A, where P is a vacuum-projection for ϕ. Suppose moreover that for every N 1 we are given (N ) (N ) a noncommutative probability space (AN , ϕN ) and elements T1 , . . . , T , P (N ) ∈ AN , such (N ) (N ) that P (N ) is a vacuum-projection for ϕN . If the -tuples T1 , . . . , T converge in moments (N ) (N ) for N → ∞ to T1 , . . . , T , then the ( + 1)-tuples T1 , . . . , T , P (N ) converge in moments for N → ∞ to the ( + 1)-tuple T1 , . . . , T , P . Proof. It clearly suffices to verify that, for any n 2 and any choice of noncommutative polynomials f1 , . . . , fn ∈ CX1 , . . . , X , the sequence (N ) (N ) (N ) (N ) (N ) (N ) ϕN f1 T1 , . . . , T P (N ) f2 T1 , . . . , T P (N ) · · · P (N ) fn T1 , . . . , T ,
N 1,
454
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
converges for N → ∞ to ϕ(f1 (T1 , . . . , T )Pf2 (T1 , . . . , T )P · · · Pfn (T1 , . . . , T )). But in view of Lemma 4.8 the latter convergence amounts to
lim
N →∞
n
n ϕN fi T1(N ) , . . . , T(N ) = ϕ fi (T1 , . . . , T ) ,
i=1
i=1
which is an immediate consequence of the given hypothesis.
2
Theorem 4.10. Let two distributions μ, ν ∈ Dalg (k) be given. Suppose that (A, ϕ) is a noncommutative probability space and that A1 , . . . , Ak , B1 , . . . , Bk ∈ A are such that {A1 , . . . , Ak } is free from {B1 , . . . , Bk }, such that the joint distribution of A1 , . . . , Ak is equal to μ, and such that the joint distribution of B1 , . . . , Bk is equal to ν. Suppose in addition that P ∈ A is a vacuumprojection for ϕ, and consider the elements Ci := Ai + (1A − P )Bi (1A − P ),
1 i k.
(4.24)
Then the joint distribution of C1 , . . . , Ck with respect to ϕ is equal to μ ν. Proof. For n 1 and 1 i1 , . . . , in k we will denote the coefficients of zi1 · · · zin in the series Rμ and Rν by α(i1 ,...,in ) and β(i1 ,...,in ) , respectively. Let N be a positive integer. Consider the distributions μN , νN ∈ Dalg (k) which are uniquely determined by the requirement that their R-transforms are ⎧ N k ⎪ ⎪ ⎪ R (z , . . . , z ) = α(i1 ,...,in ) zi1 · · · zin , ⎪ μ 1 k ⎪ ⎪ ⎨ n=1 i1 ,...,in =1 ⎪ N ⎪ ⎪ ⎪ ⎪ (z , . . . , z ) = R k ⎪ ⎩ ν 1
k
(4.25)
β(i1 ,...,in ) zi1 · · · zin .
n=1 i1 ,...,in =1
Let us consider the standard full Fock space model, exactly as described in Theorem 21.4 of [13], for the free product μN ∗ νN ∈ Dalg (2k). This gives us a noncommutative probability space (N ) (N ) (N ) (N ) (N ) (N ) (AN , ϕN ) and elements A1 , . . . , Ak , B1 , . . . , Bk ∈ AN such that {A1 , . . . , Ak } is (N ) (N ) (N ) (N ) free from {B1 , . . . , Bk }, such that the joint distribution of A1 , . . . , Ak is equal to μN , (N ) (N ) and such that the joint distribution of B1 , . . . , Bk is equal to νN . Since the full Fock space model is constructed by using a true vacuum-state on a Hilbert space, we also get at the same time a vacuum-projection P (N ) ∈ AN . We now make N → ∞. From how μN and νN were constructed it is immediate that we have limits in moments μN → μ and νN → ν. This implies that we also have the limit in moments (N ) (N ) (N ) (N ) μN ∗ νN → μ ∗ ν, or in terms of operators that the (2k)-tuples A1 , . . . , Ak , B1 , . . . , Bk converge in moments for N → ∞ to the (2k)-tuple A1 , . . . , Ak , B1 , . . . , Bk . By invoking (N ) (N ) (N ) (N ) Lemma 4.9 we upgrade this to the fact that the (2k + 1)-tuples A1 , . . . , Ak , B1 , . . . , Bk , P (N ) converge in moments for N → ∞ to the (2k + 1)-tuple A1 , . . . , Ak , B1 , . . . , Bk , P . The
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
455
latter convergence implies in turn that the k-tuple C1 , . . . , Ck defined in (4.24) is the limit in (N ) (N ) moments for the k-tuples C1 , . . . , Ck , where for 1 i k and N 1 we put (N )
Ci
(N )
:= Ai
(N ) + 1AN − P (N ) Bi 1AN − P (N ) ∈ AN . (N )
(4.26)
(N )
But for every N 1, the operators C1 , . . . , Ck provide (as observed at the end of Remark 4.5) the full Fock space model for the subordination distribution μN νN . Hence the conclusion of the preceding paragraph can be read as follows: the joint distribution of C1 , . . . , Ck is the N → ∞ limit of the distributions μN νN . Since it was noticed in Remark 3.11 that (μN νN )∞ N =1 converges in moments to μ ν, the conclusion of the theorem follows. 2 Remark 4.11. Suppose now that μ, ν ∈ Dc (k), i.e. they can appear as joint distributions for k-tuples of selfadjoint elements in some C ∗ -probability spaces. By considering the GNS representations of these C ∗ -probability spaces, one finds Hilbert spaces H, K, unit vectors ξo ∈ H, ζo ∈ K, and k-tuples of selfadjoint operators A1 , . . . , Ak ∈ B(H), B1 , . . . , Bk ∈ B(K) such that μ is the joint distribution of A1 , . . . , Ak with respect to the vector-state defined by ξo on B(H), while ν is the joint distribution of B1 , . . . , Bk with respect to the vector-state defined by ζo on B(K). Let us denote Ho := H Cξo ,
Ko := K Cζo ,
and let us consider the “free product” Hilbert space M := CΩ ⊕ Ho ⊕ Ko ⊕ Ho ⊗ Ko ⊕ Ko ⊗ Ho ⊕ Ho ⊗ K o ⊗ Ho ⊕ K o ⊗ Ho ⊗ K o ⊕ · · ·
(4.27)
(direct sum of all possible alternating tensor products of copies of Ho and Ko ). Then 1 , . . . , A k , B 1 , . . . , B k ∈ A1 , . . . , Ak , B1 , . . . , Bk extend naturally to selfadjoint operators A 1 , . . . , A k } is free from {B 1 , . . . , B k } with respect to the vacuum-state deB(M) such that {A fined by Ω on B(M) and such that (with respect to the same state) the joint distributions of 1 , . . . , A k and of B 1 , . . . , B k are equal to μ and ν, respectively (see e.g. [17, Section 1.5]). A Theorem 4.10 clearly applies in the situation described in the preceding paragraph, and tells us that if PΩ ∈ B(M) is the orthogonal projection onto CΩ and if we put i = A i + (1 − PΩ )B i (1 − PΩ ), C
1 i k,
(4.28)
1 , . . . , C k is equal to μ ν. Since the C i are selfadjoint, this then the joint distribution of C provides us with a proof that (as stated in Corollary 1.5) the subordination distribution μ ν does indeed belong to Dc (k). Remark 4.12. In the framework and notations of the preceding remark, consider the subspace L of M defined by: L := CΩ ⊕ Ho ⊕ Ko ⊗ Ho ⊕ Ho ⊗ Ko ⊗ Ho ⊕ · · ·
(4.29)
456
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
(direct sum of all alternating tensor products of copies of Ho and Ko which end in Ho ). In the terminology of [11], this is the s-free product space of the Hilbert spaces H and K, considered with respect to the special unit vectors ξo ∈ H and ζo ∈ K. k from (4.28); this happens because 1 , . . . , C Observe that L is invariant for the operators C L is in fact invariant both for Ai and for (1 − PΩ )Bi (1 − PΩ ), 1 i k. It follows that the k to L also provide us with an operator model for μ ν, with respect 1 , . . . , C restrictions of C to the vector-state defined by Ω on B(L). By analyzing this operator model a bit further, one can moreover relate to the concept of “s-freeness” from [11], in the way outlined in the next paragraph. i denote the restrictions to L of the operators A i and respeci and B For every 1 i k let A tively (1 − PΩ )Bi (1 − PΩ ). Let us consider the subalgebras A, B of B(L) which are generated 1 , . . . , A k } and respectively by {1B(L) − PΩ , B 1 , . . . , B k }. (Note that B is not a uniby {1B(L) , A tal subalgebra of B(L), but it has its own unit 1B = 1B(L) − PΩ , where PΩ is viewed here as a 1-dimensional projection in B(L).) Finally, let us select (and fix) an arbitrary unit vector θo ∈ Ho ⊆ L, and let ϕ and ψ be the vector-states defined on B(L) by Ω and by θo , respectively. It is not hard to verify that the algebras A and B are s-free in (B(L), ϕ, ψ), in the sense k 1 , . . . , A of Definition 7.1 from [11]. It is moreover immediate that the joint distribution of A 1 , . . . , B k in (B, ψ | B) is equal in (A , ϕ | A) is equal to μ, while the joint distribution of B 1 + B 1 , . . . , A k + B k , where the to ν. Thus μ ν has been realized as the joint distribution of A k and B 1 , . . . , B k are s-free and have distributions μ and ν, respectively. 1 , . . . , A k-tuples A The verification of the s-freeness of A and B in the preceding paragraph is left as an exercise. A reader who is interested in s-freeness may also find it as an amusing (not hard) exercise to start from this latter description of μ ν and see, conversely, how the statement of Theorem 1.4 can be obtained from there. We conclude this section by observing that (as a supplement to the fact that μ ν ∈ Dc (k) whenever μ, ν ∈ Dc (k)), there exist natural situations when μ ν is sure to be infinitely divisible. Corollary 4.13. Let μ, ν be two distributions in Dc (k). 1◦ If μ is -infinitely divisible, then so is μ ν. 2◦ Suppose that “μ is a -summand of ν in Dc (k),” in the sense that there exists ν ∈ Dc (k) such that ν = μ ν . Then μ ν is infinitely divisible. Proof. 1◦ The hypothesis that μ is -infinitely divisible is equivalent to the fact that, for every t > 0, the convolution power μ t (which can always be defined in Dalg (k)) still belongs to Dc (k). But then, by invoking Remark 1.2.1 and Corollary 1.5 one finds that (μ ν) t = μ t ν ∈ Dc (k),
∀t > 0,
which means that μ ν is infinitely divisible as well. 2◦ One has μ ν = μ (μ ν ) = B(μ ν ) (where at the second equality sign we used Remark 3.8.1). Since μ ν ∈ Dc (k) (by Corollary 1.5), and since B carries Dc (k) onto the set of -infinitely divisible distributions in Dc (k), the conclusion follows. 2
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
457
5. Relations with the transformations B t Proposition 5.1. Let μ, ν be distributions in Dalg (k). For every t > 0 one has that Bt (μ ν) = μ μ t ν .
(5.1)
Proof. We first prove by induction that Bm (μ ν) = μ μ m ν ,
∀m ∈ N.
(5.2)
The base case m = 1 of the induction is provided by formula (3.20) in Remark 3.8.1. The induction step “m ⇒ m + 1” also follows immediately by using the same formula: Bm+1 (μ ν) = B Bm (μ ν) (since Bm+1 = B ◦ Bm ) (by the induction hypothesis) = B μ μ m ν m by Eq. (3.20) =μ μ μ ν = μ μ(m+1) ν . Now we move to proving that (5.1) holds for arbitrary t > 0. It suffices to fix n ∈ N and 1 i1 , . . . , in k and to verify that Cf(i1 ,...,in ) (RBt (μ
ν)
) = Cf(i1 ,...,in ) (Rμ
(μ t ν)
),
∀t > 0.
(5.3)
For both sides of (5.3) one has explicit writings as sums indexed by non-crossing partitions. Indeed, Remark 4.4 from [6] tells us that the left-hand side of (5.3) is equal to
t |ρ|−1 Cf(i1 ,...,in );ρ (Rμ
ν
),
(5.4)
ρ∈NC(n), ρ1n
while the right-hand side of (5.3) can be written (by Proposition 3.3 and by taking into account the additivity of the R-transform) in the form
Cf(i1 ,...,in );π;oπ (Rμ , tRμ + Rν ).
(5.5)
π∈NC(n), π1n
Rather than pursuing a detailed combinatorial analysis of the sums in (5.4) and (5.5) we can simply exploit the obvious fact that (for our fixed n and i1 , . . . , in ) both these sums are polynomial functions of t. Two polynomial functions that agree (as shown by (5.2)) for all m ∈ N must in fact agree for all t > 0, and (5.3) follows. 2 Remark 5.2. As an application of Proposition 5.1, we will next see how the formula “μ μ = B(μ)” from Remark 1.2.2 extends to a formula for (μ s ) (μ t ), where s, t 0. In order to cover the cases when s = 0 or t = 0, we will denote by δ ∈ Dalg (k) the “noncommutative Dirac distribution at 0” which has all moments equal to 0. Then, clearly, Rδ = ηδ = 0 ∈ C0 z1 , . . . , zk ; as a consequence one has δ t = δ t = δ, hence Bt (δ) = δ for every t > 0.
458
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
Moreover, it is clear that δ is the neutral element for both the operations and on Dalg (k), which justifies the convention that μ 0 = μ 0 = δ,
∀μ ∈ Dalg (k).
(5.6)
Concerning subordination distributions it is easy to check, directly from Definition 1.1, that μ δ = μ and δ μ = δ,
∀μ ∈ Dalg (k).
(5.7)
Proposition 5.3. Let μ be a distribution in Dalg (k). Then for every s, t 0 one has s t s μ = Bt (μ) . μ
(5.8)
Proof. First observe that μ μ t = μ μ t δ
(δ neutral element for )
= Bt (μ δ) (by Proposition 5.1) = Bt (μ) by (5.7) . Then recall from Remark 1.2.1 that (μ s ) (μ t ) = (μ (μ t )) s , and (5.8) follows.
2
Remark 5.4. The remaining part of this section discusses the relation to free Brownian motion stated in Theorem 1.8. Same as in Remark 1.7, we denote by γ ∈ Dc (k) the joint distribution of a free family of k centered semicircular elements of variance 1. A fundamental property of γ is that its R-transform is Rγ (z1 , . . . , zk ) = z12 + · · · + zk2
(5.9)
(see e.g. [13, Example 11.21.2 on p. 187]). More generally, for every t > 0 let γt denote the distribution of a free family of k centered semicircular elements of variance t. It is immediate that √ √ Rγt (z1 , . . . , zk ) = Rγ ( tz1 , . . . , tzk ) = t z12 + · · · + zk2 ; hence Rγt = tRγ , which shows that γt = γ t for every t > 0. Proposition 5.5. Let ν be a distribution in Dalg (k). One has that
Rγ
ν (z1 , . . . , zk ) =
k zi 1 + Mν (z1 , . . . , zk ) zi . i=1
(5.10)
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
459
Proof. For n 3 and 1 i1 , . . . , in k one has Cf(i1 ,...,in ) (Rγ ν ) Cf(i1 ,...,in );π;oπ (Rγ , Rν ) (by Proposition 3.3) = π∈NC(n), π1n
=
δi1 ,in ·
Cf(i1 ,...,in )|W (Rν )
(because of the special form of Rγ ).
W ∈π W ={1,n}
π∈NC(n) such that {1,n}∈π
But the set of partitions π ∈ NC(n) which have {1, n} as a block is in natural bijection with NC(n − 2); when we follow through with this bijection, the above sequence of equalities is continued with = δi1 ,in ·
Cf(i2 ,...,in−1 )|W (Rν )
ρ∈NC(n−2) W ∈ρ
= δi1 ,in · ν(Xi2 · · · Xin−1 ) by the moment–cumulant formula (2.28) k = Cf(i1 ,...,in ) zi 1 + Mν (z1 , . . . , zk ) zi . i=1
The above calculation shows that the series on the two sides of Eq. (5.10) have identical coefficients of length 3. It is immediately verified that the coefficients of length 1 and 2 also coincide (each of the two series has vanishing linear part and quadratic part equal to ki=1 zi2 ), and this completes the proof. 2 Corollary 5.6. The transformation Φ : Dalg (k) → Dalg (k) from [6] satisfies γ ν = B Φ(ν) ,
∀ν ∈ Dalg (k).
(5.11)
Proof. In [6] the distribution Φ(ν) is defined via the prescription that its η-series is
ηΦ(ν) (z1 , . . . , zk ) =
k zi 1 + Mν (z1 , . . . , zk ) zi .
(5.12)
i=1
Comparing this to Proposition 5.5 we see that ηΦ(ν) coincides with the R-transform of γ ν, and Eq. (5.11) follows. 2 It is worth noting that the two main facts proved about Φ in [6] can be easily obtained from the prespective of subordination distributions, as explained in the next proposition. (The two statements of this proposition originally appeared as Theorem 6.2 and respectively as Corollary 7.10 in [6].)
460
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
Proposition 5.7. 1◦ For every ν ∈ Dalg (k) and t > 0 one has that Φ(ν γt ) = Bt Φ(ν) .
(5.13)
2◦ The transformation Φ maps the subset Dc (k) of Dalg (k) into itself. Proof. 1◦ Since the Boolean Bercovici–Pata bijection is one-to-one on Dalg (k), it will suffice to prove that B Φ(ν γt ) = B Bt Φ(ν) . And indeed, starting from the right-hand side of the above equation we can go as follows: B Bt Φ(ν) = Bt B Φ(ν) (because B ◦ Bt = Bt+1 = Bt ◦ B) = Bt (γ ν) (by Corollary 5.6) = γ γ t ν (by Proposition 5.1) = γ (ν γt ) because γ t = γt = B Φ(ν γt ) (by Corollary 5.6). 2◦ Since B is one-to-one, it will suffice to show that for ν ∈ Dc (k) one has B(Φ(ν)) ∈ B(Dc (k)). The latter set is precisely the set of distributions in Dc (k) which are -infinitely divisible (cf. Theorem 1 in [5]). In view of (5.11), what we have thus to prove is the implication “ν ∈ Dc (k) ⇒ γ ν is infinitely divisible.” But γ is itself infinitely divisible (since γ t = γt ∈ Dc (k), ∀t > 0), so the required implication follows from Corollary 4.13.1. 2 6. Properties originating from functional equations Remark 6.1. In this remark we briefly return to the 1-variable framework and notations from Section 2.1, and review the two functional equations that are to be extended to multi-variable framework. Recall in particular that for a probability measure μ on R, Fμ : C+ → C+ denotes the reciprocal Cauchy transform of μ. In the case when μ is compactly supported Fμ (z) can be viewed as a Laurent series in z, related to the η-series of μ by the formula
1 Fμ (z) = z 1 − ημ . z
(6.1)
In order to verify (6.1), one writes Fμ = 1/Gμ , ημ = Mμ /(1 + Mμ ), and uses the relation between Mμ and Gμ that was recorded in Eq. (2.6) in Section 2.1. 1◦ Let μ, ν be two probability measures on R, and let ω1 , ω2 be the subordination functions of μ ν with respect to μ and to ν, respectively. A remarkable equation satisfied by these functions (see e.g. Theorem 4.1 in [4]) is that ω1 (z) + ω2 (z) = z + Fμ ν (z),
z ∈ C+ .
(6.2)
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
But ω1 = Fν
μ
and ω2 = Fμ Fμ
ν
ν
461
, hence (6.2) amounts to
(z) + Fν
μ
(z) = z + Fμ ν (z),
z ∈ C+ .
(6.3)
Let us moreover replace the reciprocal Cauchy transforms in (6.3) by η-series, by using Eq. (6.1). Then (6.3) becomes ημ
ν
+ ην
μ
= ημ ν ,
and in this form it goes through to the multi-variable framework of Dalg (k), as shown in Proposition 6.2 below. 2◦ Let ν be a probability measure on R. Then for every p 1 one can consider the probability measure ν p , and in Theorem 2.5 of [3] it was shown that one has Gν p (z) = Gν
1 1 F p (z) , z+ 1− p p ν
z ∈ C+ .
(6.4)
In other words, Eq. (6.4) says that the Cauchy transform of ν p is subordinated to the one of ν, with subordination function ω defined by
1 1 F p (z), ω(z) = z + 1 − p p ν
z ∈ C+ .
(6.5)
It is immediate that ω from (6.5) belongs to the set F of reciprocal Cauchy transforms from Eq. (2.3) of Section 2.1, hence there exists a unique probability measure σ on R such that Fσ = ω. It is natural to call this σ the “subordination distribution of ν p with respect to ν.” (If p 2 then σ is just ν (p−1) ν, but for 1 p < 2 this point of view does not always work, as the probability measure ν (p−1) might not be defined.) So then Eq. (6.5) becomes
1 1 Fσ (z) = z + 1 − F p (z), p p ν
z ∈ C+ ,
and upon writing the reciprocal Cauchy transforms in terms of η-series this takes us to ησ =
p−1 · ην p . p
(6.6)
This latter formula is the one that will be extended to the framework of Dc (k) – see Corollary 6.4 and Remark 6.5 below. Proposition 6.2. For every μ, ν ∈ Dalg (k) one has that ημ ν = ημ
ν
+ ην
μ
.
(6.7)
Proof. We fix n ∈ N and 1 i1 , . . . , in k and compare the coefficients of zi1 · · · zin for the series on the two sides of Eq. (6.7). By using the relation between R and η and the linearizing property of R, and then by invoking Eq. (2.24) in Remark 2.12 we find that
462
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
Cf(i1 ,...,in ) (ημ ν ) =
Cf(i1 ,...,in );π (Rμ + Rν )
π∈NC(n), π1n
=
Cf(i1 ,...,in );π;c (Rμ , Rν ).
π∈NC(n), c:π→{1,2} π1n
In the latter double sum, the colourings c of π can be subdivided according to whether c(V0 ) = 1 or c(V0 ) = 2, where V0 is the unique outer block of π . This leads to an equality of the form Cf(i1 ,...,in ) (ημ ν ) = Σ1 + Σ2 , where Σ1 is exactly as on the right-hand side of Eq. (3.15) from Proposition 3.7, and Σ2 is the counterpart of Σ1 with the roles of μ and ν being reversed. We are only left to invoke Proposition 3.7 to conclude that Σ1 + Σ2 = Cf(i1 ,...,in ) (ημ and (6.7) follows.
ν
) + Cf(i1 ,...,in ) (ην
μ
) = Cf(i1 ,...,in ) (ημ
ν
+ ην
μ
),
2
When discussing the multi-variable analogue for Eq. (6.6) it is convenient to note that there is no problem to generally talk about the “subordination distribution of λ with respect to ν” for any λ, ν ∈ Dalg (k). Definition 6.3. Let two distributions λ, ν ∈ Dalg (k) be given. Consider the distribution μ ∈ Dalg (k) which is uniquely determined by the requirement that R μ = Rλ − R ν
(6.8)
(or equivalently, via the requirement that μ ν = λ). Then the subordination distribution of λ with respect to ν is, by definition, equal to μ ν. Corollary 6.4. 1◦ For every ν ∈ Dalg (k) and every p 1, the subordination distribution of ν p with respect to ν is equal to (B(ν))(p−1) . ◦ 2 Let ν be a distribution in Dc (k). Then, for every p 1, the subordination distribution of ν p with respect to ν belongs to Dc (k) as well, and is moreover -infinitely divisible. Proof. 1◦ According to Definition 6.3, the distribution in question is ν (p−1) ν. Thus we only need to invoke the particular case of Proposition 5.3 where s = p − 1 and t = 1. 2◦ This follows from part 1◦ of the corollary and the fact that B(ν) is -infinitely divisible (which implies that any convolution power (B(ν)) t , t 0, lives in Dc (k) and is itself infinitely divisible). 2 Remark 6.5. It is an easy exercise (left to the reader) to verify the identity (p−1) p (p−1)/p B(ν) = ν ,
∀ν ∈ Dalg (k), ∀p ∈ [1, ∞).
(6.9)
A. Nica / Journal of Functional Analysis 257 (2009) 428–463
463
So if we denote the subordination distribution of ν p with respect to ν by σ , then by invoking Corollary 6.4 and by taking the η-series of the distribution on the right-hand side of (6.9) we obtain that ησ = ((p − 1)/p) · ην p . Thus Corollary 6.4 gives indeed a multi-variable generalization of Eq. (6.6) from Remark 6.1.2. References [1] N.I. Akhiezer, The Classical Moment Problem and Some Related Problems in Analysis, Hafner, New York, 1965. [2] M. Anshelevich, Free evolution on algebras with two states, preprint, 2008, available at http://arXiv.org, under arXiv: 0803.4280. [3] S.T. Belinschi, H. Bercovici, Atoms and regularity for measures in a partially defined free convolution semigroup, Math. Z. 248 (2004) 665–674. [4] S.T. Belinschi, H. Bercovici, A new approach to subordination results in free probability, J. Anal. Math. 101 (2007) 357–366. [5] S.T. Belinschi, A. Nica, η-series and a Boolean Bercovici–Pata bijection for bounded k-tuples, Adv. Math. 217 (2008) 1–41. [6] S.T. Belinschi, A. Nica, Free Brownian motion and evolution towards -infinite divisibility for k-tuples, Internat. J. Math., in press, available at http://arXiv.org under arXiv: 0711.3787. [7] H. Bercovici, V. Pata, Stable laws and domains of attraction in free probability theory, with an appendix by P. Biane, The density of free stable distributions, Ann. of Math. 149 (1999) 1023–1060. [8] H. Bercovici, D. Voiculescu, Free convolution of measures with unbounded support, Indiana Univ. Math. J. 42 (1993) 733–773. [9] P. Biane, Processes with free increments, Math. Z. 227 (1998) 143–174. [10] M. Bozejko, M. Leinert, R. Speicher, Convolution and limit theorems for conditionally free random variables, Pacific J. Math. 175 (1996) 357–388. [11] R. Lenczewski, Decompositions of the free additive convolution, J. Funct. Anal. 246 (2007) 330–365. [12] H. Maassen, Addition of freely independent random variables, J. Funct. Anal. 106 (1992) 409–438. [13] A. Nica, R. Speicher, Lectures on the Combinatorics of Free Probability, London Math. Soc. Lecture Note Ser., vol. 335, Cambridge Univ. Press, 2006. [14] D. Voiculescu, Addition of certain non-commuting random variables, J. Funct. Anal. 66 (1986) 323–346. [15] D. Voiculescu, The analogues of entropy and of Fisher’s information measure in free probability theory I, Comm. Math. Phys. 155 (1993) 71–92. [16] D. Voiculescu, The coalgebra of the free difference quotient and free probability, Int. Math. Res. Not. 2 (2000) 79–106. [17] D.V. Voiculescu, K.J. Dykema, A. Nica, Free Random Variables, CRM Monogr. Ser., vol. 1, Amer. Math. Soc., 1992.
Journal of Functional Analysis 257 (2009) 464–484 www.elsevier.com/locate/jfa
Robustness of general dichotomies ✩ Luis Barreira ∗ , Claudia Valls Departamento de Matemática, Instituto Superior Técnico, 1049-001 Lisboa, Portugal Received 5 November 2008; accepted 24 November 2008 Available online 4 December 2008 Communicated by Paul Malliavin
Abstract For a linear equation v = A(t)v we consider general dichotomies that may exhibit stable and unstable behaviors with respect to arbitrary asymptotic rates ecρ(t) for some function ρ(t). This includes as a special case the usual exponential behavior when ρ(t) = t. We also consider the general case of nonuniform exponential dichotomies. We establish the robustness of the exponential dichotomies in Banach spaces, in the sense that the existence of an exponential dichotomy for a given linear equation persists under sufficiently small linear perturbations. We also establish the continuous dependence with the perturbation of the constants in the notion of dichotomy. © 2008 Elsevier Inc. All rights reserved. Keywords: Nonuniform exponential behavior; Robustness
1. Introduction We consider a linear equation v = A(t)v.
(1)
that may exhibit stable and unstable behaviors with asymptotic rates of the form ecρ(t) for an arbitrary function ρ(t). The main motivation to consider this general asymptotic behavior are those linear equations for which all Lyapunov exponents are infinite (either +∞ or −∞), and ✩
Partially supported by FCT through CAMGSD, Lisbon.
* Corresponding author.
E-mail addresses: [email protected] (L. Barreira), [email protected] (C. Valls). URL: http://www.math.ist.utl.pt/~barreira/ (L. Barreira). 0022-1236/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.11.018
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 464–484
465
thus to which one is not able, at least without further modifications, to apply the existing stability theory. This gives rise to the notion of ρ-nonuniform exponential dichotomy (see Section 3 for the definition), which turns out to be rather common. In particular, we show in [2] that for ρ in a large class of rate functions, any linear equation as in (1) in a finite-dimensional space, with two blocks having asymptotic rates ecρ(t) respectively with c negative and positive, has a ρ-nonuniform exponential dichotomy. A related important problem is whether the notion of exponential dichotomy is robust under sufficiently small linear perturbations, that is, whether the equation v = A(t) + B(t) v
(2)
admits a ρ-nonuniform exponential dichotomy if the same happens for Eq. (1). Our main objective is precisely to establish the robustness of ρ-nonuniform exponential dichotomies in Banach spaces. We also establish the continuous dependence with the perturbation of the constants in the notion of dichotomy. We note that the study of robustness in the case of uniform exponential behavior has a long history. In particular, it was discussed by Massera and Schäffer [7] (building on earlier work of Perron [11]; see also [8]), Coppel [5], and Dalec’ki˘ı and Kre˘ın [6] in the case of Banach spaces. For more recent works we refer to [4,9,14] and the references therein. We refer to [3] for the study of robustness in the more general setting of nonuniform exponential behavior in the particular case when ρ(t) = t. We emphasize that in addition to considering arbitrary growth rates given by a function ρ, we also consider the possibility of a nonuniform exponential behavior. As such, our work is also a contribution to the nonuniform hyperbolicity theory. A uniform behavior substantially restricts the dynamics and it is important to look for more general types of exponential behavior. We refer to [1] for a detailed exposition of the nonuniform hyperbolicity theory. The theory goes back to the landmark works of Oseledets [10] and particularly Pesin [12,13]. Since then it became an important part of the general theory of dynamical systems and a principal tool in the study of stochastic behavior. 2. Robustness of nonuniform exponential contractions We establish in this section the robustness of nonuniform exponential contractions. We consider the linear equation (1), where A : R+ → B(X) is a continuous function into the space B(X) of bounded linear operators in a Banach space X. Notice that each solution of Eq. (1) is defined in R+ . We denote by T (t, s) the associated evolution operator, i.e., the linear operator such that T (t, s)v(s) = v(t) for every t, s > 0, where v(t) is any solution of (1). Clearly, T (t, t) = Id, and T (t, τ )T (τ, s) = T (t, s),
t, τ, s > 0.
+ Consider an increasing differentiable function ρ : R+ 0 → R0 with ρ(0) = 0. We say that Eq. (1) admits a ρ-nonuniform exponential contraction in R+ if for some constants λ, D > 0 and a 0 we have
T (t, s) De−λ(ρ(t)−ρ(s))+aρ(s) ,
t s 0.
466
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 464–484
Now we consider the perturbed equation (2), where B : R+ → B(X) is a continuous function. The following statement gives a class of perturbations B(t)v, under which a given ρ-nonuniform exponential contraction is robust. Theorem 1. Let A, B : R+ → B(X) be continuous functions such that Eq. (1) admits a ρnonuniform exponential contraction in R+ , and B(t) δe−aρ(t) ρ (t), t 0 with δ < λ/D. Then Eq. (2) admits a ρ-nonuniform exponential contraction in R+ , with Tˆ (t, s) De−(λ−δD)(ρ(t)−ρ(s))+aρ(s) ,
t s 0,
(3)
C = U : J → B(X): U is continuous and U < ∞
(4)
U = sup U (t, s)e−aρ(s) : (t, s) ∈ J .
(5)
where Tˆ (t, s) is the evolution operator associated to Eq. (2). Proof. We consider the set J = (t, s) ∈ R+ × R+ : t s , and the Banach space
with the norm
We define an operator L in the space C by t (LU )(t, s) = T (t, s) +
T (t, τ )B(τ )U (τ, s) dτ s
for each U ∈ C. Since (LU )(t, s) T (t, s) +
t
T (t, τ ) · B(τ ) · U (τ, s) dτ
s
De
−λ(ρ(t)−ρ(s))+aρ(s)
t + Dδe
aρ(s)
U
e−λ(ρ(t)−ρ(τ )) ρ (τ ) dτ
s
De−λ(ρ(t)−ρ(s))+aρ(s) +
Dδ aρ(s) U , e λ
we have LU D +
Dδ U < ∞, λ
(6)
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 464–484
467
and we obtain a well-defined operator L : C → C. Proceeding in a similar manner to that in (6), we show that for every U1 , U2 ∈ C, LU1 − LU2
Dδ U1 − U2 . λ
Since δ < λ/D, the operator L is a contraction. Hence, there exists a unique function U ∈ C such that LU = U , which thus satisfies the identity t U (t, s) = T (t, s) +
T (t, τ )B(τ )U (τ, s) dτ s
for every t s 0. By the variation-of-constants formula, we know that U (t, s) = Tˆ (t, s). We use the following statement to estimate the norm of the operator Tˆ (t, s). Lemma 1. Given s 0, let x : [s, +∞) → [0, +∞) be a bounded continuous function satisfying x(t) De−λ(ρ(t)−ρ(s))+aρ(s) + δD
t
e−λ(ρ(t)−ρ(τ )) ρ (τ )x(τ ) dτ
(7)
s
for every t s 0. If δ < λ/D, then x(t) De−(λ−δD)(ρ(t)−ρ(s))+aρ(s) ,
t s 0.
(8)
e−λ(ρ(t)−ρ(τ )) ρ (τ )Φ(τ ) dτ
(9)
Proof. Consider the continuous function Φ(t) satisfying
Φ(t) = De
−λ(ρ(t)−ρ(s))+aρ(s)
t + δD s
for every t s 0. We can verify that Φ = (δD − λ)ρ (t)Φ. Furthermore, by (9) we have Φ(s) = Deaρ(s) , and hence, Φ(t) = Deaρ(s) e(δD−λ)(ρ(t)−ρ(s)) . If z(t) = x(t) − Φ(t), then t z(t) δD
e−λ(ρ(t)−ρ(τ )) ρ (τ )z(τ ) dτ,
t s.
(10)
s
Set z = supts z(t). Since the functions x and Φ are bounded, the number z is finite. It follows from (10) that
468
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 464–484
t z δD sup ts
e−λ(ρ(t)−ρ(τ )) ρ (τ )z(τ ) dτ
s
t δDz sup ts
e−λ(ρ(t)−ρ(τ )) ρ (τ ) dτ.
s
Hence, z (δD/λ)z, and since δD/λ < 1, we have z 0. This shows that x(t) Φ(t) for every t s 0. 2 Now we set x(t) = Tˆ (t, s). Since U ∈ C the function x is bounded. It follows from the first inequality in (6) that (7) holds. Hence, by (8) we obtain inequality (3). This completes the proof of the theorem. 2 3. Robustness of nonuniform exponential dichotomies We establish in this section the robustness of nonuniform exponential dichotomies. We say that Eq. (1) admits a ρ-nonuniform exponential dichotomy in R+ if there exist projections P (t) : X → X for each t > 0 satisfying T (t, s)P (s) = P (t)T (t, s),
t s,
(11)
and there exist constants λ, D > 0 and a 0 such that T (t, s)P (s) De−λ(ρ(t)−ρ(s))+aρ(s) ,
t s,
(12)
T (t, s)Q(s) De−λ(ρ(s)−ρ(t))+aρ(s) ,
s t,
(13)
and
where Q(t) = Id −P (t) for each t > 0. The following is our main result. Set a˜ = λ 1 − 2δD/λ and D˜ =
D . 1 − δD/(a˜ + λ)
Theorem 2. Let A, B : R+ 0 → B(X) be continuous functions such that Eq. (1) admits a ρnonuniform exponential dichotomy in R+ with a < λ, and B(t) δe−2aρ(t) ρ (t), t 0. If δ is sufficiently small, then Eq. (2) admits a ρ-nonuniform exponential dichotomy in R+ , with the ˜ and 2a. constants λ, D, and a replaced respectively by a, ˜ 4D D, Proof. We consider the set J = (t, s) ∈ I × I : t s , where I = R+ , and the Banach space C in (4).
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 464–484
469
Lemma 2. There is a unique U ∈ C such that for each (t, s) ∈ J we have t U (t, s) = T (t, s)P (s) +
T (t, τ )P (τ )B(τ )U (τ, s) dτ s
∞ −
(14)
T (t, τ )Q(τ )B(τ )U (τ, s) dτ. t
Moreover, t → U (t, s)ξ , t s is a solution of Eq. (2) for each ξ ∈ X. Proof. We proceed in a similar manner to that in the proof of Theorem 1. For the first property, we show that the operator L defined by t (LU )(t, s) = T (t, s)P (s) +
T (t, τ )P (τ )B(τ )U (τ, s) dτ s
∞ −
T (t, τ )Q(τ )B(τ )U (τ, s) dτ
(15)
t
has a unique fixed point in C. We have (LU )(t, s) T (t, s)P (s) +
t
T (t, τ )P (τ ) · B(τ ) · U (τ, s) dτ
s
∞ + T (t, τ )Q(τ ) · B(τ ) · U (τ, s) dτ t
De
−λ(ρ(t)−ρ(s))+aρ(s)
t + Dδe
aρ(s)
U
e−λ(ρ(t)−ρ(τ )) ρ (τ ) dτ
s
∞ + Dδeaρ(s) U
e−λ(ρ(τ )−ρ(t)) ρ (τ ) dτ.
(16)
t
Since λ > 0, taking δ sufficiently small, this implies that LU D +
2δD U < ∞, λ
and we have a well-defined operator L : C → C. Using (15) and proceeding in a similar manner to that in (16) we obtain LU1 − LU2
2δD U1 − U2 λ
470
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 464–484
for every U1 , U2 ∈ C. Therefore, for any sufficiently small δ, the operator L is a contraction, and there is a unique U ∈ C such that LU = U . Finally, we note that the right-hand side of (14) is differentiable in t, and that ∂U (t, s) = A(t)T (t, s)P (s) + A(t) ∂t
t T (t, τ )P (τ )B(τ )U (τ, s) dτ s
∞ − A(t)
T (t, τ )Q(τ )B(τ )U (τ, s) dτ t
+ P (t)B(t)U (t, s) + Q(t)B(t)U (t, s) = A(t)U (t, s) + B(t)U (t, s). This completes the proof of the lemma.
2
Lemma 3. For every t τ s in I we have U (t, τ )U (τ, s) = U (t, s). Proof. Setting Z(u) = U (u, τ )U (τ, s) − U (u, s), X(t, u) = T (t, u)P (u)B(u),
and Y (t, u) = T (t, u)Q(u)B(u),
(17)
we can show in a straightforward manner that t Z(t) =
∞ X(t, u)Z(u) du −
τ
Y (t, u)Z(u) du.
(18)
t
For each fixed τ s in I , we consider the operator N defined by t (N W )(t) =
∞ X(t, u)W (u) du −
τ
Y (t, u)W (u) du, t
in the Banach space E = W : [τ, +∞) → B(X): W is continuous and W < ∞ with the norm W = sup{W (u): u ∈ [τ, +∞)}. By (19) we have (N W )(t) D
t
e−λ(ρ(t)−ρ(u))+aρ(u) B(u) · W (u) du
τ
∞ +D t
2δD e−λ(ρ(u)−ρ(t))+aρ(u) B(u) · W (u) du W , λ
(19)
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 464–484
471
and thus N (E) ⊂ E. Proceeding in a similar manner we show that N W1 − N W2
2δD W1 − W2 λ
for every W1 , W2 ∈ E. If δ is sufficiently small, then N is a contraction, and there is a unique function W ∈ E satisfying (18). Since 0 ∈ E satisfies (18) we must have W = 0. By Lemma 2, the function Z in (17) is in E, and since it satisfies (18), for every t τ s in I we have Z(t) = 0. This completes the proof of the lemma. 2 Now we define linear operators Pˆ (t) = Tˆ (t, 0)U (0, 0)Tˆ (0, t)
ˆ = Id −Pˆ (t) and Q(t)
(20)
for each t ∈ I , where Tˆ (t, s) denotes the evolution operator associated to Eq. (2). We show that Eq. (2) admits a ρ-nonuniform exponential dichotomy with projections Pˆ (t). Lemma 4. The operator Pˆ (t) is a projection for each t ∈ I , and Tˆ (t, s)Pˆ (s) = Pˆ (t)Tˆ (t, s),
t s.
Proof. The proof is straightforward using U (0, 0)2 = U (0, 0), Tˆ (t, t) = Id, and (20).
(21) 2
Lemma 5. Given s ∈ I , if y : [s, +∞) → X is a bounded solution of Eq. (2) with y(s) = ξ , then t
∞ T (t, τ )P (τ )B(τ )y(τ ) dτ −
y(t) = T (t, s)P (s)ξ + s
T (t, τ )Q(τ )B(τ )y(τ ) dτ. t
Proof. By the variation-of-constants formula, for each t s in I we have t P (t)y(t) = T (t, s)P (s)ξ +
T (t, τ )P (τ )B(τ )y(τ ) dτ
(22)
s
and t Q(t)y(t) = T (t, s)Q(s)ξ +
T (t, τ )Q(τ )B(τ )y(τ ) dτ. s
Equivalently, the last formula can be written in the form t Q(s)ξ = T (s, t)Q(t)y(t) −
T (s, τ )Q(τ )B(τ )y(τ ) dτ. s
Since y(t) is bounded, we have T (s, t)Q(t)y(t) CDe−λ(ρ(t)−ρ(s))+aρ(t) ,
(23)
472
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 464–484
where C = sup{y(t): t s in I } < ∞. Furthermore, ∞ ∞ T (s, τ )Q(τ ) · B(τ ) · y(τ ) dτ DδC e−λ(ρ(τ )−ρ(s)) ρ (τ ) dτ s
s
DδC . = λ The statement follows from (22), taking limits in (23) when t → +∞.
2
Lemma 6. The function [s, +∞) t → Pˆ (t)Tˆ (t, s) is bounded, and for any t s in I we have Pˆ (t)Tˆ (t, s) = T (t, s)P (s)Pˆ (s) +
t
T (t, τ )P (τ )B(τ )Pˆ (τ )Tˆ (τ, s) dτ
s
∞ −
T (t, τ )Q(τ )B(τ )Pˆ (τ )Tˆ (τ, s) dτ.
(24)
t
Proof. By Lemma 2, the function t → U (t, 0)ξ , t 0 is a solution of Eq. (2) with initial condition at time zero equal to U (0, 0)ξ . Therefore, U (t, 0) = Tˆ (t, 0)U (0, 0). By Lemma 4 (see (21)), Pˆ (t)Tˆ (t, s) = Tˆ (t, s)Pˆ (s) = Tˆ (t, 0)U (0, 0)Tˆ (0, s) = U (t, 0)Tˆ (0, s).
(25)
Again by Lemma 2, for each ξ ∈ X the function y(t) = Pˆ (t)Tˆ (t, s)ξ = U (t, 0)Tˆ (0, s)ξ is a solution of (2). Furthermore, by the definition of the space C in (4)–(5) this solution is bounded for t s, and by (25), y(s) = U (s, 0)Tˆ (0, s)ξ = Pˆ (s)Tˆ (s, s)ξ = Pˆ (s)ξ. The desired identity follows now readily from Lemma 5.
2
Lemma 7. Given s ∈ I , if x : [s, ∞) → [0, +∞) is a bounded continuous function satisfying x(t) De
−λ(ρ(t)−ρ(s))+aρ(s)
t γ + δD
e−λ(ρ(t)−ρ(τ )) ρ (τ )x(τ ) dτ
s
∞ + δD
e−λ(ρ(τ )−ρ(t)) ρ (τ )x(τ ) dτ
(26)
t
for every t s, then ˜ ˜ e−a(ρ(t)−ρ(s))+aρ(s) x(t) Dγ ,
t s.
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 464–484
473
Proof. We show that x(t) Φ(t), where Φ(t) is any bounded continuous function satisfying the integral equation
Φ(t) = De
−λ(ρ(t)−ρ(s))+aρ(s)
t γ + δD
e−λ(ρ(t)−ρ(τ )) ρ (τ )Φ(τ ) dτ
s
∞ + δD
e−λ(ρ(τ )−ρ(t)) ρ (τ )Φ(τ ) dτ
(27)
t
for t s. We can easily verify that Φ(t) satisfies the differential equation z −
2 ρ (t) z − λ2 (1 − θ ) ρ (t) z = 0 ρ (t)
(28)
√ ˜ with θ = 2δD/λ. Setting a˜ = λ 1 − θ , the function z(t) = e−aρ(t) is a bounded solution of − a(ρ(t)−ρ(s)) ˜ Eq. (28). Therefore, Φ(t) = Φ(s)e . Furthermore, by (27), substituting Φ(t) and setting t = s, we obtain ∞ Φ(s) = De
aρ(s)
γ + δDΦ(s)
˜ )−ρ(s)) e−(λ+a)(ρ(τ ρ (τ ) dτ
s
= Deaρ(s) γ + Φ(s)
δD . λ + a˜
Since λ + a˜ > 0, this yields Φ(s) =
D eaρ(s) γ , 1 − δD/(a˜ + λ)
˜ ˜ e−a(ρ(t)−ρ(s))+aρ(s) . Now we set z(t) = x(t)−Φ(t) for t s. It follows from (26) and Φ(t) = Dγ and (27) that
t z(t) δD
e
∞
−λ(ρ(t)−ρ(τ ))
ρ (τ )z(τ ) dτ + δD
s
e−λ(ρ(τ )−ρ(t)) ρ (τ )z(τ ) dτ.
t
Set also z = supts z(t). Since the functions x and Φ are bounded, z is finite, and taking the supremum in the above inequality we obtain t z δDz sup ts
e s
∞
−λ(ρ(t)−ρ(τ ))
ρ (τ ) dτ + δDz sup ts
e−λ(ρ(τ )−ρ(t)) ρ (τ ) dτ.
t
Hence, z θ z. We have z 0 for any sufficiently small δ, and thus x(t) Φ(t) for t s.
2
474
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 464–484
Lemma 8. Given s ∈ I , if y : (0, s] → [0, +∞) is a continuous function satisfying
y(t) De
−λ(ρ(s)−ρ(t))+aρ(s)
t γ + δD
e−λ(ρ(t)−ρ(τ )) ρ (τ )y(τ ) dτ
0
s + δD
e−λ(ρ(τ )−ρ(t)) ρ (τ )y(τ ) dτ
(29)
t
for every t ∈ (0, s], then ˜ ˜ e−a(ρ(s)−ρ(t))+aρ(s) y(t) Dγ ,
t ∈ (0, s].
Proof. Proceeding in a similar manner to that in the proof of Lemma 7 we show that y(t) Ψ (t), where Ψ (t) is any bounded continuous function satisfying
Ψ (t) = De
−λ(ρ(s)−ρ(t))+aρ(s)
t γ + δD
e−λ(ρ(t)−ρ(τ )) ρ (τ )Ψ (τ ) dτ
0
s + δD
e−λ(ρ(τ )−ρ(t)) ρ (τ )Ψ (τ ) dτ
t
for t s. Note first that Ψ (t) also satisfies the differential equation (28). Substituting Ψ (t) = ˜ in the above identity and setting t = s we obtain Ψ (s)e−a(ρ(s)−ρ(t)) s Ψ (s) = De
aρ(s)
γ + δDΨ (s)
˜ )) e−(λ+a)(ρ(s)−ρ(τ ρ (τ ) dτ
0
δD . Deaρ(s) γ + Ψ (s) λ + a˜ Hence, Ψ (s)
D eaρ(s) γ , 1 − δD/(λ + a) ˜
˜ and Ψ (t) Ψ (s)e−a(ρ(s)−ρ(t)) . Proceeding in a similar manner to that in Lemma 7 we find that ˜ ˜ e−a(ρ(s)−ρ(t))+aρ(s) y(t) Ψ (t) Dγ .
This completes the proof of the lemma.
2
Now we establish norm bounds for Tˆ (t, s).
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 464–484
475
Lemma 9. For every t s we have ˜ Tˆ (t, s) Im Pˆ (s) De ˜ −a(ρ(t)−ρ(s))+aρ(s) . Proof. Let ξ ∈ X. Setting x(t) = Pˆ (t)Tˆ (t, s)ξ for t s, and γ = Pˆ (s)ξ it follows from Lemma 6 and (12)–(13) that the function x is bounded, and satisfies inequality (26). Therefore, by Lemma 7, ˜ Pˆ (t)Tˆ (t, s)ξ De Pˆ (s)ξ , ˜ −a(ρ(t)−ρ(s))+aρ(s)
t s.
By Lemma 4 we have Pˆ (t)Tˆ (t, s) = Tˆ (t, s)Pˆ (s) = Tˆ (t, s)Pˆ (s)Pˆ (s), and hence, setting η = Pˆ (s)ξ , ˜ Tˆ (t, s)Pˆ (s)η De ˜ −a(ρ(t)−ρ(s))+aρ(s) η, This establishes the desired inequality.
t s.
2
Lemma 10. For every t ∈ [0, s] we have ˜ Tˆ (t, s) Im Q(s) ˜ −a(ρ(s)−ρ(t))+aρ(s) ˆ De . Proof. Proceeding in a similar manner to that for (24), we can show that for t s, ˆ Tˆ (t, s) = T (t, s)Q(s)Q(s) ˆ Q(t) +
t
ˆ )Tˆ (τ, s) dτ T (t, τ )P (τ )B(τ )Q(τ
0
s −
ˆ )Tˆ (τ, s) dτ. T (t, τ )Q(τ )B(τ )Q(τ
(30)
t
ˆ ˆ Now let ξ ∈ X, and set y(t) = Tˆ (t, s)Q(s)ξ for t s, and γ = Q(s)ξ . The function y satisfies inequality (29). Using Lemma 8 and proceeding in a similar manner to that in the proof of Lemma 9 we obtain the desired inequality. 2 We showed that there exist projections Pˆ (t) (see (20)) leaving invariant the evolution operator ˆ T (t, s) (see Lemma 4), and satisfying the norms bounds in Lemmas 9 and 10. It remains to obtain ˆ exponential bounds for the norms of the projections Pˆ (t) and Q(t). Lemma 11. Provided that δ is sufficiently small, for any t ∈ I we have Pˆ (t) 4Deaρ(t)
ˆ 4Deaρ(t) . and Q(t)
(31)
476
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 464–484
Proof. By Lemma 6 with t = s, since P (t) and Q(t) are complementary projections, Q(t)Pˆ (t) = −
∞
T (t, τ )Q(τ )B(τ )Pˆ (τ )Tˆ (τ, t) dτ.
(32)
t
By Lemma 9 and Lemma 4 (see (21)), for τ t we have ˜ )−ρ(t))+aρ(t) ˆ Pˆ (τ )Tˆ (τ, t) De ˜ −a(ρ(τ P (t).
(33)
By (32), using (13) we obtain Q(t)Pˆ (t) δD D˜ Pˆ (t)
∞
˜ )−ρ(t))+aρ(t) e−λ(ρ(τ )−ρ(t))+aρ(τ ) e−2aρ(τ ) e−a(ρ(τ ρ (τ ) dτ
t
δD D˜ Pˆ (t)
∞
˜ )−ρ(t)) e−(λ+a−a)(ρ(τ ρ (τ ) dτ =
t
˜ D Dδ Pˆ (t), λ + a˜ − a
(34)
since λ > a. Similarly, it follows from (30) with t = s that ˆ = P (t)Q(t)
t
ˆ )Tˆ (τ, t) dτ. T (t, τ )P (τ )B(τ )Q(τ
(35)
˜ ))+aρ(t) ˆ Q(τ ˜ −a(ρ(t)−ρ(τ ˆ )Tˆ (τ, t) De Q(t).
(36)
0
By Lemma 10, for τ t we have
By (35), using (12) we obtain P (t)Q(t) ˆ ˆ δD D˜ Q(t)
t
˜ ))+aρ(t) e−λ(ρ(t)−ρ(τ ))+aρ(τ ) e−2aρ(τ ) e−a(ρ(t)−ρ(τ ρ (τ ) dτ
0
ˆ δD D˜ Q(t)
t
˜ ) e−(λ+a−a)(t−τ ρ (τ ) dτ =
0
˜ D Dδ Q(t) ˆ . λ + a˜ − a
(37)
Now observe that ˆ Pˆ (t) − P (t) = Q(t)Pˆ (t) − P (t)Q(t). It follows from (34) and (37) that Pˆ (t) − P (t)
δD D˜ Pˆ (t) + Q(t) ˆ . λ + a˜ − a
(38)
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 464–484
On the other hand, by (12)–(13) with t = s, we have P (t) Deaρ(t)
and Q(t) Deaρ(t) .
It follows from (38) that Pˆ (t) Pˆ (t) − P (t) + P (t)
δD D˜ Pˆ (t) + Q(t) ˆ + Deaρ(t) , λ + a˜ − a
ˆ − Q(t) = Pˆ (t) − P (t) we also have and since Q(t) Q(t) ˆ Pˆ (t) − P (t) + Q(t)
δD D˜ Pˆ (t) + Q(t) ˆ + Deaρ(t) . λ + a˜ − a
Therefore, ˜
Pˆ (t) + Q(t) ˆ + 2Deaρ(t) , ˆ 2δD D Pˆ (t) + Q(t) λ + a˜ − a and 1−
2δD D˜ Pˆ (t) + Q(t) ˆ 2Deaρ(t) . λ + a˜ − a
˜ Taking δ sufficiently small so that 2δD D/(λ + a˜ − a) 1/2 we obtain Pˆ (t) + Q(t) ˆ 4Deaρ(t) . This yields the desired inequalities.
2
Combining (33) with (31) we find that for τ t, ˜ )−ρ(t))+aρ(t) ˆ ˜ )−ρ(t))+2aρ(t) Pˆ (τ )Tˆ (τ, t) De ˜ −a(ρ(τ ˜ −a(ρ(τ . P (t) 4D De Similarly, combining (36) with (31) we find that for τ t, ˜ ))+aρ(t) ˆ ˜ ))+2aρ(t) Q(τ ˜ −a(ρ(t)−ρ(τ ˜ −a(ρ(t)−ρ(τ ˆ )Tˆ (τ, t) De Q(t) 4D De . This completes the proof of the theorem.
2
477
478
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 464–484
4. Robustness of dichotomies in R We consider in this section the case of nonuniform exponential dichotomies in R and we establish a corresponding robustness result. Consider an increasing differentiable function ρ : R → R with ρ(0) = 0 such that ρ(−t) = −ρ(t) for every t ∈ R. For a continuous function A : R → B(X), we say that Eq. (1) admits a ρ-nonuniform exponential dichotomy in R if there exist projections P (t) : X → X for each t ∈ R satisfying (11), and there exist constants λ, D > 0 and a 0 such that T (t, s)P (s) De−λ(ρ(t)−ρ(s))+a|ρ(s)| , t s, and T (t, s)Q(s) De−λ(ρ(s)−ρ(t))+a|ρ(s)| ,
s t,
where Q(t) = Id −P (t) for each t ∈ R. Theorem 3. Let A, B : R → B(X) be continuous functions such that Eq. (1) admits a ρnonuniform exponential dichotomy in R with a < λ, and B(t) δe−2a|ρ(t)| ρ (t), t ∈ R. If δ is sufficiently small, then Eq. (2) admits a ρ-nonuniform exponential dichotomy in R, with the ˜ and 2a. constants λ, D, and a replaced respectively by a, ˜ 4D D, Proof. Repeating the proof of Theorem 2 with I = R yields the following statement. Lemma 12. There exist projections P+ (t) for t ∈ R such that P+ (t)Tˆ (t, s) = Tˆ (t, s)P+ (s) for every t, s ∈ R, ˜ Tˆ (t, s) Im P+ (s) De ˜ −a(ρ(t)−ρ(s))+a|ρ(s)| ,
t s,
(39)
and ˜ Tˆ (t, s) Im Q+ (s) De ˜ −a(ρ(s)−ρ(t))+a|ρ(s)| ,
0 t s,
where Q+ (s) = Id −P+ (s) for each s ∈ R. A simple modification of the proof of Theorem 2 corresponding to reverse the time in the notion of dichotomy yields the following statement. Lemma 13. There exist projections P− (t) for t ∈ R such that P− (t)Tˆ (t, s) = Tˆ (t, s)P− (s) for every t, s ∈ R, ˜ Tˆ (t, s) Im Q− (s) De ˜ −a(ρ(s)−ρ(t))+a|ρ(s)| ,
t s,
(40)
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 464–484
479
and ˜ Tˆ (t, s) Im P− (s) De ˜ −a(ρ(t)−ρ(s))+a|ρ(s)| ,
0 t s,
where Q− (s) = Id −P( s) for each s ∈ R. The projections in Lemma 13 are construct setting Q− (t) = Tˆ (t, 0)V (0, 0)Tˆ (0, t), where V (t, s) is the unique solution of the equation s V (t, s) = T (t, s)Q(s) −
T (t, τ )Q(τ )B(τ )V (τ, s) dτ t
t +
T (t, τ )P (τ )B(τ )V (τ, s) dτ
(41)
−∞
in the space of continuous functions V : (t, s) ∈ R × R: t s → B(X) with the norm V = sup V (t, s)e−a|ρ(s)| : t s < ∞. We have also the identities P (0)P+ (0) = P (0),
P+ (0)P (0) = P+ (0),
(42)
Q(0)Q− (0) = Q(0),
Q− (0)Q(0) = Q− (0).
(43)
and
Indeed, setting t = s = 0 in (14) we obtain ∞ P+ (0) = P (0) −
T (0, τ )Q(τ )B(τ )U (τ, 0) dτ, 0
which yields the first identity in (42). The second identity in (42) follows from reversing the roles of T and Tˆ . The identities in (43) can be obtained in a similar manner. Lemma 14. If δ is sufficiently small, then the operator S = P+ (0) + Q− (0) is invertible.
480
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 464–484
Proof. By (43) we also have P (0)P− (0) = P− (0). Together with (42) this implies that P+ (0) + Q− (0) − Id = P+ (0) − P (0) + P (0) − P− (0) = P+ (0) − P (0)P+ (0) + P (0) − P (0)P− (0) = Q(0)P+ (0) + P (0)Q− (0). By (41) we have 0 P (0)Q− (0) = P (0)V (0, 0) = −
T (0, τ )P (τ )B(τ )V (τ, 0) dτ,
−∞
and by Lemma 2, ∞ Q(0)P+ (0) = Q(0)U (0, 0) = −
T (0, τ )Q(τ )B(τ )U (τ, 0) dτ. 0
On the other hand, U (t, 0) T (t, 0)P (0) +
t
T (t, τ )P (τ ) · B(τ ) · U (τ, 0) dτ
0
∞ + T (t, τ )Q(τ ) · B(τ ) · U (τ, 0) dτ t
De−λρ(t) + Dδ
t
e−λ(ρ(t)−ρ(τ )) ρ (τ ) · U (τ, 0) dτ
0
∞ + Dδ
e−λ(ρ(τ )−ρ(t)) ρ (τ ) · U (τ, 0) dτ.
t
Setting x(t) = U (t, 0) and γ = 1, it follows from Lemma 7 that ˜ U (t, 0) De ˜ −aρ(t) ,
t 0.
Analogously, V (t, 0) T (t, 0)Q(0) +
0 t
T (t, τ )Q(τ ) · B(τ ) · V (τ, 0) dτ
(44)
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 464–484
t
481
T (t, τ )P (τ ) · B(τ ) · V (τ, 0) dτ
+ −∞
0 De
λρ(t)
+ Dδ
e−λ(ρ(τ )−ρ(t)) ρ (τ ) · V (τ, 0) dτ
t
t
e−λ(ρ(t)−ρ(τ )) ρ (τ ) · V (τ, 0) dτ.
+ Dδ −∞
It follows from a slight modification of the proof of Lemma 8 (for functions in the interval (−∞, s]) that ˜ V (t, 0) De ˜ aρ(t) ,
t 0.
(45)
By (44) and (45) we obtain Pˆ+ (0) + Q ˆ − (0) − Id
∞ T (0, τ )Q(τ ) · B(τ ) · U (τ, 0) dτ 0
0
T (0, τ )P (τ ) · B(τ ) · V (τ, 0) dτ
+ −∞
δD D˜
∞ e
ρ (τ ) dτ + δD D˜
−(λ+a)ρ(τ ˜ )
˜ ) e(λ+a)ρ(τ ρ (τ ) dτ
−∞
0
0
2δD D˜ . |λ + a| ˜
Hence, for any sufficiently small δ, the operator S is invertible.
2
For each t ∈ R we set P˜ (t) = Tˆ (t, 0)SP (0)S −1 Tˆ (0, t). We have P˜ (t)2 = Tˆ (t, 0)P˜ (0)2 Tˆ (0, t) = Tˆ (t, 0)SP (0)2 S −1 Tˆ (0, t) = P˜ (t), and P˜ (t) is a projection for each t ∈ R. Furthermore, Tˆ (t, s)P˜ (s) = Tˆ (t, 0)SP (0)S −1 Tˆ (0, s) = P˜ (t)Tˆ (t, s).
(46)
482
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 464–484
It remains to obtain norm bounds for Tˆ (t, s). By (42) and (43) we have SP (0) = P+ (0)P (0) + Q− (0)P (0) = P+ (0), and SQ(0) = P+ (0)Q(0) + Q− (0)Q(0) = Q− (0). Therefore, setting S(t) = Tˆ (t, 0)S Tˆ (0, t) = P+ (t) + Q− (t), we obtain P˜ (t)S(t) = Tˆ (t, 0)SP (0)S −1 S Tˆ (0, t) = Tˆ (0, t)SP (0)Tˆ (0, t) = P+ (t), ˜ = Id −P˜ (t). Therefore, ˜ and thus also Q(t)S(t) = Q− (t), where Q(t) Im P˜ (t) ⊃ Im P+ (t)
and
˜ ⊃ Im Q− (t). Im Q(t)
and
˜ = Im Q− (t). Im Q(t)
Since S(t) is invertible, we have in fact Im P˜ (t) = Im P+ (t) By (39), for t s we obtain Tˆ (t, s)P˜ (s) Tˆ (t, s) Im P˜ (s) · P˜ (s) = Tˆ (t, s) Im P+ (s) · P˜ (s) ˜ P˜ (s). ˜ −a(ρ(t)−ρ(s))+a|ρ(s)| De
(47)
Similarly, it follows from (40) that for t s, Tˆ (t, s)Q(s) ˜ Tˆ (t, s) Im Q− (s) · P˜ (s) ˜ Q(s) ˜ −a(ρ(s)−ρ(t))+a|ρ(s)| ˜ . De
(48)
Lemma 15. Provided that δ is sufficiently small, for each t ∈ R we have P˜ (t) 4Dea|ρ(t)|
˜ 4Dea|ρ(t)| . and Q(t)
Proof. By (47), given ξ ∈ X the function y(t) = Tˆ (t, s)P˜ (s)ξ , t s is bounded. Since y(s) = P˜ (s)ξ , it follows from Lemma 5 and (46) that for t s,
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 464–484
P˜ (t)Tˆ (t, s) = T (t, s)P (s)P˜ (s) +
t
483
T (t, τ )P (τ )B(τ )P˜ (τ )Tˆ (τ, s) dτ
s
∞ −
T (t, τ )Q(τ )B(τ )P˜ (τ )Tˆ (τ, s) dτ.
t
Setting t = s, since Q(t) = Id −P (t) we obtain Q(t)P˜ (t) = −
∞
T (t, τ )Q(τ )B(τ )P˜ (τ )Tˆ (τ, t) dτ.
t
Proceeding as in (34) and using (46) and (47), we obtain Q(t)P˜ (t)
˜ D Dδ P˜ (t). λ + a˜ − a
˜ Similarly, by (48), given ξ ∈ X the function y(t) = Tˆ (t, s)Q(s)ξ , t s is bounded. Proceeding in a similar manner to that in the proof of Lemma 5 we find that for t s, ˜ Tˆ (t, s) = T (t, s)Q(s)Q(s) ˜ Q(t) −
t
˜ )Tˆ (τ, s) dτ T (t, τ )Q(τ )B(τ )Q(τ
s
t +
˜ )Tˆ (τ, s) dτ. T (t, τ )P (τ )B(τ )Q(τ
−∞
Setting t = s we obtain ˜ = P (t)Q(t)
t
˜ )Tˆ (τ, t) dτ, T (t, τ )P (τ )B(τ )Q(τ
−∞
and it follows from (48) and (46) that for τ t, P (t)Q(t) ˜
˜ D Dδ Q(t) ˜ . λ + a˜ − a
Moreover, ˜ P˜ (t) − P (t) = Q(t)P˜ (t) − P (t)Q(t). The desired statement can now be obtained by repeating arguments in the proof of Lemma 11, ˆ ˜ replacing Pˆ (t) by P˜ (t) and Q(t) by Q(t). 2 The theorem follows now readily from (47), (48), and Lemma 15.
2
484
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 464–484
References [1] L. Barreira, Ya. Pesin, Nonuniform Hyperbolicity, Encyclopedia Math. Appl., vol. 115, Cambridge Univ. Press, 2007. [2] L. Barreira, C. Valls, Growth rates and nonuniform hyperbolicity, Discrete Contin. Dyn. Syst. 22 (2008) 509–528. [3] L. Barreira, C. Valls, Robustness of nonuniform exponential dichotomies in Banach spaces, J. Differential Equations 244 (2008) 2407–2447. [4] S.-N. Chow, H. Leiva, Existence and roughness of the exponential dichotomy for skew-product semiflow in Banach spaces, J. Differential Equations 120 (1995) 429–477. [5] W. Coppel, Dichotomies and reducibility, J. Differential Equations 3 (1967) 500–521. [6] Ju. Dalec’ki˘ı, M. Kre˘ın, Stability of Solutions of Differential Equations in Banach Space, Transl. Math. Monogr., vol. 43, Amer. Math. Soc., 1974. [7] J. Massera, J. Schäffer, Linear differential equations and functional analysis. I, Ann. of Math. (2) 67 (1958) 517–573. [8] J. Massera, J. Schäffer, Linear Differential Equations and Function Spaces, Pure Appl. Math., vol. 21, Academic Press, 1966. [9] R. Naulin, M. Pinto, Admissible perturbations of exponential dichotomy roughness, Nonlinear Anal. 31 (1998) 559–571. [10] V. Oseledets, A multiplicative ergodic theorem. Lyapunov characteristic numbers for dynamical systems, Trans. Moscow Math. Soc. 19 (1968) 197–221. [11] O. Perron, Die Stabilitätsfrage bei Differentialgleichungen, Math. Z. 32 (1930) 703–728. [12] Ya. Pesin, Families of invariant manifolds corresponding to nonzero characteristic exponents, Math. USSR Izv. 10 (1976) 1261–1305. [13] Ya. Pesin, Characteristic Lyapunov exponents, and smooth ergodic theory, Russian Math. Surveys 32 (1977) 55– 114. [14] V. Pliss, G. Sell, Robustness of exponential dichotomies in infinite-dimensional dynamical systems, J. Dynam. Differential Equations 11 (1999) 471–513.
Journal of Functional Analysis 257 (2009) 485–505 www.elsevier.com/locate/jfa
Multi-bump solutions and multi-tower solutions for equations on RN Lishan Lin, Zhaoli Liu ∗,1 School of Mathematical Sciences, Capital Normal University, Beijing 100037, People’s Republic of China Received 7 November 2008; accepted 1 February 2009 Available online 24 February 2009 Communicated by H. Brezis
Abstract Let > 0 be a small parameter. In this paper, we study existence of multiple multi-bump positive solutions for the semilinear Schrödinger equation −u + u = 1 − a(x) |u|p−2 u,
u ∈ H 1 RN ,
where N 1, 2 < p < 2N/(N − 2) if N 3, p > 2 if N = 1 or N = 2, a ∈ C(RN ), a(x) > 0 for x ∈ RN , and lim|x|→∞ a(x) = 0. We also study existence of multiple multi-tower positive solutions for the prescribed scalar curvature equation N+2 −u = 1 − K |x| u N−2 ,
u ∈ D1,2 RN ,
where N 3, K ∈ C([0, ∞)), K(r) > 0 for r > 0, limr→0 K(r) = 0, and limr→∞ K(r) = 0. © 2009 Elsevier Inc. All rights reserved. Keywords: Semilinear Schrödinger equation; Multi-bump solution; Prescribed scalar curvature equation; Multi-tower solution
* Corresponding author.
E-mail address: [email protected] (Z.L. Liu). 1 Supported by NSFC (10825106).
0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.02.001
486
L.S. Lin, Z.L. Liu / Journal of Functional Analysis 257 (2009) 485–505
1. Introduction and the main results This paper is concerned with existence of multi-bump positive solutions of the equation −u + u = 1 − a(x) |u|p−2 u,
u ∈ H 1 RN ,
(1.1)
∗ where N 1, 2 < p < 2∗ , 2∗ = N2N −2 is the critical Sobolev exponent if N 3 and 2 = ∞ if N N = 1 or N = 2, and > 0 is a parameter. Under suitable assumptions on a : R → R, we shall prove existence of multi-bump positive solutions for small. Similar results have been obtained in [31] for the equation
−u + 1 + a(x) u = |u|p−2 u,
u ∈ H 1 RN .
(1.2)
To state our main result for (1.1), recall that the limiting equation −u + u = |u|p−2 u,
u ∈ H 1 RN
(1.3)
has a unique positive radial solution w, which decays exponentially at ∞. For n ∈ N and for n sufficiently separated y1 , y2 , . . . , yn in RN , the profile of the function i=1 w(· − yi ) resembles n n bumps and accordingly a solution of (1.1) which is close to i=1 w(· − yi ) in H 1 (RN ) is called an n-bump solution. We now formulate the assumptions on a. (A) a ∈ C(RN ), a(x) > 0 for x ∈ RN , lim|x|→∞ a(x) = 0, and there exist c > 0 and σ > 0 such that a(x) ce−σ |x| . (B) a ∈ C(RN ), a(x) > 0 for x ∈ RN , lim|x|→∞ a(x) = 0, and for any σ > 0 there exists c > 0 such that a(x) ce−σ |x| . Theorem 1.1. Let a satisfy (A). If n ∈ N satisfies n<1+
p−2 , 2σ (p − 1)
then there exists (n) > 0 such that for 0 < < (n), (1.1) has an n-bump positive solution. We remark that for existence of one-bump solutions for small , it is sufficient to assume a ∈ C(RN ) and lim|x|→∞ a(x) = 0; this will be clear by checking the argument below, and is consistent with the previous results in [4] (see also [6, Theorem 4.6]). As a consequence of Theorem 1.1, we have the following result. Corollary 1.2. Let a satisfy (B). Then for any n ∈ N, there exists (n) > 0 such that for 0 < < (n), (1.1) has an n-bump positive solution. Therefore, as → 0, (1.1) has more and more multi-bump positive solutions. As a problem closely related to (1.1), we also consider the prescribed scalar curvature equation N+2 −u = 1 − K |x| u N−2 ,
u ∈ D1,2 RN ,
(1.4)
L.S. Lin, Z.L. Liu / Journal of Functional Analysis 257 (2009) 485–505
487
where N 3, > 0 is a parameter, and K satisfies the following assumptions. (C) K ∈ C([0, ∞)), K(r) > 0 for r > 0, limr→0 K(r) = 0, limr→∞ K(r) = 0, and there exist c > 0 and μ > 0 such that K(r) cr μ for r > 0 small and K(r) cr −μ for r large. (D) K ∈ C([0, ∞)), K(r) > 0 for r > 0, limr→0 K(r) = 0, limr→∞ K(r) = 0, and for any μ > 0 there exists c > 0 such that K(r) cr μ for r > 0 small and K(r) cr −μ for r large. Theorem 1.3. Let K satisfy (C). If n ∈ N satisfies n<1+
N −2 , μ(N + 2)
then there exists (n) > 0 such that for 0 < < (n), (1.4) has an n-tower positive solution. Here, by an n-tower positive solution of (1.4) we mean a radial solution which is sufficiently close to ni=1 Uλi in the D1,2 (RN ) norm 1
u =
2
|∇u|
2
,
RN λi λj
where λi > 0 (i = 1, 2, . . . , n) are such that Uλ (x) =
λ
N−2 2
+
λj λi
are large enough for all i = j and
[N (N − 2)]
(1 + λ2 |x|2 )
N−2 4
N−2 2
.
We also remark that for existence of one-tower solutions for small , it is sufficient to assume K ∈ C([0, ∞)), limr→0 K(r) = 0, and limr→∞ K(r) = 0. As a consequence of Theorem 1.3, we have the following result. Corollary 1.4. Let K satisfy (D). Then for any n ∈ N, there exists (n) > 0 such that for 0 < < (n), (1.4) has an n-tower positive solution. Therefore, as → 0, (1.4) has more and more multi-tower positive solutions. Note that the assumptions (B) and (D) can be satisfied by many types of functions. For example, for any α > 0, c1 > 0, c2 > 0, functions of the form a(x) =
c1 c2 + |x|α
satisfy the assumption (B), and functions of the form K(r) = satisfy the assumption (D).
c1 c2 +| ln r|α ,
0,
r > 0, r =0
488
L.S. Lin, Z.L. Liu / Journal of Functional Analysis 257 (2009) 485–505
There is a vast literature concerning time-independent semilinear Schrödinger equations. In particular, the singular perturbed equation − 2 u + V (x)u = |u|p−2 u,
u ∈ H 1 RN
(1.5)
has been extensively studied. Solutions of (1.5) as → 0 are called semi-classical states, which usually exhibit a concentration phenomenon. In the case where infx∈RN V (x) > 0, see, for instance, [2,8,18–21,23,30,36–39] for results on existence of solutions concentrating near critical points of V and [6,7,22] for results on existence of solutions concentrating on manifolds. In the case where infx∈RN V (x) = 0, different concentration phenomena were observed in [11,12]. The solutions we obtain in Theorem 1.1 do not concentrate near any fixed point, and they have multiple bumps separated far apart with each bump resembling the shape of w. Similar phenomenon was observed for a Maxwell–Schrödinger system in [17]. See [31] for a similar result concerning (1.2). Multi-bump solutions were also constructed with a minimax argument in [16] for the Schrödinger equation −u + V (x)u = f (x, u),
x ∈ RN
having periodicity in x. The equation we consider here is of a different type from those in [16,17] and the approach we present is also different. For the type of equations of (1.1) and (1.2), existence of one-bump solutions has been studied in [4,6] and such solutions have been obtained under, for example, the assumptions that a ∈ L∞ (RN ) and lim|x|→∞ a(x) = 0. Here to obtain multi-bump solutions we need stronger assumptions on a. In particular, we have put a constraint on the decay rate of a, that is, a(x) ce−σ |x| . Equations of the type of (1.4) arise in the scalar curvature problem in differential geometry. If (M, g0 ) is a Riemannian manifold of dimension N 3, with scalar curvature S0 , then to find 4 a conformal metric g1 = u N−2 g0 having scalar curvature S1 is equivalent to find a solution u to the equation −4
N+2 N −1 g u + S0 u = S1 u N−2 . N −2 0
(1.6)
Up to a positive constant, if (M, g0 ) is the standard sphere then the stereographic projection π : S N → RN converts (1.6) into N+2
−u = Su N−2 ,
x ∈ RN ,
(1.7)
where S(x) = S1 (π −1 (x)), and if (M, g0 ) is the standard RN then (1.6) is just (1.7). If S(x) is a perturbation of 1 and has the form S(x) = 1 − K(|x|) and if we require u to be in D1,2 (RN ) then (1.7) becomes (1.4). These kind of equations have been studied by many authors, see, for example, [3,5,13,28,33] and the references therein. However, in all the above mentioned papers S was assumed to be a C 1 function, and in most of the known results critical points of S were assumed to satisfy additional conditions. Also, there have been only few results on existence of multiple solutions for (1.7) in the literature. In Theorem 1.3, we do not need smoothness of K and, in particular, we do not impose any assumption on critical points of K, and, furthermore, we obtain multiple solutions. More general results will be given in Section 4. It is known that C ∞
L.S. Lin, Z.L. Liu / Journal of Functional Analysis 257 (2009) 485–505
489
scalar curvature functions are dense in C 1,α (0 < α < 1) norms among positive functions, and, in some cases, there exist multiple conformal metrics with the same scalar curvature function; see, for example, [26,27,29,34]. Theorem 1.3 is a different kind of result from those in [26,27,29,34] and it reveals a phenomenon which has not been observed before. We shall use variational reduction method to prove the main results. Our argument is partially inspired by [1,17,31]. This paper is organized as follows. In Section 2, preliminary results are revisited. We prove Theorem 1.1 in Section 3. In Section 4, we prove Theorem 1.3 and obtain more general results. 2. Preliminaries To begin with, we introduce some notations. In the Hilbert space H 1 (RN ), we shall use the usual inner product u, v = ∇u · ∇v + uv RN
and the induced norm · . Let | · |p be the usual norm of Lp (RN ). We shall use C and Ci to represent positive constants which may be variant even in the same line. Let n ∈ N. We shall use and to represent summation over all subscripts i and j satisfying 1 i < j n and i<j i=j 1 i = j n, respectively. Lemma 2.1. For q 2, there exists C > 0 such that for any ai 0, i = 1, . . . , n,
n
q
ai
i=1
n
q
ai + C
q−1
ai
aj .
i=j
i=1
2
Proof. A simple calculation gives the result.
The following two lemmas are taken from [31]. Lemma 2.2. For q 2, there exists C > 0 such that for any ai 0, i = 1, . . . , n,
n
q−1 −
ai
i=1
n
q q−1
q−1 ai
C
q−1
ai
aj .
i=j
i=1
Lemma 2.3. For q 2, n ∈ N, and ai 0, i = 1, . . . , n,
n
i=1
q ai
n
q
ai + (q − 1)
i=1
q−1
ai
aj .
i=j
Recall that (see, for instance, [10,25,35]), for 2 < p < 2∗ , the equation −u + u = |u|p−2 u,
u ∈ H 1 RN
(2.1)
490
L.S. Lin, Z.L. Liu / Journal of Functional Analysis 257 (2009) 485–505
has a unique positive radial solution w ∈ C ∞ (RN ) which satisfies, for some c > 0, w(r)r
N−1 2
w (r)r
er → c > 0,
N−1 2
er → −c,
as r = |x| → ∞,
and each positive solution of (2.1) has the form wy := w(· − y) for some y ∈ RN . We shall use wy as building blocks to construct multi-bump solutions of (1.1). For yi , yj ∈ RN , the identity
p−1 wyi wyj
= wyi , wyj =
RN
p−1
wyi wyj
RN
will be frequently used in the sequel. The following lemma is a consequence of [9, Lemma II.2] (see also [15, Lemma 2.4]). Lemma 2.4. There exists a positive constant c > 0 such that as |yi − yj | → ∞,
wyi wyj ∼ c|yi − yj |− p−1
(N−1) 2
e−|yi −yj | .
RN
For λ > 0, define n Ωλ = (y1 , . . . , yn ) ∈ RN |yi − yj | > λ for i = j if n 2 and Ωλ = RN if n = 1. For y = (y1 , . . . , yn ) ∈ Ωλ , denote uy (x) =
n
wyi ,
i=1
∂wyi
Ty = span
α = 1, 2, . . . , N, i = 1, 2, . . . , n , ∂xα and Wy = v ∈ H 1 RN v, v1 = 0, ∀v1 ∈ Ty . Set P (x) = 1 − a(x) and K = (p − 1)(− + 1)−1 . For y ∈ Ωλ and ϕ ∈ H 1 (RN ), define Ay ϕ = ϕ −
n
p−2 K wyj ϕ + Ly ϕ,
j =1
where Ly ϕ =
N
p−2 ∂wyi ∂wyi −2 ∂wyi . K wyj ϕ , ∂xα ∂xα ∂xα i=j α=1
Note that Ay (Wy ) ⊂ Wy for any y ∈ Ωλ .
L.S. Lin, Z.L. Liu / Journal of Functional Analysis 257 (2009) 485–505
491
Lemma 2.5. (See [31].) There exist λ0 > 0 and η0 > 0 such that for λ > λ0 and y ∈ Ωλ , Ay |Wy : Wy → Wy is invertible and (Ay |W )−1 η0 . y Lemma 2.6. Let v ∈ H 1 (RN ). If → 0, v → 0, and λ → ∞, then sup y∈Ωλ , ϕ∈H 1 (RN ), ϕ=1
Ay ϕ − ϕ − K P |uy + v|p−2 ϕ → 0.
Proof. Note that n
p−2 Ay ϕ − ϕ − K P |uy + v|p−2 ϕ = K P |uy + v|p−2 ϕ − K wyj ϕ + Ly ϕ. j =1
Since, as |yi − yj | → ∞, p−2 ∂wyi p−2 ∂wyi K wyj ϕ , = (p − 1) wyj ϕ = o(1)ϕ, ∂xα ∂xα RN
we see that, as λ → ∞, sup Ly ϕ = o(1)ϕ. y∈Ωλ
For any ϕ, ψ ∈ H 1 (RN ),
K P |uy + v|
p−2
= −(p − 1)
ϕ −
n
K
p−2 wyj ϕ , ψ
j =1
a|uy + v|p−2 ϕψ + (p − 1)
RN
n
p−2 p−2 + (p − 1) − wyj |uy | ϕψ. RN
|uy + v|p−2 − |uy |p−2 ϕψ
RN
j =1
A direct computation shows that (see [31])
sup |uy + v|p−2 − |uy |p−2 y∈Ωλ
p p−2
→0
as v → 0,
and
n
p−2 p−2 sup |uy | − wyi
y∈Ωλ i=1
→ 0 as λ → ∞. p p−2
492
L.S. Lin, Z.L. Liu / Journal of Functional Analysis 257 (2009) 485–505
Therefore, as → 0, v → 0, and λ → ∞, n
p−2 p−2 ϕ − K wyj ϕ = o(1)ϕ. sup K P |uy + v| y∈Ωλ j =1
The result follows.
2
Solutions of (1.1) correspond to critical points of the functional J (u) =
1 2
1 |∇u|2 + |u|2 − p
RN
P |u|p ,
u ∈ H 1 RN .
RN
Using Lemmas 2.5 and 2.6, we can proceed as in [31] to prove the following two lemmas which describe a Lyapunov–Schmidt reduction procedure converting the problem of finding critical points of J to a finite dimensional problem. Lemma 2.7. There exist 0 > 0 and Λ0 > 0 such that for 0 < < 0 and λ > Λ0 , there exists a C 1 map vλ, : Ωλ → H 1 RN , depending on λ and , such that (a) for any y ∈ Ωλ , vλ,,y ∈ Wy ; (b) for any y ∈ Ωλ , Qy ∇J (uy + vλ,,y ) = 0, where Qy : H 1 (RN ) → Wy is the orthogonal projection onto Wy ; (c) limλ→∞, →0 vλ,,y = 0 uniformly in y ∈ Ωλ ; lim|y|→∞ vλ,,y = 0 uniformly in ∈ (0, 0 ) if n = 1. Decreasing 0 and increasing Λ0 if necessary, we have the following result. Lemma 2.8. For 0 < < 0 and λ > Λ0 , if y 0 = (y10 , . . . , yn0 ) ∈ Ωλ is a critical point of J (uy + vλ,,y ), then uy 0 + vλ,,y 0 is a critical point of J . 3. Proof of Theorem 1.1 We shall prove Theorem 1.1 in this section. For this we need first to estimate J (uy + vλ,,y ). Denote 1 1 p c0 := w2 − |w|p . 2 p Then
L.S. Lin, Z.L. Liu / Journal of Functional Analysis 257 (2009) 485–505
493
1 1 J (uy + vλ,,y ) = uy + vλ,,y 2 − 2 p
(1 − a)|uy + vλ,,y |p RN
n 1 = w2 + vλ,,y 2 + wyi , vλ,,y + wyi , wyj
2 2 n
i<j
i=1
1 p |uy + vλ,,y |p + p p
−
a|uy + vλ,,y |p RN
n 1 p = nc0 + |w|p + vλ,,y 2 + wyi , vλ,,y + wyi , wyj
p 2 n
1 p |uy + vλ,,y |p + p p
−
i<j
i=1
a|uy + vλ,,y |p . RN
Since w is a solution of (1.3), we have
n 1 p |w|p + vλ,,y 2 + p 2 n
J (uy + vλ,,y ) = nc0 +
i=1
+
i<j
p−1
wyi wyj −
p−1
wyi vλ,,y RN
1 p |uy + vλ,,y |p + p p
RN
a|uy + vλ,,y |p . RN
Expanding the last two terms gives
1 n p p J (uy + vλ,,y ) = nc0 − |uy |p + |w|p + p p i<j
+
n
i=1
RN
p−1 wyi vλ,,y
+ O vλ,,y 2 .
−
p−1 wyi wyj
RN
p−1 uy vλ,,y
RN
+ RN
p−1 p
p−1 p
auy vλ,,y
C auy vλ,,y
and
RN
p
auy RN
The Hölder inequality, Sobolev inequality, and Lemma 2.2 imply
RN
+ p
p−1
auy
vλ,,y
494
L.S. Lin, Z.L. Liu / Journal of Functional Analysis 257 (2009) 485–505
n n
p−1 p−1 p−1 p−1 wyi vλ,,y uy − wyi
uy vλ,,y −
i=1
RN
i=1
RN
C
i<j
|vλ,,y |p p p−1
p−1 p
p−1 wyi wyj
vλ,,y .
RN
Therefore,
1 n p p J (uy + vλ,,y ) = nc0 − |uy |p + |w|p + p p i<j
+O
i<j
RN
p
p
auy RN
2(p−1) p−1 wyi wyj
+ p
p−1 wyi wyj
+O
p auy
2
RN
RN
+ O vλ,,y 2 .
(3.1)
Lemma 3.1. vλ,,y = O
i<j
p−1 p
p−1 wyi wyj
p−1 p p auy +O .
RN
RN
Proof. For v ∈ Wy , from Lemma 2.7, we see that 0 = uy + vλ,,y , v −
P (uy + vλ,,y )p−1 v.
RN
Since uy , v =
n
i=1
p−1
wyi v
RN
and there exists θ ∈ (0, 1) such that p−1 P (uy + vλ,,y )p−1 v = (p − 1) P (uy + θ vλ,,y )p−2 vλ,,y v + P uy v, RN
RN
RN
using the operators K and Qy defined in Section 2, we have vλ,,y − Qy K P (uy + θ vλ,,y )p−2 vλ,,y , v =
RN
The right side can be estimated as
p−1 P uy v
−
n
i=1
RN
p−1
wyi v.
(3.2)
L.S. Lin, Z.L. Liu / Journal of Functional Analysis 257 (2009) 485–505
n
p−1 p−1 wyi v
P uy v −
i=1
RN
RN
RN
C
495
n
p−1 p−1 p−1 wyi |v| + auy |v|
uy −
i=1
i<j
RN
p−1 p
p−1 wyi wyj
p−1
v + C
RN
p
p auy
v.
RN
Choosing v = vλ,,y − Qy K(P (uy + θ vλ,,y )p−2 vλ,,y ) in (3.2) and using Lemmas 2.5 and 2.6, we obtain, for some constant η > 0, ηvλ,,y v C
i<j
which implies the result.
p−1 p−1 wyi wyj
p
v + C
RN
p−1 p auy
p
v,
RN
2
Combining (3.1) and Lemma 3.1 yields the following estimate. Lemma 3.2. J (uy + vλ,,y ) = nc0 −
i<j
+O
1 n p p |uy |p + |w|p + p p
i<j
RN
2(p−1) p
p−1
wyi wyj
p−1
wyi wyj +
p
p
auy RN
p 2 auy . +O
RN
RN
We are now ready to prove Theorem 1.1. Let n ∈ N and we first consider the case n 2. Define 1 p d = sup auy . (3.3) p N n y∈(R ) RN
Then for any satisfying p−2 p 0 < < 1 := min 0 , |w|p 3pd there exist μ∗ = μ∗ () > μ = μ() > Λ0 such that, for z ∈ RN with |z| ∈ [μ(), μ∗ ()], 3pd 4pd w p−1 wz . p−2 p−2
(3.4)
RN
We shall prove that, for > 0 sufficiently small, J (uy + vμ,,y ) achieves its maximum at some point in Ωμ() , which produces an n-bump positive solution of (1.1). Define
M := sup J (uy + vμ,,y ) y ∈ Ωμ() .
496
L.S. Lin, Z.L. Liu / Journal of Functional Analysis 257 (2009) 485–505
Lemma 3.3. Assume n 2. Then there exists 2 ∈ (0, 1 ) such that for 0 < < 2 ,
M > sup J (uy + vμ,,y ) y ∈ Ωμ() and |yi − yj | ∈ μ(), μ∗ () for some i = j . Proof. For > 0 small enough, if y = (y1 , . . . , yn ) ∈ Ωμ() and |yi − yj | ∈ [μ(), μ∗ ()] for some i = j , then by Lemmas 2.3 and 3.2, (3.3), and (3.4), we obtain J (uy + vμ,,y ) nc0 −
p−2
p i<j
+C
p−1
wyi wyj + RN
p
p
auy RN
2(p−1)
i<j
p
p−1
wyi wyj
+ C 2
RN
nc0 − 3d + d + C
2(p−1) p
nc0 − d.
(3.5)
On the other hand, for small enough such that the fifth and the seventh terms on the right side of the equality from Lemma 3.2 satisfy p
p auy
+O
2
RN
p auy
> 0,
RN
we have lim inf
y∈Ωμ() , |yi −yj |→+∞ for all i=j
From (3.5) and (3.6), we obtain the result.
J (uy + vμ,,y ) nc0 .
(3.6)
2
For any 0 < < 2 , let y k () = (y1k (), . . . , ynk ()) ∈ Ωμ() , k = 1, 2, . . . , be a maximizing sequence for J (uy + vμ,,y ). Then Lemma 3.3 implies that
inf min yik () − yjk () μ∗ . k i=j
Therefore, for any 0 < < 2 and 1 i n, passing to a subsequence if necessary, we may assume either limk→∞ yik () = yi0 () ∈ RN with |yi0 () − yj0 ()| μ∗ for i = j or limk→∞ |yik ()| = ∞. Define, for 0 < < 2 ,
Π() = 1 i n yik () → ∞, as k → ∞ . We shall prove that Π() = ∅ for > 0 sufficiently small and thus J (uy + vμ,,y ) achieves its maximum at (y10 (), . . . , yn0 ()) in Ωμ() . Lemma 3.4. Assume n 2. Then there exists (n) ∈ (0, 2 ) such that for ∈ (0, (n)), Π() = ∅.
L.S. Lin, Z.L. Liu / Journal of Functional Analysis 257 (2009) 485–505
497
Proof. We argue by contradiction and assume that Π() = ∅ along a sequence m → 0. Without loss of generality, we may assume Π(m ) = {1, . . . , jn } for all m ∈ N and for some 1 jn < n. The case in which jn = n can be handled similarly. For convenience of notations, we shall denote = m , yik = yik (m ), y k = (y1k , . . . , ynk ) and y∗k = (yjkn +1 , . . . , ynk ) for k = 1, 2, . . . . Then, as k → ∞,
k
y → ∞, 1
...,
k
y → ∞, jn
and y∗k → y∗0 := yj0n +1 , . . . , yn0 . In view of Lemma 3.2 and (3.4), we see that J (uy k
1 n p p + vμ,,y k ) = nc0 − |uy k |p + |w|p + p p i<j
p−1 w k wy k yi j
+ p
RN
p
auy k RN
2(p−1) +O p , and J (uy∗k + vμ,,y∗k ) = (n − jn )c0 − +
p
p y∗k
au
1 n − jn p p |u k |p + |w|p + p y∗ p
jn +1i<j n
w
p−1 wy k yik j
RN
2(p−1) +O p .
RN
Therefore, J (uy k + vμ,,y k ) − J (uy∗k + vμ,,y∗k ) = jn c0 −
1 1 jn p p p |uy k |p + |uy∗k |p + |w|p + p p p i<j
−
jn +1i<j n
w RN
p−1 wy k yik j
+
p
w
p−1 wy k yik j
RN
2(p−1) p p a uy k − u k + O p . y∗
RN
Since Lemma 2.3 implies 1 jn 1 p p p − |uy k |p + |uy∗k |p + |w|p p p p jn 2(p − 1)
p − 1 p−1 p−1 p−1 w k uy∗k + wy k u k − w k wy k − y y y∗ j i p p i i 1i<j jn
RN
i=1
RN
(3.7)
498
L.S. Lin, Z.L. Liu / Journal of Functional Analysis 257 (2009) 485–505
−
2(p − 1) p
1i<j jn
2(p − 1)
p jn
w
p−1 wy k yik j
−
i=1
RN
w
p−1 uy∗k , yik
RN
we arrive at J (uy k + vμ,,y k ) − J (uy∗k
+ vμ,,y∗k ) jn c0 + p
2(p−1) p p a uy k − u k + C p .
y∗
RN
Letting k → ∞, in view of |yik | → ∞ for i = 1, . . . , jn , we see that M J (uy∗0 + vμ,,y∗0 ) + jn c0 + C
2(p−1) p
.
(3.8)
On the other hand, since, according to the assumption, n<1+
p−2 , 2σ (p − 1)
we can choose δ such that 0<δ<
p − 2 − 2σ (n − 1)(p − 1) . 2(1 + σ (n − 1))(p − 1)
(3.9)
By Lemma 2.4 and (3.4), there exist Ci > 0, i = 1, 2, such that μ = μ() satisfies C1 μ −
N−1 2
e−μ C2 ,
which implies for small enough (1 − δ) ln
1 1 < μ < (1 + δ) ln .
(3.10)
Define y s = (4s − 2n − 2) 1 − p −1 μ, 0, . . . , 0 ∈ RN ,
s = 1, 2, . . . , n.
The open balls B(y s , 2(1 − p −1 )μ) (s = 1, 2, . . . , n) are mutually disjoint. Thus there are jn integers from {1, 2, . . . , n}, denoted by s1 < s2 < · · · < sjn , such that
y − y 0 2 1 − p −1 μ, si j
i = 1, . . . , jn , j = jn + 1, . . . , n.
Denote y si by yi , i = 1, 2, . . . , jn . Then, clearly,
y 2(n − 1) 1 − p −1 μ, i
y − y 2 1 − p −1 μ, i
j
i = 1, . . . , jn ,
(3.11)
1 i < j jn ,
(3.12)
L.S. Lin, Z.L. Liu / Journal of Functional Analysis 257 (2009) 485–505
499
and
y − y 0 2 1 − p −1 μ, i
i = 1, . . . , jn , j = jn + 1, . . . , n.
j
(3.13)
Therefore, y1 , . . . , yjn , yj0n +1 , . . . , yn0 ∈ Ωμ . Denote y = (y1 , . . . , yjn , yj0n +1 , . . . , yn0 ). Then using Lemma 2.1 we see that 1 1 jn p p p − |uy |p + |uy∗0 |p + |w|p −C p p p
1i<j jn
p−1
wy wyj − C i
RN
jn n
i=1 j =jn +1
p−1
wy uy 0 . i
j
RN
Lemma 2.4 together with (3.12) and (3.13) implies that each integral on the right side of the last −1 inequality is less than or equal to Ce−2(1−p )μ which can be estimated as, using (3.10), Ce−2(1−p
−1 )μ
Ce−2(1−p
−1 )(1−δ) ln 1
C
2(p−1) p (1−δ)
.
Then, from (3.7) with y k and y∗k being replaced by y and y∗0 , respectively, we see that J (uy
+ vμ,,y ) − J (uy∗0 + vμ,,y∗0 ) jn c0 + p
2(p−1) p p (1−δ) a uy − u 0 − C p .
y∗
RN
Now, the assumption (A) together with (3.10) and (3.11) yields p
p p a uy − u 0 y∗ p
RN
p awy 1
p
RN
awy Ce−σ (|y1 |+1) p
1
|x−y1 |1
Ce−2σ (n−1)(1−p
−1 )(1+δ) ln 1
= C 1+2σ (n−1)(1−p
−1 )(1+δ)
.
Since (3.9) implies 2(p − 1) (1 − δ), 1 + 2σ (n − 1) 1 − p −1 (1 + δ) < p we then arrive at, for small enough, M J (uy∗0 + vμ,,y∗0 ) + jn c0 + C 1+2σ (n−1)(1−p
−1 )(1+δ)
.
(3.14)
But (3.14) contradicts (3.8). Thus there exists (n) > 0 such that if 0 < < (n) then Π() = ∅ and J (uy + vμ,,y ) achieves its maximum at some point (y10 , . . . , yn0 ) ∈ Ωμ() . 2 Now we are ready to prove Theorem 1.1.
500
L.S. Lin, Z.L. Liu / Journal of Functional Analysis 257 (2009) 485–505
Proof of Theorem 1.1. For n 2, according to Lemma 3.4, if 0 < < (n) then J (uy + vμ,,y ) achieves its maximum at some point y 0 = (y10 , . . . , yn0 ) ∈ Ωμ() . Then uy 0 + vλ,,y 0 is an n-bump solution of (1.1). For n = 1, as a consequence of Lemma 2.7(c), if ∈ (0, 0 ] then lim J (uy + vλ,,y ) = J0 (w) = c0 ,
|y|→∞
and since J (uy + vμ,,y ) is defined on all RN , J (uy + vμ,,y ) has a critical point y0 ∈ RN and uy 0 + vλ,,y 0 is a 1-bump solution of (1.1). By an argument similar to those in [32], one sees that uy 0 + vλ,,y 0 is a positive solution of (1.1). 2 4. Proof of Theorem 1.3 and additional results In this section, we consider Eq. (1.4) N+2 −u = 1 − K |x| u N−2 ,
u ∈ D1,2 RN .
In order to obtain results on radial solutions for Eq. (1.4), we make the following transformation (see [14,24]) u(x) = |x|−
N−2 2
w − ln |x| ,
x ∈ RN .
(4.1)
Let y = − ln |x|. Then u is a radial solution of (1.4) if and only if w is a solution of the equation −w (y) +
N+2 (N − 2)2 w(y) = 1 − K e−y w(y) N−2 , 4
y ∈ R.
Through the dilations v(y) =
2 N −2
N−2 2 w
2 y , N −2
2 a(y) = K e− N−2 y ,
(4.2)
the last equation becomes N+2 −v (y) + v(y) = 1 − a(y) v(y) N−2 ,
y ∈ R.
(4.3)
Proof of Theorem 1.3. Let σ = N 2−2 μ and p = N2N −2 . Since K ∈ C([0, ∞), (0, ∞)) we have a ∈ C(R) and a(y) > 0. The assumptions limr→0 K(r) = 0 and limr→∞ K(r) = 0 imply lim|y|→∞ a(y) = 0. Also, the assumptions that K(r) cr μ for r > 0 small and K(r) cr −μ for r large imply 2
2
a(y) ce− N−2 μy = ce− N−2 μ|y|
for y > 0 and y large
and 2
2
a(y) ce N−2 μy = ce− N−2 μ|y|
for y < 0 and −y large.
L.S. Lin, Z.L. Liu / Journal of Functional Analysis 257 (2009) 485–505
501
Therefore, K satisfying (C) implies a satisfying (A). In addition, N −2 p−2 = . 2σ (p − 1) μ(N + 2) Applying Theorem 1.1 to (4.3), there exists (n) > 0 such that for 0 < < (n), (4.3) has a positive n-bump solution v, which is close in the H 1 (R) norm to ni=1 V (· − yi ), where V is the unique solution to the equation ⎧ N+2 ⎨ −v (y) + v(y) = v N−2 (y), v(y) > 0, y ∈ R, lim v(y) = 0, ⎩ v(0) = max v(y), |y|→∞
y∈R
and y1 , . . . , yn ∈ R with |yi − yj | being sufficiently large for i = j . The positive n-bump solution v of (4.3) corresponds to a positive solution u of (1.4) through the transformations (4.1) and (4.2), that is, u(x) =
N −2 2
N−2 2
− N−2 2
|x|
N −2 ln |x| . v − 2
Denote U (x) := U1 (x) =
[N (N − 2)] (1 + |x|2 )
N−2 4
N−2 2
.
Then the relation between U and V is given by the formula U (x) =
N −2 2
N−2 2
− N−2 2
|x|
N −2 ln |x| . V − 2
Set λi = exp
2 yi , N −2
i = 1, 2, . . . , n. λ
Then, for i = j , |yi − yj | being sufficiently large implies λλji + λji being sufficiently large. Moreover, n n
Uλi = cN v − V (· − yi ) , u − 1,2 N 1 D
i=1
N−1
(R )
i=1
H (R)
1
where cN = ((N − 2)/2) 2 ωN2 and ωN is the surface area of the unit sphere in RN . Therefore, u is a positive n-tower solution of (1.4). 2 The above approach applies also to the equation (N−2)q−(N+2) q 2 u , −u = 1 − K |x| |x|
x ∈ RN \ {0},
(4.4)
502
L.S. Lin, Z.L. Liu / Journal of Functional Analysis 257 (2009) 485–505
where N 3 and q > 1. We would like to emphasize that we do not impose any upper bound on q. The transformation (4.1) converts (4.4) to the equation −w (y) +
(N − 2)2 w(y) = 1 − K e−y w q (y), 4
y ∈ R.
Through the dilations v(y) =
2 N −2
2 q−1
w
2 y , N −2
2 a(y) = K e− N−2 y ,
(4.5)
the last equation becomes −v (y) + v(y) = 1 − a(y) v q (y),
y ∈ R.
(4.6)
In the same way as above, applying Theorem 1.1 to (4.6), one can prove the following theorem. Theorem 4.1. Let K satisfy (C). If n ∈ N satisfies n<1+
(N − 2)(q − 1) , 4μq
then there exists (n) > 0 such that for 0 < < (n), (4.4) has a positive n-tower solution. Here and in what follows, an n-tower solution of (4.4) means a radial solution which is close n λ to i=1 Uλ∗i in the D1,2 (RN ) norm, where λi > 0 (i = 1, 2, . . . , n) with λλji + λji being large enough for all i = j , N−2 2
Uλ∗i (x) = λi U ∗ (x) =
N −2 2
2 q−1
|x|−
U ∗ (λi x),
N−2 2
N −2 ln |x| , V∗ − 2
and V ∗ is the unique solution to the equation
−v (y) + v(y) = v q (y), v(y) > 0, y ∈ R, v(0) = max v(y), lim v(y) = 0. y∈R
|y|→∞
The relation between U ∗ and V ∗ comes from the transformations (4.1) and (4.5). As a consequence of Theorem 4.1, we have the following result. Corollary 4.2. Let K satisfy (D). Then for any n ∈ N, there exists (n) > 0 such that for 0 < < (n), (4.4) has a positive n-tower solution. Therefore, as → 0, (4.4) has more and more multi-tower solutions.
L.S. Lin, Z.L. Liu / Journal of Functional Analysis 257 (2009) 485–505
503
Obviously, Theorem 4.1 and Corollary 4.2 generalize Theorem 1.1 and Corollary 1.2. Another related equation is (N−2)q−(N+2) 2 uq , −u + V |x| u = |x|
x ∈ RN \ {0},
(4.7)
where N 3, q > 1 and V is assumed to satisfy (E) V ∈ C((0, ∞), (0, ∞)), limr→0 r 2 V (r) = 0, limr→∞ r 2 V (r) = 0, and for any μ > 0 there exists c > 0 such that r 2 V (r) cr μ for r > 0 small and r 2 V (r) cr −μ for r large. Theorem 4.3. Let V satisfy (E). Then for any n ∈ N, there exists (n) > 0 such that for 0 < < (n), (4.7) has a positive n-tower solution. Therefore, as → 0, (4.7) has more and more positive multi-tower solutions. Proof. The transformation (4.1) converts (4.7) into the equation −y (N − 2)2 −2y −w (y) + + e V e w(y) = w q (y), 4
y ∈ R.
Set v(y) =
2 N −2
2 q−1
w
2 y , N −2
a(y) =
4 2 4 e− N−2 y V e− N−2 y . 2 (N − 2)
Then the last equation becomes −v + 1 + a(y) v = v q ,
y ∈ R.
(4.8)
The fact that V satisfies (E) implies that a satisfies (B) with N = 1. According to [31, Theorem 1.1], for any positive integer n there exists (n) > 0 such that for 0 < < (n), (4.8) has an n-bump positive solution. As a consequence, as → 0, (4.8) has more and more multi-bump positive solutions. Translating the results for (4.8) into those for (4.7), we are done. 2 Remark 4.4. The transformation (4.1) may also be applied to the equation −u + V |x| u = K |x| uq , x ∈ RN and converts it to the equation (N+2)−(N−2)q y q (N − 2)2 −2y −y 2 −w + + e V (e ) w(y) = K e−y e w (y), 4
y ∈ R.
Studying the latter may shed some light on the former. Acknowledgment The authors would like to thank the referee for carefully reading the manuscript, pointing out missing references, giving valuable suggestions to improve the results as well as the exposition of the paper.
504
L.S. Lin, Z.L. Liu / Journal of Functional Analysis 257 (2009) 485–505
References [1] A. Ambrosetti, M. Badiale, Homoclinics: Poincaré–Melnikov type results via a variational approach, Ann. Inst. H. Poincaré Anal. Non Linéaire 15 (1998) 233–252. [2] A. Ambrosetti, M. Badiale, S. Cingolani, Semiclassical states of nonlinear Schrödinger equations, Arch. Ration. Mech. Anal. 140 (1997) 285–300. N+2 [3] A. Ambrosetti, J. Garcia Azorero, I. Peral, Perturbation of u + u N−2 = 0, the scalar curvature problem in RN and related topics, J. Funct. Anal. 165 (1999) 117–149. [4] A. Ambrosetti, J. Garcia Azorero, I. Peral, Remarks on a class of semilinear elliptic equations on R n via perturbation methods, Adv. Nonlinear Stud. 1 (2001) 1–13. [5] A. Ambrosetti, A. Malchiodi, On the symmetric scalar curvature problem on S n , J. Differential Equations 170 (2001) 228–245. [6] A. Ambrosetti, A. Malchiodi, Perturbation Methods and Semilinear Elliptic Problem on RN , Progr. Math., vol. 240, Birkhäuser, Boston, 2005. [7] A. Ambrosetti, A. Malchiodi, W.-M. Ni, Singularly perturbed elliptic equations with symmetry: existence of solutions concentrating on spheres, Part I, Comm. Math. Phys. 235 (2003) 427–466. [8] A. Ambrosetti, A. Malchiodi, S. Secchi, Multiplicity results for some nonlinear Schrödinger equations with potentials, Arch. Ration. Mech. Anal. 159 (2001) 253–271. [9] A. Bahri, P.L. Lions, On the existence of a positive solution of semilinear elliptic equations on unbounded domains, Ann. Inst. H. Poincaré Anal. Non Linéaire 14 (1997) 365–413. [10] H. Berestycki, P.L. Lions, Nonlinear scalar field equations: I, II, Arch. Ration. Mech. Anal. 82 (1983) 313–375. [11] J. Byeon, Z.-Q. Wang, Standing waves with a critical frequency for nonlinear Schrödinger equations, Arch. Ration. Mech. Anal. 165 (2002) 295–316. [12] J. Byeon, Z.-Q. Wang, Standing waves with a critical frequency for nonlinear Schrödinger equations, II, Calc. Var. Partial Differential Equations 18 (2003) 207–219. N+2 [13] D.M. Cao, E.S. Noussair, S.S. Yan, On the scalar curvature equation −u = (1 + K)u N−2 in RN , Calc. Var. Partial Differential Equations 15 (2002) 403–419. [14] F. Catrina, Z.-Q. Wang, On the Caffarelli–Nirenberg inequalities: sharp constants, existence (and nonexistence) and symmetry of extremal functions, Comm. Pure Appl. Math. 54 (2001) 229–258. [15] G. Cerami, D. Passaseo, Existence and multiplicity results for semilinear elliptic Dirichlet problems in exterior domains, Nonlinear Anal. 24 (1995) 1533–1547. [16] V. Coti Zelati, P.H. Rabinowitz, Homoclinic type solutions for a semilinear elliptic PDE on RN , Comm. Math. Pure Appl. 45 (1992) 1217–1269. [17] T. D’Aprile, J.C. Wei, Standing waves in the Maxwell–Schrödinger equation and an optimal configuration problem, Calc. Var. Partial Differential Equations 25 (2005) 105–137. [18] M. del Pino, P. Felmer, Local mountain passes for semilinear elliptic problems in unbounded domains, Calc. Var. Partial Differential Equations 4 (1996) 121–137. [19] M. del Pino, P. Felmer, Semi-classical states for nonlinear Schrödinger equations, J. Funct. Anal. 149 (1997) 245– 265. [20] M. del Pino, P. Felmer, Multi-peak bound states of nonlinear Schrödinger equations, Ann. Inst. H. Poincaré Anal. Non Linéaire 15 (1998) 127–149. [21] M. del Pino, P. Felmer, Semi-classical states of nonlinear Schrödinger equations: a variational reduction method, Math. Ann. 324 (2002) 1–32. [22] M. del Pino, M. Kowalczyk, J.C. Wei, Concentration on curves for nonlinear Schrödinger equations, Comm. Pure Appl. Math. 60 (2007) 113–146. [23] A. Floer, A. Weinstein, Nonspreading wave packets for the cubic Schrödinger equations with a bounded potential, J. Funct. Anal. 69 (1986) 397–408. [24] N. Korevaar, R. Mazzeo, F. Pacard, R. Schoen, Refined asymptotics for constant scalar curvature metric with isolated singularities, Invent. Math. 135 (1999) 233–272. [25] M.K. Kwong, Uniqueness of positive solutions of u − u + up = 0 in RN , Arch. Ration. Mech. Anal. 105 (1989) 243–266. [26] Y.Y. Li, Prescribing scalar curvature on S 3 , S 4 and related problems, J. Funct. Anal. 118 (1993) 43–118. [27] Y.Y. Li, On Nirenberg’s problem and related topics, Topol. Methods Nonlinear Anal. 3 (1994) 221–233. [28] Y.Y. Li, Prescribing scalar curvature on S n and related problems, Part I, J. Differential Equations 120 (1995) 319– 410.
L.S. Lin, Z.L. Liu / Journal of Functional Analysis 257 (2009) 485–505
505
[29] Y.Y. Li, Prescribing scalar curvature on S n and related problems. II. Existence and compactness, Comm. Pure Appl. Math. 49 (1996) 541–597. [30] Y.Y. Li, On a singularly perturbed elliptic equation, Adv. Differential Equations 2 (1997) 955–980. [31] L.S. Lin, Z.L. Liu, S.W. Chen, Multi-bump solutions for a semilinear Schrödinger equation, Indiana Univ. Math. J., in press. [32] Z.L. Liu, Z.-Q. Wang, Multi-bump type nodal solutions having a prescribed number of nodal domains: I, II, Ann. Inst. H. Poincaré Anal. Non Linéaire 22 (2005) 597–631. [33] A. Malchiodi, The scalar curvature problem on S n : an approach via Morse theory, Calc. Var. Partial Differential Equations 14 (2002) 429–445. [34] C.B. Ndiaye, Multiple solutions for the scalar curvature problem on the sphere, Comm. Partial Differential Equations 31 (2006) 1667–1678. [35] W.-M. Ni, I. Takagi, Locating the peaks of least-energy solutions to a semilinear Neumann problem, Duke Math. J. 70 (1993) 247–281. [36] Y.G. Oh, Existence of semi-classical bound states of nonlinear Schrödinger equations with potential on the class (V )a , Comm. Partial Differential Equations 13 (1988) 1499–1519. [37] Y.G. Oh, On positive multi-lump bound states of nonlinear Schrödinger equations under multiple well potential, Comm. Math. Phys. 131 (1990) 223–253. [38] P.H. Rabinowitz, On a class of nonlinear Schrödinger equations, Z. Angew. Math. Phys. 43 (1992) 270–291. [39] Z.-Q. Wang, Existence and symmetry of multi-bump solutions for nonlinear Schrödinger equations, J. Differential Equations 159 (1999) 102–137.
Journal of Functional Analysis 257 (2009) 506–536 www.elsevier.com/locate/jfa
Decomposing the essential spectrum E.B. Davies Department of Mathematics, King’s College London, Strand, London, WC2R 2LS, United Kingdom Received 12 November 2008; accepted 21 January 2009 Available online 13 February 2009 Communicated by L. Gross
Abstract We use C ∗ -algebra theory to provide a new method of decomposing the essential spectra of self-adjoint and non-self-adjoint Schrödinger operators in one or more space dimensions. © 2009 Elsevier Inc. All rights reserved. Keywords: Essential spectrum; C ∗ -algebra; Non-self-adjoint; Schrödinger operator
1. Introduction In a recent study by Hinchcliffe [14] of the spectrum of a periodic, discrete, non-self-adjoint Schrödinger operator acting on Z2 with a dislocation along {0} × Z, we were struck by the fact that the essential spectrum of the operator, defined by means of the Calkin algebra, divides into two parts, one of which occupies a region in the complex plane, the other being one or more simple curves; the curves are associated with surface states confined to a neighbourhood of the dislocation. The same phenomenon occurs in the self-adjoint case, but here the distinction is between parts of the (real) spectrum that have infinite spectral multiplicity and other parts with finite multiplicity, at least in two dimensions. In this paper we describe a method of decomposing the essential spectrum of a self-adjoint or non-self-adjoint Schrödinger operator into parts by using the two-sided ideals of a certain standard C ∗ -algebra. Our conclusion is that one can define different types of essential spectrum, provided one is given this extra structure; we warn the reader that the spectral classification that we obtain is not a unitary invariant of the operators concerned. However, the C ∗ -algebra used E-mail address: [email protected]. 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.01.031
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
507
is the same for all the applications considered so the results obtained have a high degree of model-independence. At a broad conceptual level, the C ∗ -algebras that we consider are closely related to algebras and modules that were introduced by Georgescu, M˘antoiu, Roe and others [1–3,15,16,19,20]; many further references and useful comments may be found in [12,13]. However, our treatment works in very general metric space setting and does not depend on the presence of any group action, whether abelian or nonabelian. As a result it is applicable to more or less arbitrary waveguides, discrete and continuous graphs and Riemannian manifolds, as well as to Rd and Zd . The idea of studying the essential spectrum of an operator by constructing a C ∗ -algebra with a large class of closed two-sided ideals is also not new. It first appeared in [2,3], but all previous treatments apply in much more restricted contexts than that here. Some of our spectral results can be proved by methods that are geometric in the sense that they involve Hilbert space methods rather than C ∗ -algebras. An advantage of the approach described here is that instead of dealing with new applications by invoking analogy and experience, the use of C ∗ -algebras enables one to formulate simple general theorems that cover applications directly. The method accommodates many of the technical hypotheses that have been used in the field within a single formalism. In Sections 2 and 4 we investigate the relevant C ∗ -algebra theory without reference to its application. Section 3 is devoted to showing how to apply the results to discrete Schrödinger operators. Theorems 10 and 11 describe the spectrum when a periodic potential has a dislocation on one or both of the two axes in Z2 ; the second possibility has not previously been considered. After a substantial amount of preparatory work, we turn in Section 7 to the study of Schrödinger and more general differential operators acting in L2 (Rd ), and show that the abstract methods developed earlier can be applied to their resolvent operators under suitable hypotheses. The spectral mapping theorem then allows one to pull the results back to the original operators. Example 42 explains the application of the methods to multi-body Schrödinger operators. Finally, in Section 8 we show that our methods are not only relevant in a Euclidean context. We prove that the C ∗ -algebraic assumptions are satisfied when considering the Laplace–Beltrami operator on three-dimensional hyperbolic space by writing down the explicit formulae available in this case; the same applies to a wide variety of other Riemannian manifolds but general heat kernel bounds are needed for the proofs. 2. Some C ∗ -algebra theory Throughout this section A will denote a (usually non-commutative) C ∗ -algebra with identity, and J will denote a (closed, two-sided) ideal in A. It is well-known that such an ideal is necessarily closed under adjoints and that A/J is again a C ∗ -algebra with respect to the quotient norm. See [10, Chapter 1] or [17, Chapter 1] for various standard facts about C ∗ -algebras that we will use without further comment. If x ∈ A then we denote the spectrum of x by σ (x); it is known that if A is replaced by a larger C ∗ -algebra, σ (x) does not change. If J is an ideal in A we denote the natural map of A onto the quotient algebra A/J by πJ . If several ideals Jr are labelled by a parameter r, we write πr instead of πJr for brevity, and also put σr (x) = σ (πJr (x)) Lemma 1. If the ideals J1 , J2 in A satisfy J2 ⊆ J1 ⊆ A then σ1 (x) ⊆ σ2 (x) ⊆ σ (x).
508
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
Proof. We first put J3 = {0}, so that A/J3 = A and σ3 (x) = σ (x). Suppose that 1 r < s 3 and that λ ∈ / σs (x). Then there exists y ∈ A such that πs (x) − λ1 πs (y) = πs (y) λ1 − πs (x) = 1 in A/Js . Hence there exist u, v ∈ Js such that (x − λ1)y = 1 + u,
y(λ1 − x) = 1 + v.
Applying πr to both equations and using the fact that Js ⊆ Jr we obtain πr (x) − λ1 πr (y) = πr (y) λ1 − πr (x) = 1. Hence λ ∈ / σr (x) and σr (x) ⊆ σs (x).
2
Note. If A = L(H) and J is the ideal K(H) of all compact operators on the Hilbert space H, then σ (πJ (x)) is (one of several inequivalent definitions of) the essential spectrum of x by [5, Theorem 4.3.7]. Needless to say we are interested in more general examples. There are several ways of constructing A and the relevant ideals Jr . Given J , the largest choice of A is described in (2) and more concretely in Lemma 3. If one wishes make another one has to confirm that J ⊆ A ⊆ A. choice, call it A, Theorem 2. Let B be a C ∗ -algebra with identity and let {pn }∞ n=1 be an increasing sequence of orthogonal projections in B with pn = 1 for every n. Then the norm closure J of J0 = {x ∈ B: ∃n 1. pn x = xpn = x} is a C ∗ -subalgebra that does not contain the identity of B. We have J = x ∈ B: lim x − pn xpn = 0 . n→∞
(1)
Moreover J is an ideal in the C ∗ -algebra with identity A defined by A = {a ∈ B: aJ ⊆ J and J a ⊆ J }. If B = L(H) and pn converge strongly to I as n → ∞ then K(H) ⊆ J ⊆ A, so σ πJ (x) ⊆ σess (x) ⊆ σ (x) for all x ∈ A.
(2)
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
509
Proof. First note that if pn x = xpn = x then pm x = xpm = x for all m n. It follows by elementary algebra that J0 is a ∗ -subalgebra of B, and this implies the same for J . If x ∈ J0 then there exists n for which 1 − pn = (1 − pn )(1 − x). Therefore 1 = 1 − pn = (1 − pn )(1 − x) 1 − pn 1 − x = 1 − x because pn = 1 for every n. Hence 1 − x 1 for all x ∈ J and we can deduce that 1 ∈ / J. If x ∈ B and limn→∞ x − pn xpn = 0 then the fact that pn xpn ∈ J0 implies that x ∈ J . Conversely if x ∈ J and ε > 0 then there exists y ∈ J0 such that x − y < ε. There now exists N 1 such that y = pn ypn for all n N . For all such n we have x − pn xpn x − y + y − pn xpn = x − y + pn ypn − pn xpn 2 x − y < 2ε. Hence limn→∞ x − pn xpn = 0. The proofs that A is a C ∗ -algebra with identity and that J is an ideal in A are both elementary algebra. If B = L(H) then in order to prove that K(H) ⊆ J it is sufficient by (1) and a density argument to observe that if x is a finite rank operator then limn→∞ x − pn xpn → 0. The final inclusion of the theorem follows from Lemma 1. 2 The following provides an alternative description of A. Lemma 3. Let B, {pn }∞ n=1 , J and A be defined as in Theorem 2. Let D0 = {a ∈ B: ∀n 1. ∃m n. pm apn = apn .}
(3)
D = {a ∈ B: ∀n 1. apn ∈ J }.
(4)
and
Then D0 ⊆ D and A = D ∩ D∗ . Proof. The inclusions D0 ⊆ D and A ⊆ D ∩ D∗ are elementary. If a ∈ D and x ∈ J0 then for some n 1 we have ax = a(pn x) = (apn )x ∈ J . J0 ⊆ J . A density argument now implies that aJ ⊆ J . By taking adjoints we conclude that D ∩ D∗ ⊆ A. 2 Note. In spite of the notation we do not claim that D is the norm closure of D0 .
510
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
Lemma 4. Let {pn }∞ n=1 be an increasing sequence of projections on H that converge strongly to 1 and let J and A be constructed as described in Theorem 2. If {φr }∞ r=1 is a sequence of unit vectors in H and limr→∞ pn φr = 0 for every n 1 then limr→∞ aφr = 0 for every a ∈ J . Proof. This is elementary if a ∈ J0 and follows for all a ∈ J by approximation.
2
We say that a sequence {φr }∞ r=1 of unit vectors in H is localized (with respect to J ) if there exists n 1 and c > 0 such that pn φr c for all r 1. Theorem 5. If x ∈ A and λ ∈ σ (x) \ σ (πJ (x)) then there exists a sequence {φr }∞ r=1 that is localized with respect to J and satisfies either lim xφr − λφr = 0
(5)
lim x ∗ φr − λφr = 0.
(6)
r→∞
or r→∞
Proof. If λ ∈ σ (x) then there exists a sequence {φr }∞ r=1 of unit vectors such that either (5) or (6) holds; see [5, Lemma 1.2.13]. Both cases are similar and we only consider the first. If λ ∈ σ (x) \ σ (πJ (x)) and (5) holds and limr→∞ pn φr = 0 for all n 1 then πJ (λ1 − x) is invertible in A/J , so there exist y ∈ A and a ∈ J such that y(λ1 − x) = 1 + a. Lemma 4 now yields 1 = lim (1 + a)φr r→∞ lim y (λ1 − x)φr r→∞
= 0. The contradiction establishes that if λ ∈ σ (x) \ σ (πJ (x)) then pn φr does not converge to 0 as r → ∞ for some n 1. It follows that there exists a subsequence {ψr }∞ r=1 and c > 0 such that pn ψr c for all r 1. 2 Note. Theorem 5 has no converse. If a ∈ A is a self-adjoint operator then any eigenvalue λ of a that is embedded in the continuous spectrum satisfies the conclusion of the theorem for the choice J = K(H). One simply defines φn to be the normalized eigenvector of a corresponding to the eigenvalue λ for all n. Sometimes one has several ideals in A but neither is contained in the other. Theorem 6. Let J1 and J2 be two ideals in the C ∗ -algebra A with identity, and put J3 = J1 ∩ J2 . Then σ3 (x) = σ1 (x) ∪ σ2 (x) for all x ∈ A.
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
511
Proof. It is elementary that J3 is an ideal. Let B = A/J1 ⊕ A/J2 and define the C ∗ -homomorphism π : A → B by π = π1 ⊕ π2 . Then the image C = π(A) is a C ∗ -subalgebra of B and the kernel of π is J3 . If x ∈ A then the spectrum of π(x) is the same whether regarded as an element of B or C. In the former case the spectrum is σ1 (x) ∪ σ2 (x) and in the latter case it is σ3 (x). 2 We next describe one of the C ∗ -algebras that we shall be using in the next section. Let H1 and H2 be infinite-dimensional Hilbert spaces and let H = H1 ⊗ H2 be their Hilbert space tensor product. Let Ii denote the identity operator on Hi for i = 1, 2. Theorem 7. Let {Pn }∞ n=1 be an increasing sequence of finite rank projections in H1 which converges strongly to I1 as n → ∞ and put pn = Pn ⊗ I2 . Then J defined as in Theorem 2 is the closed linear span of all operators A1 ⊗ A2 where A1 ∈ K(H1 ) and A2 ∈ L(H2 ). Also A, defined as in Theorem 2, contains the closed linear span of all operators A1 ⊗ A2 where Ai ∈ L(Hi ) for i = 1, 2. Proof. Let J denote the closed linear span of all operators a = A1 ⊗ A2 where A1 ∈ K(H1 ) and A2 ∈ L(H2 ). The formula lim A1 − Pn A1 Pn = 0
n→∞
implies lim a − pn apn = 0.
n→∞
We deduce that a ∈ J and hence that J ⊆ J . Conversely if x ∈ J0 then there exists n 1 such that x = pn xpn . If Pn has rank k then pn xpn can be written as the sum of k 2 terms of the form A1 ⊗ A2 where each A1 has rank 1. Hence pn xpn ∈ J . The inclusion J0 ⊆ J implies J ⊆ J . The final statement of the theorem follows directly from the inclusions (A1 ⊗ A2 )J ⊆ J ,
J (A1 ⊗ A2 ) ⊆ J .
2
3. Application to discrete Schrödinger operators In this section we construct a C ∗ -subalgebra A of L(H) where H = l 2 (Zd ) by an ad hoc procedure. A more systematic approach that uses a standard C ∗ -algebra is described in Section 4. We put H1 = l 2 (Z) and H2 = l 2 (Zd−1 ), so that H H1 ⊗ H2 l 2 (Z, H2 )
(7)
by means of canonical unitary isomorphisms. We define the projections pn by φ(x) if − n x1 n, (pn φ)(x) = 0 otherwise, for all φ ∈ H and x ∈ Zd . We also define the C ∗ -algebra A and the ideal J as in Theorems 2 and 7. The ideal J contains all bounded operators on H that are ‘concentrated’ in some
512
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
neighbourhood of the dislocation set S = {0} × Zd−1 . In Section 4 we explain how to generalize the ideas in this section by allowing the dislocation set to have a completely general shape. Lemma 8. The C ∗ -algebra A contains all ‘Schrödinger operators’ of the form (Aφ)(x) =
m
ar (x)φ(x + br )
(8)
r=1
where φ ∈ l 2 (Zd ), x ∈ Zd , m ∈ Z+ , br ∈ Zd and ar ∈ l ∞ (Zd ) for all r ∈ {1, 2, . . . , m}. Proof. An elementary calculation implies that pn+k Apn = Apn for all n 1 where k = max{|br |: 1 r m}, so A ∈ D0 . The same applies to A∗ , so we may apply Lemma 3. 2 We say that the Schrödinger operator A on H is periodic in the Z direction with period k if Tk A = ATk where (Tk φ)(m) = φ(m + k) for all φ ∈ l 2 (Z, H2 ). This holds if and only if the coefficients ar are all periodic in the Z direction with period k. Theorem 9. If the Schrödinger operator A is periodic in the Z direction with period k then σ πJ (A) = σess (A) = σ (A).
(9)
If in addition H = A + B + C where B ∈ J and C ∈ K(H), then σess (A) ⊆ σess (A + B) = σess (H ) ⊆ σ (H ).
(10)
Proof. The identities in (9) follow directly from Lemma 1 provided we can prove that σ (A) ⊆ σ (πJ (A)). If λ ∈ σ (A) then there exists a sequence {φr }∞ r=1 of unit vectors such that either limr→∞ Aφr − λφr = 0 or limr→∞ A∗ φr − λφr = 0; see [5, Lemma 1.2.13]. Both cases are similar, so we only consider the first. By translating the φr sufficiently and using the translation invariance of A, we see that there exists a sequence {ψr }∞ r=1 of unit vectors such that limr→∞ Aψr − λψr = 0 and limr→∞ pn ψr = 0 for every n. The argument of Theorem 5 establishes that λ ∈ σ (πJ (A)) and hence that σ (A) ⊆ σ (πJ (A)). The statements in (10) now follow from Lemma 1 as soon as one observes that σ (πJ (H )) = σ (πJ (A)) and σ (πK(H) (H )) = σ (πK(H) (A + B)). 2 The following theorem identifies the asymptotic part of the spectrum of certain Schrödinger operators H as x1 → −∞. The operators concerned have much in common with those of [7], but we allow them to be non-self-adjoint and require the underlying space to be discrete. Theorem 10. Let S = {x ∈ Zd : x1 0} and put (pn φ)(x) =
φ(x) 0
if x1 −n, otherwise,
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
513
for all φ ∈ l 2 (Zd ) and n 0. Let A be of the form (8) and suppose that it is periodic in the x1 direction. Also let H = A + B where B is any bounded operator confined to S in the sense that p0 B = Bp0 = B. If J is defined as in Theorem 2 then σ (A) = σess (A) = σ πJ (H ) ⊆ σess (H ) ⊆ σ (H ). We omit the proof, which is similar to that of Theorem 9 and uses the fact that B ∈ J . We finally come to an application that involves two different closed ideals. Let H = A + V1 + V2 where A acts on H = l 2 (Z2 ), is of the form (8) and is periodic in both horizontal and vertical directions. We assume that the bounded potential V1 has support in Z × [−a2 , a2 ] while the bounded potential V2 has support in [−a1 , a1 ] × Z for some finite a1 , a2 . Let J1 be the ideal associated with the sequence of projections φ(i, j ) if − n i n, (pn φ)(i, j ) = 0 otherwise, and let J2 be the ideal associated with the sequence of projections φ(i, j ) if − n j n, (qn φ)(i, j ) = 0 otherwise. The appropriate C ∗ -algebra A is defined by
A = x ∈ L(H): xJ1 ⊆ J1 , J1 x ⊆ J1 , xJ2 ⊆ J2 , J2 x ⊆ J2 . Theorem 11. Under the above assumptions H ∈ A and σess (H ) = σ1 (A + V1 ) ∪ σ2 (A + V2 ). If V1 is periodic in the x1 direction and V2 is periodic in the x2 direction then σess (H ) = σ (A + V1 ) ∪ σ (A + V2 ). Proof. Since V2 ∈ J1 , we have σ1 (H ) = σ1 (A + V1 ). Since V1 ∈ J2 , we have σ2 (H ) = σ2 (A + V2 ). In order to apply Theorem 6 we need to prove that σ3 (H ) = σess (H ). This follows if J1 ∩ J2 = K(H). The only non-trivial part is to prove that if x ∈ J1 ∩ J2 then x ∈ K(H). Given such an x put xm,n = pm qn xqn pm for all m, n 1. Noting that pm and qn commute and that their product is of finite rank we see that xm,n ∈ K(H) for all m, n. Since x ∈ J2 we have lim xm,n = pm xpm
n→∞
and since x ∈ J1 we have lim pm xpm = x.
m→∞
Therefore x ∈ K(H). The final statement of the theorem involves an application of Theorem 9.
2
514
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
4. The standard C ∗ -algebra If A = L(H) for some infinite-dimensional, separable Hilbert space H then A contains only one non-trivial ideal, namely K(H). In this section we construct a ‘slightly smaller’ C ∗ -algebra which has a rich ideal structure. We formulate the theory at a very general metric space level, so that it is applicable not only to Zd and Rd , but to unbounded discrete and continuous graphs and waveguides, in which X is an unbounded region in Rd . In Section 8 we show that it may also be applied to Schrödinger operators on Riemannian manifolds, writing out the details in the case of three-dimensional hyperbolic space. If H = L2 (Rd ) or H = l 2 (Zd ) then the C ∗ -algebra A constructed below coincides with the algebra C u (Q) of [11] by virtue of [11, Propositions 4.11, 4.12]. However, this fact depends on the use of Fourier transforms on Rd or Zd , which have no analogue in our more general context. Let (X, d, μ) denote a space X provided with a metric d and a measure μ; we require X to be a complete separable metric space with infinite diameter in which every closed ball is compact; all balls in this paper are taken to have positive and finite radius. We also require that the measure of every open ball B(a, r) = {x ∈ X: d(x, a) < r} is positive and finite. Let U denote the class of all non-empty, open subsets of X. If S, T ⊆ X we put
d(S, T ) = inf d(s, t): s ∈ S and t ∈ T . The function x → d(x, S) is continuous on X; indeed d(x, S) − d(y, S) d(x, y) for all x, y ∈ X and S ⊆ X. If (X, d) is a length space in the sense of Gromov then
B(a, r) = x ∈ X: d(x, a) r and
d B(a, r), B(b, s) = max d(a, b) − r − s, 0 for all a, b ∈ X and r, s > 0. However, if X = Zd with the Euclidean metric, neither of these identities need hold. Now put H = L2 (X, μ). For any S ∈ U we define the projection PS on H by (PS φ)(x) =
φ(x) 0
if x ∈ S, otherwise.
We abbreviate PB(a,r) to Pa,r . Lemma 12. If A ∈ L(H) then there exists a largest open set U such that APU = 0. There also exists a largest open set V such that PV A = 0.
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
515
Proof. If
only candidate for U
V is the class of all open sets V such that APV = 0 then the is U = V ∈V V and by Lindelöf’s theorem we may also write U = ∞ n=1 Vn where Vn is a sequence of sets in V. Put W1 = V1 and Wn+1 = Wn ∪ Vn+1 . If Wn ∈ V then APWn+1 = APWn + APVn+1 (1 − PWn ) = 0, so Wn+1 ∈ V. It follows by induction that APWn = 0 for all n 1. Now PWn is an increasing sequence of projections that converges weakly to PU so APU = 0. The second statement of the lemma has a similar proof. 2 Lemma 13. If A, B ∈ L(H) and AB = 0 then for every ε > 0 there exists a ∈ X such that APa,ε = 0 and Pa,ε B = 0. Proof. Let {an }∞ n=1 be a countable dense set in X and define the sets EN inductively by E1 = B(a1 , ε) and En+1 = B(an+1 , ε) \ (E1 ∪ · · · ∪ En ). It follows directly that the sets En are disjoint and that their union is X. Therefore lim
n
n→∞
PEr = I,
r=1
the limit being in the weak operator topology. Therefore lim
n→∞
n
APEr B = AB = 0
r=1
in the same sense and there must exist n such that APEn B = 0. We conclude first that APEn = 0 and PEn B = 0 and then that APan ,ε = 0 and Pan ,ε B = 0. 2 We say that A ∈ L(H) lies in Am (or that A has range m) if Pa,r APb,s = 0 implies d(a, b) r + s + m. If A has an integral kernel K this amounts to requiring that K(x, y) = 0 implies d(x, y) m, but we do not require that A has such a kernel. Lemma 14. If A ∈ Am and B ∈ An then A∗ ∈ Am , A + B ∈ Amax(m,n) and AB ∈ Am+n . Proof. The invariance of Am under adjoints follows immediately from its definition. If Pa,r (A + B)Pb,s = 0 then Pa,r APb,s = 0 or Pa,r BPb,s = 0. Therefore d(a, b) r + s + m or d(a, b) r + s + n. In both cases we deduce that d(a, b) r + s + max(m, n). If Pa,r ABPb,s = 0 then Lemma 13 implies that for every ε > 0 there exists c ∈ X such that Pa,r APc,ε = 0 and Pc,ε BPb,s = 0. Therefore d(a, c) r + ε + m and d(c, b) ε + s + n. These imply that d(a, b) r + s + m + n + 2ε. Letting ε → 0 we finally deduce that AB ∈ Am+n . 2 We will frequently refer to the standard C ∗ -algebra A below; this is defined in the next theo below is called the set of all finite range operators in [11, Section 4]. rem. The algebra A
516
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
= ∞ An then A is a C ∗ -subalgebra of L(H). If Theorem 15. If A is the norm closure of A n=0 V ∈ L∞ (X, μ) and V also denotes the operator of multiplication by the function V , then V ∈ A. Moreover K(H) ⊆ A. Proof. The first statement follows directly from Lemma 14. If Pa,r V Pb,s = 0 then Pa,r Pb,s V = 0 and hence Pa,r Pb,s = 0. Therefore the open set U = B(a, r) ∩ B(b, s) is not empty, and there exists c ∈ X with d(a, c) < r and d(b, c) < s. Therefore d(a, b) < r + s r + s + 0 and V ∈ A0 . If A is compact and A = APU = PU A for some open set U with diameter n then Pa,r APb,s = 0 implies Pa,r PU APU Pb,s = 0 and hence Pa,r PU = 0 and PU Pb,s = 0. Hence there exist u, v ∈ U such that d(a, u) < r and d(v, b) < s. We deduce that d(a, b) < r + s + n so A ∈ An . Since the set of all such A is norm dense in K(H), we conclude that K(H) ⊆ A. 2 If S ∈ U and r > 0 we put
B(x, r): x ∈ S . S(r) = x ∈ X: d(x, S) < r = The following alternative definition of A is slightly more transparent in spite of the fact that it quantifies over a much larger class of sets. Theorem 16. Given m 1, let Ym denote the set of all A ∈ L(H) such that for every S ∈ U one has APS = PS(m) APS . Then A is the norm closure of ∞ m=0 Ym . Proof. If we put T (m) = X \ S(m) then A ∈ Ym if and only if for every S ∈ U one has PT (m) APS = 0. Let A ∈ Am , 0 < r < 1/3 and s = 1/3. If S ∈ U and b ∈ T (m + 1) then B(a, r) ⊆ S implies d(a, b) m + 1 > r + s + m and then Pb,s APa,r = 0. Since S may be written as the union of a countable number of balls B(a, r) with 0 < r < 1/3, Lemma 12 implies that Pb,s APS = 0. Since T (m + 1) may be covered by a countable number of balls B(b, s), all with s = 1/3, we deduce that PT (m+1) APS = 0. Therefore A ∈ Ym+1 . Conversely let A ∈ Ym , r, s > 0 and d(a, b) > r + s + m. If we put S = B(a, r) then B(b, s) ⊆ T (m), so PT (m) APS = 0 implies Pb,s APa,r = 0. Therefore A ∈ Am . The two inclusions together imply ∞
Am =
m=1
and hence the statement of the theorem.
∞
Ym
m=1
2
We wish to associate an ideal JS with every non-empty open subset S of X. This may be done in two ways and we will prove that they yield the same result. The idea is to identify operators that ‘decrease in size’ as one moves away from S. It will become clear that JS depends only on the asymptotic form of S at infinity, and that two sets S1 and S2 that move away from each other as one goes to infinity give rise to different ideals, however slowly this separation occurs.
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
517
If S ∈ U and r > 0, we put JS,n = {A ∈ A: A = PS(n) APS(n) } = {A ∈ A: A = APS(n) = PS(n) A} = {A ∈ A: 0 = APT (n) = PT (n) A} where T (n) = X \ S(n) = {x ∈ X: d(x, S) n}. We also define
KS,n = A ∈ A: APa,r = 0 ⇒ d(a, S) n + r
∩ A ∈ A: Pa,r A = 0 ⇒ d(a, S) n + r
= A ∈ A: d(a, S) > n + r ⇒ APa,r = Pa,r A = 0 . Lemma 17. If n 1 then
B(x, r): d(x, S) > r + n
B(x, r): d(x, S) > r + n − 1 . ⊆ T (n) ⊆
(11)
Hence KS,n−1 ⊆ JS,n ⊆ KS,n .
(12)
Proof. If y ∈ B(x, r) and d(x, S) > r + n then d(y, S) > n. Hence B(x, r) ⊆ T (n). This proves the first inclusion of (11). If x ∈ T (n) then d(x, S) n. Putting r = 1/2 we deduce that x ∈ B(x, r) and d(x, S) > r + n − 1. This proves the second inclusion of (11). If A ∈ KS,n−1 then APx,r = Px,r A = 0 for all x, r such that d(x, S) > r + n − 1, so Lemma 12 and the second inclusion of (11) together imply that APT (n) = PT (n) A = 0. Therefore A ∈ JS,n . On the other hand if A ∈ JS,n then APT (n) = PT (n) A = 0. The first inclusion (11) of now implies that APx,r = Px,r A = 0 whenever d(x, S) > r +n. Therefore A ∈ KS,n . This completes the proof of (12). 2 Let F denote the family of all non-empty open sets S such that S(n) = X for every n 1. We say that S, T ∈ F are asymptotically equivalent if for all n 1 there exists m 1 such that S(n) ⊆ T (m) and T (n) ⊆ S(m). In particular all non-empty, open, bounded sets are asymptotically equivalent to each other. Theorem 18. If S ∈ U then ∞ n=1
JS,n =
∞
KS,n .
n=1
If S ∈ F then this set, denoted by JS , is a proper, closed, two-sided ideal in A and it contains K(H). If S, T are asymptotically equivalent then JS = JT .
518
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
Proof. Lemma 17 implies that ∞ n=1
JS,n =
∞
KS,n .
n=1
We denote this linear subspace of A by JS◦ . Let A ∈ Am and B ∈ KS,n . If ABPa,r = 0 then BPa,r = 0 so d(a, S) n + r. If Pa,r AB = 0 then Lemma 13 implies that for every ε > 0 there exists b ∈ X such that Pa,r APb,ε = 0 and Pb,ε B = 0. Therefore d(a, b) m + r + ε and d(b, S) n + ε. We conclude that d(a, S) m + n + r + 2ε. Since ε > 0 is arbitrary we deduce that d(a, S) m + n + r. Therefore AB ∈ ◦ ⊆ J◦ KS,m+n . A similar argument can be applied to BA. These calculations imply that AJ S S ◦ ◦ ⊆ J . The statement of the theorem now follows by a density argument. and JS A S In order to prove that JS is proper we need to establish that I − A 1 for all A ∈ JS , or equivalently that this holds for all A ∈ JS,n and all n 1. If A = APS(n) = PS(n) A then this follows from I − PS(n) = (I − PS(n) )(I − A) I − PS(n) I − A I − A provided I − PS(n) = 1. Since S ∈ F there exists a ∈ X \ S(n + 2). This implies that B(a, 1) ∩ S(n) = ∅. Since B(a, 1) has positive measure there exists a non-zero φ ∈ H whose support is contained in B(a, 1) and for which (I − PS(n) )φ = φ. If A is a finite rank operator then limn→∞ A − PS(n) APS(n) = 0, so A ∈ JS . The same applies to all A ∈ K(H) by a density argument. If S, T are asymptotically equivalent then routine algebra shows that JS◦ = JT◦ . This implies immediately that JS = JT . 2 If A ∈ A and S ∈ F we put σS (A) = σ (πJS (A)). Theorem 19. Let S, T ∈ F . If S ⊆ T then JS ⊆ JT and σS (A) ⊇ σT (A) for every A ∈ A. If S, T ∈ F are asymptotically independent in the sense that ∀n 1. ∃m 1. S(n) ∩ T (n) ⊆ (S ∩ T )(m),
(13)
JS∩T = JS ∩ JT
(14)
σS∩T (A) = σS (A) ∪ σT (A)
(15)
then
and
for every A ∈ A. Proof. If A ∈ JS◦ then there exists n 1 such that A = APS(n) = PS(n) A. If S ⊆ T , this implies A = APT (n) = PT (n) A and hence JS◦ ⊆ JT◦ . Therefore JS ⊆ JT and σS (A) ⊇ σT (A) for every A ∈ A by Lemma 1.
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
519
If S, T ∈ F we deduce that JS∩T ⊆ JS ∩ JT . Now suppose that S, T are asymptotically independent and that A ∈ JS ∩ JT . Eq. (1) implies lim A − PS(n) APS(n) = 0,
n→∞
lim A − PT (n) APT (n) = 0.
n→∞
If we put An = PS(n) PT (n) APT (n) PS(n) = PS(n)∩T (n) APT (n)∩S(n) , then asymptotic independence implies An = P{S∩T }(m) An = An P{S∩T }(m) , ◦ . Finally so An ∈ JS∩T
lim A − An lim A − PS(n) APS(n) n→∞ + lim PS(n) (A − PT (n) APT (n) )PS(n)
n→∞
n→∞
lim A − PS(n) APS(n) + lim A − PT (n) APT (n) n→∞
n→∞
= 0. Therefore A ∈ JS∩T . Eq. (15) finally follows from Theorem 6.
2
The C ∗ -algebra A contains L∞ (X) and is therefore not separable. It is unlikely that one can obtain a useful classification of its irreducible representations, but a partial classification of its ideals can be obtained as follows. Let X be some compactification of X and let ∂X = X \ X denote the ‘points at infinity’. The restriction of any f ∈ C(X) to X lies in L∞ (X, μ). Since every non-empty open subset of X has positive measure we see that f C(X) = f L∞ = f L(H) = f A .
(16)
It follows that B = C(X) is a commutative C ∗ -subalgebra of A. Note that there is a orderpreserving one-one correspondence between the ideals I in B and the open subsets V of X. It is given by
VI = x ∈ X: f (x) = 0 for some f ∈ I and
IV = f ∈ C(X): f |X\V = 0 . We will write E to denote the (compact) closure of a set E ⊆ X in X, even if E ⊆ X. If U is ⊆ ∂X to be the set of all an open subset of X then we define its set of asymptotic directions U
520
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
is an a ∈ ∂X that possess a neighbourhood V ⊆ X for which V ∩ X ⊆ U . It is immediate that U is an open subset of X with complement X \ U . open subset of ∂X and that U ∪ U is an increasing If S ∈ U then S(n) is an increasing sequence of open sets in X, so S(n) sequence of open subsets of ∂X. We put S=
S(n)
n1
and observe that S is also an open subset of ∂X. Example 20. Let X = Rd with the usual Euclidean metric and let Σ be the ‘sphere at infinity’ parametrized by unit vectors e, called directions. (a) If
S = x ∈ X: ∀i ∈ {1, 2, . . . , d}. xi > 0 , then
= S(n) S = e ∈ Σ: ∀i ∈ {1, 2, . . . , d}. ei > 0 for all n 1. Therefore B/(JS ∩ B) C(K) where
K = e ∈ Σ: ∃i ∈ {1, 2, . . . , d}. ei 0 . (b) If we are only interested in asymptotics in a particular direction e ∈ Σ then we may define S = Rd
B re, r 1/2 . r>0
One sees that S ∈ F and = S(n) S = Σ \ {e} for all n 1. The quotient map π from B to B/(JS ∩ B) C is given by π(f ) = f (e). Lemma 21. If S ∈ U then JS,0 ∩ L∞ (X, μ) is dense in JS ∩ L∞ (X, μ). Proof. Let f ∈ JS ∩ L∞ (X, μ). If pn is the multiplication operator associated with the characteristic function of S(n) then pn f ∈ JS,0 ∩ L∞ for all n 1 and limn→∞ f − pn f = 0 by (1). 2 Theorem 22. The map J → VJ ∩B defines an order-preserving map from ideals in A to open subsets of X. If S ∈ U then S ∪ X. VJS ∩B = If S ∈ F then S ∪ X = X.
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
521
Proof. The first statement of the theorem depends on the observation that if J is an ideal in A then J ∩ B is an ideal in B. Given S ∈ U , we put V = VJS ∩B . It follows directly from the definitions that ⊆ JS,0 ∩ B ⊆ JS ∩ B Cc S(n) ∪ S(n) where Cc denotes the space of continuous functions with compact support in the stated set. There ⊆ V for all n 1. Since S is non-empty, letting n → ∞ we obtain X ∪ fore S(n) ∪ S(n) S ⊆V. If a ∈ / X∪ S then there exists f ∈ C(X) such that f (a) = 1. Given g ∈ JS,0 ∩ L∞ (X) there exists n 1 such that g = gpn = pn g, where pn is the characteristic function of S(n). Since a ∈ X \ S(n + 2), given ε > 0, there exists b ∈ X \ S(n + 2) such that |f (b) − 1| < ε. Putting ε = 1/2 there exists δ ∈ (0, 1) such that x ∈ B(b, δ) implies |f (x)| > 1/2 and x ∈ / S(n). The set B(b, δ) has positive measure so f − g ∞ > 1/2. Lemma 21 now implies that f − h ∞ 1/2 for all h ∈ JS ∩ L∞ (X) so f ∈ / JS ∩ B. Since this holds for all f ∈ C(X) such that f (a) = 1 we conclude that a ∈ / V and V ⊆ X ∪ S. The final statement of the theorem follows from the fact that S ∈ F implies 1 ∈ / JS . 2 Corollary 23. If S ∈ F , then B/(JS ∩ B) C(∂X \ S). 5. Pseudo-resolvents If one has a family of resolvent operators R(z, A) all lying in a C ∗ -algebra A and π : A → B is an algebra homomorphism with a non-trivial kernel J , then π(R(z)) satisfy the resolvent equations in B. In this section we show how to define the spectrum of this new family, which is not the resolvent family of any obvious operator. This will be a crucial ingredient of our general theory. Let A◦ denote the set of invertible elements of an associative algebra A with identity. If a ∈ A the spectrum of a is defined by
σ (a) = α ∈ C: α1 − a ∈ / A◦ . If we put U = C \ σ (a) and define r : U → A by rz = (z1 − a)−1 then r satisfies the resolvent equations rα − rγ = (γ − α)rα rγ
(17)
for all α, γ ∈ U . Moreover 1 + (γ − α)rα = (γ 1 − a)rα so σ (a) = {z: 1 + (z − α)rα ∈ / A◦ }. Our goal in this section is to define the spectrum of a pseudo-resolvent, defined as a function r : U → A that satisfies (17) even though it is not generated by any a ∈ A.
522
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
If A is a closed, unbounded operator on a Banach space B and R(z, A) denotes its family of resolvent operators, defined for all z ∈ / σ (A), then
σ R(z, A) = {0} ∪ (z − s)−1 : s ∈ σ (A) (18) by [5, Lemma 8.1.9]. This motivates our analysis, which is, however, purely algebraic, making no reference to Banach spaces or to unbounded operators. The advantage of this is that the results are immediately applicable to quotient algebras A/J , for which no geometric interpretation exists. Theorem 24. If U ⊆ C and r : U → A is a pseudo-resolvent and α ∈ U then 1 + (z − α)rα ∈ A◦ for all z ∈ U . If
= z: 1 + (z − α)rα ∈ A◦ U and the formula then U ⊆ U −1 r˜z = rα 1 + (z − α)rα
(19)
. Moreover r˜ : U → A is a maximal defines an extension of the pseudo-resolvent from U to U pseudo-resolvent. The set σ (r) = C \ U is called the spectrum of the pseudo-resolvent r and satisfies
(20) / A◦ σ (r) = z: 1 + (z − α)rα ∈ for every choice of α ∈ U . Proof. By interchanging the labels α, γ in (17) we see that rα and rγ commute. Moreover
1 + (γ − α)rα 1 + (α − γ )rγ = 1 + (γ − α) rα − rγ − (γ − a)rα rγ = 1, . If α, z ∈ U then (17) so both terms on the left-hand side are invertible. This proves that U ⊆ U implies that rα = rz 1 + (z − α)rα . so r˜z = rz for all z ∈ U and r˜ is an extension of r to U If β, γ ∈ U , then starting from (19) we obtain −1 −1 1 + (γ − α)rα (γ − β)˜rβ r˜γ = (γ rα − βrα )rα 1 + (β − α)rα
= 1 + (γ − α)rα − 1 + (β − α)rα −1 −1 × rα 1 + (β − α)rα 1 + (γ − α)rα
−1 −1 = rα 1 + (β − α)rα − 1 + (γ − α)rα = r˜β − r˜γ . . Therefore r˜ is a pseudo-resolvent on U
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
523
⊇U . If z ∈ U then by the first Now let rˆ be a further extension of r˜ to a pseudo-resolvent on U . Therefore U =U and r˜ : U → A is a maximal half of this proof 1 + (z − α)rα ∈ A◦ , so z ∈ U pseudo-resolvent. , and this proves (20). 2 = {z: 1 + (z − α)rα ∈ A◦ } for all α ∈ U We have proved that U Corollary 25. Let J be a two-sided ideal in the associative algebra A with identity and let π : A → A/J be the quotient map. If U ⊆ C and r : U → A is a maximal pseudo-resolvent then σ π(r) ⊆ σ (r). Proof. We need only observe that z → π(rz ) is a pseudo-resolvent in A/J but its domain U need not be maximal. If its maximal extension has domain V ⊇ U then σ π(r) = C \ V ⊆ C \ U = σ (r).
2
6. Perturbation theory When extending the theory of Section 3 to differential operators, one has to be careful not to refer to strong operator convergence, because the standard C ∗ -algebra A is only closed under norm convergence. In this section we collect some of the technical results that will be needed. These are formulated at the natural level of generality, but the reader should keep in mind that they will be applied to a resolvent operator A acting in L2 (Rd , dx). Let X be a set with a countably generated σ -field and a σ -finite measure μ, and put L2 = 2 L (X, μ). Lemma 26. Let A be a linear operator on L2 that is positive in the sense that if 0 φ ∈ L2 then 0 Aφ ∈ L2 . Then A is bounded and
A = sup Aφ : 0 φ ∈ L2 and φ 1 < ∞. Moreover |A(φ)| A(|φ|) for all φ ∈ L2 . Proof. See [5, Lemma 13.1.1 and Theorem 13.1.2].
2
In the following discussion V will always denote a (possibly unbounded) measurable function V : X → C, which we call a potential, and also its associated multiplication operator. Given a A denote the set of potentials V that are relatively bounded with respect positive operator A, let V to A in the sense that
V A = sup V (Aφ): φ 1 is finite. A . If |W | |V | Lemma 27. We have V A A V ∞ for all V ∈ L∞ . Therefore L∞ (X) ⊆ V and V ∈ VA then W ∈ VA and W A V A . The space VA is a Banach space with respect to the norm · A .
524
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
Proof. The last statement is the only one that is not elementary. Let ξ ∈ L2 satisfy ξ 2 = 1 and ξ(x) > 0 almost everywhere in X and let ψ = Aξ , so that ψ 0. The exists a measurable set E such that ψ(x) > 0 almost everywhere in E and ψ(x) = 0 almost everywhere in X \ E. In many cases E = X but we do not assume this. If φ ∈ L2 and φn (x) =
if |φ(x)| nξ(x), otherwise,
φ(x) nξ(x)φ(x) |φ(x)|
then |φn | |φ| and |φn | nξ . The dominated convergence theorem implies that φn − φ 2 → 0 as n → ∞. Moreover A(φn ) A |φn | A(nξ ) = nψ so A(φn ) has support in E. Letting n → ∞ we conclude that the same holds for A(φ). We conclude that if V has support in X \ E then V A = 0, so we focus attention henceforth on the restriction of all the potentials involved to E. A then Vn ψ is a We next observe that V ψ 2 V A so if Vn is a Cauchy sequence in V 2 2 Cauchy sequence in L (E, μ). Therefore Vn ψ converges in L norm to a limit V ψ in L2 (E, μ). There exists a subsequence n(r) such that Vn(r) converges almost everywhere in E to V . Given ε > 0 there exists Nε such that for all m, n Nε we have (Vm − Vn )(Aφ) ε φ 2 2 for all φ ∈ L2 . Replacing n by n(r), letting r → ∞ and using Fatou’s lemma we obtain (Vm − V )(Aφ) ε φ 2 2 A and Vm − V A → 0 as m → ∞. for all m Nε and all φ ∈ L2 . Hence V ∈ V
2
A . Now let VA denote the closure of L∞ in V A then V ∈ VA if and only if limn→∞ V (n) − V A = 0 where Lemma 28. If V ∈ V V
(n)
(x) =
V (x) nV (x) |V (x)|
if |V (x)| n, otherwise.
If |W | |V | and V ∈ VA then W ∈ VA . Proof. If V ∈ VA then there exist Xn ∈ L∞ such that Xn ∞ n and V − Xn A → 0 as n → ∞. By carrying out a separate calculation at every x ∈ X we see that V − V (n) |V − Xn |. Lemma 27 now implies that lim V − V (n) A lim V − Xn A = 0.
n→∞
n→∞
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
525
The converse statement, that limn→∞ V (n) − V A = 0 implies V ∈ VA , is elementary. The second statement of the lemma follows in a similar way from the inequality W − W (n) V − V (n) .
2
Lemma 29. If 0 A B as operators on L2 , in the sense that 0 Aφ Bφ for all φ such that 0 φ ∈ L2 , then VB ⊆ VA . B then Proof. If φ ∈ L2 and V ∈ V V (Aφ) = |V | A(φ) |V |A |φ| |V |B |φ| so V (Aφ) |V | B|φ| |V | |φ| = V B φ 2 2 2 B 2 A . The proof of the lemma is for all φ ∈ L2 . This implies V A V B < ∞ and hence V ∈ V completed as in Lemma 28. 2 We now specialize to the case in which H = L2 (Rd , μ). Our goal is to describe certain classes of potential in VA , particularly when A is a positive convolution operator. Such operators arise as the resolvents of constant coefficient, second order partial differential operators and in certain other contexts; the reader primarily interested in Schrödinger operators should keep Example 34 in mind. We will use the classical Lp inequalities due to Hölder, Young, Hausdorff–Young and Riesz–Thorin without further mention. Lemma 30. If a ∈ L1 (Rd ) then the operator A on L2 (Rd ) defined by Aφ = a ∗ φ lies in the standard C ∗ -algebra A. Proof. If an (x) =
a(x) 0
if |x| n, otherwise
and An φ = an ∗ φ then lim An − A an − a 1 = 0
n→∞
by Lemma 43. We combine this with the observation that An ∈ An , because the support of An φ must lie within a distance n of the support of φ. 2 Let Cd denote L1 (Rd , dx).
the set of operators A on L2 (Rd , dx) given by Aφ = a ∗ φ, where 0 a ∈
Lemma 31. If A ∈ Cd and a ∈ Lp for some 1 < p 2 then Lq ⊆ VA , where 1/p + 1/q = 1.
526
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
Proof. If V ∈ Lq then V (a ∗ φ) V q a p φ 2 , 2 so V A V q a p .
2
Lemma 32. If A ∈ Cd and aˆ ∈ Lp where aˆ denotes the Fourier transform of a and 2 p < ∞, then Lp ⊆ VA . Proof. This uses the bound V A cd,p V p a ˆ p. See, for example, [5, Theorem 5.7.3].
(21)
2
There are many other results of a similar type in which both of the Lp norms in (21) are replaced by other choices. See [21, Chapter 4] for details. The following type of bound is used when analyzing multi-body Schrödinger operators. The decomposition of Rd used below may be combined with a Euclidean rotation of Rd , since this amounts to a change of coordinate system. Theorem 33. Let x = (x1 , x2 ) ∈ Rd1 × Rd2 where d = d1 + d2 and suppose that |V (x1 , x2 )| W (x1 ) for all x ∈ Rd where W ∈ Lp (Rd1 ) and 2 p < ∞. Suppose also that A ∈ Cd , B ∈ Cd1 and a(ξ ˆ 1 ) ˆ 1 , ξ2 ) b(ξ for all ξ ∈ Rd , where 0 b ∈ L1 (Rd1 ) and bˆ ∈ Lp (Rd1 ). Then V ∈ VA . Proof. We may write V=XW where |X| 1. We may also write A = BC where C 1; in fact C = F −1 MF where F is the Fourier transform and M is the operator of multiplication by a function m with |m| 1. Therefore V A = XW BC W B c W p by applying Lemma 32 in Rd1 .
2
Example 34. If H = − acting in L2 (Rd ) with the usual domain then A = (I + H )−1 is of the form Aφ = a ∗ φ where 0 a ∈ L1 (Rd ) and a(ξ ˆ ) = (1 + |ξ |2 )−1 for all ξ ∈ Rd . Theorem 33 is applicable in this context because −1 −1 1 + |ξ |2 1 + |ξ1 |2 whenever ξ = (ξ1 , ξ2 ). One needs to assume that p 2 and p > d1 /2.
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
527
7. Applications to differential operators In this section we show that the C ∗ -algebra methods developed above can be used to study the spectra of certain differential operators. Instead of trying to study σ (A) directly we may redirect our attention to the spectrum of one of its resolvent operators by virtue of the results in Section 5. We say that the closed, unbounded operator A is affiliated to the C ∗ -subalgebra A of L(H) if the conditions of the following lemma are satisfied. / σ (A). Then Lemma 35. Let A be a C ∗ -subalgebra of L(H) and let R(z, A) ∈ A for some z ∈ R(w, A) ∈ A for all w ∈ / σ (A). Proof. If X = I + (w − z)R(z, A) then X ∈ A and w−z : s ∈ σ (A) σ (X) = {1} ∪ 1 + z−s w−s : s ∈ σ (A) . = {1} ∪ z−s Since this does not contain 0 we deduce that X is invertible in L(H), and hence also invertible in A. Since R(w, A) = R(z, A)X −1 as in [5, Theorem 1.2.10], we deduce that R(w, A) ∈ A. 2 We say that a one-parameter group or semigroup Tt is affiliated to A if its generator is affiliated in the above sense, i.e. if the associated resolvent family lies in A. If H is a typical Schrödinger operator acting in L2 (Rd ), then the unitary operators e−iH t do not lie in the standard C ∗ -algebra A, but we will see they are affiliated to it. Let H = L2 (Rd ) and let H0 be a constant coefficient differential operator whose symbol is the polynomial p, so that H0 φ = F −1 pF φ where F is the Fourier transform operator and p is regarded as an unbounded multiplication operator. It is immediate that H0 is a closed operator on Dom(H0 ) = {φ ∈ H: pF φ ∈ H}. Theorem 36. Suppose that lim|ξ |→∞ |p(ξ )| = +∞ and that there exists a real constant b such that Re(p(ξ )) b for all ξ ∈ Rd . Then σ (H0 ) ⊆ {z: Re(z) b}. If Re(z) > b then R(z, H0 ) lies in the standard C ∗ -algebra A. Proof. We have R(z, H0 ) = F −1 ρF where ρ ∈ C0 (Rd ) is defined by −1 ρ(ξ ) = z − p(ξ ) . If n 1 we define ρn (ξ ) = e−|ξ |
2 /n
−1 z − p(ξ ) .
Putting R = R(z, H0 ) and Rn = F −1 ρn F we see that lim Rn − R = lim ρn − ρ ∞ = 0
n→∞
n→∞
528
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
so it is enough to prove that Rn ∈ A for all n 1. Since ρn lies in the Schwartz space S it is enough to observe that Rn φ = kn ∗ φ for all φ ∈ L2 where kn ∈ S ⊆ L1 ; we may then apply Lemma 30. 2 Before starting applications we change conventions so as to conform to the standard practice in quantum theory, writing −H where one might expect to see H . Example 37. The differential operator (H0 φ)(x, y) = −
∂ 2φ ∂ 3φ − 3 ∂x 2 ∂y
acting in L2 (R2 ) has symbol p(ξ, η) = ξ 2 + iη3 and is highly non-elliptic. Nevertheless the conditions of Theorem 36 are satisfied. The same applies to the non-negative, self-adjoint, differential operator acting in L2 (R2 ) with real symbol 2 p(ξ, η) = ξ 2 + η − ξ n where n 2. The following hypothesis is valid for a variety of second order elliptic differential operators with variable coefficients; see [4]. Hypothesis 1. The operator −H0 is the generator of a strongly continuous one-parameter semigroup e−H0 t on L2 (Rd ). Moreover there exist positive constants c, α and an integral kernel K(t, x, y) such that 0 K(t, x, y) ct −d/2 e−α|x−y|
2 /t
(22)
for all t > 0 and x, y ∈ Rd and −H t e 0 φ (x) =
K(t, x, y)φ(y) dy
(23)
Rd
for all φ ∈ L2 (Rd ) and x ∈ Rd . Lemma 38. Under Hypothesis 1
σ (H0 ) ⊆ z: Re(z) 0 and (λI + H0 )−1 has an integral kernel G(λ, x, y) for every λ > 0. There exists a function gλ ∈ L1 (Rd ) and a constant c1 > 0 such that 0 G(λ, x, y) gλ (x − y) and (λI + H0 )−1 gλ 1 = c1 λ−1 < ∞.
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
529
Proof. If we put kt (x) = ct −d/2 e−α|x|
2 /t
then there exists c1 > 0 such that kt 1 = c1 for all t > 0. Therefore e−H0 t c1 for all t > 0 and σ (H0 ) ⊆ {z: Re(z) 0}. If λ > 0 the kernel G satisfies ∞ 0 G(λ, x, y) =
K(t, x, y)e−λt dt
0
∞
kt (x − y)e−λt dt
0
= gλ (x − y), where the positivity of the functions involved implies that ∞ gλ 1 =
kt 1 e−λt dt = c1 /λ.
0
Note finally that gˆ λ (ξ ) = for some c2 > 0, all λ > 0 and all ξ ∈ Rd .
c1 λ + c2 |ξ |2
2
Example 39. A bound of the type (22) is not valid for fractional powers of the Laplacian, i.e. H0 = (−)α where 0 < α < 1. However, in this case the one-parameter semigroup e−H0 t has the kernel K(t, x, y) = kt (x − y) > 0 for all t > 0, where kt 1 = 1 and kˆt (ξ ) = e−t|ξ | for all t > 0 and ξ ∈ Rd . The construction of kt uses the theory of fractional powers of generators of one-parameter semigroups; see [22, Chapter 9.11]. The resolvent operator (λI + H0 )−1 has the kernel 2α
G(λ, x, y) = gλ (x − y) > 0 for all λ > 0, where ∞ gλ (x) = 0
kt (x)e−λt dt > 0.
530
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
One deduces that gλ 1 = λ−1 < ∞ and −1 gˆ λ (ξ ) = λ + |ξ |2α for all λ > 0 and ξ ∈ Rd . The methods developed in this paper still apply. The above results allow us to reformulate our problem. Hypothesis 2. Let K : Rd × Rd → R and k ∈ L1 (Rd ) satisfy 0 K(x, y) k(x − y) for all x, y ∈ Rd . Let R0 be the positive operator associated with K(x, y) and let B be the positive operator associated with k(x − y), so that 0 R0 B. Lemma 29 now implies that V B ⊆ V R0 . If R0 = (λI + H0 )−1 in the following theorem then R = (λI + H0 + V )−1 and the assumption V R0 < 1 states that V has relative bound less than 1 with respect to H0 in the conventional language of perturbation theory. Lemma 40. Given Hypothesis 2, let the potential V ∈ VR0 satisfy V R0 < 1 and put R = R0 (I + V R0 )−1 = R0
∞ (−V R0 )n .
(24)
n=0
Then the operators R0 and R both lie in the standard C ∗ -algebra A. Proof. Given ε > 0 there exists c ∈ Z+ such that Kc (x, y) =
K(x, y) 0
|x1 |>c |k(x)| dx
< ε. If we put
if |x − y| c, otherwise,
and define the operator T on H by (Tc φ)(x) =
Kc (x, y)φ(y) dy Rd
then R0 − Tc < ε and Tc PS(n) = PS(n+c) Tc PS(n) for every S ∈ F and n 1, hence Tc ∈ D0 and R0 ∈ D. Applying the same argument to R0∗ yields R0 ∈ A by virtue of Lemma 3. Defining V (r) as in Lemma 28, the identities V (r) PS(n) = PS(n) V (r) for all S, n and r imply that V (r) R0 ∈ A. Hence V R0 ∈ A. The norm convergence of the series in (24) now implies that R ∈ A. 2 We conclude with two applications to quantum theory. In the first we consider with the Schrödinger operator H = H0 + V acting in L2 (Rd ), where H0 = − and V = W + X is a sum of possibly complex-valued potentials satisfying the conditions specified below. Passing to the resolvent operators we actually consider R0 = (aI + H0 )−1 , R1 = (aI + H0 + W )−1 and R = (aI + H )−1 , where a > 0 is large enough to ensure that all the inverses exist.
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
531
Theorem 41. Suppose that V and W lie in the space VR0 defined just before Lemma 28 and that V R0 < 1, W R0 < 1. Suppose that W is periodic in the x1 direction. Let S = {x ∈ Rd : |x1 | < 1}, so that S(n) = {x ∈ Rd : |x1 | < n + 1} for all n 1. Suppose that X has support in S(c) for some c 1. Finally define the ideal JS ⊆ A as in Theorem 18. Then σ (H0 + W ) = σess (H0 + W ) = σS (H0 + W ) = σS (H ) ⊆ σess (H ) ⊆ σ (H )
(25)
where σS (A) = σ (πJS (A)) for every A ∈ A. Proof. The operators R0 , R1 and R all lie in A for large enough a > 0 by Lemma 40. Eq. (25) is equivalent, by definition, to σ (R1 ) = σess (R1 ) = σS (R1 ) = σS (R) ⊆ σess (R) ⊆ σ (R).
(26)
The proof of the first two equalities in (26) uses the periodicity of R1 in the x1 direction as in the proof of Theorem 9. We next observe that I + V R0 = I + W R0 + XR0
= I + XR0 (1 + W R0 )−1 (I + W R0 ) = (I + XR1 )(I + W R0 ). Since I + V R0 and I + W R0 are invertible, it follows that I + XR1 is invertible. Therefore R = R0 (I + V R0 )−1 = R0 (I + W R0 )−1 (I + XR1 )−1 = R1 (I + XR1 )−1 . Since X ∈ VR0 has support in S(c) and JS is an ideal we can use Lemma 28 to deduce that XR1 ∈ JS . Therefore
−1 = πJS (R1 ). πJS (R) = πJS (R1 ) I + πJS (XR1 ) The inclusions in (26) now follow by applying Lemma 1.
2
Example 42. We next point out the relevance of the above results to multi-body Schrödinger operators. Let H = L2 (R3 × R3 ) and put x = (x1 , x2 ) where xi ∈ R3 . Let H0 = − and define H = H0 + V1 (x1 ) + V2 (x2 ) + V3 (x1 − x2 ) where all three potentials lie in L2 (R3 ) + C0 (R3 ). By allowing V1 , V2 and V3 to be complexvalued we include in our analysis the non-self-adjoint Schrödinger operators that arise in when discussing resonances via complex scaling. For suitable choices of Vi this operator might be regarded as describing two (spinless) electrons orbiting around a fixed nucleus (a simplified Helium atom). Standard estimates imply that Vi all have relative bound 0 with respect to H0 and
532
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
that they all lie in VR0 with Vi R0 < 1/3 provided R0 = (aI + H0 )−1 and a > 0 is large enough. Lemma 40 implies that all of the relevant resolvent operators lie in the standard C ∗ -algebra A. One can produce several asymptotic sets from {x: |x1 | < 1}, {x: |x2 | < 1}, {x: |x1 − x2 | < 1}, and we will concentrate on two of these. If one puts
S = x: |x1 | < 1 ∪ x: |x2 | < 1 ∪ x: |x1 − x2 | < 1 , it is evident that S ∈ F and that V1 + V2 + V3 ∈ JS . Hence σS (H ) = σS (H0 ) = [0, ∞). This set relates to the states in which both particles move away to infinity and they also separate from each other. On the other hand if one puts
T = x: |x2 | < 1 ∪ x: |x1 − x2 | < 1 , it is evident that T ∈ F and that V2 + V3 ∈ JT . Hence σT (H ) = σT (H0 + V1 ) = σ (H0 + V1 ). This set relates to the states in which particle 2 moves away to infinity and also separates from particle 1, which may or may not stay close to the nucleus. If A = − + V1 acting in L2 (R3 ) then by taking Fourier transforms with respect to x2 it is seen that σ (H0 + V1 ) = σ (A) + [0, ∞) where σ (A) = [0, ∞) ∪ {λn }, where λn are the possibly complex-valued discrete eigenvalues of the operator A. 8. Hyperbolic space Let (X, d, μ) denote a complete non-compact Riemannian manifold X with bounded geometry, Riemannian metric d (in the sense of the triangle inequality) and Riemannian measure μ. The Laplace–Beltrami operator H = − on L2 (X, μ) is essentially self-adjoint of Cc∞ (X) and the spectrum of its closure is contained in [0, ∞). The one-parameter semigroup {e−H t }t0 is associated with a positive C ∞ heat kernel K by −H t f (x) = K(t, x, y)f (y) μ(dy). e X
The kernel K satisfies K(t, x, y) μ(dy) = 1 X
for all x ∈ X and t > 0. We wish to show that e−H t and (λI + H )−1 lie in the C ∗ -algebra A for all t, λ > 0. Rather than proving this under the weakest possible conditions, we consider the
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
533
hyperbolic space H3 , in which all of the expressions involved may be written down explicitly. The proof that we given may be extended to Hd for arbitrary d 2 with minimal effort. The geometry of hyperbolic space is well-studied; see [18, Section 4.6] for the results listed below. In the upper half space model Hn is the set {x ∈ Rn : xn > 0} with the local Riemannian metric ds 2 =
d2 x1 + · · · + d2 xn . xn2
The global metric d is given by |x − y|2 cosh d(x, y) = 1 + 2xn yn and the volume element is given by μ(dx) =
dx1 · · · dxn . xnn
The area of the unit sphere S(x, r) of radius r > 0 does not depend on x ∈ X and is given by ρ(r) = cn sinhn−1 (r) where c3 = 4π . If f : (0, ∞) → R is any positive, measurable function then
f d(x, y) μ(dy) =
X
∞ f (r)ρ(r) dr 0
for all x ∈ X. If X = Hn , the spectrum of H = − acting in L2 (X, μ) is equal to [(n − 1)2 /4, ∞), but the p L spectrum depends on p; see [8]. The heat kernel may be written in the form K(t, x, y) = kt (d(x, y)), where for n = 3 we have kt (r) = (4πt)−n/2
r 2 e−t−d(x,y) /4t . sinh(r)
See [9]; see also [6] for relevant upper and lower bounds when n = 3. One verifies directly that ∞
K(t, x, y) μ(dy) = X
kt (r)ρ(r) dr 0
∞ 2 = (4π)−1/2 t −3/2 r sinh(r)e−t−r /4t dr 0
∞ = −∞
(4π)−1/2 t −3/2 2−1 rer−t−r
2 /4t
dr
534
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
∞ =
(4π)−1/2 t −3/2 2−1 re−(r−2t)
2 /4t
dr
−∞
=1 for all t > 0. If λ > 0 the operator (H + λI )−1 has a Green function G given explicitly by G(λ, x, y) = gλ (d(x, y)), where √
∞ gλ (r) =
e
−λt
e−r λ+1 . kt (r) dt = 4π sinh(r)
0
A direct calculation establishes that ∞
G(λ, x, y) μ(dy) = X
gλ (r)ρ(r) dr = 1/λ
(27)
0
for all λ > 0. We will need the following lemma. Lemma 43. If (Rf )(x) =
r(x, y)f (y) μ(dy) X
for all x ∈ X and f ∈ L2 (X, μ), then R 2L2 (X,μ)
sup x∈X
r(x, y) μ(dy) sup r(y, x) μ(dy) . x∈X
X
X
See [5, Corollary 2.2.15] for the proof. Theorem 44. If λ > 0 and t > 0 then e−H t and (H + λI )−1 both lie in the standard C ∗ -algebra A. Proof. The proof is almost the same in both cases so we only treat the resolvent operators. We have (λI + H )−1 = An + Bn where (An f )(x) =
an (x, y)f (y) μ(dy), X
an (x, y) = a˜ n d(x, y) , gλ (r) if r n, a˜ n (r) = 0 otherwise,
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
535
and (Bn f )(x) =
bn (x, y)f (y) μ(dy), X
bn (x, y) = b˜n d(x, y) , ˜bn (r) = gλ (r) if r > n, 0 otherwise. It follows from its definition that An ∈ An and from (27) and Lemma 43 that limn→∞ Bn = 0. 2 Example 45. The ideas in the second part of Section 4 can be applied in the setting of hyperbolic space. In the upper half space model the natural compactification has ∂Hn ∼ (Rn−1 × {0}) ∪ {∞}. = If we put S = {x ∈ Hn : 0 < xn < 1} then S(m) = {x ∈ Hn : 0 < xn < em }. Moreover S(m) S = R × {0} for all m 1. Therefore the quotient map π : B → B/(JS ∩ B) C is given by π(f ) = f (∞). Acknowledgments I should like to thank V. Georgescu, S. Richard, A. Pushnitski and J. Weir for helpful comments on an earlier version of the paper. References [1] W.O. Amrein, A. Boutet de Monvel, V. Georgescu, C0 -Groups, Commutator Methods and Spectral Theory of N Body Hamiltonians, Birkhäuser-Verlag, 1996. [2] A. Boutet de Monvel, V. Georgescu, Graded C ∗ -algebras in the N -body problem, J. Math. Phys. 32 (1991) 3101– 3110. [3] A. Boutet de Monvel, V. Georgescu, Graded C ∗ -algebras associated to symplectic spaces and spectral analysis of many channel Hamiltonians, in: Dynamics of Complex and Irregular Systems, Bielefeld, 1991, in: Bielefeld Encount. Math. Phys., vol. 8, World Sci. Publishing, River Edge, NJ, 1993, pp. 22–66. [4] E.B. Davies, Heat Kernels and Spectral Theory, Cambridge Univ. Press, Cambridge, 1989. [5] E.B. Davies, Linear Operators and Their Spectra, Cambridge Univ. Press, Cambridge, 2007. [6] E.B. Davies, N. Mandouvalos, Heat kernel bounds on hyperbolic space and Kleinian groups, Proc. London Math. Soc. 57 (1988) 182–208. [7] E.B. Davies, B. Simon, Scattering theory for systems with different spatial asymptotics on the left and right, Comm. Math. Phys. 63 (1978) 277–301. [8] E.B. Davies, B. Simon, M. Taylor, Lp spectral theory of Kleinian groups, J. Funct. Anal. 78 (1988) 116–136. [9] A. Debiard, B. Gaveau, E. Mazet, Théorèmes de comparaison en géométrie riemannienne, Publ. RIMS Kyoto Univ. 12 (1976) 391–425. [10] J. Dixmier, Les C ∗ -Algèbres et Leurs Représentations, Gauthier–Villars, Paris, 1969. [11] V. Georgescu, S. Golénia, Compact perturbations and stability of the essential spectrum of singular differential operators, J. Operator Theory 59 (2008) 115–155. [12] V. Georgescu, A. Iftimovici, C ∗ -algebras of quantum Hamiltonians, in: J.-M. Combes, J. Cuntz, G.A. Elliot, G. Nenciu, H. Siedentop, S. Stratila (Eds.), Operator Algebras and Mathematical Physics, Proc. Conf. Operator Algebras and Mathematical Physics, Constanta, 2001, Theta, Bucharest, 2003, pp. 123–167. [13] V. Georgescu, A. Iftimovici, Localizations at infinity and essential spectrum of quantum Hamiltonians: I. General theory, Rev. Math. Phys. 18 (2006) 417–483. [14] J. Hinchcliffe, PhD thesis, King’s College London, 2006.
536
E.B. Davies / Journal of Functional Analysis 257 (2009) 506–536
[15] M. M˘antoiu, C ∗ -algebras, dynamical systems, spectral analysis, in: Operator Algebras and Mathematical Physics, Constant,a, 2001, Theta, Bucharest, 2003, pp. 299–314. [16] M. M˘antoiu, C ∗ -algebras, dynamical systems at infinity and the essential spectrum of generalized Schrödinger operators, J. Reine Angew. Math. 550 (2002) 211–229. [17] G.K. Pedersen, C ∗ -Algebras and Their Automorphism Groups, Academic Press, London, 1979. [18] J.G. Ratcliffe, Foundations of Hyperbolic Manifolds, Springer-Verlag, New York, 1994. [19] S. Richard, Spectral and scattering theory for Schrödinger operators with Cartesian anisotropy, Publ. RIMS Kyoto Univ. 41 (2005) 73–111. [20] J. Roe, Band-dominated Fredholm operators on discrete groups, Integral Equation Operator Theory 51 (2005) 411– 416. [21] B. Simon, Trace Ideals and Their Applications, Cambridge Univ. Press, Cambridge, 1979. [22] K. Yoshida, Functional Analysis, Springer-Verlag, Berlin, 1965.
Journal of Functional Analysis 257 (2009) 537–552 www.elsevier.com/locate/jfa
Non-spectrality of planar self-affine measures with three-elements digit set Jian-Lin Li College of Mathematics and Information Science, Shaanxi Normal University, Xi’an 710062, PR China Received 16 November 2008; accepted 12 December 2008 Available online 14 January 2009 Communicated by L. Gross
Abstract The self-affine measure μM,D associated with an affine iterated function system {φd (x) = M −1 (x + d)}d∈D is uniquely determined. The problems of determining the spectrality or non-spectrality of a measure μM,D have been received much attention in recent years. One of the non-spectral problem on μM,D is to estimate the number of orthogonal exponentials in L2 (μM,D ) and to find them. In the present paper we show that for an expanding integer matrix M ∈ M2 (Z) and the three-elements digit set D given by M=
a d
b c
and
D=
0 1 0 , , , 0 0 1
if ac − bd ∈ / 3Z, then there exist at most 3 mutually orthogonal exponentials in L2 (μM,D ), and the number 3 is the best. This confirms the three-elements digit set conjecture on the non-spectrality of self-affine measures in the plane © 2008 Elsevier Inc. All rights reserved. Keywords: Iterated function system; Self-affine measure; Orthogonal exponentials; Spectral measure
1. Introduction We follow the paper [15] to consider the three-elements digit set conjecture on the nonspectrality of self-affine measures in the plane. The question addressed in the present paper deals E-mail address: [email protected]. 0022-1236/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.12.012
538
J.-L. Li / Journal of Functional Analysis 257 (2009) 537–552
with a dichotomy problem for certain fractals (affine iterated function systems, IFSs) which has received a good amount of attention in recent years. The fractals under consideration arise from an iteration scheme applied to a fixed and finite number of contractive affine mappings x ∈ Rn , d ∈ D
φd (x) = M −1 (x + d)
in Rn , where M ∈ Mn (R) is an n × n expanding real matrix (that is, all the eigenvalues of the real matrix M have moduli > 1), and the digit set D ⊂ Rn is a finite subset of the cardinality |D|. In dynamics and in other applications of traditional Fourier series to computational mathematics, one is often faced with set arising as the attractor T (M, D) of IFS {φd (x)}d∈D , and coming equipped with the equilibrium measure μM,D . This has led to attempts at adapting traditional Fourier tools to the fractal setting. The attractor or invariant set T := T (M, D) is a unique nonempty compact set satisfying MT =
(T + d),
d∈D
and the measure μ := μM,D is a unique probability measure satisfying μ=
1 μ ◦ φd−1 . |D|
(1.1)
d∈D
The invariant set T (M, D) includes complicated geometries, and the invariant measure μM,D which is also called self-affine measure includes the restriction of n-dimensional Lebesgue measure. They are all determined by the pair (M, D). Moreover μM,D is supported on T (M, D) (cf. [7]). So for n = 1, in the way of examples, there are Cantor set and Cantor measure on the line; and for n = 2 there is a rich variety of geometries, of which the best known example is the Sierpinski gasket. The problem considered below started with a discovery in an earlier paper of Jorgensen and Pedersen [9] where it was proved that certain IFS fractals have Fourier bases. And furthermore that the question of counting orthogonal Fourier frequencies (or orthogonal exponentials in L2 (μM,D )) for a fixed fractal involves an intrinsic arithmetic of the finite set of functions making up the IFS {φd (x)}d∈D under consideration. For example if (M, D) = (3, {0, 2}) is the middle-third Cantor example on the line, there cannot be more than two orthogonal Fourier frequencies [9, Theorem 6.1], while a similar Cantor example using instead a subdivision scale 4 (i.e., (M, D) = (4, {0, 2})), turns out to have an ONB in L2 (μM,D ) consisting of Fourier frequencies [9, Theorem 3.4]. The present paper is motivated by these earlier results, and it solves a conjecture for the case n = 2, so the planar case. The main result here deals with a Sierpinski family. The main theorem shows that if the corresponding scaling matrix is integral, expansive, and has determinant indivisible by 3, then the corresponding L2 (μM,D ) can have at most 3 orthogonal Fourier frequencies (in vector form), and further than 3 is best possible. Recall that for a probability measure μ of compact support on Rn , we call μ a spectral measure if there exists a discrete set Λ ⊂ Rn such that the exponential function system EΛ := {e2πiλ,x : λ ∈ Λ} forms an orthogonal basis (Fourier basis) for L2 (μ). The set Λ is then called a spectrum for μ; we also say that (μ, Λ) is a spectral pair (cf. [10]). Spectral measure is a natural generalization of spectral set introduced by Fuglede [5] whose famous
J.-L. Li / Journal of Functional Analysis 257 (2009) 537–552
539
spectrum-tiling conjecture and its related problems have received much attention in recent years (cf. [1,3,4,11,12]). Probably the most interesting question is the spectrality or non-spectrality of a self-affine measure μM,D . We will focus our attention on the following question in the plane: Under what conditions on M and D is μM,D a spectral measure or a non-spectral measure? It is known that the non-spectral problem on self-affine measures consists of the following two classes: (I) There are at most a finite number of orthogonal exponentials in L2 (μM,D ), that is, μM,D orthogonal exponentials contain at most finite elements. The main questions here are to estimate the number of orthogonal exponentials in L2 (μM,D ) and to find them (cf. [2,14,15]). (II) There are natural infinite families of orthogonal exponentials, but non of them forms an orthogonal basis in L2 (μM,D ). The main question is whether some of theses families can be combined to form larger collections of orthogonal exponentials. The other questions concerning this class can be found in [6,8]. Except the case that there might be no more than two orthogonal exponentials, the problem on a non-spectral measure μM,D in fact falls into one of the above two classes (see [2, Section 3]). Let | det(M)| = m = p1b1 p2b2 · · · prbr be the standard prime factorization, where p1 < p2 < · · · < pr are prime numbers, bj > 0 (j = 1, 2, . . . , r). We use W (m) to denote the non-negative integer combination of p1 , p2 , . . . , pr (cf. [11, Section 4.2], [13, Section 3]). The known results in this direction provide some supportive evidence that the following Conjectures 1 and 2 should be true (cf. [15, Conjectures 1 and 2]). Conjecture 1. For an expanding integer matrix M ∈ Mn (Z) and a finite digit set D ⊂ Zn , if |D| ∈ / W (m), then μM,D is a non-spectral measure and the non-spectral problem on this μM,D falls in the class (I). In the plane R2 , the special case of Conjecture 1 with the three-elements digit set D reduces to the following. Conjecture 2. For an expanding integer matrix M ∈ M2 (Z) and the three elements digit set D given by a b 0 1 0 M= and D = , , , (1.2) d c 0 0 1 if ac − bd ∈ / 3Z, then there exist at most 3 mutually orthogonal exponentials in L2 (μM,D ), and the number 3 is the best. The main purpose of this paper is to show that Conjecture 2 is true. That is, we get the following. Theorem. The above Conjecture 2 holds. The proof of Theorem depends mainly on the characterization of the zero set Z(μˆ M,D ) of the Fourier transform μˆ M,D . In the previous research, we usually need an expression for the matrices M j or M ∗−j (j = 1, 2, . . .) in order to characterize the zero set Z(μˆ M,D ), where M ∗ denotes the conjugate transpose of M, in fact M ∗ = M t . This certainly can be realized for all upper or lower triangle matrices (cf. [15]). However, for a general 2 × 2 matrix M, it is more difficult to
540
J.-L. Li / Journal of Functional Analysis 257 (2009) 537–552
get an expression for M j or M ∗−j (j = 1, 2, . . .), and the method there cannot be applied. It is different from the previous research that, in the present paper, we first write M ∗ = 3M˜ + Mα for two matrices M˜ and Mα , where the entries of the matrix Mα are from the set {0, 1, 2}, we then view the matrix M ∗ as an operator acting on certain concrete sets, this leads us to conclude that the operator M ∗ is periodic when it acts on these concrete sets. The periodicity enables us to characterize the zero set Z(μˆ M,D ) and to find more inclusion relations inside the zero set. Some facts concerning this zero set are given in Section 2. Based on these established facts, we prove Theorem in Section 3. We believe that the method used here can provide a way of dealing with the non-spectral problem on μM,D . 2. Relations inside the zero set Z(μˆ M,D ) In this section we will establish more relations inside the zero set Z(μˆ M,D ). The main interesting conclusion is the periodicity of the operator M ∗ (in the sense of set inclusion relation) when M ∗ acts on certain sets. 2.1. General observation For a general expanding matrix M ∈ Mn (R) and a finite subset D ⊂ Rn , the Fourier transform of the self-affine measure μM,D is μˆ M,D (ξ ) =
e2πiξ,t dμM,D (t)
ξ ∈ Rn .
From (1.1), we have μˆ M,D (ξ ) =
∞
mD M ∗−j ξ ,
(2.1)
j =1
where mD (t) :=
1 2πid,t e . |D|
(2.2)
d∈D
The infinite product (2.1) converges absolutely for all ξ ∈ Rn . It also converges uniformly on compact subsets of Rn . For any λ1 , λ2 ∈ Rn , λ1 = λ2 , the orthogonality condition
e2πiλ1 ,x , e2πiλ2 ,x
L2 (μM,D
= )
e2πiλ1 −λ2 ,x dμM,D = μˆ M,D (λ1 − λ2 ) = 0
(2.3)
directly relates to the zero set Z(μˆ M,D ) of μˆ M,D . From (2.1), we have Z(μˆ M,D ) = ξ ∈ Rn : ∃j ∈ N such that mD M ∗−j ξ = 0 . Furthermore, we have the following.
(2.4)
J.-L. Li / Journal of Functional Analysis 257 (2009) 537–552
541
Proposition 1. Let Θj = {ξ ∈ Rn : mD (M ∗−j ξ ) = 0} (j = 1, 2, . . .). Then (1) Z(μˆ M,D ) = ∞ j =1 Θj ; (2) ξ0 ∈ Z(μˆ M,D ) ⇔ −ξ0 ∈ Z(μˆ M,D ) or ξ0 ∈ Θj ⇔ −ξ0 ∈ Θj for j = 1, 2, . . . ; (3) Θj +1 = M ∗ (Θj ) for j = 1, 2, . . . . Furthermore, if D ⊂ Zn , then Θj ∩ M ∗j Zn = ∅ and Θj + M ∗j Zn = Θj for j = 1, 2, . . . . 2.2. Expression of the zero set Z(μˆ M,D ) In the following, we will restrict our discussion on the special M and D given by (1.2). Let Θ0 = {ξ ∈ R2 : mD (ξ ) = 0}. Then Θ0 = Z0 ∪ Z˜ 0 , where Z0 =
1/3 2/3
+
k1 k2
: k1 , k2 ∈ Z ⊂ R 2 ,
(2.5)
: k˜1 , k˜2 ∈ Z ⊂ R2 .
(2.6)
and Z˜ 0 =
2/3 1/3
+
k˜1 k˜2
From Proposition 1, the zero set Z(μˆ M,D ) can be represented as Z(μˆ M,D ) =
∞
j =1
∞ ∗j
˜ M (Z0 ∪ Z0 ) = M (Z0 ) ∪ M ∗j (Z˜ 0 ) . ∗j
(2.7)
j =1
Let Zj := M ∗j (Z0 ) and Z˜ j := M ∗j (Z˜ 0 ) for j = 1, 2, . . . . We further have the following. Proposition 2. The sets Zj and Z˜ j satisfy the following properties: (1) (x, y)t ∈ Zj ⇔ (−x, −y)t ∈ Z˜ j , that is, Zj = −Z˜ j or Z˜ j = −Zj (j = 1, 2, . . .); (2) Zj − Zj ⊆ Z2 and Z˜ j − Z˜ j ⊆ Z2 (j = 1, 2, . . .); (3) Zj + Zj ⊆ Z˜ j and Z˜ j + Z˜ j ⊆ Zj (j = 1, 2, . . .). 2.3. Illustration of the method In order to find more relations inside the zero set Z(μˆ M,D ), we will reduce the fractional expression in (2.7), which may possibly come from (2.5) and (2.6), to its lowest term. The denominator of the fractional expression is only the number 3. So we consider the integers a, b, c, d in the matrix M ∗ according to the residue class modulo-3 where these integers belong. The condition a, b, c, d ∈ Z can be divided into the following cases:
542
J.-L. Li / Journal of Functional Analysis 257 (2009) 537–552
(A0 )
a = 3l1 (l1 ∈ Z),
(A1 )
a = 3l1 + 1 (l1 ∈ Z),
(A2 )
a = 3l1 + 2 (l1 ∈ Z),
(B0 )
b = 3l2 (l2 ∈ Z),
(B1 )
b = 3l2 + 1 (l2 ∈ Z),
(B2 )
b = 3l2 + 2 (l2 ∈ Z),
(C0 )
c = 3l3 (l3 ∈ Z),
(C1 )
c = 3l3 + 1 (l3 ∈ Z),
(C2 )
c = 3l3 + 2 (l3 ∈ Z),
(D0 )
d = 3l4 (l4 ∈ Z),
(D1 )
d = 3l4 + 1 (l4 ∈ Z),
(D2 )
d = 3l4 + 2 (l4 ∈ Z).
There are 81 cases: (Ai Bj Ck Dl ) =: (Ai )(Bj )(Ck )(Dl )
i, j, k, l ∈ {0, 1, 2} .
(2.8)
For example, the case (A1 B2 C0 D2 ) denotes that a, b, c, d are of the following form: a = 3l1 + 1 (l1 ∈ Z), c = 3l3
(l3 ∈ Z),
b = 3l2 + 2 (l2 ∈ Z), d = 3l4 + 2 (l4 ∈ Z).
Besides the condition that M ∈ M2 (Z) is an expanding matrix, a, b, c, d ∈ Z also satisfy ac − bd ∈ / 3Z.
(2.9)
Hereafter we always assume that the above conditions on a, b, c, d ∈ Z hold. It follows from (2.9) that the following 33 cases: (A0 B0 Ck Dl ) (A0 B1 C2 D0 ),
(k, l ∈ {0, 1, 2}), (A0 B2 C0 D0 ),
(A1 B0 C0 Dl ) (l ∈ {0, 1, 2}), (A1 B1 C2 D2 ),
(A1 B2 C0 D0 ),
(A2 B0 C0 Dl ) (l ∈ {0, 1, 2}), (A2 B1 C2 D1 ),
(A2 B2 C0 D0 ),
(A0 B1 C0 D0 ), (A0 B2 C1 D0 ), (A1 B1 C0 D0 ), (A1 B2 C1 D2 ), (A2 B1 C0 D0 ), (A2 B2 C1 D1 ),
(A0 B1 C1 D0 ), (A0 B2 C2 D0 ), (A1 B1 C1 D1 ), (A1 B2 C2 D1 ), (A2 B1 C1 D2 ), (A2 B2 C2 D2 )
(2.10)
can be excluded. We divide the remainder 48 cases into three subsections according to (A0 ), (A1 ) and (A2 ). Section 2.4 deals with the case (A0 ), that is, a = 3l1 (l1 ∈ Z). Section 2.5 is the case (A1 ) a = 3l1 + 1 (l1 ∈ Z) and Section 2.6 is the case (A2 ) a = 3l1 + 2 (l1 ∈ Z). In each case, we can write M ∗ as a d l l M∗ = (2.11) = 3 1 4 + Mα := 3M˜ + Mα b c l2 l3 for a certain matrix Mα ∈ M2 (Z) whose entries come from the set {0, 1, 2}. Each case corresponds to a unique matrix Mα . The matrix Mα can be viewed as an operator Mα on R2 . We find that Mα is periodic when Mα acts on the sets Z0 and Z˜ 0 successively. This leads to the periodicity of the operator M ∗ when it acts on some concrete sets, such as Zj and Z˜ j for a certain fixed j ∈ N. The periodicity enables us to simplify the expression (2.7). In fact, all the remainder 48 cases only give us four types of representations on the zero set Z(μˆ M,D ). These four types of expressions are the foundation of proving Theorem.
J.-L. Li / Journal of Functional Analysis 257 (2009) 537–552
543
2.4. The case (A0 ) a = 3l1 (l1 ∈ Z) In this case, a, b, c, d ∈ Z are one of the following 12 cases: (A0 B1 C0 D1 ),
(A0 B1 C0 D2 ),
(A0 B1 C1 D1 ),
(A0 B1 C1 D2 ),
(A0 B1 C2 D1 ),
(A0 B1 C2 D2 ),
(A0 B2 C0 D1 ),
(A0 B2 C0 D2 ),
(A0 B2 C1 D1 ),
(A0 B2 C1 D2 ),
(A0 B2 C2 D1 ),
(A0 B2 C2 D2 ).
(2.12)
The corresponding matrices M1 , M2 , . . . , M12 in (2.11) are given by
01 M1 = , 10 01 M5 = , 12 01 , M9 = 21
02 M2 = , 10 02 M6 = , 12 02 M10 = , 21
01 M3 = , 11 01 M7 = , 20 01 M11 = , 22
02 M4 = , 11 02 M8 = , 20 02 M12 = . 22
(2.13)
2.5. The case (A1 ) a = 3l1 + 1 (l1 ∈ Z) In this case, a, b, c, d ∈ Z are one of the following 18 cases: (A1 B0 C1 D0 ),
(A1 B0 C1 D1 ),
(A1 B0 C1 D2 ),
(A1 B0 C2 D0 ),
(A1 B0 C2 D1 ),
(A1 B0 C2 D2 ),
(A1 B1 C0 D1 ),
(A1 B1 C0 D2 ),
(A1 B1 C1 D0 ),
(A1 B1 C1 D2 ),
(A1 B1 C2 D0 ),
(A1 B1 C2 D1 ),
(A1 B2 C0 D1 ),
(A1 B2 C0 D2 ),
(A1 B2 C1 D0 ),
(A1 B2 C1 D1 ),
(A1 B2 C2 D0 ),
(A1 B2 C2 D2 ).
(2.14)
The corresponding matrices M13 , M14 , . . . , M30 in (2.11) are given by
10 , 01 11 , M17 = 02 10 M21 = , 11 11 M25 = , 20
M13 =
11 , 01 12 M18 = , 02 12 M22 = , 11 12 M26 = , 20 10 , M29 = 22 M14 =
12 , 01 11 M19 = , 10 10 M23 = , 12 10 M27 = , 21 12 M30 = . 22 M15 =
10 , 02 12 M20 = , 10 11 M24 = , 12 11 M28 = , 21 M16 =
(2.15)
544
J.-L. Li / Journal of Functional Analysis 257 (2009) 537–552
2.6. The case (A2 ) a = 3l1 + 2 (l1 ∈ Z) In this case, a, b, c, d ∈ Z are one of the following 18 cases: (A2 B0 C1 D0 ),
(A2 B0 C1 D1 ),
(A2 B0 C1 D2 ),
(A2 B0 C2 D0 ),
(A2 B0 C2 D1 ),
(A2 B0 C2 D2 ),
(A2 B1 C0 D1 ),
(A2 B1 C0 D2 ),
(A2 B1 C1 D0 ),
(A2 B1 C1 D1 ),
(A2 B1 C2 D0 ),
(A2 B1 C2 D2 ),
(A2 B2 C0 D1 ),
(A2 B2 C0 D2 ),
(A2 B2 C1 D0 ),
(A2 B2 C1 D2 ),
(A2 B2 C2 D0 ),
(A2 B2 C2 D1 ).
(2.16)
The corresponding matrices M31 , M32 , . . . , M48 in (2.11) are given by
20 , 01 21 , M35 = 02 20 M39 = , 11 21 , M43 = 20
M31 =
21 , 01 22 M36 = , 02 21 M40 = , 11 22 M44 = , 20 2 0 M47 = , 2 2 M32 =
22 , 01 21 M37 = , 10 20 M41 = , 12 20 M45 = , 21 2 1 M48 = . 2 2 M33 =
20 , 02 22 M38 = , 10 22 M42 = , 12 22 M46 = , 21 M34 =
(2.17)
The above 48 cases correspond to 48 matrices Mα (α = 1, 2, . . . , 48). For simplicity, we use the symbol (Mα ) to denote the corresponding case. For example, (M41 ) denotes the case (A2 B1 C2 D0 ), or the case that a, b, c, d are of the following form: a = 3l1 + 2 (l1 ∈ Z), c = 3l3 + 2 (l3 ∈ Z),
b = 3l2 + 1 (l2 ∈ Z), d = 3l4
(l4 ∈ Z).
(2.18)
Note that the expansibility of the given matrix M only corresponds to certain l1 , l2 , l3 , l4 ∈ Z in the representation of a, b, c, d. We cannot let l1 , l2 , l3 , l4 be any number in Z. For instance, in the case (2.18), we cannot choose that l1 = −1, l2 ∈ Z, l3 ∈ Z and l4 = 0. 2.7. Periodicity of the operators Mα (α = 1, 2, . . . , 48) When the operators Mα (α = 1, 2, . . . , 48) act on the sets Z0 and Z˜ 0 successively, we find j some interesting periodic properties on the Mα for j = 1, 2, 3, 4, which can be classified as the following four types. This in turn leads to the periodicity of the operator M ∗ when it acts on the corresponding sets such as Zj , Z˜ j or Zj ∪ Z˜ j . With some computations, we find that the following conclusions hold, which can be classified as
J.-L. Li / Journal of Functional Analysis 257 (2009) 537–552
545
Type 1. Mα (Z0 ) ⊆ Z0 ,
Mα (Z˜ 0 ) ⊆ Z˜ 0
(α = 6, 8, 13, 23, 32, 43);
(2.19)
Mα (Z0 ) ⊆ Z˜ 0 ,
Mα (Z˜ 0 ) ⊆ Z0
(α = 1, 9, 18, 20, 34, 45).
(2.20)
Type 2. Mα2 (Z0 ) ⊆ Z0 ,
Mα2 (Z˜ 0 ) ⊆ Z˜ 0
(α = 16, 17, 29, 31, 33, 39);
(2.21)
Mα2 (Z0 ) ⊆ Z˜ 0 ,
Mα2 (Z˜ 0 ) ⊆ Z0
(α = 2, 7, 24, 30, 40, 46).
(2.22)
Mα3 (Z0 ) ⊆ Z0 ,
Mα3 (Z˜ 0 ) ⊆ Z˜ 0
(α = 11, 14, 15, 21, 27, 38);
(2.23)
Mα3 (Z0 ) ⊆ Z˜ 0 ,
Mα3 (Z˜ 0 ) ⊆ Z0
(α = 4, 25, 35, 36, 41, 47).
(2.24)
Type 3.
Type 4. Mα4 (Z0 ) ⊆ Z˜ 0 ,
Mα4 (Z˜ 0 ) ⊆ Z0
(α = 3, 5, 10, 12, 19, 22, 26, 28, 37, 42, 44, 48). (2.25)
From Type 1, we have Mα (Z0 ∪ Z˜ 0 ) = Mα (Z0 ) ∪ Mα (Z˜ 0 ) ⊆ Z0 ∪ Z˜ 0
(2.26)
for α = 1, 6, 8, 9, 13, 18, 20, 23, 32, 34, 43, 45. Furthermore, it follows from (2.7) and (2.11) that the zero set Z(μˆ M,D ) has the following property. Proposition 3. In each case (Mα ) (α = 1, 6, 8, 9, 13, 18, 20, 23, 32, 34, 43, 45), the zero set Z(μˆ M,D ) is given by Z(μˆ M,D ) = Z1 ∪ Z˜ 1
(2.27)
Z1 ∩ Z˜ 1 = (Z1 ∪ Z˜ 1 ) ∩ Z2 = ∅.
(2.28)
with
In the same way, from Type 2, we have Mα2 (Z0 ∪ Z˜ 0 ) = Mα2 (Z0 ) ∪ Mα2 (Z˜ 0 ) ⊆ Z0 ∪ Z˜ 0
(2.29)
for α = 2, 7, 16, 17, 24, 29, 30, 31, 33, 39, 40, 46. Furthermore, it follows from (2.7) and (2.11) that the zero set Z(μˆ M,D ) has the following property. Proposition 4. In each case (Mα ) (α = 2, 7, 16, 17, 24, 29, 30, 31, 33, 39, 40, 46), the zero set Z(μˆ M,D ) is given by Z(μˆ M,D ) = Z1 ∪ Z2 ∪ Z˜ 1 ∪ Z˜ 2
(2.30)
546
J.-L. Li / Journal of Functional Analysis 257 (2009) 537–552
with Z1 , Z2 , Z˜ 1 , Z˜ 2 are mutually disjoint and
2
(Zj ∪ Z˜ j ) ∩ Z2 = ∅.
(2.31)
j =1
Similarly, from Type 3, we have Mα3 (Z0 ∪ Z˜ 0 ) = Mα3 (Z0 ) ∪ Mα3 (Z˜ 0 ) ⊆ Z0 ∪ Z˜ 0
(2.32)
for α = 4, 11, 14, 15, 21, 25, 27, 35, 36, 38, 41, 47. Furthermore, it follows from (2.7) and (2.11) that the zero set Z(μˆ M,D ) has the following property. Proposition 5. In each case (Mα ) (α = 4, 11, 14, 15, 21, 25, 27, 35, 36, 38, 41, 47), the zero set Z(μˆ M,D ) is given by Z(μˆ M,D ) = Z1 ∪ Z2 ∪ Z3 ∪ Z˜ 1 ∪ Z˜ 2 ∪ Z˜ 3
(2.33)
with Z1 , Z2 , Z3 , Z˜ 1 , Z˜ 2 , Z˜ 3 are mutually disjoint
and
3
(Zj ∪ Z˜ j ) ∩ Z2 = ∅.
(2.34)
j =1
Similarly, from Type 4, we have Mα4 (Z0 ∪ Z˜ 0 ) = Mα4 (Z0 ) ∪ Mα4 (Z˜ 0 ) ⊆ Z0 ∪ Z˜ 0
(2.35)
for α = 3, 5, 10, 12, 19, 22, 26, 28, 37, 42, 44, 48. Furthermore, it follows from (2.7) and (2.11) that the zero set Z(μˆ M,D ) has the following property. Proposition 6. In each case (Mα ) (α = 3, 5, 10, 12, 19, 22, 26, 28, 37, 42, 44, 48), the zero set Z(μˆ M,D ) is given by Z(μˆ M,D ) = Z1 ∪ Z2 ∪ Z3 ∪ Z4 ∪ Z˜ 1 ∪ Z˜ 2 ∪ Z˜ 3 ∪ Z˜ 4
(2.36)
with Z1 , Z2 , Z3 , Z4 , Z˜ 1 , Z˜ 2 , Z˜ 3 , Z˜ 4 are mutually disjoint and
4
(Zj ∪ Z˜ j ) ∩ Z2 = ∅.
j =1
(2.37) These established Propositions characterize the zero set Z(μˆ M,D ). The above four types correspond to four kinds of representations for Z(μˆ M,D ) which will help us to prove Theorem in the next section.
J.-L. Li / Journal of Functional Analysis 257 (2009) 537–552
547
3. Proof of Theorem If λj (j = 1, 2, 3, 4) ∈ R2 are such that the exponential functions e2πiλ1 ,x ,
e2πiλ2 ,x ,
e2πiλ3 ,x ,
e2πiλ4 ,x ,
are mutually orthogonal in L2 (μM,D ), then the differences λj − λk (1 j = k 4) are in the zero set Z(μˆ M,D ). That is, we have λj − λk ∈ Z(μˆ M,D )
(1 j = k 4).
(3.1)
We will use the above established facts on the zero set Z(μˆ M,D ) to deduce a contradiction. The proof can be divided into four cases according to Types 1–3 and Type 4. The cases of Types 1–3 can be proved by applying the same method as that used in the paper [15]. So we only prove Theorem in the case of Type 4. It should be pointed out that the method used in [15] can be further modified as shown below. In the case of Type 4, we obtain from (2.36) and (3.1) that λj − λk ∈ Z1 ∪ Z2 ∪ Z3 ∪ Z4 ∪ Z˜ 1 ∪ Z˜ 2 ∪ Z˜ 3 ∪ Z˜ 4
(1 j = k 4)
(3.2)
and (2.37) hold. We will use Propositions 2 and 6 to deduce a contradiction. Observe that the following six differences: λ1 − λ2 ,
λ1 − λ3 , λ2 − λ3 ,
λ1 − λ4 λ2 − λ4 λ3 − λ4
(3.3)
belong to the union of the eight sets Z1 , Z2 , Z3 , Z4 , Z˜ 1 , Z˜ 2 , Z˜ 3 , Z˜ 4 . By Proposition 2 and (2.37), the elements (or differences) in each row of (3.3) (except the final row where there is only one element λ3 − λ4 ) and the elements (or differences) in each column of (3.3) (except the first column where there is only one element λ1 − λ2 ) cannot belong to the same set. In particular, the following three elements λ1 − λ2 ,
λ1 − λ3 ,
λ1 − λ4
(3.4)
in the first row will be in the three different sets of the eight sets Z1 ,
Z2 ,
Z3 ,
Z4 ,
Z˜ 1 ,
Z˜ 2 ,
Z˜ 3 ,
Z˜ 4 .
(3.5)
There are 336 distribution methods. Note that we can regard the above eight sets in (3.5) as eight small boxes. By Proposition 2(1), if the three elements in (3.4) belong to certain three different small boxes, then the following three elements λ2 − λ1 ,
λ3 − λ1 ,
λ4 − λ1
(3.6)
will be in the other three different small boxes. That is, the six elements λ1 − λ 2 ,
λ1 − λ3 ,
λ1 − λ4 ,
λ2 − λ1 ,
λ3 − λ1 ,
λ4 − λ1
548
J.-L. Li / Journal of Functional Analysis 257 (2009) 537–552
Box 1 Z1
Z2
Z3
λ1 − λ2
λ1 − λ3
λ1 − λ4
Z4
Z˜ 1
Z˜ 2
Z˜ 3
λ2 − λ1
λ3 − λ1
λ4 − λ1
Z˜ 4
coming from (3.4) and (3.6) will belong to six different small boxes. On the other hand, the remainder three elements in (3.3), i.e., λ 2 − λ3 ,
λ2 − λ4 ,
λ3 − λ4
(3.7)
will be in the three different small boxes also. Correspondingly, the three elements λ3 − λ2 ,
λ4 − λ2 ,
λ4 − λ3
(3.8)
will be in the other three different small boxes. That is, the six elements λ2 − λ3 ,
λ2 − λ4 ,
λ3 − λ4 ,
λ3 − λ2 ,
λ4 − λ2 ,
λ4 − λ3
coming from (3.7) and (3.8) will belong to six different small boxes also. There are total eight small boxes. Hence, by the well-known pigeon hole principle, there are at least four small boxes which contain two elements each. This is impossible, since one can find a contradiction inside these four small boxes by Proposition 2. To see this, we only consider one of 336 cases, the other cases can be proved in the same manner. For example, let λ1 − λ2 ∈ Z1 ,
λ1 − λ3 ∈ Z2 ,
λ1 − λ4 ∈ Z3 .
(3.9)
λ3 − λ1 ∈ Z˜ 2 ,
λ4 − λ1 ∈ Z˜ 3 .
(3.10)
Then, by Proposition 2(1), we have λ2 − λ1 ∈ Z˜ 1 ,
That is, we have the Box 1. Now, the remainder three elements in (3.3), i.e., the elements in (3.7) are also in certain different small boxes of Box 1. By Proposition 2, we have the following facts that λ2 − λ3 cannot belong to the sets (or small boxes) Z1 , Z2 , Z˜ 1 , Z˜ 2 ;
(3.11)
λ2 − λ4 cannot belong to the sets (or small boxes) Z1 , Z3 , Z˜ 1 , Z˜ 3 ;
(3.12)
λ3 − λ4 cannot belong to the sets (or small boxes) Z2 , Z3 , Z˜ 2 , Z˜ 3 .
(3.13)
Hence, from (3.11), (3.12) and (3.13), we have λ2 − λ3 ∈ Z3
or Z˜ 3 or Z4 or Z˜ 4 ;
λ2 − λ4 ∈ Z2
or Z˜ 2 or Z4 or Z˜ 4 ;
λ3 − λ4 ∈ Z1
or
Z˜ 1 or Z4 or Z˜ 4 ,
which is impossible. We only consider the following three typical cases:
(3.14)
J.-L. Li / Journal of Functional Analysis 257 (2009) 537–552
549
Box 2 Z1
Z2
Z3
λ1 − λ2
λ1 − λ3
λ1 − λ4
Z4
λ2 − λ3
λ2 − λ4 λ4 − λ3
Z4
Z˜ 1
Z˜ 2
Z˜ 3
λ2 − λ1
λ3 − λ1
λ4 − λ1
Z˜ 4
λ3 − λ2
λ3 − λ4 λ4 − λ2
Box 3 Z1
Z2
Z3
λ1 − λ2
λ1 − λ3
λ1 − λ4
λ3 − λ4
λ3 − λ2
λ4 − λ2
Z˜ 1
Z˜ 2
Z˜ 3
Z˜ 4
λ2 − λ1
λ3 − λ1
λ4 − λ1 λ2 − λ3
λ2 − λ4
λ4 − λ3
(i ) If λ2 − λ3 ∈ Z3 ,
λ2 − λ4 ∈ Z4 ,
λ3 − λ4 ∈ Z˜ 4 ,
then, by Proposition 2(1), the above Box 1 becomes the Box 2. The small boxes Z3 and Z4 (or Z˜ 3 and Z˜ 4 ) in Box 2 contain two elements respectively. Applying Proposition 2 to the elements of small box Z4 (or Z˜ 4 ), we can get a contradiction, since λ2 − λ3 = (λ2 − λ4 ) + (λ4 − λ3 ) ∈ Z4 + Z4 ⊆ Z˜ 4 which contradicts (2.37) and λ2 − λ3 ∈ Z3 . (ii ) If λ2 − λ3 ∈ Z˜ 3 ,
λ2 − λ4 ∈ Z˜ 4 ,
λ3 − λ4 ∈ Z1 ,
then, by Proposition 2(1), the above Box 1 becomes the Box 3. The small boxes Z1 and Z3 (or Z˜ 1 and Z˜ 3 ) in Box 3 contain two elements respectively. Applying Proposition 2 to the elements of sets Z1 and Z3 (or Z˜ 1 and Z˜ 3 ) respectively, we easily get a contradiction. Since (λ1 − λ2 ) + (λ3 − λ4 ) = (λ1 − λ4 ) + (λ3 − λ2 ), the left-hand side is in Z1 + Z1 ⊆ Z˜ 1 and the right-hand side is in Z3 + Z3 ⊆ Z˜ 3 , which leads to a contradiction by (2.37). (iii ) If λ2 − λ3 ∈ Z3 ,
λ2 − λ4 ∈ Z˜ 2 ,
λ3 − λ4 ∈ Z˜ 4 ,
then, by Proposition 2(1), the above Box 1 becomes the following Box 4.
550
J.-L. Li / Journal of Functional Analysis 257 (2009) 537–552
Box 4 Z1
Z2
Z3
λ1 − λ2
λ1 − λ3
λ1 − λ4
λ4 − λ2
λ2 − λ3
Z4
λ4 − λ3
Z˜ 1
Z˜ 2
Z˜ 3
λ2 − λ1
λ3 − λ1 λ2 − λ4
λ4 − λ1 λ3 − λ2
Z˜ 4
λ3 − λ4
The small boxes Z2 and Z3 (or Z˜ 2 and Z˜ 3 ) in Box 4 contain two elements respectively. Applying Proposition 2 to the elements of sets Z2 and Z3 (or Z˜ 2 and Z˜ 3 ) respectively, we easily get a contradiction. Since (λ1 − λ3 ) − (λ4 − λ2 ) = (λ1 − λ4 ) + (λ2 − λ3 ), the left-hand side is in Z2 − Z2 ⊆ Z2 and the right-hand side is in Z3 + Z3 ⊆ Z˜ 3 , which leads to a contradiction by (2.37). In a word, there exists a contradiction inside the four boxes which contain two elements each. Hence any set of μM,D -orthogonal exponentials contains at most 3 elements. One can obtain many such orthogonal systems which contain 3 elements. For example, the exponential function systems ES with S given by S = {0, s1 , s2 } ⊂ R2
for each s1 ∈ Z1 and s2 ∈ Z˜ 1
(3.15)
S = {0, s1 , s2 } ⊂ R2
for each s1 ∈ Z2 and s2 ∈ Z˜ 2
(3.16)
S = {0, s1 , s2 } ⊂ R2
for each s1 ∈ Z3 and s2 ∈ Z˜ 3
(3.17)
S = {0, s1 , s2 } ⊂ R2
for each s1 ∈ Z4 and s2 ∈ Z˜ 4
(3.18)
or with S given by
or with S given by
or with S given by
are the three elements orthogonal system in L2 (μM,D ). Note that, in Type 1, we only have (3.15); in Type 2, we have (3.15) and (3.16); in Type 3, we have (3.15)–(3.17); in Type 4, we have (3.15)– (3.18). In each Type, Zj and Z˜ j have different representations according to the corresponding Propositions 3–6 and 48 cases. This shows that the number 3 is the best. The proof of Theorem is complete. 4. A concluding remark Finally we would like to point out that for any 2 × 2 expanding matrix M1 ∈ M2 (R) and any digit sets D1 = {0, d1 , d2 } ⊂ R2 (not necessarily an integer matrix and an integer set), if P = [d1 , d2 ] is an invertible 2 × 2 matrix (whose column vectors are d1 and d2 ) such that P −1 M1 P ∈ / 3Z, then μM1 ,D1 -orthogonal exponentials contain at most 3 elements and M2 (Z) and det(M1 ) ∈ the number 3 is the best. Here d1 and d2 are two linearly independent vectors in R2 . If they are
J.-L. Li / Journal of Functional Analysis 257 (2009) 537–552
551
dependent, the conclusion should remain true as in Conjecture 1. For example, let p1 , p2 ∈ Z / 3Z, we consider the self-affine measure μM,D corresponding with |p1 | > 1, |p2 | > 1 and p1 p2 ∈ to p1 0 0 1 l and D = , , l ∈ Z \ {0, 1} , (4.1) M= 0 p2 0 0 0 then μM,D is a non-spectral measure. Furthermore, if l ∈ / 3Z + 2, then there are no more than two orthogonal exponential functions in L2 (μM,D ); if l ∈ 3Z + 2, then there are at most 3 mutually orthogonal exponential functions in L2 (μM,D ), and the number 3 is the best. In fact, in the case when l ∈ / 3Z + 2, we have mD (ξ ) = 0 for any ξ ∈ R2 , hence there are no more than two orthogonal exponential functions in L2 (μM,D ). Generally, if mD (ξ ) = 0 for any ξ ∈ Rn , then for any expanding matrix M ∈ Mn (R), μM,D is a non-spectral measure, and μM,D -orthogonal exponentials contain at most one element. In the case when l ∈ 3Z + 2, we have Z(μˆ M,D ) = Z1 ∪ Z˜ 1 ,
(4.2)
where Z1 =
2p1 /3 p2 a
+
p1 k : k ∈ Z, a ∈ R ⊂ R2 , 0
(4.3)
and Z˜ 1 =
p1 /3 p2 a˜
+
p1 k˜ : k˜ ∈ Z, a˜ ∈ R ⊂ R2 . 0
(4.4)
Since (Z1 − Z1 ) ∩ Z(μˆ M,D ) = (Z˜ 1 − Z˜ 1 ) ∩ Z(μˆ M,D ) = ∅, we obtain that there are at most 3 mutually orthogonal exponential functions in L2 (μM,D ), and the number 3 is the best. Acknowledgments The author would like to thank the anonymous referees for their valuable suggestions. The present research is partially supported by the Key Project of Chinese Ministry of Education (No. 108117) and the National Natural Science Foundation of China (No. 10871123). References [1] D.E. Dutkay, D. Han, Q. Sun, On the spectra of a Cantor measure, available on http://arxiv.org/abs/0804.4497v1, 2008. [2] D.E. Dutkay, P.E.T. Jorgensen, Analysis of orthogonality and of orbits in affine iterated function systems, Math. Z. 256 (2007) 801–823. [3] D.E. Dutkay, P.E.T. Jorgensen, Probability and Fourier duality for affine iterated function systems, available on http://arxiv.org/abs/0808.2946v1, 2008. [4] D.E. Dutkay, P.E.T. Jorgensen, Duality questions for operators, spectrum and measures, available on http://arxiv. org/abs/0809.3274v1, 2008. [5] B. Fuglede, Commuting self-adjoint partial differential operators and a group theoretic problem, J. Funct. Anal. 16 (1974) 101–121. [6] T.-Y. Hu, K.-S. Lau, Spectral property of the Bernoulli convolutions, Adv. Math. 219 (2008) 554–567. [7] J.E. Hutchinson, Fractals and self-similarity, Indiana Univ. Math. J. 30 (1981) 713–747.
552
J.-L. Li / Journal of Functional Analysis 257 (2009) 537–552
[8] P.E.T. Jorgensen, K.A. Kornelson, K.L. Shuman, Orthogonal exponentials for Bernoulli iterated function systems, available on http://arxiv.org/abs/math.OA/0703385, 2007. [9] P.E.T. Jorgensen, S. Pedersen, Dense analytic subspaces in fractal L2 -spaces, J. Anal. Math. 75 (1998) 185–228. [10] P.E.T. Jorgensen, S. Pedersen, Spectral pairs in Cartesian coordinates, J. Fourier Anal. Appl. 5 (1999) 285–302. [11] J.-L. Li, Spectral sets and spectral self-affine measures, PhD thesis, The Chinese University of Hong Kong, November, 2004. [12] J.-L. Li, Spectral self-affine measures in Rn , Proc. Edinb. Math. Soc. 50 (2007) 197–217. [13] J.-L. Li, μM,D -Orthogonality and compatible pair, J. Funct. Anal. 244 (2007) 628–638. [14] J.-L. Li, Orthogonal exponentials on the generalized plane Sierpinski gasket, J. Approx. Theory 153 (2008) 161– 169. [15] J.-L. Li, Non-spectral problem for a class of planar self-affine measures, J. Funct. Anal. 255 (2008) 3125–3148.
Journal of Functional Analysis 257 (2009) 553–592 www.elsevier.com/locate/jfa
(5) Three-dimensional subspace of l∞ with maximal projection constant
Bruce L. Chalmers a , Grzegorz Lewicki b,∗ a Department of Mathematics, University of California, Riverside, CA 92521, USA b Department of Mathematics, Jagiellonian University, Łojasiewicza 6, 30-348 Krakow, Poland
Received 5 December 2008; accepted 5 January 2009 Available online 11 February 2009 Communicated by Paul Malliavin
Abstract Let V be an n-dimensional real Banach space and let λ(V ) denote its absolute projection constant. For any N ∈ N, N n, define (N ) λN n = sup λ(V ): dim(V ) = n, V ⊂ l∞ and λn = sup λ(V ): dim(V ) = n . A well-known Grünbaum conjecture (p. 465 in [B. Grünbaum, Projection constants, Trans. Amer. Math. Soc. 95 (1960) 451–465]) says that λ2 = 4/3. In this paper we show that λ53 =
√ 5+4 2 7
* Corresponding author.
E-mail address: [email protected] (G. Lewicki). 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.01.005
554
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592 (5)
and we determine a three-dimensional space V ⊂ l∞ satisfying λ53 = λ(V ). In particular, this shows that Proposition 3.1 from [H. König, N. Tomczak-Jaegermann, Norms of minimal projections, J. Funct. Anal. 119 (1994) 253–280] (see p. 259) is incorrect. Hence the proof of the Grünbaum conjecture given in [H. König, N. Tomczak-Jaegermann, Norms of minimal projections, J. Funct. Anal. 119 (1994) 253–280] which is based on Proposition 3.1 is incomplete. © 2009 Elsevier Inc. All rights reserved. Keywords: Absolute projection constant; Minimal projection; Three-dimensional Hahn–Banach theorem
1. Introduction Let X be a real Banach space and let V ⊂ X be a finite-dimensional subspace. A linear, continuous mapping P : X → V is called a projection if P |V = id|V . Denote by P(X, V ) the set of all projections from X onto V . Set λ(V , X) = inf P : P ∈ P(X, V ) and λ(V ) = sup λ(V , X): V ⊂ X . The constant λ(V , X) is called the relative projection constant and λ(V ) the absolute projection constant. General bounds for absolute projection constants were studied by many authors (see e.g. [2,3,9–11,13,15]). It is well known (see e.g. [16]) that if V is a finite-dimensional space then λ(V ) = λ I (V ), l∞ , where I (V ) denotes any isometric copy of V in l∞ . Denote for any n ∈ N λn = sup λ(V ): dim(V ) = n and for any N ∈ N, N n, (N ) λN n = sup λ(V ): V ⊂ l∞ . √ By the Kadec–Snobar Theorem (see [8]) λ(V ) n for any n ∈ N. However, determination of the constant λn seems to be difficult. In [7, p. 465] it was conjectured by B. Grünbaum that λ2 = 4/3. In [12, Th. 1.1] an attempt has been made to prove the Grünbaum conjecture (and a more general result). The proof presented in this paper is mainly based on [12, Proposition 3.1, p. 259] and [12, Lemma 5.1, p. 273]. Unfortunately, the proof of Proposition 3.1 is incorrect. In fact the formula (3.19) from [12, p. 263] is false. This can be easily checked differentiating formula (3.12) on p. 262 with respect to the variable Zs1 . (I am using notation from [12].) Because of this error, the part of the proof of [12], on p. 265 is incorrect and as a result, the proof of [12, Th. 1.1] is incomplete.
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
555
In this paper we show that λ53
√ 5+4 2 = 7 (5)
and we determine a three-dimensional space V ⊂ l∞ satisfying λ53 = λ(V ) (see Theorem 3.6). In particular, this shows that not only the proof of Proposition 3.1 from [12] is incorrect but also the statement of Proposition 3.1 is incorrect. Now we briefly describe the structure of the paper. In Section 2 we demonstrate some preliminary lemmas useful for determination of λ53 as well as some general results concerning calculation of λN n. In Section 3 we determine the constant λ53 . The main tools applied in our proof are the Lagrange Multiplier Theorem and the Implicit Function Theorem. We would like to add that a proof of the Grünbaum conjecture can be found in [4]. 2. Preliminary results In this section mainly we consider the following problem. For a fixed u1 ∈ [0, 1] maximize a function fu1 : RN −1 × (RN )n → R defined by N fu1 (u2 , . . . , uN ), x 1 , . . . , x n = ui uj xi , xj n
(1)
i,j =1
under constraints
xi , xj
N
= δij ,
N
1 i j n;
u2j = 1 − u21 .
(2) (3)
j =2
Here for j = 1, . . . , N , xj = ((x 1 )j , . . . , (x n )j ), w, z n = nj=1 wj zj for any w = (w1 ,
. . . , wn ), z = (z1 , . . . , zn ) ∈ Rn and p, q N = N j =1 pj qj for any p = (p1 , . . . , pN ), q = N (q1 , . . . , qN ) ∈ R . Also we will work with N fu1 ,A (u2 , . . . , uN ), x 1 , . . . , x n = ui uj aij xi , xj n ,
(4)
i,j =1
where A = {aij } is a fixed N × N symmetric matrix. Lemma 2.1. Let C = (cij )i,j =1,...,n be a real n × n orthonormal matrix. Then for any x 1 , . . . , x n , u ∈ RN satisfying (2) and (3), fu1 (u2 , . . . , uN ), x 1 , . . . , x n = fu1 (u2 , . . . , uN ), C x 1 , . . . , C x n ,
556
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
and fu1 ,A (u2 , . . . , uN ), x 1 , . . . , x n = fu1 ,A (u2 , . . . , uN ), C x 1 , . . . , C x n for any N × N matrix A. Here C(x i ) =
n
j =1 cij x
j.
Proof. It follows easily from the facts that
Cx i , Cx j
N
= xi , xj N
for i, j = 1, . . . , n and
(Cx)i , (Cx)j
n
= xi , xj n
for i, j = 1, . . . , N , where (Cx)i = ((Cx 1 )i , . . . , (Cx n )i ).
2
Now we recall without proof the following well-known Lemma 2.2. Let (X, ·,· ) be a finite-dimensional Hilbert space with an orthonormal basis x 1 , . . . , x n . Let T : X → X be a linear isometry. If C is an n × n matrix with columns cj = (c1j , . . . , cnj ) defined by T xj =
n
cj i x i ,
i=1
then C is an orthonormal matrix. Lemma 2.3. Let x 1 , . . . , x n ∈ RN and u ∈ RN satisfy (2) and (3). Set V = span[x 1 , . . . , x n ]. Assume v 1 , . . . , v n is an orthonormal basis of V (with respect to ·,· N ). Then fu1 (u2 , . . . , uN ), x 1 , . . . , x n = fu1 (u2 , . . . , uN ), v 1 , . . . , v n and fu1 ,A (u2 , . . . , uN ), x 1 , . . . , x n = fu1 ,A (u2 , . . . , uN ), v 1 , . . . , v n for any N × N matrix A. Proof. It is well known that for any x, y ∈ RN , x, x N = y, y N = 1, there exists a linear isometry (with respect to the Euclidean norm in RN ) Tx,y : RN → RN such that T x = y. Applying this fact and the induction argument with respect to n we get that there exists a linear isometry T : RN → RN such that T
x i = v i for i = 1, . . . , n. By Lemma 2.2 there exists an orthonormal i matrix C such that Cx = nj=1 Cij x j = v i . By Lemma 2.1, fu1 (u2 , . . . , uN ), x 1 , . . . , x n = fu1 (u2 , . . . , uN ), v 1 , . . . , v n ,
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
557
and fu1 ,A (u2 , . . . , uN ), x 1 , . . . , x n = fu1 ,A (u2 , . . . , uN ), v 1 , . . . , v n , which completes the proof.
2
Lemma 2.4. Let n, N ∈ N, N n. Fix u = (u1 , . . . , uN ) ∈ RN with nonnegative coordinates. Let us consider a function f : RnN → R given by N 1 n f x ,...,x = ui uj xi , xj n , i,j =1
where x i ∈ RN for i = 1, . . . , n. Assume that y 1 , . . . , y n ∈ RN are so chosen that f y 1 , . . . , y n = max f x 1 , . . . , x n : x 1 , . . . , x n satisfying (2) . Let A ∈ RN ×N be a matrix defined by aij = sgn yi , yj n
(5)
for i, j = 1, . . . , N (sgn(0) = 1 by definition). Define B ∈ RN ×N by bij = ui uj aij
(6)
for i, j = 1, . . . , N . Let b1 b2 · · · bN denote the eigenvalues of B. (Since B is symmetric all of them are real.) Then there exist orthonormal (with respect to ·,· N ) eigenvectors of B, w 1 , . . . , w n ∈ RN corresponding to b1 , . . . , bn , such that n f w1 , . . . , wn = f y 1 , . . . , y n = bj . j =1
Set N f1 x 1 , . . . , x n = bij xi , xj n . i,j =1
If y 1 , . . . , y n ∈ RN are such that f1 y 1 , . . . , y n = max f1 , under constraint (2) = max f, under constraint (2) and bn > bn+1 then span[y i : i = 1, . . . , n] = span[w i : i = 1, . . . , n].
558
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
Proof. Since uj are nonnegative, f1 x 1 , . . . , x n f x 1 , . . . , x n for any x 1 , . . . , x n ∈ RN . Moreover, f1 y 1 , . . . , y n = f y 1 , . . . , y n . Hence f1 attains its maximum under constraints (2) at (y 1 , . . . , y n ). We now apply the Lagrange Multiplier Theorem to the function f1 . This is possible since f1 is a C ∞ function. Notice that by [12, p. 261] rank(G (y 1 , . . . , y n )) = n(n + 1)/2 where G is the n(n + 1)/2 × nN matrix associated with conditions (2). Consequently there exist Lagrange multipliers kij , 1 i j n, such that
∂(f1 − 1ij n kij Gi ) 1 y , . . . , yn = 0 (7) i ∂(x )j for i = 1, . . . , n, j = 1, . . . , N , where Gi (x 1 , . . . , x n ) = x i , x j N . Let us define for i, j ∈ {1, . . . , n}, γij = kij /2 if i < j , γij = kj i /2, if j < i and γii = kii . Hence the system (7) can be rewritten (compare with [12, p. 262, formula (3.14)]) as: n γmi y i B ym =
(8)
i=1
for m = 1, . . . , n. Let Γ = {γij , i, j = 1, . . . , n}. Observe that Γ is a symmetric n × n matrix. Hence it has real eigenvalues a1 , . . . , an . Without loss of generality we can assume that a1 a2 · · · an .
(9)
Let V = [vij ] be the n × n orthonormal matrix consisting of eigenvectors of Γ . Then V T Γ V = D,
(10)
where D is a diagonal matrix with dii = ai for i = 1, . . . , n. Now we show that ai = bi
(11)
for i = 1, . . . , n. First we prove that am , m = 1, . . . , n, are also eigenvalues of B. To do this, fix m ∈ {1, . . . , n}. Define wm =
n j =1
We show that Bw m = am w m . Note that
vj m y j .
(12)
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
Bw = B m
n
vj m y
j =1
=
n n i=1
j
=
n
559
n n j i vj m B y = vj m γj i y
j =1
v j m γj i y i =
j =1
j =1
n n i=1
vj m γij y i =
j =1
i=1 n (Γ V )im y i i=1
(by (10)) n n n i i i = (V D)im y = vim am y = am vim y = am w m . i=1
i=1
i=1
Hence for m = 1, . . . , n, am are eigenvalues of B with the corresponding vectors w m . By Lemma 2.3, w i , w j N = δij . Notice that by (12) and Lemma 2.3 f1 y 1 , . . . , y n = f1 w 1 , . . . , w n . Since for any m = 1, . . . , n and i = 1, . . . , N , Bw m i = am w m i , multiplying each of the above equations by (w m )i and summing them up we get that n
am = f1 w 1 , . . . , w n = f1 y 1 , . . . , y n = f y 1 , . . . , y n .
j =1
If ai = bi for some i ∈ {1, . . . , n}, let v 1 , . . . , v n be the orthonormal eigenvectors of B corresponding to b1 , . . . , bn . Reasoning as above, we get N f v1, . . . , vn ui uj sgn yi , yj n vi , vj n i,j =1
=
n i=1
bi >
n
ai = f y 1 , . . . , y n ;
i=1
a contradiction. The fact that span[y i : i = 1, . . . , n] = span[w i : i = 1, . . . , n] follows from (12) and invertibility of the matrix V . 2 Reasoning as in the proof of Lemma 2.4 we can show Theorem 2.1. Let A denote the set of all N × N symmetric matrices (aij ) such that aij = ±1 and aii = 1 for i, j = 1, . . . , N . Let fu1 be given by (1). Then
560
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
max fu1 : (u2 , . . . , uN ), x 1 , . . . , x n satisfying (2), (3)
n N N 2 = max bi (v, A): A ∈ A, v = (v1 , . . . , vn ) ∈ R , vi = 1, v1 = u1 , i=1
i=1
where b1 (v, A) b2 (v, A) · · · bn (v, A) denote the biggest eigenvalues of an N × N matrix (vi vj aij )N ij =1 . Analogously for any A = (aij ) ∈ A,
N
max
ui uj aij xi , xj n : x 1 , . . . , x n satisfying (2),
i,j =1
2 uj = 1 − u1 /(N − 1), j = 2, . . . , N
= max
n
bi (v, A): A ∈ A, v = u1 , c(u1 ), . . . , c(u1 ) ,
i=1
where c(u1 ) =
(1 − u21 )/(N − 1). Also
max
N i,j =1
= max
N ui uj xi , xj n : x 1 , . . . , x n satisfying (2), u2j = 1 j =1
n
bi (v, A): A ∈ A, v = (v1 , . . . , vn ) ∈ R , N
i=1
N
vi2
=1 .
i=1
Now for n, N ∈ N, N n define (N ) (N ) λN n = sup λ V , l∞ : V ⊂ l∞ , dim(V ) = n .
(13)
Lemma 2.5. For any n, N ∈ N, 2 n N , −1 N λN n−1 λn . (N −1)
Proof. Let V ⊂ l∞
be an (n − 1)-dimensional subspace with a basis w 1 , . . . , w n−1 . Define N V1 = span e1 , 0, w j : j = 1, . . . , n − 1 ⊂ l∞ .
(N )
Let P ∈ P(l∞ , V1 ) be such that (N ) P = λ V1 , l∞ . (N −1)
(Since V1 is finite-dimensional such a projection exists.) Define Q ∈ L(l∞ Qx = P (0, x)2 , . . . , P (0, x)n .
, V ) by
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592 (N −1)
561 (N −1)
It is clear that Q(l∞ ) ⊂ V and Qw j = w j for j = 1, . . . , n − 1. Hence Q ∈ P(l∞ Moreover, Q P . Taking supremum over V we get that
, V ).
−1 N λN n−1 λn ,
as required.
2
Theorem 2.2. Let n, N ∈ N, N n. Then
N N 1 N n 2 ui uj xi , xj n : x , . . . , x satisfying (2), uj = 1 . λn = max i,j =1
j =1
Proof. By [12, Prop. 2.2 and (3.7), p. 260],
λN n
max
N i,j =1
N 1 n 2 ui uj xi , xj n : x , . . . , x satisfying (2), uj = 1 . j =1
To prove a converse assume that there exist n, N ∈ N, N n, such that
λN n
< φnN
= max
N i,j =1
N 1 n 2 ui uj xi , xj n : x , . . . , x satisfying (2), uj = 1 . j =1
Without loss of generality we can assume that M n = min m ∈ N: λM m < φm for some M m and M N = min M ∈ N, M n: λM n < φn . Let us define N f u, x 1 , . . . , x n = ui uj xi , xj n . i,j =1
Let y 1 , . . . , y n ∈ RN satisfying (2) and uo ∈ RN with
N
o 2 j =1 (uj )
= 1, be such that
f uo , y 1 , . . . , y n = φnN . Define as in Lemma 2.4 aij = sgn yi , yj n
(14)
for i, j = 1, . . . , N . Also let B ∈ RN ×N be given by bij = uoi uoj aij
(15)
562
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
for i = 1, . . . , N . By Lemma 2.4 and Theorem 2.1 we can get that n f uo , y 1 , . . . , y n = bi uo , A i=1
where b1 (uo , A) b2 (uo , A) · · · bn (uo , A) denote the biggest eigenvalues of the above defined matrix B. First suppose that uoj = 0 for somej ∈ {1, . . . , N}. Without loss of generality we can assume that uo1 = 0. Let B1 be an (N − 1) × (N − 1) matrix given by B1 = {bij }i,j =2,...,N (the part of B without the first row and the first column). Let d1 · · · dN −1 be the eigenvalues of B1 and z1 , . . . , zN −1 the corresponding orthonormal eigenvectors. Since uo1 = 0, v j = (0, zj ), j = 1, . . . , N − 1, are the orthonormal eigenvectors of B corresponding to dj . Also do = 0 is an eigenvalue of B with e1 as an eigenvector. Consequently bj uo , A ∈ {0, dk , k = 1, . . . , N − 1} for j = 1, . . . , n. If bj (uo , A) > 0 for j = 1, . . . , n, then bj (uo , A) are also the eigenvalues of B1 . By Theorem 2.1, n −1 bi uo , A = φnN = φnN −1 = λN λN n n; i=1
a contradiction with the definition of N . If bj (uo , A) = 0 for some j ∈ {1, . . . , n}, then again by Theorem 2.1 N −1 −1 φnN bi uo , A φn−1 = λN n−1 . i =j
Consequently by Lemma 2.5, N −1 N −1 N λN n λn−1 = φn−1 φn ,
which again leads to a contradiction. Now assume that uoj > 0 for j = 1, . . . , N . Let w 1 , . . . , w n be the orthonormal eigenvectors corresponding to bi (uo , A) for i = 1, . . . , n. By the proof of Lemma 2.4 f1 uo , w 1 , . . . , w n = φnN . Define, for j = 1, . . . , n, j j zj = w1 /uo1 , . . . , wN /uoN and let (N ) V = span zj : j = 1, . . . , n ⊂ l∞ .
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592 (N )
We show that λ(V , l∞ ) =
n
o N j =1 bj (u , A) = φn .
563
Define, for j = 1, . . . , n,
j j f j = w1 uo1 , . . . , wN uoN (N )
and let P ∈ L(l∞ , V ) be given by Px =
n
fj,x
j =1
N
zj .
(N )
Since the vectors w j are orthogonal with respect to ·,· N , P ∈ P(l∞ , V ). Now we show that (N ) P = λ V , l∞ = φnN . Since the function f1 attains its conditional maximum at uo , w 1 , . . . , w n (compare with the proof of Lemma 2.4) by the Lagrange Multiplier Theorem there exist kij ∈ R, 1 i j n, and d ∈ R such that ∂(f1 −
2 − d( N j =1 uj − 1)) o u , w 1 , . . . , w n = 0. ∂(uj )
1ij n kij Gi
(16)
It is easy to see that (16) reduces to (compare with [12, (3.12), p. 262]) N
uoj aij wi , wj n = duoi
j =1
for i = 1, . . . , N . Multiplying the above equalities by uoi and summing them up, we get that d = f1 uo , w 1 , . . . , w n = φnN . Also since uoi > 0 for i = 1, . . . , N , (16) reduces to
N
uoj aij wi , wj n
/uoi = d.
j =1
Consequently, by definition of ·,· n , we get, for i = 1, . . . , N , d=
N n k=1
j =1
aij uoj wjk
wik /uoi
=
N n k=1
j =1
aij fjk
zik = P (ai1 , . . . , aiN ) i .
(17)
(N )
Since (ai1 , . . . , aiN )∞ = 1, P d. On the other hand, for any x = (x1 , . . . , xN ) ∈ l∞ , x∞ = 1 and i ∈ {1, . . . , N},
564
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
n N n j j j j (P x)i = f , x N zi = xk fk zi j =1
k=1
j =1
N n N j o j o o = xk wk uk wi /ui uk wk , wi n /uoi j =1
k=1
=
N
k=1
uok aik wk , wi n /uoi = d,
k=1
since aij = sgn( yj , yi n ) = sgn( wj , wi n ) for i, j = 1, . . . , N . Hence P = d = φnN . Now we show that (N ) . P = λ V , l∞ (N )
(N )
To do this set for i = 1, . . . , N , a i = (ai1 , . . . , aiN ) and define an operator Ep : l∞ → l∞ by Ep (x) =
N o 2 i ui xi a . i=1
We show that Ep (V ) ⊂ V . Note that for any k = 1, . . . , N , and j = 1, . . . , n N N j o 2 j o i j Ep z k = ui wi /ui a k = uoi wi aki i=1
i=1
j j = bj uo , A wk /uok = bj uo , A zk , since w j is an eigenvector associated to bj (uo , A). Observe that by (17)
P ai
i
= d = P
o 2 for i = 1, . . . , N and N i=1 (ui ) = 1. By [5] (see also [14, Th. 1.3]), P is a minimal projection (N ) in P(l∞ , V ). Finally (N ) = P = d = φnN , λN n λ V , l∞ which leads to a contradiction. The proof is complete.
2
Lemma 2.6. For any n 2, λn+1 = 2 − 2/(n + 1). n (n+1)
= λ(ker(f ), l∞ Moreover, λn+1 n constant.
) if and only if f = c(±1, . . . , ±1), where c is a positive
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
565
Proof. It is clear that (n+1) (n+1) : f ∈ l1 = max λ ker(f ), l∞ \ {0}, f 1 = 1 . λn+1 n (n+1)
(n+1)
, f 1 = 1 is so chosen that λ(ker(f ), l∞ By [1], if f = (f1 , . . . , fn+1 ) ∈ l1 |fj | < 1/2 for any j = 1, . . . , n + 1 and (n+1) λ ker(f ), l∞ =1+
n+1 i=1
|fj | (1 − 2|fj |)
) > 1, then
−1 .
Hence it is easy to see that
λn+1 n
= max 1 +
n+1 i=1
fj (1 − 2fj )
−1
under constraints
n+1
fj = 1, 1/2 fj 0, j = 1, . . . , n + 1 .
(18)
j =1
Now we show by induction argument that = 2 − 2/(n + 1). λn+1 n If n = 2, by the Lagrange Multiplier Theorem the only functional f = (f1 , f2 , f3 ) which
fj )−1 under constraint (18) is f = can maximize the function φ2 (f ) = 1 + ( 3i=1 (1−2f j) (1/3, 1/3, 1/3) and φ2 (1/3, 1/3, 1/3) = 4/3. Now assume that λk+1 = 2 − 2/(k + 1) for any k k n. Then by the Lagrange Multiplier Theorem the only functional f = (f1 , . . . , fn+1 )
fj −1 under constraint (18) is which can maximize the function φn (f ) = 1 + ( n+1 i=1 (1−2fj ) ) f = (1/(n + 1), . . . , 1/(n + 1)) and φn (1/(n + 1), . . . , 1/(n + 1)) = 2 − 2/(n + 1). Notice that
fj −1 φn+1 (1/(n + 2), . . . , 1/(n + 2)) = 2 − 2/(n + 2), where φn+1 (f ) = 1 + ( n+1 i=1 (1−2fj ) ) . Consequently, by the induction hypothesis,
λn+2 n+1
= max 1 +
n+2 i=1
fj (1 − 2fj )
−1
under constraints
n+2
fj = 1, 1/2 > fj > 0, j = 1, . . . , n + 2 .
(19)
j =1
Again by the Lagrange Multiplier Theorem the only f = (f1 , . . . , fn+2 ) which can maximize φn+1 under constraints (19) is f = (1/(n + 2), . . . , 1/(n + 2)). Hence λn+2 n+1 = 2 − 2/(n + 2), as
566
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592 (n+1)
required. By the above proof, any functional f satisfying λ(ker(f ), l∞ c(±1/(n + 1), . . . , ±1/(n + 1)). The proof is complete. 2
) = λn+1 is of the form n
−1 > Lemma 2.7. Let us consider problem (1) with u1 = 0 and fixed N n + 2. Assume that λN n N −1 N −1 λn−1 . Then the maximum of fu1 under constraints (2) and (3) is equal to λn .
Proof. By [12, Th. 1.2] and Theorem 2.2 for any n, N ∈ N, N n + 1,
λN n
= max
N
ui uj xi , xj n
(20)
i,j =1
under constraints
xi , xj
N
= δij , N
1 i j n;
u2j = 1.
(21) (22)
j =1
Moreover, if u, y 1 , . . . , y n ∈ RN satisfying (21) and (22) are such that N
ui uj yi , yj n = λN n,
i,j =1
then by Lemma 2.4 and Theorem 2.1, λN n =
n
bj ,
(23)
j =1
where b1 b2 · · · bn are the biggest eigenvalues of the N × N matrix B = (bij )i,j =1,...,N defined by bij = ui uj sgn( yi , yj n ). Now, assume fu1 (v2 , . . . , vn ), y 1 , . . . , y n = max fu1 (u2 , . . . , un ), x 1 , . . . , x n : (u1 , . . . , un ), x 1 , . . . , x n satisfying (2), (3) . Since u1 = 0, by (20), and Theorem 2.2, −1 fu1 (v2 , . . . , vn ), y 1 , . . . , y n φnN −1 = λN . n To prove the opposite inequality, let B be an N × N matrix defined by bij = vi vj sgn yi , yj n .
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
567
Let b1 b2 · · · bN be the eigenvalues of B (with multiplicities). By Lemma 2.4, n 1 n fu1 (v2 , . . . , vn ), y , . . . , y = bj . j =1
Let C = {bij }i,j =2,...,N and let c1 c2 · · · cN −1 be the eigenvalues of C. Since u1 = 0, {c1 , . . . , cN −1 } ∪ {0} = {b1 , . . . , bN }. If bjo = 0 for some jo ∈ {1, . . . , n}, then again by [12, Th. 1.2], (20), Theorem 2.1 and Lemma 2.6 −1 λN fu1 (v2 , . . . , vn ), y 1 , . . . , y n n =
n
bj
j =1
−1 bj λN n−1 ;
j <jo
a contradiction with our assumptions. Hence bi = ci for i = 1, . . . , n. Now let z1 , . . . , zn ∈ RN −1 be the corresponding to b1 , . . . , bn orthonormal eigenvectors of C. Hence for any j = 1, . . . , n and i = 1, . . . , N − 1 j Cz i = cj zj i . Multiplying each of the above equations by (zj )i and summing them up we get max{fu1 } =
n
cj =
j =1
=
N
N
bij zi−1 , zj −1 n
i,j =2
−1 vi vj sgn yi , yj n zi−1 , zj −1 n λN . n
i,j =2
The proof is complete.
2
Lemma 2.8. Let u = (u1 , . . . , uN ) ∈ RN and let z = (z2 , . . . , zn ) ∈ {−1, 1}N −1 . Let Az be N × N matrix defined by zj = a1j ∈ {±1} for j = 2, . . . , N , aij = −1 for i, j = 2, . . . , N , i = j and aii = 1 for i = 1, . . . , N . Let Bz = {(bz )ij , i, j = 1, . . . , N} where (bz )ij = ui uj (Az )ij . Hence ⎛
u21 ⎜ z2 u1 u2 ⎜ Bz = ⎜ z3 u1 u2 ⎝ ... zN u1 uN
z2 u1 u2 u22 −u2 u3 ... −u2 uN
z3 u1 u3 −u2 u3 u23 ... ...
... ... ... ... ...
zN u1 uN ⎞ −u2 uN ⎟ ⎟ −u2 uN ⎟ . ⎠ ... u2N
(24)
Let σ be a permutation of {1, . . . , N} such that σ (1) = 1 and let for any x = (x1 , . . . , xN ) ∈ RN , x− = (x1 , −x2 , . . . , −xN ). Then the matrices
568
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
Bσ (z) = uσ (i) uσ (j ) Aσ (z) ij , i, j = 1, . . . , N , Bz− = ui uj (Az− )ij , i, j = 1, . . . , N and Bz have the same eigenvalues. Proof. Let b be an eigenvalue of Bz with an eigenvector x = (x1 , . . . , xN ). Define xσ = (x1 , xσ (2) , . . . , xσ (N ) ) and x− = (x1 , −x2 , . . . , −xN ). Notice that (Bσ (z) xσ (z) )1 = u21 x1 +
n
xσ (j ) uσ (j ) u1 = u21 x1 +
j =2
n
xj uj u1 = bx1 .
j =2
Analogously, for i = 2, . . . , N , (Bσ (z) xσ )i = u1 uσ (i) xσ (i) x1 +
n
−uσ (j ) uσ (i) xσ (j ) + u2σ (i) xσ (i)
j =2,j =i
= bxσ (i) = b(xσ )i . Also notice that (Bz− x− )1 = u21 x1 +
N (−xj )u1 uj (−xj ) = bx1 = b(x− )1 j =2
and for i = 2, . . . , N (Bz− x− )i = u1 ui (−xi )x1 +
n
aij u1 uj (−xj ) = −bxi = b(x− )i .
j =2
This shows that any eigenvalue of B is an eigenvalue of Bz− and Bσ (z) with the same multiplicity. By the same reasoning, any eigenvalue of Bz− and Bσ (z) is also an eigenvalue of B, which completes the proof. 2 Theorem 2.3. Let n = 3 and N = 5. Let z = (z2 , z3 , z4 , z5 ) be such that zi = ±1, for i = 2, . . . , 5 and zj = −1 for exactly one j ∈ {2, 3, 4, 5}. Assume that Az = (aij (z)) is a 5 × 5 matrix defined by ⎛
1 ⎜ z2 ⎜ Az = ⎜ z3 ⎝ z4 z5 Let
⎞ z2 z3 z4 z5 1 −1 −1 −1 ⎟ ⎟ −1 1 −1 −1 ⎟ . ⎠ −1 −1 1 −1 −1 −1 −1 1
(25)
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
MA = max
5
569
5 1 2 3 5 3 2 ui uj aij (z) xi , xj 3 : x , x , x ∈ R satisfying (2), ui = 1 .
i,j =1
i=1
Then MA = 3/2. Proof. By Lemma 2.8, we can assume that z2 = −1. Fix u ∈ R5 , the 5 × 5 matrix defined by
5
2 i=1 ui
= 1. Let Bu denote
(bu )ij = ui uj aij (z) for i, j = 1, . . . , 5. By Lemma 2.4,
MA = max
3
bj (u, A): u ∈ R5 ,
j =1
5
u2i = 1 ,
i=1
where b1 (u, A) b2 (u, A) b3 (u, A) denote the three biggest eigenvalues of Bu . Put for i = 1, . . . , 5, vi = u2i . After elementary but tedious calculations (we advise to check them by the symbolic Mathematica program) we get that det(Bu − t Id) = −t + t 5
4
5
vi + 16tv3 v4 v5 (v1 + v2 )
i=1
− 4t 2 v3 v4 v5 + (v1 + v2 ) v4 v5 + v3 (v4 + v5 ) . Define w = (w1 , . . . , w5 ) by w1 = 0, w2 = u21 + u22 , wi = ui for i = 3, 4, 5. Observe that by the above formula Bu and Bw have the same eigenvalues. Since w1 = 0, by Lemma 2.7, Theorem 2.1, Theorem 2.2 and Lemma 2.6 applied to n = 3 and N = 5 we get 3
bj (u, A) λ43 = 3/2,
j =1
which completes the proof.
2
Lemma 2.9. Let B be a 5 × 5 matrix defined by ⎛ u2 o1 ⎜ z2 uo1 c ⎜ B = ⎜ z3 uo1 c ⎝ z4 uo1 c z5 uo1 c
z2 uo1 c c2 −c2 −c2 −c2
z3 uo1 c −c2 c2 −c2 −c2
z4 uo1 c −c2 −c2 c2 −c2
z5 uo1 c ⎞ −c2 ⎟ ⎟ −c2 ⎟ , ⎠ −c2 c2
(26)
where zj ∈ {±1} for j = 2, 3, 4, 5. Then 2c2 is an eigenvalue of B with multiplicity at least 2.
570
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
Proof. Let C be defined by ⎛
0 0 ⎜ 0 c2 ⎜ C = ⎜ 0 −c2 ⎝ 0 −c2 0 −c2
0 −c2 c2 −c2 −c2
0 −c2 −c2 c2 −c2
⎞ 0 −c2 ⎟ ⎟ −c2 ⎟ . ⎠ −c2 c2
(27)
Since 2c2 is the eigenvalue of C with the multiplicity 3 with the eigenvectors v j , j = 2, 3, 4, given by (30), there exist 2 orthonormal vectors w 1 , w 2 in span[v 2 , v 3 , v 4 ] which are orthogonal to the first row of B, which completes the proof. 2 Theorem 2.4. Let n = 3 and N = 5. Fix uo1 ∈ [0, 1]. Assume A = (aij ) is a 5 × 5 matrix defined by ⎛
1 1 1 −1 1 −1 −1 ⎜ 1 ⎜ A = ⎜ 1 −1 1 −1 ⎝ −1 −1 −1 1 −1 −1 −1 −1
⎞ −1 −1 ⎟ ⎟ −1 ⎟ . ⎠ −1 1
(28)
Let
MA (u1 ) = max
5
3 ui uj aij xi , xj 3 : x 1 , x 2 , x 3 ∈ R5 satisfying (2),
i,j =1
u1 = uo1 , ui =
1 − u21 /2,
i = 2, 3, 4, 5 .
Then MA (u1 ) = where c =
1 + 6c2 +
(6c2 − 1)2 + 16(1 − 4c2 )c2 2
1 − u2o1 /2. Moreover, √ √ 5+4 2 = MA (5 − 3 2)/7 . MA = max MA (u): u ∈ [0, 1] = 7
Proof. Notice that by Theorem 2.1,
MA =
3 j =1
bj (B),
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
571
where b1 (B) b2 (B) · · · b5 (B) denote the eigenvalues of the matrix B given by ⎛ u2 o1 ⎜ uo1 c ⎜ B = ⎜ uo1 c ⎝ −uo1 c −uo1 c
uo1 c c2 −c2 −c2 −c2
−uo1 c −c2 −c2 c2 −c2
uo1 c −c2 c2 −c2 −c2
−uo1 c ⎞ −c2 ⎟ ⎟ −c2 ⎟ , ⎠ −c2 c2
(29)
where c = 1 − u2o1 /2. Hence we should calculate the eigenvalues of B. To do this, let C be given by (27). It is easy to see that the eigenvalues of C are: 0 (with the eigenvector v 1 = (1, 0, 0, 0, 0)), 2c2 (with the orthonormal eigenvectors √ √ v 2 = (0, 1/ 2, −1/ 2, 0, 0, 0),
√ √ v 3 = (0, 0, 0, 1/ 2, −1/ 2),
v 4 = (0, 1/2, 1/2, −1/2, −1/2))
(30)
and −2c2 (with the eigenvector v 5 = (0, 1/2, 1/2, 1/2, 1/2)). Hence our theorem is proved for uo1 = 0 (in this case c = 1/2). If uo1 > 0, since the vectors v 2 , v 3 and v 5 are orthogonal to the first row of B, by Lemma 2.9, 2c2 (with multiplicity 2) and −2c2 (with multiplicity 1) are also eigenvalues of B. Now we find the other 2 eigenvalues of B. To do this, we show that an element (a, 1/2, 1/2, −1/2, −1/2) for a properly chosen a is an eigenvector of B. Let us consider a system of equations: u2o1 a + 2uo1 c = λa
(31)
uo1 ca + c2 = λ/2
(32)
and
with unknown variables a and λ. Hence we easily get that u2o1 a + 2uo1 c = 2 uo1 ca + c2 a. The last equation has two solutions. Namely: u2o1 − 2c2 + (u2o1 − 2c2 )2 + 16u2o1 c2 a1 = 4uo1 c and a2 =
u2o1 − 2c2 −
(u2o1 − 2c2 )2 + 16u2o1 c2 4uo1 c
.
Since a1 , λ1 and a2 , λ2 are the solutions of (31) and (32), it is easy to check that (a1 , 1/2, 1/2, −1/2, −1/2) is an eigenvector of B corresponding to the eigenvalue u2o1 − 2c2 + (u2o1 − 2c2 )2 + 16u2o1 c2 λ1 = 2uo1 ca1 + 2c2 = 2c2 + 2
572
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
and (a2 , 1/2, 1/2, −1/2, −1/2) is an eigenvector of B corresponding to the eigenvalue λ2 = 2uo1 ca2 + 2c = 2c + 2
2
u2o1 − 2c2 −
(u2o1 − 2c2 )2 + 16u2o1 c2 2
.
It is clear that λ1 > 2c2 and λ2 < 2c2 . Hence by Theorem 2.1, MA = λ1 + 2c2 + 2c2 . Since u2o1 = 1 − 4c2 , λ1 + 4c = 2
1 + 6c2 +
(6c2 − 1)2 + 16(1 − 4c2 )c2 , 2
which completes this part of the proof. Now define for c ∈ [0, 1/2], h(c) =
1 + 6c2 +
(6c2 − 1)2 + 16(1 − 4c2 )c2 . 2
Notice that h(0) = 1 and h(1/2) = 3/2. After elementary calculations (substituting c2 by x), we get that co =
√ (2 + 3 2)/7 2
is the only point in [0, 1/2] such that h (co ) = 0. Since √ 5+4 2 > 3/2, 7 √ 5+4 2 . MA = h(co ) = 7
h(co ) =
Note u1 =
√ (5 − 3 2)/7 satisfies u21 + 4co2 = 1.
The proof is complete.
2
Lemma 2.10. Let B be defined by (26). Assume that c ∈ (0, 1/2) is so chosen that there exist b4 (B) b5 (B) eigenvalues of B satisfying b4 (B) < 2c2 . Let w 1 , w 2 , w 3 be the orthonormal eigenvectors corresponding to the three biggest eigenvalues of B. Assume that 5 i,j =1
bij wi , wj 3 = M = max
5 i,j =1
ui uj zi , zj 3 : z1 , z2 , z3 ∈ R5 ,
(33)
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
573
√ under constraint (2) with u1 = 1 − 4c2 and uj = c for j = 2, 3, 4, 5. Then the matrix Bo determined by 1 = z2 = z3 = −z4 = −z5 satisfies (33). Proof. By Theorem 2.1 we need to calculate the sum of the three biggest eigenvalues of any matrix B satisfying (26). If zi = zj = 1 for exactly two indices i, j ∈ {2, 3, 4, 5} then applying Theorem 2.4 and Lemma 2.8, we can show that B has the same eigenvalues as Bo . Now assume that zi = −1 for exactly one i ∈ {2, 3, 4, 5}. Then by Theorem 2.3, b1 (B) + b2 (B) + b3 (B) 3/2, where b1 (B) b2 (B) b3 (B) denote the three biggest eigenvalues of B. Notice that by Theorem 2.4, M MA > 3/2. By Lemma 2.8 the same conclusion holds true if zi = 1 for exactly one i ∈ {2, 3, 4, 5}. Now assume that zi = 1 for i = 2, 3, 4, 5. Then, reasoning as in Theorem 2.4, we get that the eigenvalues of B are: 2c2 with the multiplicity 3, 1/2 − 3c2 +
1 + 12c2 − 60c4 /2 and 1/2 − 3c2 −
1 + 12c2 − 60c4 /2.
After elementary calculations we obtain that 1/2 − 3c2 +
1 + 12c2 − 60c4 /2 2c2
√ if and only if 1/2 c 1/ 5. If Bo satisfies (33), by Theorem 2.1, we should have b1 (B) b1 (Bo ), which by the above calculations and Theorem 2.4 is equivalent to
1 + 12c2 − 60c4 /2 < 2c2 +
1 + 8c2 − 32c4 /2
or 2c2 < 1/2 − c2 +
1 + 4c2 − 28c4 /2.
After elementary calculations we get that both inequalities are equivalent to 0 < c < 1/2, which shows our claim. If zi = −1 for i = 2, 3, 4, 5, by Lemma 2.8 the conclusion is the same. Finally, by Theorem 2.1, Bo satisfies (33). 2
574
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
Lemma 2.11. Let A = {aij , i, j = 1, . . . , 5} be a 5 × 5 symmetric matrix such that aij ∈ {±1} for i, j = 1, . . . , 5 and aii = 1 for i = 1, . . . , 5. Consider a function 5 fu1 ,A (u2 , . . . , u5 ), x 1 , x 2 , x 3 = ui uj aij xi , xj 3
(34)
i,j =1
under constraints (2) and (3). Then there exist x 1 , x 2 , x 5 ∈ R5 satisfying (2) and (u2 , u3 , u4 , u5 ) satisfying (3) maximizing the function fu1 ,A such that x43 = x53 = 0, x23 0, x22 = 0, x42 0 and x21 0. Proof. Let y 1 , y 2 , y 3 and (u2 , u3 , u4 , u5 ) be any vectors satisfying (2) and (3) maximizing fu1 ,A . Let V = span[y 1 , y 2 , y 3 ]. Since dim(V ) = 3, there exist linearly independent f, g ∈ R5 such that V = ker(f ) ∩ ker(g). Hence we can find d 3 ∈ V \ {0}, which is orthogonal to e4 , e5 such that d23 0. Set x 3 = d 3 /d 3 2 . Analogously we can find d 2 ∈ V \ {0}, orthogonal to x 3 and e2 satisfying d42 0. Define x 2 = d 2 /d 2 2 . Finally we can find d 1 ∈ V \ {0}, orthogonal to x 3 and x 2 with d21 0. Set x 1 = d 1 /d 1 2 . Note that x i ∈ V for i = 1, 2, 3 and they are orthonormal. By Lemma 2.3, x 1 , x 2 , x 3 and (u2 , u3 , u4 , u5 ) maximize the function fu1 ,A , which completes the proof. 2 Lemma 2.12. Let A be a fixed 5 × 5 matrix given by ⎛
1 ⎜ z2 ⎜ A = ⎜ z3 ⎝ z4 z5
⎞ z2 z3 z4 z5 1 −1 −1 −1 ⎟ ⎟ −1 1 −1 −1 ⎟ , ⎠ −1 −1 1 −1 −1 −1 −1 1
(35)
where zi ∈ {±1} for i = 2, 3, 4, 5. Let gt,u1 ,A (u2 , . . . , u5 ), x 1 , x 2 , x 3 = fu1 ,A (u2 , . . . , u5 ), x 1 , x 2 , x 3 5 2 2 3 3 +t ui + x4 − x5 + x2 − x3 i=2
where t > 0 is fixed and (u2 , . . . , u5 ), (x 1 , x 2 , x 3 ) satisfy (2) and (3). Let u1 = 0 and let (u2 , . . . ,√ u5 ) and (x, y, z) ∈ R15 satisfying (2), (3) maximize gt,u1 ,A . Assume√that x2 √0. Then ui = 1/ √ 2, for i √ = 2, 3, 4, 5, x = (0, 1/2, 1/2, 1/2, 1/2), y = (0, 0, 0, 1/ 2, −1/ 2), and z = (0, 1/ 2, −1/ 2, 0, 0). Proof. By Lemma 2.7, the above mentioned x, y, z and (u2 , . . . , u5 ) maximize f0,A and f0,A (u2 , . . . , u5 ), x 1 , . . . , x 3 = 3/2.
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
575
Since the maximum of 5i=2 ui + x42 − x52 + x23 − x33 under restrictions 5i=2 u2i = 1 − u21 ,
5 i 2 2 2 j =1 (xj ) = 1 for i = 2, 3 is attained only for ui = (1 − u1 )/2 for i = 2, 3, 4, 5, x = y and x 3 = z, √ gt,0,A (u2 , . . . , u5 ), x 1 , . . . , x 3 = 3/2 + t (4/ 2 + 2). 1 1 1 ) maximize the function g Now assume that t,0,A . Hence in particular, √ (v2 , . . . , v5 ) and (x , y , z √
5 v = 4/ 2, which shows that v = 1/ 2 = u for i = 2, . . . , 5. Analogously, y = y 1 and i i i=1 i z = z1 . By Lemma 2.4,
span x 1 , y 1 , z1 = span[x, y, z]. Assume that x 1 = px + qy + rz. Since y = y 1 , z = z1 and x 1 , y 1 , z1 are orthonormal, we get q = r = 0. Hence p = ±1. Since x21 0 and x2 > 0, x 1 = x, which completes the proof. 2 The next lemma is a simple consequence of the Implicit Function Theorem. Lemma 2.13. Let U ⊂ Rl be an open, non-empty set and let f : U × Rn → R and Gi : Rn → R for i = 1, . . . , k be fixed C 2 functions. Let g : U × Rn+k → R be defined by g(u, x, d) = f (u, x) −
k
di Gi (x)
i=1
for u ∈ U , x ∈ Rn and d ∈ Rk . Assume that
∂g ∂zj
(uo , x o , d o ) = 0 for j = 1, . . . , n + k and
2 (∂ g) o o o det u ,x ,d = 0 ∂zi ∂zj for some (uo , x o , d o ) ∈ U × R n+k and i, j = 1, . . . , n + k. (We do not differentiate with respect to the coordinates of u.) Assume that (um , x m , d m ) ∈ U × R n+k , and (um , y m , zm ) ∈ U × R n+k , are such that (um , x m , d m ) → (uo , x o , d o ) and (um , y m , zm ) → (uo , x o , d o ) with respect to ∂g ∂g any norm in Rl+n+k . If, for any m ∈ N, ∂z (um , x m , d m ) = 0 and ∂z (um , y m , zm ) = 0 for j j j = 1, . . . , n + k then m m m m m m u ,x ,d = u ,y ,z for m mo . Proof. It suffices to apply the Implicit Function Theorem to the function G(u, x, d) = and (u, x, d) = (uo , x o , d o ).
2
∂g ∂g (u, x, d), . . . , (u, x, d) ∂z1 ∂zn+k
576
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
Lemma 2.14. Let A ∈ Rn×n be a symmetric matrix. Let λk1 , λk2 be eigenvalues of A, λki of multiplicity ji for i = 1, 2. Assume that for i = 1, 2 and j = 1, . . . , ji , v ij is an orthonormal basis corresponding to the eigenspace of λki . Define a (j1 + j2 ) × n matrix V with rows v ij , i = 1, 2, j = 1, . . . , ji . Let A − λk1 id V T . (36) C= V 0 Then det(C) = 0. Proof. Let Ej for j = 1, . . . , n+j1 +j2 denote the rows of the matrix C. Assume on the contrary that n+j 1 +j2
aj E j = 0
(37)
k=1
and
n+j1 +j2 k=1
|aj | > 0. By (36), v ij , (a1 , . . . , an ) n = 0
(38)
for i = 1, 2 and j = 1, . . . , ji . If j1 +j2 = n, then by the orthonormality conditions ai = 0 for j = 1, . . . , n. Again by the orthonormality conditions and (36) aj = 0 for j = n + 1, . . . , j1 + j2 + n, a contradiction. If j1 + j2 < n, (a1 , . . . , an ) =
k
b j wj ,
j =1
where w1 , . . . , wk are some orthonormal eigenvectors of A corresponding to eigenvalues γ1 , . . . , γk of A different from λk1 and λk2 . Since A is symmetric, by (37) and (38), k
bj (γj − λk1 )wj =
j =1
j 1 +n
aj v 1j +
j =n+1
j1 +j 2 +n
aj v 2j .
j =n+j1 +1
Since γj = λki for j = 1, . . . , k and i = 1, 2, we have j 1 +n j =n+1
aj v 1j +
j1 +j 2 +n
aj v 2j = 0
j =n+j1 +1
and consequently by the orthonormality conditions aj = 0 for j = n + 1, . . . , n + j1 + j2 and bj = 0 for j = 1, . . . , k. In particular, this shows that (a1 , . . . , an ) = 0, a contradiction. 2 Lemma 2.15. Assume that t ∈ R and let B, E be fixed m × m matrices and let A be a fixed n × n matrix. Define A D C(t) = , (39) D1 B + tE
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
where D is a fixed n × m matrix and D1 is a fixed m × n matrix. If det(C(t)) =
577
m
j =0 aj t
j,
then
am = det(A) det(E). Proof. Let, for k ∈ N, Πk denote a set of all permutations of k elements. By definition of determinant n+m n+m sgn(σ ) cj,σ (j ) (t) sgn(σ ) cj,σ (j ) (t) det C(t) = σ ∈Wm
σ ∈Wm
j =1
j =1
n+m + sgn(σ ) cj,σ (j ) (t) , σ ∈W / m
j =1
where Wm = σ ∈ Πn+m : σ {1, . . . , n} ⊂ {1, . . . , n} . Notice that to calculate the coefficient an it is sufficient to consider only the sum over Wn . But n+m sgn(σ ) cj,σ (j ) (t)
σ ∈Wm
=
j =1
sgn(σ )
σ ∈Πn
n
aj,σ (j )
sgn(σ )
σ ∈Πm
j =1
m
(bj,σ (j ) + tej,σ (j ) )
j =1
= det(A) det(B + tE). Hence, again by definition of determinant, am = det(A) det(E), as required.
2
3. Determination of λ53 In this section we will work with functions fu1 and fu1 ,A defined by (1) and (4). The main idea of our proof is to show that the function fu1 attains its maximum under conditions (2) and (3) in u2 , u3 , u4 , u5 , x 1 , x 2 , x 5 given in Theorem 2.4. This shows that λ53 has been calculated in Theorem 2.4. The main difficulty to do this, is to demonstrate that if (u2 , u3 , u4 , u5 , x 1 , x 2 , x 3 ) maximize fu1 under conditions (2) and (3) then ui = 1 − u21 /2 for i = 2, 3, 4, 5. Here Lemma 2.13, Lemma 2.14 and Lemma 2.15 are applied. The next two theorems show how look like candidates for maximizing the function fu1 ,A .
578
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
Theorem 3.1. Let A be defined by (28). Fix t ∈ R and u1 ∈ [0, 1). Let us consider a function hu1 ,A,t : R4 × (R5 )3 × R6 × R defined by hu1 ,A,t (v2 , v3 , v4 , v5 ), z1 , z2 , z3 , d1 , d2 , d3 , d12 , d13 , d23 , d7 5 5 2 2 3 3 aij vi vj zi , zj 3 + t vi + z4 − z5 + z2 − z3 = i,j =1
−
3 j =1
i=2
dj z j , z j 5 − 1 −
3 i,j =1,i<j
dij zi , zj 5 − d7 (u1 , v), (u1 , v) 5 ,
where v = (v2 , v3 , v4 , v5 ). Define for i = 2, . . . , 5, ui = w = w(u1 ) = √ x11 = w/ 1 + w 2 , xi1 = √ 1 2
1+w 2
(40)
(1 − u21 )/2 = c,
4u1 c
,
(u21 − 2c2 )2 + 16c2 u21 + 2c2 − u21
, i = 2, 3, xi1 = √−1 2
1+w 2
, i = 4, 5,
√ √ √ √ x 3 = (0, 1/ 2, −1/ 2, 0, 0), x 2 = (0, 0, 0, 1/ 2, −1/ 2), √ √ d1 = 1/2 − c2 + 1 + 4c2 − 28c4 /2, d2 = d3 = 2c2 + (1/ 2)t, dij = 0 for i, j = 1, 2, 3, i < j and 2 d7 = 1 + t/(2c) + 2 x21 + x11 x21 u1 /c. Then the above defined x 1 , x 2 , x 3 , u2 , u3 , u4 , u5 , d1 , d2 , d3 , d12 , d13 , d23 , d7 satisfy the system of equations: ∂hu1 ,A,t 1 2 3 x , x , x , u2 , u3 , u4 , u5 , d1 , d2 , d3 , d12 , d13 , d23 , d7 = 0 ∂wj for j = 1, . . . , 26, where wj ∈ v2 , v3 , v4 , v5 , zki , k = 1, . . . , 5, i = 1, 2, 3 and wj ∈ dik , i, k ∈ {1, 2, 3}, i < k, di , i = 1, 2, 3, 7 . (We do not differentiate with respect to u1 .) Proof. Notice that the equations ∂hu1 ,A,t 1 2 3 x , x , x , u2 , u3 , u4 , u5 , d1 , d2 , d3 , d12 , d13 , d23 , d7 = 0 ∂wj
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
579
for wj ∈ zki , k = 1, . . . , 5, i = 1, 2, 3 follow from the fact that x i , i = 1, 2, 3, are the orthonormal eigenvectors of the matrix B defined by (29) corresponding to the eigenvalues di , i = 1, 2, 3, which has been established in the proof of Theorem 2.4. Also the equations ∂hu1 ,A,t 1 2 3 x , x , x , u2 , u3 , u4 , u5 , d1 , d2 , d3 , d12 , d13 , d23 , d7 = 0 ∂wj where wj ∈ d1 , d2 , d3 , dik , i, k ∈ {1, 2, 3}, i < k, d7 follow immediately from the fact that x i , x j 5 = δij for i, j = 1, 2, 3, i j and (u1 , u), (u1 , u) 5 = 1, where u = (u2 , u3 , u4 , u5 ). To end the proof, we show that ∂hu1 ,A,t 1 2 3 x , x , x , u2 , u3 , u4 , u5 , d1 , d2 , d3 , d12 , d13 , d23 , d7 = 0 ∂wj for wj ∈ {v2 , v3 , v4 , v5 }. Notice that for i = 2, 3, 4, 5 ∂hu1 ,A,t 1 2 3 x , x , x , u2 , u3 , u4 , u5 , d1 , d2 , d3 , d12 , d13 , d23 , d7 ∂wi =2
5
uj aij xi , xj 3 + t − 2ui d7 .
j =1
Since u1 < 1, ui =
(1 − u21 )/2 = c > 0 for i = 2, 3, 4, 5. Hence
∂hu1 ,A,t 1 2 3 x , x , x , u2 , u3 , u4 , u5 , d1 , d2 , d3 , d12 , d13 , d23 , d7 = 0 ∂wi if and only if
5 j =1
Notice that for i = 2, 3, 4, 5,
uj aij xi , xj 3 /c + t/(2c) = d7 .
580
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592 5
aij uj xi , xj 3 /c
j =1
= xi1
5
aij uj xji
/c + xi2
j =1
5
aij uj xj2
/c + xi3
j =1
5
aij uj xj3
/c
j =1
√ √ √ 2 = u1 ai1 x11 xi1 /c + 2 xi1 + 1/ 2 (1/ 2)c + (−1)(−1/ 2)c /c 2 = 1 + u1 ai1 x11 xi1 /c + 2 xi1 .
Hence for i = 2, 3, 4, 5, 2 d7 = 1 + t/(2c) + 2 xi1 + x11 ai1 xi1 u1 /c. Since x21 = x31 = −x41 = −x51 , 1 = a21 = a31 = −a41 = −a51 , and ui = c for i = 2, 3, 4, 5, 2 d7 = 1 + t/(2c) + 2 x21 + x11 x21 u1 /c, as required.
2
Reasoning as in Theorem 3.1, we can show Theorem 3.2. Let A be defined by ⎛
⎞ 1 1 1 1 1 ⎜ 1 1 −1 −1 −1 ⎟ ⎜ ⎟ A = ⎜ 1 −1 1 −1 −1 ⎟ . ⎝ ⎠ 1 −1 −1 1 −1 1 −1 −1 −1 1
(41)
4 5 3 6 Fix t ∈ R and u1 ∈ [0, 1). Let us consider a function hu 1 ,A,t : R × (R ) × R × R given by (40)
with A defined as above. Define for i = 2, . . . , 5, ui =
(1 − u21 )/2 = c,
x 1 = (0, 1/2, 1/2, −1/2, −1/2), √ √ √ √ x 2 = (0, 0, 0, 1/ 2, −1/ 2), x 3 = (0, 1/ 2, −1/ 2, 0, 0), √ d1 = 2c2 , d2 = d3 = 2c2 + (1/ 2)t, dij = 0 for i, j = 1, 2, 3, i < j and d7 = 3c + t/2c. Then the above defined x 1 , x 2 , x 3 , u2 , u3 , u4 , u5 , d1 , d2 , d3 , d12 , d13 , d23 , d7 satisfy the system of equations: ∂hu1 ,A,t 1 2 3 x , x , x , u2 , u3 , u4 , u5 , d1 , d2 , d3 , d12 , d13 , d23 , d7 = 0 ∂wj for j = 1, . . . , 26, where wj ∈ v2 , v3 , v4 , v5 , zk1 , k = 1, . . . , 5, i = 1, 2, 3
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
581
and wj ∈ dik , i, j ∈ {1, 2, 3}, i < j, d1 , d2 , d3 , d7 . (We do not differentiate with respect to u1 .) Lemma 3.1. Let A be defined by (28). For a fixed u1 ∈ (0, 1) and t > 0 let gu1 ,A,t : R4 × (R5 )3 → R defined by 5 5 1 2 3 2 2 3 3 vi vj aij yi , yj 3 + t v i + y4 − y5 + y2 − y3 . gu1 ,A,t (v2 , . . . , v5 ), y , y , y = i,j =1
i=2
Let Mu1 ,A,t = max gu1 ,A,t under constraints
yi , yj
5
= δij ,
1 i j 3;
and 5
vj2 = 1 − u21 .
j =2
Assume that u1 ∈ (0, 1) is so chosen that Mu1 ,A,0 = fu1 ,A (u2 , u3 , u4 , u5 ), x 1 , x 2 , x 3 , d1 , d2 , d3 , d12 , d13 , d23 , d7 , where u2 , u3 , u4 , u5 , x 1 , x 2 , x 3 , d1 , d2 , d3 , d12 , d13 , d23 , d7 are as in Theorem 3.1 ( for c = 1 − u21 /2). Set Du 1 =
v2 , v3 , v4 , v5 , y 1 , y 2 , y 3 : y43 = y53 = y22 = 0, y21 0 .
(42)
Then Xu1 = u2 , u3 , u4 , u5 , x 1 , x 2 , x 3
(43)
is the only point maximizing gu1 ,A,t satisfying (2) and (3) belonging to Du1 . Proof. Let Yu1 = v2 , v3 , v4 , v5 , y 1 , y 2 , y 3 ∈ Du1 maximize gu1 ,A,t and satisfy (2) and (3). Since t > 0, and the maximum of fu1 ,A is attained at Xu1 , we have vi = ui =
1 − u21 /2 for i = 2, 3, 4, 5, y 2 = x 2 and x 3 = y 3 . Since x 1 , x 2 , x 3 are
the eigenvectors of A, by Lemma 2.4, span[y i : i = 1, 2, 3] = span[x i : i = 1, 2, 3]. Note that x1, xi 5 = y1, xi 5 = 0
582
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
for i = 2, 3. Since span[y i : i = 1, 2, 3] = span[x i : i = 1, 2, 3], y 1 = dx 1 . Since y 1 , y 1 5 = 1, y21 0 and x21 > 0, x 1 = y 1 , as required. 2 Theorem 3.3. Let A be defined by (28). For a fixed u1 ∈ [0, 1) and t ∈ R let gu1 ,A,t and Mu1 ,A,t be as in Lemma 3.1. Assume that u1 ∈ [0, 1) is so chosen that Mu1 ,A,0 = gu1 ,A,t u2 , u3 , u4 , u5 , x 1 , x 2 , x 3 where u2 , u3 , u4 , u5 , x 1 , x 2 , x 3 are as in Theorem 3.1 ( for c = 1 − u21 /2). Let the function hu1 ,A,t be defined by (40). Assume furthermore that the 23 × 23 matrix Du,A,t defined by Du,A,t =
∂hu1 ,A,t 1 2 3 x , x , x , u2 , u3 , u4 , u5 , d1 , d2 , d3 , d12 , d13 , d23 , d7 , ∂wi , ∂wj
(44)
where wi , wj ∈ v2 , v3 , v4 , v5 , yk1 , k = 1, . . . , 5, y12 , y32 , y42 , y52 , y13 , y23 , y33 , di , i = 1, 2, 3, 7, dik , 1 i < k 3 (we do not differentiate with respect to u1 , y43 , y53 , y22 ) is such that Det(Du,A,t ) =
k
aj (u)t j
j =o
and aj (u1 ) = 0 for some j ∈ {1, . . . , k}. (Here (d1 , d2 , d3 , d12 , d13 , d23 , d7 ) are such as in Theorem 3.1 for c = 1 − u21 /2 and t ∈ R.) Then there exists an open interval U ⊂ [0, 1) (U = [0, w) if u1 = 0) such that u1 ∈ U and for any u ∈ U the function fu,A attains its global maximum under constraints (2) and (3) at Xu = u2 , u3 , u4 , u5 , x 1 , x 2 , x 3 , √ where ui = cu = 1 − u2 /2 for i = 2, 3, 4, 5 and x 1 , x 2 , x 3 are defined in Theorem 3.1 (with c = cu ). The same result holds true if A will be defined by (41). (In this case 1 2 3 x , x , x , u2 , u3 , u4 , u5 , d1 , d2 , d3 , d12 , d13 , d23 , d7 are such as in Theorem 3.2.) Proof. Fix u1 ∈ [0, 1) satisfying our assumptions and let c1 = {0, . . . , k}: aj (u1 ) = 0}. Set for (u, t) ∈ [0, 1) × R, h(t, u) =
k j =jo
aj (u)t j −jo .
1 − u21 /2. Let jo = min{j ∈
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
583
Since ajo (u1 ) = 0, and aj are continuous there exist an open interval U ⊂ [0, 1) and δ > 0 such that u1 ∈ U and h(t, u) = 0 for u ∈ U and |t| < δ. Fix to ∈ (0, δ). Set Uto = {u ∈ U : Mu,A,to is attained at Xu }. Note that u1 ∈ Uto . Now we show that Uto is an open set. Let uo ∈ Uto . Assume on the contrary that there exist {un } ∈ U \ Uto such that un → uo . Let for any u ∈ U , Zu,to = Zu = v2u , v3u , v4u , v5u , x 1u , x 2u , x 3u be a point maximizing gu,A,to under constraints (2) and (3). Since the function 5 1 2 3 2 2 3 3 vi + z4 − z5 + z2 − z3 (fu,A − gto ,u,A ) (v2 , v3 , v4 , v5 ), z , z , z = to i=2
is independent of z1 and by Lemma 2.11, without loss of generality, the function gu,A,to can be considered as a function of 16 variables 1 z , (v2 , v3 , v4 , v5 ), z12 , z32 , z42 , z52 , z13 , z23 , z33 from R5 × R4 × R4 × R3 . (We can put z22 = z43 = z53 = 0.) Consequently, we can assume that Zu ∈ Du (see (42)). By (2) and (3), passing to a subsequence, if necessary, we obtain that Zun → Z. By definition of Duo , Z ∈ Duo . Also by the continuity of the function (v, Y ) →
5 i,j =1
vi vj aij yi , yj 3 + to
5
vi + y42
− y52
+ y23
− y33
,
i=2
guo ,A,to (Z) = Muo ,A,to . By Lemma 2.12 and Lemma 3.1 Xuo is the only point in Duo which maximizes gu,A,to and Z ∈ Duo . Hence Z = Xuo . Moreover, since Xuo ∈ int(Duo ), by the Lagrange Multiplier Theorem, there exists n n n Mun = Mun (to ) = d1n , d2n , d3n , d12 , d13 , d23 , d7n ∈ R7 such that ∂hu,A,to (Zun , Mun ) = 0, ∂wi
(45)
584
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
for wi ∈ X ∪ DD. Here hu,A,t is defined by (40) and DD = {di , i = 1, 2, 3, 7, dij , 1 i < j 3}. Also by (2), (3), (7), (8) (see the proof of Lemma 2.4) and (45) Mn → Luo = Luo (to ) = (d1 , d2 , d3 , d12 , d13 , d23 , d7 ), where Luo is defined in Theorem 3.1 for c = 1 − u2o /2 and t = to . Now we apply Lemma 2.13. Let us consider a function G : U × R12 × R4 × R7 → R23 defined by G(u, x, v, Q) =
∂hu,A,to ∂hto ,u (u, x, v, Q), . . . , (u, x, v, Q) /(to )jo /23 ∂w1 ∂w23
for wi ∈ X ∪ DD. Notice that by (45) G(un , Zun , Mun ) = 0. Also G(un , Xun , Lun (to )) = 0, where (Xun , Lun (to )) are defined for un in Theorem 3.1. Moreover, (un , Zun , Mun ) → (uo , Xuo , Luo ) and (un , Xun , Lun ) → (uo , Xuo , Luo ). Notice that k det(Duo ,A,to ) ∂G j −j (uo , Xuo , Luo ) = = aj (uo )to o = h(to , uo ) = 0, Det j /23 ∂wj (to o )23 j =jo by definition of jo and to . By Lemma 2.13 applied to the function G, Zun = Xun and Mun = Lun for n no . Hence un ∈ U1 for n no ; a contradiction. This shows that Uto is an open set. It is clear that Uto is closed. Since u1 ∈ Uto and U is connected, Uto = U . Consequently for any n ∈ N, n no and u ∈ U√, the functions gu,A,1/n achieve their maximum at u2 , u3 , u4 , u5 , x 1 , x 2 , x 3 , where ui = cu = 1 − u2 /2 for i = 2, 3, 4, 5 and x 1 , x 2 , x 3 , are defined in Theorem 3.1 (with c = cu ). Since gu,A,1/n tends uniformly to gu,A,0 = fu,A , on the set defined by (2) and (3), with u ∈ U fixed, fu,A attains its maximum at u2 , u3 , u4 , u5 , x 1 , x 2 , x 3 for any u ∈ U . By Theorem 3.2, reasoning exactly in the same way as above we can deduce our conclusion for the function fu,A determined by A given by (41). The proof is complete. 2 Now we show that the assumptions of Theorem 3.3 concerning Du,A,t are satisfied. This is the most important technical result which permits us to determine the constant λ53 .
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
585
Theorem 3.4. Let A be defined by (28) and let Du,A,t be given by (44). Then for any u ∈ [0, 1) and t ∈ R, Det(Du,A,t ) =
7
aj (u)t j ,
j =0
where the functions aj are continuous for j = 0, . . . , 7 and a7 (u) = 0 for any u ∈ [0, 1). Proof. Set √ √ √ √ X = (x1 , b, b − b, −b, 0, 0, 1/ 2, −1/ 2, 0, 1/ 2, −1/ 2), B = (b1 , d, d, 0, 0, 0, b7 ) and v = (c, c, c, c). Assume that we will differentiate hu,A,t in the following manner: (w1 , . . . , w5 ) = x11 , . . . , x51 , (w6 , . . . , w11 ) = (b1 , b2 , b3 , b12 , b13 , b23 ) 2 2 2 2 3 3 3 (w19 , . . . , w23 ) = (u2 , u3 , u4 , u5 , b7 ). (w12 , . . . , w18 ) = x1 , x3 , x4 , x5 , x1 , x2 , x3 , (Recall that we do not differentiate with respect to u1 , x22 , x43 and x53 .) Notice that by elementary but very tedious calculations (which we verified by a symbolic Mathematica program) we get that the 23 × 23 symmetric matrix C = Du,A,t (X, B, v) is given by C=
D1 (B1 )T
B1 D2
(46)
.
Here ⎛ 2(u2 − b1 ) ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ D1 = ⎜ ⎜ ⎜ ⎜ ⎜ ⎝
2cu 2cu −2cu −2cu −2x1 0 0 0 0 0
2cu 2(c2 − b1 ) −2c2 −2c2 −2c2 −2b 0 0 0√ −1/ 2 0
2cu −2c2 2(c2 − b1 ) −2c2 −2c2 −2b 0 0 0√ 1/ 2 0
−2cu −2c2 −2c2 2(c2 − b1 ) −2c2 2b 0 0√ −1/ 2 0 0
−2cu −2c2 −2c2 −2c2 2(c2 − b1 ) 2b 0 0 √ 1 2 0 0
−2x1 −2b −2b 2b 2b 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
0 0 0√ −1/√ 2 1/ 2 0 0 0 0 0 0
0√ −1/√ 2 1/ 2 0 0 0 0 0 0 0 0
0⎞ 0 ⎟ 0⎟ 0⎟ ⎟ 0⎟ ⎟ 0⎟; 0⎟ ⎟ 0⎟ ⎟ 0⎠ 0 0
(47) D2 = (D12 , D22 ), where
586
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
⎛ 2(u2 − d) ⎜ 2cu ⎜ −2cu ⎜ ⎜ −2cu ⎜ ⎜ 0 ⎜ 0 ⎜ D12 = ⎜ 0 ⎜ ⎜ 0 ⎜ ⎜ ⎜ √0 ⎜ − 2u ⎝ √ 2u 0
2cu 2(c2 − d) −2c2 −2c2 0 0 0 0 0 √ −√ 2c 2c 0
−2cu −2c2 2(c2 − d) −2c2 0 0 0 0 √0 3√ 2c 2c 0
−2cu 0 −2c2 0 −2c2 0 2(c2 − d) 0 0 2(u2 − d) 0 2cu 0 2cu √ 0 √2u 0 − 2u √ − √2c 0 −3 2c 0 0 0
⎞ 0 0 ⎟ ⎟ 0 ⎟ ⎟ 0 ⎟ 2cu ⎟ ⎟ −2c2 ⎟ ⎟ 2(c2√− d) ⎟ − √2c ⎟ ⎟ −3 2c ⎟ ⎟ ⎟ 0 ⎠ 0 0
0 0 0 0 2cu 2(c2 − d) 2 −2c √ 3√ 2c 2c 0 0 0
(48) and √ √ 0 0 − √2u √2u 0 0 −√ 2c ⎜ √2c ⎜ 2c 0 0 3 √2c ⎜ √ ⎜ 2c −3 2c 0 0 − ⎜ √ √ ⎜ 2u − 2u 0 0 ⎜ √ √ ⎜ 2c 0 0 3 √2c ⎜ √ D22 = ⎜ −3 2c 0 0 − 2c ⎜ ⎜ 2 1 − 2b2 2b2 2b2 ⎜ 2b − 2b7 + 1 ⎜ 2 2 2 2b − 2b7 + 1 2b 2b2 ⎜ 1 − 2b ⎜ 2 2 2 2b 2b − 2b7 + 1 1 − 2b2 2b ⎜ ⎝ 2 2 2 2 2b 1 − 2b 2b − 2b7 + 1 2b −2c −2c −2c −2c ⎛ 0 0 0 0 0 0 0 2bu 2bu 2bu ⎛
⎜ 0 ⎜ 0 ⎜ 0 ⎜ ⎜ 0 ⎜ B1 = ⎜ 0 ⎜ 0 ⎜ ⎜ 0 ⎜ −x1 ⎝ 0 0
0 0 0 0 0 0 0 −b 0√ 1/ 2
0 0 0 0 0 √ − 2 0 b 0 0
0 0 0 0 √0 2 0 b 0 0
0 0 0 0 0 0 0 0 −x1 0
0 0 0 0 0 0 √ − 2 0 −b 0
0 0 0 0 0 √0 2 0 −b 0
6bc + 2ux1 −2bc −2bc −2bc 0 0 0 0 0 0
−2bc 6bc + 2ux1 −2bc −2bc 0 0 0 0 0 0
2bc 2bc −6bc − 2ux1 2bc 0 0 0 0 0 0
⎞ 0 0 ⎟ ⎟ 0 ⎟ ⎟ 0 ⎟ ⎟ 0 ⎟ ⎟ 0 ⎟ ⎟ ; (49) 0 ⎟ ⎟ −2c ⎟ ⎟ −2c ⎟ ⎟ −2c ⎟ ⎠ −2c 0 ⎞ 2bu 0
2bc 2bc 2bc −6bc − 2ux1 0 0 0 0 0 0
0⎟ 0⎟ ⎟ 0⎟ 0⎟ ⎟ 0⎟. 0⎟ ⎟ 0⎟ 0⎟ ⎠ 0 0
(50) √ Notice that in 11th row of C the only non-zero element is c11,13 = c13,11 = 1/ 2 and in 23rd row of A the only elements which could be different from 0 are c√ c23,21 = c23,22 = 23,19 = c23,20 =√ −2c. Also the only non-zero elements in 7th row are√ c7,14 = − 2 and c = 2. Analogously, 7,15 √ the only non-zero elements in 8th row are c8,17 = − 2 and c8,18 = 2. Consequently, applying the symmetry of C, subtracting 19th row from 20, 21 and 22nd row, 19th column from 20, 21 and 22nd column, adding 15th row to 14th row and 15th column to 14th column and adding 18th row to 17th row and 18th column to 17th column we get that det(C) = 8c2 det(A),
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
587
where A is a 15 × 15 symmetric matrix defined by A=
A1 FT
F A2
(51)
.
Here ⎛ 2(u2 − b ) 2cu 2cu −2cu −2cu −2x1 0 0√ ⎞ 1 2−b ) 2 2 2 −2c −2c −2c −2b 0 −1/ 2cu 2(c 1 ⎜ √ 2⎟ ⎜ 2cu 2(c2 − b1 ) −2c2 −2c2 −2b 0√ 1/ 2 ⎟ −2c2 ⎟ ⎜ ⎜ −2cu −2c2 2(c2 − b1 ) −2c2 2b −1/√ 2 0 ⎟ −2c2 A1 = ⎜ ⎟; 2 2 2 2 −2c −2c −2c 2(c − b1 ) 2b 1/ 2 0 ⎟ ⎜ −2cu ⎜ −2x1 −2b −2b 2b√ 2b 0 0 0 ⎟ √ ⎠ ⎝ 0 0
0√ −1/ 2
0√ 1/ 2
−1/ 2 0
1 2 0
0 0
0 0
0 0
(52) A2 = (A21 , A22 ), where ⎛
2(u2 − d) −4cu −4d ⎜ −4cu ⎜ 0 0 ⎜ ⎜ A21 = ⎜ 0 0 ⎜ 0 0 ⎜ √ √ ⎝ 2 √2c −√ 2u 2u −2 2c
⎞ 0 0 0 0 ⎟ ⎟ 4cu ⎟ 2(u2 − d) ⎟ 4cu −4d √ ⎟ √ ⎟ −2√ 2u −4√2c ⎟ ⎠ −√2u −2√2c − 2u −2 2c
(53)
and ⎛
0 0 ⎜ √ ⎜ ⎜ −2 √2u ⎜ A22 = ⎜ −4 2c ⎜ 2 ⎜ 8b − 4b7 ⎝ 2 4b − 2b7 4b2 − 2b7 ⎛
0 ⎜ 0 ⎜ ⎜ 0 ⎜ ⎜ 0 F =⎜ ⎜ 0 ⎜ ⎜ 0 ⎝ −x1 0
0 0 0 0 0 0 2b 0
0 0 0 0 0 0 0 −x1
0 0 0 0 0 0 0 −2b
√ −√ 2u 2 √2c − √2u −2 2c 4b2 − 2b7 2 − 4b7 2 − 4b2 − 2b7 0 −8bc − 2ux1 8bc + 2ux1 0 0 0 0 0
√ ⎞ 2u √ −2√ 2c ⎟ ⎟ − √2u ⎟ ⎟ −2 2c ⎟ ; ⎟ 4b2 − 2b7 ⎟ ⎠ 2 − 4b2 − 2b7 2b2 − 4b7 0 −4bc − 2ux1 4bc −4bc − 2ux1 4bc 0 0 0
⎞ 0 −4bc − 2ux1 ⎟ ⎟ 4bc ⎟ ⎟ 4bc ⎟ ⎟. −4bc − 2ux1 ⎟ ⎟ ⎟ ⎠ 0 0
Now we calculate the coefficient a7 (u). Notice that Det C(t) = Det Du,A,t (X, B, v) = 8c2 Det A(t) ,
(54)
(55)
588
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
where C(t) and A(t) denote the above written matrices √ C and A with b7 replaced by b7 + t/2c, √ d2 = d3 = d replaced by d + (1/ 2)t and x1 = w/ 1 + w 2 , b = √ 1 2 , where w is defined 2
1+w
in Theorem 3.1. Now we apply Lemma 2.14 and Lemma 2.15. By Lemma 2.15, a7 (u) = det(A1 ) det(E), where E is a 7 × 7 matrix given by ⎛ √ ⎞ − 2 0√ 0 0 0 0 0 0 0 0 0 0 ⎟ −2 2 ⎜ 0 √ ⎜ ⎟ 0√ 0 0 0 ⎟ 0 − 2 ⎜ 0 ⎜ ⎟ (56) E=⎜ 0 0 0 0 ⎟. 0 0 −2 2 ⎜ ⎟ 0 0 0 −2/c −1/c −1/c ⎟ ⎜ 0 ⎝ ⎠ 0 0 0 0 −1/c −2/c −1/c 0 0 0 0 −1/c −1/c −2/c Since c = 1 − u21 /2 > 0 for u1 ∈ [0, 1), E is well defined and det(E) = 0. Also by Lemma 2.14 and Theorem 2.4, det(A1 ) = 0. Hence a7 (u) = 0 for any u ∈ [0, 1) as required. 2 Now we will prove one of the main results of this section. Theorem 3.5. Let fu1 be defined by (1), i.e. 5 ui uj xi , xj 3 . fu1 u2 , u3 , u4 , u5 , x 1 , x 2 , x 3 = i,j =1
Let Mu = max(fu ) under constraints (2) and (3). Then for any u ∈ [0, 1] 1 + 6c2 + (6c2 − 1)2 + 16(1 − 4c2 )c2 , Mu = 2 √ where c = c(u) = 1 − u2 /2. Proof. Define 1 + 6c2 + (6c2 − 1)2 + 16(1 − 4c2 )c2 . U = u ∈ [0, 1): Mu = 2 By Lemma 2.6 and Lemma 2.7, 0 ∈ U , since Mo = 3/2. Now we show that U is an open set. Fix u ∈ U . First we consider the case u = 0. We apply Theorem 3.3 and Theorem 3.4. Let (Xv , Lv ) where Xu = x 1 , x 2 , x 3 , c(v), c(v), c(v), c(v) , (c(0) = 1/2) and Lv (t) = (d1 , d2 , d3 , d12 , d13 , d23 , d7 )
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
589
are given by Theorem 3.1 for fixed v ∈ [0, 1) and t ∈ R. Assume that un → 0 and un ∈ / U . Let (Xun , Lun (t)) be such as in Theorem 3.3. Passing to a subsequence, if necessary, and reasoning as in Theorem 3.3, we can assume that (Xun , Lun (t)) → (Xo , Lo (t)). Let Xun = x 1n , x 2n , x 3n , c(un ), c(un ), c(un ), c(un ) . Since Xun → Xo , we can assume that sgn xin , xj n 3 = −1 for i, j = 2, 3, 4, 5, i = j . Without loss of generality, again passing to a subsequence if necessary we can assume that for n no sgn x1n , xj n 3 = zj for j = 2, 3, 4, 5, where zj = ±1. By Lemma 2.8 we have to consider three cases: (a) z2 = −1, z3 = z4 = z5 = 1; (b) z2 = z3 = z4 = z5 = 1; (c) z2 = z3 = −z4 = −z5 = 1. By Theorem 2.3 and Theorem 2.4, (a) can be excluded. If (b) holds true, then by Theorem 3.3, Theorem 3.2 and Theorem 3.4 applied to u1 = 0 and ht,A,0 , where A is given by (41), we get that Mun = 6cu2 3/2, which by Theorem 2.4 leads to a contradiction. (Since u1 = 0, Do,A,t is the same for the function ho,A,t , determined by A given by (41). This permits us to apply Theorem 3.4 in this case.) If (c) holds true, we get a contradiction with Theorem 3.3. Consequently, there exists an interval [0, v) ⊂ U . / U for n ∈ N. Let (Xun , Lun (t)) Now assume that u ∈ U and u > 0. Assume un → u and un ∈ be such as in Theorem 3.3. Without loss of generality we can assume that (Xun , Lun (t)) → (Xu , Lu (t)), where (Xu , Lu (t)) is defined in Theorem 3.3. Since Xun → Xu sgn xin , xj n 3 = aij for i, j = 1, 2, 3, 4, 5 for n no , where the matrix {aij } is given by (28). Applying Theorem 3.3, we get that un ∈ U for n no ; a contradiction. Hence the set U is open. It is easy to see that U is also closed. Since 0 ∈ U and [0, 1) is connected, U = [0, 1). Since M(1, 0) = 1 the proof is complete. 2 Theorem 3.6. λ53
√ 5+4 2 = . 7
(5) is spanned by Moreover, λ53 = λ(V ), where V ⊂ l∞
x 1 = (a/u1 , b/co , b/co , −b/co , −b/co ), √ √ x 2 = (0, 0, 0, 1/ 2, −1/ 2)/co
590
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
and √ √ x 3 = (0, 1/ 2, −1/ 2, 0, 0)/co , where u1 =
√ (5 − 3 2)/7,
co =
√ (2 + 3 2)/7 2
and a=
√ (2 2 − 1)/7,
b=
1 − a 2 /2.
Proof. Let f3,5 : R5 × (R5 )3 → R be defined by 5 vi vj yi , yj 3 . f3,5 (v1 , v2 , . . . , v5 ), y 1 , y 2 , y 3 = i,j =1
Let M3,5 = max f3,5 under constraints i j y , y 5 = δij ,
1 i j 3;
and 5
vj2 = 1.
j =1
By Theorem 2.2, λ53 = M3,5 . By Theorem 3.5, 1 + 6c2 + (6c2 − 1)2 + 16(1 − 4c2 )c2 : c ∈ [0, 1/2] . M3,5 = max h(c) = 2 √ √ By Theorem 2.4, co = (2+32 2)/7 and √ 5+4 2 . M3,5 = h(co ) = 7 By the proof of Theorem 2.2, and Theorem 2.4, its√maximum at √ √ the function f3,5 attains √ z1 = (a, b, b, −b, −b), z2 = (0, 0, 0, 1/ 2, −1/ 2) and z3 = (0, 1/ 2, −1/ 2, 0, 0), u = (u1 , co , co , co , co ), where √ √ (2 + 3 2)/7 u1 = (5 − 3 2)/7, co = 2
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
591
and √ a = (2 2 − 1)/7,
b=
1 − a 2 /2.
By the proof of Theorem 2.2, x 1 , x 2 and x 3 , defined in the statement of our theorem, form a basis of a space V satisfying λ(V ) = λ53 . 2 Remark 3.1. Note that (compare with [12, p. 259]) 3/2 = λ43 < λ53 . Also λ32 = 4/3 and by the √ Kadec–Snobar Theorem [8] λ42 2 < 3/2. If x 1 , x 2 , x 3 , u are such as in Theorem 3.6, then after elementary calculations we get
x1 3 =
√ 2 2−1 √ 5−3 2
and x2 3 =
√ 22 − 2 2 √ , 2+3 2
where x1 = (x11 , x12 , x13 ), x2 = (x21 , x22 , x23 ) and · 3 is the Euclidean norm in R3 . Hence it is easy to see that x1 3 = x2 3 if and only if √ 77 2 = 112, which is false. Consequently, by the above calculations and Theorem 3.6, Proposition 3.1 from [12] is incorrect. Remark 3.2. Notice that in [6], it has been proven that λ(V ) 4/3 for any two-dimensional, real, unconditional Banach space. Recall that a two-dimensional, real Banach space V is called unconditional if there exists v 1 , v 2 a basis of V such that for any a1 , a2 ∈ R and 1 , 2 ∈ {−1, 1} ! 1 ! ! ! !a1 v + a2 v 2 ! = ! 1 a1 v 1 + 2 a2 v 2 !. Moreover, the Grünbaum conjecture has been recently proved (see [4]).
592
B.L. Chalmers, G. Lewicki / Journal of Functional Analysis 257 (2009) 553–592
References [1] J. Blatter, E.W. Cheney, Minimal projections onto hyperplanes in sequence spaces, Ann. Mat. Pura Appl. 101 (1974) 215–227. [2] B.L. Chalmers, G. Lewicki, Symmetric subspaces of l1 with large projection constants, Studia Math. 134 (1999) 119–133. [3] B.L. Chalmers, G. Lewicki, Symmetric spaces with maximal projection constants, J. Funct. Anal. 200 (2003) 1–22. [4] B.L. Chalmers, G. Lewicki, Two illustrative examples of spaces with maximal projection constant, IMUJ Preprint 2008/02; http://www.im.uj.edu.pl/badania/preprinty. [5] B.L. Chalmers, F. Metcalf, A characterization and equations for minimal projections and extensions, J. Operator Theory 32 (1994) 31–46. [6] B.L. Chalmers, B. Shekhtman, Extension constants of unconditional two-dimensional operators, Linear Algebra Appl. 240 (1996) 173–182. [7] B. Grünbaum, Projection constants, Trans. Amer. Math. Soc. 95 (1960) 451–465. [8] I.M. Kadec, M.G. Snobar, Certain functionals on the Minkowski compactum, Math. Notes 10 (1971) 694–696 (English transl.). [9] H. König, Spaces with large projection constants, Israel J. Math. 50 (1985) 181–188. [10] H. König, D.R. Lewis, P.K. Lin, Finite-dimensional projection constants, Studia Math. 75 (1983) 341–358. [11] H. König, N. Tomczak-Jaegermann, Bounds for projection constants and 1-summing norms, Trans. Amer. Math. Soc. 320 (1990) 799–823. [12] H. König, N. Tomczak-Jaegermann, Norms of minimal projections, J. Funct. Anal. 119 (1994) 253–280. [13] H. König, C. Schuett, N. Tomczak-Jaegermann, Projection constants of symmetric spaces and variants of Khinchine’s inequality, J. Reine Angew. Math. 511 (1999) 1–42. [14] G. Lewicki, L. Skrzypek, Chalmers–Metcalf operator and uniqueness of minimal projections, J. Approx. Theory 148 (2007) 71–91. [15] D. Rutovitz, Some parameters associated with finite-dimensional Banach spaces, J. London Math. Soc. 40 (1965) 241–255. [16] P. Wojtaszczyk, Banach Spaces for Analysts, Cambridge University Press, Cambridge, 1991.
Journal of Functional Analysis 257 (2009) 593–609 www.elsevier.com/locate/jfa
Second order Poincaré inequalities and CLTs on Wiener space Ivan Nourdin a , Giovanni Peccati b,c,∗ , Gesine Reinert d a Laboratoire de Probabilités et Modèles Aléatoires, Université Pierre et Marie Curie (Paris VI),
Boîte courrier 188, 4 place Jussieu, 75252 Paris, Cedex 05, France b Equipe Modal’X, Université Paris Ouest – Nanterre la Défense, 200 Avenue de la République,
92000 Nanterre, France c LSTA, Université Paris VI, France d Department of Statistics, University of Oxford, 1 South Parks Road, Oxford OX1 3TG, UK
Received 27 November 2008; accepted 12 December 2008 Available online 17 January 2009 Communicated by Paul Malliavin
Abstract We prove infinite-dimensional second order Poincaré inequalities on Wiener space, thus closing a circle of ideas linking limit theorems for functionals of Gaussian fields, Stein’s method and Malliavin calculus. We provide two applications: (i) to a new “second order” characterization of CLTs on a fixed Wiener chaos, and (ii) to linear functionals of Gaussian-subordinated fields. © 2008 Elsevier Inc. All rights reserved. Keywords: Central limit theorems; Isonormal Gaussian processes; Linear functionals; Multiple integrals; Second order Poincaré inequalities; Stein’s method; Wiener chaos
* Corresponding author at: Equipe Modal’X, Université Paris Ouest – Nanterre la Défense, 200 Avenue de la République, 92000 Nanterre, France. E-mail addresses: [email protected] (I. Nourdin), [email protected] (G. Peccati), [email protected] (G. Reinert).
0022-1236/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.12.017
594
I. Nourdin et al. / Journal of Functional Analysis 257 (2009) 593–609
1. Introduction Let N ∼ N (0, 1) be a standard Gaussian random variable. In its most basic formulation, the Gaussian Poincaré inequality states that, for every differentiable function f : R → R, Var f (N ) Ef (N )2 ,
(1.1)
with equality if and only if f is affine. The estimate (1.1) is a fundamental tool of stochastic analysis: it implies that, if the random variable f (N ) has a small L2 (Ω) norm, then f (N ) has necessarily small fluctuations. Relation (1.1) has been first proved by Nash in [14], and then rediscovered by Chernoff in [9] (both proofs use Hermite polynomials). The Gaussian Poincaré inequality admits extensions in several directions, encompassing both the case of smooth functionals of multi-dimensional (and possibly infinite-dimensional) Gaussian fields, and of nonGaussian probability distributions—see e.g. Bakry et al. [1], Bobkov [2], Cacoullos et al. [3], Chen [5–7], Houdré and Perez-Abreu [10], and the references therein. In particular, the results proved in [10] (which make use of the Malliavin calculus) allow to recover the following infinitedimensional version of (1.1). Let X be an isonormal Gaussian process over some real separable Hilbert space H (see Section 2), and let F ∈ D1,2 be a Malliavin-differentiable functional of X. Then, the Malliavin derivative of F , denoted by DF , is a random element with values in H, and it holds that Var F EDF 2H ,
(1.2)
with equality if and only if F has the form of a constant plus an element of the first Wiener chaos of X. In Proposition 3.1 below we shall prove a more general version of (1.2), involving central moments of arbitrary even orders and based on the techniques developed in [16]. Note that (1.2) contains as a special case the well-known fact that, if F = f (X1 , . . . , Xd ) is a smooth function of i.i.d. N (0, 1) random variables X1 , . . . , Xd , then 2 Var F E ∇f (X1 , . . . , Xd )Rd ,
(1.3)
where ∇f is the gradient of f . Now suppose that the random variable F = f (X1 , . . . , Xd ) (where the X1 , . . . , Xd are again i.i.d. N (0, 1)) is such that f is twice differentiable. In the recent paper [4], Chatterjee has pointed out that if one focuses also on the d × d Hessian matrix Hess f , and not only on ∇f , then one can state an inequality assessing the total variation distance (see Section 3.2, (3.21)) between the law of F and the law of a Gaussian random variable with matching mean and variance. The precise result goes as follows (see [4, Theorem 2.2]). Let E(F ) = μ, Var F = σ 2 > 0, Z ∼ N (μ, σ ), and denote by dT V (F, Z) the total variation distance between the laws of F and Z, see (3.21). Then √ 4 1 4 1 2 5 dT V (F, Z) 2 E Hess f (X1 , . . . , Xd )op 4 × E ∇f (X1 , . . . , Xd )Rd 4 , σ
(1.4)
where Hess f (X1 , . . . , Xd )op is the operator norm of the (random) matrix Hess f (X1 , . . . , Xd ). A relation such as (1.4) is called a second order Poincaré inequality: it is proved in [4] by combining (1.3) with an adequate version of Stein’s method (see e.g. [8,24]).
I. Nourdin et al. / Journal of Functional Analysis 257 (2009) 593–609
595
In [16, Remark 3.6] the first two authors of the present paper pointed out that the finitedimensional Stein-type inequalities leading to Relation (1.4) are special instances of much more general estimates, which can be obtained by combining Stein’s method and Malliavin calculus on an infinite-dimensional Gaussian space. It is therefore natural to ask whether the results of [16] can be used in order to obtain a general version of (1.4), involving a “distance to Gaussian” for smooth functionals of arbitrary infinite-dimensional Gaussian fields. We shall show that the answer is positive. Indeed, one of the principal achievements of this paper is the proof of the following statement (dW denotes the Wasserstein distance, see (3.22)): Theorem 1.1 (Second order infinite-dimensional Poincaré inequality). Let X be an isonormal Gaussian process over some real separable Hilbert space H, and let F ∈ D2,4 . Assume that E(F ) = μ and Var(F ) = σ 2 > 0. Let Z ∼ N (μ, σ 2 ). Then √ 4 1 1 10 E D 2 F op 4 × E DF 4H 4 . dW (F, Z) 2 2σ
(1.5)
If, in addition, the law of F is absolutely continuous with respect to the Lebesgue measure, then √ dT V (F, Z)
1 1 10 D 2 F 4 4 × E DF 4 4 . E H op σ2
(1.6)
The class D2,4 of twice Malliavin-differentiable functionals is formally defined in Section 2; note that D 2 F is a random element with values in H 2 (the symmetric tensor product of H with itself) and that we used D 2 F op to indicate the operator norm (or, equivalently, the spectral radius) of the random Hilbert–Schmidt operator f → f, D 2 F H . The proof of Theorem 1.1 is detailed in Section 4.1. As discussed in Section 4.2, a crucial point is that Theorem 1.1 leads to further (and very useful) inequalities, which we name random contraction inequalities. These estimates involve a “contracted version” of the second derivative D 2 F , and will lead (see Section 5) to the proof of new necessary and sufficient conditions which ensure that a sequence of random variables belonging to fixed Wiener chaos converges in law to a standard Gaussian random variable. This result generalizes and unifies the findings contained in [16,17,20,21,23], and virtually closes a very fruitful circle of recent ideas linking Malliavin calculus, Stein’s method and central limit theorems (CLTs) on Wiener space (see also [15]). The role of contraction inequalities is further explored in Section 6, where we study CLTs for linear functionals of Gaussian subordinated fields. The rest of the paper is organized as follows. In Section 2 we recall some preliminary results involving Malliavin operators. Section 3 concerns Poincaré type inequalities and bounds on distances between probabilities. Section 4 deals with the proof of Theorem 1.1, as well as with “random contraction inequalities.” Sections 5 and 6 focus, respectively, on CLTs on Wiener chaos and on CLTs for Gaussian subordinated fields. Finally, Section 7 is devoted to a version of (1.5) for random variables of the type F = (F1 , . . . , Fd ). 2. Preliminaries We shall now present the basic elements of Gaussian analysis and Malliavin calculus that are used in this paper. The reader is referred to the two monographs by Malliavin [12] and Nualart [19] for any unexplained definition or result.
596
I. Nourdin et al. / Journal of Functional Analysis 257 (2009) 593–609
Let H be a real separable Hilbert space. For any q 1 let H⊗q be the qth tensor product of H and denote by H q the associated qth symmetric tensor product. We write X = {X(h), h ∈ H} to indicate an isonormal Gaussian process over H, defined on some probability space (Ω, F , P ). This means that X is a centered Gaussian family, whose covariance is given in terms of the inner product of H by E[X(h)X(g)] = h, g H . We also assume that F is generated by X. For every q 1, let Hq be the qth Wiener chaos of X, that is, the closed linear subspace of L2 (Ω, F , P ) generated by the random variables of the type {Hq (X(h)), h ∈ H, hH = 1}, 2 x2 d q − x2 where Hq is the qth Hermite polynomial defined as Hq (x) = (−1)q e 2 dx . We write by q e convention H0 = R. For any q 1, the mapping Iq (h⊗q ) = q!Hq (X(h)) can be extended to a linear isometry between the symmetric tensor product H q equipped with the modified norm √ q! · H⊗q and the qth Wiener chaos Hq . For q = 0 we write I0 (c) = c, c ∈ R. It is well known (Wiener chaos expansion) that L2 (Ω, F , P ) can be decomposed into the infinite orthogonal sum of the spaces Hq . Therefore, any square integrable random variable F ∈ L2 (Ω, F , P ) admits the following chaotic expansion F=
∞
Iq (fq ),
(2.7)
q=0
where f0 = E[F ], and the fq ∈ H q , q 1, are uniquely determined by F . For every q 0 we denote by Jq the orthogonal projection operator on the qth Wiener chaos. In particular, if F ∈ L2 (Ω, F , P ) is as in (2.7), then Jq F = Iq (fq ) for every q 0. Let {ek , k 1} be a complete orthonormal system in H. Given f ∈ H p and g ∈ H q , for every r = 0, . . . , p ∧ q, the contraction of f and g of order r is the element of H⊗(p+q−2r) defined by f ⊗r g =
∞
f, ei1 ⊗ · · · ⊗ eir H⊗r ⊗ g, ei1 ⊗ · · · ⊗ eir H⊗r .
(2.8)
i1 ,...,ir =1
r g ∈ Notice that f ⊗r g is not necessarily symmetric: we denote its symmetrization by f ⊗ H (p+q−2r) . Moreover, f ⊗0 g = f ⊗ g equals the tensor product of f and g while, for p = q, f ⊗q g = f, g H⊗q . In the particular case where H = L2 (A, A, μ), where (A, A) is a measurable space and μ is a σ -finite and non-atomic measure, one has that H q = L2s (Aq , A⊗q , μ⊗q ) is the space of symmetric and square integrable functions on Aq . Moreover, for every f ∈ H q , Iq (f ) coincides with the multiple Wiener–Itô integral of order q of f with respect to X introduced by Itô in [11]. In this case, (2.8) can be written as
(f ⊗r g)(t1 , . . . , tp+q−2r ) = f (t1 , . . . , tp−r , s1 , . . . , sr ) Ar
× g(tp−r+1 , . . . , tp+q−2r , s1 , . . . , sr ) dμ(s1 ) . . . dμ(sr ). It can then be also shown that the following multiplication formula holds: if f ∈ H p and g ∈ H q , then Ip (f )Iq (g) =
p∧q r=0
p q r g). r! Ip+q−2r (f ⊗ r r
(2.9)
I. Nourdin et al. / Journal of Functional Analysis 257 (2009) 593–609
597
Let us now introduce some basic elements of the Malliavin calculus with respect to the isonormal Gaussian process X. Let S be the set of all cylindrical random variables of the form F = g X(φ1 ), . . . , X(φn ) ,
(2.10)
where n 1, g : Rn → R is an infinitely differentiable function with compact support and φi ∈ H. The Malliavin derivative of F with respect to X is the element of L2 (Ω, H) defined as n ∂g DF = X(φ1 ), . . . , X(φn ) φi . ∂xi i=1
In particular, DX(h) = h for every h ∈ H. By iteration, one can define the mth derivative D m F , which is an element of L2 (Ω, H m ), for every m 2. For m 1 and p 1, Dm,p denotes the closure of S with respect to the norm · m,p , defined by the relation m p p F m,p = E |F |p + E D i F H⊗i . i=1
The Malliavin derivative D verifies the following chain rule. If ϕ : Rn → R is continuously differentiable with bounded partial derivatives and if F = (F1 , . . . , Fn ) is a vector of elements of D1,2 , then ϕ(F ) ∈ D1,2 and Dϕ(F ) =
n ∂ϕ (F )DFi . ∂xi i=1
2 Note also that a random variable F as in (2.7) is in D1,2 if and only if ∞ q=1 qJq F L2 (Ω) < ∞
∞ and, in this case, E(DF 2H ) = q=1 qJq F 2L2 (Ω) . If H = L2 (A, A, μ) (with μ non-atomic), then the derivative of a random variable F as in (2.7) can be identified with the element of L2 (A × Ω) given by Dx F =
∞
qIq−1 fq (·, x) ,
x ∈ A.
(2.11)
q=1
We denote by δ the adjoint of the operator D, also called the divergence operator. A random element u ∈ L2 (Ω, H) belongs to the domain of δ, noted Dom δ, if and only if it verifies |E DF, u H | cu F L2 (Ω) for any F ∈ D1,2 , where cu is a constant depending only on u. If u ∈ Dom δ, then the random variable δ(u) is defined by the duality relationship (called integration by parts formula) E F δ(u) = E DF, u H ,
(2.12)
which holds for every F ∈ D1,2 . The divergence operator δ is also called the Skorohod integral because in the case of the Brownian motion it coincides with the anticipating stochastic integral introduced by Skorohod in [26].
598
I. Nourdin et al. / Journal of Functional Analysis 257 (2009) 593–609
The family (Tt , t 0) of operators is defined through the projection operators Jq as Tt =
∞
e−qt Jq ,
(2.13)
q=0
and is called the Ornstein–Uhlenbeck semigroup. Assume that the process X , which stands for an independent copy of X, is such that X and X are defined on the product probability space (Ω × Ω , F ⊗ F , P × P ). Given a random variable Z ∈ D1,2 , we can regard its Malliavin derivative DZ = DZ(X) as a measurable mapping from RH to R, determined P ◦ X −1 -almost surely. Then, for any t 0, we have the so-called Mehler’s formula (see e.g. [12, Section 8.5, Chapter I] or [19, Eq. (1.54)]): Tt (DZ) = E DZ e−t X + 1 − e−2t X ,
(2.14)
where E denotes the mathematical expectation with respect to the probability P .
∞ The operator L is defined as L = q=0 −qJq , and it can be proven to be the infinitesimal generator of the Ornstein–Uhlenbeck semigroup (Tt )t0 . The domain of L is Dom L = F ∈ L (Ω): 2
∞
q
2
Jq F 2L2 (Ω)
< ∞ = D2,2 .
q=1
There is an important relation between the operators D, δ and L (see e.g. [19, Proposition 1.4.3]). A random variable F belongs to D2,2 if and only if F ∈ Dom(δD) (i.e. F ∈ D1,2 and DF ∈ Dom δ), and in this case δDF = −LF.
(2.15)
1 −1 is called the For any F ∈ L2 (Ω), we define L−1 F = ∞ q=0 − q Jq (F ). The operator L pseudo-inverse of L. For any F ∈ L2 (Ω), we have that L−1 F ∈ Dom L, and LL−1 F = F − E(F ).
(2.16)
We end the preliminaries by noting that Shigekawa [25] has developed an alternative framework which avoids the inverse of the Ornstein–Uhlenbeck operator L. This framework could provide an alternative derivation of the integration by parts formula (2.30) in [16] which leads to Theorem 3.3. 3. Poincaré-type inequalities and bounds on distances 3.1. Poincaré inequalities The following statement contains, among others, a general version (3.19) of the infinitedimensional Poincaré inequality (1.2).
I. Nourdin et al. / Journal of Functional Analysis 257 (2009) 593–609
599
Proposition 3.1. Fix p 2 and let F ∈ D1,p be such that E(F ) = 0. 1. The following estimate holds:
p p E DL−1 F H EDF H .
(3.17)
p p 1 E D 2 L−1 F op p E D 2 F op , 2
(3.18)
2. If in addition F ∈ D2,p , then
where D 2 F op indicates the operator norm of the random Hilbert–Schmidt operator H → H : f → f, D 2 F H (and similarly for D 2 L−1 F op ). 3. If p is an even integer, then p E F p (p − 1)p/2 E DF H .
(3.19)
Proof. By virtue of standard arguments, we may assume throughout the proof that H = L2 (A, A, μ), where (A, A) is a measurable space and μ is a σ -finite and non-atomic measure. 1. In what follows, we will write X to indicate an independent copy of X. Let F ∈ L2 (Ω) have the expansion (2.7). Then, from (2.11), Iq−1 fq (x, ·) . −Dx L−1 F = q1
By combining this relation with Mehler’s formula (2.14), one deduces that −Dx L
−1
∞ F=
∞
−t
e Tt Dx F (X) dt = 0
e−t EX Dx F e−t X + 1 − e−2t X dt
0
= EY EX Dx F e−Y X + 1 − e−2Y X where Y ∼ E(1) is an independent exponential random variable of mean 1, and {Tt : t 0} is the Ornstein–Uhlenbeck semigroup (2.13). Note that we regard every random variable Dx F as an application RH → R and that (for a generic random variable G) we write EG to indicate that we take the expectation with respect to G. It follows that p p E DL−1 F H = EX EY EX DF e−Y X + 1 − e−2Y X H p EX EY EX DF e−Y X + 1 − e−2Y X H p = EY EX EX DF e−Y X + 1 − e−2Y X H p p p = EY EX DF (X)H = EX DF (X)H = EDF H where we used the fact that e−t X +
√
law
1 − e−2t X = X for any t 0.
600
I. Nourdin et al. / Journal of Functional Analysis 257 (2009) 593–609
2. From the relation 2 −Dxy L−1 F =
(q − 1)Iq−2 fq (x, y, ·)
q2
one deduces analogously that
2 −1 L F −Dxy
∞ =
2 e−2t Tt Dxy F dt
0
∞ =
2 e−2t EX Dxy F e−t X + 1 − e−2t X dt
0
1 2 F e−Y X + 1 − e−2Y X = EY EX Dxy 2 where Y ∼ E(2) is an independent exponential random variable of mean 12 . Thus p p 1 E D 2 L−1 F op = p EX EY EX D 2 F e−Y X + 1 − e−2Y X op 2 p 1 p EX EY EX D 2 F e−Y X + 1 − e−2Y X op 2 p 1 = p EY EX EX D 2 F e−Y X + 1 − e−2Y X op 2 p p p 1 1 1 = p EY EX D 2 F (X)op = p EX D 2 F (X)op = p E D 2 F op . 2 2 2 3. Writing p = 2k, we have E F 2k = E LL−1 F × F 2k−1 = −E δDL−1 F × F 2k−1 = (2k − 1)E DF, −DL−1 F F 2k−2 1− 1 k 1 k (2k − 1) E DF, −DL−1 F k E F 2k
by Hölder’s inequality,
from which we infer that k k E F 2k (2k − 1)k E DF, −DL−1 F (2k − 1)k E DF kH DL−1 F H 2k k 2k (2k − 1) E DF H E DL−1 F H (2k − 1)k E DF 2k H .
2
We also state the following technical result which will be needed in Section 4. The proof is standard and omitted.
I. Nourdin et al. / Journal of Functional Analysis 257 (2009) 593–609
601
Lemma 3.2. Let F and G be two elements of D2,4 . Then, the two random elements D 2 F, DG H and DF, D 2 G H belong to L2 (Ω, H). Moreover, DF, DG H ∈ D1,2 and D DF, DG H = D 2 F, DG H + DF, D 2 G H .
(3.20)
3.2. Bounds on the total variation and Wasserstein distances Let U, Z be two generic real-valued random variables. We recall that the total variation distance between the law of U and the law of Z is defined as (3.21) dT V (U, Z) = supP (U ∈ A) − P (Z ∈ A), A
where the supremum is taken over all Borel subsets A of R. For two random vectors U and Z with values in Rd , d 1, the Wasserstein distance between the law of U and the law of Z is (3.22) dW (U, Z) = sup E f (U ) − E f (Z) , f : f Lip 1
where · Lip stands for the usual Lipschitz seminorm. We stress that the topologies induced by dT V and dW , on the class of all probability measures on R, are strictly stronger than the topology of weak convergence. The following statement has been proved in [16, Theorem 3.1] by means of Stein’s method. Theorem 3.3. Suppose that Z ∼ N (0, 1). Let F ∈ D1,2 and E(F ) = 0. Then, 2 1/2 dW (F, Z) E 1 − DF, −DL−1 F H E 1 − DF, −DL−1 F H .
(3.23)
If moreover F has an absolutely continuous distribution, then 2 1/2 dT V (F, Z) 2E 1 − DF, −DL−1 F H 2E 1 − DF, −DL−1 F H .
(3.24)
4. Proof of Theorem 1.1 and contraction inequalities 4.1. Proof of Theorem 1.1 We can assume, without loss of generality, that μ = 0 and σ 2 = 1. Set W = DF, −DL−1 F H . First, note that W has mean 1, as E(W ) = E DF, −DL−1 F H = −E F × δDL−1 F = E F × LL−1 F = E F 2 = 1. √ By Theorem 3.3 it follows that we only need to bound Var(W ). By (1.2), we have Var(W ) EDW 2H . So, our problem is now to evaluate DW 2H . By using Lemma 3.2 in the special case G = −L−1 F , we deduce that 2 DW 2H = D 2 F, −DL−1 F H + DF, −D 2 L−1 F H H 2 2 2 D 2 F, −DL−1 F H H + 2 DF, −D 2 L−1 F H H .
602
I. Nourdin et al. / Journal of Functional Analysis 257 (2009) 593–609
We evaluate the last two terms separately. We have 2 D F, −DL−1 F 2 D 2 F 2 DL−1 F 2 op H H H and DF, −D 2 L−1 F 2 DF 2 D 2 L−1 F 2 . H op H H It follows that 2 2 2 EDW 2H 2E DL−1 F H D 2 F op + DF 2H D 2 L−1 F op 4 4 1/2 4 1/2 + 2 EDF 4H × E D 2 L−1 F op . 2 E DL−1 F H × E D 2 F op The desired conclusion follows by using, respectively, (3.17) and (3.18) with p = 4. 4.2. Random contraction inequalities When the quantity ED 2 F 4op appearing in (1.5)–(1.6) is analytically too hard to assess, one can resort to the following inequality, which we name random contraction inequality: Proposition 4.1 (Random contraction inequality). Let F ∈ D2,4 . Then 2 4 D F D 2 F ⊗1 D 2 F 2 ⊗2 , op H
(4.25)
where D 2 F ⊗1 D 2 F is the random element of H 2 obtained as the contraction of the symmetric random tensor D 2 F , see (2.8). Proof. We can associate with the symmetric random elements D 2 F ∈ H 2 the random Hilbert– Schmidt operator f → f, D 2 F H⊗2 . Denote by {γj }j 1 the sequence of its (random) eigenvalues. One has that 2 2 4 D F = max |γj |4 |γj |4 = D 2 F ⊗1 D 2 F H⊗2 , op j 1
and the conclusion follows.
j 1
2
The following result is an immediate corollary of Theorem 1.1 and Proposition 4.1. Corollary 4.2. Let F ∈ D2,4 with E(F ) = μ and Var(F ) = σ 2 . Assume that Z ∼ N (μ, σ 2 ). Then √ 2 1 1 10 E D 2 F ⊗1 D 2 F H⊗2 4 × E DF 4H 4 . (4.26) dW (F, Z) 2 2σ If, in addition, the law of F is absolutely continuous with respect to the Lebesgue measure, then √ 2 1 1 10 dT V (F, Z) 2 E D 2 F ⊗1 D 2 F H⊗2 4 × E DF 4H 4 . (4.27) σ
I. Nourdin et al. / Journal of Functional Analysis 257 (2009) 593–609
603
Remark 4.3. When used in the context of central limit theorems, inequality (4.27) does not give, in general, optimal rates. For instance, if Fk = I2 (fk ) is a sequence of double integrals such that law E(Fk2 ) → 1 and Fk − −→ Z ∼ N (0, 1) as k → ∞, then (4.27) implies that 1/2
dT V (Fk , Z) cst × fk ⊗1 fk H⊗2 → 0, 1/2
and the rate fk ⊗1 fk H⊗2 is suboptimal (by a power of 1/2), see Proposition 3.2 in [16]. 5. Characterization of CLTs on a fixed Wiener chaos The following statement collects results proved in [21] (for the equivalences between (i)–(iii)) and [20] (for the equivalence with (iv)). Theorem 5.1. Fix q 2, and let Fk = Iq (fk ), k 1, be a sequence of multiple Wiener–Itô integrals such that E(Fk2 ) → 1. As k → ∞, the following four conditions are equivalent: law −→ Z ∼ N (0, 1); (i) Fk − (ii) E(Fk4 ) −→ E(Z 4 ) = 3; (iii) fk ⊗r fk H⊗(2q−2r) −→ 0 for all r = 1, . . . , q − 1; L2 (Ω)
(iv) DF 2H −−−−→ q. See Section 9 in [22] for a discussion of the combinatorial aspects of the implication (ii) → (i) in the statement of Theorem 5.1. The next theorem, which is a consequence of the main results of this paper, provides two new necessary and sufficient conditions for CLTs on a fixed Wiener chaos. Theorem 5.2. Fix q 2, and let Fk = Iq (fk ) be a sequence of multiple Wiener–Itô integrals such that E(Fk2 ) → 1. Then, the following three conditions are equivalent as k → ∞: law −→ Z ∼ N (0, 1); (i) Fk − L2 (Ω)
(ii) D 2 Fk ⊗1 D 2 Fk H⊗2 −−−−→ 0; L4 (Ω)
(iii) D 2 Fk op −−−−→ 0. Proof. Since EDFk 2H = qE(Fk2 ) → q, and since the random variables DFk 2H live inside a finite sum of Wiener chaoses (where all the Lp (Ω) norms are equivalent), we deduce that the sequence EDFk 4H , k 1, is bounded. In view of (1.5) and (4.25), it is therefore enough to prove the implication (i) → (ii). Without loss of generality, we can assume that H = L2 (A, A, μ) where (A, A) is a measurable space and μ is a σ -finite measure with no atoms. Now observe that 2 Da,b Fk = q(q − 1)Iq−2 fk (·, a, b) , Hence, using the multiplication formula (2.9),
a, b ∈ A.
604
I. Nourdin et al. / Journal of Functional Analysis 257 (2009) 593–609
D 2 Fk ⊗1 D 2 Fk (a, b)
= q 2 (q − 1)2 Iq−2 fk (·, a, u) Iq−2 fk (·, b, u) μ(du) A
q−2 q −2 2 r fk (·, b, u) μ(du) = q (q − 1) r! I2q−4−2r fk (·, a, u) ⊗ r 2
2
r=0
= q 2 (q − 1)2
q−2 r=0
A
q −2 2 r+1 fk (·, b) r! I2q−4−2r fk (·, a) ⊗ r
q−1 q −2 2 r fk (·, b) . (r − 1)! I2q−2−2r fk (·, a) ⊗ = q (q − 1) r −1 2
2
r=1
Using the orthogonality and isometry properties of the integrals Iq , we get 2 E D 2 Fk ⊗1 D 2 Fk H⊗2 4 q−1 2 q −2 q (q − 1) (r − 1)! (2q − 2 − 2r)!fk ⊗r fk 2H⊗(2q−2r) . r −1 4
4
r=1
The desired conclusion now follows since, according to Theorem 5.1, if (i) is verified then, necessarily, fk ⊗r fk H⊗(2q−2r) → 0 for every r = 1, . . . , q − 1. 2 6. CLTs for linear functionals of Gaussian subordinated fields We now provide an explicit application of the inequality (4.26). Let B denote a centered Gaussian process with stationary increments and such that R |ρ(x)| dx < ∞, where ρ(u − v) := E[(Bu+1 − Bu )(Bv+1 − Bv )]. Also, in order to avoid trivialities, assume that ρ is not identically zero. The Gaussian space generated by B can be identified with an isonormal Gaussian process of the type X = {X(h), h ∈ H}, for H defined as follows: (i) denote by E the set of all step functions on R, (ii) define H as the Hilbert space obtained by closing E with respect to the inner product 1[s,t] , 1[u,v] H = Cov(Bt − Bs , Bv − Bu ). In particular, with such a notation, one has that Bt − Bs = X(1[s,t] ). Let f : R → R be a real function of class C 2 , and Z ∼ N (0, 1). We assume that f is not constant, that E|f (Z)| < ∞ and that E|f (Z)|4 < ∞. As a consequence of the generalized Poincaré inequality (3.19), we see that we also automatically have E|f (Z)|4 < ∞ and E|f (Z)|4 < ∞. Fix a < b in R and, for any T > 0, consider 1 FT = √ T
bT f (Bu+1 − Bu ) − E f (Z) du. aT
I. Nourdin et al. / Journal of Functional Analysis 257 (2009) 593–609
605
Theorem 6.1. As T → ∞, FT , Z = O T −1/4 . dW √ Var FT
(6.28)
Remark 6.2. We believe that the rate in (6.28) is not optimal (it should be O(T −1/2 ) instead), see also Remark 4.3. Proof of Theorem 6.1. We have 1 DFT = √ T
bT
f (Bu+1 − Bu )1[u,u+1] du
aT
and 1 D FT = √ T
bT
2
f (Bu+1 − Bu )1⊗2 [u,u+1] du.
aT
Hence DFT 2H
1 = T
f (Bu+1 − Bu )f (Bv+1 − Bv )ρ(u − v) du dv
[aT ,bT ]2
so that DFT 4H
1 = 2 T
f (Bu+1 − Bu )f (Bv+1 − Bv )f (Bw+1 − Bw )
[aT ,bT ]4
× f (Bz+1 − Bz )ρ(w − z)ρ(u − v) du dv dw dz. law
By applying Cauchy–Schwarz inequality twice, and by using the fact that Bu+1 − Bu = Z, we get E f (Bu+1 − Bu )f (Bv+1 − Bv )f (Bw+1 − Bw )f (Bz+1 − Bz ) E f (Z)4 so that 4 1 EDFT 4H E f (Z) T
ρ(u − v) du dv
2
[aT ,bT ]2
2 bT
4 1 E f (Z) du ρ(x) dx = O(1). T aT
On the other hand, we have
R
(6.29)
606
I. Nourdin et al. / Journal of Functional Analysis 257 (2009) 593–609
D 2 FT ⊗1 D 2 FT
1 = f (Bu+1 − Bu )f (Bv+1 − Bv )ρ(u − v)1[u,u+1] ⊗ 1[v,v+1] du dv. T [aT ,bT ]2
Hence 2 E D 2 FT ⊗1 D 2 FT H⊗2
1 = 2 E f (Bu+1 − Bu )f (Bv+1 − Bv )f (Bw+1 − Bw )f (Bz+1 − Bz ) T [aT ,bT ]4
× ρ(u − v)ρ(w − z)ρ(u − w)ρ(z − v) du dv dw dz
4 1 ρ(u − v)ρ(w − z)ρ(u − w)ρ(z − v) du dv dw dz E f (Z) T2 [aT ,bT ]4
4 b − a E f (Z) T
ρ(x)ρ(y)ρ(t)ρ(x − y − t) dx dy dt = O T −1 .
R3
By combining all these facts and (4.26), the desired conclusion follows.
2
Theorem 6.1 does not guarantee that limT →∞ Var FT exists. The following proposition shows that the limit does indeed exist, at least when f is symmetric. Proposition 6.3. Suppose that f : R → R is a symmetric real function of class C 2 . Then σ 2 := limT →∞ Var FT exists in (0, ∞). Moreover, as T → ∞, law FT − −→ Z ∼ N 0, σ 2 .
(6.30)
Proof. We expand f in terms of Hermite polynomials. Since f is symmetric, we can write ∞ c2q H2q (x), f (x) = E f (Z) +
x ∈ R,
q=1
where the real numbers c2q are given by (2q)!c2q = E[f (Z)H2q (Z)]. Thus Var FT
=
1 T
Cov f (Bu+1 − Bu ), f (Bv+1 − Bv ) du dv
[aT ,bT ]2
=
∞
2 c2q (2q)!
q=1
=
∞ q=1
1 T
ρ 2q (v − u) du dv [aT ,bT ]2
2 c2q (2q)!
1 T
bT
bT
−u
du aT
aT −u
dxρ 2q (x)
I. Nourdin et al. / Journal of Functional Analysis 257 (2009) 593–609
=
∞
b 2 c2q (2q)!
q=1
−→ (b − a)
T ∞
T
(b−u)
dxρ 2q (x)
du a
∞
607
−T (u−a)
ρ 2q (x) dx =: σ 2 ,
2 c2q (2q)!
q=1
by monotone convergence.
R
Since f is not constant, there exists q 1 such that c2q = 0 so that σ 2 > 0 (recall that we assumed ρ ≡ 0). Moreover, we also have Var FT E DFT 2H E DFT 4H = O(1), so that σ 2 < ∞. The assertion now follows from Theorem 6.1.
see (6.29),
2
When B is a fractional Brownian motion with Hurst index H < 1/2, Theorem 6.1 applies because, in this case, it is easily checked that R |ρ(x)|dx < ∞. On the other hand, using the b law −Bx scaling property of B, observe that F1/ h = √1 a [f ( Bx+h ) − E(f (Z))] dx for all fixed hH h
h > 0. Hence, since E|Bt − Bs |2 = σ 2 (|t − s|) with σ 2 (r) = r 2H a concave function, the general Theorem 1.1 in [13] also applies, and this gives another proof of (6.30). We believe however that, even in this particular case, our proof is simpler (since not based on the rather technical method of moments). Moreover, note that [13] is not concerned with bounds on distance between the laws of F1/ h / Var F1/ h and Z ∼ N (0, 1). 7. A multidimensional extension Let V , Y be two random vectors with values in Rd , d 2. Recall that the Wasserstein distance between the laws of V and Y is defined in (3.22). The following statement, whose proof is based on the results obtained in [18], provides a multidimensional version of (1.5). Theorem 7.1. Fix d 2, and let C = {C(i, j ): i, j = 1, . . . , d} be a d × d positive definite matrix. Suppose that F = (F1 , . . . , Fd ) is a Rd -valued random vector such that E[Fi ] = 0 and Fi ∈ D2,4 for every i = 1, . . . , d. Assume moreover that F has covariance matrix C. Then √ d d 2 4 1/4 1/4 3 2 −1 C C1/2 D dW F, Nd (0, C) E EDFj 4H F × , i op op op 2 i=1
j =1
where Nd (0, C) indicates a d-dimensional centered Gaussian vector, with covariance matrix equal to C. Proof. In [18, Theorem 3.5] it is shown that d 2 −1 1/2 dW F, Nd (0, C) C op Cop E C(i, j ) − DFi , −DL−1 Fj H . i,j =1
608
I. Nourdin et al. / Journal of Functional Analysis 257 (2009) 593–609
Since, using successively (2.12), (2.15) and (2.16), we have E DFi , −DL−1 Fj H = −E Fi × δDL−1 Fj ] = E Fi × LL−1 Fj ] = E[Fi Fj ] = C(i, j ), we deduce, applying successively (1.2), (3.20), Cauchy–Schwarz inequality and Proposition 3.1, d 1/2 dW F, Nd (0, C) C −1 op Cop Var DFi , −DL−1 Fj H i,j =1
d −1 2 1/2 Cop E D DFi , −DL−1 Fj H H C op i,j =1
d √ −1 1/2 D 2 Fi , −DL−1 Fj 2 C E 2 C op op H H i,j =1
2 + E DFi , −D 2 L−1 Fj H H
d √ −1 4 1/4 2 4 1/4 1/2 E D Fi op E DL−1 Fj H 2C op Cop i,j =1
1/4 2 −1 1/4 E D L Fj 4op + E DFi 4H
d √ −1 2 4 1/4 1/4 1/2 E D Fi op E DFj 4H 2C op Cop i,j =1
1/4 2 4 1/4 1 E DFi 4H E D Fj op 2 √ d d 2 4 1/4 1/4 3 2 1/2 −1 C E D Fi op E DFj 4H Cop × . = op 2 +
i=1
2
j =1
Acknowledgments We would like to thank Professor Paul Malliavin for very stimulating discussions and an anonymous referee for pointing us towards the reference [25]. References [1] D. Bakry, F. Barthe, P. Cattiaux, A. Guillin, A simple proof of the Poincaré inequality for a large class of probability measures including the log-concave case, Electron. Comm. Probab. 13 (2008) 60–66 (in electronic). [2] S.G. Bobkov, Isoperimetric and analytic inequalities for log-concave probability measures, Ann. Probab. 27 (4) (1999) 1903–1921. [3] T. Cacoullos, V. Papahanasiou, S.A. Utev, Variational inequalities with examples and an application to the central limit theorem, Ann. Probab. 22 (3) (1994) 1607–1618. [4] S. Chatterjee, Fluctuation of eigenvalues and second order Poincaré inequalities, Probab. Theory Related Fields 143 (2009) 1–40. [5] L.H.Y. Chen, Poincaré-type inequalities via stochastic integrals, Z. Wahrsch. Verw. Gebiete 69 (1985) 251–277.
I. Nourdin et al. / Journal of Functional Analysis 257 (2009) 593–609
609
[6] L.H.Y. Chen, Characterization of probability distributions by Poincaré-type inequalities, Ann. Inst. H. Poincaré Probab. Statist. 23 (1) (1987) 91–110. [7] L.H.Y. Chen, The central limit theorem and Poincaré-type inequalities, Ann. Probab. 16 (1) (1988) 300–304. [8] L. Chen, Q.-M. Shao, Stein’s method for normal approximation, in: An Introduction to Stein’s Method, in: Lect. Notes Ser. Inst. Math. Sci. Natl. Univ. Singap., vol. 4, Singapore Univ. Press, Singapore, 2005, pp. 1–59. [9] H. Chernoff, A note on an inequality involving the normal distribution, Ann. Probab. 9 (3) (1981) 533–535. [10] C. Houdré, V. Pérez-Abreu, Covariance identities and inequalities for functionals on Wiener and Poisson spaces, Ann. Probab. 23 (1995) 400–419. [11] K. Itô, Multiple Wiener integral, J. Math. Soc. Japan 3 (1951) 157–169. [12] P. Malliavin, Stochastic Analysis, Springer-Verlag, Berlin–Heidelberg–New York, 1997. [13] M. Marcus, J. Rosen, CLT for Lp moduli of continuity of Gaussian processes, Stochastic Process. Appl. 118 (2008) 1107–1135. [14] J. Nash, Continuity of solutions of parabolic and elliptic equations, Amer. J. Math. 80 (1956) 931–954. [15] I. Nourdin, G. Peccati, Non-central convergence of multiple integrals, Ann. Probab. (2008), in press. [16] I. Nourdin, G. Peccati, Stein’s method on Wiener chaos, Probab. Theory Related Fields (2008), in press. [17] I. Nourdin, G. Peccati, Stein’s method and exact Berry–Esséen bounds for functionals of Gaussian fields, preprint, 2008. [18] I. Nourdin, G. Peccati, A. Réveillac, Multivariate normal approximation using Stein’s method and Malliavin calculus, Ann. Inst. H. Poincaré Probab. Statist. (2008), in press. [19] D. Nualart, The Malliavin Calculus and Related Topics, second ed., Springer-Verlag, Berlin, 2006. [20] D. Nualart, S. Ortiz-Latorre, Central limit theorem for multiple stochastic integrals and Malliavin calculus, Stochastic Process. Appl. 118 (4) (2008) 614–628. [21] D. Nualart, G. Peccati, Central limit theorems for sequence of multiple stochastic integrals, Ann. Probab. 33 (1) (2005) 177–193. [22] G. Peccati, M.S. Taqqu, Moments, cumulants and diagram formulae for non-linear functionals of random measures, preprint available at http://fr.arxiv.org/abs/0811.1726, 2008. [23] G. Peccati, C.A. Tudor, Gaussian limits for vector-valued multiple stochastic integrals, in: Séminaire de Probabilités XXXVIII, in: Lecture Notes in Math., vol. 1857, Springer-Verlag, Berlin, 2005, pp. 247–262. [24] G. Reinert, Three general approaches to Stein’s method, in: An Introduction to Stein’s Method, in: Lect. Notes Ser. Inst. Math. Sci. Natl. Univ. Singap., vol. 4, Singapore Univ. Press, Singapore, 2005, pp. 183–221. [25] I. Shigekawa, De Rham–Hodge–Kodaira’s decomposition on an abstract Wiener space, J. Math. Kyoto Univ. 26 (2) (1986) 191–202. [26] A.V. Skorohod, On a generalization of a stochastic integral, Theory Probab. Appl. 20 (1975) 219–233.
Journal of Functional Analysis 257 (2009) 610–640 www.elsevier.com/locate/jfa
On topological centre problems and SIN quantum groups ✩ Zhiguo Hu a , Matthias Neufang b,∗ , Zhong-Jin Ruan c a Department of Mathematics and Statistics, University of Windsor, Windsor, Ontario, Canada N9B 3P4 b School of Mathematics and Statistics, Carleton University, Ottawa, Ontario, Canada K1S 5B6 c Department of Mathematics, University of Illinois, Urbana, IL 61801, USA
Received 3 December 2008; accepted 3 February 2009 Available online 25 February 2009 Communicated by N. Kalton
Abstract Let A be a Banach algebra with a faithful multiplication and A∗ A∗ be the quotient Banach algebra of A∗∗ with the left Arens product. We introduce a natural Banach algebra, which is a closed subspace of A∗ A∗ but equipped with a distinct multiplication. With the help of this Banach algebra, new characterizations of the topological centre Zt (A∗ A∗ ) of A∗ A∗ are obtained, and a characterization of Zt (A∗ A∗ ) by Lau and Ülger for A having a bounded approximate identity is extended to all Banach algebras. The study of this Banach algebra motivates us to introduce the notion of SIN locally compact quantum groups and the concept of quotient strong Arens irregularity. We give characterizations of co-amenable SIN quantum groups, which are even new for locally compact groups. Our study shows that the SIN property is intrinsically related to topological centre problems. We also give characterizations of quotient strong Arens irregularity for all quantum group algebras. Within the class of Banach algebras introduced recently by the authors, we characterize the unital ones, generalizing the corresponding result of Lau and Ülger. We study the interrelationships between strong Arens irregularity and quotient strong Arens irregularity, revealing the complex nature of topological centre problems. Some open questions by Lau and Ülger on Zt (A∗ A∗ ) are also answered. © 2009 Elsevier Inc. All rights reserved. Keywords: Banach algebras; Topological centres; Locally compact groups and quantum groups
✩ The first and the second authors were partially supported by NSERC. The third author was partially supported by the National Science Foundation DMS-0500535. * Corresponding author. E-mail addresses: [email protected] (Z. Hu), [email protected] (M. Neufang), [email protected] (Z.-J. Ruan).
0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.02.004
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
611
1. Introduction Let A be a Banach algebra. As is well known, on the bidual A∗∗ of A, there are two Banach algebra multiplications, called the left and the right Arens products, respectively, each extending the multiplication on A (cf. Arens [1]). By definition, the left Arens product is induced by the left A-module structure on A. That is, for m, n ∈ A∗∗ , f ∈ A∗ , and a, b ∈ A, we have f · a, b = f, ab,
n f, a = n, f · a,
and m n, f = m, n f .
The right Arens product ♦ is defined by considering A as a right A-module. It is known that m n = w∗ - lim lim aα bβ α
β
and m ♦ n = w∗ - lim lim aα bβ β
α
whenever (aα ) and (bβ ) are nets in A converging to m and n, respectively, in σ (A∗∗ , A∗ ). The algebra A is said to be Arens regular if and ♦ coincide on A∗∗ . Every operator algebra, in particular, every C ∗ -algebra, is Arens regular. However, Banach algebras studied in abstract harmonic analysis are typically far from being Arens regular. For example, for a locally compact group G, the group algebra L1 (G) is Arens regular if and only if G is finite (cf. Young [47]). In the study of Arens irregularity, the left and the right topological centres Zt (A∗∗ , ) and Zt (A∗∗ , ♦) of A∗∗ are considered (see Section 2 for the definition). In general, Zt (A∗∗ , ) and Zt (A∗∗ , ♦) are norm closed subalgebras of A∗∗ when A∗∗ is equipped with either Arens product. It is obvious that A is Arens regular if and only if Zt (A∗∗ , ) = Zt (A∗∗ , ♦) = A∗∗ . The left Arens product on A∗∗ induces naturally a Banach algebra multiplication on A∗ A∗ , also denoted by , such that (A∗ A∗ , ) ∼ = (A∗∗ , )/A∗ A⊥ . The topological centre Zt (A∗ A∗ ) of (A∗ A∗ , ) is defined in a similar fashion as Zt (A∗∗ , ). As observed by several authors, there exists a close connection between the topological centres Zt (A∗∗ , ) and Zt (A∗ A∗ ). Under the canonical embedding A → A∗∗ , we have A ⊆ Zt (A∗∗ , ) ⊆ A∗∗ and A ⊆ Zt (A∗∗ , ♦) ⊆ A∗∗ . In [6], Dales and Lau introduced the following concepts of Arens irregularity: A is said to be left strongly Arens irregular if Zt (A∗∗ , ) = A, right strongly Arens irregular if Zt (A∗∗ , ♦) = A, and strongly Arens irregular if A is both left and right strongly Arens irregular. When A has a bounded right approximate identity, one has RM(A) ⊆ Zt (A∗ A∗ ) ⊆ A∗ A∗ , where RM(A) is the opposite right multiplier algebra of A. In general, RM(A) and A∗ A∗ can be compared via their canonical images in BA (A∗ ), the Banach algebra of bounded right A-module homomorphisms on A∗ (see Fact 2 in Section 4). In the spirit of the terminology of Dales and Lau, we say that A is left quotient strongly Arens irregular if Zt (A∗ A∗ ) ⊆ RM(A). Then, when A has a bounded right approximate identity, A is left quotient strongly Arens irregular if and only if Zt (A∗ A∗ ) = RM(A). The right quotient strong Arens irregularity is defined similarly by comparing the topological centre Zt (AA∗ ∗ ) of AA∗ ∗ with the left multiplier algebra of A, where AA∗ ∗ is the Banach algebra with the multiplication induced by ♦. There have been a number of extensive studies of strong Arens irregularity and quotient strong Arens irregularity over the last three decades. The reader is referred to Dales [5], Dales and Lau [6], Dales, Lau and Strauss [7], Lau [25], Lau and Losert [26,27], Lau and Ülger [28], and Palmer [35] for more information on Arens products, topological centres, and related topics. The present work is mainly motivated by the intriguing interrelationships between strong Arens irregularity and quotient strong Arens irregularity of general Banach algebras. The paper is organized as follows. In Section 2, we start with notation conventions, definitions, and
612
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
preliminary results on Arens products and topological centres. We then introduce the Banach algebra A∗ A∗R , which is a norm closed subspace of the quotient algebra (A∗ A∗ , ) on the one hand, but on the other hand its multiplication is induced by the right Arens product ♦ on A∗∗ . The topological centre Zt (A∗ A∗R ) of A∗ A∗ is thereby defined, since A∗ A∗R is a left topological semigroup under the relative weak∗ -topology. It will be seen that the Banach algebras A∗ A∗R and Zt (A∗ A∗R ) help reveal the intrinsic structure of A∗ A, A∗ A∗ , and A∗∗ . In Section 3, we present a new characterization of the topological centre Zt (A∗ A∗ ) (Theorem 2). In the case where A is the group algebra L1 (G) of a locally compact group G, we use LUC(G)∗R to replace the Banach algebra (ZU (G), ∗), which was defined via the group action of G on LUC(G) and was used by Lau [25, Lemma 2] to characterize Zt (LUC(G)∗ ) (see Section 3 for details, and see Dales and Lau [6, Proposition 11.6] for the extension of [25, Lemma 2] to convolution Beurling algebras). Therefore, Theorem 2 is a Banach algebraic version of [25, Lemma 2]. Our investigation of the SIN property and other problems shows that LUC(G)∗R has its advantages over ZU (G) in certain aspects. As a consequence of Theorem 2, we extend to all Banach algebras a characterization of Zt (A∗ A∗ ) by Lau and Ülger [28, Lemma 3.1c)], where they assumed that A has a bounded approximate identity (Corollary 3). The exploration of a Banach algebraic version of another characterization of Zt (LUC(G)∗ ) given by Neufang [34] leads to the concept of a strong identity of A∗ A∗ , that is proven to be important for studying locally compact quantum groups. Section 4 has initially been inspirited by the natural question of when A∗ A∗ has a strong identity. Generalizing the concept of SIN locally compact groups, we introduce the notion of SIN locally compact quantum groups. We characterize co-amenable SIN quantum groups G in terms of the Banach algebras RM(L1 (G)), Zt (LUC(G)∗R ), and LUC(G)∗R , and the existence of a strong identity of LUC(G)∗ , respectively (Theorem 18), where we note that RM(L1 (G)) = RM cb (L1 (G)) (the completely bounded opposite right multiplier algebra of L1 (G)), since G is co-amenable. In particular, we prove that a quantum group G is co-amenable and SIN if and only if the Banach algebra LUC(G)∗R is unital. It is interesting to compare this result with [3, Theorem 3.1], where Bédos and Tuset showed that a quantum group G is co-amenable if and only if the algebra C0 (G)∗ is unital, which is also shown to be equivalent to LUC(G)∗ being unital (Theorem 15). The characterizations of SIN quantum groups obtained in this section are original even for locally compact groups. In the course of this investigation, we also obtain some other new characterizations of SIN-groups (Theorem 19). Results in this section illustrate that the SIN property is intrinsically related to topological centre problems. Section 5 is devoted to the study of interrelationships between strong Arens irregularity and quotient strong Arens irregularity. First, using the connection between the Banach algebras RM(A) and Zt (A∗ A∗ ), we extend a characterization of unital Banach algebras by Lau and Ülger [28] to the larger class of Banach algebras introduced recently by the authors [16] (Theorem 22). We prove a characterization of left quotient strong Arens irregularity in terms of Zt (A∗∗ , ) (Theorem 23). Then, for the class of Banach algebras studied in [16], we give sufficient conditions to ensure the equivalence between (left, respectively, right) strong Arens irregularity and (left, respectively, right) quotient strong Arens irregularity. For involutive Banach algebras A from this class, we obtain some criteria for determining when the quotient strong Arens irregularity of A implies the strong Arens irregularity of A (Theorem 29). We prove that if the left quotient strong Arens irregularity of A is strengthened by replacing RM(A) with the multiplier algebra M(A), then the left strong Arens irregularity of A can be determined by testing elements of Zt (A∗∗ , ) against one particular element of A∗∗ (Theorem 30). In this situation, we have one test point in A∗∗ to characterize A inside Zt (A∗∗ , ). In this context, note that
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
613
in [7, Definition 12.3], Dales, Lau, and Strauss introduced the concept of a dtc set in A∗∗ (standing for “determining the topological centre”) to characterize A inside A∗∗ through a topological centre condition. We close Section 5 with characterizations of quotient strong Arens irregularity for all quantum group algebras as well as a characterization of quantum groups G satisfying Zt (LUC(G)∗R ) = RM(L1 (G)) (Theorems 32 and 33). The study in this section shows the complex nature of topological centre problems in certain aspects, noticing that Banach algebras like RM(A), LM(A), Zt (A∗ A∗ ), Zt (AA∗ ∗ ), Zt (A∗∗ , ), and Zt (A∗∗ , ♦) all play a rôle even when only a one-sided topological centre is considered. The paper concludes in Section 6 with some examples of Arens irregular Banach algebras, complementing some assertions in Section 5. Some open questions in Lau and Ülger [28] are answered there in the negative. The authors are grateful to the referee for valuable suggestions. 2. Preliminaries Let B be an algebra equipped with a topology σ such that B is a topological linear space. Assume that B is also a right topological semigroup under the multiplication. That is, for any fixed y ∈ B, the map x −→ xy is continuous on B (cf. Berglund, Junghenn and Milnes [4]). The topological centre Zt (B) of B is defined to be the set of y ∈ B such that the map x −→ yx is continuous on B. If B is a left topological semigroup under the multiplication, the topological centre Zt (B) of B is defined analogously. In the rest of the paper, if B is a subspace of a given dual Banach space, the topology σ on B is taken to be the relative weak∗ -topology. Throughout this paper, A denotes a Banach algebra with a faithful multiplication; that is, for any a ∈ A, we have a = 0 if aA = {0} or Aa = {0}. We use A∗ A and AA∗ to denote the closed linear spans of the module products A∗ A and AA∗ , respectively. By Cohen’s factorization theorem, A∗ A = A∗ A if A has a bounded right approximate identity (BRAI), and AA∗ = AA∗ if A has a bounded left approximate identity (BLAI). For any fixed m ∈ A∗∗ , the maps n −→ n m and n −→ m ♦ n are weak∗ –weak∗ continuous on A∗∗ . Then with the weak∗ -topology, (A∗∗ , ) is a right topological semigroup and (A∗∗ , ♦) is a left topological semigroup. Their topological centres Zt (A∗∗ , ) = {m ∈ A∗∗ : the map n −→ m n is weak∗ –weak∗ continuous on A∗∗ } and Zt (A∗∗ , ♦) = {m ∈ A∗∗ : the map n −→ n ♦ m is weak∗ –weak∗ continuous on A∗∗ } are called the left and the right topological centres of A∗∗ , respectively. It is seen that Zt (A∗∗ , ) = {m ∈ A∗∗ : m n = m ♦ n for all n ∈ A∗∗ }, and Zt (A∗∗ , ♦) = {m ∈ A∗∗ : n ♦ m = n m for all n ∈ A∗∗ }. Therefore, A is Arens regular if and only if Zt (A∗∗ , ) = Zt (A∗∗ , ♦) = A∗∗ . Clearly, A ⊆ Zt (A∗∗ , ) ∩ Zt (A∗∗ , ♦). The algebra A is left strongly Arens irregular (LSAI) if Zt (A∗∗ , ) = A, right strongly Arens irregular (RSAI) if Zt (A∗∗ , ♦) = A, and strongly Arens
614
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
irregular if A is both LSAI and RSAI (cf. Dales and Lau [6]). In contrast to the situation for Arens regularity, there are LSAI Banach algebras which are not RSAI (cf. Section 6). The Banach space A∗ A is a closed A-submodule of A∗ . It is also left introverted in A∗ ; that is, m x ∈ A∗ A for all x ∈ A∗ A and m ∈ A∗ A∗ , where m x = m x for any extension m ∈ A∗∗ of m. This is equivalent to that A∗ A is a left (A∗∗ , )-submodule of A∗ (cf. Dales and Lau [6, Proposition 5.2]). Then A∗ A∗ is a Banach algebra under the multiplication defined by m n, x = m, n x x ∈ A∗ A, m, n ∈ A∗ A∗ . The multiplication on A∗ A∗ is induced by the left Arens product on A∗∗ . That is, if m, n ∈ A∗ A∗ and m , n ∈ A∗∗ are extensions of m, n to A∗ , respectively, then m n is an extension ∗ of m n to A . In fact, for x ∈ A∗ , n ∈ A∗ A∗ , and w ∈ A∗∗ , n x ∈ A∗ and w n ∈ A∗∗ can be defined analogously. Then m n = m n for all m, n ∈ A∗ A∗ . The canonical quotient map ∗∗ ∗ ∗ π : A −→ A A gives the isometric algebra isomorphism ∗ ∗ A A , ∼ = (A∗∗ , )/A∗ A⊥ , where A∗ A⊥ = {m ∈ A∗∗ : m|A∗ A = 0} is a closed ideal in (A∗∗ , ). For any fixed m ∈ A∗ A∗ , the map n −→ n m is weak∗ –weak∗ continuous on A∗ A∗ . Hence, (A∗ A∗ , ) with the weak∗ -topology is a right topological semigroup, and its topological centre is given by Zt A∗ A∗ = m ∈ A∗ A∗ : n −→ m n is weak∗ –weak∗ continuous on A∗ A∗ . We say that A is left quotient Arens regular if Zt (A∗ A∗ ) = A∗ A∗ . Since A∗ A is a left A-module, for x ∈ A∗ A and m ∈ A∗ A∗ , x ♦ m ∈ A∗ is independent ∗ of the choice of an extension m ∈ A of m. We denote x ♦ m by x ♦ m. However, when A∗ A ∗ ∗ is not right introverted in A , x ♦ m may not be in A A. Let A∗ A∗R = m ∈ A∗ A∗ : A∗ A ♦ m ⊆ A∗ A . Then A∗ A∗R is a norm closed subspace of A∗ A∗ . For m ∈ A∗ A∗R and n ∈ A∗ A∗ , we define m ♦ n ∈ A∗ A∗ by x, m ♦ n = x ♦ m, n x ∈ A∗ A , and we see that m ♦ n is an extension of m ♦ n to A∗ . It is also easy to see that for all m, n ∈ A∗ A∗R , and p ∈ A∗ A∗ , we have m ♦ n ∈ A∗ A∗R as well as
m ♦ n m
n and m ♦ (n ♦ p) = (m ♦ n) ♦ p. In particular, (A∗ A∗R , ♦) is a Banach algebra. It is evident that A∗ A⊥ is a right ideal in (A∗∗ , ♦). It is seen that A∗ A⊥ is a two-sided ideal in (A∗∗ , ♦) if and only if A∗ A is two-sided introverted in A∗ (cf. Dales and Lau [6, Proposition 5.4] for the “only if” part). Let ∗ ∗ −1 A AR = m ∈ A∗∗ : A∗ A ♦ m ⊆ A∗ A . A∗∗ R =π
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
615
∗∗ ∗∗ ∗ ⊥ Then A∗∗ R is a closed subalgebra of (A , ♦), A A is a closed two-sided ideal in (AR , ♦), induces the isometric algebra isomorphism and π|A∗∗ R
∗ ∗ ∗∗ A AR , ♦ ∼ = AR , ♦ /A∗ A⊥ . ∗∗ ∗ Moreover, A∗∗ R is the largest closed subalgebra B of (A , ♦) such that A A is a right B∗ ∗ ⊥ submodule of A and A A is a closed ideal in B. With the relative weak∗ -topology, (A∗ A∗R , ♦) is a left topological semigroup. By definition, Zt (A∗ A∗R ) is the set of m ∈ A∗ A∗R such that the map n −→ n ♦ m is continuous on A∗ A∗R with respect to the relative σ (A∗ A∗ , A∗ A)-topology. Since the multiplication in A is faithful, the map A −→ A∗ A∗R , a −→ a˙ = a|A∗ A is injective. Note that A∗ A∗ is also an A-module. It is easy to see that for all a ∈ A and n ∈ A∗ A∗ , a · n = a˙ n = a˙ ♦ n. In the sequel, a · n will denote both a˙ n ∈ A∗ A∗ and a n ∈ A∗∗ . It is obvious that A ⊆ Zt (A∗ A∗ ) ∩ Zt (A∗ A∗R ). Since A ⊆ A∗ A∗ is weak∗ -dense, we have
Zt A∗ A∗R = m ∈ A∗ A∗R : n ♦ m = n m for all n ∈ A∗ A∗R , and Zt (A∗ A∗R ) is a norm closed subalgebra of (A∗ A∗R , ♦) and of (A∗ A∗ , ). ∗ ∗∗ ∗ ∗ ∗ ∗ Note that A∗∗ R and A AR both contain A. Therefore, AR (respectively, A AR ) is weak ∗∗ ∗ ∗∗ ∗ ∗ ∗∗ ∗ closed in A (respectively, in A A ) if and only if AR = A (respectively, A AR = ∗ ∗ ∗ ∗ A∗ A∗ ). On the other hand, it is clear that A∗∗ = A∗∗ R if and only if A A = A AR if and ∗ ∗ only if A A is introverted in A . In this situation, there are two Arens products on A∗ A∗ so that A∗ A∗ has two topological centres: the usual topological centre is now given by Zt A∗ A∗ = m ∈ A∗ A∗ : m n = m ♦ n for all n ∈ A∗ A∗ , and the other topological centre with respect to ♦ is just Zt (A∗ A∗R ). Analogously, AA∗ is an A-module and is right introverted in A∗ . As in the A∗ A case, one can define the Banach algebras (AA∗ ∗ , ♦), ( L AA∗ ∗ , ), and ( L A∗∗ , ), and consider the topological centres Zt (AA∗ ∗ ) and Zt ( L AA∗ ∗ ). The right quotient Arens regularity can be defined similarly. We point out that for a norm closed left (respectively, right) introverted A-submodule X of A∗ , ∗ , ♦) (respectively, (X ∗ , ♦) and ( X ∗ , )), and their topothe Banach algebras (X ∗ , ) and (XR L logical centres can also be defined. In the present paper, however, we will focus on the case where X is either A∗ A or AA∗ . See Dales and Lau [6] for more information on topological centres Zt (X ∗ , ) and Zt (X ∗ , ♦). The reader is also referred to Grosser [12] for a systematic study of left (respectively, right) Banach modules of the form V ∗ A (respectively, AV ∗ ), where V is a left (respectively, right) Banach A-module. It is well known that if A is the group algebra L1 (G) of a locally compact group G, then A∗ A = LUC(G) (respectively, AA∗ = RUC(G)), the C ∗ -algebra of bounded left (respectively, right) uniformly continuous functions on G. The space LUC(G) ∩ RUC(G) is denoted by U C(G), which is the C ∗ -algebra of bounded uniformly continuous functions on G. Let LM(A) and RM(A) be the left and the opposite right multiplier algebras of A, respectively. That is, LM(A) = T ∈ B(A): T (ab) = T (a)b for all a, b ∈ A ,
616
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
and RM(A) = T ∈ B(A)op : T (ab) = aT (b) for all a, b ∈ A , where B(A) is the Banach algebra of bounded linear operators on A. As norm closed subalgebras of B(A) and B(A)op , respectively, LM(A) and RM(A) are Banach algebras. Let M(A) denote the multiplier algebra of A, consisting of (μl , μr ) ∈ LM(A) × RM(A) satisfying aμl (b) = μr (a)b (a, b ∈ A). If A has a BRAI (respectively, BLAI), then A can naturally be identified with a norm closed left (respectively, right) ideal in LM(A) (respectively, in RM(A)). Any Banach algebra with a bounded approximate identity (BAI) has a faithful multiplication. Some forms of the following lemma are known (cf. Dales [5, Theorem 2.9.49(iii)] and Lau and Ülger [28, Theorem 4.4]). For convenience, we include a complete proof here. We state only the A∗ A-version of these results. Lemma 1. Let A be a Banach algebra with a BRAI and E be a weak∗ -cluster point in A∗∗ of a BRAI of A. (i) The map RM(A) −→ (A∗∗ , ), μ −→ μ∗∗ (E) is an injective algebra homomorphism with ∗∗ range contained in A∗∗ R ∩ (E A ). ∗ ∗ (ii) (A A , ) has an identity, and the map μ −→ μ = μ∗∗ (E)|A∗ A is a unital injective algebra homomorphism from RM(A) into Zt (A∗ A∗ ) satisfying μ , f · a = μ(a), f (f ∈ A∗ , a ∈ A). Proof. It is easy to see that E is a right identity of (A∗∗ , ) satisfying n μ∗∗ (E) = μ∗∗ (n)
and n ♦ μ∗∗ (E) = μ∗∗ (n ♦ E)
μ ∈ RM(A), n ∈ A∗∗ .
(†)
(i) Let μ, ν ∈ RM(A), a ∈ A, and f ∈ A∗ . By (†), we have μ∗∗ (ν ∗∗ (E)) = ν ∗∗ (E) μ∗∗ (E), (f · a) ♦ μ∗∗ (E) = f · μ(a), and μ∗∗ (E) = E μ∗∗ (E). Then the assertion follows. (ii) Let μ ∈ RM(A). Note that a · μ = a · μ∗∗ (E) = μ(a) ∈ A for all a ∈ A. Then, for all a ∈ A, f ∈ A∗ , and p ∈ A∗ A∗ , we have μ , f · a = μ∗∗ (E), f · a = μ(a), f
and μ p, f · a = p, f · μ(a) .
Thus ν −→ ν maps RM(A) injectively into Zt (A∗ A∗ ). Since ν −→ ν is the composition of the map in (i) and the canonical quotient map (A∗∗ , ) −→ (A∗ A∗ , ), it is an algebra homomorphism. Therefore, (A∗ A∗ , ) has an identity by [13, Theorem 4(i)]. Finally, it is easy to see that the map RM(A) −→ (A∗ A∗ , ), μ −→ μ is unital. 2 Lemma 1 modifies Dales [5, Theorem 2.9.49(iii)] and Lau and Ülger [28, Theorem 4.4], where A was assumed to have a BAI of norm 1, and an isometric embedding from M(A) into (A∗∗ , ), respectively, from RM(A) into Zt (A∗ A∗ ), was obtained. We note that in Lemma 1(i), though μ −→ μ∗∗ (E) maps RM(A) into A∗∗ R , it is not an algebra homomorphism from RM(A) to (A∗∗ , ♦) in general. R
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
617
3. Topological centres of quotient algebras Theorem 2. Let A be a Banach algebra. (i) Zt (A∗ A∗ ) = {m ∈ A∗ A∗R : m n = m ♦ n for all n ∈ A∗ A∗ }. (ii) Zt (AA∗ ∗ ) = {m ∈ L AA∗ ∗ : n ♦ m = n m for all n ∈ AA∗ ∗ }. Proof. We prove (i); the proof of (ii) follows from similar arguments. Obviously, if m ∈ A∗ A∗R and m n = m ♦ n for all n ∈ A∗ A∗ , then the map n −→ m n is weak∗ –weak∗ continuous on A∗ A∗ ; that is, m ∈ Zt (A∗ A∗ ). Conversely, suppose that m ∈ Zt (A∗ A∗ ). Let a ∈ A, n ∈ A∗∗ , and p = n|A∗ A . Then (a · m) n, f = m p, f · a for all f ∈ A∗ . It follows that a · m ∈ Zt (A∗∗ , ). Hence, if μ ∈ Zt A∗ A∗ , then A · μ ⊆ Zt (A∗∗ , ). (‡) It is known that A∗ ♦ Zt (A∗∗ , ) ⊆ A∗ A and Zt (A∗∗ , ♦) A∗ ⊆ AA∗ .
()
(Cf. Dales and Lau [6, Proposition 2.20] and Lau and Ülger [28, Lemma 3.1a)].) By the assertions (‡) and (), we have (A∗ A) ♦ m ⊆ A∗ ♦ (A · m) ⊆ A∗ ♦ Zt (A∗∗ , ) ⊆ A∗ A. Therefore, m ∈ A∗ A∗R . Since A is weak∗ -dense in A∗ A∗ , we have m n = m ♦ n for all n ∈ A∗ A∗ . 2 It follows from Theorem 2 that Zt (A∗ A∗ ) is a subalgebra of (A∗ A∗ , ) and of (A∗ A∗R , ♦), and Zt (AA∗ ∗ ) is a subalgebra of (AA∗ ∗ , ♦) and of ( L AA∗ ∗ , ). Also, it is seen from () that ∗∗ Zt (A∗∗ , ) = m ∈ A∗∗ , R : m n = m ♦ n for all n ∈ A and Zt (A∗∗ , ♦) = {m ∈ L A∗∗ : n ♦ m = n m for all n ∈ A∗∗ }. Therefore, Theorem 2 shows that these descriptions of Zt (A∗∗ , ) and Zt (A∗∗ , ♦) have their analogues for Zt (A∗ A∗ ) and Zt (AA∗ ∗ ), respectively. The corollary below generalizes Lau and Ülger [28, Lemma 3.1c)], where they assumed that A has a BAI. Corollary 3. Let A be a Banach algebra. (i) For μ ∈ A∗ A∗ , μ ∈ Zt (A∗ A∗ ) if and only if A · μ ⊆ Zt (A∗∗ , ). (ii) For μ ∈ AA∗ ∗ , μ ∈ Zt (AA∗ ∗ ) if and only if μ · A ⊆ Zt (A∗∗ , ♦). ∗ ∗ Consequently, the canonical quotient algebra homomorphisms (A∗∗ R , ♦) −→ (A AR , ♦) and ∗∗ ∗ ∗ ∗∗ ∗ ∗ (A , ) −→ (A A , ) both map Zt (A , ) into Zt (A A ).
618
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
Proof. We prove (i); the proof of (ii) is similar. Let μ ∈ A∗ A∗ . By assertion (‡) above, we suppose that A · μ ⊆ Zt (A∗∗ , ) and show that μ ∈ Zt (A∗ A∗ ). First, for all f ∈ A∗ and a ∈ A, from assertion () above, we have (f · a) ♦ μ = f ♦ (a · μ) ∈ ∗ n ∈ A∗∗ be an extension of n. Then A A. Thus μ ∈ A∗ A∗R . Next, let n ∈ A∗ A∗ and μ n, f · a = (a · μ) n, f = (a · μ) ♦ n, f = f · a, μ ♦ n (f ∈ A∗ , a ∈ A). Therefore, μ n = μ ♦ n for all n ∈ A∗ A∗ , and hence μ ∈ Zt (A∗ A∗ ) by Theorem 2(i). The final assertion follows from Theorem 2(i) and the surjectivity of the canonical quotient map A∗∗ −→ A∗ A∗ . 2 By Corollary 3, we have immediately the following result, which will be needed in Section 5. The assertion (ii) below extends [28, Corollary 3.2], where Lau and Ülger showed that A · Zt (A∗∗ , ) = A · Zt (A∗ A∗ ) if A has a BAI (see also Dales and Lau [6, Theorem 5.12]). We note that the condition A2 = A in (i) below is satisfied by all quantum group algebras L1 (G) (cf. Fact 1 in Section 4). Corollary 4. Let A be a Banach algebra. (i) If A2 = A, then A · Zt (A∗∗ , ) ⊆ A if and only if A · Zt (A∗ A∗ ) ⊆ A. (ii) If A factors (that is, A2 = A), in particular, if A has a BLAI or a BRAI, then A · Zt (A∗∗ , ) = A · Zt (A∗ A∗ ). Proof. (i) Note that A · Zt (A∗∗ , ) ⊆ A · Zt (A∗ A∗ ) since a · m = a · p for all a ∈ A and m ∈ A∗∗ with p = m|A∗ A , and p ∈ Zt (A∗ A∗ ) if m ∈ Zt (A∗∗ , ). Then the assertion follows from Corollary 3. (ii) If A2 = A, then, combining Corollary 3(i) with the inclusion above, we have A · Zt A∗ A∗ = A2 · Zt A∗ A∗ ⊆ A · Zt (A∗∗ , ) ⊆ A · Zt A∗ A∗ . Therefore, A · Zt (A∗∗ , ) = A · Zt (A∗ A∗ ).
2
Let G be a locally compact group G. For m ∈ LUC(G)∗ and f ∈ LUC(G), the bounded complex-valued function mr (f ) on G is given by mr (f )(s) = m, fs (s ∈ G), where fs is the right translate of f by s. If mr (f ) ∈ LUC(G) for all f ∈ LUC(G), then, for each n ∈ LUC(G)∗ , the product m ∗ n ∈ LUC(G)∗ is defined by f ∈ LUC(G) . f, m ∗ n = mr (f ), n (Cf. Berglund, Junghenn and Milnes [4, Definition 2.2.8] and Lau [25].) We note here that for m ∈ L∞ (G)∗ and f ∈ L∞ (G), the function s −→ m, fs may be not even measurable on G (cf. Rudin [39], Talagrand [43], and Wells [46]). Following the notation used in [4], we let ZU (G) = m ∈ LUC(G)∗ : mr (f ) ∈ LUC(G) for all f ∈ LUC(G) . Then (ZU (G), ∗) is a Banach algebra (cf. [4, Lemma 2.2.9]). We use LUC ∞ (G) to denote LUC(G) when it is considered as a subspace of ∞ (G). Obviously, LUC ∞ (G) is a closed 1 (G)-submodule of ∞ (G). Also, LUC ∞ (G) is left introverted
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
619
in ∞ (G) (cf. Dales and Lau [6, Theorem 7.19]). It is easy to see that for f ∈ LUC ∞ (G) and s ∈ G, we have δs · f = fs , where δs denotes the point mass at s. Let ♦ 1 denote the right Arens product on 1 (G)∗∗ . Then, for all s ∈ G, m ∈ LUC ∞ (G)∗ , and f ∈ LUC ∞ (G), we have (f ♦ 1 m)(s) = δs , f ♦ 1 m = δs · f, m = m, fs = mr (f )(s). It follows that mr (f ) = f ♦ 1 m
and q ∗ m = q ♦ 1 m
f ∈ LUC(G), m ∈ LUC(G)∗ , q ∈ ZU (G) .
Hence, we have ZU (G) = m ∈ LUC(G)∗ : f ♦ 1 m ∈ LUC(G) for all f ∈ LUC(G) , and ZU (G), ∗ = LUC ∞ (G)∗R , ♦ 1 . Therefore, LUC(G) is two-sided introverted in ∞ (G) if and only if ZU (G) = LUC(G)∗ , and this is exactly the case when G is an SIN-group (i.e., the identity eG of G has a basis consisting of compact sets invariant under inner automorphisms), which is also equivalent to that the left and the right uniformities on G coincide (cf. [4, Theorem 4.4.5] and [15, (4.14g)]). For m ∈ LUC(G)∗ and f ∈ LUC(G), let ml (f )(s) = m, s f (s ∈ G), where s f is the left translate of f by s. According to Lau [23, Lemma 3], m f = ml (f ), or we can write it as m f = m 1 f , where 1 is the left Arens product on 1 (G)∗∗ . It follows that m n = m 1 n
m, n ∈ LUC(G)∗ .
Consequently, we have LUC(G)∗ , = LUC ∞ (G)∗ , 1
and Zt LUC(G)∗ = Zt LUC ∞ (G)∗ , 1 .
See Dales and Lau [6, Theorem 5.15] for the general Banach algebra case. In [25, Lemma 2], Lau proved that Zt LUC(G)∗ = m ∈ ZU (G): m n = m ∗ n for all n ∈ LUC(G)∗ . This result was extended by Dales and Lau [6, Proposition 11.6] to all convolution Beurling algebras. Therefore, [25, Lemma 2] indeed characterizes Zt (LUC(G)∗ ) via the 1 (G)-module structure on LUC(G) (or the group action of G on LUC(G)). More precisely, we have Zt (LUC(G)∗ ) = Zt (LUC ∞ (G)∗ , 1 ), and we even have the following description of Zt (LUC(G)∗ ) in the format of Theorem 2: Zt LUC(G)∗ = m ∈ LUC ∞ (G)∗R : m 1 n = m ♦ 1 n for all n ∈ LUC ∞ (G)∗ . In Theorem 2(i) with A = L1 (G), we consider LUC(G)∗R instead of ZU (G) so that the group action on LUC(G) is replaced by the Banach L1 (G)-module action. Hence, Theorem 2 is a Banach algebraic version of [25, Lemma 2]. We point out that, in general, (LUC(G)∗R , ♦) =
620
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
(ZU (G), ∗); that is, (LUC(G)∗R , ♦) = (LUC ∞ (G)∗R , ♦ 1 ) (see Theorem 19 in Section 4). Therefore, even for A = L1 (G), Theorem 2 does not follow from [25, Lemma 2]. Let X be either A∗ A or AA∗ . For x ∈ X and m ∈ X ∗ , we write mL (x) = m x,
and mR (x) = x ♦ m.
When X = A∗ A and m ∈ X ∗ , we have mL ∈ B(X) and mR ∈ B(X, A∗ ), and mR ∈ B(X) if and ∗ . Similar assertions hold for X = AA∗ . only if m ∈ XR For brevity, for results in the rest of this section, we state only their A∗ A-versions. Proposition 5. Let A be a Banach algebra. (i) If m ∈ Zt (A∗ A∗ ), then m ∈ A∗ A∗R and mR nL = nL mR for all n ∈ A∗ A∗ . (ii) If m ∈ Zt (A∗ A∗R ), then mL nR = nR mL for all n ∈ A∗ A∗R . Proof. We prove (i); assertion (ii) can be proved similarly. Let m ∈ Zt (A∗ A∗ ). By Theorem 2(i), m ∈ A∗ A∗R and m n = m ♦ n for all n ∈ A∗ A∗ . Let n ∈ A∗ A∗ . Then, for all x ∈ A∗ A and a ∈ A, we have mR nL (x), a = a, (n x) ♦ m = a · (n x), m = (a · n) x, m = m (a · n), x , and
nL mR (x), a = n (x ♦ m), a = x ♦ m, a · n = x, m ♦ (a · n) = m (a · n), x .
Therefore, mR nL = nL mR for all n ∈ A∗ A∗ .
2
Proposition 5 is motivated by [34, Proposition 1.2.13], where Neufang proved that if G is a locally compact group and m ∈ LUC(G)∗ , then m ∈ Zt LUC(G)∗ if and only if mr nl = nl mr
m ∈ ZU (G)
and
∗
for all n ∈ LUC(G) .
However, we note that for m ∈ LUC(G)∗ , in general, mR = mr , though mL = ml . In fact, the converse of Proposition 5 is not true in general. For example, if G is non-SIN and m ∈ LUC(G)∗ is non-zero but vanishing on U C(G), then mR = 0 = mr . In this situation, m ∈ LUC(G)∗R , mR nL = nL mR = 0 for all n ∈ LUC(G)∗ , but m ∈ Zt (LUC(G)∗ ), since Zt (LUC(G)∗ ) is equal to the measure algebra M(G) of G (cf. Lau [25]). We observe that the identity eG of G defines the identity δeG of (LUC(G)∗ , ) and also gives the identity of (ZU (G), ∗). That is, δeG is both an identity of (LUC ∞ (G)∗ , 1 ) and an identity of (LUC ∞ (G)∗R , ♦ 1 ). This fact plays a crucial rôle in the proof of the sufficiency part of [34, Proposition 1.2.13]. We are thus led to introducing the concept below for general Banach algebras. Definition 6. Let A be a Banach algebra. An element e0 of A∗ A∗R is a strong identity of A∗ A∗ if e0 is a left identity of (A∗ A∗ , ) and a right identity of (A∗ A∗R , ♦). A strong identity of AA∗ ∗ is defined similarly.
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
621
It is readily seen that if e0 is a strong identity of A∗ A∗ , then e0 is both an identity of (A∗ A∗ , ) and an identity of (A∗ A∗R , ♦). Lemma 7. Let A be a Banach algebra such that A∗ A∗ has a strong identity. Let m ∈ A∗ A∗ . (i) m ∈ Zt (A∗ A∗ ) if and only if m ∈ A∗ A∗R and mR nL = nL mR for all n ∈ A∗ A∗ . (ii) m ∈ Zt (A∗ A∗R ) if and only if mL nR = nR mL for all n ∈ A∗ A∗R . Proof. We prove (i); the proof of (ii) is similar. By Proposition 5, we need only prove the sufficiency part. Suppose that e0 is a strong identity of A∗ A, m ∈ A∗ A∗R , and mR nL = nL mR for all n ∈ ∗ A A∗ . Let x ∈ A∗ A and n ∈ A∗ A∗ . Then (n x) ♦ m = n (x ♦ m). Note that m n, x = m, n x = m ♦ e0 , n x = e0 , (n x) ♦ m = e0 , n (x ♦ m) , and m ♦ n, x = x ♦ m, n = e0 n, x ♦ m = e0 , n (x ♦ m) . Therefore, we have m n = m ♦ n for all n ∈ A∗ A∗ ; so m ∈ Zt (A∗ A∗ ) by Theorem 2(i).
2
From Theorem 2(i) and the proof of Proposition 5, it is seen that for m ∈ A∗ A∗R , mR nL = nL mR
for all n ∈ A∗ A∗
⇐⇒
m (a · n) = m ♦ (a · n) for all n ∈ A∗ A∗ and a ∈ A
⇐⇒
(m · a) n = (m · a) ♦ n for all n ∈ A∗ A∗ and a ∈ A m · A ⊆ Zt A∗ A∗ .
⇐⇒
Therefore, combining these equivalences with Theorem 2 and Lemma 7, we have the following Banach algebraic extension of [34, Proposition 1.2.13]. Corollary 8. Let A be a Banach algebra such that A∗ A∗ has a strong identity. Let m ∈ A∗ A∗ . Then the following statements are equivalent. (i) (ii) (iii) (iv) (v)
m ∈ Zt (A∗ A∗ ). m ∈ A∗ A∗R and m n = m ♦ n for all n ∈ A∗ A∗ . m ∈ A∗ A∗R and mR nL = nL mR for all n ∈ A∗ A∗ . m ∈ A∗ A∗R and m n = m ♦ n for all n ∈ A · A∗ A∗ . A∗ A ♦ m ⊆ A∗ A and m · A ⊆ Zt (A∗ A∗ ).
4. Quotient algebras with a strong identity and SIN quantum groups Let A be a Banach algebra. From the discussions in Section 3, it is natural to consider the question of when A∗ A∗ (respectively, AA∗ ∗ ) has a strong identity. As mentioned earlier, for a locally compact group G, the identity δeG of (LUC(G)∗ , ) is always an identity of (ZU (G), ∗). However, we will see from Theorem 19 below that δeG may not be an identity of (LUC(G)∗R , ♦).
622
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
Lemma 9. Let A be a Banach algebra. (i) If (A∗ A∗R , ♦) has a right identity, then A∗ A = AA∗ A. (ii) Assume that A has a BRAI. If A∗ A∗ has a strong identity, then the map μ −→ μ in Lemma 1(ii) maps RM(A) into Zt (A∗ A∗ ) ∩ Zt (A∗ A∗R ). Proof. (i) Suppose that m ∈ A∗ A∗ and m|AA∗ A = 0. Then m · a = 0 for all a ∈ A. Thus m ∈ A∗ A∗R , and m ♦ n = 0 for all n ∈ A∗ A∗ . Therefore, A∗ A = AA∗ A if (A∗ A∗R , ♦) has a right identity. (ii) Let e0 be a strong identity of A∗ A∗ . Then e0 = E|A∗ A , where E is a weak∗ -cluster point in A∗∗ of a BRAI of A. Let μ ∈ RM(A). By Lemma 1(ii), we only have to show that μ = μ∗∗ (E)|A∗ A ∈ Zt (A∗ A∗R ). Let m ∈ A∗ A∗R and m ∈ A∗∗ be an extension of m. By (†) in the proof of Lemma 1, ∗∗ ∗∗ m ♦ μ (E) = μ ( m ♦ E) and m μ∗∗ (E) = μ∗∗ ( m). For all a ∈ A and f ∈ A∗ , we have ♦ E = μ∗ (f ) · a, m ♦ e0 = m, μ∗ (f ) · a , f · a, m ♦ μ∗∗ (E) = μ∗ (f · a), m
and
m), f · a = m , μ∗ (f · a) = m , μ∗ (f ) · a = m, μ∗ (f ) · a . f · a, m μ∗∗ (E) = μ∗∗ (
m μ∗∗ (E))|A∗ A ; that is, It follows that ( m ♦ (μ∗∗ (E))|A∗ A = ( m ♦ μ∗∗ (E) A∗ A = m μ∗∗ (E) A∗ A . Therefore, μ∗∗ (E)|A∗ A ∈ Zt (A∗ A∗R ).
2
Theorem 10. Let A be a Banach algebra. Then the following statements are equivalent. (i) A∗ A∗ has a strong identity. (ii) (A∗ A∗ , ) has an identity contained in Zt (A∗ A∗R ). (iii) (A∗ A∗ , ) has a left identity and A∗ A = AA∗ A. Furthermore, if A satisfies A2 = A, then each of the following statements is equivalent to (i)–(iii). (iv) (A∗ A∗R , ♦) is right unital. (v) A has a BRAI and A∗ A = AA∗ A. Proof. (i) ⇐⇒ (ii). This follows from Definition 6 and the equality Zt A∗ A∗R = m ∈ A∗ A∗R : n ♦ m = n m for all n ∈ A∗ A∗R . (Cf. Section 2.) (i) ⇒ (iii). This holds by Lemma 9(i).
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
623
(iii) ⇒ (i). Suppose that A∗ A = AA∗ A, and e0 is a left identity of (A∗ A∗ , ). Then e0 ∈ A∗ A∗R . Let m ∈ A∗ A∗R . Then, for a ∈ A and f ∈ A∗ A, a · f, m ♦ e0 = a · (f ♦ m), e0 = e0 a, f ♦ m = a, f ♦ m = a · f, m. Since A∗ A = AA∗ A, we have m ♦ e0 = m. Therefore, e0 is a right identity of (A∗ A∗R , ♦) and hence a strong identity of A∗ A∗ . Assume that A2 = A. In this case, by [13, Theorem 4], (A∗ A∗ , ) is (right) unital if and only if A has a BRAI. Clearly, (i) ⇒ (iv), and [(i) and (iii)] ⇒ (v) ⇒ (iii). So, we only have to show that (iv) ⇒ (i). (iv) ⇒ (i). Let e be a right identity of (A∗ A∗R , ♦). Since A is weak∗ -dense in A∗ A∗ and A ⊆ A∗ A∗R , we see that e is also a right identity of (A∗ A∗ , ), and hence is an identity of (A∗ A∗ , ) by the paragraph above. Therefore, e is a strong identity of A∗ A∗ . We note here that the word “right” is missing in the conclusion of [13, Theorem 4(ii)], where it should read “A has a BRAI” rather than “A has a BAI”. 2 We consider below some conditions which are closely related to the existence of a strong identity of A∗ A∗ . (0) A is an ideal in A∗∗ . (1) A has a central approximate identity, i.e., A has an approximate identity from the algebraic centre of A. (2) A∗ A = AA∗ . (3) A∗ A = AA∗ A. (4) A∗ A ⊆ AA∗ . (5) A∗ A is introverted in A∗ . (6) AA∗ is introverted in A∗ . Proposition 11. Let A be a Banach algebra. The following assertions hold. (i) AA∗ A = A∗ A ∩ AA∗ in the following two cases: (a) A has a BRAI or a BLAI; (b) A is commutative satisfying A2 = A. (ii) [(0) or (1)] ⇒ [(2) and (3)], [(2) or (3)] ⇒ (4), and (2) ⇒ [(5) and (6)]. (iii) If A2 = A, then [(0) or (1)] ⇒ (2) ⇒ (3) ⇐⇒ (4). (iv) If A is an involutive Banach algebra, then (2) ⇐⇒ (4), and (5) ⇐⇒ (6). (v) If A is an involutive Banach algebra satisfying A2 = A, then
(0) or (1) ⇒ (2) ⇐⇒ (3) ⇐⇒ (4) ⇒ (5) ⇐⇒ (6). (vi) If A is the group algebra L1 (G) of a locally compact group G, then (1)–(6) are all equivalent, and each of them is equivalent to that G is an SIN-group. Proof. (i) For case (a), suppose that (eα ) is a BRAI of A. Let f ∈ A∗ A ∩ AA∗ . Then f =
· –lim f · eα and thus f ∈ AA∗ A. Therefore, AA∗ A = A∗ A ∩ AA∗ . When (eα ) is a BLAI of A, one just need replace f · eα above with eα · f . The assertion holds for case (b), since AA∗ A = A∗ A2 = A∗ A = AA∗ if A is commutative satisfying A2 = A.
624
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
(ii) This is clearly true. (iii) Assume that A2 = A. Suppose that A∗ A ⊆ AA∗ . Then A∗ A2 ⊆ A∗ AA ⊆ AA∗ A ⊆ AA∗ A ⊆ A∗ A. So, A∗ A = A∗ A2 = AA∗ A. Therefore, (4) ⇒ (3), and (2) ⇒ (3) ⇐⇒ (4) by (ii). (iv) Suppose that A is an involutive Banach algebra. Note that, in general, the involution on A cannot be extended to an involution on A∗∗ with either Arens product (cf. [9]). For each m ∈ A∗∗ , an element m∗ ∈ A∗∗ can be defined by m∗ (f ) = m(f ∗ ) (f ∈ A∗ ), where f ∗ ∈ A∗ is given by f ∗ (a) = f (a ∗ ) (a ∈ A). It is easy to see that (m f )∗ = f ∗ ♦ m∗ and (f ♦ m)∗ = m∗ f ∗ (m ∈ A∗∗ , f ∈ A∗ ). Note that f −→ f ∗ maps A∗ A onto AA∗ , and AA∗ onto A∗ A. Therefore, we have (2) ⇐⇒ (4). Assume that A∗ A is introverted in A∗ . Let p ∈ AA∗ ∗ and f ∈ AA∗ . Let m ∈ A∗∗ be any extension of p. Then p f = m f = (f ∗ ♦ m∗ )∗ ∈ AA∗ , since f ∗ ∈ A∗ A and f ∗ ♦ m∗ ∈ A∗ A. Thus AA∗ is introverted in A∗ . Therefore, we have (5) ⇒ (6). Similarly, we have (6) ⇒ (5). (v) This follows from (ii)–(iv). (vi) Let G be a locally compact group and A = L1 (G). It is known that condition (1) is equivalent to G being an SIN-group (cf. [33, Proposition]). By (v), we only have to show that G is an SIN-group if condition (6) is satisfied. Assume that RUC(G) is left introverted in L∞ (G). Applying [6, Theorem 5.15] with X = RUC(G) and B = 1 (G), we see that RUC(G) is also left introverted in ∞ (G). It follows from [4, Theorem 4.4.5] and [31, Theorem 2] that G is an SIN-group. 2 Remark 12. Let A = A(F2 ), where F2 is the free group with two generators. Then A is commutative satisfying A2 = A (cf. Fact 1 below). Due to Proposition 11, we have A∗ A = AA∗ A and A∗ A∗R = A∗ A∗ . But (A∗ A∗R , ♦) does not have a right identity by Theorem 10. Therefore, the converse of Lemma 9(i) is not true. It is possible that a Banach algebra has a (not necessarily bounded) central approximate identity, but it has no bounded approximate identity. For example, as shown by De Cannière and Haagerup [8], the Fourier algebra A(F2 ) is weakly amenable. That is, A(F2 ) has a (central) approximate identity bounded with respect to the cb-multiplier norm on Mcb A(F2 ) (the completely bounded multiplier algebra of A(F2 )). However, A(F2 ) has no bounded approximate identity, since F2 is a non-amenable group. We note that there are some groups which have even weaker approximation properties than weak amenability. For instance, it was shown by Haagerup and Kraus [14] that the semi-direct product G = Z2 ρ SL(2, Z) is not weakly amenable (where ρ is the standard action of SL(2, Z) on Z2 ), but it has the AP, i.e., there exists a net in A(G) converging to 1G in the weak∗ topology on Mcb A(G). As shown by Losert [30, Proposition 2], span{A(G)2 } = A(G) if and only if G is amenable. However, all A(G) satisfy A2 = A. This condition is indeed satisfied by all quantum group algebras as stated below in Fact 1. Before seeing this, we recall briefly the notion of locally compact quantum groups.
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
625
Let G = (M, Γ, ϕ, ψ) be a von Neumann algebraic locally compact quantum group in the sense of Kustermans and Vaes [21,22]. By definition, (M, Γ ) is a Hopf–von Neumann algebra, ϕ is a normal semifinite faithful left invariant weight on (M, Γ ), and ψ is a normal semifinite faithful right invariant weight on (M, Γ ). Since the co-multiplication Γ is a normal isometric unital ∗ -homomorphism from M into M ⊗M, ¯ M∗ −→ M∗ it is well known that its pre-adjoint Γ∗ : M∗ ⊗ induces an associative completely contractive multiplication on M∗ (cf. Ruan [37,38]). Here, ¯ denotes the von Neumann algebra tensor product, and ⊗ denotes the operator space projective ⊗ tensor product. In the two classical cases where M∗ is L1 (G) or A(G), is the usual convolution on L1 (G) and the pointwise multiplication on A(G), respectively. The reader is referred to Kustermans and Vaes [21,22] and van Daele [45] for more information on locally compact quantum groups. Following the locally compact group case, the von Neumann algebra M is written as L∞ (G), and the Banach algebra M∗ equipped with the multiplication is denoted by L1 (G). It is known that the quantum group algebra L1 (G) is an involutive Banach algebra with a faithful multiplication (cf. [16]). The locally compact quantum group G is called co-amenable if L1 (G) has a BAI. It turns out that G is co-amenable if and only if L1 (G) has a BRAI if and only if L1 (G) has a BLAI. We showed in [16, Theorem 2] that G is co-amenable if and only if L1 (G) has a BAI consisting of normal states on L∞ (G). L1 (G) −→ L1 (G) is a complete quotient map, we Since the multiplication map Γ∗ : L1 (G)⊗ have the following Fact 1. All quantum group algebras L1 (G) satisfy L1 (G)2 = L1 (G). Therefore, by Proposition 11(v), conditions (2)–(4) are equivalent for all quantum group algebras L1 (G). The Banach L1 (G)-modules RUC(G) and LUC(G) are defined to be L1 (G) L∞ (G) and L∞ (G) L1 (G), respectively, and U C(G) denotes LUC(G) ∩ RUC(G) (cf. Hu, Neufang and Ruan [16] and Runde [41]). It turns out that RUC(G) and LUC(G) are closed operator systems in L∞ (G) (cf. [41, Theorem 2.3]). Obviously, they are just the usual spaces LUC(G) and RUC(G) if L1 (G) is the group algebra L1 (G) of a locally compact group G. By Proposition 11(i), U C(G) = L1 (G) L∞ (G) L1 (G) if either G is co-amenable, or G is co-commutative which is precisely the case when L1 (G) = A(G) for some locally compact group G. Recall that a locally compact group G is SIN if and only if LUC(G) = RUC(G) (cf. Milnes [31]). Definition 13. Let A be a Banach algebra and X be a Banach A-module. We say that the Banach A-module action on X is SIN if A · X = X · A. A locally compact quantum group G is called an SIN quantum group if the canonical Banach L1 (G)-module action on L∞ (G) is SIN. Note that the AA∗ -version of Theorem 10 holds. Therefore, we have the following immediate corollary of Theorem 10 and Proposition 11(v). Corollary 14. Let A be an involutive Banach algebra satisfying A2 = A. Let X be either A∗ A or AA∗ . Then X ∗ has a strong identity if and only if A has a BAI and the canonical Banach A-module action on A∗ is SIN.
626
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
For a Banach algebra A, let BA (A∗ ) be the Banach algebra of bounded right A-module homomorphisms on A∗ , and BAσ (A∗ ) be the normal part of BA (A∗ ) (i.e., consisting of all elements of BA (A∗ ) which are weak∗ –weak∗ continuous). Note that if E ∈ A∗∗ is a weak∗ -cluster point of a BRAI of A, then T = T ∗ (E)L for all T ∈ BA (A∗ ). It can be seen that we have the following Fact 2. Let A be a Banach algebra. (i) The map RM(A) −→ BAσ (A∗ ), μ −→ μ∗ is an isometric algebra isomorphism. (ii) The map (A∗ A∗ , ) −→ BA (A∗ ), m −→ mL is an injective contractive algebra homomorphism, and is a surjective isometry if A has a BRAI bounded by 1. In the sequel, RM(A) and A∗ A∗ will be compared via their canonical images in BA (A∗ ). Due to Lemma 1(ii), we have RM(A) ⊆ Zt (A∗ A∗ ) if A has a BRAI. The opposite inclusion is clearly equivalent to A · Zt (A∗ A∗ ) ⊆ A. Therefore, we have Fact 3. Zt (A∗ A∗ ) ⊆ RM(A) if and only if A · Zt (A∗ A∗ ) ⊆ A. As a consequence of Facts 1, 2, and Grosser and Losert [13, Theorem 4], we obtain the following characterizations of co-amenable locally compact quantum groups G in terms of LUC(G)∗ . Theorem 15. Let G be a locally compact quantum group. Then the following statements are equivalent. (i) (ii) (iii) (iv) (v) (vi)
G is co-amenable. LUC(G)∗ ∼ = BL1 (G) (L∞ (G)) via the isometric algebra isomorphism m −→ mL . RM(L1 (G)) ⊆ Zt (LUC(G)∗ ). idL1 (G) ∈ Zt (LUC(G)∗ ). (LUC(G)∗ , ) is unital. (LUC(G)∗ , ) is right unital.
Recall that a locally compact quantum group G is called amenable if there exists a left invariant mean on L∞ (G); that is, there exists m ∈ L∞ (G)∗ such that m = m, 1 = 1 and a m = 1, am (a ∈ L1 (G)). Right invariant means and (two-sided) invariant means on L∞ (G) are defined similarly. It is known that the existence of a right invariant mean and the existence of an invariant mean are both equivalent to G being amenable. It is well known that all cocommutative quantum groups are amenable. In [24], Lau introduced and studied a large class of Banach algebras, called F -algebras, including all preduals of Hopf–von Neumann algebras. An F -algebra is a Banach algebra A which is the predual of a W ∗ -algebra M such that the identity 1 of M is a multiplicative linear functional on A. For an F -algebra A, let A0 be the augmentation ideal in A; that is, A0 = {a ∈ A: a, 1 = 0}. Note that a quantum group algebra L1 (G) is an involutive Banach algebra, and L1 (G)0 is closed under the involution on L1 (G). Thus L1 (G) (respectively, L1 (G)0 ) has a BRAI if and only if L1 (G) (respectively, L1 (G)0 ) has a BAI. Applying Lau [24, Theorem 4.10] to L1 (G) (see also [24, Theorem 4.1]), we obtain the following proposition. We note that the equivalence below also follows by combining [24, Theorem 4.10] with its right-hand side version, which can be proved by interchanging the words “left” and “right” there.
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
627
Proposition 16. Let G be a locally compact quantum group. Then G is amenable and coamenable if and only if L1 (G)0 has a BAI. In particular, if G is a co-commutative quantum group, then each of (i)–(vi) in Theorem 15 is also equivalent to (vii) L1 (G)0 has a BAI. We point out that unlike in the case of L1 (G), where L1 (G) has a BAI (that is, G is coamenable) if and only if L1 (G) has a BAI of norm 1 (even, as shown in [16], consisting of states on L∞ (G)), for any infinite-dimensional co-amenable co-commutative quantum group G (that is, L1 (G) = A(G) of an infinite amenable group G), L1 (G)0 has a BAI bounded by 2, and 2 is the best possible norm bound for any BAI of L1 (G)0 (cf. Kaniuth and Lau [19, Theorem 3.4]). Remark 17. Recall that in the case where L1 (G) is L1 (G) or A(G), the quantum group G is amenable and co-amenable if and only if G is an amenable group. Therefore, for all locally compact groups G, we have G is amenable if and only if L1 (G)0 has a BAI if and only if A(G)0 has a BAI. It would be interesting to know whether for a general locally compact quantum group G, we have 0 has a BAI. L1 (G)0 has a BAI if and only if L1 (G) is co-amenable, then G is amenable. It is known by Bédos and Tuset [3, Theorem 3.2] that if G Therefore, to obtain the above equivalence, it suffices to know whether the amenability of G (respectively, the amenability and the co-amenability of G) would imply the co-amenability of G. This is still an open question, which is only known to be true if G is a locally compact group G (i.e., L1 (G) = L1 (G), cf. Leptin [29]), or G is a discrete quantum group (cf. Tomatsu [44], see also Ruan [38] for the discrete Kac algebra case). By our definition, a locally compact quantum group G is SIN if and only if LUC(G) = RUC(G). Therefore, by Proposition 11(v), G is SIN if and only if LUC(G) = U C(G) if and only if RUC(G) = U C(G). It was proved recently by Runde [40] that G is compact if and only if L1 (G) is an ideal in L1 (G)∗∗ (that is, L1 (G) satisfies condition (0)). Thus, as in the locally compact group case, G is an SIN quantum group whenever G is compact, or G is discrete (that is, L1 (G) has an identity), or G is co-commutative. Also, G is SIN if L1 (G) has a central approximate identity (cf. Proposition 11). As in Theorem 15, we identify RM(L1 (G)) and LUC(G)∗ with their respective canonical images in BL1 (G) (L∞ (G)). Then we obtain the following interesting counterpart of Theorem 15. Theorem 18. Let G be a locally compact quantum group. Then the following statements are equivalent. (i) (ii) (iii) (iv)
G is co-amenable and SIN. RM(L1 (G)) ⊆ Zt (LUC(G)∗R ). idL1 (G) ∈ Zt (LUC(G)∗R ). (LUC(G)∗R , ♦) is unital.
628
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
(v) (LUC(G)∗R , ♦) is right unital. (vi) LUC(G)∗ has a strong identity. Proof. (vi) ⇒ (iv) ⇒ (iii) and (ii) ⇒ (iii) are trivial. (iii) ⇒ (vi) ⇐⇒ (v) follows from Theorem 10. (i) ⇐⇒ (vi) holds by Corollary 14. And (vi) ⇒ (ii) follows from Lemma 9(ii) and Theorem 15. 2 For a general co-amenable locally compact quantum group G, we do not know whether L1 (G) has a central BAI if G is SIN, though this is true when G is commutative or co-commutative. This is not clear even for co-amenable compact quantum groups. Also, it is not clear whether G is SIN if G is co-amenable and LUC(G) is introverted in L∞ (G) (cf. Proposition 11(vi)). Let G be a locally compact group. It is seen that if f ∈ U C(G), then mr (f ) = mR (f ) ∈ RUC(G) for all m ∈ LUC(G)∗ (cf. Lau [23, Lemma 3]). Also, if m ∈ Zt (LUC(G)∗ ) = M(G), then mr (f ) = mR (f ) and m n = m ♦ n = m ∗ n for all f ∈ LUC(G) and n ∈ LUC(G)∗ . Suppose that LUC(G) = RUC(G). Then LUC(G)∗R = LUC(G)∗ = RUC(G)∗ . As shown in Section 3 that (LUC(G)∗ , ) = (LUC ∞ (G)∗ , 1 ), we also have RUC(G)∗ , ♦ = RUC ∞ (G)∗ , ♦ 1 . Recall that (ZU (G), ∗) = (LUC ∞ (G)∗R , ♦ 1 ) (cf. Section 3). Therefore, we have LUC(G)∗R , ♦ = RUC(G)∗ , ♦ = RUC ∞ (G)∗ , ♦ 1 = LUC ∞ (G)∗R , ♦ 1 = ZU (G), ∗ ; i.e., (LUC(G)∗R , ♦) = (ZU (G), ∗), or equivalently, (LUC(G)∗R , ♦) = (LUC ∞ (G)∗R , ♦ 1 ). Conversely, assume that U C(G) LUC(G). Then there exists a non-zero m ∈ LUC(G)∗R such that mR = 0 (see the paragraph after the proof of Proposition 5). Obviously, δeG ∈ ZU (G) ∩ LUC(G)∗R . Also, δeG is an identity of (ZU (G), ∗), and δeG is a left but not a right identity of (LUC(G)∗R , ♦) (since m ♦ δeG = 0). Therefore, in this situation, (LUC(G)∗R , ♦) cannot be a subalgebra of (ZU (G), ∗). Recall that RM(L1 (G)) = M(G), and δeG is an identity of (LUC(G)∗ , ). From these discussions together with Proposition 11(vi), Corollary 14, and Theorem 18, we obtain below several new characterizations of SIN locally compact groups. Theorem 19. Let G be a locally compact group. Then the following statements are equivalent. (i) (ii) (iii) (iv) (v) (vi) (vii) (viii) (ix)
G is an SIN-group. (LUC(G)∗R , ♦) = (LUC ∞ (G)∗R , ♦ 1 ). (LUC(G)∗R , ♦) is a subalgebra of (LUC ∞ (G)∗R , ♦ 1 ). M(G) ⊆ Zt (LUC(G)∗R ). δeG ∈ Zt (LUC(G)∗R ). (LUC(G)∗R , ♦) is unital. (LUC(G)∗R , ♦) is right unital. LUC(G)∗ has a strong identity. LUC(G) is two-sided introverted in L∞ (G).
Remark 20. (a) The referee kindly informed us that the equivalence between (i) and (ii) in Theorem 19 may also be obtained by [32, Theorem 7] and [36, Lemma 5].
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
629
(b) Results obtained in this section illustrate that the SIN property is intrinsically related to topological centre problems. (c) The Banach algebras (LUC(G)∗R , ♦) and (ZU (G), ∗) are both used to describe the topological centre Zt (LUC(G)∗ ) and to characterize the SIN property of a locally compact group G. The approach by using LUC(G)∗R is of Banach algebraic flavor, that can be applied to general locally compact quantum groups. We note that (LUC(G)∗R , ♦) cannot be replaced by (ZU (G), ∗) in Theorem 19(v)–(vii). We have other evidence showing advantages of LUC(G)∗R over ZU (G) for studying problems as characterizing the equality LUC(G) = WAP(G) for general quantum groups G, where WAP(G) is the space of weakly almost periodic functionals on L1 (G). (Recall that a bounded linear functional f on a Banach algebra A is called weakly almost periodic if the map A −→ A∗ , a −→ f · a is weakly compact.) (d) Suppose that G is amenable. Let RIM(LUC(G)) and TRIM(LUC(G)) be the sets of right translation invariant and topologically right invariant means on LUC(G), respectively. Then TRIM LUC(G) ⊆ RIM LUC(G) ⊆ ZU (G),
and TRIM LUC(G) ⊆ LUC(G)∗R .
It follows that if (ZU (G), ∗) is a subalgebra of (LUC(G)∗R , ♦), then TRIM(LUC(G)) = RIM(LUC(G)). We do not know whether the converse holds, that would be true if the above equality were equivalent to G being SIN. Also, it is not clear for us when we would have LUC(G)∗R = ZU (G) as subspaces of LUC(G)∗ . Let G be a locally compact quantum group and C0 (G) be the reduced C ∗ -algebra of G. Then C0 (G) ⊆ WAP(G) (cf. Runde [41, Theorem 4.3]), WAP(G) and C0 (G) are introverted in L∞ (G), and the two Arens products on WAP(G)∗ and C0 (G)∗ , respectively, coincide (cf. Dales and Lau [6, Propositions 3.11 and 5.7]). Therefore, M(G) = C0 (G)∗ is a dual Banach algebra (i.e., the multiplication on M(G) is separately weak∗ –weak∗ continuous). We point out that the Arens product on M(G) is equal to the product on C0 (G)∗ as defined in Kustermans and Vaes [21], that is induced by the co-multiplication on C0 (G). It is known that G is co-amenable if and only if C0 (G)∗ is unital (cf. Bédos and Tuset [3, Theorem 3.1]). By Theorem 15, we see that the co-amenability of G is also equivalent to WAP(G)∗ being unital. It is interesting to compare these characterizations of co-amenable locally compact quantum groups (see also Theorem 15) with Theorem 18, where we show in particular that a quantum group G is co-amenable and SIN if and only if (LUC(G)∗R , ♦) is unital. Among these kinds of characterizations in terms of the existence of a unit, we point out here that Theorem 22 of the next section shows that when L1 (G) is in the class of Banach algebras introduced by the authors in [16], in particular, when L1 (G) is separable with G co-amenable, then G is discrete if and only if L1 (G)∗∗ is unital under either Arens product. 5. Interrelationships between strong Arens irregularity and quotient strong Arens irregularity Let A be a Banach algebra. Recall that we say that A is left quotient strongly Arens irregular if Zt (A∗ A∗ ) ⊆ RM(A), which is equivalent to Zt (A∗ A∗ ) = RM(A) if A has a BRAI, where RM(A) and A∗ A∗ are identified with their respective canonical images in BA (A∗ ) (cf. Section 1). In this section, we consider how the inclusion Zt (A∗ A∗ ) ⊆ RM(A) is related to the left strong Arens irregularity of A. For brevity, most results in this section are stated in their one-
630
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
sided versions. We remind the reader that here RM(A) is the opposite right multiplier algebra of A. Recall that a Banach space X is weakly sequentially complete (WSC) if every weakly Cauchy sequence in X is weakly convergent. It is well known that the predual of a von Neumann algebra is WSC (cf. Takesaki [42, Corollary III.5.2]). First, using the connection between RM(A) and Zt (A∗ A∗ ) as shown in Lemma 1(ii), we give the following generalization of Lau and Ülger [28, Theorem 2.6], where they proved that if A is WSC with a sequential BAI, then A∗ A = A∗ if and only if AA∗ = A∗ if and only if A is unital. Lau and Ülger asked there whether one can drop the word “sequential” above. See Baker, Lau and Pym [2, Corollary 2.3] for some related results. To present results in this section, we need recall the definition of the class of Banach algebras introduced by the authors in [16]. Definition 21. (See [16].) Let A be a Banach algebra with a BAI. Then A is said to be of type (RM) if for every μ ∈ RM(A), there is a closed subalgebra B of A with a BAI such that (I) μ|B ∈ RM(B); (II) f |B ∈ BB ∗ for all f ∈ AA∗ ; (III) there is a family {Bi } of closed right ideals in B satisfying (i) each Bi is WSC with a sequential BAI, (ii) for all i, there exists a left Bi -module projection from B onto Bi , and (iii) μ ∈ A if μ|Bi ∈ Bi for all i. Similarly, Banach algebras of type (LM) are defined. A is said to be of type (M) if A is both of type (LM) and of type (RM). Obviously, a Banach algebra A is of type (M) if A is WSC with a sequential BAI. This is the case when A is L1 (G) of a co-amenable quantum group G over a separable Hilbert space. It is shown in [16] that all convolution Beurling algebras L1 (G, ω) with ω 1, in particular, all group algebras L1 (G), are of type (M). And so are Fourier algebras A(G) of amenable locally compact groups G. Theorem 22. Let A be a Banach algebra. (i) Suppose that A is of type (LM). Then A∗ A = A∗ if and only if A is unital. (ii) Suppose that A is of type (RM). Then AA∗ = A∗ if and only if A is unital. Consequently, if A is of type (M), then A is unital if and only if A∗∗ is unital under either Arens product. In particular, if G is a locally compact quantum group with L1 (G) of type (M), then LUC(G) = L∞ (G) if and only if G is discrete if and only if L1 (G)∗∗ is unital. Proof. We prove assertion (i); similar arguments establish (ii). Clearly, A∗ A = A∗ if A is unital. Conversely, assume that A∗ A = A∗ . Then we have (A∗ A∗ , ) = (A∗∗ , ) and Zt (A∗ A∗ ) = Zt (A∗∗ , ). Let (μ, ν) ∈ M(A) and E be a weak∗ cluster point in A∗∗ of a BAI of A. By Lemma 1(ii), ν ∗∗ (E) ∈ Zt (A∗ A∗ ) = Zt (A∗∗ , ). It can be seen that ν ∗∗ (E) · a = μ(a) for all a ∈ A. Thus ν ∗∗ (E) · A ⊆ A. By [16, Theorem 32(i)],
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
631
ν ∗∗ (E) = a0 for some a0 ∈ A. It follows that μ(a) = a0 a and ν(a) = aa0 (a ∈ A). We conclude that A is unital by taking (μ, ν) = (idA , idA ). By (i), (ii), and [28, Proposition 2.2], we conclude that A is unital if and only if A∗∗ is unital under either Arens product. 2 It is interesting to compare Theorem 23 below with [16, Theorem 32], where we proved that if A is of type (LM), then A is left strongly Arens irregular if and only if Zt (A∗∗ , ) · A ⊆ A; and if A is of type (RM), then A is right strongly Arens irregular if and only if A · Zt (A∗∗ , ♦) ⊆ A. Theorem 23. Let A be a Banach algebra satisfying A2 = A. (i) A is left quotient strongly Arens irregular if and only if A · Zt (A∗∗ , ) ⊆ A. (ii) A is right quotient strongly Arens irregular if and only if Zt (A∗∗ , ♦) · A ⊆ A. Consequently, Zt (A∗∗ , ) ∩ Zt (A∗∗ , ♦) = A in the following two cases: (a) A is of type (RM) and Zt (A∗ A∗ ) = RM(A); (b) A is of type (LM) and Zt (AA∗ ∗ ) = LM(A). Proof. Assertion (i) follows from Corollary 4(i) and Fact 3. Similarly, assertion (ii) holds. For the final assertion, let m ∈ Zt (A∗∗ , ) ∩ Zt (A∗∗ , ♦). In case (a), we have A · m ⊆ A by (i), and since m ∈ Zt (A∗∗ , ♦), we have m ∈ A by [16, Theorem 32(ii)]. Therefore, Zt (A∗∗ , ) ∩ Zt (A∗∗ , ♦) = A. The proof for case (b) is similar. 2 Remark 24. Suppose that A has a BRAI, and A∗ A∗ has a strong identity. It is seen that
∗ ∗ Zt A A ∩ Zt A∗ A∗R = RM(A) ⇒ A · Zt (A∗∗ , ) ∩ Zt (A∗∗ , ♦) ⊆ A . (Cf. Lemma 9(ii)].) Hence, the equality Zt (A∗ A∗ ) ∩ Zt (A∗ A∗R ) = RM(A) implies that Zt (A∗∗ , ) ∩ Zt (A∗∗ , ♦) = A if A is of type (RM) (comparing with case (a) in Theorem 23). In this situation, [Zt (A∗ A∗ ) = RM(A)] ⇒ [Zt (A∗ A∗ ) ⊆ Zt (A∗ A∗R )]. Due to Theorem 23, we have the following theorem on topological centres. Theorem 25. Let A be a Banach algebra. Consider the following statements. (i) Zt (A∗ A∗ ) = RM(A). (ii) Zt (A∗∗ , ) = A. (iii) Zt (A∗∗ , ) ⊆ Zt (A∗∗ , ♦). If A is of type (RM), then any two of (i)–(iii) imply the third.
632
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
Proof. By Theorem 23(i), we just need to show that [(i) and (iii)] ⇒ (ii). Assume that Zt (A∗ A∗ ) = RM(A) and Zt (A∗∗ , ) ⊆ Zt (A∗∗ , ♦). Then, by case (a) in Theorem 23, we have Zt (A∗∗ , ) = Zt (A∗∗ , ) ∩ Zt (A∗∗ , ♦) = A; that is, (ii) holds. 2 Combining Theorem 23 with Theorem 25 gives the corollary below. Corollary 26. Let A be a Banach algebra of type (M). If Zt (A∗∗ , ) = Zt (A∗∗ , ♦) (e.g., A is commutative), then the following statements are equivalent. (i) (ii) (iii) (iv) (v)
Zt (A∗ A∗ ) = RM(A). Zt (AA∗ ∗ ) = LM(A). Zt (A∗∗ , ) = A. Zt (A∗∗ , ) · A ⊆ A. A · Zt (A∗∗ , ) ⊆ A.
Remark 27. When A is the Fourier algebra of an amenable locally compact group, Lau and Losert proved (i) ⇒ (iii) in [27, Theorem 6.4]. Remark 28. In [28, Remark 5.2.3◦ ], Lau and Ülger observed the asymmetry between the equality “A · Zt (A∗∗ , ) = A · Zt (A∗ A∗ )” as stated in Corollary 4(ii), and the inclusion “Zt (A∗∗ , ) · A ⊆ Zt (A∗∗ , )” considered in [28, Theorem 5.1]—the topological centre Zt (A∗∗ , ) is treated as a left A-module in the former but a right A-module in the latter. One may also compare the condition “A · Zt (A∗∗ , ) ⊆ A” in Theorem 23(i) with the condition “Zt (A∗∗ , ) · A ⊆ A” considered in [16, Theorem 32(i)]. These asymmetries may be explained as follows. In the equality A · Zt (A∗∗ , ) = A · Zt (A∗ A∗ ), the topological centre Zt (A∗∗ , ) is linked to the opposite right multiplier algebra RM(A) through the embedding RM(A) → Zt (A∗ A∗ ). On the other hand, in Zt (A∗∗ , ) · A ⊆ Zt (A∗∗ , ), one relates Zt (A∗∗ , ) to the left multiplier algebra LM(A) via the map from LM(A) into (A∗∗ , ♦) as given in Lemma 1(i): the product Zt (A∗∗ , ) · A taken here should be recognized as the product Zt (A∗∗ , ) ♦ A though they are equal. Assume that A is of type (M). If Zt (A∗∗ , ) = Zt (A∗∗ , ♦), then, by Corollary 26, the strong Arens irregularity of A is equivalent to the quotient strong Arens irregularity of A. However, there do exist involutive Banach algebras A of type (M) such that Zt (A∗∗ , ) = Zt (A∗∗ , ♦) (cf. Propositions 34 and 36 in Section 6). In this situation, Theorem 25 shows that the assertion
∗ ∗ Zt A A = RM(A) ⇒ Zt (A∗∗ , ) = A
(1)
∗ ∗
Zt A A = RM(A) ⇒ Zt (A∗∗ , ) ⊆ Zt (A∗∗ , ♦) ,
(2)
is equivalent to
which is also equivalent to
A · Zt (A∗∗ , ) ⊆ A ⇒ Zt (A∗∗ , ) · A ⊆ A by Theorem 23(i) and [16, Theorem 32(i)].
(3)
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
633
Therefore, even in order to obtain the left strong Arens irregularity of A through the left quotient strong Arens irregularity of A, one may have to consider both topological centres of A∗∗ and their relationship. As noted in Remark 28, Zt (A∗∗ , ) is intrinsically related to both LM(A) and RM(A). In other words, LM(A) and RM(A) are each related to both Zt (A∗∗ , ) and Zt (A∗∗ , ♦). Moreover, the equivalence between (1) and (3) shows that implication (1) holds precisely when A is not a “wrong" sided ideal in Zt (A∗∗ , ); that is, A cannot be only a right but not a left ideal in Zt (A∗∗ , ). All of these facts (see also Remark 28) illustrate the complex nature of topological centre problems. Next, we consider the case where A is an involutive Banach algebra. In this situation, there exists a closer connection between Zt (A∗∗ , ) and Zt (A∗∗ , ♦) (respectively, Zt (A∗ A∗ ) and Zt (AA∗ ∗ )). Let τ : A∗∗ −→ A∗∗ , m −→ m∗ be the unique weak∗ –weak∗ continuous extension of the involution on A (cf. the proof of Proposition 11(iv)). Then τ is usually not an involution on A∗∗ with either Arens product (cf. Farhadi and Ghahramani [9]) but a linear involution; ¯ (n) (m, n ∈ A∗∗ , α, β ∈ C). It can be that is, τ (τ (m)) = m and τ (αm + βn) = ατ ¯ (m) + βτ ∗ ∗ ∗ ∗ ∗ seen that (m n) = n ♦ m and (m ♦ n) = n m∗ for all m, n ∈ A∗∗ (cf. Dales and Lau [6, Chapter 2]). So, τ (Zt (A∗∗ , )) = Zt (A∗∗ , ♦) and τ (Zt (A∗∗ , ♦)) = Zt (A∗∗ , ). Thus, Zt (A∗∗ , ) = A
if and only if Zt (A∗∗ , ♦) = A,
and A · Zt (A∗∗ , ) ⊆ A if and only if Zt (A∗∗ , ♦) · A ⊆ A. Therefore, when A satisfies A2 = A, by Theorem 23, we have Zt A∗ A∗ ⊆ RM(A) if and only if Zt AA∗ ∗ ⊆ LM(A). It is routine to check that A is of type (LM) if and only if A is of type (RM), and hence if and only if A is of type (M) (cf. [16]). The theorem below is immediate by Theorem 25 and the relation between the assertions (1), (2), and (3). Theorem 29. Let A be an involutive Banach algebra of type (M). Consider the following statements. (i) Zt (A∗ A∗ ) = RM(A). (ii) Zt (A∗∗ , ) = A. (iii) Zt (A∗∗ , ) = Zt (A∗∗ , ♦). Then any two of (i)–(iii) imply the third. Consequently, the following statements are equivalent. (a) (b) (c) (d)
[Zt (A∗ A∗ ) = RM(A)] ⇒ [Zt (A∗∗ , ) = A]. [Zt (A∗ A∗ ) = RM(A)] ⇒ [Zt (A∗∗ , ) = Zt (A∗∗ , ♦)]. [A · Zt (A∗∗ , ) ⊆ A] ⇒ [A · Zt (A∗∗ , ♦) ⊆ A]. [Zt (A∗∗ , ♦) · A ⊆ A] ⇒ [Zt (A∗∗ , ) · A ⊆ A].
Finally, we relate the multiplier algebra M(A) to topological centres. Let A be a Banach algebra with a BAI. Since the maps (μl , μr ) −→ μl and (μl , μr ) −→ μr are unital injective algebra homomorphisms from M(A) to LM(A) and RM(A), respectively, we have two unital injective algebra homomorphisms M(A) −→ Zt AA∗ ∗ and M(A) −→ Zt A∗ A∗ .
634
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
(Cf. Lemma 1(ii).) The theorem below shows that if A is of type (LM) and Zt (A∗ A∗ ) = M(A) (i.e., the first embedding above is onto), then the left strong Arens irregularity of A can be determined by testing elements of Zt (A∗∗ , ) against one particular element of A∗∗ , rather than verifying conditions like Zt (A∗∗ , ) · A ⊆ A, or Zt (A∗∗ , ) A∗ ⊆ AA∗ (comparing with [16, Theorem 18]). We recall that an element E of A∗∗ is called a mixed identity of A∗∗ if m E = E ♦ m = m for all m ∈ A∗∗ . It is known that E is a mixed identity of A∗∗ if and only if E is a weak∗ -cluster point of a BAI of A (cf. Dales [5, Proposition 2.9.16(iii)]). Theorem 30. Let A be a Banach algebra with a mixed identity E of A∗∗ . (i) Assume that A is of type (LM) and Zt (A∗ A∗ ) = M(A). Let m ∈ Zt (A∗∗ , ). Then m ∈ A if and only if E m = m. (ii) Assume that A is of type (RM) and Zt (AA∗ ∗ ) = M(A). Let m ∈ Zt (A∗∗ , ♦). Then m ∈ A if and only if m ♦ E = m. Proof. We consider only assertion (i). A similar argument shows (ii). Obviously, E m = E ♦ m = m if m ∈ A. Conversely, suppose that E m = m. By the assumption, M(A) → RM(A) and RM(A) → Zt (A∗ A∗ ) are both surjective. By Theorem 23(i), we have A · Zt (A∗∗ , ) ⊆ A. Hence, a −→ a · m defines a μr ∈ RM(A). Then, there exists μl ∈ LM(A) such that (μl , μr ) ∈ M(A). For all a, b ∈ A, we have b · μl (a) = μr (b) · a = (b · m) · a = b · (m · a). Thus n · μl (a) = n (m · a) for all n ∈ A∗∗ and a ∈ A. It follows that, for all a ∈ A, m · a = (E m) · a = E (m · a) = E · μl (a) = μl (a) ∈ A; that is, m · A ⊆ A. Therefore, m ∈ A by [16, Theorem 32(i)].
2
We have the following immediate corollary of Theorem 30, which shows that for a Banach algebra A of type (LM), if A has a central BAI, then Zt (A∗ A∗ ) = M(A) does imply that A is left strongly Arens irregular without the need of any testing point. Corollary 31. Let A be a Banach algebra with a central BAI. Assume that A is of type (LM). If Zt (A∗ A∗ ) = M(A), then Zt (A∗∗ , ) = A. In particular, when A satisfies RM(A) = M(A), then Zt (A∗ A∗ ) = M(A) if and only if Zt (A∗∗ , ) = A. Proof. Assume that A is of type (LM) and Zt (A∗ A∗ ) = M(A). Let E be a weak∗ -cluster point in A∗∗ of a central BAI. Then E m = m ♦ E for all m ∈ A∗∗ . Therefore, if m ∈ Zt (A∗∗ , ), then E m = m ♦ E = m E = m. Hence, we have Zt (A∗∗ , ) = A by Theorem 30(i). The second assertion follows from Theorem 23(i). 2 With group algebras L1 (G) and the Banach algebra given in Proposition 36 of the next section, we see that the two conditions “Zt (A∗∗ , ) = Zt (A∗∗ , ♦)” and “A has a central BAI” are independent, which are required in Corollaries 26 and 31, respectively.
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
635
Recall that the condition RM(A) = M(A) (cf. Corollary 31) is satisfied by convolution Beurling algebras L1 (G, ω) and Fourier algebras A(G). This condition is also satisfied by the quantum group algebra L1 (G) of any co-amenable quantum group G. In fact, if G is a co-amenable locally compact quantum group, then we have the canonical isometric algebra isomorphisms M(G) ∼ = RM L1 (G) ∼ = LM L1 (G) . = M L1 (G) ∼ (Cf. Hu, Neufang and Ruan [17].) As shown by Kraus and Ruan [20, Proposition 3.1] for Kac algebras, if G is co-amenable, every left (respectively, right) multiplier on L1 (G) is completely bounded. In this situation, the subscript “cb” can be added to the above algebras of multipliers and all the identifications there become completely isometric isomorphisms. We will study multipliers on co-amenable locally compact quantum groups in the subsequent work [17]. See Junge, Neufang and Ruan [18] for representations of cb-multipliers over general locally compact quantum groups. It is known that L1 (G) is a two-sided ideal in M(G). We showed in [16, Proposition 1] that the multiplication on L1 (G) is faithful. By Kustermans and Vaes [21, Corollary 6.11] and modifying the arguments used in the proof of [16, Proposition 1], one can show that the multiplication on M(G) is also faithful. Thus M(G) can be canonically identified with a subalgebra of RM(L1 (G)) via ν −→ νr , where νr (a) = a ν (a ∈ L1 (G)). Therefore, RM(L1 (G)) ⊆ Zt (LUC(G)∗ ) implies that M(G) ⊆ Zt (LUC(G)∗ ). In general, the converse implication is not true, since when L1 (G) = A(G), the latter always holds (cf. Lau and Losert [27, Proposition 4.5]), but the former holds precisely when G is amenable (cf. Theorem 15). However, we show below that the two inclusions Zt (LUC(G)∗ ) ⊆ RM(L1 (G)) and Zt (LUC(G)∗ ) ⊆ M(G) are equivalent. Theorem 32. Let G be a locally compact quantum group. Then the following statements are equivalent. (i) (ii) (iii) (iv) (v)
L1 (G) is quotient strongly Arens irregular. Zt (LUC(G)∗ ) ⊆ RM(L1 (G)). Zt (LUC(G)∗ ) ⊆ M(G). L1 (G) Zt (L1 (G)∗∗ , ) ⊆ L1 (G). L1 (G) Zt (LUC(G)∗ ) ⊆ L1 (G).
Proof. Note that L1 (G) is an involutive Banach algebra satisfying L1 (G)2 = L1 (G) (cf. [16] and Fact 1). By Corollary 4(i) and Theorem 23(i) together with the discussions before Theorem 29, we see that (i), (ii), (iv), and (v) are equivalent. Clearly, (iii) ⇒ (ii). So, we only have to show that (ii) ⇒ (iii). Assume that Zt (LUC(G)∗ ) ⊆ RM(L1 (G)). Let m ∈ Zt (LUC(G)∗ ). Then there exists μ ∈ RM(L1 (G)) such that mL = μ∗ ; that is, m, f a = f, μ(a) f ∈ L∞ (G), a ∈ L1 (G) . It is known that C0 (G) ⊆ LUC(G) (cf. [41, Theorem 2.3]). Let ν = m|C0 (G) . Then ν ∈ C0 (G)∗ = M(G). Let a ∈ L1 (G). Then, for all f ∈ C0 (G), we have f, a ν = ν, f a = m, f a = mL (f ), a = μ∗ (f ), a = f, μ(a) .
636
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
Thus f, a ν = f, μ(a) for all f ∈ L∞ (G), since C0 (G) is weak∗ -dense in L∞ (G). It follows that μ(a) = a ν. Therefore, μ = νr , and hence mL = μ∗ = (νr )∗ ; that is, m ∈ M(G) (cf. Fact 2). 2 Finally, with Theorems 18 and 32, we are able to characterize quantum groups G satisfying Zt (LUC(G)∗R ) = RM(L1 (G)). Theorem 33. Let G be a locally compact quantum group. Then the following statements are equivalent. (i) Zt (LUC(G)∗R ) = RM(L1 (G)). (ii) G is co-amenable and SIN, and L1 (G) is quotient strongly Arens irregular. Proof. (i) ⇒ (ii). Due to Theorem 18, G is co-amenable and SIN. In this situation, we have LUC(G) = RUC(G) = L1 (G) L∞ (G) L1 (G), and RM(L1 (G)) ∼ = LM(L1 (G)) (cf. [17]). Thus it can be seen that the identity map LUC(G)∗ −→ ∗ RUC(G) maps Zt (LUC(G)∗R ) onto Zt (RUC(G)∗ ) and RM(L1 (G)) onto LM(L1 (G)). It follows that Zt (RUC(G)∗ ) = LM(L1 (G)), which is equivalent to Zt LUC(G)∗ = RM L1 (G) (see the paragraphs before Theorem 29). Therefore, by Theorem 32, L1 (G) is quotient strongly Arens irregular. Similar arguments will establish (ii) ⇒ (i). 2 6. Some examples of Arens irregular Banach algebras We start this section with an example related to an open question in [28]. For a Banach algebra A with a BAI, Lau and Ülger asked whether “A · Zt (A∗∗ , ) ⊆ A” implies that “Zt (A∗∗ , ♦) · A ⊆ A” (see [28, question 6e)]). By Theorem 23, this is equivalent to asking whether “Zt (A∗ A∗ ) = RM(A)” implies that “Zt (AA∗ ∗ ) = LM(A)”. This question was answered in the negative by Ghahramani, McClure, and Meng in [11, Theorem 3]. However, the proof of [11, Theorem 3] used an identification of Zt (K(c0 )∗∗ , ♦) from [28], that is incorrect as pointed out by Dales and Lau in [6, Example 6.2]. In [6, Example 4.5], Dales and Lau constructed an LSAI Banach algebra which is not RSAI. It is seen that this Banach algebra has a BLAI. An earlier example of an LSAI Banach algebra with a BRAI which is not RSAI was given by Neufang (see, e.g., [6, p. 41]). Note that these two Banach algebras are both WSC. By taking the unitization and applying [11, Lemma 1], one can obtain a unital WSC Banach algebra A such that Zt (A∗∗ , ) = A but Zt (A∗∗ , ♦) = A. We give below a non-unital WSC Banach algebra A with a sequential BAI which is LSAI but far from being RSAI or right quotient strongly Arens irregular. In this situation, the two Banach algebras A∗∗ and A∗ A∗ do not coincide, and the topological centre problems for AA∗ ∗ and A∗ A∗ are distinct. Recall that for all locally compact groups G, we have Zt (L1 (G)∗∗ , ) = Zt (L1 (G)∗∗ , ♦) = L1 (G) (cf. Lau and Losert [26]). Proposition 34. There exists a non-unital WSC Banach algebra A with a sequential BAI such that
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
637
(i) Zt (A∗∗ , ) = A but Zt (A∗∗ , ♦) = A; (ii) A · Zt (A∗∗ , ) = Zt (A∗∗ , ) · A = A, but Zt (A∗∗ , ♦) · A A and A · Zt (A∗∗ , ♦) A; (iii) Zt (A∗ A∗ ) = RM(A) but Zt (AA∗ ∗ ) = LM(A). Proof. As mentioned above, there exists a unital WSC Banach algebra B satisfying Zt (B ∗∗ , ) = B and Zt (B ∗∗ , ♦) = B. Let C = L1 (R) and A = B ⊕1 C. Then A is a WSC Banach algebra under the usual multiplication, and is non-unital with a sequential BAI. It is clear that Zt (A∗∗ , ) = Zt (B ∗∗ , ) ⊕ Zt (C ∗∗ , ) = B ⊕ C = A, and Zt (A∗∗ , ♦) = Zt (B ∗∗ , ♦) ⊕ Zt (C ∗∗ , ♦) = Zt (B ∗∗ , ♦) ⊕ C B ⊕ C = A. Therefore, (i) holds. Note that B is unital and Zt (B ∗∗ , ♦) = B. We have Zt (B ∗∗ , ♦) · B ⊆ B, and hence Zt (A∗∗ , ♦) · A = Zt (B ∗∗ , ♦) · B ⊕ C 2 = Zt (B ∗∗ , ♦) · B ⊕ C B ⊕ C = A. Since A is of type (M) and Zt (A∗∗ , ♦) = A, we have A·Zt (A∗∗ , ♦) ⊆ A by [16, Theorem 32(ii)]. Therefore, (ii) holds. Finally, (iii) follows from (ii) and Theorem 23. 2 For a Banach algebra as in Proposition 34, taking the opposite algebra, one obtains a nonunital WSC Banach algebra A with a (sequential) BAI satisfying Zt (A∗ A∗ ) = RM(A). This answers [28, question 6f)] in the negative, where it was asked whether Zt (A∗ A∗ ) = RM(A) if A is such a Banach algebra (see also [16, Proposition 34]). The proposition below shows that the answer to [28, question 6k)] is also negative, where Lau and Ülger asked whether Zt (A∗ A∗ ) is a dual Banach space when A is a Banach algebra with a BAI. Proposition 35. There exists a Banach algebra A with a BAI such that neither A∗ A nor Zt (A∗ A∗ ) is a dual Banach space. Proof. Let B be the unitization of the group algebra L1 (T). By [11, Lemma 1], we have Zt (B ∗∗ , ) = Zt (L1 (T)∗∗ , ) ⊕ C = L1 (T) ⊕ C. Let A = L1 (T) ⊕1 B. Then A∗ = L∞ (T) ⊕ B ∗ ,
A∗ A = C(T) ⊕ B ∗ ,
and A∗ A∗ = C(T)∗ ⊕ B ∗∗ .
Therefore, Zt (A∗ A∗ ) = Z(C(T)∗ ) ⊕ Zt (B ∗∗ , ) = M(T) ⊕ (L1 (T) ⊕ C). In this situation, neither A∗ A nor Zt (A∗ A∗ ) is a dual Banach space, since C(T) and L1 (T) are both not dual Banach spaces. 2 For an involutive Banach algebra A, as before, we let τ : A∗∗ −→ A∗∗ denote the unique weak∗ –weak∗ continuous extension of the involution on A. Note that Zt (A∗∗ , ) = Zt (A∗∗ , ♦) if and only if τ (Zt (A∗∗ , )) = Zt (A∗∗ , ). Therefore, Zt (A∗∗ , ) = Zt (A∗∗ , ♦) if and only if Zt (A∗∗ , ), τ is an involutive Banach algebra.
638
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
In [6], Dales and Lau constructed some interesting involutive Banach algebras C with Zt (C ∗∗ , ) = Zt (C ∗∗ , ♦), where either a convolution Beurling algebra 1 (F2 , ω) or the C ∗ algebra K(c0 ) was used in their constructions. It is seen that these Banach algebras are either unital or non-WSC. In the proposition below, using group algebras, we define a non-unital WSC separable involutive Banach algebra A with a BAI such that Zt (A∗∗ , ) = Zt (A∗∗ , ♦). Proposition 36. There exists a non-unital WSC separable involutive Banach algebra A with a central BAI such that Zt (A∗∗ , ) = Zt (A∗∗ , ♦), Zt (A∗ A∗ ) = RM(A), and A∗ is a von Neumann algebra. Proof. We will combine and modify some constructions provided in [6, Examples 4.4 and 4.5]. Take B = 1 (Z) ⊕1 1 (Z) with the multiplication given by (f1 , g1 )(f2 , g2 ) = (f1 ∗ f2 , f1 ∗ g2 )
f1 , f2 , g1 , g2 ∈ 1 (Z) .
Then B is a WSC Banach algebra satisfying B = Zt (B ∗∗ , ) = Zt (B ∗∗ , ♦) (cf. [6, Example 4.5]). Obviously, the multiplication on B is not faithful. For (f, g) ∈ B, let (f, g) = (f , g), where f (x) = f (x) (the complex conjugate of f (x)). This defines a linear involution on B satisfying b1 b2 = b1 b2 (b1 , b2 ∈ B). Replacing B by its unitization, we can obtain a unital WSC Banach algebra B with a linear involution as above such that B = Zt (B ∗∗ , ) = Zt (B ∗∗ , ♦). Following the same arguments as used in [6, Example 4.4], we let C = B ⊕1 B op with the usual multiplication, and define (b1 , b2 )∗ = (b2 , b1 ) (b1 , b2 ∈ B). Then C is a unital WSC involutive Banach algebra such that B ⊕ Zt (B ∗∗ , ♦) = Zt (C ∗∗ , ) = Zt (C ∗∗ , ♦) = Zt (B ∗∗ , ♦) ⊕ B, since Zt (B ∗∗ , ♦) = B. Finally, as in the proof of Proposition 34, we let A = C ⊕1 L1 (R). Clearly, A∗ is a von Neumann algebra. Under the canonical multiplication and involution, A is a non-unital WSC separable involutive Banach algebra with a central BAI such that Zt (A∗∗ , ) = Zt (A∗∗ , ♦). Since B is unital and Zt (B ∗∗ , ♦) = B, it can be seen that A · Zt (A∗∗ , ) ⊆ A. Therefore, Zt (A∗ A∗ ) = RM(A) by Theorem 23(i). 2 Remark 37. Comparing with Theorem 29 and Proposition 36, we note that if A is taken to be A(SU (3) × Z), then we have Zt (A∗∗ , ) = Zt (A∗∗ , ♦) = A, and Zt (A∗ A∗ ) = RM(A) (cf. [16, Proposition 34]). Recall that we say that a Banach algebra A is left quotient Arens regular if Zt (A∗ A∗ ) = (see Section 2). It is seen that if A is left quotient Arens regular, and B is a closed subalgebra of A such that the restriction map A∗ −→ B ∗ maps A∗ A onto B ∗ B, then B is also left quotient Arens regular. Clearly, if A is Arens regular, then A is quotient Arens regular. Examples below illustrate that the converse does not hold even for the two classical quantum group algebras. A∗ A∗
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
639
Example 38. Let G be a locally compact group. (i) It is known that L1 (G) has a BAI, and L1 (G) is not Arens regular unless G is finite (cf. Young [47]). Note that Zt (LUC(G)∗ ) = Zt (RUC(G)∗ ) = M(G) = C0 (G)∗ (cf. Lau [25]). Therefore, L1 (G) is quotient Arens regular if and only if G is compact. In particular, L1 (G) is quotient Arens regular but not Arens regular precisely when G is infinite and compact. ∗ ) ⊆ U C(G) ∗ , and Bρ (G) = U C(G) ∗ if and (ii) It is also known that Bρ (G) ⊆ Zt (U C(G) = only if G is discrete, where Bρ (G) is the reduced Fourier–Stieltjes algebra of G and U C(G) A(G)V N (G) is the C ∗ -algebra of uniformly continuous functionals on A(G) (cf. Lau and Losert [27]). Then A(F2 ) is quotient Arens regular since F2 is discrete. However, A(F2 ) is not Arens regular (cf. Forrest [10]). Let L be a non-amenable second countable connected group, )∗ ) = Bρ (H ). and let H = R × L. By Lau and Losert [27, Corollary 5.9], we have Zt (U C(H ∗ ) since H is non-discrete. Therefore, A(H ) is not quotient Arens However, Bρ (H ) U C(H regular, and hence is not Arens regular. Note that the Fourier algebras A(F2 ) and A(H ) both do not have a BAI. References [1] R. Arens, The adjoint of a bilinear operation, Proc. Amer. Math. Soc. 2 (1951) 839–848. [2] J. Baker, A.T.-M. Lau, J. Pym, Module homomorphisms and topological centres associated with weakly sequentially complete Banach algebras, J. Funct. Anal. 158 (1998) 186–208. [3] E. Bédos, L. Tuset, Amenability and co-amenability for locally compact quantum groups, Internat. J. Math. 14 (2003) 865–884. [4] J.F. Berglund, H.D. Junghenn, P. Milnes, Analysis on Semigroups. Function Spaces, Compactifications, Representations, Canad. Math. Soc. Ser. Monogr. Adv. Texts, Wiley–Interscience Publ. John Wiley & Sons, Inc., New York, 1989. [5] H.G. Dales, Banach Algebras and Automatic Continuity, London Math. Soc. Monogr. New Ser., vol. 24, Oxford University Press, New York, 2000. [6] H.G. Dales, A.T.-M. Lau, The second duals of Beurling algebras, Mem. Amer. Math. Soc. 177 (2005), no. 836. [7] H.G. Dales, A.T.-M. Lau, D. Strauss, Banach algebras on semigroups and their compactifications, Mem. Amer. Math. Soc., in press. [8] J. De Cannière, U. Haagerup, Multipliers of the Fourier algebras of some simple Lie groups and their discrete subgroups, Amer. J. Math. 107 (1985) 455–500. [9] H. Farhadi, F. Ghahramani, Involutions on the second duals of group algebras and a multiplier problem, Proc. Edinb. Math. Soc. 50 (2007) 153–161. [10] B. Forrest, Arens regularity and discrete groups, Pacific J. Math. 151 (1991) 217–227. [11] F. Ghahramani, J.P. McClure, M. Meng, On asymmetry of topological centres of the second duals of Banach algebras, Proc. Amer. Math. Soc. 126 (1998) 1765–1768. [12] M. Grosser, Bidualräume und Vervollständigungen von Banachmoduln, Lecture Notes in Math., vol. 717, Springer, Berlin, 1979. [13] M. Grosser, V. Losert, The norm-strict bidual of a Banach algebra and the dual of Cu (G), Manuscripta Math. 45 (1984) 127–146. [14] U. Haagerup, J. Kraus, Approximation properties for group C ∗ -algebras and group von Neumann algebras, Trans. Amer. Math. Soc. 344 (1994) 667–699. [15] E. Hewitt, K.A. Ross, Abstract Harmonic Analysis I, Springer-Verlag, New York, 1979. [16] Z. Hu, M. Neufang, Z.-J. Ruan, Multipliers on a new class of Banach algebras, locally compact quantum groups, and topological centres, preprint, 2007. [17] Z. Hu, M. Neufang, Z.-J. Ruan, Completely bounded multipliers on co-amenable locally compact quantum groups, in preparation. [18] M. Junge, M. Neufang, Z.-J. Ruan, A representation for locally compact quantum groups, Internat. J. Math., in press. [19] E. Kaniuth, A.T.-M. Lau, A separation property of positive definite functions on locally compact groups and applications to Fourier algebras, J. Funct. Anal. 175 (2000) 89–110. [20] J. Kraus, Z.-J. Ruan, Multipliers of Kac algebras, Internat. J. Math. 8 (1996) 213–248.
640
Z. Hu et al. / Journal of Functional Analysis 257 (2009) 610–640
[21] J. Kustermans, S. Vaes, Locally compact quantum groups, Ann. Sci. École Norm. Sup. 33 (2000) 837–934. [22] J. Kustermans, S. Vaes, Locally compact quantum groups in the von Neumann algebraic setting, Math. Scand. 92 (2003) 68–92. [23] A.T.-M. Lau, Operators which commute with convolutions on subspaces of L∞ (G), Colloq. Math. 39 (1978) 351– 359. [24] A.T.-M. Lau, Analysis on a class of Banach algebras with applications to harmonic analysis on locally compact groups and semigroups, Fund. Math. 118 (1983) 161–175. [25] A.T.-M. Lau, Continuity of Arens multiplication on the dual space of bounded uniformly continuous functions on locally compact groups and topological semigroups, Math. Proc. Cambridge Philos. Soc. 99 (1986) 273–283. [26] A.T.-M. Lau, V. Losert, On the second conjugate algebra of L1 (G) of a locally compact group, J. London Math. Soc. (2) 37 (1988) 464–470. [27] A.T.-M. Lau, V. Losert, The C ∗ -algebra generated by operators with compact support on a locally compact group, J. Funct. Anal. 112 (1993) 1–30. [28] A.T.-M. Lau, A. Ülger, Topological centers of certain dual algebras, Trans. Amer. Math. Soc. 348 (1996) 1191– 1212. [29] H. Leptin, Sur l’algèbre de Fourier d’un groupe localement compact, C. R. Acad. Sci. Paris Sér. A 266 (1968) 1180–1182. [30] V. Losert, Some properties of groups without the property P1 , Comment. Math. Helv. 54 (1979) 133–139. [31] P. Milnes, Uniformity and uniformly continuous functions for locally compact groups, Proc. Amer. Math. Soc. 109 (1990) 567–570. [32] T. Mitchell, Topological semigroups and fixed points, Illinois J. Math. 14 (1970) 630–641. [33] R.D. Mosak, Central functions in group algebras, Proc. Amer. Math. Soc. 29 (1971) 613–616. [34] M. Neufang, Abstrakte Harmonische Analyse und Modulhomomorphismen über von Neumann-Algebren, PhD thesis at University of Saarland, Saarbrücken, Germany, 2000. [35] T.W. Palmer, Banach Algebras and General Theory of ∗-Algebras, vol. 1, Cambridge University Press, Cambridge, 1994. [36] C. Ramamohana Rao, Invariant means on spaces of continuous or measurable functions, Trans. Amer. Math. Soc. 114 (1965) 187–196. [37] Z.-J. Ruan, The operator amenability of A(G), Amer. J. Math. 117 (1995) 1449–1474. [38] Z.-J. Ruan, Amenability of Hopf von Neumann algebras and Kac algebras, J. Funct. Anal. 139 (1996) 466–499. [39] W. Rudin, Homomorphisms and translations in L∞ (G), Adv. Math. 16 (1975) 72–90. [40] V. Runde, Characterizations of compact and discrete quantum groups through second duals, J. Operator Theory 60 (2008) 415–428. [41] V. Runde, Uniform continuity over locally compact quantum groups, preprint, 2008. [42] M. Takesaki, Theory of Operator Algebras I, Operator Algebras and Non-commutative Geometry V, Encyclopaedia Math. Sci., vol. 124, Springer-Verlag, Berlin, 2002. [43] M. Talagrand, Closed convex hull of set of measurable functions, Riemann-measurable functions and measurability of translations, Ann. Inst. Fourier (Grenoble) 32 (1982) 39–69. [44] R. Tomatsu, Amenable discrete quantum groups, J. Math. Soc. Japan 58 (2006) 949–964. [45] A. van Daele, Locally compact quantum groups. A von Neumann algebra approach, preprint, 2006. [46] B.B. Wells Jr., Homomorphisms and translates of bounded functions, Duke Math. J. 41 (1974) 35–39. [47] N.J. Young, The irregularity of multiplication in group algebras, Q. J. Math. Oxford Ser. 24 (1973) 59–62.
Journal of Functional Analysis 257 (2009) 641–658 www.elsevier.com/locate/jfa
Asymmetric affine Lp Sobolev inequalities Christoph Haberl, Franz E. Schuster ∗ Institute of Discrete Mathematics and Geometry, Vienna University of Technology, Wiedner Hauptstraße 8–10/104, 1040 Vienna, Austria Received 12 September 2008; accepted 21 April 2009
Communicated by K. Ball
Abstract A new sharp affine Lp Sobolev inequality for functions on Rn is established. This inequality strengthens and implies the previously known affine Lp Sobolev inequality which in turn is stronger than the classical Lp Sobolev inequality. © 2009 Elsevier Inc. All rights reserved. Keywords: Sobolev inequalities; Affine isoperimetric inequalities
1. Introduction The sharp Lp Sobolev inequality of Aubin [1] and Talenti [36] is one of the fundamental inequalities of analysis. It plays a central role in a number of different areas such as the theory of partial differential equations, geometric measure theory, and the calculus of variations. In recent years, many variations and generalizations have been obtained, see, e.g., [2,3,6,8,11,32,33,37] and the references therein. Recently, Zhang [38] (for p = 1) and Lutwak, Yang, and Zhang [27] (for 1 < p < n) formulated and proved a sharp affine Lp Sobolev inequality. This remarkable inequality is invariant under all affine transformations of Rn and turned out to be significantly stronger than the classical Lp Sobolev inequality although it does not rely on any Euclidean geometric structure. As was shown in [38], the affine Zhang–Sobolev inequality is equivalent to the extended Petty projection inequality established in [38]. In the Euclidean setting, all the Lp Sobolev inequalities have * Corresponding author.
E-mail address: [email protected] (F.E. Schuster). 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.04.009
642
C. Haberl, F.E. Schuster / Journal of Functional Analysis 257 (2009) 641–658
the classical isoperimetric inequality at their core (for p = 1 both inequalities are equivalent as discovered by Maz’ya [31] and, independently, by Federer and Fleming [10]). In the affine setting, the situation is more difficult. Here, new geometry is needed to pass from the case p = 1 to p > 1. To establish the affine Lp Sobolev inequality for p > 1, Lutwak, Yang and Zhang [25] had to first establish an Lp Petty projection inequality. In this article we establish a new sharp affine Lp Sobolev inequality which strengthens and directly implies the previously known sharp affine Lp Sobolev inequality of Lutwak, Yang, and Zhang. The geometry behind this new Sobolev inequality is an Lp affine isoperimetric inequality, stronger than the Lp Petty projection inequality, which was recently established by the authors in [13]. This crucial geometric inequality was made possible by recent advances in valuation theory by Ludwig [17,19]. We denote by W 1,p (Rn ) the space of real-valued Lp functions on Rn (n 2) with weak Lp partial derivatives. Let | · | denote the standard Euclidean norm on Rn and let f p denote the usual Lp norm of f in Rn . The classical sharp Lp Sobolev inequality states that if f ∈ W 1,p (Rn ), with real p satisfying 1 p < n, then
1/p |∇f | dx
cˆn,p f p∗ ,
p
(1.1)
Rn
where p ∗ = np/(n − p). The optimal constants cˆn,p in this inequality are due to Federer and Fleming [10] and Maz’ya [31] for p = 1 and to Aubin [1] and Talenti [36] for p > 1. The extremal functions for inequality (1.1) are the characteristic functions of balls for p = 1 and for p > 1 equality is attained when p/(p−1) 1−n/p f (x) = a + b(x − x0 ) , with a, b > 0, and x0 ∈ Rn . The sharp affine Lp Sobolev inequality of Zhang [38] and Lutwak, Yang, and Zhang [27] states that if f ∈ W 1,p (Rn ), 1 p < n, then
−1/n
Du f −n p du
c˜n,p f p∗ ,
(1.2)
S n−1
where Du f is the directional derivative of f in the direction u ∈ S n−1 . The optimal constants c˜n,p in (1.2) were explicitly computed in [38] (for p = 1) and [27]. The determination of cˆn,p and c˜n,p in (1.1) and (1.2) is in many situations not as important as the identification of extremal functions. The extremals associated with inequality (1.2) for p = 1 are the characteristic functions of ellipsoids and for p > 1 equality is attained when p/(p−1) 1−n/p , f (x) = a + φ(x − x0 ) with a > 0, φ ∈ GL(n) and x0 ∈ Rn . We emphasize that inequality (1.2) is invariant under affine transformations of Rn , while the classical Lp Sobolev inequality (1.1) is invariant only under rigid motions. That the affine
C. Haberl, F.E. Schuster / Journal of Functional Analysis 257 (2009) 641–658
643
Lp Sobolev inequality is stronger than (1.1) follows from an application of Hölder’s inequality (cf. [27, p. 33]):
1/p |∇f |p dx
an,p
Rn
−1/n Du f −n du cˆn,p f p∗ . p
S n−1
Here, equality in the left inequality holds if and only if Du f p is independent of u ∈ S n−1 . The constant an,p was computed in [27]. For u ∈ S n−1 and f ∈ W 1,p (Rn ), we denote by D+ u f (x) = max Du f (x), 0 the positive part of the directional derivative of f in the direction u. The main result of this article is the following: Theorem 1. If f ∈ W 1,p (Rn ), with 1 p < n, then
−1/n
+ −n
D f du cn,p f p∗ , u
p
(1.3)
S n−1
where p ∗ = np/(n − p). For p > 1, the optimal constant cn,p is given by cn,p = 2
−1/p
n−p p−1
1−1/p Γ n Γ n + 1 − n 1/n p
p
Γ (n + 1)
1/p nΓ n2 Γ p+1 2 , √ πΓ n+p 2
and cn,1 = limp→1 cn,p . If p = 1, equality holds in (1.3) for characteristic functions of ellipsoids and for p > 1 equality is attained when p/(p−1) 1−n/p f (x) = a + φ(x − x0 ) , with a > 0, φ ∈ GL(n) and x0 ∈ Rn . Note that inequality (1.3) is invariant under affine transformations of Rn . We will show in Section 6 that, for p 1, S n−1
−1/n −1/n
+ −n 1/p
D f du Du f −n du 2 . p u p
(1.4)
S n−1
Since c˜n,p = 21/p cn,p , the new affine Lp Sobolev inequality (1.3) is stronger than inequality (1.2) of Lutwak, Yang, and Zhang. In particular, inequality (1.3) is also stronger than the classical Lp Sobolev inequality (1.1). It is crucial to observe that while for inequality (1.2) only the even part of the directional derivatives of f contribute, for the new inequality (1.3) also asymmetric parts are accounted for. This is reflected by the fact that equality in (1.4) holds precisely when n−1 . D+ u f p is an even function on S
644
C. Haberl, F.E. Schuster / Journal of Functional Analysis 257 (2009) 641–658
The classical L2 Sobolev inequality has drawn particular attention due to its conformal invariance, see, e.g., [3,6,16]. As noted in [27], the affine L2 Sobolev inequality of Lutwak, Yang, and Zhang is equivalent under an affine transformation to the L2 Sobolev inequality. The case p = 2 of inequality (1.3), however, yields a stronger inequality. While the geometric inequalities behind the affine Zhang–Sobolev inequality and inequality (1.3) for p = 1 are the same, a new affine isoperimetric inequality recently established by the authors [13] is needed to establish inequality (1.3) for p > 1. We will apply this inequality to convex bodies (associated with the given function) which occur as solutions to the Lp Minkowski problem for 1 < p < n. Since the geometric inequality assumes that the convex bodies contain the origin in their interiors, its application is intricate in the asymmetric situation. Here, the origin can lie on the boundary of the convex bodies which occur as a solution to the Lp Minkowski problem. All this geometric background will be discussed in detail in Sections 3 and 4. 2. Background material In the following we state some basic facts about convex bodies and compact domains. General references for the theory of convex bodies are the books by Gardner [12] and Schneider [35]. We will also collect background material from real analysis needed in the proof of Theorem 1. The setting for this article is Euclidean n-space Rn with n 2. A convex body is a compact convex set in Rn with non-empty interior. Let Kn denote the set of convex bodies in Rn endowed with the Hausdorff metric. We write Kon for the set of convex bodies containing the origin in their interiors. A compact convex set K is uniquely determined by its support function h(K, ·), where h(K, x) = max{x · y: y ∈ K}, x ∈ Rn , and where x · y denotes the usual inner product of x and y in Rn . Note that h(K, ·) is positively homogeneous of degree one and subadditive. Conversely, every function with these properties is the support function of a unique compact convex set. If K ∈ Kon , the polar body K ∗ of K is defined by K ∗ = x ∈ Rn : x · y 1 for all y ∈ K . Let ρ(K, x) = max{λ 0: λx ∈ K}, x ∈ Rn \ {0}, denote the radial function of K. It follows from the definitions of support functions and radial functions, and the definition of the polar body of K, that ρ(K ∗ , ·) = h(K, ·)−1
and h(K ∗ , ·) = ρ(K, ·)−1 .
(2.1)
A compact domain is the closure of a bounded open subset of Rn . If M and N are compact domains in Rn , then the Brunn–Minkowski inequality states that V (M + N )1/n V (M)1/n + V (N )1/n , where V denotes the usual n-dimensional Lebesgue measure. For a compact domain M and a convex body K in Rn , define nV1 (M, K) = lim inf ε→0+
V (M + εK) − V (M) . ε
C. Haberl, F.E. Schuster / Journal of Functional Analysis 257 (2009) 641–658
If the boundary ∂M of M is a C 1 submanifold of Rn , then 1 V1 (M, K) = h K, ν(x) dHn−1 (x), n
645
(2.2)
∂M
where ν(x) is the exterior unit normal vector of ∂M at x and Hn−1 denotes (n − 1)-dimensional Hausdorff measure (cf. [38, Lemma 3.2]). We need the following immediate consequence of the Brunn–Minkowski inequality: If M is a compact domain and K is a convex body in Rn , then V1 (M, K)n V (M)n−1 V (K).
(2.3)
We will frequently apply Federer’s co-area formula (see, e.g., [9, p. 258]). For quick reference we state a version which is sufficient for our purposes: If f : Rn → R is locally Lipschitz and g : Rn → [0, ∞) is measurable, then, for any Borel set A ⊆ R, g(x) g(x) dx = (2.4) dHn−1 (x) dy. |∇f (x)| f −1 (A)∩{|∇f |>0}
A f −1 {y}
Finally, we require the following consequence (cf. [2, Proposition 2.18]) of Bliss’ inequality [4]. For an elementary proof we refer to [27, Lemma 4.1]: Let f : (0, ∞) → [0, ∞) be decreasing and locally absolutely continuous and let 1 < p < n. If the integrals exist, then 1/p 1/p∗ ∞ ∞ p n−1 ∗ f (x) x dx bn,p f (x)p x n−1 dx , 0
(2.5)
0
where p ∗ = np/(n − p) and bn,p = n
1/p ∗
n−p p−1
1−1/p Γ n Γ n + 1 − n 1/n p
p
Γ (n)
.
Equality in (2.5) holds if f (x) = (ax p/(p−1) + b)1−n/p , with a, b > 0. 3. Lp projection bodies and the Lp Minkowski problem In this section we collect the material which forms the geometric core in the proof of our main result. The critical ingredients are an Lp affine isoperimetric inequality recently established in [13] and the solution (to the discrete data case) of an Lp extension of the classical Minkowski problem obtained in [7]. The projection body ΠK of K ∈ Kn is the convex body defined by h(ΠK, u) = voln−1 K | u⊥ , where voln−1 (K | u⊥ ) is the (n − 1)-dimensional volume of the projection of K onto the hyperplane orthogonal to u.
646
C. Haberl, F.E. Schuster / Journal of Functional Analysis 257 (2009) 641–658
Introduced by Minkowski, projection bodies have become a central notion in convex geometry, see, e.g., [12,13,17,26] and the references therein. A recent result by Ludwig [19] has demonstrated their special place in affine geometry: The projection operator was characterized as the unique valuation which is contravariant with respect to linear transformations. The fundamental affine isoperimetric inequality for projection bodies is the Petty projection inequality: If K ∈ Kn , then V (K)n−1 V (Π ∗ K)
κn κn−1
n ,
with equality if and only if K is an ellipsoid. Here Π ∗ K = (ΠK)∗ and κn denotes the volume of the Euclidean unit ball in Rn . This inequality turned out to be far stronger than the classical isoperimetric inequality. It is the geometric inequality behind the affine Zhang–Sobolev inequality [38]. Projection bodies are part of the classical Brunn–Minkowski theory. In a series of articles [22,23], Lutwak showed that merging the notion of volume with Firey’s Lp addition of convex sets leads to a Brunn–Minkowski theory for each p 1. Since Lutwak’s seminal work, the topic has been much studied, see, e.g., [5,7,18–21,24,26,27,29,30]. For p 1, K, L ∈ Kon and α, β 0 (not both zero), the Lp Minkowski combination α · K +p β · L is the convex body defined by h(α · K +p β · L, ·)p = αh(K, ·)p + βh(L, ·)p . One of the basic notions of the Lp Brunn–Minkowski theory is the Lp mixed volume Vp (K, L) of two bodies K, L ∈ Kon . It was defined in [22] by Vp (K, L) =
V (K +p ε · L) − V (K) p lim . n ε→0+ ε
Clearly, the diagonal form of Vp reduces to ordinary volume, i.e., for K ∈ Kon , Vp (K, K) = V (K).
(3.1)
It was shown in [22] that corresponding to each convex body K ∈ Kon , there exists a positive Borel measure on S n−1 , the Lp surface area measure Sp (K, ·) of K, such that for every L ∈ Kon , Vp (K, L) =
1 n
h(L, u)p dSp (K, u).
(3.2)
S n−1
The measure S1 (K, ·) is just the classical surface area measure S(K, ·) of K. Moreover, it was proved in [22], that the Lp surface area measure is absolutely continuous with respect to S(K, ·): dSp (K, u) = h(K, u)1−p dS(K, u),
u ∈ S n−1 .
(3.3)
Recall that for a Borel set ω ⊆ S n−1 , S(K, ω) is the (n − 1)-dimensional Hausdorff measure of the set of all boundary points of K for which there exists a normal vector of K belonging to ω.
C. Haberl, F.E. Schuster / Journal of Functional Analysis 257 (2009) 641–658
647
From the homogeneity properties of the surface area measure and the support function of K, one obtains that, for every λ > 0, Sp (λK, ·) = λn−p Sp (K, ·).
(3.4)
n−1 , the For a finite Borel measure μ on S n−1 , we define a continuous function C+ p μ on S asymmetric Lp cosine transform of μ, by + p Cp μ (u) = (u · v)+ dμ(v), u ∈ S n−1 , S n−1
where (u · v)+ = max{u · v, 0}. For f ∈ C(S n−1 ), let C+ p f be the asymmetric Lp cosine transform of the absolutely continuous measure (with respect to spherical Lebesgue measure) with density f . The asymmetric Lp projection body Πp+ K of K ∈ Kon , first considered in [23], is the convex body defined by p (3.5) h Πp+ K, · = C+ p Sp (K, ·). For p > 1, Ludwig [19] established the Lp analogue of her classification of the projection operator: She showed that the convex bodies c1 · Πp+ K +p c2 · Πp− K,
K ∈ Kon ,
(3.6)
where Πp− K = Πp+ (−K) and c1 , c2 0 (not both zero), constitute all natural Lp extensions of projection bodies. The (symmetric) Lp projection body Πp K of K ∈ Kon , defined in [26], is Πp K =
1 1 · Πp+ K +p · Πp− K. 2 2
Lutwak, Yang, and Zhang [26] (see also Campi and Gronchi [5]) established an Lp extension of the Petty projection inequality for the (symmetric) Lp projection operator which forms the geometry behind their sharp affine Lp Sobolev inequality: If K ∈ Kon , then n/p κn Γ n+p 2 V (K)n/p−1 V Πp∗ K , (3.7) π (n−1)/2 Γ 1+p 2 with equality if and only if K is an ellipsoid centered at the origin. Recently the authors [13] established the Lp Petty projection inequality for each member of the family (3.6) of Lp projection operators. The geometric core of the asymmetric affine Lp Sobolev inequality (1.3) is the following special case of this result: Theorem 2. If p > 1 and K ∈ Kon , then n/p−1
V (K)
V Πp+,∗ K
κn Γ
n+p
π (n−1)/2 Γ
n/p
2
1+p 2
where equality is attained if K is an ellipsoid centered at the origin.
,
(3.8)
648
C. Haberl, F.E. Schuster / Journal of Functional Analysis 257 (2009) 641–658
Although this inequality was formulated in [13] for dimensions n 3, we remark that it also holds true in dimension n = 2. The proof is verbally the same as the one given in [13]. Since surface area measures have their center of mass at the origin, we have Π1+ K = ΠK. Thus, for p = 1, inequality (3.8) is the classical Petty projection inequality. It was also shown in [13] that inequality (3.8), for p > 1, is stronger than the Lp Petty projection inequality (3.7) of Lutwak, Yang, and Zhang: If K ∈ Kon , then V Πp∗ K V Πp+,∗ K .
(3.9)
If p is not an odd integer, equality holds precisely for origin-symmetric K. We turn now to the second main ingredient in the proof of Theorem 1. The Lp Minkowski problem asks for necessary and sufficient conditions for a Borel measure μ on S n−1 to be the Lp surface area measure of a convex body. A solution to this problem for p > n was given by Chou and Wang [7]. Moreover, Chou and Wang [7] established the solution to the discrete-data case of the Lp Minkowski problem for all p > 1 (see also [15] for an alternate approach). The following solution to the discrete Lp Minkowski problem due to Chou and Wang will be crucial: Theorem 3. If α1 , . . . , αk > 0 and u1 , . . . , uk ∈ S n−1 are not contained in a closed hemisphere, then, for any p > 1, p = n, there exists a unique polytope P ∈ Kon such that k
αj δuj = Sp (P , ·).
j =1
Here, δu denotes the probability measure with unit point mass at u ∈ S n−1 . We will also apply two auxiliary results [28, Lemmas 2.2 and 2.3] concerning the volume normalized Lp Minkowski problem: Let μ be a positive Borel measure on S n−1 , and let K ∈ Kn contain the origin. Suppose that V (K)h(K, ·)p−1 μ = S(K, ·), and that for some constant c > 0,
p
(u · v)+ dμ(v)
n cp
for every u ∈ S n−1 .
S n−1
Then V (K) κn
n μ(S n−1 )
n/p
where Bn denotes the Euclidean unit ball in Rn .
and K ⊂ cBn ,
(3.10)
C. Haberl, F.E. Schuster / Journal of Functional Analysis 257 (2009) 641–658
649
4. A critical lemma A crucial part in the proof of our main result is the construction of a family of convex bodies containing the origin in their interiors from a given function. It is essential that the origin is an interior point in order to apply the critical geometric inequality (3.8) afterwards. In [26], this was done by using the solution to the even Lp Minkowski problem. In our case, we have to deal with the solutions to the general Lp Minkowski problem. Here, the bodies can contain the origin in their boundaries (cf. [15]). Therefore, we will associate a two parametric family of convex polytopes with a given function. These polytopes are obtained from the solution to the discretedata case of the Lp Minkowski problem which ensures that they contain the origin as an interior point. This will allow us to use the relevant geometric inequality. A function f ∈ C ∞ (Rn ) is called smooth. Suppose f is smooth and has compact support. Then the level set [f ]t = x ∈ Rn : f (x) t is compact for every 0 < t f ∞ , where f ∞ denotes the maximum value of |f | over Rn . Lemma 1. Suppose that f : Rn → R is smooth and has compact support. Then, for almost every t ∈ (0, f ∞ ), there exists a sequence of convex polytopes Pkt ∈ Kon , k ∈ N, such that lim Pkt = Kft ∈ Kn
k→∞
and 1 V Kft = n
−1 p h Kft , ∇f (x) ∇f (x) dHn−1 (x).
(4.1)
∂[f ]t
Moreover, there exists a convex body Ltf ∈ Kon such that lim Πp+ Pkt = Ltf .
k→∞
Proof. By Sard’s theorem, for almost every t ∈ (0, f ∞ ), the boundary ∂[f ]t of [f ]t is a smooth (n − 1)-dimensional submanifold with everywhere nonzero normal vector ∇f . Let t be chosen in this way and denote by ν(x) = ∇f (x)/|∇f (x)| the unit normal of ∂[f ]t at x. Let μt be the finite positive Borel measure on S n−1 defined by
g(v) dμt (v) =
S n−1
p−1 g ν(x) ∇f (x) dHn−1 (x),
(4.2)
∂[f ]t
for g ∈ C(S n−1 ). Since ν(x): x ∈ ∂[f ]t = S n−1 ,
(4.3)
650
C. Haberl, F.E. Schuster / Journal of Functional Analysis 257 (2009) 641–658
it follows that for every u ∈ S n−1 , p−1 t u · ν(x) + ∇f (x) (u · v)+ dμ (v) = dHn−1 (x) > 0. ∂[f ]t
S n−1
Therefore, the measure μt cannot be concentrated in a closed hemisphere. As in [35, pp. 392–393], construct a sequence μtk , k ∈ N, of discrete measures on S n−1 whose support is not contained in a closed hemisphere and such that μtk converges weakly to μt as k → ∞. By Theorem 3, for each k ∈ N, there exists a polytope Pkt ∈ Kon such that μtk = Sp Pkt , · .
(4.4)
We want to show that the sequence of polytopes Pkt is bounded. To this end, define for each k ∈ N a new polytope Qtk by −1/p t Qtk = V Pkt Pk . By (3.3) and the homogeneity (3.4) of Lp surface area measures, the polytopes Qtk , k ∈ N, form a solution to the volume normalized Lp Minkowski problem p−1 t V Qtk h Qtk , · μk = S Qtk , · .
(4.5)
Moreover, from definition (3.5), relation (4.4) and the weak convergence of the measures μtk , it follows that for every u ∈ S n−1 , + t p p p t (u · v)+ dμk (v) −→ (u · v)+ dμt (v) > 0. (4.6) h Πp Pk , u = S n−1
S n−1
Since pointwise convergence of support functions implies uniform convergence (see, e.g., [35, Theorem 1.8.12]), there exists a c > 0 such that for all k ∈ N, p (u · v)+ dμtk (v) > c, for every u ∈ S n−1 . (4.7) S n−1
From (4.5), (4.7) and (3.10), we deduce that the sequence Qtk , k ∈ N, is bounded. Moreover, by (3.10) and the weak convergence of the measures μtk , the volumes V (Qtk ) are bounded from below by a constant independent of k. Therefore, the original sequence Pkt = V (Qtk )1/(p−n) Qtk is also bounded. By the Blaschke selection theorem (see, e.g., [35, Theorem 1.8.6]), we can select a subsequence of the Pkt converging to a convex body Kft . After relabeling (if necessary) we may assume that limk→∞ Pkt = Kft . From (3.1), (3.2), and relation (4.4), we obtain 1 V Kft = lim V Pkt = lim k→∞ k→∞ n
S n−1
p h Pkt , v dμtk (v).
C. Haberl, F.E. Schuster / Journal of Functional Analysis 257 (2009) 641–658
651
Thus, the uniform convergence of the support functions h(Pkt , ·), the weak convergence of the measures μtk , and definition (4.2), yield 1 V Kft = n
−1 p h Kft , ∇f (x) ∇f (x) dHn−1 (x).
∂[f ]t
t Finally, we define h(Ltf , ·)p = C+ p μ . By definition (4.2), we have
p h Ltf , u =
−1 p u · ∇f (x) + ∇f (x) dHn−1 (x),
u ∈ S n−1 .
(4.8)
∂[f ]t
From (4.6), we deduce that h(Ltf , ·) is the support function of a convex body Ltf ∈ Kon and that limk→∞ Πp+ Pkt = Ltf . 2 5. Proof of the main result After these preparations, we are now in a position to proof our main result. We want to point out that the approach we use to establish Theorem 1 is based on ideas and techniques of Lutwak, Yang, and Zhang [27]. We will need the decreasing rearrangement f¯ of a function f : Rn → R. It is defined by f¯(x) = inf t > 0: V [f ]t < κn |x|n . Note that the level set [f¯]t is a dilate of the unit ball Bn and its volume is equal to V ([f ]t ). Moreover, for all p 1, f p = f¯p .
(5.1)
We will first reduce the proof of Theorem 1 to the class of smooth functions with compact support. Lemma 2. In order to prove Theorem 1, it is sufficient to verify the following assertion: If f ∈ C ∞ (Rn ) has compact support and 1 p < n, then
−1/n
+ −n
D f du cn,p f p∗ . u
p
(5.2)
S n−1
Proof. Assume that (5.2) holds for smooth functions with compact support and let f ∈ W 1,p . We may assume that the set {x ∈ Rn : f (x) = 0} has positive measure. First, we will show that n−1 . D+ u f p > 0 for every u ∈ S We may assume that u = en is the last canonical basis vector. We denote the indicator function of a set A ⊆ Rn by IA . Since for each N ∈ N, almost all points in Rn are Lebesgue points of f · I[−N,N]n (see, e.g., [34, Theorem 7.7]), there exists an n-box P = [a1 , b1 ] × · · · × [an , bn ] such that P f = 0.
652
C. Haberl, F.E. Schuster / Journal of Functional Analysis 257 (2009) 641–658
If
P
f > 0, then, since f ∈ Lp (Rn ), there exist real a < b < c such that b
c f<
P
f, P
a
b
where P denotes the (n − 1)-box [a1 , b1 ] × · · · × [an−1 , bn−1 ]. Let 0 < ε < 1 and let gi : R → [0, 1], i = 1, . . . , n − 1, be smooth functions with gi = 1 on [ai , bi ] and gi = 0 on x (ai − ε, bi + ε)c . Furthermore, define gn : R → R, by gn (x) = −∞ hn (x) dx, where hn is a smooth function which is equal to ε on [a + ε, b − ε], −ε on [b + ε, c − ε], and zero on [a, c]c . If we set φ(x) = g1 (x1) · . . . · gn (xn ), then φ is a non-negative, smooth, and compactly supported function such that Rn f ∂n φ < 0 for sufficiently small ε. Here, ∂n φ denotes the nth partial derivative of φ. If P f < 0, then the above argument applied to x → −f (−x) yields a non-positive, smooth, and compactly supported function φ with Rn f ∂n φ > 0. + Now suppose that Den f p = 0. This implies, by the definition of weak derivatives, that f ∂ φ 0 ( 0) for every smooth and compactly supported φ which is non-negative n Rn (non-positive). This is a contradiction to the above construction. Thus D+ u f p > 0 for every u ∈ S n−1 . Since f ∈ W 1,p , we can find a sequence fk , k ∈ N, of smooth functions with compact support such that fk − f p → 0
and ∂i fk − ∂i f p → 0
for i = 1, . . . , n. By Minkowski’s inequality we have cn,p fl − fm
p∗
−1/n
+
D (fl − fm ) −n du u
p
S n−1
n 1 1/n
ωn
∂i fl − ∂i fm p
i=1
for all l, m ∈ N, where ωn denotes the surface area of the Euclidean unit ball in Rn . Consequently, the sequence fk , k ∈ N, is a Cauchy sequence in Lp∗ (Rn ). By the completeness of Lp∗ (Rn ), there exists a function g such that fk − gp∗ → 0. Since sequences of functions converging in Lq , q > 0, posess a subsequence converging almost everywhere, we can find fkj , j ∈ N, such that fkj → f and fkj → g almost everywhere. We conclude that f = g almost everywhere and hence fk → f also in Lp∗ (Rn ). −n + −n n−1 . By the first part of the proof, limk→∞ D+ u fk p = Du f p for every unit vector u ∈ S Thus an application of Fatou’s lemma yields
+ −n
D f du = u
S n−1
p
−n
lim D+ u fk p du
k→∞
S n−1
lim inf k→∞
+ −n
D fk du u
p
S n−1 −n −n −n fk −n lim cn,p p ∗ = cn,p f p ∗ . k→∞
2
C. Haberl, F.E. Schuster / Journal of Functional Analysis 257 (2009) 641–658
653
Proof of Theorem 1. In the following let p > 1. By Lemma 2 we may assume that f is a smooth function with compact support which is not identically zero. An application of the coarea formula (2.4) shows that
+ p
D f = u p
f ∞
p u · ∇f (x) + dx =
Rn
0
p
∂[f ]t
(u · ∇f (x))+ dHn−1 (x) dt. |∇f (x)|
By Lemma 1 and (4.8), there exists a convex body Ltf ∈ Kon such that
−p/n f −n/p −p/n ∞
+ −n t p
D f du = h L , u dt du . u f p
S n−1
S n−1
0
Since h(Ltf , ·) is positive, we can apply a consequence of Minkowski’s integral inequality (see, e.g., [14, p. 148]), to obtain
−p/n f −p/n ∞
+ −n t −n
D f du h L , u du dt. u f p 0
S n−1
S n−1
Using (2.1) and the polar coordinate formula for volume, we deduce
−p/n f ∞
+ −n t,∗ −p/n
D f du nV Lf dt. u p
(5.3)
0
S n−1
By Lemma 1, there exists a sequence of convex polytopes Pkt ∈ Kon such that limk→∞ Pkt = Kft ∈ Kn and limk→∞ Πp+ Pkt = Ltf . Thus, from an application of Theorem 2, we obtain
−p/n −p/n (n−p)/n nV Lt,∗ = lim nV Πp+,∗ Pkt en,p V Kft , f k→∞
(5.4)
where π (n−1)/2 Γ 1+p 2 en,p = . np/n κn Γ n+p 2 From (5.3) and (5.4), we deduce
f −p/n ∞
+ −n (n−p)/n
D f du en,p V Kt dt. u
S n−1
f
p
0
(5.5)
654
C. Haberl, F.E. Schuster / Journal of Functional Analysis 257 (2009) 641–658
An application of Hölder’s integral inequality to volume formula (4.1), yields (n−p)/np V Kft n1−1/p
∂[f ]t
dHn−1 (x) |∇f (x)|
(1−p)/p
−1/n V Kft V1 [f ]t , Kft ,
where we have used integral representation (2.2). From inequality (2.3), we deduce further that (n−p)/n np−1 V Kft
∂[f ]t
dHn−1 (x) |∇f (x)|
1−p
(n−1)p/n V [f ]t .
(5.6)
Another application of the co-area formula (2.4), yields f ∞
t
∂[f ]s
dHn−1 (x) ds = V [f ]t ∩ |∇f | > 0 . |∇f (x)|
Using Sard’s theorem, it is not hard to show that for almost every t satisfying 0 < t < f ∞ , there exists a neighborhood Ut of t such that V f −1 (Ut ) ∩ |∇f | > 0 = V f −1 (Ut ) . Therefore, we obtain for almost every t with 0 < t < f ∞ , ∂[f ]t
dHn−1 (x) = −V [f ]t . |∇f (x)|
(5.7)
Combining (5.5), (5.6), and (5.7), we obtain S n−1
f −p/n ∞
+ −n en,p V ([f ]t )(n−1)p/n
D f du dt. u p n1−p (−V ([f ]t ) )p−1
(5.8)
0
In order to estimate the right integral in (5.8), define fˆ : (0, ∞) → R, by f¯(x) = fˆ 1/|x| . Since the decreasing rearrangement f¯(x) depends only on the Euclidean norm of x, the function fˆ is well defined and increasing. Noting that fˆ is locally Lipschitz, the substitution rule thus yields f ∞
0
V ([f ]t )(n−1)p/n 1−p/n dt = n1−p κn (−V ([f ]t ) )p−1
∞ 0
fˆ (s)p s 2p−n−1 ds.
C. Haberl, F.E. Schuster / Journal of Functional Analysis 257 (2009) 641–658
655
Hence, we can rewrite (5.8) as
−p/n ∞
+ −n 1−p/n
D f du en,p κn fˆ (s)p s 2p−n−1 ds. u
p
(5.9)
0
S n−1
Using polar coordinates and (5.1), we see that ∗
p f¯p∗ = nκn
∞
∗
∗ p fˆ(s)p s −n−1 ds = f p∗ .
0
The substitution t = 1/s and an application of inequality (2.5), therefore yields ∞
1/p ˆ
p 2p−n−1
f (s) s 0
ds
bn,p ∗ 1/p n1/p κn
∗
f p∗ .
(5.10)
Finally, combine inequalities (5.9) and (5.10), to obtain the desired result
−1/n
+ −n
D f du cn,p f p∗ . u
p
(5.11)
S n−1
In order to see that inequality (1.3) is sharp, take for smooth K ∈ Kon , 1−n/p f (x) = 1 + ρ(K, x)p/(1−p) .
(5.12)
Then, a straightforward (but tedious) calculation shows that inequality (1.3) reduces to the Lp affine isoperimetric inequality (3.8), where equality holds if K is an ellipsoid centered at the origin. Clearly, the case p = 1 of inequality (1.3) can be obtained from a limit of inequality (5.11) as p → 1:
1 n
−1/n
+ −n κn−1
D f du f 1∗ . u 1 κn
(5.13)
S n−1
Noting that Π1+ = Π , one can show (cf. [38]) that for characteristic functions of convex bodies, inequality (5.13) reduces to the Petty projection inequality, where equality is attained for ellipsoids. 2 We remark that for p > 1 the affine Lp Sobolev inequality (1.2) of Lutwak, Yang, and Zhang reduces to the Lp Petty projection inequality (3.7) if we take f as in (5.12). Thus, it follows from (3.9) that the new inequality (1.3) is in general stronger than (1.2). We will make this fact even more explicit in the next section.
656
C. Haberl, F.E. Schuster / Journal of Functional Analysis 257 (2009) 641–658
6. A stronger inequality In this last section we show that Theorem 1 provides a stronger result than the affine Lp Sobolev inequality (1.2) of Zhang and Lutwak, Yang, and Zhang. The basic concept behind this observation is a convex body associated with a given function f . For p 1 and f ∈ W 1,p (Rn ), let Bp+ (f ) be the convex body defined by h Bp+ (f ), u =
+ p Du f (x) dx
1/p
=
Rn
p u · ∇f (x) + dx
1/p .
Rn
From Minkowski’s integral inequality, we deduce that h(Bp+ (f ), ·) is sublinear and therefore the support function of a unique convex body Bp+ (f ). Moreover, by Lemma 2, this body contains the origin in its interior. By (2.1) and the polar coordinate formula for volume, the volume of its polar body is given by 1 V Bp+,∗ (f ) = n
+ −n
D f du. u
p
S n−1
Therefore, we can rewrite our main theorem as follows: Theorem 1 . If f ∈ W 1,p (Rn ), with 1 p < n, then −1/n V Bp+,∗ (f ) kn,p f p∗ . The optimal constant kn,p is given by kn,p = 2
−1/p
n−p p−1
1−1/p Γ n Γ n + 1 − n 1/n p
p
Γ (n)
1/p nΓ n2 Γ p+1 2 . √ πΓ n+p 2
From the definition of Lp Minkowski addition, it follows that h Bp+ (f ) +p Bp+ (−f ), u =
Du f (x)p dx
1/p .
(6.1)
Rn
Thus, the following reformulation of inequality (1.4) shows that Theorem 1 is stronger than inequality (1.2): Theorem 4. If p 1 and f ∈ W 1,p (Rn ), then ∗ V Bp+ (f ) +p Bp+ (−f ) 2−n/p V Bp+,∗ (f ) , with equality if and only if Bp+ (f ) is origin symmetric.
C. Haberl, F.E. Schuster / Journal of Functional Analysis 257 (2009) 641–658
657
In order to prove this theorem, we need a result from the dual Lp Brunn–Minkowski theory. The basis of this theory is the following addition on convex bodies. For α, β 0 (not both zero), p β · L of K, L ∈ Kon is the convex body defined Firey’s Lp harmonic radial combination α · K + by p β · L, · −p = αρ(K, ·)−p + βρ(L, ·)−p . ρ α·K + Firey started investigations of harmonic Lp combinations in the 1960’s which were continued by Lutwak leading to a dual Lp Brunn–Minkowski theory. A cornerstone of this theory is the dual Lp Brunn–Minkowski inequality [23]: If K, L ∈ Kon , then p L −p/n V (K)−p/n + V (L)−p/n , V K+
(6.2)
with equality if and only if K and L are dilates. Proof of Theorem 4. From (2.1), (6.1) and the definition of Lp harmonic radial addition, it follows that + ∗ p Bp+,∗ (−f ). Bp (f ) +p Bp+ (−f ) = Bp+,∗ (f ) + Since V (Bp+,∗ (f )) = V (Bp+,∗ (−f )), an application of (6.2) yields the desired inequality along with its equality conditions. 2 Acknowledgments The authors are grateful for the help of Monika Ludwig, Erwin Lutwak, Christian Steineder, Deane Yang, and Gaoyong Zhang in the presentation of this article. This work was supported by the Austrian Science Fund (FWF), within the project P 18308, “Valuations on convex bodies”. References [1] T. Aubin, Problèmes isopérimétriques et espaces de Sobolev, J. Differential Geom. 11 (1976) 573–598. [2] T. Aubin, Nonlinear Analysis on Manifolds: Monge–Ampère Equations, Springer, Berlin, 1982. [3] W. Beckner, Sharp Sobolev inequalities on the sphere and the Moser–Trudinger inequality, Ann. Math. 138 (1993) 213–242. [4] G.A. Bliss, An integral inequality, J. London Math. Soc. 5 (1930) 40–46. [5] S. Campi, P. Gronchi, The Lp -Busemann–Petty centroid inequality, Adv. Math. 167 (2002) 128–141. [6] E.A. Carlen, M. Loss, Extremals of functionals with competing symmetries, J. Funct. Anal. 88 (1990) 437–456. [7] K.-S. Chou, X.-J. Wang, The Lp -Minkowski problem and the Minkowski problem in centroaffine geometry, Adv. Math. 205 (2006) 33–83. [8] D. Cordero-Erausquin, B. Nazaret, C. Villani, A mass-transportation approach to sharp Sobolev and Gagliardo– Nirenberg inequalities, Adv. Math. 182 (2004) 307–332. [9] H. Federer, Geometric Measure Theory, Springer, Berlin, 1969. [10] H. Federer, W. Fleming, Normal and integral currents, Ann. Math. 72 (1960) 458–520. [11] N. Fusco, F. Maggi, A. Pratelli, The sharp quantitative Sobolev inequality for functions of bounded variation, J. Funct. Anal. 244 (2007) 315–341. [12] R. Gardner, Geometric Tomography, second ed., Cambridge Univ. Press, Cambridge, 2006. [13] C. Haberl, F.E. Schuster, General Lp affine isoperimetric inequalities, J. Differential Geom., in press. [14] G. Hardy, J.E. Littlewood, G. Pólya, Inequalities, Cambridge Univ. Press, Cambridge, 1952.
658
C. Haberl, F.E. Schuster / Journal of Functional Analysis 257 (2009) 641–658
[15] D. Hug, E. Lutwak, D. Yang, G. Zhang, On the Lp Minkowski problem for polytopes, Discrete Comput. Geom. 33 (2005) 699–715. [16] E.H. Lieb, Sharp constants in the Hardy–Littlewood–Sobolev and related inequalities, Ann. Math. 118 (1983) 349– 374. [17] M. Ludwig, Projection bodies and valuations, Adv. Math. 172 (2002) 158–168. [18] M. Ludwig, Ellipsoids and matrix-valued valuations, Duke Math. J. 119 (2003) 159–188. [19] M. Ludwig, Minkowski valuations, Trans. Amer. Math. Soc. 357 (2005) 4191–4213. [20] M. Ludwig, M. Reitzner, A classification of SL(n) invariant valuations, Ann. Math., in press. [21] E. Lutwak, On some affine isoperimetric inequalities, J. Differential Geom. 23 (1986) 1–13. [22] E. Lutwak, The Brunn–Minkowski–Firey theory. I. Mixed volumes and the Minkowski problem, J. Differential Geom. 38 (1993) 131–150. [23] E. Lutwak, The Brunn–Minkowski–Firey theory. II: Affine and geominimal surface areas, Adv. Math. 118 (1996) 244–294. [24] E. Lutwak, V. Oliker, On the regularity of solutions to a generalization of the Minkowski problem, J. Differential Geom. 41 (1995) 227–246. [25] E. Lutwak, D. Yang, G. Zhang, Lp affine isoperimetric inequalities, J. Differential Geom. 56 (2000) 111–132. [26] E. Lutwak, D. Yang, G. Zhang, A new ellipsoid associated with convex bodies, Duke Math. J. 104 (2000) 375–390. [27] E. Lutwak, D. Yang, G. Zhang, Sharp affine Lp Sobolev inequalities, J. Differential Geom. 62 (2002) 17–38. [28] E. Lutwak, D. Yang, G. Zhang, On the Lp Minkowski problem, Trans. Amer. Math. Soc. 356 (2004) 4359–4370. [29] E. Lutwak, D. Yang, G. Zhang, Optimal Sobolev norms and the Lp Minkowski problem, Int. Math. Res. Not. 65 (2006) 1–21. [30] E. Lutwak, G. Zhang, Blaschke–Santaló inequalities, J. Differential Geom. 47 (1997) 1–16. [31] V.G. Maz’ya, Classes of domains and imbedding theorems for function spaces, Dokl. Akad. Nauk SSSR 133 (1960) 527–530. [32] V.G. Maz’ya, Lectures on isoperimetric and isocapacitary inequalities in the theory of Sobolev spaces, in: Heat Kernels and Analysis on Manifolds, Graphs, and Metric Spaces, in: Contemp. Math., vol. 338, Amer. Math. Soc., Providence, RI, 2003, pp. 307–340. [33] V.G. Maz’ya, Conductor and capacitary inequalities for functions on topological spaces and their applications to Sobolev-type imbeddings, J. Funct. Anal. 224 (2005) 408–430. [34] W. Rudin, Real and Complex Analysis, McGraw–Hill Book Co., New York, 1987. [35] R. Schneider, Convex Bodies: The Brunn–Minkowski Theory, Cambridge Univ. Press, Cambridge, 1993. [36] G. Talenti, Best constant in Sobolev inequality, Ann. Math. Pura Appl. 110 (1976) 353–372. [37] J. Xiao, The sharp Sobolev and isoperimetric inequalities split twice, Adv. Math. 211 (2007) 417–435. [38] G. Zhang, The affine Sobolev inequality, J. Differential Geom. 53 (1999) 183–202.
Journal of Functional Analysis 257 (2009) 659–682 www.elsevier.com/locate/jfa
On the sum of superoptimal singular values Alberto A. Condori Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA Received 23 October 2008; accepted 3 April 2009 Available online 29 April 2009 Communicated by N. Kalton
Abstract In this paper, we study the following extremal problem and its relevance to the sum of the so-called superoptimal singular values of a matrix function: Given an m × n matrix function Φ, when is there a such that matrix function Ψ∗ in the set An,m k
trace Φ(ζ )Ψ∗ (ζ ) dm(ζ ) =
T
sup trace Φ(ζ )Ψ (ζ ) dm(ζ )? n,m
Ψ ∈ Ak
T
The set An,m is defined by k def An,m = Ψ ∈ H01 (Mn,m ): Ψ L1 (Mn,m ) 1, rank Ψ (ζ ) k a.e. ζ ∈ T . k To address this extremal problem, we introduce Hankel-type operators on spaces of matrix functions and prove that this problem has a solution if and only if the corresponding Hankel-type operator has a maximizing vector. The main result of this paper is a characterization of the smallest number k for which
trace Φ(ζ )Ψ (ζ ) dm(ζ )
T
equals the sum of all the superoptimal singular values of an admissible matrix function Φ (e.g. a continuous matrix function) for some function Ψ ∈ An,m k . Moreover, we provide a representation of any such function Ψ when Φ is an admissible very badly approximable unitary-valued n × n matrix function. © 2009 Elsevier Inc. All rights reserved. E-mail address: [email protected]. 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.04.002
660
A.A. Condori / Journal of Functional Analysis 257 (2009) 659–682
Keywords: Best and superoptimal approximation; Badly and very badly approximable matrix functions; Hankel and Toeplitz operators
1. Introduction The problem of best analytic approximation for a given m × n matrix-valued bounded function Φ on the unit circle T is to find a bounded analytic function Q such that Φ − QL∞ (Mm,n ) = inf Φ − F L∞ (Mm,n ) : F ∈ H ∞ (Mm,n ) . Throughout, def Ψ L∞ (Mm,n ) = ess supΨ (ζ )M
m,n
ζ ∈T
,
Mm,n denotes the space of m × n matrices equipped with the operator norm · Mm,n (of the space of linear operators from Cn to Cm ), and H ∞ (Mm,n ) denotes the space of bounded analytic m × n matrix-valued functions on T. It is well known that, unlike scalar-valued functions, a polynomial matrix function Φ may have many best analytic approximants. Therefore it is natural to impose additional conditions in order to distinguish a “very best” analytic approximant among all best analytic approximants. To do so here, we use the notion of superoptimal approximation by bounded analytic matrix functions. 1.1. Superoptimal approximation and very badly approximable matrix functions Recall that for an m × n matrix A, the j th-singular value sj (A), j 0, is defined to be the distance from A to the set of matrices of rank at most j under the operator norm. More precisely, sj (A) = inf A − BMm,n : B ∈ Mm,n such that rank B j . Clearly, s0 (A) = AMm,n . Definition 1.1. Let Φ ∈ L∞ (Mm,n ). For k 0, we define the sets Ωk = Ωk (Φ) by Ω0 (Φ) = F ∈ H ∞ (Mm,n ): F minimizes ess supΦ(ζ ) − F (ζ )M
m,n
ζ ∈T
Ωj (Φ) = F ∈ Ωj −1 : F minimizes ess sup sj Φ(ζ ) − F (ζ ) ζ ∈T
and
for j > 0.
Any function F ∈ k0 Ωk = Ωmin{m,n}−1 is called a superoptimal approximation to Φ by bounded analytic matrix functions. In this case, the superoptimal singular values of Φ are defined by tj = tj (Φ) = ess sup sj (Φ − F )(ζ ) ζ ∈T
for j 0.
A.A. Condori / Journal of Functional Analysis 257 (2009) 659–682
661
Moreover, if the zero matrix function O belongs to Ωmin{m,n}−1 , we say that Φ is very badly approximable. Notice that any function F ∈ Ω0 is a best analytic approximation to Φ. Also, any very badly approximable matrix function is the difference between a bounded matrix function and its superoptimal approximant. It turns out that Hankel operators on Hardy spaces play an important role in the study of superoptimal approximation. For a matrix function Φ ∈ L∞ (Mm,n ), we define the Hankel operator HΦ by HΦ f = P− Φf
for f ∈ H 2 Cn , def
where P− denotes the orthogonal projection from L2 (Cm ) onto H−2 (Cm ) = L2 (Cm ) H 2 (Cm ). When studying superoptimal approximation, we only consider bounded matrix functions that are admissible. A matrix function Φ ∈ L∞ (Mm,n ) is said to be admissible if the essential norm HΦ e of the Hankel operator HΦ is strictly less than the smallest non-zero superoptimal singular value of Φ. As usual, the essential norm of a bounded linear operator T between Hilbert spaces is defined by def T e = T − K: K is compact . Note that any continuous matrix function Φ is admissible, as the essential norm of HΦ equals zero in this case. Moreover, in the case of scalar-valued functions, to say that a function ϕ is admissible simply means that Hϕ e < Hϕ . It is known that if Φ is an admissible matrix function, then Φ has a unique superoptimal approximation Q by bounded analytic matrix functions. Moreover, the functions ζ → sj ((Φ − Q)(ζ )) equal tj (Φ) a.e. on T for each j 0. These results were first proved in [6] for the special case Φ ∈ (H ∞ + C)(Mm,n ) (i.e. matrix functions which are a sum of a bounded analytic matrix function and a continuous matrix function), and shortly after proved for the class of admissible matrix functions in [5]. While it is possible to compute the superoptimal singular values of a given matrix function in concrete examples, it is not known how to verify if a matrix function that is not continuous is admissible or not. Thus a complete characterization of the smallest non-zero superoptimal singular value of a given matrix function is an important problem for superoptimal approximation. This remains an open problem. We refer the reader to Chapter 14 of [2] which contains proofs to all of the previously mentioned results and many other interesting results concerning superoptimal approximation. 1.2. An extremal problem Throughout this note, m denotes normalized Lebesgue measure on T so that m(T) = 1. For 1 p ∞, Lp (Mm,n ) denotes the space of m × n matrix-valued functions on T whose entries belong to Lp . We equip Lp (Mm,n ) with the norm · Lp (Mm,n ) , where
662
A.A. Condori / Journal of Functional Analysis 257 (2009) 659–682 p F Lp (Mm,n )
=
F (ζ )p
Mm,n
dm(ζ )
for 1 p < ∞,
and
T
F
L∞ (Mm,n )
= ess supF (ζ )M
m,n
ζ ∈T
.
p
H p (Mm,n ) and H0 (Mm,n ) consist of matrix-valued functions in Lp (Mm,n ) whose entries bep p long to the Hardy space H p and H0 , respectively. (Recall that H p and H0 denote the spaces of p L functions on T whose Fourier coefficients of negative and non-positive index vanish, respectively.) Definition 1.2. Let m, n > 1 and 1 k min{m, n}. For Φ ∈ L∞ (Mm,n ), we define σk (Φ) by def (1.1) σk (Φ) = sup trace Φ(ζ )Ψ (ζ ) dm(ζ ), Ψ ∈An,m k
T
where = Ψ ∈ H01 (Mn,m ): Ψ L1 (Mn,m ) 1 and rank Ψ (ζ ) k a.e. ζ ∈ T . An,m k def
Whenever n = m, we use the notation Ank = An,m k . We are interested in the following extremal problem: Extremal problem 1.1. For a matrix function Φ ∈ L∞ (Mm,n ), when is there a matrix function such that Ψ ∈ An,m k
trace Φ(ζ )Ψ (ζ ) dm(ζ ) = σk (Φ)?
T
The importance of this problem arose from the following observation due to Peller [3]. Theorem 1.3. Let 1 k min{m, n}. If Φ ∈ L∞ (Mm,n ) is admissible, then σk (Φ) t0 (Φ) + · · · + tk−1 (Φ).
(1.2)
Proof. Let Ψ ∈ An,m k . We may assume, without loss of generality, that Φ is very badly approximable. Indeed, trace Φ(ζ )Ψ (ζ ) dm(ζ ) = trace (Φ − Q)(ζ )Ψ (ζ ) dm(ζ ) T
T
holds for any Q ∈ H ∞ (Mm,n ), and so we may replace Φ with Φ − Q if necessary, where Q is the superoptimal approximation to Φ in H ∞ (Mm,n ). m Let S m 1 denote the collection of m × m matrices equipped with the trace norm AS 1 = ∗ 1/2 trace(A A) = j 0 sj (A).
A.A. Condori / Journal of Functional Analysis 257 (2009) 659–682
663
It follows from the well-known inequality |trace(A)| AS m1 that the inequalities trace Φ(ζ )Ψ (ζ ) Φ(ζ )Ψ (ζ )
k−1 sj Φ(ζ ) Ψ (ζ )M
Sm 1
n,m
j =0
hold for a.e. ζ ∈ T. Thus, k−1 trace Φ(ζ )Ψ (ζ ) dm(ζ ) sj Φ(ζ ) Ψ (ζ )M dm(ζ ) n,m T
j =0
T
k−1 j =0
T
k−1
tj (Φ) Ψ (ζ )M
n,m
dm(ζ )
tj (Φ) Ψ L1 (Mn,m )
j =0
k−1
(1.3)
tj (Φ),
j =0
because the singular values of Φ satisfy sj (Φ(ζ )) = tj (Φ) for a.e. ζ ∈ T since Φ is very badly approximable. 2 Before proceeding, let us observe that equality holds in (1.2) for some simple cases. Let r be a positive integer and t0 , t1 , . . . , tr−1 be positive numbers satisfying t0 t1 · · · tr−1 . Suppose Φ is an n × n matrix function of the form ⎛
u0 ⎜O def ⎜ . Φ=⎜ ⎜ .. ⎝O O
O t1 u1 .. .
... ... .. .
O O .. .
O O .. .
O O
... ...
tr−1 ur−1 O
⎞
⎟ ⎟ ⎟, ⎟ O⎠ Φ#
(1.4)
¯ h with θj an inner where Φ# L∞ tr−1 and uj is a unimodular function of the form uj = z¯ θ¯j h/ function for 0 j r − 1 and h an outer function in H 2 . Without loss of generality, we may assume that hL2 = 1. It can be seen that if ⎛
zθ0 h2 ⎜ O def ⎜ . Ψ =⎜ ⎜ .. ⎝ O O
O zθ1 h2 .. . O O
... ... .. . ... ...
O O .. . zθr−1 O
h2
⎞ O O⎟ .. ⎟ ⎟ . ⎟, O⎠ O
(1.5)
664
A.A. Condori / Journal of Functional Analysis 257 (2009) 659–682
then Ψ ∈ H01 (Mn ), rank Ψ (ζ ) = r a.e. on T, Ψ L1 (Mn ) = 1, and
trace Φ(ζ )Ψ (ζ ) dm(ζ ) = t0 + · · · + tr−1 .
T
Thus we obtain that σr (Φ) = t0 (Φ) + · · · + tr−1 (Φ). On the other hand, one cannot expect the inequality (1.2) to hold with equality in general. After all, by the Hahn–Banach theorem, distL∞ (S n1 ) Φ, H ∞ (Mn ) = σn (Φ),
(1.6)
and there are admissible very badly approximable 2 × 2 matrix functions Φ for which the strict inequality distL∞ (S 2 ) Φ, H ∞ (M2 ) < t0 (Φ) + t1 (Φ) 1
holds. For instance, consider the matrix function 1 z¯ O 1 1 z¯ z¯ z¯ 2 Φ= . =√ √ O z¯ 2 −z 1 2 −1 z¯ Clearly, Φ has superoptimal singular values t0 (Φ) = t1 (Φ) = 1. Let 1 O O . F=√ 2 −1 O It is not difficult to verify that √ 1 s0 (Φ − F )(ζ ) = 3+ 5 2
√ 1 and s1 (Φ − F )(ζ ) = 3− 5 2
for all ζ ∈ T. Therefore distL∞ (S 2 ) Φ, H ∞ (M2 ) Φ − F L∞ (S 2 ) < 2 = t0 (Φ) + t1 (Φ). 1
1
(1.7)
1.3. What is done in this paper? In virtue of Theorem 1.3 and the remarks following it, one may ask whether it is possible to characterize the matrix functions Φ for which (1.2) becomes an equality. So let Φ be an admissible n × n matrix function with a superoptimal approximant Q in H ∞ (Mn ) for which equality in Theorem 1.3 holds with k = n. In this case, it must be that n−1 n−1 distL∞ (S n1 ) Φ, H ∞ (Mn ) = tj (Φ) = sj (Φ − Q)(ζ ) = Φ − QL∞ (S n1 ) j =0
j =0
A.A. Condori / Journal of Functional Analysis 257 (2009) 659–682
665
by (1.6) and thus the superoptimal approximant Q must be a best approximant to Φ under the L∞ (S n1 ) norm as well. Hence, we are led to investigate the following problems: 1. For which matrix functions Φ does Extremal problem 1.1 have a solution? 2. If Q$ is a best approximant to Φ under the L∞ (S n1 )-norm, when does it follow that Q$ is the superoptimal approximant to Φ in L∞ (Mn )? 3. Can we find necessary and sufficient conditions on Φ to obtain equality in (1.2) of Theorem 1.3? Before addressing these problems, we recall certain standard principles of functional analysis in Section 2 that are used throughout the paper. In particular, we give their explicit formulation for the spaces Lp (S m,n q ). {k}
In Section 3, we introduce the Hankel-type operators HΦ on spaces of matrix functions and {k} k-extremal functions, and prove that the number σk (Φ) equals the operator norm of HΦ . We {k} also show that Extremal problem 1.1 has a solution if and only if the Hankel-type operator HΦ has a maximizing vector, and thus answer question 1 in terms Hankel-type operators. In Section 4, we establish the main results of this paper concerning best approximation under the L∞ (S m,n 1 ) norm (Theorem 4.7) and the sum of superoptimal singular values (Theorem 4.13). The latter result characterizes the smallest number k for which trace Φ(ζ )Ψ (ζ ) dm(ζ ) T
equals the sum of all non-zero superoptimal singular values for some function Ψ ∈ An,m k . These results serve as partial solutions to problems 2 and 3. Lastly, in Section 5, we restrict our attention to unitary-valued very badly approximable matrix functions. For any such matrix function U , we provide a representation of any function Ψ for which the formula trace U (ζ )Ψ (ζ ) dm(ζ ) = n T
holds. 2. Best approximation and dual extremal problems We now provide explicit formulation of some basic results concerning best approximation q m,n in H q (S m,n p ) for functions in L (S p ) and the corresponding dual extremal problem. We first consider the general setting. 2.1. Best approximation Definition 2.1. Let X be a normed space, M be a closed subspace of X, and x0 ∈ X. We say that m0 is a best approximant to x0 in M if m0 ∈ M and def x0 − m0 X = dist(x0 , M) = inf x0 − mX : m ∈ M .
666
A.A. Condori / Journal of Functional Analysis 257 (2009) 659–682
It is known that if X is a reflexive Banach space and M is a closed subspace of X, then each x0 ∈ X \ M has a best approximant m0 in M. Two standard principles from functional analysis are used throughout this note. Namely, if X is a normed space with a linear subspace M, then for any Λ0 ∈ X ∗ and x0 ∈ X sup
Λ0 (m) = min Λ0 − Λ: Λ ∈ M ⊥
max
Λ(x0 ) = dist(x0 , M)
m∈M, m1
Λ∈M ⊥ , Λ1
and
whenever M is closed.
We now discuss these results in the case of the spaces Lq (S m,n p ). 2.2. The spaces Lq (S m,n p ) Let 1 q < ∞ and 1 p ∞. Let p denote the conjugate exponent to p, i.e. p = p/(p − 1). Let S m,n p denote the space of m × n matrices equipped with the Schatten–von Neumann norm , i.e. for A ∈ Mm,n · S m,n p def
AS m,n = AMm,n ∞
def
and AS m,n = p
1/p
p sj (A)
for 1 p < ∞.
j 0
def
We also use the notation S np = S n,n p . If X is a normed space of functions on T with norm · X , then X(S m,n p ) denotes the space ), we define of m × n matrix functions whose entries belong to X. For Φ ∈ X(S m,n p def
= ρX , ΦX(S m,n p )
def where ρ(ζ ) = Φ(ζ )S m,n for ζ ∈ T. p
n,m q It is known that the dual space of Lq (S m,n p ) is isometrically isomorphic to L (S p ) via the
mapping Φ → ΛΦ , where Φ ∈ Lq (S n,m p ) and ΛΦ (Ψ ) =
trace Φ(ζ )Ψ (ζ ) dm(ζ )
. for Ψ ∈ Lq S m,n p
T q
n,m q m,n In particular, it follows that the annihilator of H q (S m,n p ) in L (S p ) is given by H0 (S p ), and so q m,n trace Φ(ζ )Ψ (ζ ) dm(ζ ), Φ, H S = max distLq (S m,n p p ) Ψ
q n,m 1 H0 (S ) p
T
by our remarks in Section 2.1. Moreover, if 1 < q < ∞, then Φ ∈ Lq (S m,n p ) has a best approxiq m,n mant Q in H q (S m,n p ) (as L (S p ) is reflexive); that is, Φ, H q S m,n . = distLq (S m,n Φ − QLq (S m,n p p ) p )
A.A. Condori / Journal of Functional Analysis 257 (2009) 659–682
667
∞ m,n The situation is similar in the case of L∞ (S m,n p ). Indeed, L (S p ) is a dual space, and so m,n there is a Q ∈ H ∞ (S p ) such that
Φ, H ∞ S m,n . Φ − QL∞ (S m,n = distL∞ (S m,n p p ) p ) Again, it also follows from our remarks in Section 2.1 that distL∞ (S m,n Φ, H ∞ S m,n = p p )
trace Φ(ζ )Ψ (ζ ) dm(ζ ).
sup
Ψ H 1 (S n,m ) 1 0
p
T
However, an extremal function may fail to exist in this case even if Φ is a scalar-valued function. An example can be deduced from Section 1 of Chapter 1 in [2]. 3. σk (Φ) as the norm of a Hankel-type operator and k-extremal functions {k}
We now introduce the Hankel-type operators HΦ which act on spaces of matrix functions. {k} {k} We prove that the number σk (Φ) equals the operator norm of HΦ and characterize when HΦ has a maximizing vector. Recall that for an operator T : X → Y between normed spaces X and Y , a vector x ∈ X is called a maximizing vector of T if x is non-zero and T xY = T · xX . We begin by establishing the following lemma. Lemma 3.1. Let 1 k min{m, n}. If Ψ ∈ H 1 (Mn,m ) is such that rank Ψ (ζ ) = k for a.e. ζ ∈ T, then there are functions R ∈ H 2 (Mn,k ) and Q ∈ H 2 (Mk,m ) such that R(ζ ) has rank equal to k for almost every ζ ∈ T, Ψ = RQ
2 and R(ζ )M
n,k
2 = Q(ζ )M
k,m
= Ψ (ζ )M
n,m
for a.e. ζ ∈ T.
Proof. Consider the set A = closL1 (Cn ) f ∈ H 1 Cn : f (ζ ) ∈ Range Ψ (ζ ) a.e. on T . Since A is a non-trivial invariant subspace of H 1 (Cn ) under multiplication by z, there is an n × r inner function Θ such that A = ΘH 1 (Cr ). We first show that r = k. Let {ej }rj =1 be an orthonormal basis for Cr . Then for almost every ζ ∈ T, we have that {Θ(ζ )ej }rj =1 is a linearly independent set, since Θ is inner. Moreover, {Θ(ζ )ej }rj =1 is a basis for Range Θ(ζ ) = Range Ψ (ζ ) for a.e. ζ ∈ T. Since dim Range Ψ (ζ ) = k a.e. on T, it follows that r = dim Range Θ(ζ ) = dim Range Ψ (ζ ) = k. In particular, we obtain that A = ΘH 1 Ck . Therefore, Ψ = ΘF for some k × m matrix function F ∈ H 1 (Mk,m ), because the columns of Ψ belong to A .
668
A.A. Condori / Journal of Functional Analysis 257 (2009) 659–682 1/2
Let h be an outer function in H 2 such that |h(ζ )| = Ψ (ζ )Mn,m for a.e. ζ ∈ T. (The existence
of h is a consequence of the fact that log Ψ (ζ )Mn,m ∈ L1 as Ψ ∈ H 1 (Mn,m ).) Thus, the matrix functions and Q = h−1 F
R = hΘ 2
have the desired properties.
2 m,k 2 m,k Definition 3.2. Let Φ ∈ L∞ (Mm,n ), 1 k min{m, n}, and ρ :L2 (S m,k 1 ) →L (S 1 )/H (S 1 ) {k} denote the natural quotient map. We define the Hankel-type operator HΦ : H 2 (Mn,k ) → 2 m,k L2 (S m,k 1 )/H (S 1 ) by setting {k}
def
HΦ F = ρ(ΦF )
for F ∈ H 2 (Mn,k ).
2 m,k The norm in the quotient space L2 (S m,k 1 )/H (S 1 ) is the natural one; that is, the norm of a m,k coset equals the infimum of the L2 (S 1 )-norms of its elements.
Theorem 3.3. Let 1 k min{m, n}. If Φ ∈ L∞ (Mm,n ), then {k} σk (Φ) = HΦ H 2 (M
2 m,k 2 m,k n,k )→L (S 1 )/H (S 1 )
.
Proof. Consider the collection Bkn,m = RQ: RH 2 (Mn,k ) 1, QH 2 (Mk,m ) 1 . 0
Bkn,m
An,m k .
We claim that = Indeed if Ψ ∈ Ak satisfies rank Ψ (ζ ) = j for ζ ∈ T, where 1 j k, then by Lemma 3.1 there are functions R ∈ H 2 (Mn,j ) and Q ∈ H02 (Mj,m ) such that R(ζ ) has rank equal to j for almost every ζ ∈ T, Ψ = RQ
2 and R(ζ )M
n,j
2 = Q(ζ )M
j,m
= Ψ (ζ )M
n,m
for a.e. ζ ∈ T.
We may now add zeros, if necessary, to obtain n × k and k × m matrix functions Q R# = ( R O ) and Q# = , O respectively, from which it follows that Ψ = R# Q# ∈ Bkn,m . Therefore An,m ⊂ Bkn,m . The reverse k inclusion is trivial and so these sets are equal. Hence σk (Φ) = sup sup trace Φ(ζ )R(ζ )Q(ζ ) dm(ζ ) RH 2 (M
=
n,k )
sup
RH 2 (M
n,k )
1 QH 2 (M 0
1
{k} = HΦ H 2 (M
k,m )
1
T
distL2 (S m,k ) ΦR, H 2 (Mm,k ) 1
2 m,k 2 m,k n,k )→L (S 1 )/H (S 1 )
.
2
A.A. Condori / Journal of Functional Analysis 257 (2009) 659–682
669
Definition 3.4. Let Φ ∈ L∞ (Mm,n ) and 1 k min{m, n}. We say that Ψ is a k-extremal and function for Φ if Ψ ∈ An,m k σk (Φ) =
trace Φ(ζ )Ψ (ζ ) dm(ζ ).
T
Thus a matrix function Φ has a k-extremal function if and only if Extremal problem 1.1 has a positive solution. We can now describe matrix functions that have a k-extremal function in terms of Hankel-type operators. Theorem 3.5. Let Φ ∈ L∞ (Mm,n ). The matrix function Φ has a k-extremal function if and only {k} if the Hankel-type operator HΦ has a maximizing vector. Proof. To simplify notation, let {k} def {k} H = H Φ
Φ
2 m,k H 2 (Mn,k )→L2 (S m,k 1 )/H (S 1 )
.
Suppose Ψ is a k-extremal function for Φ. Let j ∈ N be such that j k and rank Ψ (ζ ) = j
for a.e. ζ ∈ T.
By Lemma 3.1, there is an R ∈ H 2 (Mn,j ) and a Q ∈ H02 (Mj,m ) such that Ψ = RQ
2 and R(ζ )M
n,j
2 = Q(ζ )M
j,m
= Ψ (ζ )M
n,m
for a.e. ζ ∈ T.
As before, adding zeros if necessary, we obtain n × k and k × m matrix functions Q R# = ( R O ) and Q# = , O respectively, so that Ψ = R# Q# and Q# (ζ )2
Mk,m
2 = Q(ζ )M
j,m
= Ψ (ζ )M
n,m
for a.e. ζ ∈ T.
{k}
Let us show that R# is a maximizing vector for HΦ . Since Q# belongs to H02 (Mk,m ), we have that for any F ∈ H 2 (S m,k 1 ) σk (Φ) = T
= T
and so
trace Φ(ζ )Ψ (ζ ) dm(ζ ) =
trace Φ(ζ )R# (ζ )Q# (ζ ) dm(ζ )
T
trace (ΦR# − F )(ζ )Q# (ζ ) dm(ζ ),
670
A.A. Condori / Journal of Functional Analysis 257 (2009) 659–682
σk (Φ) = trace (ΦR# − F )(ζ )Q# (ζ ) dm(ζ ) T
trace (ΦR# − F )(ζ )Q# (ζ ) dm(ζ )
T
(ΦR# − F )(ζ )Q# (ζ )
Sm 1
dm(ζ )
T
(ΦR# − F )(ζ )
S m,k 1
Q# (ζ )
Mk,m
dm(ζ )
T
ΦR# − F L2 (S m,k ) Q# L2 (Mk,m ) 1
= ΦR# − F L2 (S m,k ) Ψ L1 (Mn,m ) 1
ΦR# − F L2 (S m,k ) . 1
By Theorem 3.3, we obtain that {k} {k} σk (Φ) HΦ R# L2 (S m,k )/H 2 (S m,k ) HΦ = σk (Φ), 1
1
and therefore {k} {k} H = H R # Φ
Φ
2 m,k L2 (S m,k 1 )/H (S 1 )
.
Thus, R# is a maximizing vector of HΦ . {k} Conversely, suppose the Hankel-type operator HΦ has a maximizing vector R ∈ H 2 (Mn,k ). Without loss of generality, we may assume that RL2 (Mn,k ) = 1. Then {k} = HΦ . distL2 (S m,k ) ΦR, H 2 S m,k 1 1
By the remarks in Section 2.2, there is a function G ∈ H02 (Mk,m ) such that GL2 (Mk,m ) 1 and . trace (ΦR)(ζ )G(ζ ) dm(ζ ) = distL2 (S m,k ) ΦR, H 2 S m,k 1 1
T {k}
On the other hand, since R is a maximizing vector of HΦ , it follows from Theorem 3.3 that
{k} trace Φ(ζ )(RG)(ζ ) dm(ζ ) = HΦ = σk (Φ).
T def
Hence Ψ = RG is a k-extremal function for Φ.
2
A.A. Condori / Journal of Functional Analysis 257 (2009) 659–682
671
Before stating the next result, let us recall that the Hankel operator HΦ : H 2 (Cn ) → H−2 (Cm ) is defined by HΦ f = P− Φf for f ∈ H 2 (Cn ). The following is an immediate consequence of the previous theorem when k = 1. Corollary 3.6. Let Φ ∈ L∞ (Mm,n ). The Hankel operator HΦ has a maximizing vector if and only if Φ has a 1-extremal function. Proof. By Theorem 3.5, Φ has a 1-extremal function if and only if the Hankel-type opera{1} tor HΦ : H 2 (Cn ) → L2 (Cm )/H 2 (Cm ) has a maximizing vector. The conclusion now follows by considering the “natural” isometric isomorphism between the spaces H−2 (Cm ) = L2 (Cm ) H 2 (Cm ) and L2 (Cm )/H 2 (Cm ). 2 Remark 3.7. It is worth mentioning that if a matrix function Φ is such that the Hankel operator HΦ has a maximizing vector (e.g. Φ ∈ (H ∞ + C)(Mn )), then any 1-extremal function Ψ of Φ satisfies trace Φ(ζ )Ψ (ζ ) dm(ζ ) = HΦ = t0 (Φ). T
This is a consequence of Corollary 3.6 and Theorem 3.3. Remark 3.8. There are other characterizations of the class of bounded matrix functions Φ such that the Hankel operator HΦ has a maximizing vector. These involve “dual” extremal functions and “thematic” factorizations. We refer the interested reader to [4] for details. {k}
Corollary 3.9. Let 1 k n and Φ ∈ L∞ (Mn ). Suppose that σk (Φ) = σ (Φ). If HΦ has { } a maximizing vector, then HΦ also has a maximizing vector. Proof. This is an immediate consequence of Theorem 3.5.
2
4. How about the sum of superoptimal singular values? In this section, we prove in Theorem 4.7 that equality is obtained in (1.2) under some natural conditions. For the rest of this note, we assume that m = n. Consider the non-decreasing sequence σ1 (Φ), . . . , σn (Φ). Recall that σn (Φ) = distL∞ (S n1 ) Φ, H ∞ (Mn ) and the distance on the right-hand side is in fact always attained, i.e. a best approximant Q to Φ under the L∞ (S n1 ) norm always exists as explained in Section 2.2. Theorem 4.1. Let Φ ∈ L∞ (Mn ) and 1 k n. Suppose Q is a best approximant to Φ in {k} H ∞ (Mn ) under the L∞ (S n1 )-norm. If the Hankel-type operator HΦ has a maximizing vector F 2 in H (Mn,k ) and σk (Φ) = σn (Φ), then
672
A.A. Condori / Journal of Functional Analysis 257 (2009) 659–682
1. QF is a best approximant to ΦF in H 2 under the L2 (S n,k 1 )-norm, 2. for each j 0, sj (Φ − Q)(ζ )F (ζ ) = sj (Φ − Q)(ζ ) F (ζ )M
for a.e. ζ ∈ T,
n,k
k−1 3. j =0 sj ((Φ − Q)(ζ )) = σk (Φ) holds for a.e. ζ ∈ T, and 4. sj ((Φ − Q)(ζ )) = 0 holds for a.e. ζ ∈ T whenever j k. Proof. By our assumptions, {k} 2 H F 2 2 Φ
L (Mn,k )
2 {k} 2 = HΦ F L2 (S n,k )/H 2 (S n,k ) = ρ(ΦF ) 1
1
2 = ρ (Φ − Q)F
2 (Φ − Q)F L2 (S n,k ) =
(Φ − Q)(ζ )F (ζ )2 n,k dm(ζ ) S1
1
T
(Φ − Q)(ζ )2 n F (ζ )2
Mn,k
S1
dm(ζ )
T
Φ − Q2L∞ (S n ) F 2L2 (M
n,k )
1
= σk (Φ)2 F 2L2 (M
n,k )
.
It follows from Theorem 3.3 that all inequalities are equalities. In particular, we obtain that QF is a best approximant to ΦQ under the L2 (S n,k 1 )-norm since the first inequality is actually an equality. For almost every ζ ∈ T, (Φ − Q)(ζ )F (ζ ) n = (Φ − Q)(ζ ) n F (ζ ) S1 S1 Mn,k (Φ − Q)(ζ ) n = Φ − QL∞ (S n ) = σk (Φ), S 1
and (4.1)
1
because the second and third inequalities are equalities as well. It follows from (4.1) that for each j 0, sj (Φ − Q)(ζ )F (ζ ) = sj (Φ − Q)(ζ ) F (ζ )M
n,k
for a.e. ζ ∈ T.
We claim that if j k, then sj ((Φ − Q)(ζ )) = 0 for a.e. ζ ∈ T. By Theorem 3.5, we can choose a k-extremal function, say Ψ , for Φ. Since Ψ belongs to H01 (Mn ), σk (Φ) =
trace Φ(ζ )Ψ (ζ ) dm(ζ ) =
T
T
trace (Φ − Q)(ζ )Ψ (ζ ) dm(ζ )
T
(Φ − Q)(ζ )Ψ (ζ )
dm(ζ ) Sn 1
(Φ − Q)(ζ ) n Ψ (ζ ) dm(ζ ) S M 1
T
Φ − QL∞ (S n1 ) Ψ L1 (Mn ) Φ − QL∞ (S n1 ) = σk (Φ),
n
A.A. Condori / Journal of Functional Analysis 257 (2009) 659–682
673
and so all inequalities are equalities. It follows that trace (Φ − Q)(ζ )Ψ (ζ ) = (Φ − Q)(ζ ) n Ψ (ζ ) S M
for a.e. ζ ∈ T.
n
1
(4.2)
In order to complete the proof, we need the following lemma. Lemma 4.2. Let A ∈ Mn and B ∈ Mn . Suppose that A and B satisfy trace(AB) = AM BS n . n 1 If rank A k, then rank B k as well. We first finish the proof of Theorem 4.1 before proving Lemma 4.2. It follows from (4.2) and Lemma 4.2 that rank (Φ − Q)(ζ ) k
for a.e. ζ ∈ T.
In particular, if j k, then sj (Φ − Q)(ζ ) = 0
for a.e. ζ ∈ T,
and so k−1 sj (Φ − Q)(ζ ) = (Φ − Q)(ζ )S n = σk (Φ) 1
j =0
This completes the proof.
for a.e. ζ ∈ T.
2
Remark 4.3. Lemma 4.2 is a slight modification of Lemma 4.6 in [1]. Although the proof of Lemma 4.2 given below is almost the same as that given in [1] for Lemma 4.6, we include it for the convenience of the reader. Proof of Lemma 4.2. Let B have polar decomposition B = U P and set C = AU , where P = (B ∗ B)1/2 . Let e1 , . . . , en be an orthonormal basis of eigenvectors for P and P ej = λj ej . It is easy to see that the following inequalities hold: n n ∗ ∗ trace(AB) = trace(CP ) = P ej , C ej = λj ej , C ej j =1
j =1
n n n = λj (Cej , ej ) λj (Cej , ej ) λj Cej j =1
CMn
j =1
n j =1
λj .
j =1
674
A.A. Condori / Journal of Functional Analysis 257 (2009) 659–682
On the other hand, AMn B
S n1
= CMm P
S n1
= CMn
n
λj
j =1
and so, by the assumption |trace(AB)| = AMn BS n1 , it follows that n j =1
λj Cej = CMn
n
λj .
j =1
Therefore λj Cej = CMn λj for each j . However, if rank A k, then rank C k. Thus there are at most k vectors ej such that Cej = CMn . In particular, there are at least n − k vectors ej such that Cej < CMn . Thus, λj = 0 for those n − k vectors ej , rank P k, and so rank B k. 2 Remark 4.4. Note that the distance function dΦ defined on T by def dΦ (ζ ) = (Φ − Q)(ζ )S n 1
equals σk (Φ) for almost every ζ ∈ T and is therefore independent of the choice of the best approximant Q. This is an immediate consequence of Theorem 4.1. A similar phenomenon occurs in the case of matrix functions Φ ∈ Lp (Mn ) for 2 < p < ∞. We refer the reader to [1] for details. Corollary 4.5. Let Φ ∈ L∞ (Mn ) be an admissible matrix function and 1 k n. If the Hankel{k} type operator HΦ has a maximizing vector and σk (Φ) = σn (Φ), then k−1 k−1 sj (Φ − Q)(ζ ) tj (Φ) j =0
j =0
for any best approximation Q of Φ in H ∞ (Mn ) under the L∞ (S n1 )-norm. Proof. This is an immediate consequence of Theorems 1.3 and 4.1.
2
Definition 4.6. A matrix function Φ ∈ L∞ (Mn ) is said to have order if is the smallest number { } such that HΦ has a maximizing vector and σ (Φ) = distL∞ (S n1 ) Φ, H ∞ (Mn ) . If no such number exists, we say that Φ is inaccessible. The interested reader should compare this definition of “order” with the one made in [1] for matrix functions in Lp (Mn ) for 2 < p < ∞. Also, due to Corollary 3.9, it is clear that if {k} Φ ∈ L∞ (Mn ) has order , then the Hankel-type operator HΦ has a maximizing vector and
A.A. Condori / Journal of Functional Analysis 257 (2009) 659–682
675
σk (Φ) = distL∞ (S n1 ) Φ, H ∞ (Mn ) holds for each k . Theorem 4.7. Let Φ ∈ L∞ (Mn ) be an admissible matrix function of order k. The following statements are equivalent. (1) Q ∈ H ∞ is a best approximant to Φ under the L∞ (S n1 )-norm and the functions ζ → sj (Φ − Q)(ζ ) ,
0 j k − 1,
are constant almost everywhere on T. (2) Q is the superoptimal approximant to Φ, tj (Φ) = 0 for j k, and σk (Φ) = t0 (Φ) + · · · + tk−1 (Φ). Proof. We first prove that (1) implies (2). By Corollary 4.5, we have that, for almost every ζ ∈ T, k−1 k−1 k−1 k−1 sj (Φ − Q)(ζ ) tj (Φ) ess sup sj (Φ − Q)(ζ ) = sj (Φ − Q)(ζ ) . j =0
j =0
j =0
ζ ∈T
j =0
This implies that tj (Φ) = ess sup sj (Φ − Q)(ζ ) = sj (Φ − Q)(ζ ) ζ ∈T
for 0 j k − 1,
Q ∈ Ωk−1 (Φ), and k−1
tj (Φ) =
j =0
k−1 sj (Φ − Q)(ζ ) = σk (Φ). j =0
Moreover, Theorem 4.1 gives that sj ((Φ − Q)(ζ )) = 0 a.e. on T for j k, and so tj (Φ) = 0 for j k, as Q ∈ Ωk−1 (Φ). Hence, Q is the superoptimal approximant to Φ. Let us show that (2) implies (1). Clearly, it suffices to show that if (2) holds, then Q is a best approximant to Φ under the L∞ (S n1 )-norm. Suppose (2) holds. In this case, we must have that σk (Φ) =
k−1
tj (Φ) =
j =0
k−1 sj (Φ − Q)(ζ ) = Φ − QL∞ (S n1 ) . j =0
Since Φ has order k, it follows that σn (Φ) = Φ − QL∞ (S n1 ) and so the proof is complete.
2
676
A.A. Condori / Journal of Functional Analysis 257 (2009) 659–682
For the rest of this section, we restrict ourselves to admissible matrix functions Φ which are also very badly approximable. Recall that, in this case, the function ζ → sj (Φ(ζ )) equals tj (Φ) a.e. on T for 0 j n − 1, as mentioned in Section 1.1. The next result follows at once from Theorem 4.7. Corollary 4.8. Let Φ be an admissible very badly approximable n × n matrix function of order k. The zero matrix function is a best approximant to Φ under the L∞ (S n1 )-norm if and only if tj (Φ) = 0 for j k and σk (Φ) = t0 (Φ) + · · · + tk−1 (Φ). It is natural to question at this point whether or not the collection of admissible very badly approximable matrix functions of order k is non-empty. It turns out that one can easily construct examples of admissible very badly approximable matrix functions of order k (see Examples 4.14 and 4.15). Theorem 4.10 below gives a simple sufficient condition for determining when a very badly approximable matrix function has order k. We first need the following lemma. Lemma 4.9. Let Φ ∈ L∞ (Mn ). Suppose there is Ψ ∈ Ank such that
trace Φ(ζ )Ψ (ζ ) dm(ζ ) = ΦL∞ (S n1 ) .
T
Then Ψ is a k-extremal function for Φ, σk (Φ) = σn (Φ), and the zero matrix function is a best approximant to Φ under the L∞ (S n1 )-norm. Proof. By the assumptions on Ψ , we have ΦL∞ (S n1 ) =
trace Φ(ζ )Ψ (ζ ) dm(ζ ) σk (Φ).
T
On the other hand, σk (Φ) distL∞ (S n1 ) Φ, H ∞ ΦL∞ (S n1 ) always holds. Since all the previously mentioned inequalities are equalities, the conclusion follows. 2 Theorem 4.10. Let Φ ∈ L∞ (Mn ) be an admissible very badly approximable matrix function. Suppose there is Ψ ∈ Ank such that
trace Φ(ζ )Ψ (ζ ) dm(ζ ) = t0 (Φ) + · · · + tn−1 (Φ).
T
If tk−1 (Φ) > 0, then Φ has order k and the zero matrix function is a best approximant to Φ under the L∞ (S n1 )-norm.
A.A. Condori / Journal of Functional Analysis 257 (2009) 659–682
677
Proof. By the remarks preceding Corollary 4.8, it is easy to see that ΦL∞ (S n1 ) = t0 (Φ) + · · · + tn (Φ). It follows from Lemma 4.9 that Ψ is a k-extremal function for Φ, σk (Φ) = σn (Φ), and the zero matrix function is a best approximant to Φ under the L∞ (S n1 )-norm. Thus ΦL∞ (S n1 ) = σk (Φ). Moreover, by Theorem 1.3, σk−1 (Φ) t0 (Φ) + · · · + tk−2 (Φ) < t0 (Φ) + · · · + tk−1 (Φ) ΦL∞ (S n1 ) . Therefore σk−1 (Φ) < σk (Φ).
2
Remark 4.11. Notice that under the hypotheses of Theorem 4.10, one also obtains that tk−1 (Φ) is the smallest non-zero superoptimal singular value of Φ. This is an immediate consequence of Corollary 4.8. We now formulate the corresponding result for admissible very badly approximable unitaryvalued matrix functions. These functions are considered in greater detail in Section 5. Corollary 4.12. Let U ∈ L∞ (Mn ) be an admissible very badly approximable unitary-valued matrix function. If there is Ψ ∈ Ann such that
trace U (ζ )Ψ (ζ ) dm(ζ ) = n,
T
then U has order n and the zero matrix function is a best approximant to U under the L∞ (S n1 )norm. Proof. This is a trivial consequence of Theorem 4.10 and the fact that tj (U ) = 1 for 0 j n − 1.
2
We are now ready to state the main result of this section. Theorem 4.13. Let Φ be an admissible very badly approximable n × n matrix function. The following statements are equivalent: (1) k is the smallest number for which there exists Ψ ∈ Ank such that
trace Φ(ζ )Ψ (ζ ) dm(ζ ) = t0 (Φ) + · · · + tn−1 (Φ);
T
(2) Φ has order k, tj (Φ) = 0 for j k and σk (Φ) = t0 (Φ) + · · · + tk−1 (Φ).
678
A.A. Condori / Journal of Functional Analysis 257 (2009) 659–682
Proof. Let κ(Φ) = inf j 0: there exists a Ψ ∈ Anj such that def
trace Φ(ζ )Ψ (ζ ) dm(ζ ) = t0 (Φ) + · · · + tn−1 (Φ) .
T
Clearly, κ(Φ) may be infinite for arbitrary Φ. Suppose κ = κ(Φ) is finite. Then Lemma 4.9 implies that Φ has a κ-extremal function, σκ (Φ) = σn (Φ), and the zero matrix function is a best approximant to Φ under the L∞ (S n1 )norm. In particular, Φ has order k κ(Φ), tj (Φ) = 0 for j k, and σk (Φ) = t0 (Φ) + · · · + tk−1 (Φ), by Corollary 4.8. On the other hand, if Φ has order k, tj (Φ) = 0 for j k, and σk (Φ) = t0 (Φ) + · · · + tk−1 (Φ), then Φ has a k-extremal function Ψ ∈ Ank such that
trace Φ(ζ )Ψ (ζ ) dm(ζ ) = σk (Φ) = t0 (Φ) + · · · + tk−1 (Φ).
T
Since tj (Φ) = 0 for j k, it follows that
trace Φ(ζ )Ψ (ζ ) dm(ζ ) = t0 (Φ) + · · · + tn−1 (Φ).
T
Thus κ(Φ) k. Hence, if either κ(Φ) is finite or Φ satisfies (2), then k = κ(Φ).
2
We end this section by illustrating existence of very badly approximable matrix functions of order k by giving two simple examples; a 2 × 2 matrix function of order 2 and a 3 × 3 matrix function of order 2. Example 4.14. Let 1 Φ=√ 2
1 −1 1 1
z¯ 2 O
O . z¯
It is easy to see that Φ is a continuous (and hence admissible) unitary-valued very badly approximable matrix function with superoptimal singular values t0 (Φ) = t1 (Φ) = 1. We claim that Φ has order 2. Indeed, the matrix function
A.A. Condori / Journal of Functional Analysis 257 (2009) 659–682
Ψ=
z2 O
O z
1 √ 2
1 1 −1 1
679
satisfies
trace Φ(ζ )Ψ (ζ ) dm(ζ ) = 2,
T
and so Φ has order 2 by Corollary 4.12. Example 4.15. Let t0 and t1 be two positive numbers satisfying t0 t1 . Let
Φ=
t0 z¯ a O O
O t1 z¯ b O
O O , O
where a and b are positive integers. It is easy to see that Φ is a continuous (and hence admissible) very badly approximable matrix function with superoptimal singular values t0 (Φ) = t0 , t1 (Φ) = t1 , and t2 (Φ) = 0. Again, we have that Φ has order 2. After all, the matrix function
Ψ=
za O O
O zb O
O O O
satisfies
trace Φ(ζ )Ψ (ζ ) dm(ζ ) = t0 + t1 = t0 (Φ) + t1 (Φ) + t2 (Φ),
T
and so Φ has order 2 by Theorem 4.10, since t1 (Φ) = t1 > 0. 5. Unitary-valued very badly approximable matrix functions We lastly consider the class Un of admissible very badly approximable unitary-valued matrix functions of size n × n and provide a representation of any n-extremal function Ψ for a function U ∈ Un such that
trace U (ζ )Ψ (ζ ) dm(ζ ) = t0 (U ) + · · · + tn−1 (U )
T
holds. Note that for any such U we have that tj (U ) = 1 for 0 j n − 1. For a matrix function Φ ∈ L∞ (Mm,n ), we define the Toeplitz operator TΦ by TΦ f = P+ Φf
for f ∈ H 2 Cn ,
where P+ denotes the orthogonal projection from L2 (Cn ) onto H 2 (Cm ).
(5.1)
680
A.A. Condori / Journal of Functional Analysis 257 (2009) 659–682
It is well known that, for any function U ∈ Un , the Toeplitz operator TU is Fredholm and ind TU > 0. (As usual, for a Fredholm operator T , its index, ind T , is defined by dim ker T − dim ker T ∗ .) In particular, ind Tdet U = ind TU . We refer the reader to Chapter 14 in [2] for more information concerning functions in Un . Theorem 5.1. Suppose U ∈ Un has an n-extremal function Ψ such that (5.1) holds. Then Ψ admits a representation of the form Ψ = zh2 Θ, where h ∈ H 2 is an outer function such that hL2 = 1 and Θ is a finite Blaschke–Potapov product. Moreover, the scalar functions det(U Θ) and trace(U Θ) are admissible badly approximable functions that admit the factorizations det(U Θ) = z¯ n
h¯ n hn
h¯ trace(U Θ) = n¯z . h
and
We refer the reader to Section 5 of Chapter 2 in [2] for the definition and other results concerning Blaschke–Potapov products. Proof. It follows from (5.1) that all inequalities in (1.3) are equalities and so trace U (ζ )Ψ (ζ ) = U (ζ )Ψ (ζ )S n = nΨ (ζ )M
n
1
(5.2)
holds for a.e. ζ ∈ T. Since U is unitary-valued, then U (ζ )Ψ (ζ )
S n1
= Ψ (ζ )S n , 1
and so Ψ (ζ )
S n1
= nΨ (ζ )M
n
must hold for a.e. ζ ∈ T. Therefore we must have Ψ (ζ ) = Ψ (ζ )M V (ζ ) n
for a.e. ζ ∈ T,
for some unitary-valued matrix function V , because sj Ψ (ζ ) = Ψ (ζ )M
n
for a.e. ζ ∈ T, 0 j n − 1.
Let h ∈ H 2 be an outer function such that h(ζ ) = Ψ (ζ )1/2 Mn
on T.
(5.3)
A.A. Condori / Journal of Functional Analysis 257 (2009) 659–682
681
def
Consider also the matrix function Ξ = h−2 Ψ . It follows from (5.3) that ∗ Ξ Ξ (ζ ) =
1 ∗ Ψ Ψ (ζ ) = In |h(ζ )|4
for a.e. ζ ∈ T,
and so Ξ is an inner function. Thus Ψ admits the factorization Ψ = zh2 Θ for some n × n unitary-valued inner function Θ and an outer function h ∈ H 2 such that hL2 = 1. def
Note that the first equality in (5.2) indicates that the scalar function ϕ = trace(U Θ) satisfies zh2 ϕ = n|h|2
on T,
or equivalently h¯ ϕ = n¯z . h Moreover, HU Θ e HU e < 1, hence Hϕ e < n = Hϕ implying that ϕ is an admissible badly approximable scalar function on T. We conclude that the Toeplitz operator Tϕ is Fredholm and ind Tϕ > 0 (cf. Theorem 7.5.5 in [2]). Returning to (5.2), it also follows that each eigenvalue of U (ζ )Ψ (ζ ) equals Ψ (ζ )Mn = |h(ζ )|2 for a.e. ζ ∈ T. In particular, h(ζ )2n = det U (ζ )Ψ (ζ ) = zn h2n (ζ ) · det U (ζ ) · det Θ(ζ ) holds a.e. ζ ∈ T. By setting def
θ = det Θ
def
and u = det U,
we have that u admits the factorization u = θ¯ z¯ n
h¯ n = θ¯ ωn , hn
¯ h = ϕ/n. Since the Toeplitz operator Tω is Fredholm with positive index, Tuω¯ n is where ω = z¯ h/ Fredholm as well. We conclude now that def
dim H 2 θ H 2 = dim ker Tθ∗ = dim ker Tθ¯ = ind Tθ¯ < ∞ because ker Tθ = {O} and uω¯ n = θ¯ . Therefore Θ is a Blaschke–Potapov product, because Θ is a unitary-valued inner function and det Θ is a finite Blaschke product. 2
682
A.A. Condori / Journal of Functional Analysis 257 (2009) 659–682
Acknowledgments This article is based in part on the author’s PhD dissertation at Michigan State University. I would like to thank Professor Vladimir V. Peller for communicating Theorem 1.3 and for suggesting corrections on earlier versions of this paper. I also thank the reviewer for remarks that led to improvements in the paper’s presentation. References [1] L. Baratchart, F.L. Nazarov, V.V. Peller, Analytic approximation of matrix functions in Lp , J. Approx. Theory, in press. [2] V.V. Peller, Hankel Operators and Their Applications, Springer Monogr. Math., Springer, New York, 2003. [3] V.V. Peller, personal communication. [4] V.V. Peller, Analytic approximation of matrix functions and dual extremal functions, Proc. Amer. Math. Soc., in press. [5] V.V. Peller, S.R. Treil, Approximation by analytic matrix functions. The four block problem, J. Funct. Anal. 148 (1997) 191–228. [6] V.V. Peller, N.J. Young, Superoptimal analytic approximation of matrix functions, J. Funct. Anal. 120 (1994) 300– 343.
Journal of Functional Analysis 257 (2009) 683–720 www.elsevier.com/locate/jfa
Relationships between combinatorial measurements and Orlicz norms Ron Blei ∗ , Lin Ge Department of Mathematics, University of Connecticut, Storrs, CT 06269, United States Received 17 November 2008; accepted 23 February 2009 Available online 10 March 2009 Communicated by K. Ball
Abstract We establish in a setting of harmonic analysis precise relationships between combinatorial measurements and Orlicz norms. These relationships further extend and sharpen prior results concerning extensions of the Littlewood 2n/(n + 1)-inequalities, the n-dimensional Khintchin inequalities, and the Kahane–Khintchin inequality. © 2009 Elsevier Inc. All rights reserved. Keywords: Rademacher system; Littlewood 2n/(n + 1)-inequalities; n-Dimensional Khintchin inequalities; Kahane–Khintchin inequality; Combinatorial measurement; α-Orlicz function; Orlicz norm; Quasi-asymptotic
1. Introduction In this paper we focus on connections between measurements reflecting purely combinatorial data and measurements that are based on harmonic-analytic and probabilistic properties. Given an infinite set Y and F ⊂ Y n (n 1), we consider a function associated with F , ΨF : N → N such that for s ∈ N, ΨF (s) = max F ∩ (A1 × · · · × An ): Aj ⊂ Y, |Aj | s, j = 1, . . . , n . Define * Corresponding author.
E-mail addresses: [email protected] (R. Blei), [email protected] (L. Ge). 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.02.020
(1.1)
684
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
dim F = lim log ΨF (s)/ log s;
(1.2)
dF (a) = sup ΨF (s)/s a : s ∈ N ,
(1.3)
s→∞
equivalently, for a > 0 define
and observe that if |F | = ∞, then dim F = inf a: dF (a) < ∞ = sup a: dF (a) = ∞ .
(1.4)
The function ΨF is viewed as a gauge of the combinatorial complexity in F : ΨF (s) is the smallest integer k such that for all s-sets A1 ⊂ Y, . . . , An ⊂ Y , the number of samplings a1 ∈ A1 , . . . , an ∈ An with (a1 , . . . , an ) ∈ F is no greater than k. The index dim F is viewed as the combinatorial dimension of F , conveying that ΨF (s) “grows like” s dim F , in the sense that ΨF (s) 0 if β > dim F, lim = ∞ if β < dim F. s→∞ s β
(1.5)
We distinguish between two cases: (i) if lims→∞ ΨF (s)/s dim F < ∞ (dF (dim F ) < ∞), then dim F is exact; (ii) if lims→∞ ΨF (s)/s dim F = ∞ (dF (dim F ) = ∞), then dim F is asymptotic. (E.g., see [2], XIII.4.) In this paper we further analyze the asymptotic case, and establish a precise resolution of it. We take Y to be N (without loss of generality), and identify it with the Rademacher system {rj }j ∈N := R, a set of projections from {−1, 1}N : = Ω onto {−1, 1}: rj (ω) = ω(j ),
j ∈ N, ω = ω(j ) j ∈N ∈ Ω.
(1.6)
Here we view Ω as a compact Abelian group (endowed with the product topology, coordinatewise multiplication, and the normalized Haar measure P), and view R as an independent set of characters on Ω. (E.g., see [2], II.1, VII.2.) For F ⊂ R n (n 1), let CF (Ω n ) and L2F (Ω n ) be, respectively, the spaces of continuous functions and Pn -square integrable functions on Ω n , whose Fourier–Walsh transforms are supported in F . For t > 0, let · t be the t norm, and for f ∈ C(Ω n ), let fˆ be the Fourier–Walsh transform of f . For F ⊂ R n and t > 0, let ζF (t) = sup fˆt : f ∈ BCF (Ω n ) ,
(1.7)
where BCF (Ω n ) denotes the closed unit ball in CF (Ω n ), and define σF = inf t: ζF (t) < ∞ = sup t: ζF (t) = ∞ ; if ζF (σF ) < ∞, then σF is exact, and if ζF (σF ) = ∞, then σF is asymptotic.
(1.8)
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
685
For F ⊂ R n and t > 0, let ηF (t) = sup f Lp /p t : p > 2, f ∈ BL2 (Ω n ) , F
(1.9)
where BL2 (Ω n ) is the closed unit ball in L2F (Ω n ), and define F
δF = inf t: ηF (t) < ∞ = sup t: ηF (t) = ∞ ;
(1.10)
again, if ηF (δF ) < ∞, then δF is exact, and if ηF (δF ) = ∞, then δF is asymptotic. The main results in [1] were: dF (t) < ∞
⇐⇒
ζF 2t/(t + 1) < ∞
⇐⇒
ηF (t/2) < ∞.
(1.11)
In particular, σF =
2 dim F dim F + 1
(1.12)
dim F , 2
(1.13)
and δF =
where σF and δF are exact if and only if dimF is exact. These results in effect were extensions of the classical Littlewood 2n/(n + 1)-inequalities [6,9], and the n-dimensional Khintchin inequalities [5,7]. In this paper, we use Orlicz functions and their associated Orlicz norms to precisely resolve the case dF (dim F ) = ∞. Our work is divided into four parts. In the first part we focus on the combinatorial gauge ΨF , F ⊂ R n (n 1). Given functions Ψ : N → N and Φ : R → R, we say that Ψ is quasi-asymptotic to Φ, and write Ψ ∼q Φ, if 0 < lim
Ψ (s)
s→∞ Φ(s)
<∞
(1.14)
(cf. (1.5)). We prove (Theorem 2.2) that if F ⊂ R n is infinite, dim F = α (α 1), and lims→∞ ΨF (s)/s α > 0, then there exists an α-Orlicz function (Definition 2.1) Φ such that ΨF ∼q Φ. Conversely, we show (Theorem 2.3) that for every α-Orlicz function Φ (α 1) there exists F ⊂ R n such that ΨF ∼q Φ. These results extend prior constructions in [3] and [4]. In the next three parts we derive precise relations between ΨF (F ⊂ R n ) and corresponding Orlicz norms associated with ΨF in CF (Ω n ) and L2F (Ω n , Pn ) (Theorem 3.4, Corollary 4.3 and Theorem 5.4). These results naturally extend prior results stated in (1.11), (1.12) and (1.13) above, concerning relations between combinatorial dimension and Littlewood-type inequalities and Khintchin-type inequalities. The results in this paper form a portion of the second author’s PhD dissertation written at the University of Connecticut under the guidance of the first author.
686
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
2. Combinatorial structures and α-Orlicz functions An R-valued function Φ on [0, ∞) is an Orlicz function if Φ is continuous, non-decreasing, convex, Φ(0) = 0, and limx→∞ Φ(x) = ∞. (E.g., see [8], 4.a.) For F ⊂ Nn , and Orlicz function Φ, define (extending the definition in (1.3)) dF (Φ) = sup ΨF (s)/Φ(s): s ∈ N .
(2.1)
If Φ(x) = x a for some a 1, then we write dF (a) for dF (Φ). Note that dim F = α is exact (α 1) and lims→∞ ΨF (s)/s α > 0 if and only if ΨF is quasiasymptotic to Φ(x) = x α , x 0. If dim F = α is asymptotic, then we focus on φ(s) = ΨF (s)/s α , where (necessarily) lims→∞ φ(s) = ∞, and φ(s) is o(s ) for all > 0. To this end, for technical reason that will later become apparent, we introduce the notion of an α-Orlicz function: Definition 2.1. For α 1, an Orlicz function Φ is said to be an α-Orlicz function if φ ∈ C 2 [0, ∞) and Φ(x) = x α φ(x) for x 0, where either φ ≡ 1, or φ satisfies the following properties: (i) φ is concave and strictly increasing to ∞; (ii) xφ(x) is convex for x 0; (iii) φ(x) = o(x ) for all > 0, and for each > 0 there exists K > 0, such that φ(x)/x is decreasing with increasing x for x K. An example. Suppose we want to construct an α-Orlicz function whose graph contains (s, s α (log s)β ) for s large, for some α 1 and β > 0. Note that (log x)β is not concave for x < eβ−1 , x(log x)β is not convex for x < e1−β , and the y-intercept of the tangent line to the graph of (log x)β at x for x < eβ is less than 0. Let x0 = max{e1−β , eβ } + 1, and let be the linear function whose graph is the tangent line to the graph of (log x)β at x0 ; that is (x) = (log x0 )β + βx0−1 (log x0 )β−1 (x − x0 ),
−∞ < x < ∞.
(2.2)
Let (log x)β ˜ φ(x) = (x)
if x x0 , if 0 x < x0 .
(2.3)
Smooth φ˜ at x0 so that the smoothed function φ is in C 2 [0, ∞), φ is concave, and xφ(x) is convex. Then the function Φ(x) = x α φ(x) for x 0 is the desired α-Orlicz function. Theorem 2.2. Let n ∈ N. If F ⊂ Nn is infinite with dim F = α, and lims→∞ ΨF (s)/s α > 0, then there exists an α-Orlicz function Φ such that ΨF ∼q Φ. Proof. Because F ⊂ Nn is infinite, we have α 1. If dF (α) < ∞, then Φ(x) = x α for x 0 is an α-Orlicz function such that ΨF ∼q Φ. Suppose dF (α) = ∞. First we choose a sequence {sj }, sj ↑ ∞. For any positive integers s and s , let s,s be the linear function whose graph is the line passing through (s, ΨF (s)/s α ) and (s , ΨF (s )/(s )α ). (Let 0,1 be the linear function whose graph is the line passing through (0, 0) and (1, 1).) Let s1 = 0, and s2 = 1. To choose sj for j > 2, we proceed by (double) induction.
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
687 (1)
(j )
Suppose we have chosen sj for j 2. To choose sj +1 , we consider the j points sj , . . . , sj such that ΨF (sj ) ΨF (s) (1) sj = min s > sj : < < (s) , (2.4) sj −1 ,sj sjα sα and for 1 < i j , (i) sj
(1)
ΨF (s) (i−1) ΨF (sj ) = min s > sj : < < sj −1 ,sj (s) . sjα sα
(2.5)
(j )
The existence of sj , . . . , sj for any j is guaranteed because dF (α) = ∞, and because ΨF (s)/s α = o(s ) for all > 0 (because dim F = α). Denote the slope of s,s by ms,s for any s and s . Let (j )
sj +1 = max s ∈ sj(1) , . . . , sj : msj ,s ms
(i) j ,sj
for all i = 1, . . . , j .
(2.6)
Continuing this process, we obtain a sequence sj ↑ ∞ that satisfies (1) ΨF (sj )/sjα is strictly increasing to ∞ with increasing j ; (2) msj −1 ,sj > msj ,sj +1 > 0 for all j > 1; (j )
(3) for each j , and sj s sj , either ΨF (s) sj −1 ,sj (s), sα
(2.7)
ΨF (s) sj ,sj +1 (s). sα
(2.8)
or
Claim 1. For each j , there are only finitely many s ∈ N such that ΨF (s) sj ,sj +1 (s). sα
(2.9)
Proof of Claim 1. Suppose the claim is false. Then there exist j , and a sequence sk ↑ ∞ such that ΨF (sk ) sj ,sj +1 (sk ). (sk )α
(2.10)
sj ,sj +1 (x) = msj ,sj +1 x + bj ,
(2.11)
For x 0, write
where msj ,sj +1 > 0, and bj > 0. By (2.10) and (2.11),
688
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
ΨF (sk ) msj ,sj +1 (sk )α+1 + bj (sk )α , which contradicts dim F = α, and the claim follows.
(2.12)
2
Let be the piecewise-linear function defined by (x) = sj ,sj +1 (x),
sj x sj +1 , j 1.
(2.13)
Claim 2. sup ΨF (s)/ s α (s) : s ∈ N < ∞.
(2.14)
Proof of Claim 2. Suppose the claim is false. Then there exists a sequence {si } such that ΨF (si )/(si )α > (si ), and limi→∞ ΨF (si )/((si )α (si )) = ∞. By Claim 1 and because (j ) (j ) [sj , sj + j ] ⊂ [sj , sj ], there exist j sufficiently large, and si ∈ [sj , sj ] such that sj ,sj +1 (si ) < α ΨF (si )/(si ) < sj −1 ,sj (si ), which contradicts (2.7) and (2.8), and the claim follows. 2 Next we construct a spline function as follows. Note that for b > 0, (log x)b is concave for b < log x + 1, and x(log x)b is convex for x > e. We start from s4 (because s4 > e). For s4 x s5 , let P4 (x) = a4 (log x)b4 + c4 x + d4 ,
(2.15)
where a4 > 0, 0 < b4 < log s4 + 1, c4 0, and d4 are chosen such that P4 (s4 ) = (s4 ),
P4 (s5 ) = (s5 ),
(P4 )+ (s4 ) =
ms3 ,s4 + ms4 ,s5 , 2
(2.16)
where (P4 )+ (x) denotes the right derivative of P4 at x. (Similarly (P4 )− (x) denotes the left derivative of P4 at x.) For s5 x s6 , let P5 (x) = a5 (log x)b5 + c5 x + d5 ,
(2.17)
where a5 > 0, 0 < b5 < log s5 + 1, c5 0, and d5 are chosen such that: (4) if (P4 )− (s5 ) > ms5 ,s6 , then P5 (s5 ) = (s5 ),
P5 (s6 ) = (s6 ),
and (P5 )+ (s5 ) = (P4 )− (s5 );
(2.18)
(5) if (P4 )− (s5 ) ms5 ,s6 , then P5 (s5 ) = (s5 ),
P5 (s6 ) = (s6 ),
and (P5 )− (s6 ) =
ms5 ,s6 + ms6 ,s7 . 2
(2.19)
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
689
We proceed as follows. For j 6, and sj x sj +1 , let Pj (x) = aj (log x)bj + cj x + dj ,
(2.20)
where aj > 0, 0 < bj < log sj + 1, cj 0, and dj are chosen such that: (6) if (Pj −1 )− (sj ) > msj ,sj +1 , then Pj (sj ) = (sj ),
Pj (sj +1 ) = (sj +1 ),
and (Pj )+ (sj ) = (Pj −1 )− (sj );
(2.21)
(7) if (Pj −1 )− (sj ) msj ,sj +1 , then Pj (sj ) = (sj ),
Pj (sj +1 ) = (sj +1 ),
(2.22)
and (Pj )− (sj +1 ) =
msj ,sj +1 + msj +1 ,sj +2 2
.
(2.23)
For any j 5 such that (7) holds, (Pj −1 )− (sj ) msj ,sj +1 < (Pj )+ (sj ). By the mean value theorem, there exist xj −1 ∈ (sj −1 , sj ), and xj ∈ (sj , sj +1 ) such that Pj −1 (xj −1 ) = msj −1 ,sj , and Pj (xj ) = msj ,sj +1 . Because Pj −1 and Pj are concave, and because msj −1 ,sj > msj ,sj +1 , there are tj −1 ∈ (xj −1 , sj ), and tj ∈ (sj , xj ) such that Pj −1 (tj −1 ) = Pj (tj ).
(2.24)
Tj (x) = Pj −1 (tj −1 ) + Pj −1 (tj −1 )(x − tj −1 ),
(2.25)
For x 0, let
that is, Tj is the linear function whose graph is both the tangent line to the graph of Pj −1 at tj −1 , and the tangent line to the graph of Pj at tj . Let φ˜ be the spline function such that ˜ is the linear function whose graph is the tangent line to the graph of P4 (8) for 0 x s4 , φ(x) at s4 ; (9) for any x s4 , let p(x) = Pj (x), sj −1 x sj for j 5 and let ˜ φ(x) =
Tj (x) p(x)
if tj −1 x tj , [tj −1 , tj ] ⊂ [xj −1 , xj ], j 5, otherwise,
(2.26)
where tj −1 and tj , j 5, are indicated in (2.24). ˜ ˜ Then φ˜ is concave, and x φ(x) is convex. Let Φ˜ = x α φ(x). By Claim 2 and because φ˜ , ˜ sup ΨF (s)/Φ(s): s ∈ N sup ΨF (s)/ s α (s) : s ∈ N < ∞.
(2.27)
690
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
Claim 3. There are infinitely many j such that ˜ j ) = (sj ). φ(s
(2.28)
Proof of Claim 3. For each j 5, by the definition of φ˜ in (2.26), and the construction of Pj ˜ j ) = Pj (sj ) = (sj ), or φ(s ˜ j +1 ) = Pj (sj +1 ) = (sj +1 ). Hence in (6) and (7), we have either φ(s the claim follows. ˜ Smooth By (2.27) and (2.28), and because (sj ) = ΨF (sj )/sjα for all j , we obtain ΨF ∼q Φ. 2 φ˜ at the points sj and tj for all j so that the smoothed function φ is in C [0, ∞), φ is concave, and xφ(x) is convex. Let Φ = x α φ(x). Then Φ is an α-Orlicz function such that ΨF ∼q Φ. 2 Next we establish the converse to Theorem 2.2. Theorem 2.3. (Cf. [2], Theorem XIII.19.) For n 2, and 1 α < n, if Φ is an α-Orlicz function, then there exist F ⊂ Nn such that ΨF ∼q Φ. Lemma 2.4. (Cf. [2], Lemma XIII.17.) Let n 2 be an integer, and 1 γ < n. Let Φ be an Orlicz function such that x Φ(x) x γ for all x ∈ [1, ∞) and Φ(x)/x γ is decreasing with increasing x. Then for every k ∈ N, there exist F ⊂ [k]n ([k] = {1, . . . , k}) such that ΨF (s) CΦ(s),
s ∈ [k],
(2.29)
and 1 |F | = ΨF (k) Φ(k), 2
(2.30)
where C > 0 depends only on n and γ . (k)
Proof. For k ∈ N, let {Xi : i ∈ [k]n } be the Bernoulli system of statistically independent {0, 1}valued variable on (Ω, P) such that (k) Φ(k) P Xi = 1 = n . k
(2.31)
(k)
Consider the random set F = {i: Xi = 1}. We use the following elementary fact about binomial probabilities: for p ∈ (0, 1), and integers m > 0 and i 2mp, m i m p (1 − p)m−i , p i+1 (1 − p)m−i−1 i i+1
2
(2.32)
which implies m
m j m i m−i 2 p (1 − p)m−j , p (1 − p) j i i=j
j 2mp.
(2.33)
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
691
Fix s ∈ [k], and let A be a s-hypercube in [k]n (A = A1 × · · · × An , where |A1 | = · · · = |An | = s). Denote n+2 n + 1 . (2.34) C = max 2e , n−γ Let j (s) = [CΦ(s)] (= smallest integer CΦ(s)). Then Φ(k) Φ(s) k n k n s n Φ(k) Φ(k) 2s n n because Φ(s)/s γ is decreasing . k
j 2Φ(s) = 2s n
(2.35)
By (2.33) and (2.35), n s n n s Φ(k) s −i Φ(k) i (k) 1 − Xi j = P i kn kn i=j
i∈A
n n Φ(k) s −j Φ(k) j s 1− n 2 j kn k j nj Φ(k) s 2 j! kn
2s nj (Φ(k))j . (CΦ(s))j e−j k nj
(2.36)
Then, (k) P Xi CΦ(s) for some s-hypercube A i∈A
n 2s nj (Φ(k))j k s (CΦ(s))j e−j k nj k ns
2s nj (Φ(k))j
s ns e−ns
C j (Φ(s))j e−j k nj
nj −ns s Φ(k) j ej +ns by (2.34) n+2 j Φ(s) (2e ) k nj −ns γj 1 s k n(j −s)+j because Φ(s)/s γ is decreasing k s e (n−γ )j −ns s because n(j − s) + j j s e−s k s s e−s because (n − γ )j (n − γ )CΦ(s) (n + 1)s . k
2
(2.37)
692
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
Hence, s k s (k) Xi CΦ(s) for some s-hypercube A, s ∈ [k] e−s . P k
(2.38)
s=1
i∈A
Therefore, (k) lim P Xi CΦ(s) = 0.
k→∞
(2.39)
i∈A
By Chebyshev’s inequality, Φ(k) (k) P Xi − Φ(k) > 2 n i∈[k]
(k) Var( i∈[k]n Xi − Φ(k)) 2 ( Φ(k) 2 )
(k) 4k n Var(Xi ) 4k n ( Φ(k) k n )(1 − = = Φ(k)2 Φ(k)2
Φ(k) kn )
4 . Φ(k)
(2.40)
Hence Φ(k) (k) = 0. lim P Xi k→∞ 2 n
(2.41)
i∈[k]
By (2.39) and (2.41), lim P F satisfies (2.29) and (2.30) = 1.
k→∞
2
(2.42)
Let π1 , . . . , πn be the canonical projections from Nn onto N. We say F ⊂ Nn and G ⊂ Nn are n-disjoint if π (F ) ∩ π (G) = ∅ for all = 1, . . . , n. Lemma 2.5. (Cf. [2], Lemma XIII.18.) Suppose Fj , j ∈ N, is a sequence of pairwise n-disjoint subsets of Nn , and let F = j Fj . For an Orlicz function Φ, and for every m ∈ N, sup ΨF (s)/Φ(s): s ∈ [m] n sup ΨFj (s)/Φ(s): s ∈ [m], j ∈ N .
(2.43)
Proof. Let m ∈ N and let s ∈ [m]. For A1 × · · · × An ⊂ Nn such that |A1 | s, . . . , |An | s, let si,j = πi (Fj ) ∩ Ai ,
i ∈ [n], j ∈ N,
(2.44)
and sj = max si,j : i ∈ [n] ,
j ∈ N.
(2.45)
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
693
Then ∞
si,j |Ai | s,
i ∈ [n].
(2.46)
j =1
Let L = sup ΨFj (s)/Φ(s): s ∈ [m], j ∈ N .
(2.47)
By (2.44), (2.45) and (2.47), for any j ∈ N, Fj ∩ (A1 × · · · × An ) = Fj ∩ π1 (Fj ) ∩ A1 × · · · × πn (Fj ) ∩ An LΦ(sj ).
(2.48)
Then |F ∩ (A1 × · · · × An )| = Φ(s)
∞
j =1 |Fj
L
∩ (A1 × · · · × An )| Φ(s)
∞
j =1 Φ(sj )
Φ(s)
(2.49)
.
Because Φ is increasing, ∞
Φ(sj )
j =1
n ∞
i=1 j =1
n
Φ
i=1
nΦ(s)
by (2.45)
Φ(si,j ) ∞
j =1
si,j
(because Φ is convex) by (2.46) .
(2.50)
By (2.49) and (2.50), |F ∩ (A1 × · · · × An )| nL. Φ(s)
2
(2.51)
Proof of Theorem 2.3. Let α < γ < n, and let Φ be an α-Orlicz function. Then x Φ(x) x γ for large x, and Φ(x)/x γ is eventually decreasing. By Lemma 2.4, we produce a collection {Fj } of pairwise n-disjoint subsets of Nn such that sup{ΨFj (s)/Φ(s): s ∈ N} < C, and for each j ∈ N, we have |π (Fj )| = j for ∈ [n], |Fj | Φ(j )/2. Let F = j Fj , and apply Lemma 2.5. 2
694
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
3. A relation between combinatorial and functional analytic structures 3.1. An Orlicz norm associated with an α-Orlicz function Suppose Φ(x) = x α φ(x) for x 0 is an α-Orlicz function. Because φ is concave, increasing, and φ(0) 0, we have φ (x) φ(x)/x for all x 0. Hence 0
φ (x) x 1, φ(x)
x 0.
(3.1)
Let Θ(x) = x
α+1 2
1 φ(x) 2 ,
x 0,
(3.2)
x 0.
(3.3)
and θ (x) =
1 φ(Θ −1 (1/x))
,
Note that φ(x)θ 1/Θ(x) = 1,
x 0.
(3.4)
For x 0, define 1 2α MΦ (x) = x α+1 θ (x) α+1 .
(3.5)
Then 1 2α θ (x) 1 x α+1 −1 θ (x) α+1 2α + x , α+1 θ (x)
(3.6)
1 2α(α − 1) 2α 1 −2 α+1 α+1 x + D(x) , θ (x) α+1 α+1
(3.7)
(x) = MΦ
and MΦ (x) =
where 4α θ (x) α θ (x) 2 θ (x) 2 D(x) = x− x + x . α + 1 θ (x) α + 1 θ (x) θ (x)
(3.8)
We now establish that MΦ is an Orlicz function. (In Section 3.2 we will use the Orlicz norm associated with MΦ .) Lemma 3.1. MΦ (defined in (3.5)) is an Orlicz function. Moreover, except for the case Φ(x) = x (x) > 0 and M (x) > 0 for x > 0. for x 0, we have MΦ Φ
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
695
(x) > 0 for x > 0. Now we consider M . Taking derivatives on Proof. It is obvious that MΦ Φ both sides of (3.4), we have
φ (x)θ 1/Θ(x) + φ(x)θ 1/Θ(x) 1/Θ(x) = 0.
(3.9)
θ (1/Θ(x)) φ (x) x+ 1/Θ(x) x = 0. φ(x) θ (1/Θ(x))
(3.10)
α + 1 1/Θ(x) x = − 1 + E(x) 1/Θ(x) , 2
(3.11)
Hence
By (3.2),
where E(x) =
1 φ (x) x. α + 1 φ(x)
(3.12)
Note α 1. By (3.10) and (3.11), and by substituting 1/Θ(x) = y, we have 2 φ (x) φ (x) θ (y) y= x x 1 by (3.1) . θ (y) (α + 1)(1 + E(x)) φ(x) φ(x)
(3.13)
Taking derivatives on both sides of (3.9), we have φ (x)θ 1/Θ(x) + 2φ (x)θ 1/Θ(x) 1/Θ(x) 2 + φ(x)θ 1/Θ(x) 1/Θ(x) + φ(x)θ 1/Θ(x) 1/Θ(x) = 0.
(3.14)
Hence 2 φ (x) 2 φ (x) θ (1/Θ(x)) θ (1/Θ(x)) x +2 x 1/Θ(x) x + 1/Θ(x) x 2 φ(x) φ(x) θ (1/Θ(x)) θ (1/Θ(x)) θ (1/Θ(x)) 1/Θ(x) x 2 = 0. + θ (1/Θ(x))
(3.15)
By (3.2), (α + 1)(α + 3) 1 + F (x) 1/Θ(x) , 1/Θ(x) x 2 = 4
(3.16)
α + 1 φ (x) 3 φ (x) 2 1 φ (x) 2 4 x+ x − x . F (x) = (α + 1)(α + 3) 2 φ(x) 4 φ(x) 2 φ(x)
(3.17)
where
Bringing (3.11) and (3.16) into (3.15), and substituting 1/Θ(x) = y, we have
696
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
φ (x) θ (y) φ (x) 2 x − (α + 1) 1 + E(x) x y φ(x) φ(x) θ (y) θ (y) 2 θ (y) 2 (α + 1)(α + 3) α + 1 2 y + 1 + F (x) y = 0. 1 + E(x) + 2 θ (y) 4 θ (y)
(3.18)
By (3.12) and (3.17), 2 1 + F (x) = 1 + E(x) −
φ (x) 2 2 x − G(x), (α + 1)(α + 3) φ(x)
(3.19)
where G(x) =
φ (x) α φ (x) 4 x 1− x . (α + 1)(α + 3) φ(x) 2(α + 1) φ(x)
(3.20)
Then by (3.1), G(x) 0 for all x 0. Applying (3.13) and (3.19) to (3.18), we have φ (x) 2 (α + 1)2 (1 + E(x))2 θ (y) 2 x − y φ(x) 2 θ (y) +
2 θ (y) (α + 1)2 (1 + E(x))2 θ (y) 2 (α + 1)(α + 3) 1 + E(x) y + y 4 θ (y) 4 θ (y)
−
(α + 1)(α + 3) θ (y) 1 φ (x) 2 θ (y) x y− G(x) y = 0. 2 φ(x) θ (y) 4 θ (y)
(3.21)
Then (α + 1)2 (1 + E(x))2 θ (y) 2 θ (y) 2 α + 3 θ (y) −2 y + y + y 4 θ (y) θ (y) α + 1 θ (y) 1 θ (y) (α + 1)(α + 3) θ (y) φ (x) 2 x 1− y + G(x) y0 =− φ(x) 2 θ (y) 4 θ (y) φ (x) 2 θ (y) because − x 0, 0 y 1, and G(x) 0 . φ(x) θ (y)
(3.22)
Hence for all y 0,
θ (y) y −2 θ (y)
2 +
θ (y) 2 α + 3 θ (y) y + y 0. θ (y) α + 1 θ (y)
(3.23)
Then, by (3.8), for all x 0, θ (x) 2 θ (x) 2 α + 3 θ (x) x + x + x D(x) = −2 θ (x) θ (x) α + 1 θ (x) α + 2 θ (x) 2 3α − 3 θ (x) + x + x 0. α + 1 θ (x) α + 1 θ (x)
(3.24)
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
697
0. If α > 1, then 2α(α − 1)/(α + 1) > 0, and hence By (3.7) and (3.24), we have MΦ (y) y > 0 in (3.13), and hence MΦ (x) > 0 for x > 0. If α = 1, and φ is strictly increasing, then θθ(y) D(x) > 0 for x > 0. Then MΦ (x) > 0 for x > 0. Because Φ is an α-Orlicz function, either (x) > 0 φ ≡ 1, or φ is strictly increasing. Therefore, except for the case Φ(x) = x, we have MΦ and MΦ (x) > 0 for x > 0. 2
The following property will be needed in Section 3.2. (x) − xM (x) > 0 for all x > 0. Lemma 3.2. For MΦ defined in (3.5), MΦ Φ (y) − yM (y) > 0 for all y > 0, where y = 1/Θ(x). For Proof. It suffices to show that MΦ Φ simplicity, we denote MΦ by M. By (3.6) and (3.7),
M (y) − yM (y) =
2α 1 1 y α+1 −1 θ (y) α+1 H (y), α+1
(3.25)
where H (y) = 2α +
2α(α − 1) θ (y) y− − D(y), θ (y) α+1
(3.26)
where D(y) is defined in (3.8). By (3.24) and (3.22), we have D(y) =
4 φ (x) 2 1 θ (y) (α + 1)(α + 3) θ (y) − 1 − x y + G(x) y φ(x) 2 θ (y) 4 θ (y) (α + 1)2 (1 + E(x))2 2 α + 2 θ (y) 3α − 3 θ (y) + y + y. (3.27) α + 1 θ (y) α + 1 θ (y)
Because xφ(x) is convex for x 0, we have xφ(x) = 2φ (x) + xφ (x) 0,
x 0.
(3.28)
Hence −
φ (x) 2 φ (x) x 2 x. φ(x) φ(x)
By (3.26), (3.27), (3.29) and (3.20), H (y)
4α φ (x) θ (y) 4 1 θ (y) 2 + y− x 1 − y α+1 θ (y) φ(x) 2 θ (y) (α + 1)2 (1 + E(x))2 α φ (x) θ (y) φ (x) x 1− x y + φ(x) 2(α + 1) φ(x) θ (y) α + 2 θ (y) 2 3α − 3 θ (y) y − y − α + 1 θ (y) α + 1 θ (y)
(3.29)
698
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
θ (y) 4 θ (y) α θ (y) 3 4α + y− y+ y = α+1 θ (y) (α + 1)(1 + E(x)) θ (y) 2(α + 1) θ (y) α + 2 θ (y) 2 3α − 3 θ (y) y − y by (3.13) . − α + 1 θ (y) α + 1 θ (y)
(3.30)
By (3.12), θ (y) 4 4 y= (α + 1)(1 + E(x)) θ (y) α+1+
θ (y) y φ (x) θ (y) x φ(x)
φ (x) x (x) α + 1 + φφ(x) x φ(x)
4 α+2
4
by (3.13)
by (3.1) .
(3.31)
By (3.30) and (3.31), 4α 4 3α − 3 θ (y) α + 2 θ (y) 2 − + 1− y− y α+1 α+2 α + 1 θ (y) α + 1 θ (y) 3α − 2 −2α + 4 θ (y) α + 2 θ (y) 2 α2 + + y− y = (α + 1)(α + 2) α+1 α + 1 θ (y) α + 1 θ (y) 3α − 2 −2α + 4 θ (y) α + 2 θ (y) 2 α2 + + y− y (α + 1)(α + 2) α+1 α+1 θ (y) α + 1 θ (y) θ (y) because 0 y1 θ (y) α + 2 θ (y) α + 2 θ (y) 2 α2 + y− y > 0, = (α + 1)(α + 2) α + 1 θ (y) α + 1 θ (y)
H (y)
as desired.
(3.32)
2
3.2. Precise relations between combinatorial measurements and Orlicz norms We recall the following definitions of Orlicz norms ([8], 4.a and 4.b). For an Orlicz function M and a sequence of scalars a = (a1 , a2 , . . .), define aM = inf ρ > 0:
∞
M |an |/ρ 1 ,
(3.33)
n=1
M ∗ (u) = max ux − M(x): x > 0 ,
(3.34)
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
699
and |||a|||M = sup
∞
an bn :
n=1
∞
M |bn | 1 . ∗
(3.35)
n=1
The two Orlicz norms · M and ||| · |||M are equivalent and aM |||a|||M 2aM .
(3.36)
Definition 3.3. For F ⊂ R n and α-Orlicz function Φ, let ζF (Φ) = sup fˆMΦ : f ∈ BCF (Ω n ) ,
(3.37)
where MΦ is given in (3.5). This definition naturally extends the definition in (1.7). If Φ(x) = x α , x 0, for some α 1, then ζF (Φ) and ζF (2α/(α + 1)) have the same meaning. Let n ∈ N. For F ⊂ R n and α-Orlicz function Φ, let δ(α) =
1 2α
α 2(α 2 +α+1)
if dF (Φ) 1, if dF (Φ) > 1.
(3.38)
Theorem 3.4. (Cf. [2], Theorem XIII.20.) For n ∈ N, there exist Cn > 0 and Dn > 0 such that for all F ⊂ R n and α-Orlicz functions Φ, δ(α) 1 Cn dF (Φ) ζF (Φ) Dn max dF (Φ) 2α , 1 .
(3.39)
Proof. Let Φ be an α-Orlicz function, and let F ⊂ R n such that dF (Φ) < ∞. First we assume limx→∞ Φ(x)/x = ∞. (That is, we exclude the case Φ(x) = x for x 0.) Then, by Lemma 3.1, (x) > 0 and M (x) > 0 for x > 0. (The case Φ(x) = x, MΦ (defined in (3.5)) satisfies MΦ Φ x ∈ [0, ∞), will be discussed later.) For simplicity, we denote MΦ by M. In (3.34), for each u > 0, the maximum of ux − M(x) occurs at the unique point x satisfying M (x) = u. Hence we can treat x as a function of u, and write M ∗ as a function satisfying the two equations M ∗ (u) = ux − M(x),
where x is such that M (x) = u.
(3.40)
We define M2 on [0, ∞) in a similar way by M2 (w) =
√ wx − M(x),
where x is such that M (x) =
√ w.
(3.41)
Then for x satisfying (3.41), dx √ x 1 = , M2 (w) = √ x + w − M (x) dw 2M (x) 2 w
(3.42)
700
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
and M2 (w) =
M (x) − xM (x) dx . dw 2(M (x))2
(3.43)
By (3.42), (3.43) and Lemma 3.2, M2 is an Orlicz function such that M2 (w) > 0, M2 (w) > 0 for w > 0. By (3.34), M2∗ (y) = max yw − M2 (w): w > 0 .
(3.44)
For each y > 0, the maximum of yw − M2 (w) occurs at the unique point w satisfying y = M2 (w). Hence we can treat w as a function of y. But x is a function of w in (3.41). Therefore by (3.41) and (3.44), x M2∗ (y) = − M (x) + M(x), 2
where x is such that
x = y. 2M (x)
(3.45)
Our aim is to apply (3.45) and the duality expressed in (3.35) to prove (3.39). To this end, let s ∈ N, and consider a s-hypercube A1 × · · · × An ⊂ R n such that |F ∩ (A1 × · · · × An )| = ΨF (s). By (3.33), 1F ∩(A1 ×···×An ) M2∗
= inf ρ > 0:
M2∗
1F (w)/ρ 1
w∈A1 ×···×An
= inf ρ > 0: M2∗ (1/ρ)ΨF (s) 1 .
(3.46)
Let ρs > 0 be such that M2∗ (1/ρs )ΨF (s) = 1.
(3.47)
Then ρs = 1F ∩(A1 ×···×An ) M2∗ . Replacing y by 1/ρs in (3.45), and then combining (3.45) with (3.47), we have the system of equations x 1 = M(x) − M (x) ΨF (s) 2 ρs =
and
2M (x) . x
(3.48) (3.49)
We want to estimate ρs using Eqs. (3.48) and (3.49). To this end, we first estimate x as a solution to Eq. (3.48). By (3.48), 1 x 1 by (2.1) M(x) M(x) − M (x) = α 2 ΨF (s) dF (Φ)s φ(s) 1 1 = φ(s)θ 1/Θ(s) α+1 by (3.4) α dF (Φ)s φ(s) 1 1 2α 1 − α+1 s 2 φ(s)− 2 α+1 θ 1/Θ(s) α+1 = dF (Φ)
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
2α 1 1 1/Θ(s) α+1 θ 1/Θ(s) α+1 dF (Φ) 1 = M 1/Θ(s) by (3.5) . dF (Φ)
=
701
by ( 3.2) (3.50)
If dF (Φ) 1, then α+1 2α α+1 1 α+1 M dF (Φ) 2α x = dF (Φ) 2α x α+1 θ dF (Φ) 2α x α+1 by (3.5) 1 2α dF (Φ)x α+1 θ (x) α+1 (because θ is increasing) = dF (Φ)M(x) by (3.5) M 1/Θ(s) by (3.50) .
(3.51)
Because M is increasing, the comparison of both sides of (3.51) implies − α+1 x dF (Φ) 2α /Θ(s).
(3.52)
M(x) M 1/Θ(s) .
(3.53)
x 1/Θ(s).
(3.54)
If dF (Φ) < 1, then by (3.50),
Hence
For simplicity, let d˜F (Φ) = max{dF (Φ), 1}. By (3.52), (3.54), − α+1 x d˜F (Φ) 2α /Θ(s), which is the estimate that we need. Now we estimate ρs . By (3.13), we have 0 (3.49), ρs =
θ (x) θ(x) x
1 for all x 0. Then by (3.6) and
1 1 2α θ (x) 2 x α+1 −2 θ (x) α+1 2α + x 4 x −2 θ (x) α+1 . α+1 θ (x)
(3.55)
(3.56)
1
(x) x 1 for all x 0, (x −2 θ (x)) α+1 is decreasing with increasing x. Then by Because 0 θθ(x) (3.55) and (3.56),
− α+1 − 2 − α+1 1 ρs 4 d˜F (Φ) 2α /Θ(s) α+1 θ d˜F (Φ) 2α /Θ(s) α+1 1 1 1 by (3.2) and because d˜F (Φ) 1 4 d˜F (Φ) α s φ(s) α+1 θ 1/Θ(s) α+1 1 = 4 d˜F (Φ) α s by (3.4) , (3.57) which is the estimate we need.
702
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
Let h be a function with support in F such that
M ∗ h(w) 1.
(3.58)
w∈A1 ×···×An
By (3.40) and (3.41), for all w ∈ A1 × · · · × An , 2 M ∗ h(w) = M2 h(w) .
(3.59)
Hence
2 M2 h(w) 1.
(3.60)
w∈A1 ×···×An
Then,
h(w)2 =
w∈A1 ×···×An
w∈A1 ×···×An
h(w)2 1F |||1F ∩(A ×···×A ) |||M ∗ n 1 2
by (3.60) and the duality in (3.35) 21F ∩(A1 ×···×An ) M2∗ by (3.36) 1 8 d˜F (Φ) α s
by ( 3.57) .
(3.61)
Let πi be the canonical projection from R n onto R. By [2], Lemma XIII.21, there exists a cover {G1 , . . . , Gn } of A1 × · · · × An such that for every i = 1, . . . , n, max r∈Ai
1 h(w)2 1G (w) 8 d˜F (Φ) α . i
(3.62)
w∈πi−1 [r]
Suppose f is an R n -polynomial in C(Ω n ) with spectrum in A1 × · · · × An . (We identify (rj1 , . . . , rjn ) ∈ R n with the character w = rj1 ⊗ · · · ⊗ rjn on Ω n .) By the Cauchy–Schwarz inequality, (3.62) and [2], Lemma XIII.22, we obtain for i ∈ [n],
w∈A1 ×···×An
ˆ f (w)h(w)1Gi (w)
r∈Ai w∈π −1 [r] i
r∈Ai
max r∈Ai
ˆ f (w)h(w)1Gi (w)
2 fˆ(w)2
1
2 h(w)2 1G (w) · i w∈πi−1 [r]
1
w∈πi−1 [r]
1
2 h(w)2 1G (w) · i w∈πi−1 [r]
√ 1 n−1 2 2 d˜F (Φ) 2α ζR (1)2 2 f ∞ ,
r∈Ai
2 fˆ(w)2 1
w∈πi−1 [r]
(3.63)
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
703
where ζR (1) = sup{fˆ1 : f ∈ BCR (Ω) } 2. Therefore,
w∈A1 ×···×An
n ˆ f (w)h(w)
i=1 w∈A1 ×···×An
ˆ f (w)h(w)1Gi (w)
√ 1 n−1 2 2n d˜F (Φ) 2α ζR (1)2 2 f ∞ .
(3.64)
By (3.58), (3.64) and the duality in (3.35), |||fˆ|||M = sup
fˆ(w)h(w) :
w∈A1 ×···×An
M ∗ h(w) 1
w∈A1 ×···×An
√ 1 n−1 2 2n d˜F (Φ) 2α ζR (1)2 2 f ∞ .
(3.65)
Then by (3.36), √ 1 n−1 fˆM 2 2n d˜F (Φ) 2α ζR (1)2 2 f ∞ ,
(3.66)
√ n−1 which implies (3.39) with Dn = 2 2nζR (1)2 2 . Next suppose that Φ(x) = x for all x 0. (Recall we excluded this case in the beginning of our proof.) Then MΦ (x) = M(x) = x, x 0, and · M = · 1 (R n ) . Let h be in the unit ball of ∞ (R n ) with support in F . Then
h(w)2 F ∩ (A1 × · · · × An ) dF (Φ)s,
(3.67)
w∈A1 ×···×An
which corresponds to (3.61). Following the steps from (3.62) to (3.66), we have 1 n−1 fˆM n dF (Φ) 2 ζR (1)2 2 f ∞ ,
(3.68)
which implies (3.39) in this case. Now we prove the left side inequality of (3.39). For s ∈ N, let A1 × · · · × An be a s-hypercube in R n such that |F ∩ (A1 × · · · × An )| = ΨF (s). Identify (rj1 , . . . , rjn ) ∈ R n with the character w = rj1 ⊗ · · · ⊗ rjn on Ω n . By the Kahane–Salem–Zygmund probabilistic estimates ([2], Theorem X.8), there exists a {−1, +1}-valued n-array {w : w = ±1, w ∈ F ∩ (A1 × · · · × An )} such that if fs =
1 1 2
s (ΨF (s))
1 2
w w,
(3.69)
w∈F ∩(A1 ×···×An )
then 1 1 1 1 1 1 fs ∞ Cfs 2 log 2ns 2 = Cs − 2 (ns) 2 (log 2) 2 = Cn 2 (log 2) 2 , where C > 0 is a constant. By (3.33),
(3.70)
704
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
fˆs MΦ = inf ρ > 0: = inf ρ > 0:
MΦ fˆs (w)/ρ 1
w∈F ∩(A1 ×···×An )
1 − 1 MΦ s − 2 ΨF (s) 2 /ρ 1
w∈F ∩(A1 ×···×An )
1 − 1 = inf ρ > 0: MΦ s − 2 ΨF (s) 2 ρ −1 ΨF (s) 1 .
(3.71)
For each s ∈ N, let ρs > 0 be such that 1 − 1 MΦ s − 2 ΨF (s) 2 ρs−1 ΨF (s) = 1.
(3.72)
Then ρs = fˆs MΦ . By the definition of dF (Φ) in (2.1) and because Φ is an α-Orlicz function, we have, ΨF (s) dF (Φ)Φ(s) = dF (Φ)s α φ(s).
(3.73)
By the definition of MΦ in (3.5) and by (3.72), 1 2α 1 1 − 1 − 1 1 = s − 2 ΨF (s) 2 ρs−1 α+1 θ s − 2 ΨF (s) 2 ρs−1 α+1 ΨF (s) 2α − α+1
ρs
− 1 1 1 − 1 α α+1 ΨF (s) α+1 s − α+1 θ s − 2 dF (Φ) 2 φ(s) 2 ρs−1 α+1 by (3.73), and because θ is increasing .
(3.74)
By (3.74) and (3.2), − 1 ρs2α ΨF (s)s −α θ dF (Φ) 2 ρs−1 /Θ(s) .
(3.75)
−1/2 −1 ρs . c = dF (Φ)
(3.76)
ρs2α ΨF (s)s −α θ c/Θ(s) .
(3.77)
−1 ΨF (s) . ρs2α ΨF (s)s −α θ 1/Θ(s) = ΨF (s)s −α φ(s) = Φ(s)
(3.78)
Let
By (3.75),
If c > 1, by (3.77) and (3.4),
Then (by taking supremum) 1 sup fˆs MΦ : s ∈ N = sup{ρs : s ∈ N} sup ΨF (s)/Φ(s): s ∈ N = dF (Φ) 2α .
(3.79)
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
705
If c 1, by (3.2) and because φ is increasing, c/Θ(s) = cs −
α+1 2
− 1 − α+1 − 2 − 1 2 2 2 2 = 1/Θ c − α+1 s . φ(s) 2 c− α+1 s φ c α+1 s
(3.80)
Then −1 2 2 θ c/Θ(s) θ 1/Θ c− α+1 s = φ c− α+1 s by (3.4) .
(3.81)
By (3.77) and (3.81), −1 2 ρs2α ΨF (s)s −α φ c− α+1 s .
(3.82)
Because φ is concave and c < 1, 2
φ(c− α+1 s) − φ(0) c
2 − α+1
s
φ(s) − φ(0) . s
(3.83)
Because φ(0) 0, 2
φ(c− α+1 s) c
2 − α+1
φ(s) − φ(0) +
φ(0) 2
c− α+1
φ(s).
(3.84)
By (3.82), (3.84) and (3.76), −1 2 − 1 − 2 ΨF (s) dF (Φ) α+1 ρs α+1 . ρs2α ΨF (s)s −α φ(s) c α+1 = Φ(s)
(3.85)
Hence 2(α 2 +α+1) α+1
ρs
− 1 ΨF (s) dF (Φ) α+1 . Φ(s)
(3.86)
Then (by taking supremum) 1 (1− α+1 )( 2α+1 ) 2(α +α+1) sup fˆs MΦ : s ∈ N = sup{ρs : s ∈ N} dF (Φ) α = dF (Φ) 2(α2 +α+1) .
(3.87)
By (3.70), (3.79) and (3.87), we obtain the left side inequality of (3.39) with Cn = 1 1 (Cn 2 (log 2) 2 )−1 . 2 Corollary 3.5. For n ∈ N, F ⊂ R n , and α-Orlicz function Φ, lim
s→∞
ΨF (s) <∞ Φ(s)
⇐⇒
ζF (Φ) < ∞.
(3.88)
706
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
Remarks. (i) (A question.) We were unable to answer the following: on the left side inequality in (3.39), can δ(α) be replaced by 1/(2α)? (ii) (An example.) Let log(i) denote the i-fold iteration of log. Suppose Φ(x) = x α φ(x) is an α-Orlicz function such that for some N > 0, k (i) βi log x ,
φ(x) =
x N,
(3.89)
i=1
for k 1 and βi 0 for i = 1, . . . , k. We want to show that the Orlicz function MΦ defined in (3.5) can be approximated in a neighborhood of 0 by Mα,β1 ,...,βk (x) =
α+1 2
β1 α+1
x
2α α+1
− βi k α+1 (i) 1 log , x
(3.90)
i=1
in the sense that limx→0 Mα,β1 ,...,βk (x)/MΦ (x) = 1. By (3.5) and (3.90), Mα,β1 ,...,βk (x) x→0 MΦ (x) lim
β1
= lim
= = =
·
α+1 2 α+1 2 α+1 2 k i=2
k
β
i=1 (log
2α
x→0
2α
α+1 x α+1 ( α+1 2 )
β1 α+1
β1 α+1
1
x α+1 (θ (x)) α+1 βi k − α+1 (i) i=1 (log Θ(y)) lim 1 y→∞ (θ (1/Θ(y))) α+1 lim φ(y)
1 α+1
y→∞
β1 α+1
by substituting x = 1/Θ(y)
βi k 1 − α+1 (i) α+1 2 log y 2 φ(y)
by (3.2) and (3.4)
i=1
− 1 k α+1 (i) βi α + 1 log y α+1 lim log y y→∞ 2 β
i=1
log
i (i) 1 − α+1 x)
(i−1)
k − βi α+1 βi α+1 1 (i) log y + log log x 2 2
k
(by 3.89 )
i=1
k
(i)
βi
y) α+1 = lim βi y→∞ k (i−1) α+1 ( 2 log y + 12 log( ki=1 (log(i) x)βi ))} α+1 i=2 i=2 {log = 1 (by L’Hopital’s rule ), as desired.
i=2 (log
(3.91)
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
707
4. Relations between combinatorial structures and Lp norms Definition 4.1. (Cf. [2], VII.9, remark (ii).) For n ∈ N, F ⊂ R n , and α-Orlicz function Φ, let ηF (Φ) = sup f Lp /Φ(p): p > 2, f ∈ BL2 (Ω n ) . F
(4.1)
This definition extends the definition in (1.9). Our aim is to establish a link between ηF (Φ) and dF (Φ), where F ⊂ R n . To this end, we first analyze analogous measurements in the context of (TN )n , where T = {e2πit : t ∈ [0, 1]}. We let S = {βj : j ∈ N} be the set of the canonical projections from TN onto T: βj (t) = t(j ),
t = t(j ) : j ∈ N ∈ TN .
(4.2)
We refer to S as the Steinhaus system, and view it as an independent set of characters on the compact Abelian group TN with the normalized Haar measure P . For F ⊂ S n and α-Orlicz function Φ, the definition of ηF (Φ) is the same as in (4.1). (Replace n Ω by (TN )n , and replace P by P.) Lemma 4.2. (Cf. [2], Theorem XIII.27.) For n ∈ N, F ⊂ S n , and α-Orlicz function Φ, 1 1 1 16−n dF (Φ) 2 ηF Φ 2 dF (Φ) 2 .
(4.3)
Proof. By [2], Lemma XII.6 and Lemma XIII.26, for all f ∈ L2F ((TN )n ), 1 f L2s ΨF (s) 2 f L2 ,
s ∈ N.
(4.4)
Because ΨF (s) dF (Φ)Φ(s) for all s ∈ N, 1 1 f L2s dF (Φ) 2 Φ(s) 2 f L2 , Let λ =
2s 2 +2s−ps . p
s ∈ N.
(4.5)
By Hölder’s inequality, for 2s < p 2s + 2, 1−λ f Lp f λL2s f L 2s+2 .
(4.6)
Then by (4.5) and (4.6), 1 λ 1 1−λ 1 1 dF (Φ) 2 Φ(s + 1) 2 f L2 dF (Φ) 2 Φ(s) 2 f L2 λ 1−λ 1 1 2 f = dF (Φ) 2 Φ(s) Φ(s + 1) L2 .
f Lp
(4.7)
Because p > 2s s + 1 and Φ is increasing, λ 1−λ Φ(s) Φ(s + 1) Φ(p).
(4.8)
708
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
Therefore, 1 1 f Lp dF (Φ) 2 Φ(p) 2 f L2 .
(4.9)
To verify the left side inequality of (4.3), let s ∈ N and let A1 × · · · × An be a s-hypercube in S n such that |F ∩ (A1 × · · · × An )| = ΨF (s). Consider the Riesz product β + β¯ β + β¯ 1+ 1+ ⊗ ··· ⊗ . Hs = 2 2 β∈A1
(4.10)
β∈An
Then Hs L1 = 1 and Hs L2 = 2ns/2 . Hence for 1 p 2, 1− 2
2
ns
Hs Lp Hs L1 q Hs Lq 2 = 2 q ,
1/p + 1/q = 1.
(4.11)
Let
hs =
β1 ⊗ · · · ⊗ βn .
(4.12)
(β1 ,...,βn )∈F ∩(A1 ×···×An )
Let E (expectation) denote integration with respect to Haar measure, either on Ω or on TN . Let En denote the n-fold iteration of E. By Hölder’s inequality and (4.11) with q = s, n E Hs hs Hs Lp hs Lq 2 nss hs Ls 2n ηF (Φ)Φ(s)hs 2 . L
(4.13)
Because n E Hs hs = 2−n ΨF (s),
(4.14)
1 hs L2 = ΨF (s) 2 ,
(4.15)
1 4−n ΨF (s) 2 ηF (Φ)Φ(s),
(4.16)
and
we obtain
which implies the left side of (4.3).
2
Corollary 4.3. (Cf. [2], Corollary XIII.28.) For n ∈ N, F ⊂ R n , and α-Orlicz function Φ, 1 1 1 16−n dF (Φ) 2 ηF Φ 2 4n dF (Φ) 2 .
(4.17)
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
709
Proof. For each j ∈ N, let rj be the Rademacher function in R such that rj (ω) = ω(j ),
ω ∈ Ω = {−1, 1}N ,
(4.18)
and let βj be the Steinhaus function in S such that βj (t) = t(j ),
t ∈ TN .
(4.19)
Let f be an F -polynomial (i.e., spect f = support fˆ ⊂ F , and spect f is finite). Define for t = (t1 , . . . , tn ) ∈ (TN )n , ft =
fˆ(rj1 ⊗ · · · ⊗ rjn )βj1 (t1 ) · · · βjn (tn )rj1 ⊗ · · · ⊗ rjn .
(4.20)
(rj1 ,...,rjn )∈F
For t ∈ (TN )n , there exists θt ∈ L1 (Ω n ) such that θˆt (rj1 ⊗ · · · ⊗ rjn ) = βj1 (t1 ) · · · βjn (tn ),
rj1 ⊗ · · · ⊗ rjn ∈ spect f,
(4.21)
and θt L1 4n .
(4.22)
(See [2], VII.8 and 12). Then q
q
q
f Lq = ft ∗ θt Lq 4nq ft Lq ,
(4.23)
where ∗ denotes convolution. Integrating both sides of (4.23) with respect to the Haar measure on (TN )n , applying Fubini’s Theorem, and then the right side of (4.3), we obtain 1 1 f Lq 4n dF (Φ) 2 Φ(q) 2 f L2 ,
(4.24)
which implies the right side of (4.17). The proof of the left side of (4.17) is a transcription of the proof of the left side of (4.3).
2
5. Extensions of the Kahane–Khintchin inequality Suppose (A , P) is a probability space. For any Orlicz function ψ , consider the Orlicz norm corresponding to ψ, Xψ = inf ρ > 0: Eψ |X|/ρ 1 ,
X ∈ L0 (A , P).
(5.1)
The classical Kahane–Khintchin inequality states that: if ψ(x) = exp(x 2 ) − 1 for x 0, then there exists K > 0 such that, Xψ KXL2 (Ω,P) ,
X ∈ L2R (Ω, P).
(5.2)
710
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
(E.g., see [10].) We will extend the inequality in (5.2) to F ⊂ R n . Let Φ = x α φ(x) be an α-Orlicz function (as per Definition 2.1). Define 1 f (x) = x α φ x 2 2 ,
x 0,
(5.3)
and let g = f −1 .
(5.4)
Lemma 5.1. (Cf. [2], Lemma X.18.) Suppose (A , P) is a probability space, and X ∈ BL2 (A ,P) . Then the following are equivalent: (i) there exists 0 < A < ∞ such that 2 lim exp A g(x) P |X| > x < ∞;
x→∞
(5.5)
(ii) there exists 0 < B < ∞ such that 1 α sup XLp /p 2 φ(p) 2 : p > 2 B;
(5.6)
(iii) there exists 0 < C < ∞ such that lim E exp tg |X| − Ct 2 < ∞;
t→∞
(5.7)
(iv) there exist 0 < D < ∞ such that 2 < ∞. E exp D g |X|
(5.8)
Proof. (i) ⇒ (ii). Suppose limx→∞ exp(A(g(x))2 )P(|X| > x) := B1 < ∞. For p > 2 sufficiently large, ∞ E|X| = p
P |X|p > x dx
0
p
αp 2
p φ(p) 2 + B1 αp p 2
∞
1 2 dx. exp −A g x p
(5.9)
p (φ(p)) 2
1
Let y = g(x p ). Then p p p x = g −1 (y) = f (y) = y αp φ y 2 2 . Hence
(5.10)
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
p 1 φ (y 2 ) 2 y dx/dy = αpy αp−1 φ y 2 2 1 + α φ(y 2 ) p 2αpy αp−1 φ y 2 2 by (3.1) . When x = p
αp 2
p
(φ(p)) 2 , we have y = αp 2
E|X| p p
711
(5.11)
√ p. Hence by (5.9) and (5.11),
p φ(p) 2 + B1
∞
√
p 2αpy αp−1 φ y 2 2 exp −Ay 2 dy.
(5.12)
p
By the Cauchy–Schwarz inequality, E|X| p p
αp 2
1 ∞ 2 p 2(αp−1) 2 2 φ(p) + 2B1 αp π/A A/π y exp −Ay dy
∞ · √
0
p A/π φ y 2 exp −Ay 2 dy
1 2
(5.13)
.
p
The first integral on the right side of (5.13) is the 2(αp − 1) moment of a Gaussian random variable with mean 0 and variance 1/2A. Hence there exists B2 > 0 such that ∞
p A/πy 2(αp−1) exp −Ay 2 dy B2 p αp−1 .
(5.14)
0
Next we estimate the second integral on the right side of (5.13). By property (iii) in Definition 2.1, √ φ(y 2 )/y is eventually decreasing. Because p is sufficiently large, for all y p, √ φ y 2 /y φ(p)/ p. (5.15) Then ∞ 2 p 1 2 dy A/π φ y exp −Ay (φ(p))p √ p
∞ =
p A/π φ y 2 /φ(p) exp −Ay 2 dy
√ p
∞
√
√ A/π (y/ p )p exp −Ay 2 dy
by (5.15)
p
1 p
p2
∞ p A/π y p exp −Ay 2 dy B3 , 0
(5.16)
712
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
for some B3 > 0 (by estimating p-th moments of Gaussian random variables). Then ∞ √
p p p A/π φ y 2 exp −Ay 2 dy B3 φ(p) .
(5.17)
p
By (5.13), (5.14) and (5.17), there exists B > 0 such that 1 α XLp Bp 2 φ(p) 2 .
(5.18)
(ii) ⇒ (iii). We assume B 1. For t > 0, ∞ k t k E exp tg |X| = E g |X| . k!
(5.19)
k=0
For each k 1, let 2 k fk (x) = x α φ x k 2 ,
x ∈ [0, ∞),
(5.20)
and let gk = fk−1 .
(5.21)
Then f1 = f and g1 = g. We will show that gk is increasing for k 1, and is concave for k 2. To this end, it suffices to show that fk is increasing for k 1, and is convex for k 2. By (5.20), fk (x) = x α−1
k φ (y) 2 α+ y 0, φ(y) φ(y)
(5.22)
where y = x 2/k . Hence gk = fk−1 is increasing for all k 1. By (5.22), k 2 kα(α − 1) k φ (y) fk (x) = x α−2 φ(y) 2 + kα − + 1 y k 2 2 φ(y) φ (y) 2 φ (y) 2 k + . y + −1 y φ(y) 2 φ(y)
(5.23)
Because Φ is an Orlicz function, for all x 0, φ (x) 2 φ (x) Φ (x) = x α φ(x) = x α−2 φ(x) α(α − 1) + 2α x+ x 0. φ(x) φ(x) For k 2, the expression inside the brackets of (5.23) is
(5.24)
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
713
k φ (y) φ (y) 2 φ (y) 2 kα(α − 1) k + kα − + 1 y+ y + −1 y 2 2 φ(y) φ(y) 2 φ(y) α(α − 1) + 2α 0
φ (y) 2 φ (y) y+ y φ(y) φ(y)
by (5.24) .
(5.25)
Hence fk 0 for k 2. Therefore gk = fk−1 is concave for k 2, as desired. By (5.3), (5.4), (5.20) and (5.21), k k g(x) = f −1 (x) = fk−1 x k = gk x k .
(5.26)
Then, by Jensen’s inequality, for k 2, k E g |X| = E gk |X|k gk E|X|k .
(5.27)
By assumption (ii) and because XL2 1, for k 2, E|X|k B k k
αk 2
k φ(k) 2 .
(5.28)
Because B 1, we have φ(B 2/α k) φ(k). Hence 2 k 2 αk 2 k E|X|k B α k 2 φ B α k 2 = fk B α k 2 .
(5.29)
By (5.27) and (5.29), for k 2, k 2 k 2 k k k E g(|X|) (gk ◦ fk ) B α k 2 = B α k 2 = B α k 2 .
(5.30)
Next we estimate E(g(|X|)). By (5.3), 0 f (x) (φ(1))1/2 for 0 x 1. Then 0 g(x) 1 for 0 x (φ(1))1/2 (because g = f −1 ). Also by (5.3), for x (φ(1))1/2 , we have − 1 − 1 1 1 g(x) x α φ(1) 2α = x φ(1) 2 α − 1 − 1 2 because x φ(1) 2 1 x φ(1) 2 −1 = x 2 φ(1) .
(5.31)
Let K = max{2(φ(1))−1 , 2}. Then E g |X| = E g |X| 1
1
{|X|(φ(1)) 2 }
+ E g |X| 1
−1 1 + φ(1) E|X|2 K
1
{|X|>(φ(1)) 2 }
because E|X|2 1 .
Applying (5.30) and (5.32) to (5.19), we obtain for t sufficiently large,
(5.32)
714
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720 ∞ k
k k t Bαk2 E exp tg |X| 1 + Kt + k! k=2
k because k 2 /k! < 2k /(k/2)! exp Ct 2
(5.33)
for some C > 0. (iii) ⇒ (i). Because g is increasing (g = g1 ), for x > 0 and t > 0, P |X| > x P g(|X|) > g(x) E exp(tg(|X|)) exp(tg(x))
(by Chebyshev’s inequality).
(5.34)
Then by assumption (iii), for t > 0 sufficiently large, exp(Ct 2 ) P |X| > x . exp(tg(x))
(5.35)
Put t = g(x)/2C in (5.34), and obtain (5.5) with A = 1/4C. (i) ⇒ (iv). Suppose limx→∞ exp(A(g(x))2 )P(|X| > x) := M1 < ∞, and let M2 > 0 be sufficiently large so that P(|X| > x) M1 exp(−A(g(x))2 ) for x M2 . Choose 0 < D < A. Then 2 E exp D g |X| =
∞
2 > x dx P exp D g |X|
0
∞ M2 +
1 1 P |X| > g −1 (log x) 2 /D 2 dx
(because g is increasing)
M2
∞ M2 + M1
1 1 2 dx exp −A g g −1 (log x) 2 /D 2
(by assumption (i))
M2
∞ = M2 + M 1
A
x − D dx M2 + M1 .
(5.36)
M2
(iv) ⇒ (i). Because g is increasing, for x > 0 sufficiently large, 2 2 P |X| > x P D g |X| > D g(x) which implies (5.5).
2
E exp(D(g(|X|))2 ) exp(D(g(x))2 )
(by Chebyshev’s inequality),
(5.37)
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
715
Lemma 5.2. Let f and g be the functions defined in (5.3) and (5.4). Let 2 h(x) = exp g(x) − 1,
x 0.
(5.38)
Then there exists N > 0 such that h (x) 0 for all x N . Proof. 2 h (x) = 2 exp g(x) g(x)g (x),
(5.39)
2 h (x) = 2 exp g(x) I (x),
(5.40)
2 2 2 I (x) = 2 g(x) g (x) + g (x) + g(x)g (x).
(5.41)
and
where
Because g = f −1 , we have g f (x) f (x) = 1,
x 0.
(5.42)
Hence 2 g f (x) f (x) + g f (x) f (x) = 0.
(5.43)
f (x) . g f (x) = − (f (x))3
(5.44)
By (5.42) and (5.43),
By (5.41) and (5.44), 2 2 2 I f (x) = 2 g f (x) g f (x) + g f (x) + g f (x) g f (x) 2 2 because g = f −1 = 2x 2 g f (x) + g f (x) + xg f (x) f (x) (f (x)3 ) 1 f (x) 2 2x + 1 − = x . f (x) (f (x))2 = 2x 2
1
(f (x))2
+
1
(f (x))2
−x
by ( 5.42) and (5.44) (5.45)
By (5.22) with k = 1, 1 1 φ (x 2 ) 2 x f (x) = x α−1 φ x 2 2 α + αx α−1 φ x 2 2 . 2 φ(x )
(5.46)
716
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
By (5.23) with k = 1, and because 0
φ (x 2 ) 2 x φ(x 2 )
1 and φ 0,
2 1 φ (x 2 ) 2 φ (x 2 ) 4 φ (x ) 2 2 f (x) = x α−2 φ x 2 2 α(α − 1) + (2α + 1) x x x + 2 − φ(x 2 ) φ(x 2 ) φ(x 2 ) 1 (5.47) x α−2 φ x 2 2 α(α − 1) + 2α + 1 . Hence 1 f (x) x α + 1 + α + 2. f (x) α
(5.48)
By (5.45) and (5.48), I f (x)
1 (f (x))2
2x 2 − α − 1 .
(5.49)
Replacing x by g(x) in (5.49), we have I (x)
1 (f (g(x)))2
2 2 g(x) − α − 1 .
Then for x g −1 ((α + 1)/2)1/2 , we have I (x) 0 which implies h (x) 0.
(5.50) 2
Let ψΦ be an Orlicz function such that for some N > 0, 2 ψΦ (x) = exp g(x) − 1,
x N,
(5.51)
where g is defined in (5.4). Lemma 5.3. (Cf. [2], Remark X.9.i.) Suppose (A , P) is a probability space, and X ∈ BL2 (A ,P) . Then the following are equivalent: F
(i) there exists 0 < D < ∞ such that 2 < ∞; E exp D g |X|
(5.52)
XψΦ < ∞.
(5.53)
(ii)
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
717
Proof. (i) ⇒ (ii). Suppose 2 M, E exp D g |X|
(5.54)
for some M 1. Let β > 0 be such that β max{4M, D} and ψΦ (N/β) 12 . Then 1 EψΦ |X|/β 1{|X|
(5.55)
Because φ is concave, we have for c 1 and x > 0, φ(cx) − φ(0) φ(x) − φ(0) . cx x
(5.56)
φ(cx) φ(x) φ(0) φ(0) φ(x) − + . cx x x cx x
(5.57)
Then, because φ(0) 0,
Let − α+1 2 x , L(x) = g βD −1
x 0.
(5.58)
Then α+1 − α+1 2 x x = βD −1 2 βD −1 α+1 = βD −1 2 g −1 L(x) α+1 α 2 1 2 by (5.3) and (5.4) = βD −1 2 L(x) φ L(x) α α 2 1 2 = βD −1 2 L(x) βD −1 φ L(x) 1 2 1 α 2 by (5.57) and because βD −1 1 βD −1 2 L(x) φ βD −1 L(x) 1 by (5.3) and (5.4) . (5.59) = g −1 βD −1 2 L(x) By (5.58) and (5.59), 1 1 − α+1 2 x . g(x) βD −1 2 L(x) = βD −1 2 g βD −1
(5.60)
Because β 2M, we have − α+1 1 1 2 x . (2M)− 2 D 2 g(x) g βD −1 Then
(5.61)
718
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720 ∞
− α+1 2 k 1 2 |X| 2 1 + E (2M)−1 D g |X| E exp g βD −1 k! k=1
1+
∞ 1 1 2 k D g |X| E 2M k!
(because M 1)
k=1
1+
3 2
2 1 E exp D g |X| 2M by (5.54) .
(5.62)
By the definition of ψΦ in (5.51), we have − α+1 −1 − α+1 2 |X| 1 2 |X| 2 1 EψΦ βD −1 {|X|N } = E exp g βD {|X|N } − 1 Let K = max{β, (βD −1 )
α+1 2
1 2
by (5.62) .
(5.63)
}. By (5.55) and (5.63), we have EψΦ |X|/K 1.
(5.64)
XψΦ K.
(5.65)
Therefore
(ii) ⇒ (i). If XψΦ K for some K > 0, then EψΦ |X|/K 1.
(5.66)
Hence by the definition of ψΦ in (5.51), 2 − 1 1{|X|N } 1. E exp g |X|/K
(5.67)
2 . M = max 4, 2E exp g(N/K)
(5.68)
2 M + 2 M. E exp g |X|/K 2
(5.69)
Let
By (5.67) and (5.68),
We may assume K 1. By (5.3) and (5.4), for x 0, α 2 1 2. x = f g(x) = g(x) φ g(x) Then
(5.70)
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
719
α 2 1 2 1 f g(x)/K α = g(x)/K 1/α φ g(x) /K α 2 2 1 α 2 /K g(x) φ g(x) = x/K by ( 5.70) = f g(x/K) .
(5.71)
Hence 1
g(x)/K α g(x/K).
(5.72)
Let D = 1/K 2/α . By (5.72) and (5.69), we obtain 2 2 E exp D g |X| E exp g |X|/K M,
(5.73)
2
as desired.
The following is a link between the combinatorial structure of F ⊂ R n and tail probability estimates involving random variables in L2F (Ω n , Pn ). Theorem 5.4. For n ∈ N, F ⊂ R n , and α-Orlicz function Φ, dF (Φ) < ∞
⇐⇒
sup XψΦ : X ∈ BL2 (Ω n ) < ∞. F
(5.74)
Proof. Observe that statement (iv) in Lemma 5.1 is the same as statement (i) in Lemma 5.3. Then by Lemma 5.1 and Lemma 5.3, 1 α sup XLp /p 2 φ(p) 2 : p > 2, X ∈ BL2 (Ω n ) < ∞ F ⇐⇒ sup XψΦ : X ∈ BL2 (Ω n ) < ∞. F
1
(5.75)
1
Because Φ 2 (p) = p α/2 (φ(p)) 2 , 1 1 α ηF Φ 2 = sup XLp /p 2 φ(p) 2 : p > 2, X ∈ BL2 (Ω n ) F
(Definition 4.1).
(5.76)
Hence 1 ηF Φ 2 < ∞
⇐⇒
which, by Corollary 4.3, implies (5.74).
sup XψΦ : X ∈ BL2 (Ω n ) < ∞, F
2
(5.77)
720
R. Blei, L. Ge / Journal of Functional Analysis 257 (2009) 683–720
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]
R. Blei, Combinatorial dimension and certain norms in harmonic analysis, Amer. J. Math. 106 (1984) 847–887. R. Blei, Analysis in Integer and Fractional Dimensions, Cambridge University Press, Cambridge, 2001. R. Blei, T.W. Körner, Combinatorial dimension and random sets, Israel J. Math. 47 (1984) 65–74. R. Blei, Y. Peres, J.H. Schmerl, Fractional products of sets, Random Structures Algorithms 6 (1995) 113–119. A. Bonami, Étude des coefficients de Fourier des fonctions de Lp (G), Ann. Inst. Fourier (Grenoble) 20 (1970) 335–402. G. Johnson, G. Woodward, On p-Sidon sets, Indiana Univ. Math. J. 24 (1974) 161–167. A. Khintchin, Über dyadische Brüche, Math. Z. 18 (1923) 109–116. J. Lindenstrauss, L. Tzafriri, Classical Banach Spaces I, Springer-Verlag, Berlin, Heidelberg, New York, 1977. J.E. Littlewood, On bounded bilinear forms in an infinite number of variables, Quart. J. Math. Oxford 1 (1930) 164–174. G. Pe˘skir, Best constants in Kahane–Khintchine inequalities for complex Steinhaus function, Proc. Amer. Math. Soc. 123 (10) (October 1995) 3101–3111.
Journal of Functional Analysis 257 (2009) 721–752 www.elsevier.com/locate/jfa
Local “superlinearity” and “sublinearity” for the p-Laplacian Djairo G. de Figueiredo a , Jean-Pierre Gossez b,∗ , Pedro Ubilla c a IMECC-UNICAMP, Caixa Postal 6065, 13081-970 Campinas, SP, Brazil b Département de Mathématique, C.P. 214, Université Libre de Bruxelles, 1050 Bruxelles, Belgium c Universidad de Santiago de Chile, Casilla 307, Correo 2, Santiago, Chile
Received 23 November 2008; accepted 1 April 2009 Available online 29 April 2009 Communicated by J. Coron
Abstract We study the existence, nonexistence and multiplicity of positive solutions for a family of problems 1,p −p u = fλ (x, u), u ∈ W0 (Ω), where Ω is a bounded domain in RN , N > p, and λ > 0 is a parameter. The family we consider includes the well-known nonlinearities of Ambrosetti–Brezis–Cerami type in a more general form, namely λa(x)uq + b(x)ur , where 0 q < p − 1 < r p ∗ − 1. Here the coefficient a(x) is assumed to be nonnegative but b(x) is allowed to change sign, even in the critical case. Preliminary results of independent interest include the extension to the p-Laplacian context of the Brezis–Nirenberg 1,p result on local minimization in W0 and C01 , a C 1,α estimate for equations of the form −p u = h(x, u) with h of critical growth, a strong comparison result for the p-Laplacian, and a variational approach to the method of upper–lower solutions for the p-Laplacian. © 2009 Elsevier Inc. All rights reserved. 1,p
Keywords: p-Laplacian; Concave-convex nonlinearities; Critical exponent; C01 versus W0 comparison principle; C 1,α estimate; Upper–lower solutions
local minimization; Strong
* Corresponding author.
E-mail addresses: [email protected] (D.G. de Figueiredo), [email protected] (J.-P. Gossez), [email protected] (P. Ubilla). 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.04.001
722
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
1. Introduction This paper is concerned with the existence, nonexistence and multiplicity of solutions for the family of problems ⎧ ⎨ −p u = fλ (x, u) u>0 ⎩ u=0
in Ω, in Ω, on ∂Ω
(1.1)
where p u := div(|∇u|p−2 ∇u) is the usual p-Laplacian, Ω is a bounded domain in RN , and λ > 0 is a real parameter. A basic feature of the family considered here is its monotone dependence on λ, i.e. fλ (x, s) fλ (x, s) if λ < λ . In the context of (1.1) the conditions of local “sublinearity” at 0 and of local “superlinearity” at ∞ mean, roughly speaking, that for x in a subdomain Ω1 of Ω, one has lim fλ (x, s)/s p−1 = +∞,
s→0 s>0
while for x in another subdomain Ω2 of Ω, one has lim fλ (x, s)/s p−1 = +∞
s→+∞
(see (HΩ1 ) and (HΩ2 ) in Section 2 for the precise statements). There are several motivations to our study of (1.1). The main one comes from the following example: ⎧ q r ⎨ −p u = λa(x)u + b(x)u u>0 ⎩ u=0
in Ω, in Ω, on ∂Ω
(1.2)
where 0 < q < p − 1 < r. Problem (1.2) was originally considered in [2] when p = 2 and a(x) ≡ 1, b(x) ≡ 1. It was in particular shown there that if r 2∗ − 1, then there exists 0 < Λ < ∞ such that (1.2) has at least two solutions for λ < Λ, at least one solution for λ = Λ, and no solution for λ > Λ. This result of “global multiplicity” was extended in [15] to the case p = 2 and variable coefficients a(x), b(x). It was also extended in [19] to the case p = 2 and a(x) ≡ 1, b(x) ≡ 1, although here under some restrictions on the exponents p, q in the critical case r = p ∗ − 1. In this paper we consider the general case: p = 2 and variable coefficients a(x), b(x). As in [15] a(x) is restricted to be 0 and b(x) is allowed to change sign, even in the critical case r = p ∗ − 1. For the existence of a second solution, we will however need here a stronger restriction on a(x), namely a(x) > 0. This difference with respect to the semilinear case is connected with the use of a strong comparison principle for the p-Laplacian (see the comments after hypothesis (M) in Section 2). In the critical case r = p ∗ − 1 we will meet similar restrictions as in [19] on the exponents p, q (see Remark 2.7). As observed on p. 454 of [7], critical problems become more delicate in the presence of variable coefficients. In this respect our basic assumption on b(x) in (1.2) for r = p ∗ − 1 is of the same nature as that introduced in [15] when p = 2: b(x) should be sufficiently close to b L∞ (Ω) on a small ball (cf. condition (b) in Theorems 2.5 and 2.6).
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
723
Our general results relative to (1.1) can also be applied to various situations rather different from (1.2). For instance the p version of example (1.3) from [15] can be handled, as we shall see in Section 6. There are several preliminary results in our study which have an independent interest. We mention in particular the extension to the p-Laplacian context of the well-known result of Brezis 1,p and Nirenberg [8] on local minimization in C01 and W0 (cf. Proposition 3.9). Several works have been devoted to this problem, e.g. [19,23,21,16]. Our approach differs from that in these papers and is more in the line of that introduced recently in [9] in the subcritical case. It avoids in particular the consideration of equations involving two p-Laplacians. An important step in the proof of this minimization result is Proposition 3.7 which provides a C 1,α estimate in the critical case. Another result of independent interest concerns the strong comparison principle for the p-Laplacian. This is known to be a delicate question, which has not received yet a complete answer (see [29] for a recent survey). The version we present here (cf. Proposition 3.4) is based on ideas from [4] and [17]. There is finally a variational approach to the method of upper–lower solutions for the p-Laplacian (cf. Proposition 3.1). It is adapted from [30] and could also prove useful in other situations. Our method to obtain multiple solutions to (1.1) follows the classical way of obtaining a first solution via upper–lower solutions and a second solution via the mountain pass theorem. To handle the (PS) condition in the critical case we use some of the techniques initially developed for p = 2 in [7,2] and later extended for p = 2 in [18]. Our results relative to (1.1) are stated in detail in Section 2 and their proofs given in Sections 4 and 5. Section 3 is devoted to various preliminaries, including those mentioned above. Section 6 is devoted to some applications of the results of Section 2, in particular to (1.2). 2. Statement of results In this section we state our results relative to (1.1). We successively consider fλ (x, u) of arbitrary growth, of subcritical growth, and finally of critical growth. Let Ω be a smooth bounded domain in R N . We assume 1 < p < N . Our results, however, can be easily adapted to the case p N , by replacing the subcritical or critical growth by an arbitrary power growth. Our general assumptions on the family fλ (x, s) are the following: (H ) For each λ > 0, fλ : Ω × [0, ∞[ → R is a Carathéodory function with the property that for any s0 > 0, there exists a constant A (depending on λ and s0 ), such that |fλ (x, s)| A for a.e. x ∈ Ω and all s ∈ [0, s0 ]. Moreover if λ < λ , then fλ (x, s) fλ (x, s) for a.e. x ∈ R and all s 0. (H0 ) For each λ > 0 and any s0 > 0, there exists B 0 (depending on λ and s0 ), such that fλ (x, s) −Bs p−1 for a.e. x ∈ Ω and all s ∈ [0, s0 ].
724
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
Assumption (H0 ) concerns the behavior of fλ (x, s) near s = 0 and is used to apply the maximum principle; it implies fλ (x, 0) 0. From now on we will always understand that fλ (x, s) has been extended for s < 0 by putting fλ (x, s) = fλ (x, 0). 1,p At this stage, if u ∈ W0 (Ω) ∩ L∞ (Ω) satisfies −p u = fλ (x, u) in the weak sense, then (H ) and the regularity theory for the p-Laplacian (cf. e.g. [26]) imply u ∈ C 1,α (Ω) for some α = α(N, p) ∈ ]0, 1[. Moreover u 0 (take −u− as testing function and use fλ (x, 0) 0). In addition, by the strong maximum principle of [31], if u ≡ 0, then u > 0 in Ω and ∂u/∂ν < 0 on ∂Ω, where ν denotes the exterior normal. We thus have a solution of (1.1). Observe also that the associated functional 1 Iλ (u) := p
|∇u| − p
Ω
Fλ (x, u)
(2.1)
Ω
s 1,p where Fλ (x, s) := 0 fλ (x, t) dt, is well defined for u ∈ W0 (Ω) ∩ L∞ (Ω). The following two assumptions will also be used throughout the paper: (He ) There exist λ > 0 and a nondecreasing function g with inf{g(s)/s p−1 : s > 0} < p−1 1/ e ∞ such that fλ (x, s) g(s) 1,p
for a.e. x ∈ Ω and all s 0; here e ∈ W0 (Ω) ∩ C 1,α (Ω) is the solution of −p e = 1 and ∞ denotes the L∞ (Ω) norm. (HΩ1 ) For any λ > 0 there exists a smooth subdomain Ω1 , s1 > 0 and θ1 > λ1 (Ω1 ) such that fλ (x, s) θ1 s p−1 for a.e. x ∈ Ω1 and all s ∈ [0, s1 ]; here λ1 (Ω) denotes the principal eigenvalue of −p 1,p on W0 (Ω). Assumption (He ) is used to guarantee the existence of an upper solution for that specific value of λ. More comments about (He ) can be found for instance in [14]. Assumption (HΩ1 ) is a local (i.e. on Ω1 ) “sublinearity” condition at 0, which is satisfied for instance if lim fλ (x, s)/s p−1 = +∞
s→0 s>0
uniformly for x ∈ Ω1 . (HΩ1 ) is used to construct a lower solution. Theorem 2.1 (Existence of one solution without growth condition). Under the assumptions (H ), (H0 ), (He ) and (HΩ1 ), there exists 0 < Λ +∞ such that problem (1.1) has at least one 1,p solution u ∈ W0 (Ω) ∩ L∞ (Ω) (with Iλ (u) < 0) for 0 < λ < Λ and no solution for λ > Λ. As observed in [15, p. 272], in the present generality, Λ can be +∞.
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
725
Theorem 2.2 (Nonexistence for λ large). In addition to the hypotheses of Theorem 2.1 assume: ˜ m ˜ with m ˜ m) (HΩ˜ ) There exist λ > 0, a smooth subdomain Ω, ˜ ∈ L∞ (Ω) ˜ 0, ≡ 0, μ > λ1 (Ω, ˜ such that p−1 fλ (x, s) μm(x)s ˜
˜ m) for a.e. x ∈ Ω˜ and all s 0; here λ1 (Ω, ˜ denotes the principal eigenvalue of −p 1,p ˜ on W0 (Ω) for the weight m. ˜ Then problem (1.1) for the value of λ provided by (HΩ˜ ) has no solution (and consequently Λ < +∞). Assumption (HΩ˜ ) can be seen as a localized version of the trivial sufficient condition of nonexistence for the semilinear problem −u = l(u) in Ω, u > 0 in Ω and u = 0 on ∂Ω, namely inf{l(s)/s: s > 0} > λ1 (Ω), where λ1 (Ω) denotes here the first eigenvalue of − on H01 (Ω). p−1 for some function h with Assumption (HΩ˜ ) is satisfied in particular if fλ (x, s) h(λ)m(x)s ˜ h(λ) → +∞ as λ → ∞. Due to the absence of growth condition, we have up to now included in the definition of a solution u the requirement that u belongs to L∞ (Ω). We will now assume the following growth condition on fλ (x, u), where p ∗ := Np/(N − p): (G) For any λ > 0, there exist d1 , d2 and σ p ∗ − 1 such that fλ (x, s) d1 + d2 s σ for a.e. x ∈ Ω and all s 0. 1,p
Condition (G) implies that any u ∈ W0 (Ω) which solves −p u = fλ (x, u) in the weak sense belongs to L∞ (Ω) (cf. e.g. [3] in the subcritical case, [22] in the critical case), and consequently, as before, belongs to C 1,α (Ω) for some α = α(p, N ). Moreover, when σ < p ∗ − 1, the norm of u 1,p in C 1,α (Ω) can be estimated in terms of the constants from (G) and the norm of u in W0 (Ω) (by using successively [3] and [26]). Such an estimate does not hold anymore when σ = p ∗ − 1, as can be seen by considering for p = 2 the family of instantons uε (x) =
ε 2 ε + |x|2
N−2 2
−
ε 2 ε +1
N−2 2
on Ω = B(0, 1). Note that Proposition 3.7 from Section 3 will provide a substitute to this estimate in the critical case. Condition (G) also implies that the functional Iλ (u) is now well defined for 1,p all u in W0 (Ω). The following Ambrosetti–Rabinowitz type condition, introduced in [14] to handle indefinite nonlinearities, will play a role in our subsequent results:
726
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
(AR)d For any λ > 0, there exist θ > p, ρ < p, s0 0 and d 0 such that θ Fλ (x, s) sfλ (x, s) + ds ρ for a.e. x ∈ Ω and all s s0 . Theorem 2.3 (Existence of one solution for λ = Λ in the subcritical case). In addition to the hypotheses of Theorem 2.2 assume the continuity of fλ (x, s) with respect to λ ( for a.e. x and uniformly for s bounded ). Assume also that (G) with σ < p ∗ − 1 and (AR)d hold uniformly on each interval [r, R] ⊂ {λ > 0} (i.e. the various constants appearing in (G) and (AR)d ) can be chosen independently of λ for λ ∈ [r, R]). Then problem (1.1) has at least one solution u ∈ 1,p W0 (Ω) ∩ L∞ (Ω) (with Iλ (u) 0) for λ = Λ. Next we state our results on the existence of at least two solutions for 0 < λ < Λ. We will first deal with the subcritical case σ < p ∗ − 1. Some of the hypothesis of Theorem 2.1 have to be strengthened. Condition (H0 ) is replaced by: (H0 ) For any λ > 0 and any s0 > 0, there exists B 0 such that for a.e. x ∈ Ω, the function s → fλ (x, s) + Bs p−1 is nondecreasing on [0, s0 ]; moreover fλ (x, 0) 0 for all λ 0 and a.e. x ∈ Ω. The monotonicity of the family fλ is also strengthened in the following way, where we write h(x) ≺ l(x) to mean that for any compact K ⊂ Ω there exists ε > 0 such that h(x) + ε l(x) for a.e. x ∈ K: (M) For any λ < λ and any u ∈ C01 (Ω) with u > 0 in Ω, one has
fλ x, u(x) ≺ fλ x, u(x) . Note that (M) is significantly stronger than the corresponding requirement in the semilinear case: fλ (z, u(x)) ≡ fλ (x, u(x)) (cf. [15, p. 273]). This difference is related to the use of a strong comparison principle for the p-Laplacian (cf. Proposition 3.4 below). Our last additional assumption is: (HΩ2 ) For any λ > 0, these exist a subdomain Ω2 , s2 > 0 and θ2 > 0 such that Fλ (x, s) θ2 s p for a.e. x ∈ Ω2 and all s s2 . Assumption (HΩ2 ) is a local (i.e. on Ω2 ) “superlinearity” condition at ∞, which is satisfied for instance if lim fλ (x, s)/s p−1 = +∞
s→+∞
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
727
uniformly for x ∈ Ω2 . It is used in conjunction with (AR)d to derive the geometry of the mountain pass. Theorem 2.4 (Second solution in the subcritical case). In addition to the hypotheses of Theorem 2.1, assume (G) with σ < p ∗ − 1 as well as (AR)d , (H0 ) , (M) and (HΩ2 ). Then problem (1.1) has at least two solutions u, v for 0 < λ < Λ, with u ≡ v in Ω and Iλ (u) < 0. We finally consider multiplicity in the case where fλ has critical growth. Here we write fλ as fλ (x, s) = hλ (x, s) + b(x)s p
∗ −1
(2.2)
,
where hλ : Ω × [0, ∞[ → R is a Carathéodory function and b ∈ L∞ (Ω). We will distinguish two cases depending on the growth of the subcritical part hλ (x, s) : either (i) hλ satisfies (G) with σ < p − 1, b(x) may change sign, or (ii) hλ satisfies (G) with σ < p ∗ − 1, b(x) 0 in Ω. In each case b(x) will be assumed to satisfy the following condition: (b) For some x0 ∈ Ω, some ball B1 ⊂ Ω around x0 , some constant M and some γb with γb > N(N − p)/(p 2 + N − p), one has 0 b ∞ − b(x) M|x − x0 |γb for a.e. x ∈ B1 . (Recall that ∞ denotes the L∞ (Ω) norm.) Moreover, when p 3, the following condition on hλ will also be assumed: 2 0 and any s0 > 0, there exist c0 > 0, δ > 0, and q with p ∗ − p−1 such that
Hλ (x, s + u) − Hλ (x, s) hλ (x, s)u + c0 uq+1 for all u 0, s ∈ [0, s0 ] and a.e. x ∈ B(x0 , δ), where Hλ (x, s) := the point involved in assumption (b) above.
s 0
hλ (x, t) dt, and x0 is
Assumption (b) implies some control on the negative part of b: b− ∞ b+ ∞ , with in addition some limitation on the way b(x) approaches b ∞ . It trivially holds if b(x) = b ∞ on a small ball. Assumption (Hh ) provides some control on the way Hλ (x, s) is increasing. It is used to handle the case p 3 in Lemma 5.3. A simple calculation based on Lemma A4, part (4), from [18] shows that (Hh ) holds for instance if hλ (x, s) = λa(x)(s q + g(s)) with g nondecreas2 < q + 1 < p in (Hh ) imposes a ing and a(x) ε > 0 near x0 . Note that the condition p ∗ − p−1 rather strong restriction on the dimension: N > p(1 + and 6.4. We first deal with the critical case (i).
p(p−1) ). 2
See in this respect Remarks 2.7
Theorem 2.5 (Second solution in the critical case with σ < p − 1 for hλ ). In addition to the assumptions of Theorem 2.1, assume fλ (x, s) satisfies (H0 ) and (M). Suppose also that fλ (x, s) can be written as in (2.2) with hλ (x, s) nondecreasing with respect to s and satisfying (G) with
728
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
σ < p − 1. Suppose also that b(x) in (2.2) is ≡ 0, and satisfies (b). Then the conclusion of Theorem 2.4 holds provided that in addition either (i) 2N/(N + 1) < p < 3, or (ii) p 3 and (Hh ) is satisfied. We now deal with the critical case (ii). Theorem 2.6 (Second solution in the critical case with σ < p ∗ − 1 for hλ ). In addition to the assumptions of Theorem 2.1, assume fλ (x, s) satisfies (H0 ) and (M). Suppose also that fλ (x, s) can be written as in (2.2) with hλ (x, s) nondecreasing with respect to s and satisfying (G) with σ < p ∗ − 1, and hλ (x, s) satisfying (AR)d . Suppose that b in (2.2) is ≡ 0, 0 in Ω and satisfies (b). Then the conclusion of Theorem 2.4 holds provided that in addition either (i) 2N/(N + 1) p < 3, or (ii) p 3 and (Hh ) is satisfied. In Theorem 2.6, hλ (x, s) is allowed any subcritical growth, at the expense of assuming (AR)d for hλ (x, s) and b(x) 0. Remark 2.7. The condition p > N2N +1 in Theorems 2.5 and 2.6 is slightly more restrictive than 2 considered in [18,19]. The condition p ∗ − p−1 < q + 1 < p from (Hh ) the condition p > N2N +2 already appears in [18,19]. 3. Some preliminaries 3.1. Upper–lower solutions We start by recalling the version of the method of upper–lower solutions which we will use repeatedly. Let g(x, s) be a Carathéodory function on Ω × R with the property that for any s0 > 0, there exists a constant A such that |g(x, s)| A for a.e. x ∈ Ω and all s ∈ [−s0 , s0 ]. A function u ∈ W 1,p (Ω) ∩ L∞ (Ω) is called a (weak) lower solution of the problem
−p u = g(x, u) u=0
in Ω, on ∂Ω,
(3.1)
if u 0 on ∂Ω and
|∇u|p−2 ∇u∇ϕ
Ω
g(x, u)ϕ Ω
for all ϕ ∈ Cc∞ (Ω), ϕ 0. An upper solution is defined by reversing the inequality signs. Proposition 3.1. Assume that u and u are respectively lower and upper solutions for (3.1), with u u a.e. in Ω. Consider the associated functional Φ(u) :=
1 p
|∇u|p − Ω
G(x, u) Ω
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
where G(x, s) :=
s 0
729
g(x, t) dt, and the interval
1,p M := u ∈ W0 (Ω): u u u a.e. in Ω .
Then the infimum of Φ on M is achieved at some u, and such a u is a solution of (3.1). Proof. It is adapted from [30] which deals with the semilinear case. By coercivity and weak lower semicontinuity, one easily sees that the infimum of Φ on M is achieved at some u. Let ϕ ∈ Cc∞ (Ω), ε > 0, and define
vε := min u, max{u, u + εϕ} = u + εϕ − ϕ ε + ϕε where ϕ ε := max{0, u + εϕ − u} and ϕε := − min{0, u + εϕ − u}. Since u minimizes Φ on M, one has Φ (u), vε − u 0, which gives Φ (u), ϕ Φ (u), ϕ ε − Φ (u), ϕε /ε.
(3.2)
Since u is an upper solution and −p is monotone, one also has Φ (u), ϕ ε Φ (u) − Φ (u), ϕ ε
p−2 p−2 ε |∇u| ∇u − |∇u| ∇u ∇ϕ − ε g(x, u) − g(x, u)|ϕ|
Ωε
Ωε
where Ωε := {x ∈ Ω: u(x) + εϕ(x) u(x) > u(x)}. As |Ωε | → 0 as ε → 0, this latter relation implies Φ (u), ϕ ε o(ε) as ε → 0. Similarly Φ (u), ϕε o(ε), and so by (3.2), Φ (u), ϕ 0. Replacing ϕ by −ϕ, one concludes that u solves (3.1). 2 Other results asserting the existence of a solution between a lower solution and an upper solution for a p-Laplacian equation can be found e.g. in [12,19]. These results however do not have the level of generality we need (our local “sublinearity” assumption will lead us to deal with weak lower solutions) or do not give enough information on the solution (the minimization property will be crucial for our multiplicity results). 3.2. Integration by parts formula We now turn to an integration by parts formula which will play a role in the use of our local “sublinearity” condition (HΩ1 ). Proposition 3.2. Let u ∈ C 1 (Ω) with p u ∈ L1 (Ω) in the distribution sense. Let ϕ ∈ C ∞ (Ω). Then |∇u|p−2 ∇u∇ϕ = |∇u|p−2 ∇uνϕ − (p u)ϕ (3.3) Ω
∂Ω
where ν denotes the exterior normal vector.
Ω
730
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
Formula (3.3) is standard when u ∈ W 2,p (Ω). Its proof in the present situation is based on the following Lemma 3.3. (Cf.[11].) Let a ∈ C(Ω)N be a vector field such that div a ∈ L1 (Ω) in the distribution sense. Then Ω div a = ∂Ω aν. Proof of Proposition 3.2. Take a := |∇u|p−2 ∇uϕ. By the well-known formula for the derivation of the product of a distribution by a C ∞ function, one has div a = (p u)ϕ + |∇u|p−2 ∇u∇ϕ. Lemma 3.3 thus applies and yields (3.3).
2
3.3. Strong comparison principle The following strong comparison principle, which is obtained by combining arguments from [4] and [17], will be of importance in our study of multiplicity. We recall that the notation f ≺ g was introduced before the statement of assumption (M) in Section 2; moreover, for ∂v C01 (Ω) functions u and v, we will write u v to mean u(x) < v(x) in Ω and ∂u ∂ν (x) > ∂ν (x) on ∂Ω. Proposition 3.4. Let f, g ∈ L∞ (Ω), and let u, v be solutions of −p u = μ|u|p−2 u + f
in Ω,
u = 0 on ∂Ω,
(3.4)
−p v = μ|v|p−2 v + g
in Ω,
v = 0 on ∂Ω
(3.5)
where μ 0. If 0 f ≺ g, then u v. Several works have been devoted in the last years to the strong comparison principle for the pLaplacian. In particular the strict inequality u < v in Proposition 3.4 was derived recently in [4]. Note that this conclusion u < v does not hold if the hypothesis 0 f ≺ g in Proposition 3.4 is weakened into 0 f g, f ≡ g (cf. [11]). The stronger conclusion u v however holds under this weakened hypothesis on f, g if 0 μ < λ1 (Ω) and ∂Ω is connected (cf. [10]). This conclusion u v also holds if μ = 0 and 0 f g with f ≡ g on any open subset of Ω (cf. [22]). Note that the assumption 0 f ≺ g was also considered recently in a slightly different setting in [21]. Proof of Proposition 3.4. The assumptions imply g 0, ≡ 0, and so, by standard arguments, v 0 (take −v − as testing function in (3.5) to get v 0 and apply the strong maximum principle of [31]). Once this is observed, the proof of Proposition 2.6 in [4] can be followed without any change to reach u(x) < v(x) in Ω. Now to derive the strict inequality of the normal derivatives on ∂Ω, it suffices to apply near any point of ∂Ω a local strong comparison result from [17], which we recall below. 2 Lemma 3.5. (Cf. [17].) Let f, g ∈ L∞ (Ω) and let u, v be solutions of (3.4), (3.5) where μ < λ1 (Ω). Assume 0 f g and call Ωδ := {x ∈ Ω: dist(x, ∂Ω) < δ}. Then for every δ > 0
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
731
sufficiently small and for every component Σ of Ωδ , one has either (i) u ≡ v in Σ , or (ii) u < v in Σ and ∂u/∂ν > ∂v/∂ν on ∂Ω ∩ Σ . 3.4. Brezis–Lieb lemma The following version of the Brezis–Lieb lemma (cf. [6]) for vector-valued functions will also be needed. Lemma 3.6. Let fk be a bounded sequence in Lp (Ω, Rn ), where 1 p < ∞. Assume fk → f p p p a.e. in Ω. Then fk Lp − f − fk Lp → f Lp , where g Lp denotes ( Ω |g(x)|p )1/p with |g(x)| the Euclidean norm of g(x) ∈ Rn . Proof. It is easily adapted from the proof given for instance in [24] in the scalar case. The only difference is the verification of |a + b|p − |a|p − |b|p ε|a|p + Cε |b|p for a, b ∈ RN . This latter relation follows from the observation that for a, b ∈ RN , (|a + b|p − |a|p − |b|p )/|a|p → 0 as |a| → +∞, uniformly for |b| 1. 2 3.5. C01,α estimates of weak solutions in the critical case 1,p
For the proof in the next subsection on the local minimization in C01 and W0 , we will need the following estimate. 1,p
Proposition 3.7. Let a sequence uk ∈ W0 (Ω) satisfy −p uk = hk (x, uk )
(3.6)
where the Carathéodory functions hk verify the uniform growth condition hk (x, s) C1 + C2 |s|p∗ −1 .
(3.7)
∗ 1,p Assume that uk remains bounded in W0 (Ω). Moreover assume E |uk |p → 0 as |E| → 0, ¯ for some 0 < α < 1. uniformly in k. Then uk remains bounded in C01,α (Ω) As observed in the comments after hypothesis (G) in Section 2, the fact that each uk above ¯ follows from [22]. We are proving here that uk remains bounded in C 1,α (Ω) ¯ belongs to C01,α (Ω) 0 ∗ 1,p provided, in addition to being bounded in W0 (Ω), uk is uniformly equi-integrable in Lp (Ω). The necessity of such an additional requirement is clear from the comments after hypothesis (G). Our proof below combines arguments and results from [18,25,26]. Proof of Proposition 3.7. We break the proof in three steps. (i) There exists q > p ∗ such that the sequence uk remains bounded in Lq (Ω). (ii) The sequence uk remains bounded in L∞ (Ω). ¯ (iii) There exists 0 < α < 1 such that the sequence uk remains bounded in C01,α (Ω).
732
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
Proof of step (i). This is essentially a consequence of the calculations on pp. 951–952 from [18]. Indeed a careful reading of [18], using our assumption of uniform equi-integrability, shows the existence of R > 0 such that for any nonnegative η ∈ Cc∞ (RN ) with support of diameter R, on has
βp∗ ∗
ηp u+ k
p/p∗
C(η)
Ω
+ βp uk + C|Ω|
(3.8)
Ω
with constants independent of k. Here β is fixed with 1 < β < p ∗ /p. This clearly implies that βp ∗ (Ω). And a similar argument applied to u− yields the conclusion of u+ k remains bounded in L k step (i). Proof of step (ii). Theorem 7.1 from [25] can be applied to (3.6) to derive that uk remains bounded in L∞ (Ω). In fact a particular case of this result from [25] suffices here, which is recalled below as Lemma 3.8. Here are some details on the application of this Lemma 3.8 to (3.6): (3.10) clearly holds with ϕ1 ≡ 0; taking q > p ∗ as given by step (i), one can verify (3.11) with α2 = p ∗ − 1 and ϕ2 a suitable constant by picking r2 sufficiently large. Proof of step (iii). Once the L∞ estimate of step (ii) is obtained, the global regularity result of [26] can be applied to (3.6) and gives that for some 0 < α < 1, uk remains bounded ¯ This concludes the proof of Proposition 3.7. 2 in C01,α (Ω). Lemma 3.8. (Cf. [25].) Let u ∈ W0 (Ω) ∩ Lq (Ω) with q p ∗ satisfy 1,p
a(x, u, ∇u)∇v =
Ω
(3.9)
b(x, u)v Ω
for v of the form (u − c)+ or (u + c)− , c any positive constant. Here the functions a(x, s, η) and b(x, s) are assumed to verify for x ∈ Ω, s ∈ R and η ∈ RN ,
a(x, s, η), η ν|η|p − 1 + |s|α1 ϕ1 (x),
(sign s)b(x, s) 1 + |s|α2 ϕ2 (x),
(3.10) (3.11)
with ν a positive constant, 0 ϕi ∈ Lri (Ω), ri > N/p, 0 α1 < p NN+q − p NN+q
−1− and Ω.
q r2 .
Then
u ∈ L∞ (Ω)
q r1
and 0 α2 <
and u L∞ can be estimated in terms of u Lq , ν, αi , ϕi Lri 1,p
3.6. Local minimization in C01 and W0
In our study of multiplicity we will use the following extension to Lp of a well-known result of Brezis and Nirenberg [8]. Proposition 3.9. Let Φ(u) be a functional of the form Φ(u) := s G(x, s) := 0 g(x, t) dt and g satisfies the growth condition g(x, s) d1 + d2 |s|σ
1 p Ω
|∇u|p −
Ω
G(x, u), where
(3.12)
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
733
for some σ p ∗ − 1 and some constants d1 , d2 . Let u0 ∈ W0 (Ω) be a local minimizer of Φ ¯ with for the C01 (Ω) topology, i.e. Φ(u0 ) Φ(u0 + w) for some ε0 > 0 and all w ∈ C01 (Ω) 1,α ¯
w C 1 ε0 . Then u0 ∈ C0 (Ω) for some 0 < α < 1 and u0 is a local minimizer of Φ for 1,p
0
1,p
1,p
the W0 (Ω) topology, i.e. Φ(u0 ) Φ(u0 + w) for some ε1 > 0 and all w ∈ W0 (Ω) with
w W 1,p ε1 . 0
Proposition 3.9 was proved in [8] when p = 2 and later extended to the case p = 2 with σ < p ∗ − 1 in [19,23] (see also [21,16,9] for recent related works). Its validity in the critical case σ = p ∗ − 1 is also suggested in [19], although not explicitly. For that matter, we feel necessary to give here a complete proof. Our proof below borrows some ideas from [8,19] but follows a different approach, which turns out to be simpler and to yield a slightly stronger result (cf. Remark 3.10). In fact, as in [9], we avoid the consideration of equations involving two p-Laplacians, equations to which the global C 1,α estimates from [26] do not seem to apply. Proof of Proposition 3.9. Since (3.12) with σ < p ∗ − 1 implies a similar condition with σ = 1,p p ∗ − 1, it suffices to consider the latter case. Moreover since u0 ∈ W0 (Ω) satisfies −p u0 = g(x, u0 )
in Ω,
(3.13)
¯ Assume by contradiction it follows from Corollary 1.1 in [22] that u0 belongs to some C01,α (Ω). 1,p that u0 is not a local minimizer of Φ for the W0 (Ω) topology. This means that for any ε > 0 1,p there exists vε ∈ W0 (Ω) with vε − u0 W 1,p ε and Φ(vε ) < Φ(u0 ). For later use of Lagrange 0 multiplier rule, it will be convenient to use only, as in [9], a consequence of that, namely vε − u0 Lp ε and Φ(vε ) < Φ(u0 ). We now consider as in [8] the truncated functional Φj (u) :=
1 p
|∇u|p − Ω
Gj (x, u) Ω
s for j = 1, 2, . . . , where gj (x, s) := g(x, Tj (s)), Gj (x, s) := 0 gj (x, t) dt and Tj (s) = −j if s −j , s if −j s j and +j if s j . Note that (3.12) still holds for each gj , with constants and exponent independent of j . This easily implies, by dominated convergence, that for each 1,p v ∈ W0 (Ω), Φj (v) → Φ(v) as j → ∞. It follows that for each ε > 0, there is some jε such that Φjε (vε ) < Φ(u0 ). On the other hand, since gjε has subcritical growth and since for some constants D1 , D2 independent of ε, 1 Φjε (v) p
|∇v| − p
Ω
∗ D1 + D2 |v|p
(3.14)
Ω
1,p
1,p
for v ∈ W0 (Ω), one deduces that Φjε achieves its infimum on {v ∈ W0 (Ω): v −u0 Lp∗ ε} at some uε ; this easily follows by taking a minimizing sequence and using (3.14). One thus has Φjε (uε ) Φjε (vε ) < Φ(u0 ).
(3.15)
734
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752 ∗
By construction uε → u0 in Lp (Ω) as ε → 0, and it follows from (3.14), (3.15) that uε remains 1,p bounded in W0 (Ω). Claim. There exists 0 < α < 1 such that uε remains bounded in C01,α (Ω) as ε → 0. Accepting this claim, one deduces from Ascoli–Arzela theorem that uε → u0 in C01 (Ω). It follows that for ε > 0 sufficiently small, Φ(uε ) = Φjε (uε ) < Φ(u0 ), ¯ topology, and the which contradicts the fact that u0 is a local minimizer of Φ for the C01 (Ω) proof of Proposition 3.9 will be complete. It remains to prove the claim. For this purpose we write the Euler equation satisfied by uε : −p uε = gjε (x, uε ) + με |uε − u0 |p
∗ −2
(uε − u0 )
(3.16)
where με is a Lagrange multiplier associated to the constraint uε − u0 Lp∗ ε. Taking u0 − uε as testing function in (3.16) and using the minimizing property of uε , one gets με 0. We now distinguish two cases according to the behavior of με as ε → 0: either (i) με remains bounded, or (ii) for a subsequence με → −∞. In case (ii) below, for simplicity of notation, we will keep writing ε → 0 instead of considering a subsequence. Case (i). In this case the conclusion of the claim is a direct consequence of Proposition 3.7. Case (ii). In this case there exists ε0 > 0 and a constant M such that for 0 < ε < ε0 , p∗ −2
< 0 for s > M, s − u0 (x) gjε (x, s) + με s − u0 (x) > 0 for s < −M.
(3.17)
∗ Indeed, by (3.12) and the fact that u0 is bounded, |gjε (x, s)| d˜1 + d˜2 |s − u0 (x)|p −1 for some constant d˜1 , d˜2 ; one then picks ε0 such that με −d˜2 − 1 for 0 < ε < ε0 , and observe that the ∗ left-hand side of (3.17) for s > u0 ∞ (resp. s < − u0 ∞ ) and 0 < ε < ε0 is d˜1 − |s − u0 |p −1 ∗ (resp. d˜1 + |s − u0 |p −1 ); consequently (3.17) follows. Taking now (uε − M)+ and (uε − M)− as testing functions in (3.16), one concludes that |uε (x)| M for x ∈ Ω and 0 < ε < ε0 . So uε remains bounded in L∞ (Ω) as ε → 0. We now take |uε − u0 |β−1 (uε − u0 ) with β 1 as testing function in (3.13), (3.16), and use the monotonicity of −p to get
0
|∇uε |p−2 ∇uε − |∇u0 |p−2 ∇u0 ∇ |uε − u0 |β−1 (uε − u0 )
Ω
=
gjε (x, uε ) − g(x, u0 ) |uε − u0 |β−1 (uε − u0 ) + με
Ω
|uε − u0 |p
∗ +β−1
.
Ω
Since uε remains bounded in L∞ (Ω), using Hölder inequality in the integral involving g, we obtain p ∗ −1
−με uε − u0 Lp∗ +β−1 c
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
735
where the constant c does not depend on β and ε. Letting β → ∞ yields p ∗ −1
−με uε − u0 L∞ c. So the right-hand side of (3.16) remains bounded in L∞ (Ω), and it follows from [26] that for some 0 < α = α(N, p) < 1, uε remains bounded in C01,α (Ω). This concludes the proof of the claim in case (ii). 2 Remark 3.10. The above proof of Proposition 3.9 in the critical case σ = p ∗ − 1 shows that ∗ 1,p u0 is a local minimizer of Φ on W0 (Ω) for the Lp (Ω) topology, i.e. Φ(u0 ) Φ(u0 + w) for 1,p some ε1 > 0 and all w ∈ W0 (Ω) with w Lp∗ ε1 . And in the subcritical case σ < p ∗ − 1, 1,p one would conclude that u0 is a local minimizer of Φ on W0 (Ω) for the Lσ +1 (Ω) topology. Remark 3.11. The above proof of Proposition 3.9 greatly simplifies in the subcritical case σ < ¯ now follows by using successively p ∗ − 1 (cf. [9]): the fact that u0 belongs to some C01,α (Ω) [3] and [26], no truncation of Φ is needed, and Proposition 3.7 can be replaced by another direct application of [3] and [26]. 4. Proofs of Theorems 2.1, 2.2 and 2.3 This section is devoted to the proof of the first theorems stated in Section 2. The general strategy in this section as well as in the following one is rather similar to that in the semilinear case [15] and we will mainly concentrate on the differences with respect to [15]. It will be convenient from now on to denote (1.1) by (1.1)λ . Proof of Theorem 2.1. One starts by proving the existence of an upper solution of (1.1)λ for the value λ provided by (He ). With g and e as in (He ), there exists M > 0 such that
p−1
p−1 1/ e ∞ g M e ∞ / M e ∞ and so one has, for any ϕ ∈ Cc∞ (Ω) with ϕ 0,
∇(Me)p−2 ∇(Me)∇ϕ =
Ω
M p−1 ϕ
Ω
Ω
g(Me)ϕ Ω
g M e ∞ ϕ
fλ (x, Me)ϕ. Ω
This shows that Me is an upper solution. We now construct a lower solution of (1.1)λ by using the local “sublinearity” assumption 1,p (HΩ1 ) at λ. Denote by ϕ1 a positive principal eigenfunction of −p on W0 (Ω1 ); one has ϕ1 ∈ C 1,α (Ω 1 ) and ∂ϕ1 /∂ν < 0 on ∂Ω1 , where ν denotes here the exterior normal on ∂Ω1 . 1,p Extending ϕ1 by 0 on Ω \ Ω1 , the extended function, still denoted by ϕ1 , belongs to W0 (Ω) ∩ ∞ ∞ L (Ω). Call uε := εϕ1 for ε > 0, and let ϕ ∈ Cc (Ω), ϕ 0. One has, for ε sufficiently small so that uε ∞ s1 (where s1 comes from (HΩ1 )),
736
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
|∇uε |
p−2
∇uε ∇ϕ =
Ω
|∇uε |
p−2
∂Ω1
∇uε νϕ −
(p uε )ϕ
Ω1
λ1 (Ω1 )
|uε |p−2 uε ϕ
Ω1
fλ (x, uε )ϕ
Ω1
fλ (x, uε ) Ω
where we have used successively Proposition 3.2, (HΩ1 ) and fλ (x, 0) 0. This shows that uε is a (weak) lower solution. Moreover, taking ε > 0 smaller if necessary, one has uε Me in Ω. 1,p Proposition 3.1 can thus be applied and yields a solution u ∈ W0 (Ω) ∩ L∞ (Ω) of (1.1)λ for the value of λ provided by (He ). So at this stage, we have proved that
Λ := sup λ > 0: (1.1)λ has a solution > 0. It remains to show that for each 0 < λ < Λ, (1.1)λ has a solution u with Iλ (u) < 0. Let 0 < λ < Λ and take λ with λ < λ < Λ such that (1.1)λ has a solution u. One has, by the monotonicity of the family fλ (cf. (H )), |∇u|p−2 ∇u∇ϕ = fλ (x, u)ϕ fλ (x, u)ϕ Ω
Ω
Ω
for all ϕ ∈ Cc∞ (Ω) with ϕ 0. This shows that u is an upper solution for (1.1)λ . The rest of the argument is then easily adapted from p. 275 in [15]: one first constructs a (weak) lower solution as above in the form εϕ1 by using (HΩ1 ) at λ, and Proposition 3.1 applies again to yield a solution u0 to (1.1)λ ; moreover the minimization property provided by Proposition 3.1 leads to Iλ (u0 ) Iλ (εϕ1 ), and by (HΩ1 ), Iλ (εϕ1 ) < 0 for ε sufficiently small. 2 Proof of Theorem 2.2. Let λ be given by (HΩ˜ ) and suppose by contradiction that (1.1)λ admits 1,p a solution u ∈ W0 (Ω) ∩ L∞ (Ω). Consider the eigenvalue problem with weight
p−2 ˜ v −p v = μm(x)|v|
v=0
˜ in Ω, ˜ on ∂ Ω.
(4.1)
Since by (HΩ˜ ),
|∇u|p−2 ∇u∇ϕ = Ω˜
fλ (x, u)ϕ μ
Ω˜
p−1 m(x)u ˜ ϕ Ω˜
˜ is an upper solution for (4.1). On the other for ϕ ∈ Cc∞ (Ω), ϕ 0, we see that u (restricted to Ω) ˜ m) ˜ and call uε := εϕ1 . Since hand let ϕ1 be a positive eigenfunction associated to λ1 (Ω, p−1 p−1 ˜ m) |∇uε |p−2 ∇uε ∇ϕ = λ1 (Ω, ˜ m(x)u ˜ ϕ < μ m(x)u ˜ ϕ ε ε Ω˜
Ω˜
Ω˜
for ϕ ∈ Cc∞ (Ω), ϕ 0, we see that uε is a lower solution for (4.1). Clearly, for ε > 0 sufficiently ˜ either u(x0 ) > 0 or u(x0 ) = 0 and ∂u/∂ν(x0 ) < 0, where small, uε u on Ω˜ (since for x0 ∈ ∂ Ω,
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
737
˜ Proposition 3.1 then guarantees the existence of a solution v ν is here the exterior normal on ∂ Ω). ˜ In particular v 0, v ≡ 0, which shows that μ is a principal of (4.1) with uε v u in Ω. ˜ m). ˜ This is a contradiction since μ > λ1 (Ω, ˜ 2 eigenvalue of −p on Ω˜ for the weight m. Remark 4.1. Here is another proof of Theorem 2.2 based on Picone’s identity (cf. Suppose [1]). p > 0. By ˜ with ϕ 0 and ˜ mϕ again that (1.1)λ admits a solution u and let ϕ ∈ Cc∞ (Ω) ˜ Ω Picone’s identity,
|∇ϕ| − p
Ω
∇ ϕ p /up−1 |∇u|p−2 ∇u 0
Ω
and consequently, from the equation satisfied by u and (HΩ˜ ),
|∇ϕ|p
Ω
fλ (x, u)ϕ p /up−1 μ
p m(x)ϕ ˜ . Ω˜
Ω
˜ m) Taking the infimum with respect to ϕ yields λ1 (Ω, ˜ μ, a contradiction. Proof of Theorem 2.3. Let μk → Λ with 0 < μk < Λ, μk increasing, and let uk be a solution of (1.1)μk with Iμk (uk ) < 0. 1,p We first show that uk remains bounded in W0 (Ω). This is carried out as on [15, p. 276]: 1 1,p using Iμk (uk ) < 0 and (AR)d , and denoting by v the W0 (Ω) norm ( Ω |∇v|p ) p , one obtains θ
uk p p
fμk (x, uk )uk + d
Ω
p
uk + c1 Ω
for some constant c1 . One then deduces, using (1.1)μk ,
θ − 1 uk p c2 μk p + c1 p
for some constant c2 , which gives the required bound. Since σ < p ∗ − 1, we have, for a subsequence, uk → u in C 1 (Ω). Clearly u solves −p u = fΛ (x, u) in Ω, u 0 in Ω, u = 0 on ∂Ω, and one has IΛ (u) 0. It remains to see that u ≡ 0. Assume by contradiction u ≡ 0. We will use (HΩ1 ) for λ = μ1 . Let as before Ω1 be the corresponding subdomain and ϕ1 a positive eigenfunction associated to the principal eigenvalue 1,p λ1 (Ω1 ) of −p on W0 (Ω1 ). We have, for ϕ ∈ Cc∞ (Ω), ϕ 0, using the monotonicity of the family fλ ,
|∇uk |
p−2
Ω1
∇uk ∇ϕ Ω1
fμ1 (x, uk )ϕ θ1
p−1
uk
ϕ
Ω1
for k sufficiently large (so that 0 uk (x) s1 on Ω1 , which is possible since uk → 0 uniformly). The above relation shows that uk (restricted to Ω1 ) is an upper solution for the problem
738
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
−p v = θ1 |v|p−2 v v=0
in Ω1 , on ∂Ω1 .
(4.2)
Since λ1 (Ω1 ) < θ1 , one also has that for ε > 0 sufficiently small, εϕ1 is a lower solution of (4.2) which satisfies εϕ1 uk on Ω1 . Proposition 3.1 then implies the existence of a solution v of (4.2) with εϕ1 v uk . This is a contradiction since θ1 > λ1 (Ω1 ). 2 Remark 4.2. Here is another proof of Theorem 2.3 based on Picone’s identity. One starts by constructing u as above. Assume again by contradiction u ≡ 0. With Ω1 , λ1 (Ω1 ), ϕ1 and k sufficiently large as above, one has λ1 (Ω1 ) Ω1
p
ϕ1 =
|∇ϕ1 |p
Ω1
=
p p−1 |∇uk |p−2 ∇uk ∇ ϕ1 /uk
Ω1 p p−1 fμk (x, uk )ϕ1 /uk
Ω
p p−1 fμk (x, uk )ϕ1 /uk
Ω1
θ1
p
ϕ1 ,
Ω1
which is a contradiction since θ1 > λ1 (Ω1 ). Here one has used successively the definition of λ1 (Ω1 ), Picone’s identity, the equation satisfied by uk , the monotonicity of the family fλ , and (HΩ1 ). 5. Proofs of Theorems 2.4, 2.5 and 2.6 We now turn to the proof of the multiplicity theorems from Section 2. We start with the following result which concerns the first solution. Theorem 5.1. In addition to the assumptions of Theorem 2.1, assume that fλ (x, s) satisfies (H0 ), (M) and the growth condition (G) with σ p ∗ − 1. Let 0 < λ < Λ. Then there exists a solution ¯ of problem (1.1)λ which is a local minimum of Iλ in the W 1,p topology. u0 ∈ C 1,α (Ω) 0 Proof. Pick λ1 , λ2 with 0 < λ1 < λ < λ2 < Λ. We first observe that (1.1)λ1 and (1.1)λ2 have solutions u1 and u2 , respectively, which satisfy u1 u2 . This can be seen as follows. One starts with a solution u2 of (1.1)λ2 and considers (1.1)λ1 . By the monotonicity of the family fλ , u2 is an upper solution of (1.1)λ1 ; moreover, using (HΩ1 ) at λ1 as in the proof of Theorem 2.1, one constructs a lower solution of (1.1)λ1 which is smaller than u2 . Proposition 3.1 thus applies and yields a solution u1 of (1.1)λ1 with u1 u2 in Ω and Iλ1 (u1 ) < 0. We now use u1 and u2 as lower and upper solutions for (1.1)λ and apply as before Proposition 3.1 to obtain a solution u0 of (1.1)λ which minimizes Iλ on {u ∈ W 1,p (Ω): u1 u u2 } and satisfies Iλ (u0 ) < 0 (the latter inequality follows from the minimization property, using Iλ (u1 ) Iλ1 (u1 ) < 0). We claim that
u1 < u0 < u2 ∂u1 /∂ν > ∂u0 /∂ν > ∂u2 /∂ν
in Ω, on ∂Ω.
(5.1)
Let us prove the relations involving u1 and u0 (same argument for u0 and u2 ). We first use (M) and (H0 ) at λ to derive that for a suitable B 0,
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752 p−1
−p u1 = fλ1 (x, u1 ) ≺ fλ (x, u1 ) = fλ (x, u1 ) + Bu1 p−1
fλ (x, u0 ) + Bu0
p−1
− Bu1
739
p−1
− Bu1 p−1
= −p u0 + Bu0
p−1
− Bu1
.
So p−1
−p u1 + Bu1
p−1
≺ −p u0 + Bu0
.
Moreover, taking B larger if necessary and applying (H0 ) at λ1 , one also has p−1
−p u1 + Bu1
p−1
= fλ1 (x, u1 ) + Bu1
fλ1 (x, 0) 0.
We are thus in a position to apply the strong comparison principle of Proposition 3.4, which yields u1 u0 , i.e. the assertion (5.1) relative to u1 and u0 . 1,p It follows from (5.1) that {u ∈ W0 (Ω): u1 u u2 } contains a C01 (Ω) neighborhood of u0 and consequently, u0 is a local minimizer of Iλ on C01 (Ω). We now apply Proposition 3.9 to get 1,p that u0 is a local minimizer of Iλ on W0 (Ω). 2 The proofs of Theorems 2.4, 2.5 and 2.6 have a common part, that we will first consider. We keep assuming here the hypothesis of Theorem 5.1. In each theorem the second solution of (1.1)λ will be constructed in the form u0 + w where u0 is the first solution provided by Theorem 5.1 and w is a nonzero solution of
−p (u0 + w) = fλ x, u0 + w + w=0
in Ω, on ∂Ω.
(5.2)
And the construction of a nonzero solution of (5.2) will be carried out using the mountain pass theorem. 1,p Observe that if w ∈ W0 (Ω) solves (5.2), then w 0. Indeed, by (G) and the regularity ∞ theory, w ∈ L (Ω); moreover, using (H0 ) , one has, for a suitable B 0, p−1
p−1
−p u0 = fλ (x, u0 ) + Bu0 − Bu0
p−1 p−1 fλ x, u0 + w + + B u0 + w + − Bu0 p−1
p−1 = −p (u0 + w) + B u0 + w + − Bu0 and consequently, since (B(u0 + w + )p−1 − Bu0
p−1
)w − ≡ 0, we have
p−2 |∇u0 |p−2 ∇u0 − ∇(u0 + w) ∇(u0 + w) ∇w − 0.
Ω
Splitting the preceding integral as an integral on {w > 0} and an integral on {w 0}, one obtains, by strict monotonicity, w − ≡ 0, i.e. w 0.
740
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
It follows that if w is a nonzero solution of (5.2), then u0 + w will be a second solution of (1.1)λ satisfying all the requirements in Theorem 2.4. To derive the existence of a nonzero solution of (5.2), we write the associated functional 1 Jλ (w) := p
∇(u0 + w)p −
Ω
(5.3)
Gλ (x, w) Ω
where Gλ (x, w) := Fλ (x, u0 + w + ) − Fλ (x, , u0 ) − fλ (x, u0 )w − . We are thus lead to look for a 1,p nonzero critical point of Jλ on W0 (Ω). 1,p We now prove that 0 is a local minimizer of Jλ on W0 (Ω). Indeed, using the fact that u0 is 1,p a local minimizer of Iλ on W0 (Ω), one obtains Jλ (w)
1 p
∇(u0 + w)p − 1 p
Ω
∇ u0 + w + p + 1 p
Ω
|∇u0 |p + Ω
fλ (x, u0 )w −
(5.4)
Ω
for w + sufficiently small. Recall that when p 2,
|ξ2 |p |ξ1 |p + p|ξ1 |p−2 ξ1 , ξ2 − ξ1 + c(p)|ξ2 − ξ1 |p / 2p − 1
(5.5)
for some positive constant c(p) and all ξ1 , ξ2 ∈ R N , and that a similar relation holds when p < 2, with the last term of (5.5) replaced by c(p)|ξ1 − ξ2 |2 /(|ξ2 | + |ξ1 |)2−p (cf. [27,28]). Using (5.5) and the fact that u0 solves (1.1)λ , one derives from (5.4) that when p 2, 1 Jλ (w) p
|∇u0 | + c(p) p
Ω
∇w − p / 2p − 1 Jλ (0),
Ω
i.e. the conclusion that 0 is a local minimizer of Jλ . A similar inequality can be obtained when p < 2. We will now distinguish between the subcritical situation of Theorem 2.4 and the critical situation of Theorems 2.5 and 2.6. Proof of Theorem 2.4. Assumption (G) with σ < p ∗ − 1 and (AR)d imply that Jλ satisfies the 1,p (PS) condition on W0 (Ω). Indeed, if wk is a (PS) sequence, then, for θ as in (AR)d and for some εk → 0 and some constant c, we have θ Jλ (wk ) − Jλ (wk )(u0 + wk ) c + εk u0 + wk . Next, after some computations, using (AR)d , one deduces, for another constant c ,
ρ θ − 1 u0 + wk p + (θ − 1) fλ (x, u0 )wk− c + d u0 + wk+ + εk u0 + wk . p Ω
Ω 1,p
Since ρ < p, this implies that the sequence (wk ) remains bounded in W0 (Ω). Passing to a subsequence, let w0 be the weak limit of (wk ). So it follows that Jλ (wk )(wk − w0 ) → 0. Using
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
741
the fact that σ < p ∗ − 1 and the (S)+ property of −p , one derives, in a rather standard way, 1,p that for a further subsequence, wk converges in W0 (Ω). 1,p From the above discussion, we know that 0 is a local minimizer of Jλ on W0 (Ω). One now faces the following alternative, as in [15, p. 278]: either Jλ admits near 0 another local minimizer (and we are finished), or for any r > 0 sufficiently small,
1,p Jλ (0) < inf Jλ (w): w ∈ W0 (Ω) and w = r
(5.6)
(by using Theorem 5.10 from [13]). We aim in the latter case at applying the mountain pass 1,p theorem. This will be possible if for some ϕ ∈ W0 (Ω), Jλ (tϕ) → −∞ as t → +∞. Assumption (HΩ2 ) will be used to construct such a ϕ. One first adapts the calculation from p. 462 in [14] to derive from (HΩ2 ) and (AR)d that for some s3 and some c > 0, Fλ (x, s) cs θ
(5.7)
for a.e. x ∈ Ω2 and all s s3 , where θ comes from (AR)d . Take ϕ ∈ Cc∞ (Ω2 ) with ϕ 0, ϕ ≡ 0. Since 1 Jλ (tϕ) = p
∇(u0 + tϕ)p −
Ω
Fλ (x, u0 + tϕ) +
Ω
Fλ (x, u0 ), Ω
one easily derives from (5.7) that Jλ (tϕ) → −∞ as t → ∞. This concludes the proof of Theorem 2.4. 2 Now we study the (PS) condition for the functional Jλ under the hypotheses of either Theorem 2.5 or 2.6. Lemma 5.2. Assume that 0 is the only critical point of Jλ . Then Jλ satisfies the (PS)c condition for all levels c with c < c0 := where S := inf{
Ω
S N/p
u0 p + (N −p)/p p N b ∞
(5.8)
∗ ∗ 1,p |∇u|p /( Ω |u|p )p/p : u ∈ W0 (Ω), u ≡ 0} is the best Sobolev constant.
Proof. Let wk be a (PS)c sequence with c < c0 , i.e. 1
u0 + wk p − p
Gλ (x, wk ) → c,
(5.9)
Ω
∇(u0 + wk )p−2 ∇(u0 + wk )∇ϕ − gλ (x, wk )ϕ εk ϕ , Ω
Ω
1,p
∀ϕ ∈ W0 (Ω)
(5.10)
742
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
where εk → 0. Recall that in the present situation gλ (x, s) := hλ (x, u0 + s + ) + b(x)(u0 + ∗ ∗ p∗ s + )p −1 and Gλ (x, s) := Hλ (x, u0 + s + ) + b(x)(u0 + s + )p /p ∗ − Hλ (x, u0 ) − b(x)u0 /p ∗ − s p ∗ −1 hλ (x, u0 )s − − b(x)u0 s − , with Hλ (x, s) := 0 hλ (x, t) dt. 1,p We first observe that wk remains bounded in W0 (Ω). This follows by multiplying (5.10) with ϕ = u0 + wk by 1/p ∗ and subtracting from (5.9): the terms of power p ∗ cancel, and the remaining dominating term is wk p , which easily yields the desired bound. In this argument we have used condition (G) with σ < p − 1 in the case of Theorem 2.5, and condition (AR)d as well as b(x) 0 in the case of Theorem 2.6 (the argument in this latter case is a little more involved but can be easily adapted from that on p. 283 in [15]). So, for a subsequence, wk → w0 weakly 1,p in W0 (Ω), strongly in Lr (Ω) for any r < p ∗ , and a.e. in Ω. Writing (5.10) as −p (u0 + wk ) = gλ (x, wk ) + fk
(5.11)
where fk ∈ W −1,p (Ω) with fk −1,p εk , one can apply Theorem 2.1 from [5] to go to the limit in the weak form of (5.11), and so w0 solves (5.2). By the assumption of Lemma 5.2, one then concludes w0 = 0. We now claim that
u0 + wk p /N + u0 p /p ∗ → c.
(5.12)
Indeed, multiplying again (5.10) with ϕ = u0 + wk by 1/p ∗ and subtracting from (5.9), one obtains 1 1
u0 + wk p + ∗ N p
p∗ hλ (x, u0 )u0 + b(x)u0 → c,
Ω
and (5.12) follows by using the equation for u0 . Relation (5.12) and the weak lower semi-continuity of the norm imply c u0 p /p. We distinguish two cases: either (i) c = u0 p /p or (ii) c > u0 p /p. In case (i), one deduces from (5.12) that u0 + wk → u0 , and consequently wk → 0 in 1,p W0 (Ω), which shows that (PS)c holds. We will now prove that case (ii) leads to c c0 , which contradicts assumption (5.8). For that purpose we start from (5.10) with ϕ = u0 + wk and use Eq. (1.1) for u0 to obtain lim u0 + wk = u0 + lim p
p
p∗
p∗ b(x) u0 + wk+ − u0 .
Ω +p
Applying (5.12) to the left-hand side and the mean-value theorem to (u0 + wk+ )p − wk right-hand side, one obtains
N c − u0 p /p = lim
Ω
p∗ b(x) wk+ ,
in the
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
743
which easily leads to p/p∗
p/p∗ −1
b ∞ /N S lim wk p . c − u0 p /p
(5.13)
On the other hand, by Lemma 3.6, u0 + wk p − wk p → u0 p . Replacing in (5.13) and using again (5.12) together with the fact that we are in case (ii), we come to the inequality c c0 , which is a contradiction. This completes the proof of Lemma 5.2. 2 Proof of Theorems 2.5 and 2.6. Proceeding exactly as before, we know that 0 is a local mini1,p mizer of Jλ on W0 (Ω), and we look for a nonzero critical point of Jλ . Assume by contradiction that 0 is the only critical point of Jλ . Then, for some ball B(0, r) in 1,p W0 (Ω), we have Jλ (0) < Jλ (w)
(5.14)
for all w ∈ B(0, r) with w = 0. Using Lemma 5.2 above and Theorem 5.10 from [13] (which only requires the (PS)c condition to hold at the level of the strict local minimum, here the level Jλ (0) = 0 < c0 ), one deduces from (5.14) that (5.6) holds for all r > 0 sufficiently small, i.e. one has mountain ranges surrounding 0. We aim again at applying the mountain pass theorem. For 1,p this purpose it remains to show the existence of u¯ ∈ W0 (Ω) such that Jλ (u) ¯ < 0 and the inf max value of Jλ over the family of all continuous paths from 0 to u¯ is < c0 . Once this is done, the mountain pass theorem yields the existence of a nonzero critical point of Jλ , and we reach a contradiction with the fact that 0 was supposed to be the only critical point of Jλ . The construction of the desired u¯ is made as follows. One considers as in [2,18] functions of the form uε (x) := ρ(x)Uε (x), where Uε (x) :=
ε (N −p)/p(p−1) (ε p/(p−1) + |x − x0 |p/(p−1) )(N −p)/p
and ρ(x) is a cutoff function near x0 (from assumption (b)). More precisely ρ is smooth, nonnegative, with ρ ≡ 1 near x0 and support in a ball B2 around x0 , where B2 is chosen such that B2 ⊂ B1 ∩ B(x0 , δ) (from assumptions (b) and (Hh )) and b(x) some η > 0 a.e. on B2 . Using Lemma 5.3 below, one easily sees that the function u¯ = tuε satisfies the desired properties if one first selects ε > 0 sufficiently small and then t sufficiently large. 2 Lemma 5.3. (i) For any ε > 0, Jλ (tuε ) → −∞ as t → +∞. (ii) One has sup Jλ (tuε ) < c0 t0
for ε > 0 sufficiently small. Proof. Part (i) easily follows from the “superlinearity” of the problem near x0 . The proof of part (ii) is separated in three cases. We use similar arguments as in [18], in particular their Lemmas A4 and A5.
744
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
Case 1: 2 p < 3. Let
Jλ (tuε ) := Jλ (tuε ) −
1
u0 p p
where u0 is the first solution from Theorem 5.1. We thus have
Jλ (tuε ) =
1 1 1
u0 + tuε p − u0 p − ∗ p p p −
∗
b(x)(u0 + tuε )p +
1 p∗
Ω
Hλ (x, u0 + tuε ) +
Ω
p∗
b(x)u0 Ω
Hλ (x, u0 ). Ω
Since hλ (x, ·) is nondecreasing, using Lemma A4, parts (1) and (4), with r = p ∗ , as well as the gradient estimate on p. 946 in [18], we obtain
Jλ (tuε )
tp p
|∇uε |p + t Ω
−
Ω
∗ tp
|∇u0 |p−2 ∇u0 ∇uε + Ct γ0 ε β
∗
p ∗ −1
b(x)upε − t
p∗ Ω
− Ct γ
b(x)u0
uε − t p
∗ −1
Ω p ∗ −γ
u0 Ω
b(x)uεp
∗ −1
u0
Ω
uγε − t
hλ (x, u0 )uε Ω
for some β > (N − p)/p and all γ , γ0 with 1 < γ < p ∗ − 1, p − 1 < γ0 < NN(p−1) −1 , where C denotes, here and below, various positive constants possibly depending on γ and γ0 but independent on t and ε. Moreover, as on p. 947 in [18],
|∇uε |p dx = Ω
N−p |∇U1 |p dx + O ε p−1 ,
RN
∗
|uε |p dx =
N ∗ |U1 |p dx + O ε p−1
RN
Ω
when ε → 0. It follows, using Lemma A5, part (1), with α = p ∗ − 1, and the fact that u0 solves (5.2), that
Jλ (tuε )
tp p
∗
N−p
N tp ∗ − ∗ b ∞ |∇U1 |p + O ε p−1 |U1 |p + O ε p−1 p
RN
+ Ct γ0 ε β − Ct p
Ω ∗ −1
ε
N−p p
−
p∗
t p∗
Ω
∗ b(x) − b ∞ upε .
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
745
Note that
∗ b(x) − b ∞ upε =
Ω
∗ b(x) − b ∞ upε +
Bεδ
∗ b(x) − b ∞ upε
Ω\Bεδ
where Bεδ is the ball centered in x0 and radius ε δ . Using assumption (b), we have that p∗
b ∞ )uε
Bεδ (b(x)−
is of order ε δγb ; we also have, for M large enough,
∗ b(x) − b ∞ upε
M εδ
Ω\Bεδ
N
ε p−1 (ε
p p−1
+r
p p−1
r N −1 dr )N
N(1−δ) = O ε p−1 . Now from the assumption on γb , one can choose δ > 0 such that N − p N (1 − δ) < , p p−1 and N −p < δγb , p and so we have
∗
N−p
b ∞ − b(x) upε o ε p .
Ω
Therefore Jλ (tuε )
tp p
∗
N−p
N−p tp p∗ p−1 p − ∗ b ∞ |U1 | + o ε |∇U1 | + O ε p p
RN
RN
+ Ct γ0 ε β − Ct p
∗ −1
ε
N−p p
.
Denoting Aε =
N−p |∇U1 |p + O ε p−1 ,
Rn
Bε = b ∞ Rn
N−p ∗ |U1 |p + o ε p ,
(5.15)
746
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
we thus have Jλ (tuε ) f (t), where f is defined by ∗
f (t) :=
N−p tp tp ∗ Aε − ∗ Bε + Ct γ0 ε β − Ct p −1 ε p . p p
It is clear that this function f is bounded from above and reach its maximum at some tε > 0. Note that f (tε ) = 0 if and only if Cγ0 tεγ0 −p ε β + Aε − tεp Since p ∗ > p, −1 γ0 − p <
−N +p N −1
∗ −p
N−p
∗ Bε = C p ∗ − 1 tεp −(p+1) ε p .
(5.16)
0, lim Aε =
|∇U1 |p
ε→0
RN
and
∗
lim Bε = b ∞
|U1 |p ,
ε→0
RN
we deduce from (5.16) that tε remains bounded as ε → 0. If lim infε→0 tε = 0, then part (ii) of Lemma 5.3 is proved. Otherwise, (5.16) implies limε→0 tε > 0. And so
tε p f (tε ) = p
p∗
tε
∇U1 − ∗ b ∞ p
N−p
N−p ∗ |U1 |p + o ε p + Cε β − Cε p .
p
RN
(5.17)
RN
But
tp sup t0 p
∗
tp |∇U1 | − ∗ b ∞ p
p
p∗
=
|U1 |
Rn
RN
1 N
1 N−p p
N
Sp.
b ∞
Thus, taking ε > 0 small enough, we have
I (tuε ) <
1 N
1 N−p p
N
Sp
b ∞
for t 0. Part (ii) of Lemma 5.3 is thus proved in Case 1. ∗ Case 2: N2N +1 < p < 2. Using Lemma A4, parts (3) and (4), with r = p (note that p > ∗ implies p > N2N +2 , i.e. p > 2), we obtain
2N N +1
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
Jλ (tuε )
tp p
−
|∇uε |p + t Ω ∗ tp
p∗
|∇u0 |p−2 ∇u0 ∇uε + t γ0 Ω
∗ b(x)upε
− Ct γ
|∇uε |γ0 Ω
p ∗ −1 b(x)u0 uε
−t
Ω
−t
p ∗ −1
Ω p ∗ −γ γ uε
u0
747
b(x)uεp
∗ −1
Ω
−t
Ω
(5.18)
hλ (x, u0 )uε Ω
for all 1 < γ < p ∗ − 1 and 1 < γ0 < p. Since p > and β := N −
2N N +1 ,
one can select γ0 such that
N (p−1) N −1
< γ0
N N −p γ0 > . p p
Then, by Lemma A5, part (2), we have |∇uε |γ0 Cε β . Ω
We can now proceed as in Case 1 above to conclude the proof of part (ii) of Lemma 5.3 in Case 2. Case 3: p 3. Since p 3, Lemma A4, part (2), gives 1 1 |∇u0 + t∇uε |p − |∇u0 |p p p
tp |∇uε |p + t|∇u0 |p−2 ∇u0 , ∇uε + Ct 2 |∇uε |2 + Ct p−1 |∇uε |p−1 . p
Thus, using (Hh ) and proceeding as in Case 1 above, using Lemma A4, part (4), we obtain Jλ (tuε )
tp p
|∇uε | + t Ω
|∇u0 |
p
Ω
+ Ct p−1
|∇uε |p−1 −
p∗
t p∗
Ω
− tp
∗ −1
∇u0 ∇uε + Ct
2
p−2
Ω
∗
b(x)uεp
∗ −1
− Ct γ
b(x)u0 Ω
Ω
p ∗ −1
b(x)upε − t Ω
|∇uε |2
p ∗ −γ
u0
uγε − t
Ω
hλ (x, u0 )uε − c0 t q+1
Ω
for all 1 < γ < p ∗ − 1. And so tp Jλ (tuε ) |∇uε |p + Ct 2 |∇uε |2 + Ct p−1 |∇uε |p−1 p −
Ω t∗
Ω ∗
b(x)upε − c0 t q+1
p∗ Ω
Ω
uq+1 ε Ω
uε uq+1 ε Ω
748
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
=
tp p
∗
|∇uε |p − Ω
tp
b ∞ p∗
|∇uε |p−1 +
∗ tp
p∗
Ω
Since p 3, we have 2 <
∗
upε + Ct 2 Ω
+ Ct p−1
|∇uε |2 Ω
∗
b ∞ − b(x) upε − c0 t q+1
Ω
N (p−1) N −1 .
uq+1 ε . Ω
Thus Lemma A5, part (2), implies
2(N−p)
|∇uε |2 Cε p(p−1) Ω
and since (p − 1) <
N (p−1) N −1 ,
we have |∇uε |p−1 Cε
N−p p
.
Ω
In addition Lemma A5, part (1), implies N − (N−p) p (q+1) uq+1 Cε ε Ω
for ε small enough (note that by (Hh ), q + 1 > p ∗ −
2 p−1
>
N (p−1) N −p ).
Hence
∗
Jλ (tuε )
2(N−p)
tq tp N − (N−p) p (q+1) =: f (t). Aε − ∗ Bε + C t 2 + t p−1 ε p(p−1) − Ct q+1 ε p p
Using again the hypothesis q + 1 > p ∗ − N−
2 p−1 ,
we have
(N − p) 2(N − p) (q + 1) < . p p(p − 1)
And by the same argument as in Case 1 above, we can finish the proof of Lemma 5.3.
2
6. Applications In this section we will first show how the previous theorems apply to problem (1.2). The verification of the corresponding hypotheses are either easy or can be easily adapted from the arguments in [15]; in Theorem 6.3, the verification of (Hh ) uses Lemma A4, part (4), from [18]. The functional Iλ (u) here reads
q+1
r+1 1 λ 1 |∇u|p − a(x) u+ − b(x) u+ . Iλ (u) := p q +1 r +1 Ω
Ω
As an application of Theorems 2.1, 2.2 and 2.3, we have
Ω
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
749
Theorem 6.1. Let 0 q < p − 1 < r and assume a, b ∈ L∞ (Ω) with (i) a(x) 0 a.e. in Ω, (ii) a(x) ε > 0 on some ball B1 . Then there exists 0 < Λ ∞ such that problem (1.2) has at least one solution u (with Iλ (u) < 0) for 0 < λ < Λ and no solution for λ > Λ. If in addition (iii) b(x) 0 on some ball B2 , with a(x)b(x) ≡ 0 on B2 , then Λ < ∞. Moreover, if r < p ∗ − 1, then problem (1.2) has at least one solution u (with Iλ (u) 0) for λ = Λ. As an application of Theorem 2.4, we have Theorem 6.2. Let 0 q < p − 1 < r < p ∗ − 1. Assume that a, b ∈ L∞ (Ω) satisfy (iv) a 0, i.e. a(x) εK > 0 on any compact K ⊂ Ω, (v) b(x) ε > 0 on some ball B2 . Then problem (1.2) has at least two solutions u, v for 0 < λ < Λ, with u ≡ v and Iλ (u) < 0. Note that (iv), (v) imply assumptions (i), (ii) and (iii) of Theorem 6.1. As an application of Theorem 2.5, we finally have Theorem 6.3. Let 0 q < p − 1 and r = p ∗ − 1. Assume that a, b ∈ L∞ (Ω) satisfy respectively condition (iv) of Theorem 6.2 and condition (b) of Theorem 2.5. Assume further that either 2 2N/(N + 1) < p < 3, or p 3 and p ∗ − p−1 < q + 1. Then the conclusion of Theorem 6.2 holds. Note that condition (b) implies assumption (v) of Theorem 6.2. Remark 6.4. Here are some questions which remain unsolved in the context of problem (1.2). (i) In Theorem 6.1, the question of existence of at least one solution for λ = Λ when r = p ∗ − 1. (ii) In Theorems 6.2 and 6.3, the question whether the two solutions u and v satisfy u < v in Ω and ∂u/∂ν > ∂v/∂ν on ∂Ω. (iii) In Theorem 6.3, the question of the necessity of the restrictions on the exponents p, q. Note that these questions are also unsolved in the constant coefficients case. The second application concerns the problem ⎧ r ⎨ −p u = λc(x)(u + 1) u>0 ⎩ u=0
in Ω, in Ω, on ∂Ω
(6.1)
750
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
where p − 1 < r. This problem was studied in [7,20] when c(x) ≡ 1 and p = 2 and more recently in [15] when p = 2 and c(x) is variable. The functional Iλ (u) here reads
r+1 1 λ |∇u|p − c(x) u+ + 1 . Iλ (u) := p r +1 Ω
Ω
As a consequence of Theorems 2.1, 2.2 and 2.3, we have Theorem 6.5. Assume that c ∈ L∞ (Ω) satisfies (i) c(x) 0 a.e. in Ω, (ii) c(x) ε > 0 on some ball B. Then there exists 0 < Λ < +∞ such that problem (6.1) has at least one solution u (with Iλ (u) < 0) for 0 < λ < Λ and no solution for λ > Λ. Moreover, if r < p ∗ − 1, then problem (6.1) has at least one solution u (with Iλ (u) 0) for λ = Λ. As an application of Theorem 2.4, we have Theorem 6.6. Assume r < p ∗ − 1, and that c ∈ L∞ (Ω) satisfies (iii) c 0. Then problem (6.1) has at least two solutions u, v for 0 < λ < Λ, with u ≡ v and Iλ (u) < 0. Finally, as an application of Theorem 2.6, we have Theorem 6.7. Assume r = p ∗ − 1 and that c(x) satisfies (iii) above as well as condition (b) of Theorem 2.6. Assume further that 2N/(N + 1) < p < 3. Then problem (6.1) has at least two solutions u, v for 0 < λ < Λ, with u ≡ v and Iλ (u) < 0. The critical case r = p ∗ − 1 in Theorem 6.7 requires more care because the right-hand side of (6.1) is not written in the form (2.2). However, u solves (6.1) if and only if v = λg u solves ⎧ p ∗ −1 ⎨ −p v = c(x)(v + μ) ⎩v > 0 v=0
in Ω, in Ω, on ∂Ω
(6.2)
for μ = λg with g = p∗1−p . This implies in particular that (6.2) has at least one solution for μ < Λg and no solutions for μ > Λg . In order to apply Theorem 2.6 to (6.2), we write the nonlinearity in the following way c(x)(v + μ)r = hμ (x, u) + c(x)ur ∗
∗
where hμ (x, s) = c(x)((s + μ)p −1 − s p −1 ). And now is not difficult to verify the hypotheses of Theorem 2.6. Note that the critical case with p 3 in (6.2) remains unsolved; the difficulty lies in the verification of (Hh ).
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
751
Acknowledgments Most of this work was done with the support of CNPq, FNRS, PRONEX, FAPESP and FONDECYT 1080430 at Unicamp, ULB and USACH. We wish to thank H. Brezis for several comments relative to Proposition 3.7 and the referee for some remarks on condition (Hh ) in Theorems 2.5 and 2.6. References [1] W. Allegretto, Yin Xi Huang, A Picone’s identity for the p-Laplacian and applications, Nonlinear Anal. 32 (7) (1998) 819–830. [2] A. Ambrosetti, H. Brezis, G. Cerami, Combined effects of concave and convex nonlinearities in some elliptic problems, J. Funct. Anal. 122 (2) (1994) 519–543. [3] A. Anane, Etudes des valeurs propres et de la résonance pour l’opérateur p-Laplacien, Thèse de doctorat, Université Libre de Bruxelles, 1988. [4] D. Arcoya, D. Ruiz, The Ambrosetti–Prodi problem for the p-Laplacian operator, Comm. Partial Differential Equations 31 (4–6) (2006) 849–865. [5] L. Boccardo, F. Murat, Almost everywhere convergence of the gradients of solutions to elliptic and parabolic equations, Nonlinear Anal. 19 (6) (1992) 581–597. [6] H. Brezis, E. Lieb, A relation between pointwise convergence of functions and convergence of functionals, Proc. Amer. Math. Soc. 88 (3) (1983) 486–490. [7] H. Brezis, L. Nirenberg, Positive solutions of nonlinear elliptic equations involving critical Sobolev exponents, Comm. Pure Appl. Math. 36 (4) (1983) 437–477. [8] H. Brezis, L. Nirenberg, H 1 versus C 1 local minimizers, C. R. Acad. Sci. Paris Sér. I Math. 317 (5) (1993) 465–472. [9] F. Brock, L. Iturriaga, P. Ubilla, A multiplicity result for the p-Laplacian involving a parameter, Ann. Henri Poincaré 9 (7) (2008) 1371–1386. [10] M. Cuesta, P. Takáˇc, A strong comparison principle for the Dirichlet p-Laplacian, in: Reaction Diffusion Systems, Trieste, 1995, in: Lect. Notes Pure Appl. Math., vol. 194, Dekker, New York, 1998, pp. 79–87. [11] M. Cuesta, P. Takáˇc, A strong comparison principle for positive solutions of degenerate elliptic equations, Differential Integral Equations 13 (4–6) (2000) 721–746. [12] C. De Coster, M. Henrard, Existence and localization of solution for second order elliptic BVP in presence of lower and upper solutions without any order, J. Differential Equations 145 (2) (1998) 420–452. [13] D.G. de Figueiredo, Lectures on the Ekeland Variational Principle with Applications and Detours, Tata Inst. Fund. Res. Lect. Math. Phys., vol. 81, Springer-Verlag, 1989. [14] D.G. de Figueiredo, J.-P. Gossez, P. Ubilla, Local superlinearity and sublinearity for indefinite semilinear elliptic problems, J. Funct. Anal. 199 (2) (2003) 452–467. [15] D.G. de Figueiredo, J.-P. Gossez, P. Ubilla, Multiplicity results for a family of semilinear elliptic problems under local superlinearity and sublinearity, J. Eur. Math. Soc. 8 (2) (2006) 269–286. [16] X. Fan, On the sub–supersolution method for p(x)-Laplacian equations, J. Math. Anal. Appl. 330 (1) (2007) 665– 682. [17] J. Fleckinger-Pellé, P. Takáˇc, Uniqueness of positive solutions for nonlinear cooperative systems with the pLaplacian, Indiana Univ. Math. J. 43 (4) (1994) 1227–1253. [18] J. García, I. Peral, Some results about the existence of a second positive solution in a quasilinear critical problem, Indiana Univ. Math. J. 43 (3) (1994) 941–957. [19] J. García, I. Peral, J. Manfredi, Sobolev versus Hölder local minimizers and global multiplicity for some quasilinear elliptic equations, Commun. Contemp. Math. 2 (3) (2000) 385–404. [20] F. Gazzola, A. Malchiodi, Some remarks on the equation −u = (u + 1)p for varying domains, Comm. Partial Differential Equations 27 (2002) 809–845. [21] J. Giacomoni, I. Schindler, P. Takáˇc, Sobolev versus Hölder local minimizers and existence of multiple solutions for a singular quasilinear equation, Ann. Sc. Norm. Super. Pisa Cl. Sci. (5) 6 (1) (2007) 117–158. [22] M. Guedda, L. Véron, Quasilinear elliptic equations involving critical Sobolev exponents, Nonlinear Anal. 13 (8) (1989) 879–902. [23] Zongming Guo, Zhitao Zhang, W 1,p versus C 1 local minimizers and multiplicity results for quasilinear elliptic equations, J. Math. Anal. Appl. 286 (1) (2003) 32–50.
752
D.G. de Figueiredo et al. / Journal of Functional Analysis 257 (2009) 721–752
[24] O. Kavian, Introduction à la théorie des points critiques et applications aux problèmes elliptiques, Math. Appl., vol. 13, Springer-Verlag, 1993. [25] O.A. Ladyženskaja, N.N. Ural’ceva, Équations aux dérivées partielles de type elliptique, Monographies Universitaires de Mathématiques, No. 31, Dunod, 1968. [26] G.M. Lieberman, Boundary regularity for solutions of degenerate elliptic equations, Nonlinear Anal. 12 (11) (1988) 1203–1219. [27] P. Lindqvist, On the equation div(|∇u|p−2 ∇u) + λ|u|p−2 u = 0, Proc. Amer. Math. Soc. 109 (1) (1990) 157–164. [28] I. Peral, Some results on quasilinear elliptic equations: Growth versus shape, in: Nonlinear Functional Analysis and Applications to Differential Equations, Trieste, 1997, World Sci. Publ., 1998, pp. 153–202. [29] P. Pucci, J. Serrin, The Maximum Principle, Progr. Nonlinear Differential Equations Appl., vol. 73, Birkhäuser, 2007. [30] M. Struwe, Variational Methods. Applications to Nonlinear Partial Differential Equations and Hamiltonian Systems, third ed., Ergeb. Math. Grenzgeb., vol. 34, Springer-Verlag, 2000. [31] J.L. Vázquez, A strong maximum principle for some quasilinear elliptic equations, Appl. Math. Optim. 12 (3) (1984) 191–202.
Journal of Functional Analysis 257 (2009) 753–806 www.elsevier.com/locate/jfa
Fast rotating Bose–Einstein condensates in an asymmetric trap Amandine Aftalion a,∗ , Xavier Blanc b , Nicolas Lerner c a CMAP, Ecole Polytechnique, CNRS, 91128 Palaiseau cedex, France b Université Pierre et Marie Curie-Paris 6, UMR 7598, Laboratoire Jacques-Louis Lions, 175 rue du Chevaleret,
Paris F-75013, France c Projet analyse fonctionnelle, Institut de Mathématiques de Jussieu, Université Pierre-et-Marie-Curie (Paris 6),
175 rue du Chevaleret, 75013 Paris, France Received 5 December 2008; accepted 5 January 2009 Available online 23 January 2009 Communicated by Paul Malliavin
Abstract We investigate the effect of the anisotropy of a harmonic trap on the behaviour of a fast rotating Bose– Einstein condensate. This is done in the framework of the 2D Gross–Pitaevskii equation and requires a symplectic reduction of the quadratic form defining the energy. This reduction allows us to simplify the energy on a Bargmann space and study the asymptotics of large rotational velocity. We characterize two regimes of velocity and anisotropy; in the first one where the behaviour is similar to the isotropic case, we construct an upper bound: a hexagonal Abrikosov lattice of vortices, with an inverted parabola profile. The second regime deals with very large velocities, a case in which we prove that the ground state does not display vortices in the bulk, with a 1D limiting problem. In that case, we show that the coarse grained atomic density behaves like an inverted parabola with large radius in the deconfined direction but keeps a fixed profile given by a Gaussian in the other direction. The features of this second regime appear as new phenomena. © 2009 Elsevier Inc. All rights reserved.
* Corresponding author.
E-mail addresses: [email protected] (A. Aftalion), [email protected] (X. Blanc), [email protected] (N. Lerner). URLs: http://www.cmap.polytechnique.fr/~aftalion/ (A. Aftalion), http://www.ann.jussieu.fr/~blanc/ (X. Blanc), http://www.math.jussieu.fr/~lerner/ (N. Lerner). 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.01.002
754
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
Keywords: Bose–Einstein condensates; Bargmann spaces; Metaplectic transformation; Theta functions; Abrikosov lattice
Contents 1.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1. The physics problem and its mathematical formulation . . . . . 1.2. The isotropic lowest Landau level . . . . . . . . . . . . . . . . . . . 1.3. Sketch of some preliminary reductions in the anisotropic case 1.4. Main results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. Quadratic Hamiltonians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1. On positive definite quadratic forms on symplectic spaces . . 2.2. Generating functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3. Effective diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . 3. Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1. The Irving E. Segal formula . . . . . . . . . . . . . . . . . . . . . . . 3.2. The metaplectic group and the generating functions . . . . . . . 3.3. Explicit expression for M . . . . . . . . . . . . . . . . . . . . . . . . 4. The Fock–Bargmann space and the anisotropic LLL . . . . . . . . . . . 4.1. Nonnegative quantization and entire functions . . . . . . . . . . 4.2. The anisotropic LLL . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3. The energy in the anisotropic LLL . . . . . . . . . . . . . . . . . . . 4.4. The (final) reduction to a simpler lowest Landau level . . . . . 5. Weak anisotropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1. Approximation results . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2. Energy bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6. Strong anisotropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1. Upper bound for the energy . . . . . . . . . . . . . . . . . . . . . . . 6.2. Lower bound for the energy . . . . . . . . . . . . . . . . . . . . . . . 6.3. Proof of Theorem 1.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix A. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.1. Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.2. Notations for the calculations of Section 2.3 . . . . . . . . . . . . A.3. Some calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
754 755 757 759 762 765 765 766 767 771 771 772 772 773 773 776 779 781 782 782 786 789 790 792 793 798 798 798 799 802 805
1. Introduction Bose–Einstein condensates (BEC) are a new phase of matter where various aspects of macroscopic quantum physics can be studied. Many experimental and theoretical works have emerged in the past ten years. We refer to the monographs by C.J. Pethick and H. Smith [14], L. Pitaevskii and S. Stringari [15] for more details on the physics and to A. Aftalion [2] for the mathematical aspects. Our work is motivated by experiments in the group of J. Dalibard [11] on rotating condensates: when a condensate is rotated at a sufficiently large velocity, a superfluid behaviour is detected with the observation of quantized vortices. These vortices arrange themselves on a lat-
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
755
tice, similar to Abrikosov lattices in superconductors [1]. This fast rotation regime is of interest for its analogy with Quantum Hall physics [5,8,18]. In a previous work, A. Aftalion, X. Blanc and F. Nier [3] have addressed the mathematical aspects of fast rotating condensates in harmonic isotropic traps and gave a mathematical description of the observed vortex lattice. This was done through the minimization of the Gross– Pitaevskii energy and the introduction of Bargmann spaces to describe the lowest Landau level sets of states. Nevertheless, the experimental device leading to the realization of a rotating condensate requires an anisotropy of the trap holding the atoms, which was not taken into account in [3]. Several physics papers have addressed the behaviour of anisotropic condensates under rotation and its similarity or differences with isotropic traps. We refer the reader to the paper by A. Fetter [7], and to the related works [13,16,17]. The aim of the present article is to analyze the effect of anisotropy on the energy minimization and the vortex pattern, and in particular to derive a mathematical study of some of Fetter’s computations and conjectures. Two different situations emerge according to the values of the parameters: in one case, the behaviour is similar to the isotropic case with a triangular vortex lattice; in the other case, for very large velocities, we have found a new regime where there are no vortices, and a full mathematical analysis can be performed, reducing the minimization to a 1D problem. The existence of this new regime was apparently not predicted in the physics literature. This feature relies on the analysis of the bottom of the spectrum of a specific operator whose positive lower bound prevents the condensate from shrinking in one direction, contradicting some heuristic explanations present in [7]. Our analysis is based on the symplectic reduction of the quadratic form defining the Hamiltonian (inspired by the computations of Fetter [7]), the characterization of a lowest Landau level adapted to the anisotropy and finally the study of the reduced energy in this space. 1.1. The physics problem and its mathematical formulation Our problem comes from the study of the 3D Gross–Pitaevskii energy functional for a fast rotating Bose–Einstein condensate with N particles of mass m given by EGP (φ) = Hφ, φL2 (R3 ) +
g3d N φ4L4 (R3 ) , 2
(1.1)
where the operator H is H=
m 2 2 1 2 2 h Dx + h2 Dy2 + h2 Dz2 + ωx x + ωy2 y 2 + ωz2 z2 − Ω(xhDy − yhDx ), 2m 2
(1.2)
where h is the Planck constant, Dx = (2iπ)−1 ∂x , ωj is the frequency along the j -axis, Ω is the rotational velocity, and the coupling constant g3d is a positive parameter. In the particular case where ωx = ωy , the fast rotation regime corresponds to the case where Ω tends to ωx and the condensate expands in the transverse direction. It has been proved [4] that the minimizer can be described at leading order by a 2D function ψ(x, y), multiplied by the ground state of the harmonic oscillator in the z-direction (the operator h2 /(2m)Dz2 + mωz2 z2 /2), −1 2 which is equal to (2mωz h−1 )1/4 e−πmωz h z . This property is still true in the anisotropic case if ωy ωz . The reduced 2D energy to study is thus E(ψ) = H0 ψ, ψL2 (R2 ) +
g2d N ψ4L4 (R2 ) , 2
(1.3)
756
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
where the operator H0 is H0 =
m 2 2 1 2 2 h Dx + h2 Dy2 + ωx x + ωy2 y 2 − Ω(xhDy − yhDx ), 2m 2
(1.4)
and the coupling constant g2d takes into account the integral of the ground state in the z-direction: g2d N =
gh2 , m
where g is dimensionless (and > 0).
(1.5)
Since h has the dimension energy × time, it is consistent to assume that the wave function ψ has the dimension 1/length, with the normalization ψL2 (R2 ) = 1. We define the mean square oscillator frequency ω⊥ by 2 ω⊥ =
1 2 ωx + ωy2 2
and the function u by 1/2 1/2 1/2 ψ(x, y) = h−1/2 m1/2 ω⊥ u h−1/2 m1/2 ω⊥ x, h−1/2 m1/2 ω⊥ y ,
(1.6)
so that uL2 (R2 ) = ψL2 (R2 ) = 1,
g2d N ψ4L4 (R2 ) = ghω⊥ u4L4 (R2 ) .
We also note that the dimension of h−1/2 m1/2 ω⊥ is 1/length, so that 1/2
x1 = h−1/2 m1/2 ω⊥ x, 1/2
x2 = h−1/2 m1/2 ω⊥ y, 1/2
u(x1 , x2 )
are dimensionless.
Assuming ωx2 ωy2 , we use the dimensionless parameter ν to write 2 ωx2 = 1 − ν 2 ω⊥ ,
2 ωy2 = 1 + ν 2 ω⊥ ,
and we get immediately 1 1 1 1 E(ψ) = D1 u2L2 (R2 ) + D2 u2L2 (R2 ) + 1 − ν 2 x1 u2L2 (R2 ) hω⊥ 2 2 2 +
1 Ω g 1 + ν 2 x2 u2L2 (R2 ) − (x1 D2 − x2 D1 )u, u L2 (R2 ) + u4L4 (R2 ) . 2 ω⊥ 2
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
757
Finally, we have 1 g E(ψ) := EGP (u) = H u, u + u4L4 (R2 ) , hω⊥ 2 2H = D12 + D22 + 1 − ν 2 x12 + 1 + ν 2 x22 − 2ω(x1 D2 − x2 D1 ),
(1.7) ω=
Ω , ω⊥
(1.8)
where ω, ν, u, g are all dimensionless and uL2 (R2 ) = 1. The minimization of this functional is the mathematical problem that we address in this paper. The Euler–Lagrange equation for the minimization of EGP (u), under the constraint uL2 (R2 ) = 1, is H u + g|u|2 u = λu,
(1.9)
where λ is the Lagrange multiplier. We shall always assume that Ω 2 ωx2 , i.e. ω2 + ν 2 1 and define the dimensionless parameter ε by ω2 + ν 2 + ε 2 = 1.
(1.10)
The fast rotation regime occurs when the ratio Ω 2 /ωx2 tends to 1− , i.e. ε tends to 0. Summarizing and reformulating our reduction, we have 1 w g u, u L2 (R2 ) + |u|4 dx, (1.11) EGP (u) = qω,ν,ε 2 2 R2
where qω,ν,ε is the quadratic form qω,ν,ε (x1 , x2 , ξ1 , ξ2 ) = ξ12 + ξ22 + 1 − ν 2 x12 + 1 + ν 2 x22 − 2ω(x1 ξ2 − x2 ξ1 ), (1.12) w which depends on the real parameters ω, ν, ε such that1 (1.10) holds. Here qω,ν,ε is the operator with Weyl symbol qω,ν,ε , that is:
w = D12 + D22 + 1 − ν 2 x12 + 1 + ν 2 x22 − 2ω(x1 D2 − x2 D1 ), qω,ν,ε
(1.13)
where Dj = ∂j /(2iπ). We would like to minimize the energy EGP (u) under the constraint uL2 = 1 and understand what is happening when ε → 0. 1.2. The isotropic lowest Landau level When the harmonic trap is isotropic, i.e. when ν = 0, it turns out that, since ω2 + ε 2 = 1, q = qω,0,ε = (ξ1 + ωx2 )2 + (ξ2 − ωx1 )2 + ε 2 x12 + x22
(1.14)
1 Of course there is no loss of generality assuming that , ν are nonnegative parameters; we may also assume that ω 0, since the change of function u(x1 , x2 ) → u(−x1 , x2 ) preserves the L4 -norm, is unitary in L2 , corresponding to the symplectic transformation (x1 , x2 , ξ1 , ξ2 ) → (−x1 , x2 , −ξ1 , ξ2 ) and leads to the same problem where ω is replaced by −ω.
758
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
so that 2 1 ω ε2 g u2 + |x|u2 + EGP (u) = (D1 + ωx2 )ψ + i(D2 − ωx1 )u + 2 2π 2 2
|u|4 dx.
We note that, with z = x1 + ix2 , D1 + ωx2 + i(D2 − ωx1 ) =
1 ¯ 1 ¯ ∂ − iωz = (∂ + πωz), iπ iπ
hence the first term of the energy is minimized (and equal to 0) if u ∈ LLLω−1 , where 2 LLLω−1 = u ∈ L2 R2 , u(x) = f (z)e−πω|z| = ker(∂¯ + πωz) ∩ L2 R2 ,
(1.15)
with f holomorphic. We expect the condensate to have a large expansion, hence the term |u|4 to be small. Thus, it is natural to minimize the energy EGP in LLLω−1 . It has been proved in [4] that the restriction to LLLω−1 is a good approximation of the original problem, i.e. the minimization of EGP in L2 (R2 ). We get for u ∈ LLLω−1 , uL2 = 1, 2 g 2 ε2 1 ω + |x|u + EGP (u) = (D1 + ωx2 )u + i(D2 − ωx1 )u + 2 2π 2 2
|u|4 dx,
¯ (iπ)−1 (∂+πωz)u=0
and with u(x) = υ((ωε)1/2 x)(ωε)1/2 (unitary change in L2 (R2 )), EGP (u) =
ε ω + 2π 2ω
2 |y|2 υ(y) dy + ω2 g
υ(y)4 dy .
The minimization problem of EGP (u) in the space LLLω−1 is thus reduced to study 2 ELLL (υ) = |x|υ L2 + ω2 gυ4L4 , −1
υ ∈ LLLε ,
(1.16)
i.e. with z = x1 + ix2 , v(x1 , x2 ) = f (z)e−πε |z| , f entire (and v ∈ L2 (R2 )). This program has been carried out in the paper [3] by A. Aftalion, X. Blanc, F. Nier. In the isotropic case, a key point is the fact that the symplectic diagonalisation of the quadratic Hamiltonian is rather simple: in fact revisiting the formula (1.14), we obtain easily
η12
2
μ21 y12
1−ω 1−ω 2 q= (ξ1 − x2 ) + (ξ2 + x1 )2 2 2 1+ω 1+ω (ξ1 + x2 )2 + (ξ2 − x1 )2 , + 2 2 η22
μ22 y22
(1.17)
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
759
with
η1 = 2−1/2 (1 − ω)1/2 (ξ1 − x2 ), η2 = 2−1/2 (1 + ω)1/2 (ξ1 + x2 ),
μ1 = 1 − ω, y1 = 2−1/2 (1 − ω)−1/2 (ξ2 + x1 ), μ2 = 1 + ω, y2 = 2−1/2 (1 + ω)−1/2 (x1 − ξ2 ),
(1.18)
so that the linear forms (y1 , y2 , η1 , η2 ) are symplectic coordinates in R4 , i.e. {η1 , y1 } = {η2 , y2 } = 1,
{η1 , η2 } = {η1 , y2 } = {η2 , y1 } = {y1 , y2 } = 0.
In [3], an upper bound for the energy is constructed with a test function which is also an “almost” solution to the Euler–Lagrange equation corresponding to the minimization of (1.16) in LLLε . This almost solution displays a triangular vortex lattice in a central region of the condensate and is constructed using a Jacobi Theta function, which is modulated by an inverted parabola profile and projected onto LLLε . 1.3. Sketch of some preliminary reductions in the anisotropic case The analysis of the reduced energy in the anisotropic case yields two different situations: one is similar to the isotropic case and the other one is quite different, without vortices. To tackle the non-isotropic case where ν > 0 in (1.13), one would like to determine a space playing the role of the LLL and taking into account the anisotropy. Step 1. Symplectic reduction of the quadratic form qω,ν, . Given the quadratic form qω,ν,ε (1.12), identified with a 4 × 4 symmetric matrix, we define its fundamental matrix by the identity F = −σ −1 qω,ν,ε = σ qω,ν,ε where σ=
0 −I2
I2 0
is the symplectic matrix given in 2 × 2 blocks.
The properties of the eigenvalues and eigenvectors of F allow to find a symplectic reduction for qω,ν,ε . Step 2. Determination of the anisotropic LLL. The anisotropic equivalent of the LLL can be determined explicitely, thanks to the results of the first step. We find that it is the subspace of functions u of L2 (R2 ) such that ν2 ν2 πν 2 γ γπ 2 x1 1 − + (β2 x2 )2 1 + exp −i x1 x2 , f (x1 + iβ2 x2 ) exp − 4β2 2α 2α 4α where f is entire. The positive parameters α, γ , β2 are defined in the text and are explicitely known in terms of ω, ν. We also determine an operator M, which can be used to give an explicit expression for the isomorphism between L2 (R) and the anisotropic LLL as well as to express the Gross–Pitaevskii energy in the new symplectic coordinates.
760
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
Step 3. Rescaling. Introducing a new set of parameters (ω, ν, are positive satisfying (1.10), g > 0 given by (1.5)), κ12
2 2 1+ = 2ν + κ=
κ1 , β2
g0 =
2ν 2 , α − ν 2 + ω2
g1 γ 2 , 4β2
α=
ν 4 + 4ω2 ,
g1 = g
2α 2ωμ2 , , β2 = ω α + 2ω2 + ν 2
γ=
α + 2ω2 + ν 2 , 2α
μ2 = 1 + ω2 + α,
(1.19) (1.20)
we show that, after some rescaling, the minimization of the full energy EGP (u) of (1.11) can be reduced to the minimization of 1 2 2 g0 ε x1 + κ 2 x22 |u|2 + |u|4 (1.21) E(u) = 2 2 R2
on the space 2 Λ0 = u ∈ L2 (R2 ), u(x1 , x2 ) = f (z)e−π|z| /2 , f holomorphic, z = x1 + ix2 .
(1.22)
The point is that, after some scaling, we are able to come back to an isotropic space. The orthogonal projection Π0 of L2 (R2 ) onto Λ0 is explicit and simple: (Π0 u)(x) =
π
e− 2 |x−y|
2 +iπ(x y −y x ) 2 1 2 1
u(y) dy.
(1.23)
R2
We are thus reduced to the following problem: with E(u) given by (1.21), study I (ε, κ) = inf E(u), u ∈ Λ0 , uL2 (R2 ) = 1 .
(1.24)
The minimization of E without the holomorphy constraint yields x12 x22 2 4g0 κ 1/4 4g0 ε 1/4 |u| = 1− 2 − 2 , where R1 = , R2 = . (1.25) πR1 R2 πε 3 πκ 3 R1 R2 2
As ε tends to 0, R1 always tends to infinity (in fact R1 ε −1/2 ), but the behaviour of R2 depends on the respective values of ε and κ, that is of ε and ν. Step 4. Sorting out the various regimes. Recalling that the positive parameter ν stands for the anisotropy, we find two regimes: • ν ε 1/3 (weak anisotropy): R2 → ∞ (in fact, R2 ≈ min(ε −2/3 , ε 1/3 ν −1 )). Numerical simulations (Fig. 1) show a triangular vortex lattice. The behaviour is similar to the isotropic case except that the inverted parabola profile (1.25) takes into account the anisotropy. We will construct an approximate minimizer. 4/3
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
761
Fig. 1. Plot of the zeroes of the minimizer (left) and the density (right) for ε 2 = 0.002, ν = 0.03. Triangular vortex lattice in an anisotropic trap.
Fig. 2. Plot of the zeroes of the minimizer (left) and the density (right) for ε 2 = 0.002, ν = 0.73. No vortex in the visible region.
• ν ε 1/3 (strong anisotropy): R2 → 0 (in fact R2 ≈ ε 1/3 ν −1 ). Numerical simulations (Fig. 2) show that there are no vortices in the bulk, the behaviour is an inverted parabola in the x1 direction and a fixed Gaussian in the x2 direction. Thus, the size of the condensate does not shrink in the x2 direction and (1.25) is not a good approximation of the minimizer. The shrinking of the condensate in the x2 direction is not allowed in Λ0 (see (1.22)) because the operator x22 is bounded from below in that space by a positive constant and the first eigenfunction is a Gaussian in the x2 direction. We find an asymptotic 1D problem (upper and lower bounds match) which yields a separation of variables. 4/3
762
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
1.4. Main results 1.4.1. Weakly anisotropic case In a first step,2 we assume that, with κ given by (1.20), ε κ ε 1/3 .
(1.26)
The isotropic case is recovered by assuming κ = ε. This case is similar to the isotropic case and we derive similar results to the paper [3], namely an upper bound given by the Theta function but we lack a good lower bound. We recall that the Jacobi Theta function Θ(z, τ ) associated to a lattice Z ⊕ Zτ is a holomorphic function which vanishes exactly once in any lattice cell and is defined by Θ(z, τ ) =
+∞ 1 2 (−1)n eiπτ (n+1/2) e(2n+1)πiz , i n=−∞
z ∈ C.
(1.27)
This function allows us to construct a periodic function on the same lattice: uτ is defined by π
uτ (x1 , x2 ) = e 2 (z
2 −|z|2 )
√ Θ( τI z, τ ),
z = x1 + ix2 , τ = τR + iτI ,
|uτ | is periodic over the lattice Z ⊕ τ Z, and uτ satisfies Π0 |uτ |2 uτ = λτ uτ ,
(1.28)
(1.29)
with
– |uτ |4 γ (τ ) =√ , λτ =
– |uτ |2 2τI
(1.30)
where – is the mean integral on a cell and
– |uτ |4
. γ (τ ) := ( – |uτ |2 )2
(1.31)
The minimization of γ (τ ) on all possible τ corresponds to the Abrikosov problem. It turns out that the properties of the Theta function allow to derive that γ (τ ) =
e
− τπ |j τ −k|2 I
(j,k)∈Z2
and prove (see [3]) that τ → γ (τ ) is minimized for τ = j = e2iπ/3 , which corresponds to the hexagonal lattice. The minimum is b = γ (j ) ≈ 1.1596.
(1.32)
2 We shall see that κ ≈ ν + ε in the sense that the ratio κ/(ν + ε) is bounded above and below by some fixed positive constants, so that the weakly anisotropic case is indeed ν ε 1/3 .
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
763
The function uτ allows us to construct the vortex lattice and we multiply it by the proper inverted parabola to get a good upper bound: Theorem 1.1. We have for I (ε, κ) defined in (1.24), b given in (1.32), κ in (1.20), 2 3
2 2g0 εκ < I (ε, κ) π 3
3 1/8 √ 2g0 bεκ κ , +O εκ π ε
(1.33)
when ( , κ −1/3 ) → (0, 0). Moreover, the following function provides the upper bound: v = Π0 (uτ ρ), where uτ is defined by (1.28) with τ = e ρ(x) = 2
2
√ π bR1 R2
2iπ 3
(1.34)
and
x2 x2 1− √ 1 2 − √ 2 2 bR1 bR2
,
+
R1 =
4g0 κ πε 3
1/4
, R2 =
4g0 ε πκ 3
1/4 .
We expect v to be a good approximation of the minimizer and the energy asymptotics to match the right-hand side of (1.33). Thus, the lower bound is not optimal (it does not include b). In addition, the test function (1.34) (with a general τ = j a priori) gives the upper bound of (1.33) with γ (τ ) instead of b. The proof is a refinement of that in [3]. 1.4.2. Strong anisotropy In the case where the rotation is fast enough in the sense that κ ε 1/3
(1.35)
we have found a regime unknown by physicists where vortices disappear and the problem can be reduced in fact to a 1D energy. Theorem 1.2. For I (ε, κ) defined in (1.24), b given in (1.32), κ in (1.20), we have lim
( , 1/3 κ −1 )→(0,0)
I (ε, κ) − ε 2/3
κ2 8π
= J,
(1.36)
where 1 2 g0 J = inf t p(t)2 + p(t)4 , p real-valued ∈ L2 (R) ∩ L4 (R), pL2 (R) = 1 . (1.37) 2 2 In addition, if u is a minimizer of I (ε, κ), then x1 1 −→ 21/4 e−πx22 p(x1 ), u , x 2 1/3 2/3 ε ε in L2 (R2 ) ∩ L4 (R2 ), where p is the minimizer of J.
(1.38)
764
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
Note that the minimizer p of (1.37) is explicit: p(t)2 =
3 t2 1− 2 , 4R R +
R=
3g0 2
1/3 .
A few words about the proof of Theorem 1.2. The first point is that the operator Π0 x22 Π0 (see (1.22), (1.23)) is bounded from below by a positive constant: ∀u ∈ Λ0 ,
x22 |u|2
1 4π
R2
|u|2 . R2
This is proven in Lemma 4.4 below. Actually, the spectrum of this operator is purely continuous, and any Weyl sequence associated with the value 1/(4π) converges (up to renormalization) to the function u0 (x1 , x2 ) = exp −πx22 + iπx1 x2 , which satisfies the equation Π0 (u0 ) =
1 4π u0 .
(1.39)
This gives the lower bound
I (ε, κ)
κ2 , 8π
and indicates that in order to be close to this lower bound, a test function should be close to (1.39). Thus, the second point is to construct a test function having the same behaviour as (1.39) in x2 , and a large extension in x1 . This is done by using the function u1 (x1 , x2 ) =
1 21/4
e
− π2 x22
π
e− 2 ((x1 −y1 )
2 −2iy x ) 1 2
ρ(y1 ) dy1 ,
R
which is equal to Π0 (ρ(x1 )δ0 (x2 )), where δ0 is the Dirac delta function and ρ any real-valued 2 function of one variable. This test function is then proved to be close to 21/4 e−πx2 ρ(x1 ), which allows to compute its energy, and gives the upper bound, provided that ρ(t) = ε1/3 p(ε 2/3 t), where p is the minimizer of (1.37). Finally, in order to prove the lower bound, we first extract bounds on the minimizer from the energy, which allow to pass to the limit in the equation (after rescaling as in (1.38)), hence prove that the limit is the right-hand side of (1.38). This uses the fact that the energy appearing in (1.37) is strictly convex, hence that any critical point is the unique minimizer. The paper is organized as follows: in Section 2, we review some standard facts on positive definite quadratic forms in a symplectic space. This allows us, in Section 3, to construct a symplectic mapping χ , which yields a simplification of the quadratic form q. In Section 4, quantizing that symplectic mapping in a metaplectic transformation, we find the expression of the LLL and manage to reach the reduced form of the energy (Proposition 4.5). Section 5 is devoted to the proof of Theorem 1.1 and Section 6 to Theorem 1.2.
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
765
Open questions. We have no information on the intermediate regime where, for instance, 4/3 ε 1/3 /κ converges to some constant R0 (in that case, R1 ≈ ε −2/3 , R2 ≈ R0 ). We expect that the extension in the x2 direction depends on R0 and wonder whether the condensate has a finite number of vortex lines. We have not determined the limiting problem. 2. Quadratic Hamiltonians We first review some standard facts on positive definite quadratic forms in a symplectic space. 2.1. On positive definite quadratic forms on symplectic spaces We consider the phase space Rnx × Rnξ , equipped with its canonical symplectic structure: the symplectic form σ is a bilinear alternate form on R2n given by σ (x, ξ ); (y, η) = ξ · y − η · x = σ X, Y , x y 0 X= , Y= , σ= ξ η −In
with In , 0
(2.1) (2.2)
where the form σ is identified with the 2n × 2n matrix above given in n × n blocks. The symplectic group Sp(n) (a subgroup of Sl(2n, R)), is defined by the equation on the 2n × 2n matrix χ , χ ∗ σ χ = σ,
i.e. ∀X, Y ∈ R2n ,
σ χX, χY = σ X, Y .
(2.3)
The following lemma is classical (see e.g. the chapter XXI in [9], or [12]). Lemma 2.1. Let B ∈ GL(n, R) and let A, C be n×n real symmetric matrices. Then the matrix Ξ , given by n × n blocks ΞA,B,C =
B −1 AB −1
−B −1 C B ∗ − AB −1 C
=
I A
0 I
B −1 0
0 B∗
I 0
−C I
(2.4)
belongs to Sp(n). Any element of Sp(n) can be written as a product ΞA1 ,B1 ,C1 ΞA2 ,B2 ,C2 . N.B. The first statement is easy to verify directly and we shall not use the last statement, which is nevertheless an interesting piece of information. For a symplectic mapping Ξ , to be of the form above is equivalent to the assumption that the mapping x → pr1 Ξ (x ⊕ 0) is invertible from Rn to Rn . Given a quadratic form Q on R2n , identified with a symmetric 2n × 2n matrix, we define its fundamental matrix F by the identity F = −σ −1 Q = σ Q,
so that for X, Y ∈ R2n
σ Y, F X = QY, X.
The following proposition is classical (see e.g. Theorem 21.5.3 in [9]).
766
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
Proposition 2.2. Let Q be a positive definite quadratic form on the symplectic Rnx × Rnξ . One can find χ ∈ Sp(n) such that with R2n X = χY,
Y = (y1 , . . . , yn , η1 , . . . , ηn ), QX, X = QχY, χY = ηj2 + μ2j yj2 , μj > 0. 1j n
The {±iμj }1j n are the 2n eigenvalues of the fundamental matrix, related to the 2n eigenvectors {ej ± iεj }1j n . The {ej , εj }1j n make a symplectic basis of R2n : σ (εj , ek ) = δj,k ,
σ (εj , εk ) = σ (ej , ek ) = 0,
and the symplectic planes Πj = Rej ⊕ Rεj are orthogonal for Q. N.B. A one-line-proof of these classical facts: on C2n equipped with the dot-product given by Q, diagonalize the sesquilinear Hermitian form iσ . 2.2. Generating functions We define on Rn × Rn the generating function S of the symplectic mapping of the form ΞA,B,C given in Lemma 2.1 by the identity S(x, η) =
1 Ax, x + 2Bx, η + Cη, η . 2
(2.5)
We have
∂S ∂S , η = x, ΞA,B,C . ∂η ∂x ∈Rn ×Rn
(2.6)
∈Rn ×Rn
In fact, we see directly
I A
0 I
B −1 0
0 B∗
I 0
−C I
Bx + Cη η
=
I 0 A I
x B ∗η
=
x . Ax + B ∗ η
Given a positive definite quadratic form Q on R2n , identified with a symmetric 2n × 2n matrix, we know from Proposition 2.2 that there exists χ ∈ Sp(n) such that χ ∗ Qχ =
μ2 0
0 In
,
μ2 = diag μ21 , . . . , μ2n .
Looking for χ = ΞA,B,C given by a generating function S as above, we end-up (using the notation q(X) = QX, X with X ∈ R2n ) with the equation q(x, ∂x S ) = μ∂η S2 + η2 , Rn ×Rn
μ∂η S = (μj ∂ηj S)1j n ∈ Rn ,
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
767
where · stands for the standard Euclidean norm on Rn . This means 2 q(x, Ax + B ∗ η) = μ(Bx + Cη) + η2 .
(2.7)
We want now to go back to the study of our quadratic form (1.12). 2.3. Effective diagonalization Lemma 2.3. Let q be the quadratic form on R4 given by (1.12), where ω, ν, ε are nonnegative parameters such that ω2 + ν 2 + ε 2 = 1. The eigenvalues of the fundamental matrix are ±iμ1 , ±iμ2 with (2.8) 0 μ21 = 1 + ω2 − α μ22 = 1 + ω2 + α, α = ν 4 + 4ω2 , μ21 =
2ν 2 ε 2 + ε 4 . μ22
(2.9)
In the isotropic case ν = 0, we recover μ1 = 1 − ω, μ2 = 1 + ω. When ε > 0, we have 0 < μ21 μ22 and q is positive-definite. When ε = 0, we have μ1 = 0 < μ2 , and q is positive semi-definite with rank 2 if ν = 0 and with rank 3 if ν > 0. Proof. The matrix Q of q is thus ⎞ 0 0 −ω 1 + ν2 ω 0 ⎟ ⎟ , and ω 1 0 ⎠ 0 0 1 ⎞ 0 ω 1 0 ⎜ −ω 0 0 1⎟ ⎟. F = σQ = ⎜ ⎝ ν2 − 1 0 0 ω⎠ 0 −ν 2 − 1 −ω 0 ⎛
1 − ν2 ⎜ 0 Q=⎜ ⎝ 0 −ω ⎛
(2.10)
The characteristic polynomial p of F is easily seen to be even and we calculate 2 2 p(λ) = det(F − λI4 ) = λ4 + 2 1 + ω2 λ2 + 1 − ω2 − ν 4 = λ2 + 1 + ω2 − ν 4 + 4ω2 . √ The four eigenvalues of F are thus ±i 1 + ω2 ± ν 4 + 4ω2 , proving the first statement of the lemma. Since (1 + ω2 )2 − α 2 = (1 − ω2 )2 − ν 4 = ε 2 (2ν 2 + ε 2 ), we get μ21 = ε 2 (2ν 2 + ε 2 )/μ22 . The statements on the cases ν = 0, ε > 0 are now obvious. When ε = 0 = ν, we have ω = 1, and rank q = 2 as it is obvious on (1.17). When ε = 0, ν > 0, we consider the following minor determinant in F , cofactor of f31 ω 1 0 0 0 1 = (−1) −ω2 + ν 2 + 1 = −2ν 2 = 0, −ν 2 − 1 −ω 0 so that rank Q = rank F = 3 in that case.
2
768
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
N.B. We may note here that the condition ω2 + ν 2 1 is an iff condition on the real parameters ν, ω for the quadratic form (1.12) to be positive semi-definite. This is obvious on the expression (1.17) in the isotropic case ν = 0, and more generally, the (non-symplectic) decomposition in independent linear forms q = (ξ1 + ωx2 )2 + (ξ2 − ωx1 )2 + x12 1 − ν 2 − ω2 + x22 1 + ν 2 − ω2 , shows that q has exactly one negative eigenvalue when ω2 + ν 2 > 1 ω2 − ν 2 , and exactly two negative eigenvalues when ω2 − ν 2 > 1. As a result, when ω2 + ν 2 > 1, the operator q w is unbounded from below. Using now Eqs. (2.7), (1.12) and assuming that we may find a linear symplectic transformation given by a generating function (2.5), we have to find A, B, C like in Lemma 2.1 with n = 2, so that for all (x, η) ∈ R2 × R2 , 2 Ax + B ∗ η2 + x2 + ν 2 x22 − x12 − 2ω x ∧ (Ax + B ∗ η) = μ(Bx + Cη) + η2 , with x ∧ ξ = x1 ξ2 − x2 ξ1 , μ = diag(μ1 , μ2 ). At this point, we see that the previous identity forces some relationships between the matrices A, B, C. However, the algebra is somewhat complicated and assuming that B is diagonal, A, C are (symmetrical) with zeroes on the diagonal lead to some simplifications and to the following results. We introduce first some parameters: β1 =
β2 =
γ=
2ωμ1 α − 2ω2 − ν 2 = 2ωμ1 α − 2ω2 + ν 2 2 2 4 since α − 2ω − ν = 4ω2 + 4ω4 − 4ω2 α = 4ω2 μ21 , + 2ω2
(2.11)
− ν2
2ωμ2 α = 2 2 2ωμ2 α + 2ω + ν 2 2 4 since α + 2ω − ν = 4ω2 + 4ω4 + 4ω2 α = 4ω2 μ22 , 2α , ω
(2.12) (2.13)
λ21 =
μ1 1 = = β μ1 + β1 β2 μ2 1 + 1 β2 μ2 1+ μ1
λ22 =
μ2 1 = = μ2 + β1 β2 μ1 1 + β1 β2 μ1 1+ μ2
1 α+2ω2 −ν 2 α−2ω2 +ν 2
1 α−2ω2 −ν 2 α+2ω2 +ν 2
=
α − 2ω2 + ν 2 , 2α
(2.14)
=
α + 2ω2 + ν 2 , 2α
(2.15)
and we have λ21 + λ22 = 1 +
ν2 , α
λ21 λ22 =
(α + ν 2 )2 − 4ω4 . 4α 2
(2.16)
We define also d=
γ λ1 λ2 , 2
c=
λ21 + λ22 2λ1 λ2
which gives cd =
2α(1 + ν 2 /α) α + ν 2 = . 4ω 2ω
(2.17)
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
769
Lemma 2.4. We define the 2 × 2 matrices B=
λ−1 1 0
0 λ−1 2
C=
,
d −1 0
0 d −1
A=
,
d λ1 λ2
0 − cd
d λ1 λ2
− cd 0
.
The 4 × 4 matrix given with 2 × 2 blocks by χ = ΞA,B,C =
0 I2
I2 A
B −1 0
0 B∗
−C I2
I2 0
belongs to Sp(2) and ⎛ ⎜ ⎜ χ =⎜ ⎝ ⎛
λ1
0
0
0
λ2
− λd2
d λ2
− λ2 cd
d λ1
0 − λ1 cd
− λd1 ⎞ 0 ⎟ ⎟ ⎟, 0 ⎠
cλ2
0
0
cλ1 λ2 d
⎞
cλ2
0
0
0
cλ1
λ1 d
+ λ1 cd
λ1
0⎟ ⎟ ⎟. 0⎠
0
0
λ2
⎜ ⎜ χ −1 = ⎜ ⎝
− λd2
0 − λd1 + λ2 cd
(2.18)
(2.19)
Proof. Lemma 2.1 gives that χ ∈ Sp(2) and we have also χ
−1
=
I2 0
C I2
B 0
0
B ∗ −1
0 I2 −A I2
.
The remaining part of the proof depends on the formula giving ΞA,B,C in Lemma 2.1 and a direct computation whose verification is left to the reader. 2 Lemma 2.5. Let χ be the symplectic matrix given by (2.18) and Q be the matrix given in (2.10). Then, with μj given by (2.8), we have χ ∗ Qχ = diag μ21 , μ22 , 1, 1 .
(2.20)
The (tedious) proof of that lemma is given in Appendix A.3.1. Using the expression of χ −1 in (2.18), defining ⎛ ⎞ y1 ⎜ ⎜ y2 ⎟ ⎜ ⎝ ⎠=⎜ η1 ⎝ η2 ⎛
cλ2 0 0 − λd1 + λ2 cd
we get from Lemma 2.5 the following result.
− λd2
0
0
cλ1
λ1 d
+ λ1 cd
λ1
0
0
⎞⎛
⎞ x1 ⎟ 0 ⎟ ⎜ x2 ⎟ ⎟⎜ ⎟, 0 ⎠ ⎝ ξ1 ⎠ ξ2 λ2 λ2 d
(2.21)
770
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
Lemma 2.6. For (x1 , x2 , ξ1 , ξ2 ) ∈ R4 , (y1 , y2 , η1 , η2 ) ∈ R4 given by (2.21), we have the following identity, 2 2 μ21 y12 + μ22 y22 + η12 + η22 = μ21 cλ1 x2 + λ2 d −1 ξ2 + μ22 cλ2 x1 + λ1 d −1 ξ1 2 2 −1 + −dλ−1 2 + λ1 cd x2 + λ1 ξ1 + −dλ1 + λ2 cd x1 + λ2 ξ2 = ξ12 + ξ22 + 1 − ν 2 x12 + 1 + ν 2 x22 − 2ω(x1 ξ2 − x2 ξ1 ), where the parameters c, λ2 , d, λ1 are defined above (note that all these parameters are welldefined when (ω, ν) are both positive with ω2 + ν 2 < 1). We have achieved an explicit diagonalization of the quadratic form (1.12) and, most importantly, that diagonalization is performed via a symplectic mapping. That feature will be of particular importance in our next section. Expressing the parameters in terms of α, ω, ν, ε (cf. Appendix A.2), we obtain 1/2 1/2 2 α − ν 2 x2 ξ1 − 2−3/2 ω−1 α −1/2 α − 2ω2 + ν 2 q = 2−1/2 α −1/2 α − 2ω2 + ν 2 α + 2ω2 − ν 2 1/2 (2ν 2 ε 2 + ε 4 )1/2 + 2−1/2 α −1/2 ξ2 μ2 2ν 2 + ε 2 2 α + 2ω2 − ν 2 1/2 (2ν 2 ε 2 + ε 4 )1/2 1/2 α + ν 2 α −1/2 2−3/2 ω−1 + x 1 μ2 2ν 2 + ε 2 −1/2 1/2 1/2 −1/2 + 1 + ω2 + α 2 α ω α + 2ω2 + ν 2 ξ1 −1/2 2 1/2 1 + α −1 ν 2 2−1/2 α 1/2 α + 2ω2 + ν 2 + 1 + ω2 + α x2 1/2 2 −1/2 −1/2 1/2 α + 2ω2 + ν 2 + 2 α ξ2 − 2−3/2 ω−1 α −1/2 α − ν 2 α + 2ω2 + ν 2 x1 , so that η12
μ21 y12
2 2 α − 2ω2 + ν 2 α + 2ω2 − ν 2 2 α − ν2 α + ν2 ε ξ2 + ξ1 − x2 + x1 q= 2α 2ω 2ω 2αμ22 2 α + ν2 1 + ω2 + α + 2ω2 + x2 ξ 1 2ω α(α + 2ω2 + ν 2 ) +
α + 2ω2 + ν 2 2α
μ22 y22
ξ2 −
2 α − ν2 x1 . 2ω
(2.22)
η22
Eq. (2.22) encapsulates most of our previous work on the diagonalization of q. In Appendix A.3.2, we provide another way of checking the symplectic relationships between the linear forms, yj , ηl .
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
771
We have seen in Lemma 2.3 that when ε = 0, ν > 0, the rank of q is 3, whereas its symplectic rank is 2. Indeed, ε = 0 and ν > 0, we have η12
2 α − 2ω2 + ν 2 α − ν2 ξ1 − x2 q= 2α 2ω 2 1 + ω2 + α α + ν2 x2 ξ + 2ω2 + 1 2ω α(α + 2ω2 + ν 2 ) μ22 y22
α + 2ω2 + ν 2 + 2α
2 α − ν2 ξ2 − x1 . 2ω
(2.23)
η22
3. Quantization 3.1. The Irving E. Segal formula Let a be defined on Rnx × Rnξ (say a tempered distribution on R2n ). Its Weyl quantization is the operator, acting for instance on u ∈ S (Rn ), w a u (x) =
e2iπ(x−x )ξ a
x + x , ξ u(x ) dx dξ. 2
(3.1)
In fact, the weak formula a w u, v = R2n a(x, ξ )H(u, v)(x, ξ ) dx dξ makes sense for a ∈ S (R2n ), u, v ∈ S (Rn ) since the Wigner function H(u, v) defined by H(u, v)(x, ξ ) =
e
−2iπx ξ
x x v¯ x − dx u x+ 2 2
belongs to S (R2n ) for u, v ∈ S (R n ) . Note also our definition of the Fourier transform u(ξ ˆ )=
−2iπx·ξ e u(x) dx (so that u(x) = e2iπx·ξ u(ξ ˆ ) dξ ) and ξjw u =
1 ∂u = Dj u, 2iπ ∂xj
xjw u = xj u,
1 (xj ξj )w = (xj Dj + Dj xj ). 2
Let χ be a linear symplectic transformation χ(y, η) = (x, ξ ). The Segal formula (see e.g. Theorem 18.5.9 in [9]) asserts that there exists a unitary transformation M of L2 (Rn ), uniquely determined apart from a constant factor of modulus one, which is also an automorphism of S (Rn ) and S (Rn ) such that, for all a ∈ S (R2n ), (a ◦ χ)w = M ∗ a w M,
(3.2)
772
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
providing the following commutative diagrams S (Rnx )
aw
S (Rnx ) M∗
M
S (Rny )
(a◦χ)w
aw
L2 (Rnx ) and if a w ∈ L L2 Rn
S (Rny )
L2 (Rnx ) M∗
M
L2 (Rny )
(a◦χ)w
L2 (Rny )
3.2. The metaplectic group and the generating functions For a given χ , how can we determine M? We shall not need here the rich algebraic structure of the two-fold covering Mp(n) (the metaplectic group in which live the transformations M) of the symplectic group Sp(n). The following lemma is classical (and also easy to prove directly using the factorization of Lemma 2.1) and provides a simple expression for M when the transformation χ has a generating function. Lemma 3.1. Let χ = ΞA,B,C be the symplectic mapping given by (2.4). Then the Segal formula (3.2) holds with (Mv)(x) = e2iπS(x,η) v(η) ˆ dη| det B|1/2 , (3.3) where S is given by (2.5). 3.3. Explicit expression for M Lemma 3.2. Let χ be the symplectic transformation of R4 given by (2.18). Then the Segal formula (3.2) holds with M given by −1
(Mv)(x1 , x2 ) = (λ1 λ2 )−1/2 e2iπd((λ1 λ2 ) −c)x1 x2 −1 −1 −1 × e2iπd η1 η2 v(η ˆ 1 , η2 )e2iπ(λ1 x1 η1 +λ2 x2 η2 ) dη1 dη2 , (Mv)(x1 , x2 ) = (λ1 λ2 )−1/2 e2iπd((λ1 λ2 )
−1 −c)x x 1 2
2iπd −1 D D −1 −1 1 2v λ e 1 x1 , λ2 x2 .
(3.4) (3.5)
Proof. We apply Lemmas 3.1 and 2.4, along with the fact that the mapping Mp(n) M → χ ∈ Sp(n) is an homomorphism or more elementarily that (3.2) implies for χj ∈ Sp(n), (a ◦ χ2 ◦ χ1 )w = M1∗ (a ◦ χ2 )w M1 = M1∗ M2∗ a w M2 M1 . The factorization of Lemma 2.4 implies that iπAx,x e2iπBx,η eiπCη,η v(η) ˆ dη, (Mv)(x) = e R2
which gives readily the formulas above.
2
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
773
Summing-up, we have proven the following result. Theorem 3.3. Let q be the quadratic form on R4 given by (1.12). We define the symplectic mapping χ by (2.18) and the metaplectic mapping M by (3.5). We have the μ2j are given by (2.8) ,
(q ◦ χ)(y, η) = μ21 y12 + μ22 y22 + η12 + η22
∗ w
(q ◦ χ) = M q M. w
(3.6) (3.7)
We can also explicitly quantize the formulas of Lemma 2.6, to obtain3 (η12 )w
μ21 (y12 )w
2 2 −1 w 2 −1 q = λ1 cd − dλ2 x2 + λ1 Dx1 + μ1 λ2 d Dx2 + cλ2 x1 2 2 2 −1 + λ2 cd − dλ−1 1 x1 + λ2 Dx2 + μ2 λ1 d Dx1 + cλ1 x2 . (η22 )w
(3.8)
μ22 (y22 )w
4. The Fock–Bargmann space and the anisotropic LLL 4.1. Nonnegative quantization and entire functions Definition 4.1. For X, Y ∈ R2n we set Π(X, Y ) = e− 2 |X−Y | e−iπ[X,Y ] , π
2
(4.1)
where [X, Y ] = σ X, Y is the symplectic form (2.1). For v ∈ L2 (Rn ), we define (W v)(y, η) = v, ϕy,η L2 (Rn ) ,
y
with ϕy,η (x) = 2n/4 e−π(x−y) e2iπ(x− 2 )η . 2
(4.2)
We define also − π2 |z|2 , z = η + iy, f entire . Λ0 = u ∈ L2 R2n y,η such that u = f (z)e
(4.3)
Proposition 4.2. The operator Π0 with kernel Π(X, Y )is the orthogonal projection in L2 (R2n ) on Λ0 , which is a proper closed subspace of L2 (R2n ), canonically isomorphic to L2 (Rn ). We have π (4.4) Λ0 = ran W = L2 R2n ∩ ker ∂¯ + z , 2 ∗ W W = IdL2 (Rn ) reconstruction formula u(x) = (W u)(Y )ϕY (x) dY , (4.5) R2n
∗
W W = Π0
W is an isomorphism from L2 Rn onto Λ0 .
3 Note that for a linear form L on R2n , Lw Lw = (L2 )w .
(4.6)
774
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
Proof. These statements are classical (see e.g. [10]); however, since we shall need some extension of that proposition, it is useful to examine the proof. We note that e−iπyη (W v)(y, η) is the partial Fourier transform w.r.t. x of Rn × Rn (x, y) → v(x)2n/4 e−π(x−y) , 2
whose L2 (R2n )-norm is vL2 (Rn ) so that W is isometric from L2 (Rn ) into L2 (R2n ), thus with a closed range. As a result, we have W ∗ W = IdL2 (Rn ) , W W ∗ is selfadjoint and such that W W ∗ W W ∗ = W W ∗ : W W ∗ is indeed the orthogonal projection on ran W (ran W W ∗ ⊂ ran W and W u = W W ∗ W u). The straightforward computation of the kernel of W W ∗ is left to the reader. Let us prove that Λ0 = ran W is indeed defined by (4.3). For v ∈ L2 (Rn ), we have (W v)(y, η) =
y
v(x)2n/4 e−π(x−y) e−2iπ(x− 2 )η dx 2
Rn
=
π
v(x)2n/4 e−π(x−y+iη) dxe− 2 (y 2
2 +η2 )
π
e− 2 (η+iy)
2
(4.7)
Rn
and we see that W v ∈ L2 (R2n ) ∩ ker(∂¯ + π2 z). Conversely, if Φ ∈ L2 (R2n ) ∩ ker(∂¯ + π2 z), we π
have Φ(x, ξ ) = e− 2 (x
2 +ξ 2 )
(W W ∗ Φ)(x, ξ ) =
f (ξ + ix) with Φ ∈ L2 (R2n ) and f entire. This gives
π
= e− 2 (ξ =e
2 +x 2 )
− π2 (ξ 2 +x 2 ) π
= e− 2 (ξ =e
π
e− 2 ((ξ −η)
2 +x 2 )
− π2 |z|2 π
= e− 2 |z|
2
2 +(x−y)2 +2iξy−2iηx)
e− 2 (η
π
2 −2ξ η+y 2 −2xy+2iξy−2iηx)
π
2 +y 2 +2iy(ξ +ix)−2η(ξ +ix))
e− 2 (η e−π(y
2 +η2 )
¯
2
f (ζ )
= e− 2 |z| f (ζ ) 2
1j n
π
∂ ∂ ζ¯j
Φ(y, η) dy dη
Φ(y, η) dy dη
eπ(η−iy)(ξ +ix) f (η + iy) dy dη
e−π|ζ | eπ ζ z f (ζ ) dy dη
1j n π
Φ(y, η) dy dη
(ζ = η + iy, z = ξ + ix)
∂ −π|ζ |2 π ζ¯ z 1 e dy dη e π(zj − ζj ) ∂ ζ¯j
! 1 2 ¯ , e−π|ζ | eπ ζ z π(ζj − zj ) S (R2n ),S (R2n )
= e− 2 |z| f (z), 2
since f is entire. This implies W W ∗ Φ = Φ and Φ ∈ ran W . The proof of the proposition is complete. 2
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
775
Proposition 4.3. Defining π ¯ K = ker ∂ + z ∩ S R2n , 2
(4.8)
the operator W given by (4.2) can be extended as a continuous mapping from S (Rn ) onto K " defined by (the L2 (Rn ) dot-product is replaced by a bracket of (anti)duality). The operator Π its kernel Π given by (4.1) defines a continuous mapping from S (R2n ) into itself and can be extended as a continuous mapping from S (R2n ) onto K . It verifies "2 = Π, " Π
"|K = IdK . Π
(4.9)
Proof. As above we use that e−iπyη (W v)(y, η) is the partial Fourier transform w.r.t. x of the tempered distribution on R2n x,y v(x)2n/4 e−π(x−y) . 2
Since e±iπyη are in the space OM (R2n ) of multipliers of S (R2n ), that transformation is continuous and injective from S (Rn ) into S (R2n ). Replacing in (4.7) the integrals by brackets of duality, we see that W (S (Rn )) ⊂ K . Conversely, if Φ ∈ K , the same calculations as above give (4.9) and (4.8). 2 For a Hamiltonian a defined on R2n , for instance a bounded function on R2n , we define = W ∗ aW :
a Wick
L2 (R2n )
a (multiplication by a)
L2 (R2n ) W∗
W
L2 (Rn )
a Wick
L2 (Rn )
we note that a(x, ξ ) 0 ⇒ a Wick = W ∗ aW 0, as an operator. There are many useful applications of the Wick quantization due to that non-negativity property, but for our purpose here, it will be more important to relate explicitely that quantization to the usual Weyl quantization (as given by (3.1)) for quadratic forms. Lemma 4.4. Let q(X) = QX, X be a quadratic form on R2n (Q is a 2n×2n symmetric matrix). Then we have q Wick = q w +
1 trace Q. 4π
(4.10)
Let L(y, η) = τ · y − t · η be a real linear form on R2n ; then, for all Φ ∈ Λ0 , we have
2 |τ |2 + |t|2 Φ2L2 (R2n ) . L(y, η)2 Φ(y, η) dy dη 4π
(4.11)
776
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
Proof. A straightforward computation shows that where Γ (X) = 2n e−2π|X|
2
q Wick = (q ∗ Γ )w ,
X ∈ R2n .
(4.12)
2 By Taylor’s formula, we have (q ∗ Γ )(X) = q(X) + R2n 2n e−2π|Y | QY, Y dY, we can use the
1/2 2 −2πt 2 1 dt = 4π to get the first result. For Φ ∈ Λ0 , we have Φ = W u with formula R 2 t e u ∈ L2 (Rn ) and thus LΦ2L2 (R2n ) = L2 W u, W u L2 (R2n ) = W ∗ L2 W u, u L2 (Rn ) Wick w trace(L2 ) u2L2 (Rn ) , = L2 u, u L2 (Rn ) = L2 u, u L2 (Rn ) + 4π and since Lw Lw = (L2 )w for a linear form, we get since L is real-valued, 2 |τ |2 + |t|2 LΦ2L2 (R2n ) = Lw uL2 (Rn ) + Φ2L2 (R2n ) , 4π which implies (4.11).
2
N.B. The inequality (4.11) looks like an uncertainty principle related to the localization in R2n for the functions of Λ0 . Moreover the equality (4.10) provides a simple way to saturate approximately the inequality (4.11); for instance if L(y, η) = y1 , we consider the sequence Φε = W uε with uε (x) = ϕ(x1 /ε)ε −1/2 ψ(x ), ϕL2 (R) = ψL2 (Rn−1 ) = 1, and we get, provided xϕ(x) ∈ L2 (R), 2 2 1 1 = O ε2 + . y12 Φε (y, η) dy dη = x12 ϕ(x1 /ε) ε −1 dx1 + 4π 4π R
4.2. The anisotropic LLL Going back to the Gross–Pitaevskii energy (1.11), with q given by (1.13), we see, using Theorem 3.3 and (3.8) that, with u = Mv, 2EGP (u) = q w u, u L2 (R2 ) + g
|u|4 dx 4 = M ∗ q w Mv, v L2 (R2 ) + g (Mv)(x) dx = Dy21 + μ21 y12 + Dy22 + μ22 y22 v, v L2 (R2 ) + g =
(Mv)(x)4 dx
2 2 2 −1 λ1 cd − dλ−1 2 x2 + λ1 Dx1 u + μ1 λ2 d Dx2 + cλ2 x1 u, u 2 2 2 −1 + λ2 cd − dλ−1 1 x1 + λ2 Dx2 u, u + μ2 λ1 d Dx1 + cλ1 x2 u, u + g |u|4 dx.
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
777
The question at hand is the determination of infuL2 =1 EGP (u), which is equal to infvL2 =1 EGP (Mv). Since μ1 = 0 at ε = 0 (see (2.9)) and μ2 ∈ [1, 4] (see (A.1)), it is natural to modify our minimization problem, and in the (y, η) coordinates, to restrict our attention to the Lowest Landau Level, i.e. the groundspace of Dy22 + μ22 y22 , that is the subspace of L2 (R2 ) 2 1/4 LLLy = v1 (y1 ) ⊗ 21/4 μ2 e−πμ2 y2 v
2 1 ∈L (R)
= ker(Dy2 − iμ2 y2 ) ∩ L2 R2 .
(4.13)
If we want to stay in the physical coordinates (x, ξ ) we reach the following definition, obtained by using Segal’s formula (3.2) with M, χ given in Lemma 3.1 so that LLLx = M(LLLy ). Proposition 4.5. Let q be the quadratic form on R4 given by (1.13). We define the LLL as LLL = (ker L) ∩ L2 R2 ,
with −1 w w L = λ2 cd − dλ−1 1 x1 + λ2 Dx2 − iμ2 λ1 d Dx1 − iμ2 cλ1 x2 = η2 − iμ2 y2 .
(4.14) (4.15)
The LLL is the subspace of L2 (R2 ) of functions of type ν2 ν2 πν 2 γ γπ 2 2 x 1− (4.16) + (β2 x2 ) 1 + exp −i x1 x2 F (x1 + iβ2 x2 ) exp − 4β2 1 2α 2α 4α where F is entire on C, and the parameters γ , β2 , ν, α are given in Appendix A.2. The real part of the phase of the Gaussian function multiplying F (x1 + iβ2 x2 ) is a negative definite quadratic form when (ω, ν) = (0, 0). Proof. We have μ2 y2
η2
− λ cd x1 iL = μ2 λ1 d −1 Dx1 + μ2 cλ1 x2 +i λ2 Dx2 − dλ−1 2 1 1 μ2 λ1 d −1 ∂1 + iλ2 ∂2 + 2iπμ2 cλ1 x2 + 2π dλ−1 − λ2 cd x1 1 2iπ 1 1 1 . μ2 λ1 d −1 ∂1 + i λ2 ∂2 + iπμ2 cλ1 x2 + π dλ−1 − λ cd x = 2 1 1 iπ 2 2 =
We set −1 t1 = μ−1 2 λ1 dx1 ,
t2 = λ−1 2 x2 ,
and we get for z = t1 + it2 , ∂ + iπμ2 cλ1 λ2 t2 + π dλ−1 − λ2 cd μ2 λ1 d −1 t1 1 ∂ z¯ z − z¯ z + z¯ ∂ + iπμ2 cλ1 λ2 + π dλ−1 = − λ2 cd μ2 λ1 d −1 1 ∂ z¯ 2i 2
(4.17)
778
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
∂ ∂ μ2 μ2 μ2 μ2 + zπ + z¯ π (1 − 2λ1 λ2 c) = + zπ − z¯ π ν 2 α −1 ∂ z¯ 2 2 ∂ z¯ 2 2 μ2 ν 2 μ2 μ2 ν 2 μ2 ∂ 2 2 = e−π 2 z¯z eπ 4α (¯z) eπ 2 z¯z e−π 4α (¯z) . ∂ z¯ =
As a consequence, the LLL is the subspace of L2 (C) of functions f (z)e−π
μ2 z 2 z¯
eπ
ν 2 μ2 z )2 4α (¯
,
with f holomorphic.
We note that the real part of the exponent is −
πμ2 2 2α + ν 2 πμ2 2 2α − ν 2 ν2 2 t1 + t22 − t1 − t22 = − t1 + t22 2 2α 2 2α 2α
and that 2α − ν 2 > 0
⇐⇒
(ω, ν) = (0, 0).
Leaving the t-coordinates for the original x-coordinates, we get with f entire, 2 −1 −1 πμ2 2 2α − ν 2 −1 2 2α + ν f μ2 λ1 dx1 + iλ2 x2 exp − t1 + t2 2 2α 2α 2 πμ2 ν t1 t2 , × exp −i 2α i.e. 2 −1 −1 πμ2 2 2 2α − ν 2 −1 2 2α + ν f μ2 λ1 dx1 + iλ2 x2 exp − + x2 x1 d 2 2αλ21 μ22 2αλ22 πμ2 ν 2 d × exp −i x1 x2 , 2αλ1 λ2 μ2 and since −1 −1 −1 −1 −1 μ2 λ1 d −1 λ−1 λ1 λ2 λ2 = μ2 2γ −1 λ−2 γβ2 (2μ2 )−1 = β2 , 2 = μ2 λ1 2γ 2 = μ2 2γ γ −2 −1 2 −1 2 −2 −1 2 −1 −1 −1 −2 2−1 μ2 d 2 λ−2 β2 μ2 = , 1 μ2 = 2 μ2 γ 4 λ2 μ2 = 2 μ2 γ 4 2μ2 γ 4β2 −1 2−1 μ2 λ−2 2 = 2 μ2
γβ2 γβ2 , = 2μ2 4
πμ2 ν 2 d πν 2 d πν 2 γ , = = 2αλ1 λ2 μ2 2αλ1 λ2 2α2 we obtain
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
779
$ −1 # −1 −1 f μ−1 2 λ1 d x1 + i μ 2 λ1 d λ2 x2 =β2
2 πμ2 ν 2 d πμ2 2 2 2α − ν 2 2 2α + ν + x2 exp −i x1 d × exp − x1 x2 , 2 2αλ1 λ2 μ2 2αλ21 μ22 2αλ22 that is, with F entire on C, ν2 ν2 γπ 2 x1 1 − + (β2 x2 )2 1 + F (x1 + iβ2 x2 ) exp − 4β2 2α 2α 2 πν γ x1 x2 . × exp −i 4α The proof of the proposition is complete.
(4.18)
2
Remark 4.6. We note that in the isotropic case ν = 0, we have β2 = 1, γ = 4, recovering (1.15) 2 2 (f (x1 + ix2 )e−π(x1 +x2 ) ) for ω = 1. On the other hand, the reader may have noticed that it seems difficult to guess the above definition without going through the explicit computations on the diagonalization of q of the previous sections. 4.3. The energy in the anisotropic LLL Lemma 4.7. The LLL is defined by Proposition 4.5 and the Gross–Pitaevskii energy by (1.11). For u ∈ LLL, we have EGP (u) =
1 2
R2
+
g 2
2 2α 2α(2ν 2 + ε 2 ) 2 2 2 dx1 dx2 u(x ε x + x , x ) 1 2 1 2 α + 2ω2 + ν 2 α − ν 2 + 2ω2
R2
u(x1 , x2 )4 dx1 dx2 + μ2 − μ1 β1 β2 + 1 . 4π 8π β1 β2
(4.19)
Proof. In the LLL, one can simplify the energy. We define A2 = M(η2 − iμ2 y2 )w M ∗ = μ2 λ1 d −1 Dx1 + cλ1 x2 + i λ2 Dx2 − dλ−1 1 − λ2 cd x1 , A1 = M(η1 − iμ1 y1 )w M ∗ = μ1 λ2 d −1 Dx2 + cλ2 x1 + i λ1 cd − dλ−1 2 x2 + λ1 Dx1 , which satisfy the canonical commutation relations: [Aj , A∗j ] = μj /π, while all other commutators vanish. We have proven that q w = A∗1 A1 + A∗2 A2 +
μ1 + μ2 = (Re A1 )2 + (Im A1 )2 + (Re A2 )2 + (Im A2 )2 2π
and the LLL is defined by the equation A2 u = 0. On the other hand, we have −1 dμ−1 1 Re A1 − Im A2 = dλ1 x1 ,
dμ2 −1 Re A2 − Im A1 = dλ−1 2 x2 ,
780
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
and thus for u ∈ LLL, since A2 u = 0 , using the commutation relations of the Aj ’s, one gets 2 −1 2 2 −2 2 ∗ ∗ d 2 λ−2 1 x1 = d μ1 (Re A1 ) + A2 − A2 /2i + 2dμ1 (Re A1 ) A2 − A2 /2i μ2 2 = d 2 μ−2 1 (Re A1 ) + 4π , and similarly, 2 2 2 −2 d 2 λ−2 A2 + A∗2 /2 + (Im A1 )2 2 x2 = d μ 2 = (Im A1 )2 +
d2 . 4πμ2
As a result, we get on the LLL, 2 2 −2 2 2 2 μ21 λ−2 1 x1 + d λ2 x2 = (Re A1 ) + (Im A1 ) + 2 2 −2 2 and q w = μ21 λ−2 1 x1 + d λ2 x2 −
γ 2EGP (u) = 2
d2 4πμ2
−
μ2 μ21 4πd 2
μ1 β1 x12 R2
+g
+
μ2 2π ,
μ2 μ21 d2 + , 4πμ2 4πd 2
so that
2 μ1 2 + x2 u(x1 , x2 ) dx1 dx2 β1
u(x1 , x2 )4 dx1 dx2
R2
μ1 μ2 1 − + , β1 β2 + 2π 4π β1 β2 for any u ∈ LLL, that is, satisfying (4.16). We note that 2α γ μ1 β1 = ε2 2 α + 2ω2 + ν 2 2α(2ν 2 + ε 2 ) γ μ1 = 2β1 α − ν 2 + 2ω2
coefficient of x12 , coefficient of x22 .
Definition 4.8. For u ∈ LLL (see Proposition 4.5), we define 1 ELLL (u) = 2
2 2 2 g1 ε x1 + κ12 x22 u(x1 , x2 ) dx1 dx2 + 2
R2
u(x1 , x2 )4 dx1 dx2 , (4.20)
R2
with κ12 =
(α + 2ω2 + ν 2 )(2ν 2 + ε 2 ) , α − ν 2 + 2ω2
g1 = g
α + 2ω2 + ν 2 , 2α
α=
ν 4 + 4ω2 . (4.21)
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
781
We note that, from (4.19), EGP (u) =
2α μ2 1 μ1 E (u) + β + . − β LLL 1 2 4π 8π β1 β2 α + 2ω2 + ν 2
Remark 4.9. Since α 2 = ν 4 + 4ω2 , we see that 2 (α + 2ω2 + ν 2 )(2ν 2 + ε 2 ) 2ν 2 2ν + ε 2 1 + 2ν 2 + ε 2 , = κ2 = 2 2 α − ν + 2ω α − ν 2 + 2ω2
(4.22)
(4.23)
and κ 2 = ε 2 ⇐⇒ ν = 0. Remark 4.10. We stay away from the case where ω = 0 and shall always assume ω > 0. In the case ω = 0, the quadratic part of the energy is diagonal and the LLL is, 1/8 −π(2−ε2 )1/2 x 2 2, e v1 (x1 ) ⊗ 21/4 2 − ε 2 and we get a 1D problem on the function v1 . 4.4. The (final) reduction to a simpler lowest Landau level Given the fact that in (4.16), we can write F (x1 + iβ2 x2 ) as a holomorphic function 2 times e−δz , with δ = γ πν 2 /(8β2 α), and that the energy ELLL depends only on the modulus of u and not on its phase, it is equivalent to minimize ELLL on the LLL or on the space γπ 2 2 x + (β2 x2 ) , with f entire. f (x1 + iβ2 x2 ) exp − 4β2 1 A rescaling in x1 and x2 yields the space of the introduction with u(x1 , x2 ) =
γ v(y1 , y2 ), 2
y1 = x1
γ , 2β2
y2 = x2
γβ2 , 2
(4.24)
and, with Λ0 given by (4.3), the mapping LLL u → v ∈ Λ0 is bijective and isometric. With κ1 , g1 given in Definition 4.8, β2 in (2.12), γ in (2.13), we introduce κ=
κ1 , β2
g0 =
g1 γ 2 , 4β2
(4.25)
and E(v) =
1 2
2 2 2 g0 ε y1 + κ 2 y22 v(y1 , y2 ) dy1 dy2 + v4L4 (R2 ) . 2
(4.26)
R2
Using the transformation (4.24), we have ELLL (u) =
2β2 E(v), γ
(4.27)
782
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
so that, via Definition 4.8, we are indeed reduced to the minimization of (1.21) in the space Λ0 (given in (1.22)) under the constraint uL2 (R2 ) = 1. We note also that the quantities 2α , α + 2ω2 + ν 2 β2 ,
γ2 , β2
2β2 γ
α + 2ω2 + ν 2 2α
factors of ELLL (u) in (4.22) and E(v) in (4.27) ,
and (4.28)
factors of κ in (4.25), of g1 in (4.25), of g in (4.21) ,
(4.29)
are bounded and away from zero as long as ω stays away from zero, a condition that we shall always assume, say 0 < ω0 ω 1. 5. Weak anisotropy This section is devoted to the proof of Theorem 1.1. We assume ε κ ε 1/3 . The isotropic case is recovered by assuming κ = ε. We first give some approximation results in Section 5.1, and prove the theorem in Section 5.2. We recall that the space Λ0 , the operator Π0 , the energy E and the minimization problem I (ε, κ) are defined by (1.22), (1.23), (1.21) and (1.24), respectively. An important test function will be (1.28), namely π
uτ (x1 , x2 ) = e 2 (z for τ = τR + iτI = e
2iπ 3
2 −|z|2 )
√ Θ( τI z, τ ),
z = x1 + ix2 ,
(5.1)
.
5.1. Approximation results π
Lemma 5.1. Let u(x) = f (x1 + ix2 )e− 2 |x| ∈ L∞ (R2 ), with f holomorphic. Assume 0 β 1 and let p ∈ C 0,β (R2 ) be such that supp(p) ⊂ BS the Euclidean ball of radius S > 0 and of center 0. Define x1 x2 1 . (5.2) p , ρ(x) = √ R1 R2 R1 R 2 2
Then, for any r 1, there exists a constant CS,r > 0 depending only on S and r such that, setting R = min(R1 , R2 ), we have, 1
1
r −2 Π0 (ρu) − ρu r 2 CS,r u ∞ 2 p 0,β 2 (R1 R2 ) . L (R ) C (R ) L (R ) Rβ
(5.3)
Proof. We first prove the lemma in the case β = 0. For this purpose, we write Π0 (ρu) e− π2 |x−y|2 u(y)ρ(y) dy. R2
Young’s inequality implies, for any r 1 and any p, q 1 such that 1/p + 1/q = 1 + 1/r, Π0 (ρu) r e− π2 |x|2 p uρLq uL∞ e− π2 |x|2 p ρLq . L L L
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
783
Fixing q = r, hence p = 1, we find Π0 (ρu)
1
Lr
1
2uL∞ ρLr = 2uL∞ (R1 R2 ) r − 2 pLr .
(5.4)
This proves (5.3) for β = 0. Next, we assume β = 1. We use a Taylor expansion of ρ(y) = ρ(x + y − x) around x: ρ(y) = ρ(x) 1 +√ R1 R 2
1 0
x1 y1 − x1 x2 y2 − x2 ∇p +t , +t R1 R1 R2 R2
y1 − x1 y2 − x2 · dt. , R1 R2
We then notice that, although u ∈ / Λ0 a priori, it belongs to K (see Proposition 4.3) and we have Π0 (u) = u since u ∈ L∞ and u(x) = f (x1 + ix2 ) exp(−π|x|2 /2) with f holomorphic. Hence, we have Π0 (ρu) − ρu π 2 e− 2 |x−y| +iπ(x2 y1 −y2 x1 ) u(y1 , y2 ) = R ,R2
1 BS+1
1 ×√ R1 R 2
1 ∇p 0
− ρ(x)
x1 y1 − x1 x2 y2 − x2 +t , +t R1 R1 R2 R2
u(y)e− 2 |x−y| π
2 +iπ(x y −y x ) 2 1 2 1
y1 − x1 y2 − x2 · dt dy , R1 R2
dy,
R ,R
1 2 )c (BS+1
R1 ,R2 is where the set BS+1
R1 ,R2 BS+1 = (y1 , y2 ) = (R1 t1 , R2 t2 ), t ∈ BS+1 .
(5.5)
We thus have, with R = min(R1 , R2 ),
Π0 (ρu) − ρu ∇pL∞
R ,R2
1 π |y − x| 2 dy e− 2 |x−y| u(y) √ R1 R 2 R
1 BS+1
+ ρ(x)
u(y)e− π2 |x−y|2 dy.
(5.6)
R1 ,R2 c (BS+1 )
We bound the first term of the right-hand side of (5.6) using Young’s inequality, while for the second term, we have, ∀x ∈ supp(ρ) ⊂ BSR1 ,R2 ,
784
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
u(y)e− π2 |x−y|2 dy uL∞ e− π4 R 2
e− 4 |x−y| dy π
2
R2
R ,R
1 2 )c (BS+1 π
= 4uL∞ e− 4 R uL∞ 2
C , R
where C is a universal constant. Hence, we have Π0 (ρu) − ρu
Lr
π 1 R1 ,R2 1/r 1 2 B ∇pL∞ |y|e− 2 |y| L1 uL∞ √ R R1 R2 S+1
C uL∞ ρLr R √ 1 1 1 = ∇pL∞ 2uL∞ (R1 R2 ) r − 2 |BS+1 |1/r R +
+
1 1 C uL∞ pL∞ (R1 R2 ) r − 2 |BS |1/r . R
This gives (5.3) for β = 1. We then conclude by a real interpolation argument between C 0 and C 0,1 . 2 A comment is in order here: we have chosen to state Lemma 5.1 with a general function p. 1/2 However, since our aim is to apply the above result with the special case p(x) = (1 − |x|2 )+ , it is also possible to use explicitly this value of p in order to give a simpler proof of the above result. The method would then be to prove the estimate for r = +∞ first, then for r = 1, and then use an interpolation argument between L1 and L∞ . For instance, the proof of the r = +∞ case would go as follows: Π0 (ρu)(x) − ρ(x)u(x) = e− π2 |x−y|2 +iπ(x2 y1 −y2 x1 ) ρ(y)u(y) − ρ(x)u(y) dy R2
uL∞
π 2 e− 2 |x−y| ρ(y) − ρ(x) dy
R2
uL∞
e
− π2 |x−y|2
|x − y| dy R
R2
uL∞ = √ R
π
e− 2 |y|
2
|y| dy.
R2
The proof of the case r = 1 is slightly more involved, but is based on the same idea.
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
785
We now prove Lemma 5.2. With the same hypotheses as in Lemma 5.1, we have, for any s 1,
2 x12s Π0 (ρu) − ρu
1/2
1 + R1s S s , Rβ
(5.7)
(1 + R2s S s ) , Rβ
(5.8)
CS,s uL∞ (R2 ) pC 0,β (R2 )
R2
and
2 x22s Π0 (ρu) − ρu
1/2 CS,s uL∞ (R2 ) pC 0,β (R2 )
R2
where CS,s depends only on S and s. Proof. Here again, we first deal with the case β = 0. For this purpose, we write: |x1 |s Π0 (ρu) 2s−1
R2
π 2 |x1 − y1 |s e− 2 |x−y| u(y)ρ(y) dy
+ 2s−1
π 2 |y1 |s e− 2 |x−y| u(y)ρ(y) dy,
(5.9)
R2
where we have used the inequality (a + b)s 2s−1 (a s + bs ), valid for any a, b 0, s 1. The first line of (5.9) is dealt with exactly as in the proof of Lemma 5.1, leading to (5.4) with r = 2, which reads here |x1 − y1 |s e− π2 |x−y|2 u(y)ρ(y) dy uL∞ |x|s e− π2 |x|2 1 ρ 2 L L L2
R2
Cs uL∞ pL2 ,
(5.10)
where Cs depends only on s. The second line of (5.9) is treated in the same way, but ρ(y) is replaced by |y1 |s ρ(y), that is, p(y) is replaced by R1s |y1 |s p(y). Hence, we have |y1 |s e− π2 |x−y|2 u(y)ρ(y) dy R2
L2
2R1s uL∞ |y1 |s p L2 .
(5.11)
Collecting (5.9), (5.10) and (5.11), we find |x1 |s Π0 (ρu)
L2
Cs 1 + R1s S s uL∞ pC 0 |BS |1/2 .
This proves (5.7) for β = 0. Next, we consider the case β = 1. Here again, we use a Taylor expansion to obtain (5.6). This implies
786
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
∇pL∞ |x1 | Π0 (ρu) − ρu 2s−1 R
s
1 π 2 e− 2 |x−y| u(y) √ |y − x||y1 − x1 |s dy R1 R 2
R ,R2
1 BS+1
+2
s−1 ∇pL∞
R R ,R2
1 π 2 e− 2 |x−y| u(y) √ |y − x||y1 |s dy R1 R 2
1 BS+1
+ |x1 |s ρ(x)
u(y)e− π2 |x−y|2 dy,
R ,R
1 2 )c (BS+1
R1 ,R2 where BS+1 is defined by (5.5). We use Young’s inequality again, finding
|x1 |s Π0 (ρu) − ρu
L2
2
s−1 ∇pL∞
π 2 |y|s+1 e− 2 |y| L1
R1 ,R2 |BS+1 |
1/2
uL∞ R1 R2 1/2 |y1 |2s s−1 ∇pL∞ − π2 |y|2 |y|e +2 dy uL∞ L1 R R1 R2 R
R ,R2
1 BS+1
+
C uL∞ |x1 |s ρ L2 , R
where C is a universal constant. Hence, |x1 |s Π0 (ρu) − ρu 2 CS,s pC 1 1 + R s S s uL∞ . 1 L R This gives (5.7) in the case β = 1. Here again, we conclude with a real interpolation argument. The proof of (5.8) follows the same lines. 2 5.2. Energy bounds Proposition 5.3. Let τ ∈ C \ R, let p ∈ C 0,1/2 (R2 ) be such that supp(p) ⊂ K for some compact set K, and |p|2 = 1. Consider uτ as defined by (1.28), and define −1 (5.12) v = Π0 (ρuτ )L2 (R2 ) Π0 (ρuτ ), where ρ is given by 1 x1 x2 ρ(x) = √ , p , R1 R2 R1 R 2
R1 =
4g0 κ πε 3
1/4
,
R2 =
4g0 ε πκ 3
1/4 .
Then we have, with E(u) defined by (1.21) 3 1/8 2 πγ (τ ) √ 2gεκ 1 2 κ p(x)4 + O , |x| p(x) + E(u) = εκ π 2 4 ε R2
for (ε, κε −1/3 ) → (0, 0), where γ (τ ) is given by (1.31).
(5.13)
(5.14)
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
787
N.B. The L∞ function ρuτ does not belong to Λ0 since it is compactly supported and not identically 0; as a result, Π0 (ρuτ )L2 = 0 and v makes sense. Proof. First note that R = min(R1 , R2 ) = R2 , and that Lemma 5.1 with r = 2 implies 3 1/8 κ − ρuτ L2 CR −1/2 = C . ε
Π0 (ρuτ )
L2
(5.15)
We then apply Lemma 5.2 for s = 1, β = 1/2, finding x 2 Π0 (ρuτ )2 − x 2 |ρ|2 |uτ |2 C x1 Π0 (ρuτ ) 2 + x1 ρuτ 2 1 + R1 L 1 1 L R 1/2 R2
R2
1 + R1 1 + R1 C 2x1 ρuτ L2 + C 1/2 . R R 1/2
We also compute
2 2 x12 ρ(x) uτ (x) dx
R12 uτ 2L∞
R2
2 x12 p(x) dx CR12 .
R2
Hence, we get 3 1/8 2 2 √ ε 2 κ 2 2 2 2 2 1 + R1 x1 Π0 (ρuτ ) − x1 |ρ| |uτ | Cε C εκ . 1/2 2 ε R R2
(5.16)
R2
A similar argument allows to show that 3 1/8 2 2 √ κ 2 κ 2 2 2 2 2 1 + R2 x2 Π0 (ρuτ ) − x2 |ρ| |uτ | Cκ C εκ . 1/2 2 ε R R2
(5.17)
R2
Turning to the last term of the energy, we apply Lemma 5.1 again, with r = 4, β = 1/2, finding Π0 (ρuτ )4 − |ρuτ |4 2 Π0 (ρuτ )3 4 + ρuτ 3 4 Π0 (ρuτ ) − ρuτ 4 L L L R2
R2
Cρuτ 3L4 (R1 R2 )−1/4 R −1/2 . In addition, we have
|ρuτ |4 uτ 4L∞
R2
R2
|ρ|4 = uτ 4L∞ (R1 R2 )−1
p4 . R
788
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
Hence, we obtain 3 1/8 √ Π0 (ρuτ )4 − |ρuτ |4 C(R1 R2 )−1 R −1/2 C εκ κ . ε R2
(5.18)
R2
Combining (5.16), (5.17) and (5.18), we have 3 1/8 κ . E Π0 (ρuτ ) = E(ρuτ ) 1 + O ε Hence, with the help of (5.15), we get
ρuτ E(v) = E ρuτ L2
3 1/8 κ 1+O . ε
Finally, we estimate the terms of E(ρuτ /ρuτ L2 ): using real interpolation between C 0 and C 0,1 , we obtain ρuτ 2L2 =
p(x)2 uτ (R1 x1 , R2 x2 )2 dx
R2
3 1/8 1 κ 2 2 = – |uτ | + O . = – |uτ | + O ε R 1/2
(5.19)
Moreover, we have ε2 2
x12 |ρ|2 |uτ |2 = R2
κ2 2
3 1/8 2 ε2 2 κ R1 – |uτ |2 + O x12 p(x) dx, 2 ε
(5.20)
R2
3 1/8 2 κ2 2 κ 2 2 2 2 x2 |ρ| |uτ | = R2 – |uτ | + O x22 p(x) dx, 2 ε
R2
g 2
3 1/8 g κ 4 |ρ| |uτ | = |p|4 . – |uτ | + O 2R1 R2 ε 4
R2
(5.21)
R2
4
R2
Thus, collecting (5.19), (5.20), (5.21) and (5.22),
ε2 2 R E(u) = 2 1
2 x12 p(x) dx
R2
κ2 + R22 2
R2
2 x22 p(x) dx
3 1/8 – |uτ |4 κ g0 4 1+O |p| +
2 2 ε ( – |uτ | ) 2R1 R2 R2
(5.22)
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
=
2g0 εκ π
2 πγ (τ ) 4 1 2 x1 + x22 p(x) + |p| 2 4
789
R2
3 1/8 κ × 1+O ε 2 πγ (τ ) 4 2g0 εκ 1 2 2 x + x2 p(x) + |p| = π 2 1 4 R2
3 1/8 √ κ . εκ +O ε
2
Proof of Theorem 1.1. We first prove the lower bound in (1.33): this is done by noticing that J (ε, κ) I (ε, κ), where 2 2 2 2 4 2 J (ε, κ) = inf E(u), u ∈ L R , 1 + |x| dx ∩ L R , |u| = 1 . R2
In addition, the minimizer of J (ε, κ) may be explicitly computed (up to the multiplication by a complex function of modulus one): % x12 x22 1/2 2 1− 2 − 2 u(x) = , (5.23) πR1 R2 R1 R2 + with R1 , R2 defined by (5.13). Inserting (5.23) in the energy, one finds the lower bound of (1.33). In addition, the inverted parabola (5.23) is compactly supported, so it cannot be in Λ0 . Hence, the inequality is strict. In order to prove the upper bound, we apply Proposition 5.3, with % |x|2 1/2 2 1− √ , p(x) = √ π γ (τ ) γ (τ ) + and τ = j. This corresponds
to minimizing the leading order term of (5.14) with respect to τ and p, with the constraint |p|2 = 1. 2 6. Strong anisotropy We give in this section the proof of Theorem 1.2. We deal here with the strongly asymmetric case that is, (1.35), which we recall here: κ ε 1/3
(6.1)
We first prove an upper bound for the energy in Section 6.1, then a lower bound in Section 6.2, and conclude the proof in Section 6.3
790
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
6.1. Upper bound for the energy Lemma 6.1. Assume that ρ ∈ L2 (R). Then the function π 2 π 1 2 u(x1 , x2 ) = 1/4 e− 2 x2 e− 2 ((x1 −y1 ) −2iy1 x2 ) ρ(y1 ) dy1 , 2
(6.2)
R
satisfies u ∈ Λ0 . Proof. We first write π
u(x1 , x2 )e 2 (x1 +x2 ) = 2
2
1 21/4
π
e− 2 (y1 −2(x1 +ix2 )y1 ) ρ(y1 ) dy1 , 2
R
which is a holomorphic function of x1 + ix2 . In addition, we have u(x1 , x2 ) 1 e− π2 x22 ρ ∗ e− π2 y12 (x1 ). 1/4 2 Hence, using Young’s inequality, we get uL2 (R2 ) hence u ∈ L2 (R2 ).
− π y2 ρ = 21/4 ρL2 (R) , 2 (R) e 2 1 1 L L (R) 1/4 1
2
2
Lemma 6.2. Let p ∈ C 2 (R) have compact support with supp(p) ⊂ (−T , T ), and consider the function t 1 ρ(t) = √ p . (6.3) R R Then, for any r 1, there exists a constant Cr depending only on r such that the function u defined by (6.2) satisfies, for R 1, u(x1 , x2 ) − 21/4 ρ(x1 )e−πx22 +iπx1 x2 − i21/4 x2 ρ (x1 )e−πx22 +iπx1 x2 r 2 L (R ) Cr T 1/r
p L∞ (R) . R 5/2−1/r
(6.4)
Proof. We use a Taylor expansion of p( yR1 ) around
y1 p R
x1 =p R
x1 R,
that is,
1 x1 + p (y1 − x1 ) R R
1 + 2 (x1 − y1 )2 R
1 0
(1 − t)p
x1 t (y1 − x1 ) + dt. R R
(6.5)
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
791
In addition we have x1 x1 1/4 −πx 2 +iπx1 x2 1 1 − π x2 − π2 ((x1 −y1 )2 −2iy1 x2 ) 1 2 2 2 dy 2 e p p e e = , √ √ 1 1/4 R R 2 R R R
and 1
2
π
e− 2 x2 1/4
2
π
e− 2 ((x1 −y1 )
R
=
1 R
i21/4 x2 p 3/2
1
2 −2iy x ) 1 2
R
p 3/2
x1 (y1 − x1 ) dy1 R
x1 −πx 2 +iπx1 x2 2 e . R
Setting v(x1 , x2 ) = u(x1 , x2 ) − 21/4 ρ(x1 )e−πx2 +iπx1 x2 − i21/4 x2 ρ (x1 )e−πx2 +iπx1 x2 , 2
2
(6.6)
we infer v(x1 , x2 )
1
π 2 1 e− 2 x2 21/4 R 5/2
R 0
1
π 2 p L∞ 1/4 5/2 e− 2 x2 2 R
π 2 x1 y1 dt dy1 +t y12 e− 2 y1 (1 − t)p R R π
y12 e− 2 y1 (1 − t)1(−T R,T R) (x1 + ty1 ) dt dy1 . 2
R 0
Hence, using Jensen’s inequality, we see that there is a constant Cr depending only on r such that r v(x1 , x2 )r Cr p L∞ e−r π2 x22 R 5r/2
1
π
y12 e− 2 y1 (1 − t)1(−T R,T R) (x1 + ty1 ) dt dy1 , 2
R 0
whence vrLr
p rL∞ Cr R 5r/2
= Cr
1
π
2
2
R R 0
p rL∞ (2T R) R 5r/2
p rL∞ = Cr T R, R 5r/2 which implies (6.4).
π
e−r 2 x2 y12 e− 2 y1 (1 − t)
2
1(−T R,T R) (x1 + ty1 ) dx1 dt dx2 dy1 R
1
R R 0
π
π
e−r 2 x2 y12 e− 2 y1 (1 − t) dt dx2 dy1 2
2
792
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
Lemma 6.3. Under the same assumptions as Lemma 6.2, let u be defined by (6.2). Then, there exists a constant CT > 0 depending only on T such that u satisfies
2 2 2 x12 u(x1 , x2 ) − 21/4 ρ(x1 )e−πx2 +iπx1 x2 − i21/4 x2 ρ (x1 )e−πx2 +iπx1 x2 dx
R2
CT
p 2L∞ (R) R2
(6.7)
,
and
2 2 2 x22 u(x1 , x2 ) − 21/4 ρ(x1 )e−πx2 +iπx1 x2 − i21/4 x2 ρ (x1 )e−πx2 +iπx1 x2 dx
R2
CT
p 2L∞ (R) R4
(6.8)
.
Proof. Here again, we use the Taylor expansion (6.5). Hence, v being defined by (6.6), we have p L∞ π 2 |x1 |v(x1 , x2 ) 1/4 5/2 |x1 |e− 2 x2 2 R π 2 p L∞ 1/4 5/2 e− 2 x2 2 R
1
y12 e− 2 y1 (1 − t)1(−T R,T R) (x1 + ty1 ) dt dy1 π
2
R 0
1
π
y12 e− 2 y1 (1 − t)|x1 + ty1 |1(−T R,T R) (x1 + ty1 ) dt dy1 2
R 0
π 2 p L∞ + 1/4 5/2 e− 2 x2 2 R
1
π
|y1 |3 e− 2 y1 t (1 − t)1(−T R,T R) (x1 + ty1 ) dt dy1 . 2
R 0
Hence, using Jensen’s inequality and arguing as in the proof of Lemma 6.2, we have x1 vL2 (R2 ) C
√ p L∞ 3/2 (RT ) + RT , R 5/2
where C is a universal constant. This implies (6.7). A similar computation gives x2 vL2 (R2 ) C which proves (6.8).
p L∞ √ RT , R 5/2
2
6.2. Lower bound for the energy We first recall an important result by Carlen [6] about wave functions in Λ0 (defined by (1.22)):
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
Lemma 6.4. (See E.A. Carlen [6].) For any u ∈ Λ0 , ∇u ∈ L2 , and we have ∇|u|2 = π |u|2 . R2
793
(6.9)
R2
Remark 6.5. The result of Carlen is actually much more general than the one we cite here, but the special case (6.9) is the only thing we need. Lemma 6.4 implies the following decomposition of the energy in Λ0 : Lemma 6.6. Let u ∈ Λ0 be such that uL2 = 1. Then, we have E(u) = −
κ2 1 κ2 ∂2 |u|2 + x 2 |u|2 + 2 8π 2 4π 2
+
κ2 8π 2
R2 2 ∂1 |u|2 + ε 2
R2
R2
x12 |u|2 +
g0 2
R2
|u|4 .
(6.10)
R2
Proof. We write
κ2 κ2 κ2 + + E(u) = − 8π 8π 2
x22 |u|2
ε2 + 2
R2
Hence, applying (6.9), we find (6.10).
x12 |u|2
g0 + 2
R2
|u|4 .
(6.11)
R2
2
Note that the first line is easily seen to be bounded from below by the first eigenvalue of the corresponding harmonic oscillator, namely κ 2 /(4π). Hence, (6.10) readily implies E(u) This explains why we chose the constant gives the highest lower bound in (6.12).
κ2 8π
κ2 . 8π
(6.12)
in the decomposition (6.11): it is the constant which
6.3. Proof of Theorem 1.2 Step 1. Upper bound for the energy. We pick a real-valued function p such that p ∈ C 2 (R),
supp(p) ⊂ (−T , T ),
p 2 = 1, R
and define u by (6.2), where ρ is defined by (6.3), with R = ε −2/3 .
(6.13)
794
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
Hence, setting v =
1 uL2 u,
we know by Lemma 6.1 that v is a test function for I (ε, κ). Hence, I (ε, κ) E(v).
(6.14)
Next, we set v1 = 21/4 ρ(x1 )e−πx2 +iπx1 x2 + i21/4 x2 ρ (x1 )e−πx2 +iπx1 x2 , 2
2
and point out that, applying Lemma 6.2 with r = 2, u2L2
= v1 2L2
+ O ε 4/3 = 1 + 21/2
= 1 + Cε 4/3
ρ (x1 )2
R
2 x22 e−2πx2 dx2 + O ε 4/3
R
p 2 + O ε 4/3 ,
R
where we have used that the two terms defining v1 are orthogonal to each other. Hence, uL2 = 1 + O ε 4/3 , where the term O(ε 4/3 ) depends only on p L2 , p L∞ and T . According to (6.14) and the definition of v, we thus have $ # I (ε, κ) E(u) 1 + O ε 4/3 ,
(6.15)
where the term O(ε 4/3 ) is independent of κ. We now compute the energy of u: applying Lemma 6.3, we have x 2 |u|2 − x 2 |v1 |2 Cε 2/3 x1 u 2 + x1 v1 2 Cε 2/3 2x1 v1 2 + Cε 2/3 . L L L 1 1 R2
R2
Moreover, we have, since ρ is real-valued, 1 2 2 2 2 2 2 −4/3 x1 |v1 | dx = x1 ρ(x1 ) dx1 + x1 ρ (x1 ) dx1 = ε t 2 p(t)2 dt + O(1). 4π R2
R
R
R
Hence, we have x12 |u|2
=ε
−4/3
t 2 p(t)2 dt + O(1).
The same kind of argument allows us to prove that 1 + O ε 4/3 . x22 |u|2 = x22 v12 + O ε 4/3 = 4π R2
(6.16)
R
R2
R2
(6.17)
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
795
Next, we apply Lemma 6.2 with r = 4: |u|4 − |v1 |4 2u − v1 4 u3 4 + v1 3 4 Cε 3/2 u3 4 + v1 3 4 . L L L L L R2
R2
Moreover, we have uL4 v1 L4 + Cε 2/3 , hence |u|4 − |v1 |4 Cε 3/2 v1 3 4 . L R2
R2
We also have 2 2 2 4 |v1 | = 2ρ(x1 )4 e−4πx2 + 4ρ(x1 )2 ρ (x1 )2 x22 e−4πx2 + 2x24 ρ (x1 )4 e−4πx2 R2
R2
= ε 2/3
p4 + ε2 R
1 4π
p(t)2 p (t)2 dt + ε 10/3
R
3 64π 2
p 4 .
R
Hence, we obtain
|u| = ε 4
2/3
p(t)4 dt + O ε 2 .
(6.18)
R
R2
Collecting (6.16), (6.17) and (6.18), we thus have ' & 2 4/3 κ2 1 2 g0 2/3 2 4 E(u) = +O κ ε t p(t) dt + +ε p(t) dt + O ε 2 . 8π 2 2 R
R
Recalling (6.15), this implies I (ε, κ) − ε 2/3
κ2 8π
1 2
t 2 p(t)2 dt + R
g0 2
p(t)4 dt + O κ 2 ε 2/3 + O ε 4/3 .
R
As a conclusion, we have lim sup 1/3 ε→0, ε κ →0
I (ε, κ) − ε 2/3
κ2 8π
1 2
g0 t p(t) dt + 2 2
R
2
p(t)4 dt, R
for any real-valued p ∈ C 2 (R) having compact support, and such that pL2 = 1. A density argument allows to prove that lim sup 1/3
ε→0, ε κ →0
I (ε, κ) − ε 2/3
κ2 8π
J,
796
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
where J is defined by (1.37). Thus, we get I (ε, κ) − ε 2/3
κ2 8π
ε 1/3 , = J + c ε, κ
with lim (t,s)→(0,0) c(t, s) = 0. t,s>0
Step 2. Convergence of minimizers. Let u be a minimizer of I (ε, κ). Then, according to the first step, we have ε 1/3 κ2 2/3 2/3 + J ε + ε c ε, , E(u) 8π κ with lim (t,s)→(0,0) c(t, s) = 0. Hence, applying Lemma 6.6, we obtain t,s>0
2 2 κ2 1 g0 2 2 ∂2 |u|2 + x 2 |u|2 + κ ∂1 |u|2 + ε x |u| + |u|4 2 1 2 4π 2 2 2 8π 2 R2
R2
R2
ε 1/3 2/3 2/3 + J ε + ε c ε, . 4π κ
R2
R2
κ2
(6.19)
We set x1 1 v(x1 , x2 ) = 1/3 u 2/3 , x2 , ε ε
(6.20)
so that vL2 = uL2 = 1, v 0, and (6.19) becomes κ2 κ 2 ε 4/3 1 ε 2/3 2 2 2 2 2 2 4 |∂2 v| + x2 v + |∂1 v| + x 1 v + g0 v 2 4π 2 2 8π 2 R2
R2
R2
1/3
κ2 ε + J ε 2/3 + ε 2/3 c ε, 4π κ
R2
R2
(6.21)
.
This implies that
|∂2 v| +
x22 v 2 C,
2
R2
(6.22)
R2
where C does not depend on (ε, κ). Moreover, since the first eigenvalue of the operator 2 − 4π1 2 d 2 + x22 is equal to 1/(2π), (6.21) implies that dx2
x12 v 2 + g0
R2
v 4 C, R2
(6.23)
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
797
where C does not depend on (ε, κ). Hence, up to extracting a subsequence, v converges weakly in L4 and weakly in L2 to some limit v0 0. Using (6.22) and (6.23), we see that |x|2 v 2 C, R2
hence v converges strongly in L2 . Since in addition ∂2 v converges weakly in L2 , we have: ⎧ v −−−−−−−−−−→ v0 strongly in L2 (R2 ), ⎪ ⎪ ⎪ (ε,ε 1/3 κ −1 )→(0,0) ⎪ ⎪ ⎪ 2 2 ⎪ ⎪ ⎨ x1 v −−−−−−−−−−→ x1 v0 weakly in L (R ), (ε,ε 1/3 κ −1 )→(0,0)
(6.24)
⎪ v −−−−−−−−−−→ v0 weakly in L4 (R2 ), ⎪ ⎪ ⎪ (ε,ε 1/3 κ −1 )→(0,0) ⎪ ⎪ ⎪ ⎪ ⎩ ∂2 v −−−−−−−−−−→ ∂2 v0 weakly in L2 (R2 ). (ε,ε 1/3 κ −1 )→(0,0)
Hence, we may pass to the liminf in the two first terms of (6.21), getting 1 4π 2
|∂2 v0 |2 + R2
x22 v02 R2
lim inf
(ε,ε 1/3 κ −1 )→(0,0)
We use that the first eigenvalue of the operator − 4π1 2
1 4π 2
|∂2 v|2 + R2
d2 dx22
1 . x22 v 2 2π
(6.25)
R2
+ x22 on L2 (R) is equal to 1/(2π), is
simple, with an eigenvector equal to 21/4 exp(−πx22 ). Thus, v0 (x1 , x2 ) = ξ(x1 )21/4 e−πx2 , 2
(6.26)
with ξ 0. Next, (6.21) and (6.24) also imply 1 2
x12 v02 R2
g0 + 2
v04
1 g0 2 2 4 lim inf x1 v + v J. 1/3 2 ε→0, ε →0 2 κ
R2
R2
(6.27)
R2
Using (6.26), we infer 1 2
x12 ξ 2 + R
g0 2
ξ 4 J. R
Hence, recalling that, in view of (6.24) and (6.26), we have ξ 2 = 1, the definition of J implies that ξ is the unique nonnegative minimizer of (1.37). This proves (1.38), with strong convergence in L2 and weak convergence in L4 . Moreover, using (6.27) again and the fact that ξ is a minimizer of (1.37), we have 4 2 2 2 4 lim v0 − v = 0. x 1 v 0 − v + g0 (ε,ε 1/3 κ −1 )→(0,0)
R2
R2
798
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
Next, using the explicit formula giving v0 , a simple computation gives 2 2 v − v02 , x12 v 2 − v02 + g0 v 4 − v04 g R2
R2
hence v 2 converges to v02 strongly in L2 (R2 ). Thus,
v 4 −→ R2
v04 . R2
The space L4 (R2 ) being uniformly convex, this implies strong convergence in L4 , hence (1.38). Step 3. Lower bound for the energy. Using Lemma 6.6, we have ε 2/3 κ2 + E(u) 4π 2
x12 v 2
+ g0
R2
v . 4
R2
In addition, we already proved (1.38), which implies 1 g0 1 g0 2 2 4 2 2 x1 v + v −→ x1 v 0 + v04 = J, 2 2 2 2 R2
R2
R2
which implies the lower bound for the energy.
R2
2
Acknowledgments We would like to thank A.L.Fetter and J. Dalibard for very useful comments on the physics of the problem. We also acknowledge support from the French ministry grant ANR-BLAN-0238, VoLQuan and express our gratitude to our colleagues participating to this ANR-project, in particular T. Jolicœur and S. Ouvry. Appendix A A.1. Glossary A.1.1. The harmonic oscillator The operator
w π ξj2 + λ2j xj2 = π Dx2j + λ2j xj2 ,
1j n
λj > 0, Dxj =
1j n
1 ∂x , 2iπ j
(A.1)
has a discrete spectrum 1 λj + αj λj , 2 (α1 ,...,αn )∈Nn 1j n
1j n
(A.2)
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
799
and its ground state is one-dimensional generated by the Gaussian function
ϕλ (x) = 2n/4
λj e−πλj xj . 2
1/4
(A.3)
1j n
A.1.2. Degenerate harmonic oscillator Let r ∈ {1, . . . , n}. Using the identity Hr u, u =
(Dx − iλj xj )u2 2 + λj u2 2 , (A.4) Dx2j + λ2j xj2 u, u = j L L 2π
1j r
1j r
we can define the ground state Er of the operator Hr as Er = L2 Rn ∩1j r ker(Dxj − iλj xj ) = ϕ(λ1 ,...,λr ) (x1 , . . . , xr ) ⊗ v(xr+1 , . . . , xn ) v∈L2 (Rn−r ) . The bottom of the spectrum of πHr is
1 2
,
1j r
(A.5)
λj .
A.2. Notations for the calculations of Section 2.3
α=
ν 2 + ω2 + ε 2 = 1, ν 2 + ω2 1, ν 4 + 4ω2 = 4ω2 + (1 − ω2 − ε 2 )2 (if ν = 0, α = 2ω).
μ21 = 1 + ω2 − α =
(1 + ω2 )2
− α2
1 + ω2 + α
μ22 = 1 + ω2 + α
=
(1 − ω2 )2
− ν4
μ22
=
2ν 2 ε 2
+ ε4
μ22
(if ν = 0, μ2 = 1 + ω).
(A.6) (A.7) (A.8) (A.9)
Remark A.1. If ν = 0, μ1 = O(ε 2 ) and if ν = 0, μ1 = O(ε). Moreover, for ν 2 + ω2 1, μ22 ∈ [1, 4] and for ν 2 + ω2 = 1, μ22 ∈ [2, 4]: we have indeed 1/2 1 1 + ω2 + ν 4 + 4ω2 4
(A.10)
since ν 4 + 10ω2 (1 − ω2 )2 + 10ω2 = 8ω2 + 1 + ω4 9 + ω4 , implying (3 − ω2 )2 ν 4 + 4ω2 and (A.10). If ν 2 + ω2 = 1, we have (1 − ω2 )2 = ν 4 ν 4 + 4ω2 ⇒ 2 1 + ω2 + (ν 4 + 4ω2 )1/2 . We define the following set of parameters, 2ωμ1 α − 2ω2 − ν 2 = 2 2 2ωμ1 α − 2ω + ν 2 since α − 2ω2 − ν 4 = 4ω2 + 4ω4 − 4ω2 α = 4ω2 μ21 ,
β1 =
(A.11)
800
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
2ωμ2 α + 2ω2 − ν 2 = 2 2 2ωμ2 α + 2ω + ν 2 since α + 2ω2 − ν 4 = 4ω2 + 4ω4 + 4ω2 α = 4ω2 μ22 ,
β2 =
γ= λ21 = λ22 =
μ1 1 = = β μ1 + β1 β2 μ2 1 + 1 β2 μ2 1+ μ1
μ2 1 = = β μ2 + β1 β2 μ1 1 + 1 β2 μ1 1+ μ2 λ21 + λ22 = 1 + d=
2α , ω
γ λ1 λ2 , 2
c=
ν2 , α
λ21 + λ22 2λ1 λ2
1 α−2ω2 −ν 2 α+2ω2 +ν 2
λ21 λ22 =
(A.12) (A.13)
1 α+2ω2 −ν 2 α−2ω2 +ν 2
=
α − 2ω2 + ν 2 , 2α
(A.14)
α + 2ω2 + ν 2 , 2α
and we have (A.15)
(α + ν 2 )2 − 4ω4 , 4α 2
(A.16)
=
so that cd =
2α(1 + ν 2 /α) α + ν 2 = . 4ω 2ω
(A.17)
We have also 2μ1 α − 2ω2 + ν 2 α − 2ω2 + ν 2 = = = λ21 , γβ1 ωγ 2α 2μ2 α + 2ω2 + ν 2 α + 2ω2 + ν 2 = = λ22 , = γβ2 ωγ 2α and λ21 + λ22 21/2 α 1/2 = 1 + ν 2 α −1 2−1 √ 2λ1 α − 2ω2 + ν 2 √ 21/2 α 1/2 α + 2ω2 − ν 2 = 1 + ν 2 α −1 2−1 √ √ α − 2ω2 + ν 2 α + 2ω2 − ν 2 √ α + 2ω2 − ν 2 2 −1 −1 1/2 1/2 = 1+ν α 2 2 α α 2 − (2ω2 − ν 2 )2 −1/2 = 1 + ν 2 α −1 2−1/2 α 1/2 α + 2ω2 − ν 2 (2ω)−1 2ν 2 + ε 2 .
cλ2 =
Moreover, we have % α + 2ω2 − ν 2 cλ2 = 2−3/2 α 1/2 + ν 2 α −1/2 ω−1 2ν 2 + ε 2 if ν = 0, cλ2 = 2−1/2 (1 − ω)−1/2 , % α + 2ω2 − ν 2 2ω cλ 2 = 2−3/2 α 1/2 + ν 2 α −1/2 ω−1 λ2 d −1 = , cd 2ν 2 + ε 2 α + ν 2
(A.18)
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
λ2 d
−1
=2
−1/2
% λ2 d
−1
−1/2
= (2α) cλ1 =
801
% 1/2 α + 2ω2 − ν 2 −1 2 −1/2 α +ν α α + ν2 , 2 2 2ν + ε
α + 2ω2 − ν 2 2ν 2 + ε 2
if ν = 0, λ2 d −1 = 2−1/2 (1 − ω)−1/2 ,
−1/2 λ21 + λ22 = 1 + α −1 ν 2 2−1/2 α 1/2 α + 2ω2 + ν 2 2λ2 if ν = 0, cλ1 = 2−1/2 (1 + ω)−1/2 ,
(A.19)
(A.20)
−1/2 −1 λ1 d −1 = λ1 c(cd)−1 = 1 + α −1 ν 2 2−1/2 α 1/2 α + 2ω2 + ν 2 2ω α + ν 2 , −1/2 λ1 d −1 = 21/2 α −1/2 ω α + 2ω2 + ν 2 if ν = 0, λ1 d −1 = 2−1/2 (1 + ω)−1/2 ,
(A.21)
1/2 −1/2 −1/2 λ1 cd = α + ν 2 2−1 ω−1 α − 2ω2 + ν 2 2 α , 1/2 λ1 cd = 2−3/2 α + ν 2 ω−1 α −1/2 α − 2ω2 + ν 2 if ν = 0, λ1 cd = 2−1/2 (1 − ω)1/2 , 1/2 −1/2 −1/2 d γ λ1 = αω−1 α − 2ω2 + ν 2 = 2 α λ2 2 1/2 = 2−1/2 α 1/2 ω−1 α − 2ω2 + ν 2 , λ1 cd −
1/2 −3/2 d 2 = α − 2ω2 + ν 2 (α + ν 2 )ω−1 α −1/2 − 2−1/2 α 1/2 ω−1 , λ2
λ1 cd −
1/2 d α + ν 2 − 2α , = 2−3/2 ω−1 α −1/2 α − 2ω2 + ν 2 λ2
λ1 cd −
1/2 d α − ν2 , = −2−3/2 ω−1 α −1/2 α − 2ω2 + ν 2 λ2 if ν = 0, λ1 cd − λd2 = −2−1/2 (1 − ω)−1/2 ,
1/2 λ1 = 2−1/2 α −1/2 α − 2ω2 + ν 2
if ν = 0, λ1 = 2−1/2 (1 − ω)1/2 ,
d d −1 = λ1 λ2 λ1 cd − λ2 cd − λ1 λ2 1/2 = −2−3/2 ω−1 α −1/2 α − 2ω2 + ν 2 α − ν2 1/2 −1/2 × α + 2ω2 + ν 2 α − 2ω2 + ν 2 1/2 = −2−3/2 ω−1 α −1/2 α − ν 2 α + 2ω2 + ν 2 ,
(A.22) (A.23)
802
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
1/2 d = −2−3/2 ω−1 α −1/2 α − ν 2 α + 2ω2 + ν 2 λ1 if ν = 0, λ2 cd − λd1 = −2−1/2 (1 + ω)1/2 , 1/2 if ν = 0, λ2 = 2−1/2 (1 + ω)1/2 , λ2 = 2−1/2 α −1/2 α + 2ω2 + ν 2 λ2 cd −
(A.24) (A.25)
γ μ1 4αω(2ν 2 + ε 2 ) = . 2β1 α − ν 2 + 2ω2
2α γ μ1 β1 = ε2 , 2 α + 2ω2 + ν 2
(A.26)
A.3. Some calculations A.3.1. Proof of Lemma 2.5 We have to calculate ⎛
1 − ν2 ⎜ " = χ ∗ Qχ = χ ∗ ⎜ 0 Q ⎝ 0 −ω ⎛ ⎜ ⎜ =χ ⎜ ⎝
(1 − ν 2 )λ1 −
−ωλ1 +
⎛
λ1 ⎜ ⎜ 0 =⎜ ⎜ ⎝ 0 − λd1 ⎜ ⎜ ×⎜ ⎜ ⎝
(1 + ν 2 )λ2 + ωλ2 + − λ1 cd
0
0 d λ1
λ2 − λd2 0
(1 − ν 2 )λ1 −
ωd λ2
−ωλ1 +
− λ1 cd 0
cλ2
0
0
cλ1
+ λ1 cdω
0
0
λ2
− λd2
d λ1
0 d λ2
− λ2 cd
− λ1 cd
0
− λ2 cdω
− λ2 cd
2 )λ
− (1+νd
2
− λd1 ⎞ 0 ⎟ ⎟ ⎟ 0 ⎠
cλ2 0
cλ1
− λ1 (1−ν d
2)
+ ωcλ2
0
⎞
− cλ1 ω
0
2 − ωλ d + cλ2
0 ωλ1 d
⎞ ⎟ ⎟ ⎟ ⎟ ⎠
+ cλ1
⎟ ⎟ ⎟ ⎟ ⎠ 0
ωλ2 + − λ1 cd
0
0 ωd λ1
d λ1
(1 + ν 2 )λ2 +
0
λ1
0 d λ2
− λ2 cd
0 d λ2
⎞⎛ −ω ⎜ 0 ⎟ ⎟⎜ ⎜ 0 ⎠⎝ 1 0
0 d λ2
0 ω 1 0
+ λ1 cdω
0
∗⎜
⎛
ωd λ2
0 1 + ν2 ω 0
d λ1
0
− λ1 (1−ν d
0 ωd λ1
− λ2 cdω
− λ2 cd
2 )λ 2
− (1+νd
+ ωcλ2
− cλ1 ω
0
2 − ωλ d + cλ2
0
2)
0 ωλ1 d
⎞ ⎟ ⎟ ⎟. ⎟ ⎠
+ cλ1
˜ is diagonal, it is We get easily q˜12 = q˜13 = 0 = q˜24 = q˜34 . To prove that the symmetric matrix Q thus sufficient to prove that q˜14 = 0 = q˜23 . We have λ21 λ1 cdλ1 1 − ν 2 − ωcλ21 + ω + − λ21 cω − c2 λ21 d d λ2 λ2 λ21 ωd cd 2 2 2 2 −1 + ν − 2ωcd + + −c d = d λ2 λ1 λ2 λ1 λ2 (α + ν 2 ) α (α + ν 2 )2 = 1 −1 + ν 2 − α − ν 2 + α + − d 2ω ω 4ω2
q˜14 = −
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
λ21 −ω2 + dω2 λ2 = 12 −ω2 + dω λ2 = 12 −ω2 + dω λ2 = 12 −ω2 + dω
=
(α 2 + ν 2 α) (α + ν 2 )2 − 2 4
803
(ν 4 + 4ω2 + ν 2 α) (α 2 + ν 4 + 2αν 2 ) − 2 4
(ν 4 + 4ω2 + ν 2 α) (ν 4 + 4ω2 + ν 4 + 2αν 2 ) − 2 4 (2ν 4 + 8ω2 + 2ν 2 α) (2ν 4 + 4ω2 + 2αν 2 ) − = 0. 4 4
Moreover we have λ22 ωλ2 cdλ2 1 + ν 2 + ωcλ22 − + + λ22 ωc − λ22 c2 d d λ1 λ1 λ2 ωd cd 2 + − c2 d 2 = 2 −1 − ν 2 + 2ωcd − d λ2 λ1 λ2 λ1 2 λ (α + ν 2 ) α (α + ν 2 )2 = 2 −1 − ν 2 + α + ν 2 − α + − d 2ω ω 4ω2 λ2 (α 2 + ν 2 α) (α + ν 2 )2 = 22 −ω2 + − = 0, from the previous computation. 2 4 dω
q˜23 = −
˜ is indeed diagonal. We calculate We know now that Q λ21 (1 − ν 2 ) 2cλ21 ω λ21 2 2 2 2 2 1 − ν + c + λ = + 2ωcd + c d 1 d d2 d2 λ2 (α + ν 2 )2 , = 12 1 − ν 2 + α + ν 2 + d 4ω2 λ2 λ2 (ν 4 + 4ω2 + ν 4 + 2αν 2 ) (ν 4 + αν 2 ) q˜44 = 2 1 2 ω2 + αω2 + = 2 1 2 2ω2 + αω2 + . 4 2 ω d ω d
q˜44 =
Since
λ21 ω2 d 2
=
4 γ 2 λ22 ω2
q˜44 =
=
1 α 2 λ22
=
2α , α 2 (α+2ω2 +ν 2 )
we have
# 2 $ α 2 + 2αω2 + αν 2 1 2 4 2 4ω = 2 + 2αω + ν + αν = 1. α(α + 2ω2 + ν 2 ) α + 2αω2 + αν 2
Analogously, we have λ22 (1 + ν 2 ) 2cλ22 ω λ22 2 2 2 2 2 1 + ν − λ = − 2ωcd + c d + c 2 d d2 d2 λ2 (α + ν 2 )2 = 22 1 + ν 2 − α − ν 2 + , d 4ω2
q˜33 =
804
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
λ22 (ν 4 + 4ω2 + ν 4 + 2αν 2 ) 2 2 ω − αω + 4 ω2 d 2 λ2 (ν 4 + αν 2 ) . = 2 2 2 2ω2 − αω2 + 2 ω d
q˜33 =
Since
λ22 ω2 d 2
=
4 γ 2 λ21 ω2
q˜33 =
=
1 α 2 λ21
=
2α , α 2 (α−2ω2 +ν 2 )
we have
# 2 $ α 2 − 2αω2 + αν 2 1 2 4 2 4ω = 2 − 2αω + ν + αν = 1. α(α − 2ω2 + ν 2 ) α − 2αω2 + αν 2
We calculate ωdλ1 d2 cd 2 λ1 + 2λ21 cdω + 2 − 2 + λ21 c2 d 2 , q˜11 = λ21 1 − ν 2 − 2 λ2 λ2 λ2 2ωd d2 cd 2 2 2 2 2 q˜11 = λ1 1 − ν − + 2cdω + 2 2 − 2 +c d , λ1 λ2 λ1 λ2 λ1 λ2 α2 α + ν 2 α (α + ν 2 )2 2 2 2 q˜11 = λ1 1 − ν − 2α + α + ν + 2 − 2 , + 2ω ω ω 4ω2 λ21 # (α + ν 2 )2 $ 2 2 2 2 , (1 − α)ω + α − α − αν + 4 ω2 α − 2ω2 + ν 2 2 ν 4 + 4ω2 + ν 4 + 2αν 2 2 2 q˜11 = − αω − αν + ω , 4 2αω2 α − 2ω2 + ν 2 1 2 ν4 2 2 2ω − αω − αν + . q˜11 = 2 2 2αω2 q˜11 =
More calculations: ν4 ν2 α − 2ω2 + ν 2 2ω2 + − α ω2 + 2 2 4 ν ν2 − ν 4 + 4ω2 ω2 + = ν 2 − 2ω2 2ω2 + 2 2 4 2 ν ν 2 + ω2 + 2ω − ν 2 + α 2ω2 + 2 2 4 4 2 4 = −8ω − 2ω ν + α 2ω + 2ω2 which is equal to 2αω2 1 + ω2 − α = α 2ω4 + 2ω2 − 2α 2 ω2 = α 2ω4 + 2ω2 − 2ω2 ν 4 + 4ω2 , proving thus that q˜11 = 1 + ω2 − α. The previous calculations and (2.8) give ϕ q˜22 = 1 + ω2 + α, completing the proof of the lemma.
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
805
A.3.2. On the symplectic relationships in Lemma 2.6 The reader is invited to check the following formulas,4 with the notations of Lemma 2.6: α − ν2 α + ν2 x 2 , ξ2 + x1 = αω−1 , ξ1 − 2ω 2ω α − ν2 α + ν2 ξ2 − x 1 , ξ1 + x2 = αω−1 , 2ω 2ω α − ν2 α + ν2 ξ1 − x 2 , ξ1 + x2 = 0, 2ω 2ω α − ν2 α − ν2 ξ1 − x 2 , ξ2 − x1 = 0, 2ω 2ω α + ν2 α + ν2 ξ2 + x 1 , ξ1 + x2 = 0, 2ω 2ω α + ν2 α − ν2 x 1 , ξ2 − x1 = 0, ξ2 + 2ω 2ω as well as
α + 2ω2 − ν 2 2 1/2 −1 ε αω 2αμ22 2 1/2 −1 2 α − 2ω2 − ν 2 = 2−1 εμ−1 2 ω 2 1/2 −1 2 2 1/2 4ω − 4ω4 + 4ω2 ν 2 = 2−1 εμ−1 = εμ−1 2 ω 2 1−ω +ν 2 2 1/2 = εμ−1 = μ1 2 2ν + ε
α − 2ω2 + ν 2 2α
1/2
and
α + 2ω2 + ν 2 2α
1/2
2
1/2
1 + ω2 + α ω α(α + 2ω2 + ν 2 )
1/2
1/2 αω−1 = 1 + ω2 + α = μ2 .
References [1] A.A. Abrikosov, On the Magnetic properties of superconductors of the second group, Sov. Phys. JETP 5 (1957) 1174–1182. [2] A. Aftalion, Vortices in Bose Einstein Condensates, Progr. Nonlinear Differential Equations Appl., vol. 67, Birkhäuser, 2006. [3] A. Aftalion, X. Blanc, F. Nier, Lowest Landau level functional and Bargmann spaces for Bose–Einstein condensates, J. Funct. Anal. 241 (2) (2006) 661–702, MR MR2271933 (2008c:82052). [4] A. Aftalion, X. Blanc, Reduced energy functionals for a three-dimensional fast rotating Bose Einstein condensates, Ann. Inst. H. Poincaré Anal. Non Linéaire 25 (2) (2008) 339–355 (in English), MR MR2400105. [5] V. Bretin, S. Stock, Y. Seurin, J. Dalibard, Fast rotation of a Bose–Einstein condensate, Phys. Rev. Lett. 92 (2004) 050403. 4 This is indeed double-checking since those formulas are proven in Section 2.
806
A. Aftalion et al. / Journal of Functional Analysis 257 (2009) 753–806
[6] Eric A. Carlen, Some integral identities and inequalities for entire functions and their application to the coherent state transform, J. Funct. Anal. 97 (1) (1991) 231–249, MR MR1105661 (92i:46025). [7] A.L. Fetter, Lowest-Landau-level description of a Bose–Einstein condensate in a rapidly rotating anisotropic trap, Phys. Rev. A 75 (2007) 013620. [8] T.L. Ho, Bose–Einstein condensates with large number of vortices, Phys. Rev. Lett. 87 (2001) 060403. [9] Lars Hörmander, The analysis of linear partial differential operators. III, Classics Math., Springer, Berlin, 2007, Pseudo-differential operators, reprint of the 1994 edition, MR MR2304165 (2007k:35006). [10] Nicolas Lerner, The Wick calculus of pseudo-differential operators and some of its applications, Cubo Mat. Educ. 5 (1) (2003) 213–236, MR MR1957713 (2004a:47058). [11] K. Madison, F. Chevy, V. Bretin, J. Dalibard, Vortex formation in a stirred Bose–Einstein condensate, Phys. Rev. Lett. 84 (2000) 806. [12] Dusa McDuff, Dietmar Salamon, Introduction to Symplectic Topology, second ed., Oxford Math. Monogr., Clarendon Press, Oxford Univ. Press, New York, 1998, MR MR1698616 (2000g:53098). [13] M.Ö. Oktel, Vortex lattice of a bose-einstein condensate in a rotating anisotropic trap, Phys. Rev. A 69 (2) (2004) 023618. [14] C.J. Pethick, H. Smith, Bose Einstein Condensation in Dilute Gases, Cambridge Univ. Press, 2002. [15] L. Pitaevskii, S. Stringari, Bose Einstein Condensation, Internat. Ser. Monogr. Phys., vol. 116, Oxford Science Publications, 2003. [16] P. Sanchez-Lotero, J.J. Palacios, Vortices in a rotating bose-einstein condensate under extreme elongation, Phys. Rev. A (Atomic, Molecular, and Optical Physics) 72 (4) (2005) 043613. [17] S. Sinha, G.V. Shlyapnikov, Two-dimensional bose-einstein condensate under extreme rotation, Phys. Rev. Lett. 94 (15) (2005) 150401. [18] G. Watanabe, G. Baym, C.J. Pethick, Landau levels and the Thomas–Fermi structure of rapidly rotating Bose– Einstein condensates, Phys. Rev. Lett. 93 (2004) 190401.
Journal of Functional Analysis 257 (2009) 807–831 www.elsevier.com/locate/jfa
On the differentiability of very weak solutions with right-hand side data integrable with respect to the distance to the boundary J.I. Díaz a,∗ , J.M. Rakotoson b a Departamento de Matemática Aplicada, Universidad Complutense de Madrid, Plaza de las Ciencias No. 3,
28040 Madrid, Spain b Laboratoire de Mathématiques et Applications, Université de Poitiers, Boulevard Marie et Pierre Curie,
Téléport 2, BP 30179, 86962 Futuroscope Chasseneuil Cedex, France Received 7 December 2008; accepted 8 March 2009 Available online 1 April 2009 Communicated by H. Brezis
Abstract We study the differentiability of very weak solutions v ∈ L1 (Ω) of (v, L ϕ)0 = (f, ϕ)0 for all ϕ ∈ C 2 (Ω) vanishing at the boundary whenever f is in L1 (Ω, δ), with δ = dist(x, ∂Ω), and L∗ is a linear second order elliptic operator with variable coefficients. We show that our results are optimal. We use symmetrization techniques to derive the regularity in Lorentz spaces or to consider the radial solution associated to the increasing radial rearrangement function f of f . © 2009 Elsevier Inc. All rights reserved. Keywords: Very weak solutions; Distance to the boundary; Regularity; Linear PDE; Monotone rearrangement; Lorentz spaces
1. Introduction The origin of this paper starts with an originally unpublished manuscript by H. Brezis (personal communication of him to the first author [4]), later mostly in the paper by Brezis et al. [5] (see also the mention made in [17]). In his note, when f is given in L1 (Ω, dist(x, ∂Ω)) * Corresponding author.
E-mail address: [email protected] (J.I. Díaz). 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.03.002
808
J.I. Díaz, J.M. Rakotoson / Journal of Functional Analysis 257 (2009) 807–831
(Ω bounded smooth open set of RN ), H. Brezis shows the existence and uniqueness of a very weak solution v ∈ L1 (Ω) of − Ω vϕ dx = Ω f ϕ dx, ∀ϕ ∈ V1 (Ω), GD(Ω) = with V1 (Ω) = {ϕ ∈ C 2 (Ω), ϕ = 0 on ∂Ω}, and also that |v|L1 (Ω) c|f |L1 (Ω,dist(x,∂Ω)) . ∂v Therefore, the question of the integrability of the generalized derivative ∂i v = ∂x arises in a i natural way and was raised already in the note by H. Brezis. To give some answer to the above question, we shall note δ(x) = dist(x, ∂Ω) and introduce the following spaces
L1 Ω, δ α = f : Ω → R Lebesgue measurable: f (x)δ(x)α dx is finite , 0 α 1,
L Ω, δ 1 = L1 (Ω, δ), 1
Ω
L1 Ω, δ 0 = L1 (Ω),
L1 Ω, δ| Ln δ| = f : Ω → R Lebesgue measurable such that |f |(x)δ(x) Ln δ(x) dx < +∞ . Ω
One has, for α ∈ [0, 1[ L1 Ω, δ α L1 Ω, δ| Ln δ| L1 (Ω, δ). One of our results contains in particular the following statements: 1,q
(i) The very weak solution v ∈ W0 (Ω) for some q > 1 if and only if f ∈ L1 (Ω, δ α ) for some α ∈ [0, 1[, for nonnegative f . N (ii) If f ∈ L1 (Ω, δ α ), 0 α < 1 then |∇v| belongs to the Lorentz space L N−1+α ,∞ (Ω). 1
The above result contains the result given in [12] since Lp (Ω, δ) L1 (Ω, δ p ) for p > 1. We also improve the result of Cabré and Martel [6], by showing that if f is only in L1 (Ω, δ) then N the function v is in L N−1 ,∞ (Ω). Moreover, we can show that |∇v| ∈ Lq (Ω, δ) for some q > 1, 1,q in particular v ∈ Wloc (Ω). As a matter of fact, all our results in the first four sections are valid when we replace the Laplacian operator by a linear elliptic second order operator L with variable coefficients. In Section 5, we consider the case of L∗ = − and Ω being the unit ball. Our aim is to study if we may have the W 1,1 -regularity whenever f ∈ L1 (Ω, δ) − L1 Ω, δ 1− ,
L1 Ω, δ 1− = L1 Ω, δ α . 0α<1
J.I. Díaz, J.M. Rakotoson / Journal of Functional Analysis 257 (2009) 807–831
809
We show: (1) If f ∈ L1 (Ω, δ| Ln δ|) then the very weak solution v ∈ L1 (Ω) is in W01,1 (Ω). (2) If we consider f the increasing radial rearrangement function (see its definition in Section 2) of f ∈ L1 (Ω, δ), f 0 then f∈ L1 (Ω, δ). Moreover, the unique very weak solution ω ∈ L1 (Ω) of the generalized Dirichlet problem GD(Ω), −
fϕ(x) dx,
ω(x)ϕ(x) dx = Ω
∀ϕ ∈ V1 (Ω),
Ω
is radial but decreasing belonging to W01,1 (Ω). Moreover, there exists a constant c(Ω) > 0 |∇ω|L1 (Ω) c(Ω) · |f |L1 (Ω,δ) . Under our assumptions on Ω, 1
c(Ω) =
1+ N1
.
N αN
We shall restate the necessary and sufficient conditions to ensure that ω ∈ W 1,q (Ω) for q > 1 and we shall show that
ω
1,q ∈ W0 (Ω)
if and only if
q |Ω| |Ω| f∗ (t) dt dσ is finite. 0
σ
We also remark that the usual comparison technique based on the decreasing rearrangement f∗ (t) of f 0, is inefficient in the case where f ∈ L1 (Ω, δ) \ L1 (Ω). Indeed, the function |Ω| U (x) = cN
t αN |x|N
2 N −2
t f∗ (σ ) dσ 0
is in L1 (Ω) if and only if f ∈ L1 (Ω). In any case the pointwise comparison v U and the comparison in mass (see, e.g. the results and references presented in Section 1.3 of [7]) are still true (but they do not give any information on the integrability of v). We end the paper by giving two applications of our differentiability results to two special data f which are in L1 (Ω, δ) but not in L1 (Ω) nor in L1 (Ω, δ α ), respectively. The application to the existence, uniqueness and qualitative properties of the very weak solution of some associate semilinear problem will be the object of a separate paper by the authors (Díaz and Rakotoson [8]). 2. Notation – preliminary results We shall always consider Ω ⊂ RN , N 2, a bounded open set of class C 2,1 . For any measurable set E ⊂ RN we shall denote by |E| its Lebesgue measure.
810
J.I. Díaz, J.M. Rakotoson / Journal of Functional Analysis 257 (2009) 807–831
We shall consider a linear operator L:
Lu = −
N
N ∂i aij (x)∂j u + bi (x)∂i u + c0 (x)u
i,j =1
i=1
under the same assumptions as in [9], say aij ∈ C 0,1 (Ω), bi ∈ C 0,1 (Ω), c0 ∈ L∞ (Ω), c0 0, ∀ξ = (ξ1 , . . . , ξN ) ∈ RN
1 i ∂i b (x) 0 a.e. in Ω. 2 N
aij (x)ξi ξj α|ξ |2
c0 (x) −
for some α > 0,
i,j
i=1
We shall use the adjoint operator associated to L, that is L∗ ϕ = −
N ∂j aij (x)∂i ϕ − ∂i bi ϕ + c0 (x)ϕ.
i,j
i=1
Remark 1. The case of unbounded term c0 (x), blowing up on the boundary, will be considered in a subsequent paper by the authors (Díaz and Rakotoson [8]) where in fact the general framework will concern the case of semilinear equations. We recall that: – the decreasing rearrangement of a measurable function u is given by u∗ : Ω∗ = 0, |Ω| → R,
u∗ (s) = inf t ∈ R: |u > t| s , u∗ |Ω| = ess inf u; u∗ (0) = ess sup u, Ω
Ω
having the – the decreasing radial rearrangement of the function u is defined, on the ball Ω same measure as Ω, by → R, u:Ω
u(x) = u∗ αN |x|N ;
– the increasing rearrangement of a measurable function u is given by u∗ : Ω∗ → R,
u∗ (s) = u∗ |Ω| − s ,
s ∈ 0, |Ω| ;
– the increasing radial rearrangement of the function u is defined by → R, u:Ω
u(x) = u∗ αN |x|N .
J.I. Díaz, J.M. Rakotoson / Journal of Functional Analysis 257 (2009) 807–831
811
We shall use the following Lorentz spaces (see [14,1] for example), for 1 < p < +∞, 1 q +∞ L
p,q
(Ω) = v : Ω → R measurable
q |v|Lp,q
|Ω| 1 q dt t p |v|∗∗ (t) = < +∞ , t 0
for q < +∞ 1 Lp,∞ (Ω) = v : Ω → R measurable |v|Lp,∞ = sup t p |v|∗∗ (t) < +∞ , t|Ω|
χE is the characteristic function of a set E ⊂ Ω and |v|∗∗ (t) = 1t ]0, |Ω|[. 2 We denote by ∂i = ∂x∂ i , ∂ij = ∂x∂i ∂xj . We define the following sets
t 0
|v|∗ (s) ds for t ∈ Ω∗ =
W 1 Ω, | · |p,q = v ∈ W 1,1 (Ω): |∇v| ∈ Lp,q (Ω) and W 2 Ω, | · |p,q = v ∈ W 2,1 (Ω): ∂ij v ∈ Lp,q (Ω) for (i, j ) ∈ {1, . . . , N }2 . We shall denote by c various constants depending only on the data. The notation ≈ stands for equivalence of nonnegative quantities, that is Λ1 ≈ Λ2
⇐⇒
∃c1 > 0, c2 > 0 such that c1 Λ2 Λ1 c2 Λ2 .
We first extend the Agmon–Douglis–Nirenberg theorem to Lorentz spaces. Lemma 1. Consider L∗ the above linear operator. There exists a constant c(Ω, L∗ ) > 0 such that ∀g ∈ LN,1 (Ω) there exists a function ϕ ∈ W 2 (Ω, | · |N,1 ) ∩ H01 (Ω) satisfying L∗ ϕ = g, and |ϕ|H 1 + Max |∂ij ϕ|LN,1 c(Ω, L∗ )|g|LN,1 . i,j
Proof. For g ∈ L2 (Ω), we know (see [9]) that there exists a unique function ϕ ∈ H 2 (Ω) ∩ H01 (Ω) such that L∗ ϕ = g. This defines a continuous linear operator A from L2 (Ω) into H 2 (Ω) by setting Ag = ϕ. Let (i, j ) ∈ {1, . . . , N}2 , x ∈ Ω, we define Tij g(x) = ∂ij Ag(x),
812
J.I. Díaz, J.M. Rakotoson / Journal of Functional Analysis 257 (2009) 807–831
Tij is a linear map acting continuously from Lp (Ω) into Lp (Ω) for all p ∈ [2, +∞[ according to Agmon–Douglis–Nirenberg’s theorem [9], we derive from Marcinkiewicz’s interpolation theorem (see [1]) that it maps continuously LN,1 (Ω) into LN,1 (Ω):
|Tij g|LN,1 c(Ω, L∗ )|g|LN,1 .
This shows the lemma with the fact that |∇Ag|L2 c|g|L2 c|g|LN,1 (Ω) .
2
3. General result for f ∈ L1 (Ω, δ) The following existence theorem follows the idea of H. Brezis [4] and the regularity improves the one obtained in [6] for the case L = δ and in [17] for the case of a general operator L. Theorem 1. Let f ∈ L1 (Ω, δ) and N = satisfying DGL (Ω) :
N N −1 . Then there exists a unique function v
∗
vL ϕ dx = Ω
f ϕ dx,
∈ LN ,∞ (Ω)
∀ϕ ∈ W 2 Ω, | · |N,1 ∩ H01 (Ω).
Ω
Moreover, there exists a constant c(Ω, L) > 0 such that |v|LN ,∞ c(Ω, L)|f |L1 (Ω,δ) .
(1)
Proof. For k 1, we define the usual truncation Tk (σ ) =
σ if |σ | k, k sign(σ ) otherwise,
σ ∈ R.
We set fk = Tk (f ) ∈ L1 (Ω, δ) ∩ L∞ (Ω) and fk → f in L1 (Ω, δ). By standard result there exists a unique function vk ∈ W 2,p (Ω) ∩ H01 (Ω),
∀p ∈ [1, +∞[:
Lvk = fk .
Next we want to show that vk is a Cauchy sequence in LN ,∞ (Ω). For n 1, k 1, we set v nk = vn − vk , f nk = fn − fk . Then Lv nk = f nk which implies that ∀ϕ ∈ H 2 (Ω) ∩ H01 (Ω) Ω
v nk L∗ ϕ dx =
f nk ϕ dx.
(2)
Ω
For any E measurable in Ω, there exists a function ϕE ∈ W 2,p (Ω) ∩ H01 (Ω) such that L∗ ϕE = χE sign v nk .
(3)
J.I. Díaz, J.M. Rakotoson / Journal of Functional Analysis 257 (2009) 807–831
813
From Sobolev embedding associated to Lorentz spaces (see [14]), we have |∇ϕE |∞ c(Ω) Max |∂ij ϕE |LN,1 + |ϕE |H 1 i,j
(4)
and using Lemma 1, we derive that 1
|∇ϕE |∞ c|χE |LN,1 c|E| N .
(5)
ϕE (x) δ(x) c|∇ϕE |∞ ,
(6)
Since ∀x ∈ Ω, we have
and from relations (5) and (6), we get 1 |ϕE (x)| c|E| N , δ(x)
∀x ∈ Ω.
(7)
From relations (2) and (3), we derive |vn − vk | dx = f nk ϕE dx.
(8)
Ω
E
From relations (7) and (8), we have 1 N |vn − vk | dx c|E| |fn − fk |(x)δ(x) dx
(9)
Ω
E
for all E measurable sets in Ω. Using the Hardy–Littlewood inequality (see [14,1]), we have 1 sup t 1− N |vn − vk |∗∗ (t) c|fn − fk |L1 (Ω,δ) .
(10)
t|Ω|
This shows the result. Knowing that LN ,∞ is the dual and associate space of LN,1 we pass to the limit in relation that vk L∗ ψ dx = fk ψ dx, ∀ψ ∈ W 2 Ω, | · |N,1 ∩ H01 (Ω) (11) Ω
as k → +∞ to derive the result.
Ω
2
Next, we want to show that the solution is in W 1,q (Ω, δ) for 0 = bi (x) and aij = aj i .
2N 2N −1
> q provided that c(x) =
814
J.I. Díaz, J.M. Rakotoson / Journal of Functional Analysis 257 (2009) 807–831
Lemma 2. Assume L is the self-adjoint uniformly elliptic operator L = − i,j ∂i (ai,j (·)∂j ). Then there exists a function ϕ1 ∈ W 2,p (Ω) ∩ H01 (Ω) and λ1 > 0, ∀p ∈ ]1, +∞[ satisfying
Lϕ1 = λ1 ϕ1 ϕ1 = 0.
in Ω,
Moreover, there are two constants c1 > 0, c2 > 0 such that c1 δ(x) ϕ1 (x) c2 δ(x)
∀x ∈ Ω.
Proof. The proof of the existence is classical (see [15,5]). The estimate is a consequence of Hopf lemma and can be proved as in the case L = − (see [12,2]). 2 Theorem 2. Under the same assumptions as for Lemma 2, the unique generalized function v given in Theorem 1 belongs to W 1,q (Ω, δ) for 1 q < 2N2N−1 . Proof. Let us show that the sequence vk ∈ W 2,p (Ω) ∩ H01 (Ω) ∀p ∈ [1, +∞[ solution of Lvk = fk is a Cauchy sequence in W 1,q (Ω, δ). For η > 0 (that we shall choose later), we consider σ φη (σ ) = 0
dt (1 + t 2 )
1+η 2
σ ∈ R,
,
the function ψk = φη (vk )ϕ1 , with ϕ1 the first eigenfunction associated to L. Then,
aij (x)∂i vk ∂j ψk dx =
i,j Ω
(12)
fk ψk dx. Ω
Using the coercivity condition on aij , we have α Ω
|∇vk |2 (1 + vk2 )
1+η 2
ϕ1 dx +
vk
i,j Ω
φη (t) dt dx
aij (x)∂j ϕ1 ∂i
fk ψk dx.
(13)
Ω
0
We have
i,j Ω
vk
φη (t) dt dx =
aij (x)∂j ϕ1 ∂i 0
vk
φη (t) dt Lϕ1 dx Ω
0
vk
= λ1
ϕ1 Ω
φη (t) dt dx.
0
(14)
J.I. Díaz, J.M. Rakotoson / Journal of Functional Analysis 257 (2009) 807–831
815
From relations (13) and (14), we derive using Lemma 2 Ω
|∇vk |2 (1 + vk2 )
1+η 2
δ(x) dx c|φη |∞ Ω
c(η)
ϕ1 |vk | dx +
fk ϕ1 dx Ω
fk (x)δ(x) dx.
(15)
Ω
We conclude as in [13] (see also [3] for another proof), using Hölder inequality, with q ∈ [1, 2N2N−1 [, we have 1− q 2 m |∇vk | (x)δ(x) dx c|fk |L1 (Ω,δ) 1 + |vk | (x)δ(x) dx
q 2
q
Ω
with m = Since
(1+η)q 2−q
(16)
Ω
<
N N −1
(choice of η). |vk |Lm (Ω,δ) c|vk |LN ,∞ c|fk |L1 (Ω,δ) ,
we then have
q 1− q 2. |∇vk |q (x)δ(x) dx c|fk |L2 1 (Ω,δ) 1 + |f |L1 (Ω,δ)
(17)
Ω
By linearity of the equation ∇(vk − vn )
q
Lq (Ω,δ)
We conclude that v ∈ W 1,q (Ω, δ).
c|fk − fn |L2 1 (Ω,δ) → 0 as n, k → +∞. 2
4. Necessary and sufficient conditions for the Lq (Ω)-integrability of gradient In this section, we are investigating the integrability of the gradient on the whole set Ω for the general operator L. We start with the following theorem: Theorem 3. Let v be the unique solution of (DGL (Ω)) given in Theorem 1. If f ∈ L1 (Ω, δ α ) for some α ∈ [0, 1[, then N
|∇v| ∈ L N−1+α ,∞ (Ω). Moreover, there exists c(Ω, L) > 0 |∇v|
N
L N−1+α
,∞
c(Ω, L)|f |L1 (Ω,δ α ) .
816
J.I. Díaz, J.M. Rakotoson / Journal of Functional Analysis 257 (2009) 807–831
The proof relies on the following result which is a consequence of Simader’s result [16]. Lemma 3. For u ∈ L1loc (Ω), we consider a measurable vector field H (u) ∈ L∞ (Ω)N . 1,p For any function g ∈ Lp (Ω), 2 p < +∞ there exists a unique function ϕ ∈ W0 (Ω) such that . B(ϕ, ψ) = aij (x)∂i ϕ∂j ψ dx + bi (x)ϕ∂i ψ dx + c0 ϕψ dx i,j Ω
Ω
=
g(x)H (u) · ∇ψ dx,
i
Ω 1,p
∀ψ ∈ W0
(Ω),
1 1 + = 1. p p
Ω
Moreover, there exists a constant c = c(Ω, L, p) > 0 (independent of ϕ) such that |ϕ|W 1,p (Ω) cH (u)g Lp (Ω) .
(18)
0
Proof. If p = 2 it is a consequence of Lax–Milgram theorem. We notice that B(ϕ, ϕ) α|∇ϕ|2L2
+ Ω
N 1 ∂bi 2 c0 − ϕ dx α|∇ϕ|2L2 2 ∂xi i=1
for ϕ ∈ H01 (Ω) and then 2 |∇ϕ|2L2 + |ϕ|2L2 cH (u)g L2 (Ω) .
(19)
If p > 2, we apply Simader’s result to derive the regularity of the above unique function 1,p ϕ ∈ W0 (Ω). Moreover, there exists a constant γ (Ω, p, L) > 0: |ϕ|W 1,p γ H (u)g Lp + |ϕ|Lp .
(20)
|ϕ|L2 cH (u)g L2 cH (u)g Lp ,
(21)
Since
and |ϕ|p c|ϕ|θL2 |ϕ|1−θ 1,p W0
for some θ ∈ ]0, 1[,
one derives via Young’s inequality, relations (20) to (22) |ϕ|W 1,p cH (u)g Lp . We shall use a corollary of Lemma 3.
2
(22)
J.I. Díaz, J.M. Rakotoson / Journal of Functional Analysis 257 (2009) 807–831
817
Corollary 3.1 of Lemma 3. Under the same assumptions as for Lemma 3, for all p 2, all r ∈ [1, +∞] if g ∈ Lp,r (Ω) then the unique solution ϕ of B(ϕ, ψ) =
1,p
∀ψ ∈ W0
gH (u)∇ψ dx,
(23)
(Ω),
Ω
belongs to W 1 (Ω, | · |p,r ) and |∇ϕ|Lp,r (Ω) cH (u)g Lp,r (Ω) .
(24)
Moreover, for p N , for all x ∈ Ω 1− N |ϕ(x)| cp H (u)g Lp,1 · δ(x) p .
(25)
Proof. To deduce the relation (24), we apply the Marcinkiewicz’s interpolation theorem (see [1]) with T g = |∇Ag|, where the map A is defined as A(g) = ϕ with ϕ the unique solution of (23). T maps Lp (Ω) into Lp (Ω) continuously and then from Lp,r (Ω) into itself. Therefore, we have relation (24) thanks to Lemma 3. While for relation (25), we use the Sobolev embedding W 1 (Ω, | · |p,1 ) ⊂> C 0,β (Ω) with β = 1 − Np if p > N and W 1 (Ω, | · |N,1 ) ⊂> C(Ω) ∩ L∞ (Ω) if p = N (see [14]). We combine these results with relation (24) to derive the result. 2 Proof of Theorem 3. We shall consider vk ∈ W 2,p (Ω) ∩ H01 (Ω) ∀p ∈ [1, +∞[ satisfying Lvk = fk = Tk (f ). We want to show that (vk )k1 is a Cauchy sequence in W 1 (Ω, | · |qα ,∞ ) with qα = We introduce ∇vk H (vk ) = |∇vk | if ∇vk = 0, 0 otherwise.
N N −1+α .
Then, for any E measurable ⊂ Ω, we have from Lemma 3 and its corollary a function ϕE ∈ W 1 (Ω, | · |p,1 ) ∀p ∈ [2, +∞[ such that B(ϕE , ψ) =
H (vk ) · ∇ψ dx
∀ψ ∈ W
1,p
(Ω)
1 1 + =1 . p p
E
Choosing ψ = vk , we have
|∇vk | dx = B(ϕE , vk ) =
ϕE Lvk dx =
Ω
E
fk ϕE dx.
(26)
Ω
From relation (25), we know that ϕE (x) c|χE |
Lp,1
1− N p
· δ(x)
.
(27)
818
J.I. Díaz, J.M. Rakotoson / Journal of Functional Analysis 257 (2009) 807–831
So let us fix α ∈ [0, 1[ and choose p so that 1 1−α = . p N
(28)
|∇vk | c|χE |Lp,1 · |fk |L1 (Ω,δ α ) .
(29)
α=1−
N p
that is
Therefore, from relations (26) and (27) one has E 1
Since |χE |L1,p cp |E| p , one has from relation (29) 1− 1 sup t p |∇vk |∗∗ (t) c|fk |L1 (Ω,δ α )
with 1 −
t|Ω|
N −1+α 1 1 = = . p N qα
By linearity, relation (30) implies that (vk )k1 is a Cauchy sequence in W 1 (Ω, | · |qα ,∞ ).
(30) 2
Now, we are able to prove Theorem 4. Let v be the unique solution of the generalized Dirichlet problem (DGL (Ω)), f 0. 1,q Then v ∈ W0 (Ω) for some q > 1 if and only if there exists α ∈ [0, 1[ such that f ∈ L1 (Ω, δ α ). Proof. From Theorem 3, we know that 1,q if f ∈ L1 Ω, δ α then v ∈ W 1 Ω, | · |qα ,∞ ⊂ W0 (Ω) for all 1 q < qα . For the converse, we use that f 0. 1,q If v ∈ W0 (Ω), q > 1 then we have for all ϕ ∈ Cc∞ (Ω) −
v∂i aij (x) · ∂j ϕ dx =
Ω
(31)
∂i v∂j ϕaij dx. Ω
We deduce from the equation satisfied by v that ∀ϕ ∈ Cc∞ (Ω), ϕ 0 f ϕ dx = Ω
aij (x)∂i v∂j ϕ dx +
i,j Ω
bi v∂i ϕ dx +
Ω
c0 vϕ dx = B(v, ϕ).
(32)
Ω 1,q
Using a density argument and Fatou’s lemma, the relation (32) implies ∀ϕ ∈ W0 1 1 q + q = 1
(Ω), ϕ 0,
f ϕ dx B(v, ϕ). Ω
(33)
J.I. Díaz, J.M. Rakotoson / Journal of Functional Analysis 257 (2009) 807–831
We choose 1 > α = 1 − β > q1 , we have 1,q
δ α ∈ W0
Ω
(Ω)
819
δ(x)−q β dx < +∞, therefore the function
α q ∇δ dx
and Ω
δ −βq (x) dx < +∞.
Ω
Choosing as a test function ϕ = δ α in relation (32)
0
f δ dx c Ω
|v| dx +
q
Ω
which shows the result.
|∇v| dx +
α
|v| dx
q
Ω
(34)
Ω
2
Next, we want to analyze some specific case, namely when we “symmetrize” the equation. Unfortunately, the usual trick consisting to compare v (when f 0) with a radial decreasing function U associated to f radial decreasing rearrangement of f , does not give any information for the integrability of v either its gradient (see Lemma 6). The following remark explains partly this fact. having the same measure |Ω| than Ω, then Remark 2. In general, when we consider the ball Ω the distance to the boundary δ(x) = δΩ (x) is given by
−1
1
δΩ = αN N |Ω| N − |x|, x ∈ Ω. −1
δΩ ), f 0 then f ∈ L1 (Ω). Indeed Setting R = αN N |Ω| N , if f ∈ L1 (Ω, 1
f dy =
f (x) dx
Ω
Ω
f (x) dx +
{|x| R2 }
2 R
f (x)δΩ (x) dx + f∗
Ω
f (x) dx
{ R2 <|x|R}
N R αN |Ω| 2
< +∞. The situation is different with respect to the increasing symmetric rearrangement f of f defined through the scalar increasing rearrangement of f . We shall use Lemma 4. Assume that MaxΩ δ(x) = 1 and |Ω| = αN . Then (i) δ∗ (s) = 1 −
1
sN 1
αNN
, s ∈ [0, |Ω|], and up to a translation Ω is equal to the unit ball.
820
J.I. Díaz, J.M. Rakotoson / Journal of Functional Analysis 257 (2009) 807–831
(ii) ∀f ∈ L1 (Ω, δ), f 0, one has ⎧ 1 ⎪ ⎨f ∗ ∈ L (Ω∗ , σ ), f (σ )σ dσ N α f (x)δ(x) dx. ∗ N ⎪ ⎩ Ω∗
Ω
Proof. Following the proof given in [14], we have 1
δ∗ (0) −
sN 1
1
δ∗ (s)
1
|Ω| N − s N 1
αNN
(35)
.
αNN
Therefore, under the assumptions of the lemma, we have 1
δ∗ (0) = 1 =
|Ω| N 1
,
αNN which shows (i). While for (ii) we apply the Hardy–Littlewood inequality to derive
f(x) δ (x) dx
Ω
(36)
f (x)δ(x) dx. Ω
But, one has Ω
f(x) δ (x) dx =
f∗ |Ω| − αN |x|N · δ∗ αN |x|N dx
Ω
1 = N αN
f∗ αN 1 − r N δ∗ αN r N r N −1 dr
setting t = αN r N
0
αN =
f∗ (αN − t)δ∗ (t) dt 0
=
1 1 N
αN =
αN
1
0
αN f∗ (αN − t) ·
1 N
αN
1 1 f∗ (αN − t) · αNN − t N dt
0
(αN − t) dt, PN (t)
(37)
J.I. Díaz, J.M. Rakotoson / Journal of Functional Analysis 257 (2009) 807–831
821
where, for t ∈ [0, αN ], PN (t) =
αN − t 1
1
αNN − t N
is a polynomial function increasing in t and 1− N1
αN
1− N1
= PN (0) PN (t) PN (αN ) = N αN
.
Thus from (36), we have 1 1 N
αN
1 PN (αN )
αN f∗ (σ )σ dσ
f(x) δ (x) dx.
Ω
0
(38)
2
From (36) and (38) we get statement (ii).
5. Some results on a ball: f is replaced by its increasing rearrangement f The aim of this section is to show that there are functions v whose data are in L1 (Ω, δ) for which we have only the regularity W 1,1 (Ω). Setting
− L1 Ω, δ α , L1 Ω, δ 1 = 0α<1
a natural question concerns the global differentiability of v on the entire Ω when f ∈ L1 (Ω, δ) \ − L1 (Ω, δ 1 ). A partial answer can be given if Ω is a ball and when L = −. In this case, we have an estimate of the gradient of the Green function given in [10]. We have Proposition 1. Assume that L = −. If f ∈ L1 (Ω, δ| Ln δ|) then the function v solution of (DGL (Ω)) is in W01,1 (Ω). Moreover, there exists a constant c > 0: |∇v|L1 c|f |L1 (Ω,δ| Ln δ|) . Proof. The function vk ∈ W 2,p (Ω ∩ H01 (Ω), p ∈ [1, +∞[ solution of −vk = fk satisfies that vk (x) =
G(x, y)fk (y) dy, Ω
where G is the Green function associated to the Dirichlet problem. According to [10], there exists a constant c > 0 such that δ(y) |∇G|(x, y) c|x − y|1−N min 1, . (39) |x − y|
822
J.I. Díaz, J.M. Rakotoson / Journal of Functional Analysis 257 (2009) 807–831
By Fubini’s theorem, one has
|∇vk |(x) dx
Ω
fk (y)
Ω
|∇G|(x, y) dx dy
(40)
Ω
and by the estimates (39), we have
|∇G|(x, y) dx c
|x − y|1−N dx + cδ(y)
(41)
{x: |x−y|>δ(y)}
{x: |x−y|δ(y)}
Ω
|x − y|−N dx.
Thus
|∇G|(x, y) dx cδ(y) + cδ(y)Ln δ(y).
(42)
Ω
From relations (40) and (42), we deduce
|∇vk |(x) dx c
Ω
f (y)δ(y)Ln δ(y) dy.
(43)
Ω
This shows that (vk )k1 is a Cauchy sequence in W01,1 (Ω). Thus v ∈ W01,1 (Ω). 2 We recall the following result which can be obtained by some direct integrations (see for instance [11,7]). . Lemma 5. Let f ∈ L1 (Ω, δ), f 0 and let for n ∈ N, Tn (f ) = min(f, n) = fn . Then the sequence (Un )n0 defined on Ω by
Un (x) =
αN σ 1 σ −2(1− N ) fn∗ (t) dt dσ,
1 2
N 2 αNN α
N N |x|
0
is the unique solution of
−Un (x) = fn (x) = fn∗ αN |x|N ,
x ∈ Ω,
Un = 0 on ∂Ω, 1,q
Un ∈ W0 (Ω), ∀q < +∞. Another lemma that shall explain the difference between the results when f ∈ L1 (Ω) and f ∈ L1 (Ω, δ) is the following necessary and sufficient condition.
J.I. Díaz, J.M. Rakotoson / Journal of Functional Analysis 257 (2009) 807–831
823
Lemma 6. Under the same assumptions as in Lemma 5, we have f ∈ L1 (Ω) if and only if limn→+∞ Un = U is in L1 (Ω). And in this case (i.e. f ∈ L1 (Ω)), the function U is the unique solution of ⎧ ⎨ −U = f in Ω, N ⎩ U ∈ W01,q (Ω), 1 q < . N −1 Proof. We first note that fn∗ = Tn (f∗ ) = min(f∗ , n), thus by Beppo–Levi monotone convergence limn→+∞ Un (x) = U (x) exists in [0, +∞] for a.e. x, since Un (x) Un+1 (x)
∀x ∈ Ω.
The first part is well known since if f ∈ L1 (Ω) then f ∈ L1 (Ω) and therefore, the unique solution of the Dirichlet problemis U . Conversely, assume that 0 Ω U (x) dx < +∞, one has for all n 1, using integration by parts
U (x) dx Ω
Un (x) dx Ω
αN =
s
−1+ N1
s
fn∗ (t) dt ds
0
0
αN 1 N 1 =N αN − s N fn∗ (s) ds.
(44)
0 1
1
From relation (44), one has for 0 < ε N
αNN 2
ε fn∗ (s) ds 0
2 1
|U |L1 (Ω)
(45)
N αNN
and αN fn∗ (s) ds fn∗ (ε) · (αN − ε) f∗ (ε)(αN − ε).
(46)
ε
We note that since f ∈ L1+ (Ω, δ) then for all s ∈ ]0, |Ω|], 0 f∗ (s) < +∞, in particular f∗ (ε) < +∞. Thus, for all n 0 2 fn (x) dx f∗ (ε)(αN − ε) + |U |L1 (Ω) . 1 N N α Ω N
824
J.I. Díaz, J.M. Rakotoson / Journal of Functional Analysis 257 (2009) 807–831
This implies, by Fatou’s lemma that f (x) dx f∗ (ε)(αN − ε) +
2 1
|U |L1 (Ω) .
2
(47)
N αNN
Ω
Next, we want to prove the following theorem. Theorem 5. Let h ∈ L1 (Ω, δ), h 0. Then, the unique solution ω ∈ L1 (Ω) of
−ω(x) = h(x) = h∗ αN − αN |x|N , ω(x) = 0, x ∈ ∂Ω,
in Ω,
in the very weak sense given above belongs to W01,1 (Ω). Moreover we have |ω|L1 (Ω)
h∗ (σ )σ
1 1+ 1 αN N
N 1
|∇ω|L1 (Ω)
1 N
L1 (Ω∗ )
1 1
|h|L1 (Ω,δ) ,
αNN
h∗ (σ )σ 1 |h|L1 (Ω,δ) . L (Ω ) ∗
N αN
For this, we shall prove the following more general theorem which shows merely that for radial solution ω, one has the W 1,1 -regularity. Theorem 6. Let f0 be a given nonnegative measurable function on the interval Ω∗ with σf0 (σ ) ∈ L1 (Ω∗ ). Then, f ∈ L1 (Ω, δΩ ), with f (x) = f0 (αN − αN |x|N ), and the unique generalized function ω ∈ L1 (Ω) of −ω = f0 (αN − αN |x|N ) belongs to W01,1 (Ω). Moreover we have |ω|L1 (Ω)
f0 (σ )σ
1 1+ 1 αN N
|∇ω|L1 (Ω)
N 1 1 N
N αN
L1 (Ω∗ )
1 1
|f |L1 (Ω,δ) ,
αNN
f0 (σ )σ 1 |f |L1 (Ω,δ) . L (Ω ) ∗
N Proof. Consider for f 0, with f (x) = f0 (αN − ααNN |x| ). We first remark, arguing as in Lemma 4, that Ω (1 − |x|)f (x) dx is equivalent to 0 σf0 (σ ) dσ and we have precisely
αN f (x)δ(x) dx
αN Ω
σ f0 (σ ) dσ N αN
0
f (x)δ(x) dx. Ω
Thus, under the condition on f0 , one deduces that f ∈ L1 (Ω, δ).
J.I. Díaz, J.M. Rakotoson / Journal of Functional Analysis 257 (2009) 807–831
825
The function αN
1
ω(x) =
σ
2
−2(1− N1 )
σ
f0 (αN − t) dt dσ
N 2 αNN α
N N |x|
0
is L1 (Ω). Indeed, considering f0 n = Tn (f0 ) and the function αN ωn (x) = αN
σ
−2(1− N1 )
s
f0 n (αN − t) dt ds,
|x|N
x ∈ Ω,
0
one has by change of variables
ωn (x) dx =
Ω
αN αN σ
−2(1− N1 )
f0 n (αN − t) dt
s
0
σ
dσ
0
αN 1 N 1 =N αN − s N f0 n (αN − s) ds 0
αN =N
(αN − s)f0 n (αN − s) ds PN (s)
(48)
0
with PN (s) =
αN −s 1
1
αNN −s N
, s ∈ [0, αN [. Thus from (48), we deduce
ωn (x) dx
Ω
N PN (0)
αN f0 (σ )σ dσ.
(49)
0
Letting n → +∞, in relation (49), we deduce from Fatou’s lemma
ω(x) dx
1
f0 (σ )σ dσ.
1+ N1
N αN
Ω
αN 0
The same analysis shows that for (j, n) ∈ N2 , one has
1 ∇(ωn − ωj )(x) dx α N
αN s
N
Ω
From the latter, we derive
0
−1+ N1
s (f0 n − f0 j )(αN − t) dt ds. 0
(50)
826
J.I. Díaz, J.M. Rakotoson / Journal of Functional Analysis 257 (2009) 807–831
Ω
1
N ∇(ωn − ωj )(x) dx N αN PN (0)
αN (αN − s)|f0 n − f0 j |(αN − s) ds 0
2 N −1
αN
N αN
σ (f0 n − f0 j )(σ ) dσ.
0
By Lebesgue dominate theorem, we deduce that αN lim
n→+∞ j →+∞ 0
σ (f0 n − f0 j )(σ ) dσ = 0.
Thus ωn is a Cauchy sequence in W01,1 (Ω). Therefore ω is W01,1 (Ω) and so is ω. Moreover, we have the identity
∇ω(x) dx = N αN N1
Ω
αN
(αN − t)f0 (αN − t) dt. PN (t)
(51)
0
From the latter, we derive |∇ω|L1 (Ω) c3N |f |L1 (Ω,δ)
with c3N =
1 . N αN
(52)
Since one has −ωn = f0 n αN − αN |x|N , this implies that ω is a solution of DG(Ω).
ωn ∈ H01 (Ω),
2
As a complement for Theorem 4, we can make precise the necessary and sufficient condition for radial solution as in the above theorem. This will allow us to construct easily some examples for the applications. Lemma 7. Let q ∈ [1, N [. Then the function ω given in Theorem 6 is in W0 (Ω) if and only if we have 1,q
αN
αN σ f0 (σ )
0
q−1 dσ =
f0 (t) dt σ
q
αN αN f0 (t) dt 0
σ
Proof. Assume first that f0 ∈ L∞ (Ω∗ ). One has for any q ∈ [1, N [ Ω
∇ω(x)q dx = γN
αN s 0
− Nq
q
s f0 (αN − t) dt 0
dx
dσ
is finite.
J.I. Díaz, J.M. Rakotoson / Journal of Functional Analysis 257 (2009) 807–831
827
s q−1 αN q q 1− 1− αN N − s N f0 (αN − s) = γN f0 (αN − t) dt ds 0
= γN
0
αN
αN q−1 q q αN 1− N − (αN − σ )1− N σ f0 (σ ) f0 (t) dt dσ. σ σ
0
One has two constants c1 > c0 > 0 such that ∀σ ∈ [0, αN ] 1− Nq
c0
αN
q
− (αN − σ )1− N c1 . σ
Indeed 1− q αN N
1− Nq
− (αN − σ )
1− q = αN N
σ 1− 1− αN
1−
q N
.
The last function is equivalent to − q αN N
q 1− σ N
as σ → 0.
Therefore the quotient 1− Nq
αN
q
− (αN − σ )1− N σ
−
q
≈ αN N
σ →0
1−
q . N
Thus, one has the equivalence
∇ω(x)q dx ≈
Ω
αN
αN
σ f0 (σ )
q−1 f0 (t) dt
dσ.
σ
0
Since f0 (t), t ∈ L1 (0, αN ), then by integration by parts, the last integral is equal to q
αN αN f0 (t) dt 0
dσ = I (f ).
σ
We have shown that there are two constants k1N and k2N depending only on N and q such that k1N I (f )
|∇ω|q dx k2N I (f ). Ω
(53)
828
J.I. Díaz, J.M. Rakotoson / Journal of Functional Analysis 257 (2009) 807–831
If f ∈ L1 (Ω, δ), f 0, fn = Tn (f ) ∈ L1 (Ω, δ) and with the expression of ωn and ω, we have by Beppo–Levi monotone convergence lim |∇ωn |q dx = |∇ω|q dx (54) n→+∞
Ω
Ω
and q
αN αN lim
dσ =
f0 n (t) dt
n→+∞ σ
0
q
αN αN f0 (t) dt
dσ.
σ
0
These numbers might be infinite. Thus (53) is true for f ∈ L1 (Ω, δ). This gives the equivalence. 2 We shall end this section by a few examples of applications of the above results. Corollary 7.1. Let Ω be the unit ball of RN and q ∈ [1, NN−1 [ for γ ∈ [1, 2[, we consider f (x) =
1 . (1 − |x|N )γ
Then f ∈ L1 (Ω, δ)
and f ∈ / L1 (Ω).
Moreover 1,q
• if γ ∈ [1 + q1 , 2[ then the function ω given in Theorem 5 is not in W0 (Ω); 1,q
• if γ ∈ [1, 1 + q1 [ then the function ω ∈ W0
1 with q ∈ [1, min( γ −1 , NN−1 )[.
Proof. One has f∗ (σ ) =
βN , σγ
σ ∈ [0, αN ].
The necessary and sufficient condition can be written as αN 1−γ 1−γ q I (γ ) = σ − αN dσ 0 1,q
is finite if and only if ω ∈ W0 (Ω). And αN σ (1−γ )q dσ is finite,
I (γ ) is finite if and only if 0
(55)
J.I. Díaz, J.M. Rakotoson / Journal of Functional Analysis 257 (2009) 807–831
829
and αN σ (1−γ )q dσ is finite if and only if
γ <1+
1 . q
2
0
Corollary 7.2. Let g(σ ) =
4αN σ | Ln 4ασ |γ
, with γ > 1 we set f (σ ) = −g (σ ), σ ∈ ]0, αN [. Then
N
(1) g ∈ L1 (]0, αN [) and g ∈ / Lq (]0, αN [) for all q > 1. (2) Setting h(x) = f (αn − αN |x|N ), x ∈ Ω, h ∈ L1 (Ω, δ| Ln δ|), but h is not in L1 (Ω, δ α ), for any α ∈ [0, 1[. (3) The generalized function ω ∈ L1 (Ω) solution of −ω = h belongs to W01,1 (Ω) but not to 1,q W0 (Ω) for q > 1. α +∞ Proof. (1) 0 N g(σ )q dσ ≈ Ln 4 exp((q − 1)σ )σ −γ dσ from which we have the result. γ (2) We set X(σ ) = | Ln 4ασN | = Ln 4ασN for σ ∈ ]0, αn ], Y (σ ) = 1 − X(σ ). By a straightforward computation, we have g (σ ) = − Thus f (σ ) =
g(σ ) σ Y (σ )
g(σ ) Y (σ ). σ
and for α ∈ [0, 1] we have:
Jα =
α |h|(x) 1 − |x| dx =
Ω
αN
σ α |f |(σ ) dσ −1
0
1
,
PNα (αN N (αN − σ ) N )
N
α with PNα (t) = ( 1−t 1−t αN ) , t ∈ [0, 1]. α Since inft∈[0,1] PN (t) > 0, we deduce that
αN Jα ≈
σ f (σ ) dσ ≈
α
0
αN Y (σ )g(σ )σ α−1 dσ. 0
Let us introduce σN = 4αN exp(−γ ), then 0 < Y (σ ) 1 if σ ∈ [0, σN [
and Y (σ ) < 0
for σ > σN ,
If α = 1 since Maxσ ∈[0,αN ] |Y (σ )| is finite then αN J1 c
g(σ ) dσ < +∞. 0
Y (σN ) = 0.
830
J.I. Díaz, J.M. Rakotoson / Journal of Functional Analysis 257 (2009) 807–831
If 0 α < 1, then we have εαN Y (σ )g(σ )σ α−1 dσ, Jα c 0
with 0 < ε < 1, 0 < εαN < 12 inf(αN , σN ). From the latter, we have +∞ Jα c exp (1 − α)σ σ −γ dσ = +∞. Ln 4ε
To show that h ∈ L1 (Ω, δ| Ln δ|), we start with the case γ Ln 4 then f (σ ) 0 for all σ ∈ ]0, αN [, then f (σ ) = f∗ (σ ). Since
|h|(x)δ(x)Ln δ(x) dx c
Ω
αN f (σ )σ dσ 0
αN f∗ (σ )σ dσ c
=c 0
f (x)δ(x) dx < +∞, Ω
if γ > 2, we have (arguing as before):
h(x)δ(x)Ln δ(x) dx
Ω
c Ω
h(x)δ(x) dx + c
αN 0
+∞ g(σ ) d(σ ) + c σ (1−γ ) dσ < +∞, Ln 4
thus h ∈ L1 (Ω, δ| Ln δ|). (3) According to Theorem 6 or Proposition 1, we then have that the very weak solution 1,q / L1 (Ω, δ α ) then ω does not belong to W0 (Ω) −ω = h, ω ∈ L1 (Ω) is in W01,1 (Ω). Since h ∈ for all q > 1. 2 Acknowledgments The research of the first author was partially supported by projects MTM2008-06208 of the Ministerio de Ciencia e Innovación, Spain and CCG07-UCM/ESP-2787 of the DGUIC of the CAM and the UCM.
J.I. Díaz, J.M. Rakotoson / Journal of Functional Analysis 257 (2009) 807–831
831
References [1] C. Bennett, R. Sharpley, Interpolation of Operators, Academic Press, London, 1983. [2] I. Birindelli, F. Demengel, Comparison principle and Liouville type results for singular fully nonlinear operators, Ann. Fac. Sci. Toulouse Math. (6) 13 (2) (2004) 261–287. [3] L. Boccardo, T. Gallouët, Non-linear elliptic and parabolic equations involving measure as data, J. Funct. Anal. 87 (1989) 149–169. [4] H. Brezis, Personal communication to J.I. Díaz: Une équation semi-linéaire avec conditions aux limites dans L1 , unpublished. [5] H. Brezis, T. Cazenave, Y. Martel, A. Ramiandrisoa, Blow up for ut − u = g(u) revisited, Adv. Differential Equations 1 (1996) 73–90. [6] X. Cabré, Y. Martel, Weak eigenfunctions for the linearization of extremal elliptic problems, J. Funct. Anal. 156 (1998) 30–56. [7] J.I. Díaz, Nonlinear Partial Differential Equations and Free Boundaries, Res. Notes Math., vol. 106, Pitman, London, 1985. [8] J.I. Díaz, J.M. Rakotoson, On very weak solutions of semilinear elliptic equations with right hand side data integrable with respect to the distance to the boundary, in preparation. [9] D. Gilbarg, N.S. Trudinger, Elliptic Partial Differential Equations of Second Order, Springer-Verlag, 1983. [10] H.-Ch. Grunau, G. Sweers, Positivity for equations involving polyharmonic operators with Dirichlet boundary conditions, Math. Nachr. 307 (1997) 89–102. [11] L. Korkut, M. Pasic, D. Zubrini´c, A singular ODE related to quasilinear elliptic equations, Electron. J. Differential Equations 12 (2000) 1–37. [12] P. Quittner, P. Souplet, Superlinear Parabolic Problems, Birkhäuser, Basel, 2007. [13] J.M. Rakotoson, Quasilinear elliptic problems with measure as data, Differential Integral Equations 4 (3) (1991) 449–457. [14] J.M. Rakotoson, Réarrangement Relatif: un instrument d’estimation dans les problèmes aux limites, SpringerVerlag, Berlin, 2008. [15] J.E. Rakotoson, J.M. Rakotoson, Analyse fonctionnelle appliquée aux équations aux dérivées partielles, P.U.F., Paris, 1999. [16] C. Simader, On Dirichlet’s Boundary Value Problem and Lp Theory Based on Generalization of Gårding’s Inequality, Lecture Notes in Math., vol. 268, Springer-Verlag, New York, 1972. [17] L. Veron, Singularities of Solutions of Second Order Quasilinear Equations, Longman, Edinburgh Gate, Harlow, 1995, pp. 176–180.
Journal of Functional Analysis 257 (2009) 832–902 www.elsevier.com/locate/jfa
Almost periodically forced circle flows ✩ Wen Huang a , Yingfei Yi b,∗ a Department of Mathematics, University of Science and Technology of China, Hefei Anhui 230026, PR China b School of Mathematics, Georgia Institute of Technology, Atlanta, GA 30332, USA
Received 8 December 2008; accepted 8 December 2008 Available online 14 January 2009 Communicated by J. Bourgain
Abstract We study general dynamical and topological behaviors of minimal sets in skew-product circle flows in both continuous and discrete settings, with particular attentions paying to almost periodically forced circle flows. When a circle flow is either discrete in time and unforced (i.e., a circle map) or continuous in time but periodically forced, behaviors of minimal sets are completely characterized by classical theory. The general case involving almost periodic forcing is much more complicated due to the presence of multiple forcing frequencies, the topological complexity of the forcing space, and the possible loss of mean motion property. On one hand, we will show that to some extent behaviors of minimal sets in an almost periodically forced circle flow resemble those of Denjoy sets of circle maps in the sense that they can be almost automorphic, Cantorian, and everywhere non-locally connected. But on the other hand, we will show that almost periodic forcing can lead to significant topological and dynamical complexities on minimal sets which exceed the contents of Denjoy theory. For instance, an almost periodically forced circle flow can be positively transitive and its minimal sets can be Li–Yorke chaotic and non-almost automorphic. As an application of our results, we will give a complete classification of minimal sets for the projective bundle flow of an almost periodic, sl(2, R)-valued, continuous or discrete cocycle. Continuous almost periodically forced circle flows are among the simplest non-monotone, multifrequency dynamical systems. They can be generated from almost periodically forced nonlinear oscillators through integral manifolds reduction in the damped cases and through Mather theory in the damping-free cases. They also naturally arise in 2D almost periodic Floquet theory as well as in climate models. Discrete almost periodically forced circle flows arise in the discretization of nonlinear oscillators and discrete
✩ The first author is partially supported by NSFC grant 10531010, 973 project 2006CB805903, and FANEDD grant 200520. The second author is partially supported by NSFC grant 10428101 and NSF grants DMS0204119, DMS0708331. * Corresponding author. E-mail addresses: [email protected] (W. Huang), [email protected] (Y. Yi).
0022-1236/$ – see front matter Published by Elsevier Inc. doi:10.1016/j.jfa.2008.12.005
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
833
counterparts of linear Schrödinger equations with almost periodic potentials. They have been widely used as models for studying strange, non-chaotic attractors and intermittency phenomena during the transition from order to chaos. Hence the study of these flows is of fundamental importance to the understanding of multi-frequency-driven dynamical irregularities and complexities in non-monotone dynamical systems. Published by Elsevier Inc. Keywords: Almost automorphic dynamics; Almost periodically forced circle flows; Forced nonlinear oscillators; minimal sets; Topological dynamics
1. Introduction Through out the paper, we let T = R or Z. Consider askew-product circle flow (SPCF for short) (S 1 × Y, T) = (S 1 × Y, {Λt }t∈T ) with a compact base (or forcing) flow (Y, T) = (Y, {σt }t∈T ), i.e., for each t ∈ T the following diagram S1 × Y
Λt
S1 × Y
π
Y
π σt
Y
commutes, where π : S 1 × Y → Y denotes the natural projection. Let y0 · t = σt (y0 ), ψ(s0 , y0 , t) = ΠΛt (s0 , y0 ), where Π : S 1 × Y → S 1 denotes the natural projection. The SPCF can be expressed more explicitly as Λt : S 1 × Y → S 1 × Y : Λt (s0 , y0 ) = ψ(s0 , y0 , t), y0 · t ,
t ∈ T.
(1.1)
In particular, when T = Z, the discrete flow Λt is generated by the iteration of the skew-product circle map Λ : S 1 × Y → S 1 × Y : Λ(s0 , y0 ) = f (s0 , y0 ), T (y0 ) ,
(1.2)
where f (s0 , y0 ) = ψ(s0 , y0 , 1) and T (y0 ) = y0 · 1. Using angular coordinate φ0 ∈ R 1 (mod 1), the SPCF Λt can be also expressed as Λˆ t (φ0 , y0 ) = φ(φ0 , y0 , t), y0 · t ,
t ∈ T,
(1.3)
where e2πiφ(φ0 ,y0 ,t) = ψ(s0 , y0 , t) and e2πiφ0 = s0 . The present paper is mainly devoted to the study of dynamical and topological properties of minimal sets in an almost periodically forced circle flow (APCF for short), i.e., a SPCF (S 1 × Y, T) with an almost periodic base (or forcing) flow (Y, T). We refer an APCF as continuous APCF if T = R and as discrete APCF if T = Z. In the case that the SPCF is discrete, we assume that the generating skew-product circle map Λ is homotopic to the identity and fiber-wise monotone, i.e., there is a homeomorphism Λ˜ : R 1 × Y → R 1 × Y : ˜ 0 , y0 ) = f˜(x0 , y0 ), T (y0 ) , Λ(x
834
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
where for each y0 ∈ Y , f˜(·, y0 ) is monotone and f˜(x0 + 1, y0 ) ≡ f˜(x0 , y0 ) + 1, such that when both f˜ and x0 are identified modulo 1 the identified map Λ˜ is homeomorphic to Λ. In the case that the SPCF (1.1) is either discrete and unforced (i.e., a circle map) or continuous but periodically forced, it follows from the classical Poincaré–Birkhoff–Denjoy theory that a minimal set of the SPCF is periodic if the rotation number is rational, and is either almost periodic or of Denjoy type if the rotation number is irrational. It is also known that a Denjoy type of minimal set has a Cantor structure and is almost automorphic. But even if the SPCF (1.1) becomes an APCF, its minimal dynamics can be far more complex, though it always admits zero topological entropy. Taking the continuous case for example, while the majority existence of quasi-periodic dynamics in a parameter family of quasi-periodically forced ordinary differential equations on the circle is asserted by the Arnold–Moser theorem [2,43] when the forcing frequencies are Diophantine and the forcing functions are sufficiently small, smooth perturbations of a constant plus a deformation parameter, almost periodic dynamics are however not generally expected in a continuous APCF. Even in the continuous APCF generated from the projective bundle flow of a continuous, almost periodic, sl(2, R)-valued cocycle, it was shown by Johnson [29] that if the cocycle is not uniformly hyperbolic and admits two minimal sets, then both minimal sets are almost automorphic which are not necessarily almost periodic, and moreover, if only one minimal set exists, then dynamics of the minimal set can be much more complicated than being almost automorphic. For discrete APCFs, it was recently shown in [27,28] that even with one forcing frequency the flows can be topologically transitive. Besides the presence of multiple forcing frequencies and the topological complexity of the forcing space, much of the dynamical complexity in an APCF (1.1) is governed by the loss of the so-called mean motion property. Let ˜ φ˜ 0 , y0 , t) φ( t→∞ t
ρ = lim
˜ φ˜ 0 , y0 , t) denotes the be the rotation number associated with the APCF (1.1) or (1.3), where φ( 1 ˜ ˜ ˜ ˜ lift of φ(φ0 , y0 , t) in R satisfying φ(φ0 + 1, y0 , t) ≡ φ(φ0 , y0 , t) + 1. The limit exists and is independent of orbits. Differing from the unforced discrete case and periodically forced continuous case, there are general cases in which ˜ φ˜ 0 , y0 , t) − ρt = +∞ (1.4) supφ( t∈T
for some (φ˜ 0 , y0 ) ∈ R 1 × Y . We say that the APCF (1.1) admits mean motion (or bounded mean motion) if ˜ φ˜ 0 , y0 , t) − ρt < +∞ (1.5) supφ( t∈T
for all (φ˜ 0 , y0 ) ∈ R 1 × Y . In the opposite case, we say that the APCF (1.1) admits no mean motion (or unbounded mean motion). It is well known that if the APCF (1.1) has an almost periodic orbit, then it always admits mean motion. In fact, as suggested by works [27,28,54,62], dynamics of an APCF (1.1) with mean motion should resemble more closely to the unforced discrete case or the periodically forced continuous case, while dynamics of an APCF (1.1) without mean motion should be considerably different than those with mean motion. Indeed, in the case that an APCF (1.1) is continuous in time and admits mean motion, a translation x = φ˜ − ρt will
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
835
readily transform the APCF into an almost periodically forced totally monotone system and an application of results in [56] shows that each of its minimal set is almost automorphic. In the present paper, we will employee techniques from topological dynamics and ergodic theory to study minimal sets in an APCF (1.1) with particular attentions paying to their general (dynamical and topological) complexities, structures, and topological classifications. We will also investigate in other fundamental dynamics issues for an APCF such as the existence of almost automorphic minimal sets and sufficient conditions for the validity of the mean motion property. We will obtain a set of general results for an APCF including the following: (a) Each minimal set is either point-distal or residually Li–Yorke chaotic; (b) Any minimal set is either an almost finite to one extension of the base, or the entire phase space, or a Cantorian; (c) If the flow admits more than one minimal set, then each minimal set is an almost finite cover of the base; (d) If the flow admits mean motion, then each minimal set is almost automorphic; (e) If the flow admits no mean motion, then each minimal set is either the entire phase space or is everywhere non-locally connected. In the case that the forcing is quasi-periodic, the following more concrete results will be obtained: (f) If the rotation number is rationally independent of the forcing frequencies, then the flow admits a unique minimal set and the minimal set is either the entire phase space or is everywhere non-locally connected; (g) If the flow admits no mean motion, then it is positively transitive and admits a unique minimal set, and consequently, if the flow has more than one minimal sets, then it must admit mean motion and each minimal set is almost automorphic. We remark that, except in those involving rotation number and mean motion, our results above actually hold for a general discrete APCF without assuming its generating skew-product map to be homotopic to identity. Based on our general results and works [7,29], we will also give a complete classification of minimal sets for the projective bundle flow (P 1 × Y, T) of an almost periodic, sl(2, R)-valued, continuous (i.e., T = R) or discrete (i.e., T = Z) cocycle. APCFs are among the simplest but fundamental models of non-monotone, multi-frequency systems in which interactions of (both internal and external) frequencies are expected to generate rather complicated dynamics. First of all, APCFs arise naturally in the study of almost periodically forced nonlinear oscillators and their discretizations. Consider an almost periodically forced, damped, nonlinear oscillator x + F (x, x , y · t) = 0,
x ∈ R 1 , y ∈ Y.
(1.6)
Such an oscillator admits both internal and external frequencies, and due to damping, its oscillations all lie in a compact global attractor. According to the classical oscillation theory, not only are almost periodic oscillations rare in a such system, but also the global attractor can become complicated especially when the damping is weak. In particular, even in quasi-periodically forced nonlinear oscillators as simple as Van der Pol and Josephson junction, numerical stud-
836
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
ies have discovered the existence of so-called strange, non-chaotic attractors (SNAs) which are geometrically strange but admit no positive Lyapunov exponent (see e.g., [22,51]). There have been many theoretical and numerical studies on SNAs with respect to both quasi-periodically forced nonlinear oscillators and their discretizations. It is well believed that a SNA typically lies in the intermittency during the transition from order to chaos. An almost periodically forced, damped, nonlinear oscillator (1.6) can be often reduced through an integral manifolds reduction to an almost periodically forced scalar ordinary differential equation of the form φ = f (φ, y · t),
φ ∈ R 1 , y ∈ Y,
(1.7)
where f (φ + 1, y) ≡ f (φ, y) (see [54,62] for the case of almost periodically forced Van der Pol and Josephson junction oscillators). When both the solution φ(φ0 , y0 , t) and the initial value φ0 corresponding to y = y0 are identified modulo 1, Eq. (1.7) clearly generates an APCF containing the SNA. In fact, much of the study on SNAs has been made with respect to such reduced quasiperiodically forced circle flows and their discrete counterparts (we refer the readers to [14,21, 27,28,35] for recent progresses on the subject). But besides the case of (1.7) with mean motion property in which minimal sets are known to be almost automorphic [54,62], dynamical and topological structures of SNAs in general situation are yet to be understood. In the case that an almost periodically forced nonlinear oscillator is damping-free, it becomes an almost periodically forced, one-degree-of-freedom Hamiltonian system of the form x + Vx (x, y · t) = 0,
x ∈ R 1 , y ∈ Y,
(1.8)
where V (x + 1, y) ≡ V (x, y). Due to the conservative nature, oscillations of a such system spread over the entire phase space. It is known that if (1.8) is quasi-periodically forced with Diophantine frequencies, then associated with high “energy” the system becomes nearly integrable and an application of KAM theory shows the existence of a positive Lebesgue measure set of quasi-periodic invariant tori with Diophantine frequencies. But it is also known that these quasiperiodic tori tend to disappear if either the system become less integrable or the frequencies are close to resonance. Instead, the so-called Mather sets (or Cantori) supporting minimizing measures can be shown to exist in the phase space R 1 × S 1 × Y based on the Mather theory [39,41]. An application of the Mather theory further shows that dynamics on each projected Mather set in S 1 × Y is topologically conjugated to that of the corresponding Mather set (see e.g., [26,40]). 2 More precisely, consider the Lagrangian L = p2 − V (x, y · t) associated with (1.8). Then for each η ∈ R 1 , minimizing measures μη exist, i.e., each μη is an invariant measure for the skew-product flow (R 1 × S 1 × Y, R) generated from (1.8) and satisfies (L − η) dμη = inf (L − η) dμ, μ
R 1 ×S 1 ×Y
R 1 ×S 1 ×Y
where the infimum is taken over all Borel probability measures on R 1 × S 1 × Y . The set Mη = 1 1 μη supp μη is called Mather set, which is a compact invariant set of (R × S × Y, R). Let π : R 1 × S 1 × Y → S 1 × Y be the natural projection and M˜ η = πMη be the projected Mather set. Then π : Mη → M˜ η is a homeomorphism. It follows that the projected flow on M˜ η , as a subflow of an APCF, is topologically conjugated to that defined on Mη . When (1.8) is periodically forced, the projected Mather sets are just the well-known Aubry–Mather sets which have the
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
837
basic structure of Denjoy sets supporting 2-frequency almost automorphic dynamics. However, not much is known on dynamical and topological properties of Mather sets in situations involving more than two frequencies. The second, APCFs arise naturally in the spectral theory of 2D linear systems with almost periodic coefficients. For instance, as the projective bundle flows of sl(2, R)-valued, almost periodic, continuous or discrete cocycles, they play an essential role in 2D almost periodic Floquet theory and the spectral problem of linear Schrödinger equations/operators and their discrete counterparts (such as Harper’s equations and almost Mathieu operators) with almost periodic potentials [6,29,34]. In particular, for the 2D almost periodic Floquet problem, it was a remarkable observation due to Johnson [29] that, with the general unavailability of an almost periodic strong Perron transformation which transforms an almost periodic linear differential system into a canonical form, one can often introduce an almost automorphic transformation instead, provided that an almost automorphic minimal set exists in the reduced continuous APCF. The third, as recently shown by Pliss and Sell [47], continuous APCFs arise in oceanic dynamics and climate models through invariant manifolds reductions and high frequency averaging. Hence they can be used as basic models to explain complicated oceanic dynamics in particular the nature of turbulence. In addition, in the case that the forcing flow is quasi-periodic, (1.1) becomes a toral flow or map whose rotation set is a singleton. Dynamics of toral flows and maps have been extensively studied for cases with convex rotation sets, but the case with “thin” rotation set is more or less open. Linking to these important problems and applications, our primary goal for the present study is to make a preliminary understanding of frequency-driven dynamical irregularity and complexity in non-monotone, multi-frequency systems. It is our hope that the present study on APCFs will lead to some deep insights on dynamics and structures of SNAs and Mather sets in almost periodically forced, nonlinear oscillators in the damped and damping-free case respectively, on the spectral problem of almost periodic Schrödinger-like operators, on dynamics of toral flows and maps with “thin” rotation sets, on the nature of turbulence in oceanic flows, and on intermittency phenomenon during the transition from order to chaos. We remark that smooth dynamical systems theory plays a less role in the general problems which we are studying. First of all, due to the general almost periodic time dependence, the forcing space of an APCF need not be smooth. Secondly, even if the forcing space is smooth, APCFs arising in applications need not be smooth at all. For instance, for a quasi-periodically forced, damped nonlinear oscillator, it is well known that the weaker the damping is, the less smoother an integral manifold becomes [61]. While for a quasi-periodically forced, damping-free nonlinear oscillator, the torus which a projected Mather set is embedded into is only Lipschiz in general even in the periodically forced case [13,39]. The rest of the paper is organized as follows. In Section 2, we will give precise statements of our main results with respect to APCFs along with some discussions. Section 3 is a preliminary section in which we will review basic concepts and results from topological dynamics and ergodic theory to be used in later sections. Our main results will be proved in Sections 4–8 based on some general results which we will prove for compact flow extensions, as well as for general SPCFs. More precisely, in Section 4, we will give an ordering condition under which a compact flow extension preserves topological entropy. In Section 5, we will show that if a minimal flow is a proximal extension of another minimal flow which is not almost 1–1, then it must be Li– Yorke chaotic. In Section 6, we will classify the general topological structures of minimal sets in a SPCF. In Section 7, we will study the nature of minimal sets of a SPCF which are almost finite to one extensions of the base space. In Section 8, we will study dynamical and topological
838
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
behaviors of minimal sets in a SPCF in connection with the validity of the mean motion property. In particular, we will show that if a SPCF is positively transitive, then it has a unique minimal set. We then consider the case with locally connected base and show that if the SPCF admits no mean motion, then it must be positively transitive. In Section 9, we generalize results in [4,7,29] to give a complete classification of minimal dynamics of the projective bundle flow generated from a (continuous or discrete) almost periodic, sl(2, R)-valued cocycle of all four basic types: elliptic, parabolic, partially hyperbolic, and hyperbolic. 2. Main results In this section, we will state our main results with respect to APCFs which are the main motivations for the present study, though most of these results actually hold for more general SPCFs (see Sections 4–8 for more details). Dynamical and topological structures of minimal sets of an APCF (S 1 × Y, T) seem to depend on various factors: the number of minimal sets, local connectivity of minimal sets, validity of the mean motion property, and the topological nature of the forcing space. Hence our main results lie in several categories which particularly include cases for general Y as well as for Y being locally connected (e.g., Y is a torus). A special example of the later case is when (1.1) is quasiperiodically forced. As to be seen from the proofs of these results, both the validity of mean motion property and the local connectivity of Y seem to be essential for a minimal set of an APCF to be better behaved in general. Let π : S 1 × Y → Y denote the natural projection. 2.1. General dynamical complexities General dynamical complexities of an APCF is characterized in the following two theorems. Theorem 1. An APCF always has zero topological entropy. Theorem 1 may be proved by using an entropy inequality due to Bowen [9] for compact flow extensions. In Section 4 we will give a self-contained proof of this theorem by providing a general result on the preservation of topological entropy between flow extensions under an ordering condition. Such a general result on preservation of topological entropy will also be useful to other skew-product flows having certain fiber-wise order preserving properties. Theorem 2. Let M be a minimal set of an APCF. Then precisely one of the following holds: (a) M is point-distal; (b) M is residually Li–Yorke chaotic. Theorem 2 will be proved in Section 5 following a general result which says that if a minimal flow is a proximal extension of another minimal flow that is not almost 1–1, then it must be Li–Yorke chaotic.
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
839
The notion of Li–Yorke chaos is introduced based on the well-known work of Li and Yorke [37]. A compact metric flow (X, T) is called Li–Yorke chaotic if X contains an uncountable scrambled set S—set in which any pair of distinct points {x, y} ⊂ S is a Li–Yorke pair, i.e., lim sup d(x · t, y · t) > 0 and t→+∞
lim inf d(x · t, y · t) = 0, t→+∞
where d denotes the metric on X. It is known that if (X, T) admits positive topological entropy, then it is necessarily Li–Yorke chaotic, but not vice versa [8]. Residual Li–Yorke chaos is a stronger notion than Li–Yorke chaos. A compact metric flow extension π : (X, T) → (Y, T) is said to be residually Li–Yorke chaotic if there exists a residual (i.e., dense Gδ ) subset Yc of Y such that each fiber π −1 (y), y ∈ Yc , admits an uncountable scrambled set. Remark 1. (1) A point-distal minimal set (including almost automorphic minimal set), though cannot be residually Li–Yorke chaotic, it can well be Li–Yorke chaotic. (2) Minimal sets in a circle map or a periodically forced continuous circle flow can never be Li–Yorke chaotic. (3) Using Theorem 2, one can show that many APCFs admit Li–Yorke chaos. Consider a 2D almost periodic linear system x˙ = A(y · t)x,
x ∈ R 2 , y ∈ Y,
(2.1)
where tr A = 0 and (Y, R) is an almost periodic minimal flow. The system naturally generates a 1 continuous APCF ΛA t on the projective bundle P × Y . Let S0 be the set of continuous matrixvalued function A whose respective linear system (2.1) can be reduced to a system with skewsymmetric coefficient matrix B(y) of zero mean via a strong Perron transformation. It was shown by Johnson [31] (see also [42]) that there is a residual subset S of S¯0 such that for each A ∈ S, the entire phase space of ΛA t is minimal, strictly ergodic, and a proximal extension of Y . Now, for each A ∈ S, it follows from Theorem 2 that all minimal sets of ΛA t are residually Li–Yorke chaotic. We note that with the non-existence of almost automorphic dynamics, the APCFs ΛA t with A ∈ S admit no mean motion. We refer the readers to a recent work of Bjerklov and Johnson [7] for more concrete discussions on Li–Yorke chaos in continuous, almost periodically forced projective bundle flows and to Section 9 of this paper for some general discussions in this regard. (4) We expect that point-distality of minimal sets stated in Theorem 2 can be replaced by almost automorphy in a generic sense. However, there are minimal sets of APCFs which are pointdistal but not almost automorphic. An easy example is as follows. Let (Y, R) be an non-periodic, periodic minimal flow and let a be a continuous function on Y with zero mean such that almost t a(y 0 · s) ds is unbounded for some y0 ∈ Y (such functions largely exist, see [30]). Then the 0 t almost periodically forced circle flow defined by Λ∗t (φ, y) = (φ + 0 a(y · s) ds mod 1, y · t), is not almost periodic. Hence Λ∗t is distal (in particular, point-distal) but not almost automorphic, simply because a distal almost automorphic minimal flow must be almost periodic. (5) In [29], concerning the 2D almost periodic Floquet problem, Johnson studied the APCF (projective bundle flow) ΛA t generated from (2.1). Minimal sets of the flow were shown to be almost periodic or almost automorphic for most cases except that the flow has only one minimal set M in which no fiber over the base admits a distal pair. Using Theorem 2, we conclude that
840
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
M is either an almost 1–1 extension of the base (hence almost automorphic, see Theorem 3.2) or residually Li–Yorke chaotic (see also [7]). Similar classifications can be made for the projective bundle flow of a general sl(2, R)-valued, almost periodic, continuous or discrete cocycle (see Section 9 for details). 2.2. Topological classification of minimal sets In the case that an APCF is either discrete and unforced or continuous but periodically forced (i.e., Y = S 1 ), it follows from the classical Poincaré–Birkhoff–Denjoy classification that if the rotation number is rational (the resonant case), then each minimal set is a finite to one extension of S 1 , and if the rotation number is irrational (the non-resonant case), then a minimal set is either the entire phase space or of Denjoy type (in the continuous case, each of its Poincaré section is a Denjoy Cantor set). Our next result shows that one can have a similar topological classification of minimal sets for a general APCF. Theorem 3. Let M be a minimal set of an APCF. Then precisely one of the following holds: (a) M is an almost N –1 extension of Y for some positive integer N ; (b) M = S 1 × Y ; (c) M is a Cantorian. Theorem 3 will be proved in Section 6. A minimal set M of a SPCF (S 1 × Y, T) is said to be a Cantorian if there exists a residual subset Y0 of Y such that for each y ∈ Y0 , the fiber My = π −1 (y) ∩ M is a Cantor set. Remark 2. (1) Cantorians can arise in APCFs with or without mean motions. The Denjoy type of minimal set in a continuous, periodically forced circle flow is an example of Cantorian in (topologically non-transitive) APCFs with mean motion. An example of Cantorian in (topologically transitive) APCFs without mean motion is constructed in a recent work due to Béguin, Crovisier, Jäger, and Le Roux [4]. (2) It is clear that a minimal set in the case (a) of Theorem 3 cannot be residually Li– Yorke chaotic, hence by Theorem 2 it must be point-distal. Still, such a minimal set can well be Li–Yorke chaotic and topological complicated by being everywhere non-locally connected (see Remark 4(1) below). (3) We believe that a Cantorian minimal set of an APCF is more topologically complicated in the sense that it is not only a Cantorian but also everywhere non-locally connected. (4) According to the Poincaré–Birkhoff–Denjoy theory, dynamics of a minimal set M in a continuous, periodically forced circle flow can be completely classified according to its topological nature: in the resonant case M is periodic, while in the non-resonant case M is either 2-frequency almost periodic if it is the entire phase space or 2-frequency almost automorphic if it is of Denjoy type. To the contrary, dynamics of minimal sets in a general APCF is too complicated to be classified according to their topological natures. For instance, minimal sets in both cases (b) and (c) of Theorem 3 can be either point-distal or residually Li–Yorke chaotic which need not be almost automorphic. In fact, even in the case (a) of Theorem 3 we believe that minimal sets need not be almost automorphic in general. An almost N –1 extension of an almost periodic minimal flow need not be almost automorphic. A such example can be constructed in symbolic flows as follows. Let τ be the substitution on
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
841
{0, 1} with τ (0) = 01 and τ (1) = 10. For any finite word w = w0 w1 · · · w −1 in {0, 1}, define τ (w) = τ (w0 )τ (w1 ) · · · τ (w −1 ) and τ i (w) = τ (τ i−1 (w)), i 2. Let X ⊆ {0, 1}Z be the set of all bi-infinite binary sequences x in X such that any finite word in x is a sub-word of τ i (0) for some i ∈ N, and let T be the left shift map on X. Then the discrete dynamical system (X, T ) is a so-called Morse system which is known to be minimal and an almost 2–1 extension of its maximal almost periodic factor [38]. As an almost automorphic minimal flow is necessary an almost 1–1 extension of its maximal almost periodic factor (see Theorem 3.2), (X, T ) is not almost automorphic. A suspension of (X, T ) also leads to an example of continuous flows. (5) Unlike the periodically forced continuous case, the topological classification for general APCFs given in Theorem 3 need not depend on the resonance type. For instance, case (a) of Theorem 3 can happen when the rotation number is rationally independent of forcing frequencies, as shown by an example of Johnson [33] in which an APCF admits a unique minimal set that is an almost 1–1 extension of the base, but the rotation number is not rationally dependent on the forcing frequencies. 2.3. Almost finite to one extensions We would like to exam cases of APCFs in which an almost N –1 extension of the base becomes almost automorphic. We first give a structural characterization of a minimal set if there are more than one minimal sets presented in an APCF. Theorem 4. Suppose that an APCF (S 1 × Y, T) has more than one minimal sets. Then the following holds. (1) There exists a positive integer N such that each minimal set of (S 1 × Y, T) is an almost N –1 extension of Y . (2) If one minimal set of (S 1 × Y, T) is almost automorphic, then so are others. (3) If Y is locally connected, then all minimal sets of (S 1 × Y, T) are almost automorphic. In the case that an APCF (S 1 × Y, T) admits more than one minimal sets, we feel that the local connectivity of Y is essential for the minimal sets to be almost automorphic. In the case that it only admits one minimal set which is an almost N –1 extension of Y for some N > 1, we believe that even local connectivity of Y would not be sufficient for the minimal set to be almost automorphic. However, we do have the following result. Theorem 5. Let M be a minimal set of an APCF (S 1 × Y, T) which is an almost N –1 extension of Y for some N 1. If M is not everywhere non-locally connected, then it is almost automorphic, and moreover, for any y ∈ Y , each fiber π −1 (y) ∩ M consists of exactly N connected components which are either singletons or closed intervals, if it is not homeomorphic to S 1 . Theorems 4, 5 will be proved in Section 7. Remark 3. (1) A Denjoy type of minimal set in a continuous, periodically forced circle flow is an almost 1–1 extension of a 2-torus with two points on each non-residual fiber. Hence by Theorem 5 it is everywhere non-locally connected. (2) In [32], Johnson constructed an example of continuous, quasi-periodically forced circle flow which has a unique minimal set M with the following properties: (i) M is an almost 1–1
842
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
extension of the base torus (hence it is almost automorphic); (ii) M is locally connected at every points on singleton fibers; (iii) M is not locally connected at all points; (vi) there is a full (Haar) measure set in the base torus over which all fibers are non-degenerate intervals. This gives an example of Theorem 5. In fact, by Theorem 5, all non-singleton fibers in M are non-degenerate intervals. 2.4. Mean motion and dynamics In the next two results, we describe the behavior and structure of a minimal set of an APCF in both the cases with and without mean motion. Theorem 6 below is more or less known in the continuous case [54,62] but unknown in the discrete case. Theorem 6. Suppose that an APCF (S 1 × Y, T) admits mean motion. Then the following holds. (1) Each minimal set of (S 1 × Y, T) is almost automorphic whose frequency module is generated by the rotation number and the forcing frequencies. (2) If a minimal set of (S 1 × Y, T) is an almost N –1 extension of Y for some positive integer N , then N is the smallest positive integer whose multiplication to the rotation number is contained in the frequency module of the forcing. Theorem 7. Suppose that an APCF (S 1 × Y, T) admits no mean motion. Then the following holds. (1) Each minimal set of (S 1 × Y, T) is either the entire phase space S 1 × Y or is everywhere non-locally connected. (2) If Y is locally connected, then (S 1 × Y, T) is positively transitive and has only one minimal set. Theorems 6, 7 will be proved in Section 8 based on some general results on the connections between the lacking of mean motion, positive transitivity, and the uniqueness of minimal set. Theorem 7(2) is partially known for a quasi-periodically forced circle map with one forcing frequency [27,28]. But arguments in [27,28], being crucially depending on the one-dimensional forcing space, does not extend to the general situation completely. Corollary. Consider an APCF (S 1 × Y, T) with Y being locally connected. Then the following holds. (1) If (S 1 × Y, T) has more than one minimal set, then it admits mean motion. (2) If the entire phase space S 1 × Y is not minimal, then each minimal set of (S 1 × Y, T) is either everywhere non-locally connected or almost automorphic. (3) If the rotation number is rationally independent of the forcing frequencies, then (S 1 × Y, T) has a unique minimal set. In the above Corollary, (1) follows immediately from Theorem 7(2), (2) follows immediately from Theorem 6(1) and Theorem 7(1), and (3) follows immediately from Theorem 7(2), Theorem 4(1) and Theorem 6(2).
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
843
Remark 4. (1) Consider the example of Johnson [33] in which an APCF admits a unique minimal set that is an almost 1–1 extension of the base, but the rotation number is not rationally dependent on the forcing frequencies. By Theorem 6, this example admits no mean motion, and, by Theorem 7, the unique almost automorphic minimal set is everywhere non-locally connected. Also consider the quasi-periodically forced circle flow constructed by Johnson [32] in which the unique minimal set is not everywhere non-locally connected. Theorem 7 implies that this flow does admit mean motion. (2) If an APCF has a minimal set which is residually Li–Yorke chaotic, then the minimal set cannot be almost automorphic and hence by Theorem 6 the APCF admits no mean motion. (3) An almost automorphic minimal set often occurs as intermediate dynamics in a parameter family of quasi-periodic forced circle flows. Consider a smooth family of quasi-periodically forced equations φ = λ + εf (φ, y · t),
φ ∈ R1 ,
(2.2)
where f : R 1 × T k → R 1 is sufficiently smooth, f (φ + 1, y) ≡ f (φ, y), y · t = y + ωt, ω ∈ R k is Diophantine, and λ, ε are bounded parameters. We let Σ be the set of (λ, ε) whose corresponding equation (2.2) is smoothly reducible to a pure rotation according to Arnold–Moser theorem [2,43], i.e., there is a smooth, near identity transformation φ = ψ + hλ,ε (ψ, y) with hλ,ε ∞ < 1, such that the transformed equation becomes ψ = λ (hence the corresponding quasi-periodically forced circle flow is quasi-periodic and Diophantine). Now consider a boundary point (λ0 , ε0 ) of Σ , i.e., there is a sequence (λn , εn ) ∈ Σ → (λ0 , ε0 ). Let y0 ∈ T k , ψ0 ∈ [0, 1) be given, and hn (t) = hλn ,εn (ψ0 , y0 + ωt) converges uniformly on compact sets to some h∞ (t) according to the Ascoli theorem. Then φn (t) = ψ0 + λn t + hn (t) converges uniformly on compact sets to φ∞ = ψ0 + λ∞ t + h∞ (t) which is a solution of (2.2) corresponding to (λ0 , ε0 ). Since h∞ ∞ 1, it follows that the quasi-periodically forced circle flow (2.2) corresponding to (λ0 , ε0 ) admits mean motion (see Theorem 8.2 in Section 8) and hence by Theorem 6 all its minimal sets are almost automorphic. The rotation number associated with (λ0 , ε0 ) may well depend on ω in a joint Louisville way so that the frequencies of the almost automorphic minimal sets need not be Diophantine. Dynamics of the flow associated with (λ, ε) lying in the complement of Σ¯ are expected to be more complicated due to the possible loss of mean motion property. Similar intermittency phenomenon can be observed in the spectral problem of an almost periodic Schrödinger operator (see Section 9 for detail). For a general parameter family of APCFs, we believe that almost automorphic intermittency (or bifurcation) may occur at the critical value when either almost periodicity is lost, or mean motion property becomes invalid, or when Li–Yorke chaos tends to appear (order to chaos). 2.5. An extended Denjoy theorem From Theorems 4, 7 and the Corollary above, we see that local connectivity of Y plays an important role in the dynamics and topological structures of minimal sets in the corresponding APCF (S 1 × Y, T). The case when Y is locally connected of course includes that of a quasi-periodically forced circle flow. In fact, when Y is a torus, one can have a more complete characterization on the topological structure of a minimal set. The following result can be regarded as a quasi-periodic extension of the classical Denjoy theorem with respect to the topological structure of minimal sets.
844
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
Theorem 8. Consider an APCF (S 1 × Y, T) with Y being a torus (e.g., (Y, T) is quasi-periodic) and suppose that the rotation number is rationally independent of the forcing frequencies. Then (S 1 × Y, T) has a unique minimal set M and M is either the entire phase space S 1 × Y or is everywhere non-locally connected. If, in addition, the APCF admits mean motion, then M is almost automorphic, and moreover, M is either the entire phase space S 1 × Y or an everywhere non-locally connected Cantorian. Theorems 8 will also be proved in Section 8. Remark 5. (1) In light of Theorem 2, under the condition of Theorem 8, an everywhere nonlocally connected minimal set can be either a finite to one extension of the base or a Cantorian. (2) Our results give some information on possible topological and dynamical complexity of a SNA in a quasi-periodically forced, damped nonlinear oscillator and Mather sets in a quasiperiodically forced, damping-free nonlinear oscillator. Consider a quasi-periodically forced, damped, nonlinear oscillator (1.6) in which a SNA exists. If the damping is not too weak, then the attractor lies in a quasi-periodically forced circle flow through an integral manifolds reduction. The complexity of a SNA is often reflected by that on its minimal sets, because, using arguments in [55], such an attractor is made up by minimal sets and their “connecting orbits.” Due to the geometric strangeness, the SNA is however not the entire phase space if it is globally attracting. It follows from Theorem 3 that each minimal set in the SNA is either an almost finite cover of the forcing space (a torus) or a Cantorian, which, by part (2) of the Corollary, is almost automorphic and/or everywhere non-locally connected. In particular, if the rotation number is rationally independent of the forcing frequencies, then it follows from Theorem 8 that the SNA contains a unique minimal set which is everywhere nonlocally connected (which can be a Cantorian carrying almost automorphic dynamics). Of course, minimal dynamics in a SNA can well be Li–Yorke chaotic or even residually Li–Yorke chaotic according to Theorem 2 (as shown in [21], a SNAs can exhibit certain mild chaotic behavior). All these actually suggest that topologically a minimal set in a SNA should typically be everywhere non-local connected and be either an almost finite cover of the forcing space or a Cantorian; and dynamically a minimal set in a SNA should essentially be either almost automorphic or residually Li–Yorke chaotic (see Section 9 for more discussions in this regard on almost periodically forced projective bundle flows). We remark that the kind of complexity of SNAs described above for a quasi-periodically forced, damped, nonlinear oscillator is particularly significant when the damping is weak, in which case the reduced flow on the integral manifold becomes a less/non-smooth, quasiperiodically forced circle flow. To the contrary, when the damping is strong, one can well have cases in which topological complexity of minimal sets plays a less role to the geometric complexity of a SNA in comparing with its measure-theoretic complexity. For a quasi-periodically forced, damping-free nonlinear oscillator (1.8), we note that the flow on each projected Mather set is a (not necessarily smooth) skew-product flow lying in S 1 × Y for which all our results above are applicable. Hence if dynamics on a projected Mather set is not quasi-periodic, then similar topological and dynamical structures are expected for its minimal sets. 3. Preliminary For simplicity, we assume that all T-flows, for T = R or Z, to be considered in the rest of the paper are defined on complete separable metric spaces. We will use the same symbol | · |
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
845
to denote the cardinality of a set, the absolute value of a number, and the norm of a matrix or a function. Also, for a compact metric space X, we let the set 2X of compact subsets of X be endowed with the Hausdorff metric. We say that a flow (X, T) is compact if the phase space X is a compact metric space. Recall that a nonempty compact invariant subset M of a flow (X, T) is minimal if it contains no nonempty, proper, closed invariant subset. A compact flow (X, T) is said to be minimal if X itself is a minimal set, to be strictly ergodic if it is both minimal and uniquely ergodic (i.e., it admits an unique invariant probability measure), and to be positively transitive if for each pair of nonempty open subsets U, V of X there exists t 0 such that U · t ∩ V = ∅. If (X, T) is positively transitive, then the set Tran+ (X) of positive transitive points of X is a residual subset of X, and moreover, Tran+ (X) =
∞
Un · t,
n=1 t0
where {Un }∞ n=1 is a countable basis of X. 3.1. Proximality, distality, and almost automorphy Let (X, T) be a flow and d be the metric on X. Two points x, y ∈ X are said to be positively proximal if lim inft→+∞ d(x · t, x · t) = 0; proximal if lim inft→∞ d(x · t, x · t) = 0. For any x ∈ X, we define PR+ (x) = {x ∈ X: x, x are positively proximal}; PR(x) = {x ∈ X: x, x are proximal}. Now assume that (X, T) is a compact flow and consider the flow maps Πt : X → X: Πt (x) = x ·t, t ∈ T. Then {Πt : t ∈ T} ⊂ X X —the compact Hausdorff space of self-maps of X endowed with the topology of pointwise convergence. The space X X is also a semigroup under the composition of maps on which the right multiplication p → pp0 is continuous for all p0 ∈ X X and the left multiplication p → p0 p is continuous only if p0 is a continuous map. The Ellis semigroup E(X, T) of X is simply defined as E(X, T) = {Πt : t ∈ T}, where the closure is taken under the topology of pointwise convergence. Hence E(X, T) is compact and a sub-semigroup of X X with identity e—the identity map, on which the right multiplication is continuous. We note that the flow (X, T) also induces a natural compact flow (E(X, T), T) on E(X, T): γ · t ≡ Πt γ , γ ∈ E(X, T), t ∈ T. Let ω(e) be the ω-limit set of the identity e in (E(X, T), T). It is clear that two points x, y ∈ X are proximal (resp. positively proximal) iff there exists p ∈ E(X, T) (resp. p ∈ ω(e)) such that p(x) = p(y). x ∈ X is called a positive distal point (resp. distal point) if PR+ (x) = {x} (resp. PR(x) = {x}). A minimal flow (X, T) is called point-distal if it contains a distal point. It is well known that if (X, T) is point-distal, then the set Xd of distal points of X is a residual subset [59]. It is clear that a distal point in a flow must be a positive distal point. In the following, we show that the converse is also true. Proposition 3.1. Let (X, T) be a minimal flow. Then a positive distal point in X is also a distal point. In particular, if (X, T) is not point-distal, then for any x ∈ X, PR+ (x) \ {x} = ∅.
846
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
Proof. Let ω(e) be the ω-limit set of e in the flow (E(X, T), T). Using continuity of the right multiplication and the fact that e is the identity of E(X, T), we have that E(X, T)ω(e) = ω(e) and ω(e) is a sub-semigroup of E(X, T). Let x be a positive distal point. Since (X, T) is minimal, it is easy to see that for any y ∈ X, ω(e)y = {p(y): p ∈ ω(e)} = X. Let y ∈ X \ {x} and consider ωy = {p ∈ ω(e): p(y) = y}. Since ω(e)y = X, ωy is nonempty. Since ωy is a closed sub-semigroup of ω(e) on which the right multiplication is continuous, it follows from a general result due to Namakura [45] (see also [11, Lemma 1]) that ωy contains an idempotent point u, i.e., u2 = u. Clearly u(y) = y. Since u(x) = u(u(x)) and x is a positive distal point, x = u(x). Hence for any p ∈ E(X, T), p(x) = pu(x)
and p(y) = pu(y).
(3.1)
Since pu ∈ E(X, T)ω(e) = ω(e) and x, y are not positively proximal, pu(x) = pu(y). It follows from (3.1) that p(x) = p(y). Since p is arbitrary, x, y are not proximal. This shows that PR(x) = {x}, i.e., x is a distal point. 2 A function f ∈ C(T, X), where X is a complete separable metric space, is said to be almost automorphic if whenever {tn } is a sequence such that f (tn + t) → g(t) ∈ C(T, X) uniformly on compact sets, then also g(t − tn ) → f (t) uniformly on compact sets, as n → ∞. An almost automorphic function valued in a separable Banach space admits well-defined Fourier series which are however not necessarily unique and only converge point-wise in general in term of Bochner–Fejer summation [57,58]. But one can uniquely define the frequency module of an almost automorphic function in the usual way as the smallest additive sub-group of R containing a Fourier spectrum [57]. In this sense, both almost periodic and almost automorphic functions can be viewed as natural generalizations to the periodic ones in the strongest and the weakest sense respectively. A point x in a flow (X, T) is said to be almost automorphic if the orbit {x · t} is an almost automorphic function in t. A flow (X, T) is called almost automorphic minimal if X is the closure of an almost automorphic orbit. An almost automorphic minimal flow is compact, minimal, point-distal, and contains residually many almost automorphic points which are precisely the distal points [59]. Unlike an almost periodic minimal flow, an almost automorphic one can be non-uniquely ergodic, topologically complicated, can admit positive topological entropy, and its general measure-theoretic characterization can be completely random (see [5,15,17,57,62] and references therein). Hence, though an almost automorphic minimal flow resembles an almost periodic one harmonically, it can have certain dynamical, topological, and measure-theoretic complexities which significantly differ from an almost periodic one. Let (X, T) be a flow. A Δ-set S of T is the set of all increasing differences in a sequence ∗ {sn }∞ n=1 , i.e., S = {sn −sm : n > m}, and a Δ -set is a subset of T which has nonempty intersection with each Δ-set. A point x ∈ X is called Δ∗ -recurrent if for every neighborhood V of x, the recurrent time set N (x, V ) = {t ∈ T: x · t ∈ V } is a Δ∗ -set. Almost automorphic points can be characterized by Δ∗ -recurrency as follows. Theorem 3.1. A point x in a compact flow (X, T) is almost automorphic iff it is Δ∗ -recurrent.
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
847
Proof. The theorem is a classical result of Furstenberg [16] when T = Z. For T = R, the proof is completely similar. 2 3.2. Flow extensions A flow extension (or a flow homomorphism or a factor map) π : (X, T) → (Y, T) is a continuous onto map π : X → Y which preserves the flows. If such a flow extension exists, then (X, T) (or X) is called an extension of (Y, T) (or Y ) and (Y, T) (or Y ) is called a factor of (X, T) (X). Let π : (X, T) → (Y, T) be an extension between compact flows. π is called proximal (resp. distal) if for each y ∈ Y and any two points on π −1 (y) are proximal (resp. distal), called almost N –1 (resp. almost finite to one), if there exists a residual subset X0 ⊂ X such that for any x ∈ X0 π −1 π(x) consists of N points (resp. π −1 π(x) is a finite set), called N –1 (resp. finite to one) if π −1 π(x) consists of N points (resp. π −1 π(x) is a finite set) for all x ∈ X, called open if π : X → Y is an open map, and called semi-open if π : X → Y is a semi-open map, i.e., for any nonempty open subset U of X the image π(U ) has nonempty interior in Y . A 1–1 flow extension is also called a flow isomorphism. Let Rπ = {(x1 , x2 ) ∈ X × X: π(x1 ) = π(x2 )}. The flow (X, T) induces a natural flow (Rπ , T) on Rπ . π is called positively weakly mixing if the flow (Rπ , T) is positively transitive. Using the ω-limit sets of (Rπ , T), it is easy to see that if π is a proximal extension, then it must be a positive proximal extension, i.e., for any x ∈ X, any two points in π −1 π(x) are positively proximal. The general structure of an almost automorphic minimal flow is characterized by the following structure theorem due to Veech [58]. Theorem 3.2. A compact flow is almost automorphic minimal iff it is an almost 1–1 extension of an almost periodic minimal flow. By the above structure theorem, almost automorphic points in an almost automorphic minimal flow are precisely those lying in singleton fibers of the corresponding almost 1–1 extension of an almost periodic minimal flow. Hence an almost automorphic minimal set becomes almost periodic iff every point in the set is an almost automorphic point. Proposition 3.2. Let π : (X, T) → (Y, T) be an extension between minimal flows. Then the following holds. (1) If π is almost finite to one, then there is a positive integer N such that π is an almost N –1 extension. (2) If π is open and finite to one, then π is a distal and N –1 extension for some positive integer N . If, in addition, (Y, T) is point-distal, then so is (X, T). Proof. (1) We let Y∗ = {y ∈ Y : |π −1 (y)| < +∞}. Since π −1 : Y → 2X : y → π −1 (y) is upper semi-continuous, the set Y0 of all continuity points of π −1 is a residual subset of Y . For any given y∗ ∈ Y∗ and y0 ∈ Y0 , we let {tn } ⊂ T be a sequence such that y∗ · tn → y0 . It follows from the continuity of π −1 at y0 that |π −1 (y0 )| |π −1 (y∗ · tn )| = |π −1 (y∗ )| < +∞. Hence Y0 ⊂ Y∗ . Now, for any y1 , y2 ∈ Y0 , the above argument yields that |π −1 (y1 )| |π −1 (y2 )| and |π −1 (y2 )| |π −1 (y1 )|, i.e., the map Y0 → N: y → |π −1 (y)| is a constant, say N .
848
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
(2) Since π is open, π −1 is continuous, i.e., Y0 = Y . It follows from the proof of (1) above that the map Y → N: y → |π −1 (y)| is a constant N . The continuity of π −1 also implies that there cannot be any proximal pair on each fiber, for otherwise the number of points on some fiber would be smaller than N . A distal extension of a point-distal flow is easily seen to be point-distal. 2 Proposition 3.3. If π : (X, T) → (Y, T) is an open and proximal extension between minimal flows, then it is a positively weakly mixing extension. Proof. We follow the arguments of the proof of Theorem 6.3, in [19]. We first show the following Claim. If x1 , x2 , . . . , xn ∈ X are such that π(x1 ) = π(x2 ) = · · · = π(xn ), then for any x ∈ X there is a positive increasing sequence {tm } ⊂ T such that lim xi · tm = x,
m→∞
for all 1 i n.
(3.2)
Since (X, T) is minimal, the Claim clearly holds for n = 1. By induction, we assume that the Claim is true for some n = k. Then for any x ∈ X there exists a positive increasing sequence {tm } such that limm→∞ xi · tm = x for all 1 i k. Without loss of generality, we let xk+1 · tm be convergent, say to some x ∈ X. Then π(x) = π(x ), and hence lim inft→+∞ d(x · t, x · t) = 0. Using minimality of (X, T) we let {sj } be a positive increasing sequence such that limj →∞ x · sj = x and limj →∞ x · sj = x, i.e., lim lim xi · (tm + sj ) = x,
j →∞ m→∞
i = 1, 2, . . . , k + 1.
It follows that we can take a positive increasing sequence {rj = sj + tm(j ) } for sufficiently large {m(j )} such that lim xi · rj = x,
j →∞
i = 1, 2, . . . , k + 1.
This proves the Claim. Let W, W be two nonempty open subsets of Rπ = {(x1 , x2 ) ∈ X × X: π(x1 ) = π(x2 )}. Since π is an open map, there exist nonempty open sets U, V ; U , V of X such that π(U ) = π(V ), π(U ) = π(V ), W ⊇ (U × V ) ∩ Rπ = ∅, and W ⊇ (U × V ) ∩ Rπ = ∅. For a fixed z0 ∈ π(U ), we have by minimality of (X, T) that there are t1 , t2 , . . . , tn ∈ T such that ni=1 V · ti = X. By relabeling the ti ’s if necessary, we assume without loss of generality an integer 1 m n such that V · ti ∩ π −1 (z0 ) = ∅ for all i = 1, 2, . . . , m and mthat there is −1 −1 i=1 V · ti ⊃ π (z0 ). For each i = 1, 2, . . . , m, we let vi ∈ V be such that vi · ti = yi ∈ π (z0 ). Since π(U ) = π(V ), there is a point ui ∈ U with π(ui ) = π(vi ). Denote xi = ui · ti , i = 1, 2, . . . , m. Then it is clear that π(xi ) = π(ui · ti ) = π(vi · ti ) = π(yi ) = z0 , i.e., xi ∈ π −1 (z0 ), i = 1, 2, . . . , m. By the Claim, there exists t > 0 such that xi · t ∈ U and t + ti > 0 for all i = 1, 2, . . . , m. Since z0 · t = π(xi · t) ∈ π(U ) = π(V ),i = 1, 2, . . . , m, we can take a point b ∈ V such that π(b) = z0 · t. Then b · (−t) ∈ π −1 (z0 ) ⊆ m i=1 V · ti . Hence b · (−t) ∈ V · ti , i.e., b ∈ V · (t + ti ) ∩ V for some 1 i m. Now let a = xi · t. Then π(a) = z0 · t and a ∈ U · (t + ti ) ∩ U . It follows that (a, b) ∈ ((U × V ) · (t + ti )) ∩ (U × V ) ∩ Rπ ⊂ (W · (t + ti )) ∩ W . 2
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
849
The following is a classical result of Auslander [59]. Proposition 3.4. If π : (X, T) → (Y, T) is an extension between minimal flows, then π is semiopen. It is well known that every extension of minimal flows can be lifted to an open extension by almost 1–1 modifications. To be precise, let π : (X, T) → (Y, T) be an extension between minimal flows, and let Y0 be the set of continuity points of π −1 : Y → 2X : y → π −1 (y). Recall that Y0 is an invariant residual subset of Y . Let Y ∗ = cl({π −1 (y): y ∈ Y0 }) and (2X , T) be the flow on 2X induced from (X, T). Then Y ∗ is an invariant closed subset of 2X . It is easy to see that for any y ∗ ∈ Y ∗ , π(y ∗ ) is a singleton. Define τ : Y ∗ → Y as such that τ (y ∗ ) = π(x), x ∈ y ∗ . Then τ : (Y ∗ , T) → (Y, T) is a flow extension. Also let X ∗ = {(x, y ∗ ) ∈ X × Y ∗ : x ∈ Y ∗ }. Then X ∗ is a closed invariant subset of (X × Y ∗ , T). Denote τ : X ∗ → X and π : X ∗ → Y ∗ as the natural projections. Proposition 3.5. The following holds. (1) (X ∗ , T) is a minimal flow and the following diagram (X, T)
τ
(X ∗ , T) π
π
(Y, T)
τ
(Y ∗ , T)
commutes. (2) τ, τ are almost 1–1 extensions. (3) π is an open extension. Proof. See Theorem 3.1 in [59] or Lemma 14.41 in [3].
2
3.3. Entropies Let (X, T) be a compact flow and consider the time-1 map T : X → X: x → x · 1. We denote the discrete flow induced by T simply by (X, T ). Let BX denote the collection of all Borel subsets of X. A cover of X is a finite family of Borel subsets of X whose union is X. A partition of X is a cover of X whose elements are pairwise disjoint. We denote the set of partitions of X by PX and the set of open covers of X by CX . An open cover U is said to be finer than V (denoted by U V) if each element of U is contained in some element of V. Let U ∨ V
= {U ∩ V : U ∈ U, V ∈ V}. Given non-negative integers M, N N N = −n U . Also, given U ∈ C , we let N (U) be the minimal and U ∈ CX or PX , we let UM X n=M T cardinality among all cardinalities of sub-open-covers of U and let H (U) = log N (U). Clearly, if there is another open cover V U , then H (V) H (U). In fact, for any two covers U , V ∈ CX we have H (U ∨ V) H (U) + H (V). Consequently, for any open cover U ∈ CX ,
850
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
an = H (U0n−1 ) is a bounded sub-additive sequence, i.e., an+m an + am , n, m ∈ N. Hence limn→∞ ann exists and equals infn1 ann . This limit, denoted by htop (T , U), is called the entropy of U . The topological entropy htop (X, T ) of (X, T ) is simply defined as htop (X, T ) = sup htop (T , U), U ∈ CX
and, the topological entropy htop (X, T) of (X, T) is simply defined as htop (X, T ). Let M(X), M(X, T ), and Me (X, T ), respectively, be the set of Borel probability measures on X, the set of invariant Borel probability measures on X, and the set of invariant ergodic measures on X, respectively. For given α ∈ PX and μ ∈ M(X), define Hμ (α) = −μ(A) log μ(A). A∈α
Now let μ ∈ M(X, T ). Then for a given α ∈ PX , Hμ (α0n−1 ) is a non-negative sub-additive sequence. The measure-theoretic entropy of μ relative to α is defined by 1 n−1 1 Hμ α0 = inf Hμ α0n−1 , n→∞ n n1 n
hμ (T , α) = lim
and the measure-theoretic entropy of μ is defined by hμ (X, T ) = sup hμ (T , α). α∈PX
The classical variational principle of entropy says that htop (X, T ) =
sup
μ∈M(X,T )
hμ (X, T )
and the supremum can be attained by an invariant ergodic measure. We refer the readers to [10,46,60] for more information on the classical theory of measure-theoretic and topological entropies. Given partitions α, β ∈ PX , μ ∈ M(X) and σ -algebra A ⊆ BX , define Hμ (α | β) = Hμ (α ∨ β) − Hμ (β), Hμ (α | A) = −E(1A | A) log E(1A | A) dμ, A∈α X
Hμ (α | β ∨ A) = Hμ (α ∨ β | A) − Hμ (β | A), where E(1A | A) is the expectation of 1A with respect to A. Then Hμ (α | β) (resp. Hμ (α | A)) increases with respect to α and decreases with respect to β (resp. A). Let μ ∈ M(X, T ) and A be an invariant measurable σ -algebra of X. It is not hard to see that for a given α ∈ PX , Hμ (α0n−1 | A) is a bounded sub-additive sequence. The measure-theoretic conditional entropy of α with respect to A is defined by 1 n−1 1 A = inf Hμ α0n−1 A , Hμ α0 n→∞ n n1 n
hμ (T , α | A) = lim
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
851
and the measure-theoretic conditional entropy of (X, T , μ) with respect to A is defined by hμ (X, T | A) = sup hμ (T , α | A). α∈PX
∞
It is easy to see that hμ (T , α | A) = Hμ (α | i=1 (T −i α) ∨ A). Let π : (X, T ) → (Y, S) be an extension between compact discrete flows. For each α ∈ PX , the measure-theoretic conditional entropy of α with respect to (Y, S) is defined by 1 hμ (T , α | Y ) = hμ T , α π −1 (BY ) = lim Hμ α0n−1 π −1 (BY ) , n→∞ n and the measure-theoretic conditional entropy of (X, T , μ) with respect to (Y, S) is defined by hμ (X, T | Y ) = sup hμ (T , α | Y ). α∈PX
When (Y, S) is the trivial flow, the above coincides with the measure-theoretic entropy of (X, T ) with respect to μ. Let μ ∈ M(X, T ) and A be an invariant sub-σ -algebra of BX . The relative Pinsker σ -algebra Pμ (A) is defined as the smallest σ -algebra containing {ξ ∈ PX : hμ (T , ξ | A) = 0}. When A = {∅, X}, Pμ (A) coincides with Pμ —the classical Pinsker σ -algebra of the system. It is easy k) = Pμ (A) for any k ∈ Z \ {0}, to see that Pμ (A) is invariant, Pμ (A) ⊇ Pμ ∨ A, and Pμ (A, smallest σ -algebra containing {ξ ∈ PX : hμ (T k , ξ | A) = 0}. where Pμ (A, k) denotes the
+∞ ∞ − −i T −i Given α ∈ PX , we let α = n=1 T α and α = n=−∞ T α. Then a relative version of the classical Pinsker formula (see [18,20,46]) says that if α, β ∈ PX , then hμ (T , α ∨ β | A) = hμ (T , β | A) + Hμ α β T ∨ α − ∨ A . (3.3) In particular, when A is trivial, hμ (T , α ∨ β) = hμ (T , β) + Hμ (α | β T ∨ α − ). Proposition 3.6. Let μ and A be given as above. Then for each ξ ∈ PX , hμ T , ξ Pμ (A) = Hμ ξ ξ − ∨ Pμ (A) = Hμ (ξ | ξ − ∨ A) = hμ (T , ξ | A). Proof. For any α ∈ PX and any invariant sub-σ -algebra C of BX , it is easy to see that hμ (α, T | C) = Hμ (α | α − ∨ C). Hence for each ξ ∈ PX , we have hμ T , ξ Pμ (A) = Hμ ξ ξ − ∨ Pμ (A) and Hμ (ξ | ξ − ∨ A) = hμ (T , ξ | A). Now we fix ξ ∈ PX and let η ⊂ Pμ (A) be any finite measurable partition. It follows from (3.3) that Hμ (ξ | ξ − ∨ A) + Hμ η η− ∨ ξ T ∨ A = hμ (T , ξ | A) + Hμ η ξ T ∨ η− ∨ A = hμ (T , ξ ∨ η | A) = Hμ (ξ ∨ η | ξ − ∨ η− ∨ A) = Hμ (η | ξ − ∨ η− ∨ A ∨ ξ ) + Hμ (ξ | ξ − ∨ η− ∨ A).
(3.4)
852
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
Since η ⊂ Pμ (A), we have Hμ (η | η− ∨ A) = hμ (T , η | A) = 0, Hμ η η− ∨ ξ T ∨ A = 0,
andHμ (η | η− ∨ ξ − ∨ A ∨ ξ ) = 0.
(3.5)
Combining (3.4) and (3.5), we have Hμ (ξ | ξ − ∨ η− ∨ A) = Hμ (ξ | ξ − ∨ A).
(3.6)
Let ηn ⊂ Pμ (A) be an increasing sequence of finite measurable partitions of X such that
∞ − ∨ P (A)) = μ n=1 ηn = Pμ (A) (mod μ). It follows from (3.6) that Hμ (ξ | ξ − Hμ (ξ | ξ ∨ A). 2 3.4. Measure-theoretic extensions Let (X, B, μ) be a standard Borel space, μ be a regular probability measure on X, and T : X → X be a measurable transformation. The quadruple (X, B, μ, T ) is said to be a metric dynamical system (MDS for short) if T is measure preserving, i.e., μ(B) = μ(T −1 B) for all B ∈ B. If, in addition, T is bijective and T −1 is also measure-preserving, then (X, B, μ, T ) is said to be invertible. In the following, a MDS is always assumed to be invertible. A MDS (X, B, μ) is said to be ergodic if whenever A ∈ B is such that μ(AΔT −1 A) = 0 then either μ(A) = 0 or μ(A) = 1. Let (X, B, μ, T ) be a MDS and (Y, C, ν, S) be a measure-theoretic factor of (X, B, μ, T ), i.e., there exists a measure-preserving map π : X → Y , called a measure-theoretic factor map or extension, such that π ◦ T = S ◦ π μ-a.e. It is well known that μ admits a ν-a.s. unique disintegration μ = Y μy dν(y) over (Y, C, ν, S) [16, Proposition 5.9], where μy , y ∈ Y , are Borel probability measures on X satisfying μSy = T μy ,
ν-a.e. y ∈ Y.
(3.7)
For each i = 1, 2, . . . , n, let πi : (Xi , Bi , μi , Ti ) → (Y, C, ν, S) be a factor map between MDSs and μi = Y μi,y dν(y) be the disintegration of μi over (Y, C, ν, S). Define μ 1 × Y μ 2 × Y · · · ×Y μ n =
μ1,y × μ2,y × · · · × μn,y dν(y).
(3.8)
Y
Then by (3.7), T1 × T2 × · · · × Tn preserves the measure μ1 ×Y μ2 ×Y · · · ×Y μn . The MDS (X1 × X2 × · · · × Xn , B1 × B2 × · · · × Bn , μ1 ×Y μ2 ×Y · · · ×Y μn , T1 × T2 × · · · × Tn ) is called the product of (Xi , Bi , μi , Ti ), i = 1, 2, . . . , n, relative to (Y, C, ν, S). Let π : (X, B, μ, T ) → (Y, C, ν, S) be a factor map between ergodic MDSs and let μ = Y μy dν(y) be the disintegration of μ over (Y, C, ν, S). π is said to be relatively weakly mixing if the MDS (X × X, B × B, μ ×Y μ, T × T ) is ergodic, and is said to be compact if there exists a dense set F of functions in L2 (X, B, μ) with the following properties: for any f ∈ F and δ > 0, there exists a finite set of functions g1 , g2 , . . . , gk ∈ L2 (X, B, μ) such that min1ik T n f − gj L2 (μy ) < δ, ν-a.e. y ∈ Y , for all n ∈ Z. Now consider a compact discrete flow (X, T ). For μ ∈ M(X, T ), we let Pμ (A) be the relative Pinsker σ -algebra of invariant σ -algebra A of BX and denote the completion of Borel σ -algebra BX of X under μ by Bμ . Then (X, Bμ , μ, T ) is a Lebesgue system. Let (Z, Z, η, R) be the
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
853
Pinsker factor of (X, Bμ , μ, T ) and π : (X, Bμ , μ, T ) → (Z, Z, η, R) be the measure-theoretic Pinsker factor map with respect to A, i.e., π : X → Z is measure-preserving, π ◦ T = R ◦ π −1 μ-a.e., and π Z = Pμ (A) (mod μ). Let μ = Z μz dη(z) be the disintegration of μ over (Z, Z, η, R). Then for each integer n 2, (n) (n) -invariant measure on X (n) , where λA n (μ) = Z μz dη(z) is a T μ(n) μz × μz × · · · × μz , z =
X (n) = X ×X×
· · · × X,
T (n) = T ×T ×
· · · × T .
n
n
n
Moreover, it follows from basic properties of disintegration [16,50] that for any A1 , A2 , . . . , An ∈ Bμ and α ∈ PX , λA n (μ)
n
Ai =
i=1
n E 1A Pμ (A) (x) dμ(x)
(3.9)
X i=1
and Hμ α Pμ (A) =
−E 1A Pμ (A) (x) log E 1A Pμ (A) (x) dμ(x)
X A∈α
=
−μπ(x) (A) log μπ(x) (A) dμ(x)
X A∈α
=
Z
=
−μπ(x) (A) log μπ(x) (A) dμz (x) dη(z)
X A∈α
Hμz (α) dη(z).
(3.10)
Z
The following result should be well-known. As we are not aware of a suitable reference for it, a proof of the result is given for the sake of completeness. Theorem 3.3. Let (X, T ) be a compact discrete flow, μ ∈ Me (X, T ), and A be a T -invariant σ -algebra of BX . Let π : (X, Bμ , μ, T ) → (Z, Z, η, R) be the measure-theoretic Pinsker factor map with respect to A and μ = Z μz dη(z) be the disintegration of μ over (Z, Z, η, R). If hμ (X, T | A) > 0, then (1) μz is non-atomic for η-a.e. z ∈ Z; (n) -invariant ergodic measure on X (n) . (2) λA n (μ) is a T Proof. (1) If (1) is not true, then the ergodicity of μ implies that there exists a positive integer k such that μz is purely atomic with exactly k points in its support for η-a.e. z ∈ Z. Hence for each β ∈ PX and η-a.e. z ∈ Z, Hμz (β) log k.
(3.11)
854
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
Now for any α ∈ PX , we have by Proposition 3.6, (3.10), and (3.11) that n−1 1 −i T α Pμ (A) hμ (T , α | A) = hμ T , α Pμ (A) = lim Hμ n→∞ n i=0 n−1 1 1 −i = lim Hμz T α dη(z) lim log k dη(z) = 0. n→∞ n n→∞ n Z
i=0
Z
Since α is arbitrary, hμ (X, T | A) = 0, a contradiction. (2) We first claim that π : (X, Bμ , μ, T ) → (Z, Z, η, T ) is a relatively weakly mixing extension. If not, then by a classical result of Furstenberg and Zimmer [16,63,64] there exist (Y, C, ν, S) and factor maps π1 : X → Y , π2 : Y → Z such that π = π2 π1 and π2 is a nontrivial compact extension. We note that π1−1 π2−1 Z = Pμ (A). Now, for any A ∈ C, we have by Proposition 3.6 that hμ T , π1−1 A, π1−1 (Y \ A) A = hμ T , π1−1 A, π1−1 (Y \ A) Pμ (A) = hμ T , π1−1 A, π1−1 (Y \ A) π1−1 π2−1 Z = hν S, {A, Y \ A} π2−1 Z . Since π2 is a compact extension, the conditional sequential entropy characterization of compact extensions [24] implies that hμ (T , {π1−1 A, π1−1 (Y \ A)} | A) = 0. This shows that π1−1 A ∈ Pμ (A). Since A is arbitrary, π1−1 C ⊆ Pμ (A) (mod μ), and moreover, Pμ (A) = π1−1 π2−1 Z ⊆ π1−1 C ⊆ Pμ (A) (mod μ). Hence π1−1 π2−1 Z = π1−1 C = Pμ (A) (mod μ), i.e., π2−1 Z = C (mod ν). This shows that π2 is an isomorphism, a contradiction to the fact that π2 is a nontrivial compact extension. Hence π : (X, Bμ , μ, T ) → (Z, Z, η, T ) is a relatively weakly mixing extension. × μ ×Z · · · ×Z μ and μ is ergodic, we have by Proposition 6.3 in [16] Since λA n (μ) = μ Z
n
that (X (n) , B (n) , μ(n) , T (n) ) is ergodic, where B (n) = B ×B×
· · · × B.
2
n
4. Entropy and ordering The main purpose of this section is to prove Theorem 1. While Theorem 1 should more or less follow from an entropy inequality in [9], we will prove a general entropy preservation result (Theorem 4.1 below) under an ordering condition which not only implies Theorem 1 but also has the advantage of treating other zero entropy problems. For instance, applying our entropy preservation result, one can similarly show that all almost automorphic minimal sets obtained in [56,57] for an almost periodically forced monotone system admit zero topological entropy. 4.1. Entropy preservation Given a compact discrete flow (X, T ), a finite subset A of X is called a full scrambled set if for each map f : A → A there exists an infinite sequence {ni } ⊂ N such that limi→∞ T ni x = f (x) for any x ∈ A.
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
855
Lemma 4.1. Let π : (X, T ) → (Y, S) be an extension between compact discrete flows and μ be an ergodic measure of (X, T ). If hμ (X, T | Y ) > 0, then for each integer n 2 there exist y ∈ Y and x1 , x2 , . . . , xn ∈ π −1 (y) such that x1 , x2 , . . . , xn are pairwise different and {x1 , x2 , . . . , xn } is a full scrambled set of (X, T ). Proof. We follow the arguments in the proof of Theorem 2.1 in [8]. −1 Let (Z, Z, η, R) be the Pinsker factor of (X, Bμ , μ, T ) with respect to π BY and μ = μ dη(z) be the disintegration of μ over (Z, Z, η, R). For each integer n 2, let λn (μ) = z Z (n) n Z μz dη(z) and W = supp(λn (μ)). Since μ is ergodic and hμ (X, T | Y ) > 0, we have by Proposition 3.3 that μz is non-atomic for η-a.e. z ∈ Z and λn (μ) is an ergodic measure. Hence n (W n , T (n) ) is transitive, i.e., it contains a transitive point—point whose orbit is dense. Let Wtrans n (n) denote the set of all transitive points of (W , T ) and Gn be the set of generic points in W n −1 with respect to λn (μ), i.e., Gn = {w ∈ W n : N1 N i=0 δT i ×T i ×···×T i (w) → λn (μ) as N → ∞}. n Since λn (μ) is ergodic, we have Gn ⊂ WTrans . Then by Birkhoff ergodic theorem, 1 = λn (μ)(Gn ) =
μ(n) z (Gn ) dη(z). Z (n)
It follows that there exists a subset Zn of Z with η(Zn ) = 1 such that μz (Gn ) = 1 and μz is non-atomic for all z ∈ Zn . For each z ∈ Zn , let Sz = supp(μz ). Then Sz is a closed subset of X without isolated points and n ∩ Sz(n) =: Lnz , Gn ∩ Sz(n) ⊂ Wtrans
where Sz(n) = Sz × Sz × · · · × Sz .
n
(n)
(n)
(n)
Since μz (Gn ∩ Sz ) = 1, Gn ∩ Sz
(n)
is a dense subset of Sz . This shows that for each z ∈ Zn ,
Sz(n) = cl Lnz ⊆ W n .
(4.1)
n . By (4.1) and the fact that S is not a Now, fix z ∈ Zn and take (x1 , x2 , . . . , xn ) ∈ Lnz ⊂ Wtrans z singleton, x1 , x2 , . . . , xn are pairwise different. Let A = {x1 , x2 , . . . , xn } and f : A → A be any (n) map. We have by (4.1) that (f (x1 ), f (x2 ), . . . , f (xn )) ∈ Sz ⊂ W n . Since (x1 , x2 , . . . , xn ) ∈ n Wtrans , there exists an infinite sequence {ni } ⊂ N such that
lim T ni x1 , T ni x2 , . . . , T ni xn = f (x1 ), f (x2 ), . . . , f (xn ) ,
i→∞
i.e., limi→∞ T ni x = f (x) for all x ∈ A. This shows that A is a full scrambled set of (X, T ). It remains to show that there exists y ∈ Y such that {x1 , x2 , . . . , xn } ⊆ π −1 (y). If this is not true, then there exist two disjoint nonempty open subsets U1 and U2 of Y such that {π(xi )}ni=1 ⊂ U1 ∪ U2 and {π(xi )}ni=1 ∩ Uj = ∅, j = 1, 2. For each i = 1, 2, . . . , n, take s(i) ∈ {1, 2} such that xi ∈ π −1 Us(i) . Since (x1 , x2 , . . . , xn ) ∈ supp(λn (μ)) and ni=1 π −1 Us(i) is an open neighborhood of (x1 , x2 , . . . , xn ), we have n −1 π Us(i) > 0. λn (μ) i=1
856
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
But by (3.9) and the facts that π −1 Us(i) ∈ Pμ (π −1 BY ), π −1 U1 ∩ π −1 U2 = ∅, and {s(1), . . . , s(n)} = {1, 2}, we also have λn (μ)
n
π
−1
Us(i)
n E 1π −1 Us(i) Pμ π −1 BY (x) dμ(x) =
i=1
X i=1
=
n
1π −1 Us(i) (x) dμ(x) = 0,
X i=1
which is a contradiction.
2
Lemma 4.2. Let π : (X, T ) → (Y, S) be an extension between compact discrete flows. If htop (X, T ) > htop (Y, S), then for each integer n 2 there exist y ∈ Y and x1 , x2 , . . . , xn ∈ π −1 (y) such that x1 , x2 , . . . , xn are pairwise different and {x1 , x2 , . . . , xn } is a full scrambled set of (X, T ). Proof. Since htop (X, T ) > htop (Y, S), htop (Y, S) < +∞. By the variational principle of entropy there exists an ergodic measure μ of (X, T ) with hμ (X, T ) > htop (Y, S). Hence ν = φμ ∈ Me (Y, S) and hμ (X, T ) > hν (Y, S).
∞ ∞ Let {αi }∞ X , {βj }j =1 ⊂ PY be such that α1 α2 · · ·, i=1 αi = BX (mod μ), i=1 ⊂ P
β1 β2 · · · , and ∞ β = B (mod ν). We have by (3.3) that j Y j =1 T hμ T , αi ∨ π −1 βj = hμ T , π −1 βj + Hμ αi π −1 βj ∨ (αi )− .
(4.2)
Since hμ (T , π −1 βj ) = hν (S, βj ), (4.2) yields that T Hμ αi π −1 βj ∨ (αi )− hμ (T , αi ) − hν (βj , S) hμ (T , αi ) − hν (Y, S).
(4.3)
T
Note that (π −1 βj ) ∨ (αi )− π −1 BY ∨ (αi )− as j → ∞. Taking j → ∞ in (4.3), we have by Matingale theorem that Hμ (αi | π −1 BY ∨ (αi )− ) hμ (T , αi ) − hν (Y, S). Hence hμ (X, T | Y ) hμ (αi , T | Y ) = Hμ αi π −1 BY ∨ (αi )− hμ (T , αi ) − hν (Y, S). (4.4) Taking i → ∞ in (4.4), we have by Kolmogorov–Sinai theorem that hμ (X, T | Y ) hμ (X, T ) − hν (Y, S) > 0. The lemma now follows from Lemma 4.1.
2
When hν (Y, S) < +∞, it can be shown that hμ (X, T | Y ) = hμ (X, T ) − hν (Y, S). When hν (Y, S) = +∞, we note that hμ (X, T ) = hν (Y, S) = +∞. But in this case, it can also happen that hμ (X, T | Y ) > 0. Therefore the condition hμ (X, T | Y ) > 0 in Lemma 4.1 is more general than the condition hμ (X, T ) > hν (Y, S) in Lemma 4.2. Given an integer n 2, we denote by Pern (X) the set of all coordinate permutations on X (n) . An n-partial order relation R on X is a subset of X (n) such that
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
857
(a) τ0 (R) ∩ R = ∅ for some τ0 ∈ Pern (X); (n) (b) R is essentially closed, i.e., for any {wk }∞ k=1 ⊂ R and w = (x1 , x2 , . . . , xn ) ∈ X , if limk→∞ wk = w and xi = xj for 1 i = j n, then w ∈ R. We say that a compact flow (X, T) = (X, {Πt }t∈T ) preserves an n-partial order R if Π × Πt × · · · × Πt (R) ⊆ R t
n
for all t > 0. Also, we refer the relation PRne (X, T) = {(x1 , x2 , . . . , xn ): xi = xj , 1 i < j n, and lim inft→+∞ diam({x1 · t, x2 · t, . . . , xn · t}) = 0} as the proper n-proximal relation of (X, T). Theorem 4.1. Let π : (X, T) → (Y, T) be an extension between compact flows. If for some integer n 2, the flow (X, T) preserves an n-partial order relation R on X and PRne (X, T) ∩ n Rπ ⊆ τ ∈Pern (X) τ (R), where Rπn = {(x1 , x2 , . . . , xn ) ∈ X (n) : π(x1 ) = · · · = π(xn )}, then htop (X, T) = htop (Y, T). Proof. Let T and S be the time-1 maps of (X, T) and (Y, T) respectively. If htop (X, T ) = htop (Y, S), then htop (X, T ) > htop (Y, S). It follows from Lemma 4.2 that there exist y ∈ Y and x1 , x2 , . . . , xn ∈ π −1 (y) such that xi = xj , 1 i = j n, and {x 1 , x2 , . . . , xn } is a full scrambled set of (X, T ). Clearly, (x1 , x2 , . . . , xn ) ∈ PRen (X, T) ∩ Rπ ⊆ τ ∈Pern (X) τ (R). Without loss of generality, we assume that (x1 , x2 , . . . , xn ) ∈ R. Since R is an n-partial relation, there exists τ0 ∈ Pern (X) such that τ0 (R) ∩ R = ∅. Denote τ0 (x1 , x2 , . . . , xn ) = (x1 , x2 , . . . , xn ). Then xi = xj for all 1 i = j n. Since {x1 , x2 , . . . , xn } is a full scrambled set of (X, T ), there exists a sequence {ni } ⊂ N such that lim T ni x1 , T ni x2 , . . . , T ni xn = x1 , x2 , . . . , xn . i→∞
Note that (T ni x1 , T ni x2 , . . . , T ni xn ) ∈ R and xi = xj for all 1 i = j n. We have by the essential closeness of R that (x1 , x2 , . . . , xn ) ∈ R. Now (x1 , x2 , . . . , xn ) ∈ τ0 (R), a contradiction to the fact that τ0 (R) ∩ R = ∅. Hence htop (X, T ) = htop (Y, S). 2 Corollary 4.1. Let (X, T) be a compact flow which preserves an n-partial order relation R on X for some integer n 2. If PRne (X, T) ⊆ τ ∈Pern (X) τ (R), then htop (X, T) = 0. Proof. It follows from Theorem 4.1 by taking (Y, T) as the trivial flow.
2
4.2. Zero entropy of APCFs The follows theorem immediately implies Theorem 1. Theorem 4.2. For a SPCF (S 1 × Y, T), htop (S 1 × Y, T) = htop (Y, T). Proof. Let π : S 1 × Y → Y be the natural projection. Clearly, π : (S 1 × Y, T) → (Y, T) is a flow extension. Consider R = e2πφ1 i , y , e2πφ2 i , y , e2πφ3 i , y : y ∈ Y and φ1 < φ2 < φ3 < 1 + φ1 .
858
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
It is clear that R is a 3-partial relation on X = S 1 × Y which is preserved by (X, T) and PR3e (X, T) ∩ Rπ3 ⊆ τ ∈Per3 (X) τ (R). It follows from Theorem 4.1 that htop (S 1 × Y, T) = htop (Y, T). 2 5. Li–Yorke chaos and proximality 5.1. General conditions on the existence of Li–Yorke chaos The following lemmas will be needed in the proof of our general result on Li–Yorke chaos of a proximal extension. Lemma 5.1. Let X and Y be compact metric spaces, π : X → Y be a semi-open, surjective, continuous map, and K be a residual subset of X. Then A = AK = y ∈ Y : K ∩ π −1 (y) is a residual subset of π −1 (y) is a residual subset of Y . Proof. See Proposition 3.1 in [59].
2
Let X be a compact metric space. A subset K ⊆ X is called a Mycielski set if it is a countable union of Cantor sets. The following result is a special case of Mycielski theorem [44]. Lemma 5.2. Let X be a compact metric space with no isolated point. If R is a residual subset of X × X, then there exists a Mycielski set K of X which is dense in X such that for any two distinct points x, y in K, (x, y) ∈ R. Proof. See Theorem 1 in [44] or Lemma 2.6 in [8].
2
The following result is more or less known for maps [1] but the proof does not automatically carry over to the case of R-flows. Theorem 5.1. Let π : (X, T) → (Y, T) be a proximal extension of minimal flows which is not almost 1–1. Then there exists a residual subset Yc of Y such that each fiber π −1 (y), y ∈ Yc , admits an uncountable scrambled set. In particular, (X, T) is Li–Yorke chaotic. Proof. Let Y0 , Y ∗ , X ∗ and π , τ, τ be as in Proposition 3.5. Recall that π is an open extension. Let y ∗ ∈ Y ∗ and x, x ∈ y ∗ . Since π(x) = π(x ) = τ (y ∗ ) and π is a proximal extension, x, x are proximal, so are (x, y ∗ ), (x , y ∗ ). This shows that π is a proximal extension as well. It follows from Proposition 3.3 that (Rπ , T) is positively transitive. Thus Tran+ (Rπ ) is a residual subset of Rπ . Since ρ : Rπ → Y ∗ : ((x, y ∗ ), (x , y ∗ )) → y ∗ is an open map, it follows from Lemma 5.1 that there is a residual subset Yc∗ ⊆ Y ∗ such that for every point y ∗ ∈ Yc∗ , Tran+ (Rπ ) ∩ ((π )−1 (y ∗ ) × (π )−1 (y ∗ )) is a residual subset of (π )−1 (y ∗ ) × (π )−1 (y ∗ ). Let Yc = τ (Yc∗ ). Since by Proposition 3.4 τ is semi-open, Lemma 5.1 implies that Yc is a residual set. Fix y0 ∈ Yc and let y0∗ ∈ Yc∗ be such that τ (y0∗ ) = y0 , i.e., y0∗ ⊂ π −1 (y0 ). We first claim that the closed subset (π )−1 (y0∗ ) = {(x, y0∗ ): x ∈ y0∗ } has no isolated point. Suppose for
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
859
contradiction that there exists x ∈ y0∗ such that (x, y0∗ ) is an isolated point of (π )−1 (y0∗ ). Obviously, {((x, y0∗ ), (x, y0∗ ))} is a relatively open subset of (π )−1 (y0∗ ) × (π )−1 (y0∗ ), hence ((x, y0∗ ), (x, y0∗ )) ∈ Trans+ (Rπ ) as Trans+ (Rπ )∩((π )−1 (y0∗ )×(π )−1 (y0∗ )) is a residual subset of (π )−1 (y0∗ )×(π )−1 (y0∗ ). It follows that Rπ = {(x ∗ , x ∗ ): x ∗ ∈ X ∗ }, i.e., π is an isomorphism, and hence π −1 (y ∗ ) is a singleton for each y ∗ ∈ Y ∗ . In particular, π −1 (y) is a singleton for each y ∈ Y0 , i.e., π is an almost 1–1 extension, a contradiction. Next, by applying Lemma 5.2 with R = Tran+ (Rπ ) ∩ ((π )−1 (y0∗ ) × (π )−1 (y0∗ )), we obtain a Mycielski set Ky∗0 ⊂ (π )−1 (y0∗ ) such that for any two distinct points x ∗ , x1∗ in Ky∗0 , (x ∗ , x1∗ ) ∈ Tran+ (Rπ ). Now let Ky0 = {x ∈ X : (x, y0∗ ) ∈ Ky∗0 }. Since Ky0 and Ky∗0 are homeomorphic, Ky0 ⊆ π −1 (y0 ) is also a Mycielski set hence it is uncountable. It remains to show that Ky0 is a scrambled subset of (X, T). Let x, x1 be any two distinct points in Ky0 . We note that ((x, y0∗ ), (x1 , y0∗ )) ∈ Tran+ (Rπ ). Since both ((x, y0∗ ), (x, y0∗ )) and ((x, y0∗ ), (x1 , y0∗ )) are in Rπ , there are positive increasing sequences ti → +∞, sj → +∞ such that x, y0∗ , x1 , y0∗ · ti = x, y0∗ , x1 , y0∗ , i→∞ lim x, y0∗ , x1 , y0∗ · sj = x, y0∗ , x, y0∗ . lim
j →∞
This implies that limi→∞ (x, x1 ) · ti = (x, x1 ) and limj →∞ (x, x1 ) · sj = (x, x), i.e., {x, x1 } is a Li–Yorke pair of (X, T). Hence Ky0 is an uncountable scrambled set of X. This completes the proof. 2 An extension π : (X, T) → (Y, T) between compact flows is said to be positively asymptotic if for each y ∈ Y , any two points x, x ∈ π −1 (y) are positively asymptotic, i.e., limt→+∞ d(x · t, x · t) = 0, where d is a compatible metric on X. Corollary 5.1. Let π : (X, T) → (Y, T) be a positively asymptotic extension between minimal flows. Then π is an almost 1–1 extension. Proof. Obviously, π is a proximal extension. If π is not almost 1–1, then by Theorem 5.1 there exists an uncountable scrambled set Ky ⊂ π −1 (y) for some y ∈ Y . In particular, there exist x1 , x2 ∈ Ky ⊂ π −1 (y) which are not positively asymptotic, a contradiction. 2 5.2. A strict dynamical dichotomy of minimal sets We now consider a SPCF (S 1 × Y, T) = (S 1 × Y, {Λt }t∈T ) in the form (1.1), i.e., Λt (s0 , y0 ) = ψ(s0 , y0 , t), y0 · t , t ∈ T. We denote dY as a compatible metric on Y and π : S 1 × Y → Y as the natural projection. For s1 = s2 ∈ S 1 , we denote [s1 , s2 ] as the closed arc from s1 to s2 oriented counter-clockwise, and let (s1 , s2 ) = [s1 , s2 ] \ {s1 , s2 }. We also denote [s, s] = {s} for any s ∈ S 1 . For a fixed y0 ∈ Y , consider the family of maps ft : S 1 → S 1 : s → ψ(s, y0 , t), t ∈ T. Then each ft is an orientation preserving homeomorphism of S 1 .
860
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
Lemma 5.3. Let s1 = s2 ∈ S 1 and {tj } ⊂ T be such that limj →∞ ftj (si ) = si for some si ∈ S 1 , i = 1, 2. Then the following holds. (1) If s1 = s2 , then limj →∞ ftj ([s1 , s2 ]) = [s1 , s2 ] under the Hausdorff metric on 2S . (2) If s1 = s2 = s , then either limj →∞ ftj ([s1 , s2 ]) = {s } or limj →∞ ftj ([s2 , s1 ]) = {s } by taking subsequences if necessary. (3) If s1 = s2 = s and lim supj →∞ ftj ([s1 , s2 ]) = S 1 , then limi→∞ ftj ([s1 , s2 ]) = {s }. (4) If limt→+∞ |ft (s1 ) − ft (s2 )| = 0, then there exists t0 > 0 such that, as t t0 , |ft (s1∗ ) − ft (s2∗ )| |ft (s1 ) − ft (s2 )| for either all s1∗ , s2∗ ∈ [s1 , s2 ] or all s1∗ , s2∗ ∈ [s2 , s1 ]. In particular, limt→+∞ |ft (s1∗ ) − ft (s2∗ )| = 0 for either all s1∗ , s2∗ ∈ [s1 , s2 ] or all s1∗ , s2∗ ∈ [s2 , s1 ]. 1
Proof. (1)–(3) are obvious. (4) Denote A1 = [s1 , s2 ] and A2 = [s2 , s1 ]. We let 0 < 0 < diam(S 1 )
diam(S 1 ) 3
be such that if |s −s | 0
then |ψ(s, y, r) − ψ(s , y, r)| for all y ∈ Y and r ∈ [0, 1]. We also let t0 > 0 be such 3 that |ft (s1 ) − ft (s2 )| 0 for all t t0 . Then for any t t0 , there exists i(t) = 1 or 2 such that diam(ft (Ai(t) )) |ft (s1 ) − ft (s2 )| 0 . Since diam(A1 ∪ A2 ) = diam(S 1 ), we have 2 diam ft S 1 \ Ai(t) diam S 1 − 0 > diam S 1 3 for all t t0 . In the following, we show that i(t) equals a constant, say i0 = 1 or 2, on [t0 , +∞) ∩ T. If this is not true, then there exist t1 ∈ [t0 , +∞) ∩ T and r ∈ (0, 1] ∩ T such that i(t1 ) = i(t1 + r). On one hand, since i(t1 ) = i(t1 + r), we have 2 diam ft1 +r (Ai(t1 ) ) diam ft1 +r S 1 \ Ai(t1 +r) diam S 1 − 0 > diam S 1 . 3 But on the other hand, since diam(ft1 (Ai(t1 ) )) 0 , we have diam(S 1 ) diam ft1 +r (Ai(t1 ) ) = diam ψ(s, y · t1 , r): s ∈ ft1 (Ai(t1 ) ) . 3 This is a contradiction. Now for any s1∗ , s2∗ ∈ Ai0 and t t0 , we have |ft (s1∗ ) − ft (s2∗ )| diam(ft (Ai(t) )) |ft (s1 ) − ft (s2 )|. 2 We now assume that base flow (Y, T) is minimal in the SPCF (S 1 × Y, T). Let X be a minimal set of (S 1 × Y, T), Y0 be the set of continuity points of π −1 : Y → 2X : y → π −1 (y), and (X ∗ , T), (Y ∗ , T), τ, τ , π be defined as in Proposition 3.5 with respect to the extension π : (X, T) → (Y, T). Recall that Y0 is an invariant residual subset of Y , Y ∗ = cl{π −1 (y): y ∈ Y0 }, X ∗ = {(x, y ∗ ) ∈ X × Y ∗ : x ∈ y ∗ }, (X ∗ , T) is a minimal flow, τ : (Y ∗ , T) → (Y, T) and τ : (X ∗ , T) → (X, T) are almost 1–1 extensions (hence (Y ∗ , T) is point-distal if (Y, T) is), and π : (X ∗ , T) → (Y ∗ , T) is an open extension. Let Z = S 1 × Y ∗ and define the skew-product flow (Z, T) = (Z, {ΠtZ }t∈T ) by ΠtZ (s, y ∗ ) = ψ s, τ (y ∗ ), t , y ∗ · t .
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
861
Consider ρ : X ∗ → Z: ((s, τ (y ∗ )), y ∗ ) → (s, y ∗ ) and let Z ∗ = ρ(Z) endow with the metric dZ ∗ s1 , y1∗ , s2 , y2∗ = |s1 − s2 | + dY ∗ y1∗ , y2∗ ,
s1 , y1∗ , s2 , y2∗ ∈ Z ∗ ,
where dY ∗ is a compatible metric on Y ∗ . Since for any ((s, τ (y ∗ )), y ∗ ) ∈ X ∗ and t ∈ T, ρ
s, τ (y ∗ ) , y ∗ · t = ρ Λt s, τ (y ∗ ) , y ∗ · t = ρ ψ s, τ (y ∗ ), t , τ (y ∗ ) · t , y ∗ · t = ψ s, τ (y ∗ ), t , y ∗ · t = ΠtZ (s, y ∗ ) = ΠtZ ρ s, τ (y ∗ ) , y ∗ ,
we see that ρ : (X ∗ , T) → (Z ∗ , T) is a flow isomorphism. Hence (Z ∗ , T) is a minimal flow. Let π ∗ : Z → Y ∗ be the natural projection and denote π1 = π ∗ |Z ∗ . Lemma 5.4. The following diagram (X, T)
τ
ρ
(Z ∗ , T)
π
π
(Y, T)
(X ∗ , T)
τ
(Y ∗ , T)
π1
(Y ∗ , T)
commutes, where τ, τ are almost 1–1, π , π1 are open, and ρ is 1–1. Proof. Since π is open, so is π1 = π ◦ ρ −1 . With Proposition 3.5, we only need to check the commutativity of the right-half of the diagram. Let ((s, τ (y ∗ )), y ∗ ) ∈ X ∗ . Then π1 ρ s, τ (y ∗ ) , y ∗ = π1 (s, y ∗ ) = y ∗ = π s, τ (y ∗ ) , y ∗ .
2
Proposition 5.1. Suppose that the base flow (Y, T) of the SPCF (S 1 ×Y, T) is point-distal. If there exists a second category subset Yu∗ of Y ∗ such that for each y ∗ ∈ Yu∗ there exists no uncountable scrambled set in π1−1 (y ∗ ), then (Z ∗ , T) is point-distal. Proof. Let A denote the collection of all proper, closed, sub-arcs of S 1 with end points being roots of unity and consider the set D = (I1 , I2 ): I1 , I2 ∈ A and I2 ⊂ int(I1 ) . It is clear that D is countable. Since (Y ∗ , T) is point-distal, the set Yd∗ of distal points of Y ∗ is a residual subset. Clearly, ∗ Yw = Yu∗ ∩ Yd∗ is a second category subset of Y ∗ . For each y ∗ ∈ Y ∗ , we let S(y ∗ ) = {s ∈ S 1 : (s, y ∗ ) ∈ Z ∗ }. Then S(y ∗ ) is a closed subset of S 1 . 1 Since π1 is open, the map θ : Y ∗ → 2S : y ∗ → S(y ∗ ) is continuous. Define Yi∗ = y ∗ ∈ Yw∗ : S(y ∗ ) contains an isolated point
and Yp∗ = Yw∗ \ Yi∗ .
Since Yw∗ is a second category subset of Y ∗ , either Yi∗ or Yp∗ is a second category subset of Y ∗ .
862
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
There are two cases to consider. Case 1. Yi∗ is a second category subset of Y ∗ . y∗
y∗
y∗
y∗
We note that for each y ∗ ∈ Yi∗ , there exists (I1 , I2 ) ∈ D such that S(y ∗ ) ∩ I1 = S(y ∗ ) ∩ I2 y∗
y∗
is a singleton. Thus the map Φ : Yi∗ → D: y ∗ → (I1 , I2 ) is well defined. Since Yi∗ = (I1 ,I2 )∈D Φ −1 (I1 , I2 ), D is countable, and Yi∗ is a second category subset of Y ∗ ,
there exist (I10 , I20 ) ∈ D and a nonempty open subset U of Y ∗ such that Φ −1 (I10 , I20 ) ⊇ U . Using
the continuity of the map θ : Y ∗ → 2S : y ∗ → S(y ∗ ), we have that for each y ∗ ∈ Φ −1 (I10 , I20 ), S(y ∗ ) ∩ int(I10 ) = S(y ∗ ) ∩ I20 is a singleton. In particular, for each y ∗ ∈ U , S(y ∗ ) ∩ int(I10 ) = S(y ∗ ) ∩ I20 is a singleton. Let W = int(I10 ). Then W ∩ S(y ∗ ) is a singleton for each y ∗ ∈ U . Fix points y ∗ ∈ Y ∗ and s ∈ S(y ∗ ). Then (s, y ∗ ) ∈ Z ∗ . Since (W × U ) ∩ Z ∗ is a nonempty open ∗ subset of Z ∗ and (Z ∗ , T) is a minimal flow, there exists t0 ∈ T such that ΠtZ0 (s, y ∗ ) ∈ W × U . ∗ Hence there exists an open neighborhood V of s in S 1 such that ΠtZ0 ((V × {y ∗ }) ∩ Z ∗ ) ⊂ ∗ (W × U ) ∩ Z ∗ . Since ΠtZ0 ((V × {y ∗ }) ∩ Z ∗ ) ⊂ S(y ∗ · t0 ) × {y ∗ · t0 } and y ∗ · t0 ∈ U , ∗ ΠtZ0 ((V × {y ∗ }) ∩ Z ∗ ) ⊆ (S(y ∗ · t0 ) ∩ W ) × {y ∗ · t0 } is a singleton, it follows that (V × {y ∗ }) ∩ Z ∗ is a singleton, i.e., s is an isolated point of S(y ∗ ). Thus, for each y ∗ ∈ Y ∗ , S(y ∗ ) is a discrete closed subset of S 1 , hence a finite subset of S 1 . This shows that π1 : (Z ∗ , T) → (Y ∗ , T) is a finite to one and open extension of the point-distal flow (Y ∗ , T). By Proposition 3.2(2), (Z ∗ , T) is point-distal. 1
Case 2. Yp∗ is a second category subset of Y ∗ . Let y0∗ ∈ Yp∗ be fixed such that S(y0∗ ) contains no isolated point. Then S(y0∗ ) is a perfect subset of S 1 . It is not hard to see that there exists s0 ∈ S(y0∗ ) such that for any > 0 sufficiently small, [s0 , s0 e2πi ] ∩ S(y0∗ ) and [s0 e−2πi , s0 ] ∩ S(y0∗ ) are two uncountable sets. Clearly, z0∗ = (s0 , y0∗ ) ∈ Z ∗ . Let y0 = τ (y0∗ ) and ft : S 1 → S 1 : s → ψ(s, y0 , t). Suppose for contradiction that (Z ∗ , T) is not point-distal. Claim 1. There exists 0 > 0 such that if (s1 , y ∗ ), (s2 , y ∗ ) ∈ Z ∗ are distal, then supt0 ψ s1 , τ (y ∗ ), t − ψ s2 , τ (y ∗ ), t > 0 . Since (Z ∗ , T) is not point-distal, we have by Proposition 3.1 that there exists z1∗ = (s , y1∗ ) ∈ \ {z0∗ } such that z1∗ , z0∗ are positively proximal. In particular, y1∗ , y0∗ are positively proximal. Since y0∗ ∈ Yd∗ , y1∗ = y0∗ . Hence s = s0 and there exists a sequence ti → +∞ and a point s ∈ S 1 such that limi→∞ fti (s ) = limi→∞ fti (s0 ) = s . It follows from Lemma 5.3(2) that there exists a subsequence {ik } ⊂ {i} such that either limk→∞ ftik ([s , s0 ]) = {s } or limk→∞ ftik ([s0 , s ]) = {s } under Hausdorff metric. We let B = [s , s0 ] or [s0 , s ] be such that for any s1 , s2 ∈ B ∩ S(y0∗ ), Z∗
lim inf dZ ∗ (s1 , y0∗ ) · t, (s2 , y0∗ ) · t = lim ftik (s1 ) − ftik (s2 ) = 0. t→+∞
k→∞
By the choice of s0 , B ∩ S(y0∗ ) is an uncountable set. For each (s, y ∗ ) ∈ Z ∗ , we consider the set AR+ (s, y ∗ ) = s1 ∈ S(y ∗ ): lim dZ ∗ (s1 , y ∗ ) · t, (s, y ∗ ) · t = 0 . t→+∞
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
863
Obviously, if s ∈ S(y0∗ ), then AR+ (s, y0∗ ) = {s1 ∈ S(y0∗ ): limt→+∞ |ft (s1 ) − ft (s)| = 0}. Hence for any s1 , s2 ∈ S(y0∗ ), either AR+ (s1 , y0∗ ) = AR+ (s2 , y0∗ ) or AR+ (s1 , y0∗ ) ∩ AR+ (s2 , y0∗ ) = ∅. It follows that there exists a set I ⊂ B ∩ S(y0∗ ) such that (1) for any s ∈ I , B ∩ AR+ (s, y0∗ ) = ∅; ∗ ∗ (2) for any s1 = s 2 ∈ I , AR+ (s1 , y0 ) ∩ AR+ (s2 , y0 ) = ∅; (3) B ∩ S(y0∗ ) = s∈I (B ∩ AR+ (s, y0∗ )). By (2) above, if s1 = s2 ∈ I , then lim supt→+∞ dZ ∗ ((s1 , y0∗ ) · t, (s2 , y0∗ ) · t) > 0, hence (s1 , y0∗ ) and (s2 , y0∗ ) form a Li–Yorke pair. This shows that I × {y0∗ } ⊆ π1−1 (y0∗ ) is a scrambled set of (Z ∗ , T). Since π1−1 (y0∗ ) contains no uncountable scrambled set, I must be countable. Note that B ∩S(y0∗ ) is uncountable. There must be a point s 0 ∈ I such that B ∩AR+ (s 0 , y0∗ ) is uncountable, in particular, AR+ (s 0 , y0∗ ) is uncountable. Let > 0 be given. Since AR+ (s 0 , y0∗ ) is uncountable, there exist s11 , s21 ∈ AR+ (s 0 , y0∗ ) such that [s11 , s21 ] ∩ AR+ (s 0 , y0∗ ) and [s21 , s11 ] ∩ AR+ (s 0 , y0∗ ) are both uncountable. Using Lemma 5.3(4) and the fact that limt→+∞ dZ ∗ ((s11 , y0∗ ) · t, (s21 , y0∗ ) · t) = 0, we have that there exists t0 0 such that, as t t0 , dZ ∗ s1 , y0∗ · t, s2 , y0∗ · t dZ ∗ s11 , y0∗ · t, s21 , y0∗ · t for all s1 , s2 ∈ [s11 , s21 ] or s1 , s2 ∈ [s21 , s11 ]. Let [s10 , s20 ] ⊆ [s11 , s21 ] or [s21 , s11 ] be such that [s10 , s20 ]∩S(y0∗ ) is an uncountable set, [s10 , s20 ]∩S(y0∗ ) ⊆ AR+ (s0 , y ∗ ), and supt0 dZ ∗ ((s1 , y ∗ )·t, (s2 , y ∗ ) · t) for all s1 , s2 ∈ [s10 , s20 ] ∩ S(y0∗ ). Also let s30 ∈ (s10 , s20 ) ∩ S(y0∗ ) and 0 > 0 be such that {s ∈ S 1 : |s − s30 | 20 } ⊂ (s10 , s30 ). Let (s1 , y ∗ ), (s2 , y ∗ ) ∈ Z ∗ be distal. Since (Z ∗ , T) is minimal, there exists a positive sequence ∗ ∗ tn → +∞ such that ΠtZn (s1 , y ∗ ) → (s30 , y0∗ ) and ΠtZn (s2 , y ∗ ) → (s40 , y0∗ ) for some s40 ∈ S(y0∗ ). / (s10 , s20 ). Hence |s30 − s40 | 20 . It follows that Since (s30 , y0∗ ), (s40 , y0∗ ) ∈ Z ∗ are also distal, s40 ∈ supt0 |ψ(s1 , τ (y ∗ ), t) − ψ(s2 , τ (y ∗ ), t)| |s30 − s40 | > 0 . This proves Claim 1. Claim 2. For each > 0, there exist a nonempty open set U ⊂ Y ∗ and a point (I1 , I2 ) ∈ D such that if y ∗ ∈ U , then S(y ∗ ) ∩ I2 = ∅ and supt0 dZ ∗ ((s1 , y ∗ ) · t, (s2 , y ∗ ) · t) for all s1 , s2 ∈ int(I1 ) ∩ S(y ∗ ). Let y ∗ ∈ Yp∗ . Then S(y ∗ ) contains no isolated point and no uncountable scrambled set. Similar to the proof of Claim 1 there are s1 = s2 ∈ S(y ∗ ) such that [s1 , s2 ] ∩ S(y ∗ ) is an uncountable set, [s1 , s2 ] ∩ S(y ∗ ) ⊆ AR+ (s, y ∗ ) for some s ∈ S(y ∗ ), and supt0 dZ ∗ ((s1 , y ∗ ) · t, (s2 , y ∗ ) · t) y∗
y∗
y∗
for all s1 , s2 ∈ [s1 , s2 ] ∩ S(y ∗ ). Hence there exists (I1 , I2 ) ∈ D such that I2 ∩ S(y ∗ ) = ∅ y∗
and supt0 dZ ∗ ((s1 , y ∗ ) · t, (s2 , y ∗ ) · t) for any s1 , s2 ∈ I1 ∩ S(y ∗ ). We define the map y∗
y∗
Φ : Yp∗ → D as Φ (y ∗ ) = (I1 , I2 ), y ∗ ∈ Yp∗ . Since Yp∗ = (I1 ,I2 )∈D Φ−1 (I1 , I2 ), D is countable, and Yp∗ is a second category subset of Y ∗ , we have that there exists (I1 , I2 ) ∈ D and a nonempty open subset U of Y ∗ such that
Φ−1 (I1 , I2 ) ⊇ U . We note that for any y ∗ ∈ Φ−1 (I1 , I2 ), I2 ∩ S(y ∗ ) = ∅, supt0 dZ ∗ ((s1 , y ∗ ) · t, (s2 , y ∗ ) · t)
for any s1 , s2 ∈ I1 ∩ S(y ∗ ). It follows from the continuity of the map θ : Y ∗ → 2S : y ∗ → S(y ∗ ) 1
that for each y ∗ ∈ Φ−1 (I1 , I2 ), S(y ∗ ) ∩ I2 = ∅, and supt0 dZ ∗ ((s1 , y ∗ ) · t, (s2 , y ∗ ) · t)
864
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
for any s1 , s2 ∈ int(I1 ) ∩ S(y ∗ ). In particular, for each y ∗ ∈ U , we have S(y ∗ ) ∩ I2 = ∅ and supt0 dZ ∗ ((s1 , y ∗ ) · t, (s2 , y ∗ ) · t) for any s1 , s2 ∈ int(I1 ) ∩ S(y ∗ ). This proves Claim 2. Claim 3. If (s1 , y ∗ ), (s2 , y ∗ ) ∈ Z ∗ are proximal, then s1 ∈ AR+ (s2 , y ∗ ). Moreover, for each (s, y ∗ ) ∈ Z ∗ , AR+ (s, y ∗ ) is an open subset of S(y ∗ ). Let (s1 , y ∗ ), (s2 , y ∗ ) ∈ Z ∗ be proximal. For any > 0, we let U ⊂ Y ∗ and (I1 , I2 ) ∈ D be as in Claim 2. Then for each y1∗ ∈ U , we have S(y1∗ ) ∩ I2 = ∅ and supt0 dZ ∗ ((s1 , y1∗ ) · t, (s2 , y1∗ ) · t) for all s1 , s2 ∈ int(I1 ) ∩ S(y1∗ ). Since (Z ∗ , T) is minimal and (int(I1 ) × U ) ∩ Z ∗ is an open subset of Z ∗ , there exists t1 ∈ T such that (s1 , y ∗ )·t1 , (s2 , y ∗ )·t1 ∈ (int(I1 )×U )∩Z ∗ . Hence sup dZ ∗ (s1 , y ∗ ) · (t1 + t), (s2 , y ∗ ) · (t1 + t) . t0
Since > 0 is arbitrary, s1 ∈ AR+ (s2 , y ∗ ). For a fixed ∈ (0, 0 ), we let y1∗ ∈ Y ∗ and s1 , s2 ∈ int(I1 ) ∩ S(y1∗ ). We have by Claims 1 and 2 that (s1 , y1∗ ), (s2 , y1∗ ) are proximal. Repeat the above arguments for (s1 , y1∗ ), (s2 , y1∗ ) in place of (s1 , y ∗ ), (s2 , y ∗ ) respectively, we conclude that s1 ∈ AR+ (s2 , y1∗ ). Hence int(I1 ) ∩ S(y1∗ ) ⊂ AR+ (s, y1∗ ) for each s ∈ int(I1 ) ∩ S(y1∗ ). Let (s, y ∗ ) ∈ Z ∗ and t ∈ T be such that (s, y ∗ ) · t ∈ (int(I1 ) × U ) ∩ Z ∗ , i.e., ψ(s, τ (y ∗ ), t) ∈ int(I1 ) and y ∗ · t ∈ U . Clearly, there exists an open neighborhood V of s in S 1 such that ψ s , τ (y ∗ ), t : s ∈ V ∩ S(y ∗ ) ⊂ int I1 ∩ S(y ∗ · t) ⊂ AR+ (s, y ∗ ) · t . This implies that V ∩ S(y ∗ ) ⊂ AR+ (s, y ∗ ), i.e., AR+ (s, y ∗ ) is an open subset of S(y ∗ ). Claim 4. π1 : (Z ∗ , T) → (Y ∗ , T) is a finite to one extension. Since for any s1 , s2 ∈ S(y0∗ ), either AR+ (s1 , y0∗ ) = AR+ (s2 , y0∗ ) or ∗ ∗ ∗ AR+ (s1 , y0 ) ∩ AR+ (s2 , y0 ) = ∅, there exists J ⊂ S(y0 ) such that {AR+ (s, y0∗ )}s∈J is a partition of S(y0∗ ). By Claim 3, for each s ∈ J , AR+ (s, y0∗ ) is an open subset of S(y0∗ ). So {AR+ (s, y0 )}s∈J is an open cover and also a partition of S(y0∗ ). Hence J mustbe a finite set, say, J = {s1 , s2 , . . . , sn }. For each i = 1, 2, . . . , n, since AR+ (si , y0∗ ) = S(y0∗ ) \ j =i AR+ (sj , y0∗ ), we see that AR+ (si , y0∗ ) is also a closed subset of S(y0∗ ). If n = 1, then for any s1 , s2 ∈ S(y0∗ ), (s1 , y0∗ ), (s2 , y0∗ ) are proximal. This implies that π1 is a proximal extension. Since for each y ∗ ∈ Yu∗ there exists no uncountable scrambled set in π1−1 (y ∗ ), we have by Theorem 5.1 that π1 is an almost 1–1 extension. Hence there exists y1∗ such that |S(y1∗ )| = |π1−1 (y1∗ )| = 1. Moreover, since θ : y ∗ → S(y ∗ ) is continuous, |π1−1 (y ∗ )| = |S(y ∗ )| = 1 for any y ∗ ∈ Y ∗ , i.e., π1 is a flow isomorphism. If n 2, then there exists s ∈ S(y0∗ ) \ AR+ (s1 , y0∗ ). Since AR+ (s1 , y0∗ ) is closed, we can find points a1 , b1 ∈ AR+ (s1 , y0∗ ) such that AR+ (s1 , y0∗ ) ⊆ [a1 , b1 ] (if a1 = b1 , then [a1 , b1 ] = {a1 }) and s ∈ S 1 \ [a1 , b1 ]. By Lemma 5.3(4) and the fact that s ∈ / AR+ (s1 , y0∗ ), we have that ∗ ∗ [a1 , b1 ] ∩ S(y0 ) ⊆ AR+ (s1 , y0 ) and diam(ft ([a1 , b1 ])) |ft (a1 ) − ft (b1 )| as t sufficiently large, where ft (s) = ψ(s, τ (y0∗ ), t), t ∈ T. Hence AR+ (s1 , y0∗ ) = [a1 , b1 ] ∩ S(y0∗ ). Similarly, for each i = 2, 3, . . . , n, there exist points ai , bi ∈ AR+ (si , y0∗ ) such that AR+ (si , y0∗ ) = [ai , bi ] ∩ S(y0∗ ) and diam(ft ([ai , bi ])) |ft (ai ) − ft (bi )| as t sufficiently large.
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
865
Let tk → +∞ be such that y0∗ · tk → y0∗ and limk→∞ ftk ([ai , bi ]) = {ci } for some ci ∈ S(y0∗ ). Using the continuity of θ : y ∗ → S(y ∗ ) and the fact that n n S y0∗ · t = ft S y0∗ = ft AR+ si , y0∗ = ft [ai , bi ] ∩ S y0∗ , i=1
t ∈ T,
i=1
we have n S y0∗ = lim S y0∗ · tk = lim ftk [ai , bi ] ∩ S y0∗ = {c1 , c2 , . . . , cn }. k→∞
k→∞
i=1
By the continuity of θ again, we have |π1−1 (y ∗ )| = |S(y ∗ )| = |S(y0∗ )| for any y ∗ ∈ Y ∗ . This shows that π1 is a finite to one extension. The proof of Claim 4 is now complete. Now, since (Y ∗ , T) is point-distal and π1 is finite to one and open, we have by Proposition 3.2(2) that (Z ∗ , T) is point-distal, a contradiction. 2 Theorem 5.2. Let X be a minimal set of a SPCF (S 1 × Y, T) with point-distal base flow (Y, T). Then X is either point-distal or residually Li–Yorke chaotic. Proof. Let X ∗ , Y ∗ , Z ∗ , τ, τ , ρ, π, π1 , π be as in Lemma 5.4 for the present minimal set X. We first consider the case that there exists a second category subset Yu∗ of Y ∗ such that for each y ∗ ∈ Yu∗ there exists no uncountable scrambled set in π1−1 (y ∗ ). By Proposition 5.1, (Z ∗ , T) is point-distal. Since ρ : (X ∗ , T) → (Z ∗ , T) is a flow isomorphism, (X ∗ , T) is also point-distal. Note that τ : (X ∗ , T) → (X, T) is almost 1–1. We conclude that (X, T) is point-distal in this case. Next, we consider the case that there exists a residual subset Yc∗ of Y ∗ such that for each ∗ y ∈ Yc∗ there exists an uncountable scrambled set in π1−1 (y ∗ ). Let y0∗ ∈ Yc∗ and F be an uncountable scrambled set in π1−1 (y0∗ ). Then there exists an uncountable subset S of S 1 such that F = {(s, y0∗ ): s ∈ S}. Let y0 = τ (y0∗ ) and F = {(s, y0 ): s ∈ S}. Clearly, F is an uncountable set. Since π2 = τ ◦ ρ −1 : (Z ∗ , T) → (X, T) is a flow extension and π2 (F ) = F , we have F ⊂ π −1 (y0 ). Let s1 = s2 ∈ S. Then (s1 , y0∗ ), (s2 , y0∗ ) form a Li–Yorke pair. Hence lim infψ(s1 , y0 , t) − ψ(s2 , y0 , t) = 0 and t→+∞
lim supψ(s1 , y0 , t) − ψ(s2 , y0 , t) > 0, t→+∞
i.e., {(s1 , y0 ), (s2 , y0 )} is also a Li–Yorke pair. Therefore, F ⊆ π −1 (y0 ) is an uncountable scrambled set of (X, T). Let Yc = τ (Yc∗ ). Since Yc∗ is a residual subset of Y ∗ and τ is an almost 1–1 extension, Yc is a residual subset of Y and there exists an uncountable scrambled set in π −1 (y) for each y ∈ Yc . Hence (X, T) is residually Li–Yorke chaotic. 2 A point-distal or even an almost automorphic minimal set can also be Li–Yorke chaotic. But our next theorem says that it cannot be residually Li–Yorke chaotic.
866
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
Lemma 5.5. Let X and Y be compact metric spaces, and π : X → Y be a semi-open, surjective, continuous map. Then X0 = x ∈ X: for any open neighborhood U of x, π(U ) is a neighborhood of π(x) is a residual subset of X. Proof. It follows from arguments of Lemma 3.1 in [59].
2
Theorem 5.3. Consider a SPCF (S 1 × Y, T) with point-distal base flow (Y, T). If a minimal set is point-distal, then it is not residually Li–Yorke chaotic. Proof. We use the explicit expression (1.1) for a SPCF (S 1 × Y, T) = (S 1 × Y, {Λt }t∈T ). Suppose for contradiction that the SPCF has a point-distal minimal set M which is also residually Li–Yorke chaotic. Then the set Md of distal points of M is a residual subset. It follows from Lemma 5.5 that Yd = y ∈ Y : Md ∩ π −1 (y) is a residual subset of π −1 (y) ∩ M is a residual subset of Y . Since M is residually Li–Yorke chaotic, there exists a residual subset Yc of Y such that each fiber π −1 (y), y ∈ Yc , admits an uncountable scrambled set Wy . Fix y ∈ Yd ∩ Yc and let Ey = {s ∈ S 1 : (s, y) ∈ Wy }. Then Ey is an uncountable subset of 1 S and it is not hard to see that there exists s0 ∈ Ey such that for any > 0 sufficiently small, [s0 , s0 e2πi ] ∩ Ey and [s0 e−2πi , s0 ] ∩ Ey are two uncountable sets. Consider the family of maps ft : S 1 → S 1 : s → ψ(s, y, t), t ∈ T. Take s1 ∈ Ey \ {s0 }. Then (s0 , y), (s1 , y) are positively proximal. Hence there exists a sequence ti → +∞ and a point s ∈ S 1 such that limi→∞ fti (s1 ) = limi→∞ fti (s0 ) = s . It follows from Lemma 5.3(2) that there exists a subsequence {ik } ⊂ {i} such that either limk→∞ ftik ([s1 , s0 ]) = {s } or limk→∞ ftik ([s0 , s1 ]) = {s } under Hausdorff metric. We let B = [s1 , s0 ] or [s0 , s1 ] be such that for any s2 ∈ B, lim inf d (s0 , y) · t, (s2 , y) · t = lim ftik (s0 ) − ftik (s2 ) = 0. t→+∞
k→∞
According to the choice of s0 , B ∩ Ey is an uncountable set. Now since π −1 (y) ∩ Md is a residual subset of π −1 (y) ∩ M, there exists s2∗ ∈ int(B) ∩ Ey such that (s2∗ , y) ∈ Md . This is impossible since lim inft→+∞ d((s0 , y) · t, (s2∗ , y) · t) = 0. 2 Now Theorem 2 immediately follows from Theorems 5.2, 5.3 above. 6. A general topological classification of minimal sets In this section, we consider a SPCF (S 1 × Y, T) = (S 1 × Y, {Λt }t∈T ) with minimal base flow (Y, T). We adopt the explicit form (1.1), i.e., Λt (s0 , y0 ) = ψ(s0 , y0 , t), y0 · t , t ∈ T, and denote dY as a compatible metric on Y and π : S 1 × Y → Y as the natural projection.
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
867
Let M be a minimal set of the SPCF (S 1 × Y, T). For each y ∈ Y , we denote My = {s ∈ S 1 : (s, y) ∈ M}. Since My is a closed subset of S 1 , each connected component of My is either the whole circle or a closed interval in the circle (which can be degenerate). Consider the function ζM : Y → R 1 : ζM (y) = sup |B|: B is a connected component of My , where |B| denotes the length of B. For each y ∈ Y , it is clear that there exists a component B in My such that ζM (y) = |B|, i.e., ζM (y) = max |B|: B is a connected component of My . Lemma 6.1. The function ζM is non-negative and upper semi-continuous, i.e., ζM (y) 0 and lim supyn →y ζM (yn ) ζM (y) for each y ∈ Y . Proof. The lemma is clear because M is compact.
2
Let Y0 (ζM ) = {y ∈ Y : ζM (y) = 0}. Lemma 6.2. Either infy∈Y ζM (y) > 0 or Y0 (ζM ) is a residual subset of Y . Proof. By Lemma 6.1, Y0 (ζM ) is a Gδ -set. Hence it is sufficient to show that if infy∈Y ζM (y) = 0, then Y0 (ζM ) is a dense subset of Y . Assume that infy∈Y ζM (y) = 0 and let Yc (ζM ) be the set of points of continuity of ζM . Since ζM is upper semi-continuous, Yc (ζM ) is a residual set. If ζM (y0 ) > 0 for some y0 ∈ Yc (ζM ), then there exist open neighborhood U of y0 and c > 0 such n that ζM (y) c for all y ∈ U . By the minimality of Y , we let t1 < t2 < · · · < tn be such that i=1 U · ti = Y . Since infy∈U ζM (y) > 0, we have that ci =: infy∈U ζM (y · ti ) > 0, for all i = 1, 2, . . . , n. Hence infy∈Y ζM (y) min{ci : i = 1, 2, . . . , n} > 0, a contradiction to the fact that infy∈Y ζM (y) = 0. This shows that for any y0 ∈ Yc (ζ (M)), ζM (y0 ) = 0, i.e., Yc (ζM ) ⊆ Y0 (ζM ). Hence Y0 (ζM ) is a residual subset of Y . 2 Recall that a minimal set M of a SPCF (S 1 × Y, T) is a Cantorian if there exists a residual subset Y0 of Y such that for each y ∈ Y0 , My is a Cantor set. The following theorem immediately yields Theorem 3. Theorem 6.1. Let M be a minimal set of a SPCF (S 1 × Y, T) with minimal base flow (Y, T). Then precisely one of the following holds: (a) M is an almost N –1 extension of Y for some positive integer N ; (b) M = S 1 × Y ; (c) M is a Cantorian. Proof. By Lemma 6.2, there are two cases: (a) infy∈Y ζM (y) > 0; (b) Y0 (ζM ) is a residual subset of Y .
868
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
We first consider the case (a). Let A = e2πir , e2πil : r < l < 1 + r, r, l ∈ Q . Since A is countable, we can rewrite it as A = {[ai , bi ]}i∈N . Since infy∈Y ζM (y) > 0, we have that for each y ∈ Y , there exists i(y) ∈ N such that [ai(y) , bi(y) ] ⊆ My . In particular, [ai(y) , bi(y) ] × {y} ⊆ M. Denote Yi = {y ∈ Y : i(y) = i} for i ∈ N. Then i∈N Yi = Y . Hence there exists i∗ ∈ N such that W =: int(Y i∗ ) = ∅. Since [ai∗ , bi∗ ] × Yi∗ ⊆ M and M is closed, we have that [ai∗ , bi∗ ] × Y i∗ ⊆ M, which implies that (ai∗ , bi∗ ) × W ⊆ M. Let y ∈ Y and denote S(y) = {s ∈ S 1 : (s, y) ∈ My }. For any s ∈ S(y), since M is minimal, there exists t ∈ T such that (s, y) · t ∈ (ai∗ , bi∗ ) × W , which implies that s ∈ intS 1 (S(y)). It 1 1 follows that S(y) is an open subset1of S . Since S(y) is also closed, S(y) = S . Since y is arbitrary, M = y∈Y S(y) × {y} = S × Y . We now consider the case (b). Let Y c (M) = {y ∈ Y : My is a Cantor set}, Y i (M) = {y ∈ Y : My has an isolated point}. Since it is clear that My is a Cantor set for any y ∈ Y0 (ζM ) \ Y i (M), we have that Y c (M) ⊇ Y0 (ζM ) \ Y i (M). If Y c (M) is a residual subset of Y , then by definition M is a Cantorian. If Y c (M) is not a residual subset of Y , then Y i (M) is of second category, or Y c (M) ⊇ Y0 (ζM ) \ Y i (M) is a residual subset of Y since Y0 (ζM ) is a residual subset of Y . Consider the countable set D = (I1 , I2 ): I1 , I2 ∈ A and I2 ⊂ int(I1 ) . y
y
y
y
We note that for each y ∈ Y i (M), there exists (I1 , I2 ) ∈ D such that S(y) ∩ I1 = S(y) ∩ I2 is a y y singleton. Thus the map Φ : Y i (M) → D: y → (I1 , I2 ) is well defined. i −1 Since Y (M) = (I1 ,I2 )∈D Φ (I1 , I2 ), D is countable, and Y i (M) is of second category, there exist (I10 , I20 ) ∈ D and a nonempty open subset U of Y such that Φ −1 (I10 , I20 ) ⊇ U . Since θ : y → S(y) is upper semi-continuous, the set Y0 of all continuity points of θ is an invariant residual subset of Y . Let W = int(I10 ). For each y ∈ U ∩ Y0 ⊂ Φ −1 (I10 , I20 ) ∩ Y0 , since S(y) ∩ int(I10 ) = S(y) ∩ I20 is a singleton, W ∩ S(y) is also a singleton. Fix y ∈ Y0 and s ∈ S(y). Then (s, y) ∈ M. Since (W × U ) ∩ M is a nonempty open subset of M and M is a minimal set, there exists t0 ∈ T such that (s, y) · t0 ∈ W × U . Hence there exists an open neighborhood V of s in S 1 such that (V × {y}) · t0 ∩ M ⊂ (W × U ) ∩ M. Since (V × {y}) · t0 ∩ M ⊂ S(y · t0 ) × {y · t0 } and y · t0 ∈ U ∩ Y0 , we have that (V × {y}) · t0 ∩ M ⊆ (S(y · t0 ) ∩ W ) × {y · t0 } is a singleton. It follows that (V × {y}) ∩ M is a singleton, i.e., s is an isolated point of S(y). Thus, for each y ∈ Y0 , S(y) is a discrete closed subset of S 1 , hence it is a finite subset of S 1 . This shows that π : M → Y is an almost finite to one extension. It follows from Proposition 3.2(1) that π : M → Y is an almost N –1 extension for some positive integer N . Finally, the above is a strict trichotomy because in both cases (a) and (b), M cannot be a Cantorian. 2
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
869
7. Finite to one extensions and almost automorphic dynamics 7.1. Local connectivity and almost automorphy Let X be a complete metric space. Recall that x ∈ X is a locally connected point if for any open neighborhood U of x there is a connected closed neighborhood V of x such that V ⊆ U . We denote by Xlc the set of locally connected points in X. Lemma 7.1. Suppose that Xlc = ∅ and (X, T) is minimal. Then Xlc is an invariant residual subset of X. Proof. The invariance of Xlc is clear. For each k ∈ N, we consider the open set 1 k Xlc = x ∈ X: there exists a connected closed neighborhood V of x such that V ⊆ B x, , k k where B(x, k1 ) = {z ∈ X: d(x, z) < k1 }. Then Xlc = ∞ k=1 Xlc , i.e., Xlc is a Gδ subset of X. It follows from the minimality of X that Xlc is also dense. 2 The following result is known as the Ramsey theorem [49]. Lemma 7.2. If the set C = {(i, j ) ∈ N × N: 1 i < j < ∞} is divided into finite sets C1 , C2 , . . . , C , then there is a sequence {in } of natural numbers for which all pairs (im , in ), m < n, are in Cj for some j ∈ {1, 2, . . . , }. Our main result Theorem 5 is a direct consequence of the following theorem. Theorem 7.1. Consider an almost n–1 extension π : (X, T) → (Y, T) between minimal flows in which (Y, T) is almost periodic minimal. If Xlc = ∅, then the following holds. (1) (X, T) is almost automorphic; (2) For each y ∈ Y , the fiber π −1 (y) has precisely n connected components. Proof. (1) Let X0 = x ∈ X: for any open neighborhood U of x, π(U ) is a neighborhood of π(x) . By Lemma 5.5, X0 is a residual subset of X. Since by Lemma 7.1 Xlc is residual, X0 ∩ Xlc is also a residual subset of X. Hence by Lemma 5.1, Y1 = y ∈ Y : π −1 (y) ∩ X0 ∩ Xlc is a residual subset of π −1 (y) is a residual subset of Y . Let Y0 be the set of continuity points of the map Φ : Y → 2X : y → π −1 (y). Since Y0 is a residual subset of Y , so is Y0 ∩ Y1 . Let y0 ∈ Y0 ∩ Y1 . Then |π −1 (y0 )| = n, say, π −1 (y0 ) = {x1 , x2 , . . . , xn }. Since π −1 (y0 ) ∩ X0 is a residual subset of π −1 (y0 ), we have that xi ∈ X0 for all i = 1, 2, . . . , n. By Theorem 3.1, we want to show that x1 is a Δ∗ -recurrent point, i.e., for any open neighborhood U of x1 , the
870
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
recurrent time set N (x1 , U ) = {t ∈ T: x1 · t ∈ U } is a Δ∗ -set. More precisely, let {si }∞ i=1 be a sequence in T. We need to show that N (x1 , U ) ∩ {sk − sk : k > k } = ∅. Let Ui be open neighborhoods of xi , for i = 1, 2, . . . , n respectively, such that U1 ⊆ U and cl(Ui ) ∩ cl(Uj ) = ∅ for any 1 i = j n. Since y0 ∈ Y0 , there exists an open neighborhood W of y0 such that for each y ∈ W , π −1 (y) ∩ Ui = ∅, i = 1, 2, . . . , n, and π −1 (y) ⊂ ni=1 Ui . Since (Y, T) is almost periodic, there exists an invariant compatible metric dY on Y , i.e., dY (y1 · t, y2 · t) = dY (y1 , y2 ) for any y1 , y2 ∈ Y and t ∈ T. Let δ > 0 be such that the open ball Bδ (y0 ) centered at y0 with radius δ is contained in W . Let m = n!. Then there exist r ∈ si −r {0, 1, . . . , m − 1}, a subsequence {ik } ⊂ N, and a homeomorphism g : Y → Y such that km ∈ T and sik − r δ , g(y) k sup dY y · m 2 y∈Y
(7.1)
for all k = 1, 2, . . . . Then for any u, v ∈ N with u = v, it follows from (7.1) that siu − siv si − r si − r , y = sup dY y · u ,y · v sup dY y · m m m y∈Y y∈Y si − r si − r , g(y) + sup dY y · v , g(y) sup dY y · u m m y∈Y y∈Y
δ δ + . 2u 2v
si −r
Denote rk = km , k = 1, 2, . . ., R = {ri − rj : i > j }, and Per(n) as the set of all permutations of {1, 2, . . . , n}. Let t ∈ R. Since y0 · t ∈ W , π −1 (y0 · t) ∩ Ui = ∅ for all i = 1, 2, . . . , n. Hence there exists a unique Pt ∈ Per(n) such that xj · t ∈ UPt (j ) for all j = 1, 2, . . . , n. For eachP ∈ Per(n), we let RP = {t ∈ R : Pt = P } and CP = {(i, j ) ∈ N × N: rj − ri ∈ P }. Since R = P ∈Per(n) RP = {ri − rj : i > j }, we have C = P ∈Per(n) CP . Applying Lemma 7.2, one finds a subsequence {lj } ⊂ N and Q ∈ Per(n) such that RQ ⊇ {rli − rlj : i > j }. Let ui = rli , i ∈ N. It is clear that (a) {ui − uj : i > j } ⊆ RQ ; (b) {m(ui − uj ): i > j } ⊆ {sk − sk : k > k }; (c) supy∈Y d(y · (ui − uj ), y) < 2δi + 2δj for any i > j . Since Q ∈ Per(n), Qm (j ) = j for each j = 1, 2, . . . , n. In particular, Qm (1) = 1. Let Wm = U1 . Since limi>j →∞ supy∈Y d(y · (ui − uj ), y) = 0, there exists a positive integer Nm and an open neighborhood Vm ⊆ W of y0 such that Vm · (ui − uj ) ⊆ W for all i > j Nm . Since X is local connected, there exists a connected closed neighborhood Wm−1 of xQm−1 (1) such that Wm−1 ⊆ UQm−1 (1) ∩ π −1 (Vm ). Now for any i > j Nm , since π −1 (Vm · (ui − uj )) ⊆ π −1 (W ) ⊆ nk=1 Uk , we have Wm−1 · (ui − uj ) ⊆ nk=1 Uk . Note that Wm−1 · (ui − uj ) is both connected and closed, xQm−1 (1) · (ui − uj ) ∈ Wm = U1 , and cl(Uk ) ∩ cl(Uk ) = ∅ for 1 k < k n. It follows that Wm−1 · (ui − uj ) ⊆ Wm , i.e., we find a connected closed neighborhood Wm−1 of xQm−1 (1) and a positive integer Nm such that Wm−1 · (ui − uj ) ⊆ Wm for all i > j Nm .
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
871
By repeating the above process and using induction, we find that, for each v = m − 1, m − 2, . . . , 1, 0, there is a connected closed neighborhood Wv of xQv (1) and a positive integer Nv+1 such that Wv · (ui − uj ) ⊆ Wv+1 for all i > j Nv+1 . Let N = max{Nv+1 : v = m − 1, m − 2, . . . , 1, 0}. Then for any i > j N , we have W0 · m(ui − uj ) = W0 · (ui − uj ) · (m − 1)(ui − uj ) ⊆ W1 · (m − 1)(ui − uj ) ⊆ · · · ⊆ Wv · (m − v)(ui − uj ) ⊆ · · · ⊆ Wm = U1 ⊆ U. Since x1 ∈ W0 , x1 · (m(ui − uj )) ∈ U . This together with (b) above implies that m(ui − uj ) ∈ ∗ N(x1 , U ) ∩ {sk − sk : k > k } = ∅. Since {si }∞ i=1 is arbitrary, N (x, U ) is a Δ -set for any neigh∗ borhood U of x1 . Hence x1 is a Δ -recurrent point, i.e., (X, T) is almost automorphic. (2) Let dY be an invariant compatible metric on Y , d be the metric on X, and Y0 , Y1 be the residual sets defined in (1). First, we show that for each y ∈ Y , the fiber π −1 (y) has at least n connected components. Suppose this is not true. Then there exists y1 ∈ Y such that π −1 (y1 ) has m-connected components −1 {Br }m r=1 for some m n − 1. For a fixed y0 ∈ Y0 ∩ Y1 , we let π (y0 ) = {x1 , x2 , . . . , xn }. Also let Ui be open neighborhoods of xi , for i = 1, 2, . . . , n respectively, such that U1 ⊆ U and cl(Ui ) ∩ cl(Uj ) = ∅ for any 1 i = j n. Since y0 ∈ Y0 , there exists an open neighborhood W of n −1 (y) ⊂ π U . Let y0 such that for each y ∈ W , π −1 (y) ∩ Ui = ∅, i =1, 2, . . . , n, and i i=1 n −1 t ∈ T be such that y1 · t ∈ W . Then π −1 (y1 · t) = m r=1 Br · t ⊆ i=1 Ui and π (y1 · t) ∩ Ui = ∅, i = 1, 2, . . . , n. For each r = 1, 2, . . . , m, since Br · t is a closed connected subset of π −1 (y1 · t) and cl(Ui ) ∩ cl(Uj ) = ∅, there exists i(r) ∈ {1, 2, . . . , n} such that Br · t ⊆ Ui(r) . Hence {1, 2, . . . , n} \ {i(1), i(2), . . . , i(m)} = ∅. Let i0 ∈ {1, 2, . . . , n} \ {i(1), i(2), . . . , i(m)}. Then Ui0 ∩ π −1 (y1 · t) ⊆ m r=1 Ui0 ∩ Ui(r) = ∅, a contradiction. Next, suppose for contradiction that there exists y ∈ Y such that π −1 (y) has at least (n + 1)−1 connected components {Aj }n+1 j =1 . For a fixed y0 ∈ Y0 ∩ Y1 , we let π (y0 ) = {x1 , x2 , . . . , xn } and
= min{d(xi , xj ) : 1 i < j n}. Also let N be a natural number such that > N2 . For a given integer m N , we consider the sets Uim = π −1 (B(y0 , m1 )) ∩ B(xi , m1 ), i = 1, 2, . . . , n, where B(y0 , r) = {z ∈ Y : dY (z, y0 ) < r} and B(xi , r) = {x ∈ X: d(xi , x) < r}, i = 1, 2, . . . , n. For each i = 1, 2, . . . , n, since xi ∈ Xlc , there exists a connected closed neighborhood Vim of xi such that Vim ⊆ Uim . Let V m = ni=1 Vim . Then V m is a closed neighborhood of π −1 (y0 ). Since π −1 is continuous at y0 , there exists a neighborhood Wm of y0 such that π −1 (Wm ) ⊂ V m . Let tm ∈ T be such that y · tm ∈ Wm . Then Aj · tm ⊆ V m for all j = 1, 2, . . . , n + 1. Since Aj · tm is connected and Vim ∩ Vjm = ∅ for all 1 i < j n, there exists a unique integer m . Hence there have to be integers j1 (m), j2 (m) n(j, m) ∈ {1, 2, . . . , n} such that Aj · tm ⊆ Vn(j,m) with 1 j1 (m) < j2 (m) n + 1 such that n(j1 (m), m) = n(j2 (m), m), which we denote as n(m). m · (−t ). It is clear that E is a connected closed set, A Let Em = Vn(m) m m j1 (m) ∪ Aj2 (m) ⊆ Em , m ) · (−t ) ⊆ B(y , 1 ) · (−t ) = B(y · (−t ), 1 ). Since y ∈ π(E ), we and π(Em ) = π(Vn(m) m 0 m m 0 m m m have π(Em ) ⊆ B(y, m2 ). Now we take a sequence N m1 < m2 < · · · such that (i) j1 (m1 ) = j1 (m2 ) = · · ·, denoted by j1 ; (ii) j2 (m1 ) = j2 (m2 ) = · · ·, denoted by j2 ; (iii) lim →∞ Em = E for some E ∈ 2X under the Hausdorff metric on 2X .
872
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
It is clear that 1 j1 < j2 n + 1, Aj1 ∪ Aj2 ⊆ E, and E is a connected closed set of X. Since π(Em ) ⊆ B(y, m2 ), E ⊆ π −1 (y). Note that E is connected, Aj1 ∪ Aj2 ⊆ E, and Aj1 , Aj2 are two connected components of π −1 (y). We must have E = Aj1 = Aj2 , which is a contradiction to the fact that Aj1 ∩ Aj2 = ∅. 2 7.2. SPCF with at least two minimal sets We consider a SPCF (S 1 × Y, T) = (S 1 × Y, {Λt }t∈T ) in the form (1.1), i.e., Λt (s0 , y0 ) = ψ(s0 , y0 , t), y0 · t ,
t ∈ T.
We denote dY as a compatible metric on Y and π : S 1 × Y → Y as the natural projection. The following result may be regarded as a topological counterpart to the Furstenberg measuretheoretic characterization [15] for SPCFs. Theorem 7.2. Consider a SPCF (S 1 × Y, T) with minimal base flow (Y, T) and assume that it has at least two minimal sets. Then the following holds. (a) There is a positive integer n such that each minimal set is an almost n–1 extension of Y . (b) If the SPCF becomes APCF and one of its minimal set is almost automorphic, then so are others. Proof. (a) Let M, M0 be two minimal sets of (S 1 × Y, T). For each y ∈ Y , we consider the sets S(y) = {s ∈ S 1 : (s, y) ∈ M} and S0 (y) = {s ∈ S 1 : (s, y) ∈ M0 }. Clearly, for each y ∈ Y , S(y) 1 and S0 (y) are closed subsets of S 1 , S(y) ∩ S0 (y) = ∅, and the maps ρ, ρ0 : Y → 2S ×Y defined by ρ(y) = S(y), ρ0 (y) = S0 (y), y ∈ Y , are upper semi-continuous. We denote by Y c and Y0c , respectively, as the sets of continuity points of ρ, ρ0 , respectively. Then both Y c and Y0c are residual subsets of Y . Fix a point y0 ∈ Y c ∩ Y0c . Since S 1 \ S0 (y0 ) is an open subset of S 1 , S 1 \ S0 (y) is a countable union of proper, open sub-arcs of S 1 , i.e., S 1 \ S0 (y0 ) = Ii=1 Ai , where 1 I +∞ and each Ai is a proper, open sub-arc of S 1 . Since Ii=1 Ai ⊇ S(y0 ), there exists a positive integer N (y0 ) N (y ) such that i=1 0 Ai ⊃ S(y0 ). Without loss of generality, we assume that Ai ∩ S(y0 ) = ∅ for all i = 1, 2, . . . , N(y0 ). Claim. For each i = 1, 2, . . . , N (y0 ), Ai ∩ S(y0 ) is a singleton. In particular, |S(y0 )| = N(y0 ) < ∞. Suppose for contradiction that the Claim is not true. Then there exists some 1 i N (y0 ) such that Ai ∩ S(y0 ) is not a singleton. We denote Ai = (c, d) and let Bi = [a, b] be a closed subarc of Ai such that Ai ∩ S(y0 ) = Bi ∩ S(y0 ). It is clear that a, b ∈ S(y0 ) and c, d ∈ S0 (y0 ). Using minimality of M, we let tj → +∞ be a sequence such that limj →∞ (b, y0 ) · tj = (a, y0 ). By taking subsequences if necessary, we assume that limj →∞ (a, y0 ) · tj = (a , y0 ), limj →∞ (c, y0 ) · tj = (c , y0 ), and limj →∞ (d, y0 ) · tj = (d , y0 ), for some a ∈ S(y0 ) and c , d ∈ S0 (y0 ). Let ft be as in Lemma 5.3. Then ftj (b) → a, ftj (a) → a , ftj (c) → c , and ftj (d) → d , as j → ∞. We first show that c = c, d = d, and a = a. Suppose c = c. Since c = a, we have by Lemma 5.3(1) that limj →∞ ftj ([c, b]) = [c , a] [c, a]. Hence there is a sufficiently small open neighborhood V of c in S 1 such that V ⊂ (c , a). Since y0 ∈ Y0c and c ∈ S0 (y0 ), there
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
873
exists an open neighborhood U of y0 in Y such that S0 (y) ∩ V = ∅ for all y ∈ U . Note that ftj ((c, b)) = (ftj (c), ftj (b)), V ⊂ (c , a), and y0 · tj → y0 . There exists a positive integer J such that ftj ((c, b)) ⊃ V and y · tj ∈ U as j J . Moreover, for fixed j J , there exists s ∈ (c, b) such that ftj (s) ∈ S0 (y0 · tj ) ∩ V . This implies that (ftj (s), y0 · tj ) ∈ M0 , i.e., (s, y0 ) · tj ∈ M0 . Hence (s, y0 ) ∈ M0 , i.e., s ∈ S0 (y0 ), which contradicts to the fact that (c, b) ∩ S0 (y0 ) = ∅. Therefore, c = c. Similarly, d = d. Since a ∈ [c, b], a = limj →∞ ftj (a) ∈ [c, a]. Hence a ∈ [c, a] ∩ S(y0 ) = {a}, i.e., a = a. Next, we show that S(y0 · tj ) S(y0 ).
(7.2)
Since ftj ([a, b]) ⊆ ftj ([c, b]), lim supj →∞ ftj ([a, b]) ⊆ [c, a] = S 1 . Also note that ftj (a) → a and ftj (b) → a. We have by Lemma 5.3(3) that limj →∞ ftj ([a, b]) = {a}. If c = d, then S(y0 ) ⊆ [a, b]. Hence S(y0 · tj ) = ftj (S(y0 )) = ftj (S(y0 ) ∩ [a, b]) → {a} = S(y0 ). Now suppose that c = d. Since limj →∞ ftj (c) = c and limj →∞ ftj (d) = d, we have by Lemma 5.3(1) that limj →∞ ftj ([d, c]) = [d, c]. Note that S(y0 · tj ) = ftj (S(y0 )) = ftj (S(y0 ) ∩ [a, b]) ∪ ftj (S(y0 ) ∩ [d, c]). It follows that lim supj →∞ S(y0 · tj ) ⊆ ({a} ∪ [d, c]). Since b ∈ / {a} ∪ [d, c], (7.2) holds. Now, since y0 ∈ Y c and y0 · tj → y0 , we have that S(y0 · tj ) → S(y0 ), which is a contradiction (7.2). This proves the Claim. It follows from the Claim that S(y) is a finite set for any y ∈ Y c ∩ Y0c . Hence M is an almost finite to one extension of Y . It follows from Proposition 3.2(1) that M is an almost n–1 extension of Y for some positive integer n = n(M). Similarly, M0 is an almost n0 –1 extension for some positive integer n0 = n(M0 ). In fact, from the proof of Proposition 3.2(1), we also see that |S(y)| = n for any y ∈ Y c and |S(y)| = n0 for any y ∈ Y0c . For a fixed y ∈ Y c ∩ Y0c , since S 1 \ S0 (y) has precisely n0 connected components, N (y) n0 . Using the Claim, we also have n = |S(y)| = N(y). Hence n n0 . Similarly, n0 n. This shows that n = n0 . (b) Let M0 , M be two minimal sets of (S 1 × Y, T) among which M0 is almost automorphic. We consider the set A0 of almost automorphic points of M0 . Since A0 is a residual subset of M0 , it follows from Lemma 5.5 that Y0∗ = y ∈ Y : A0 ∩ π −1 (y) is a residual subset of π −1 (y) ∩ M0 is a residual subset of Y . Let Y0c , Y c , S0 (y), S(y) be as in (a) for the minimal sets M0 , M and take any point y0 ∈ Y c ∩ Y0c ∩ Y0∗ . By (a), n = |S0 (y0 )| = |S(y0 )|. Since A0 ∩ π −1 (y0 ) = π −1 (y0 ) ∩ M0 , we see that for each s ∈ S0 (y0 ), (s, y0 ) ∈ A0 . If n = 1, then M is an almost 1–1 extension of Y , hence it is almost automorphic by Theorem 3.2. We now assume n 2. Since S 1 \ S0 (y0 ) has precisely n connected components and each connected component contains precisely one point in S(y0 ), there exist a1 = a2 ∈ S0 (y0 ) and b ∈ S(y0 ) such that {b} = [a1 , a2 ] ∩ S(y0 ). We want to show that (b, y0 ) is an almost automorphic point of M. Let {ti } be any sequence in T. Since (a1 , y0 ) and (a2 , y0 ) are almost automorphic points of M0 , there exist a subsequence {ti } ⊆ {ti } and a1 , a2 ∈ S0 (y) for some y ∈ Y such that limi→∞ (aj , y0 )· ti = (aj , y) and limi→∞ (aj , y) · (−ti ) = (aj , y0 ), for j = 1, 2. Taking a subsequence of {ti } if necessary, we may assume that there exist b ∈ S(y) and b ∈ S(y0 ) such that limi→∞ (b, y0 )·ti = (b , y) and limi→∞ (b , y) · (−ti ) = (b , y0 ).
874
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
Since limi→∞ limm→∞ (aj , y0 ) · (tm − ti ) = (aj , y0 ), j = 1, 2, and limi→∞ limm→∞ (b, y0 ) · (tm − ti ) = (b , y0 ), there exist sequences {mk } and {ik } such that if rk = tmk − tik for all k, then limk→∞ (aj , y0 ) · rk = (aj , y0 ), j = 1, 2, and limk→∞ (b, y0 ) · rk = (b , y0 ). Again, let ft be as in Lemma 5.3. Since limk→∞ frk (aj ) = aj , we have by Lemma 5.3(1) that limk→∞ frk ([a1 , a2 ]) = [a1 , a2 ]. Now, b = limk→∞ frk (b) ∈ limk→∞ frk ([a1 , a2 ]) = [a1 , a2 ]. Hence b ∈ [a1 , a2 ] ∩ S(y0 ) = {b}, i.e., b = b. This shows that (b, y0 ) is an almost automorphic point of M, implying that M is almost automorphic. 2 Theorem 7.3. Consider an APCF (S 1 × Y, T) in which Y is locally connected. If there are at least two minimal sets, then each minimal set is almost automorphic. Proof. Let M, M0 be two minimal sets of (S 1 × Y, T) and let S(y), S0 (y), ρ, ρ0 , Y c , Y0c be defined as in the proof of Theorem 7.2 for the present M, M0 . Fix y0 ∈ Y c ∩ Y0c . It follows from the proof of Theorem 7.2 that n =: |S0 (y0 )| = |S(y0 )| < +∞. Again, if n = 1, then M is an almost 1–1 extension of Y , hence by Theorem 3.2 it is almost automorphic. We now assume that n 2. Since S 1 \ S0 (y0 ) has precisely n connected components and each of them has precisely one point in S(y0 ), there exist points 0 t1 < r1 < t2 < r2 < · · · < tn < rn < 1 + t1 such that S(y0 ) = {a1 , a2 , . . . , an } and S0 (y0 ) = {b1 , b2 , . . . , bn }, where aj = e2πitj , bj = e2πirj , j = 1, 2, . . . , n. We want to show that (a1 , y0 ) is an almost automorphic point of M. By Theorem 3.1, it is sufficient to show that for any open neighborhood U of (a1 , y0 ) in S 1 × Y , the recurrent time set N((a1 , y0 ), U ) is a Δ∗ -set. Let U be an open neighborhood of (a1 , y0 ) in S 1 × Y . It is clear that there exist open neighborhoods W1 of y0 in Y and E of a1 in S 1 such that E × W1 ⊆ U . Let δ > 0 be sufficiently small and aj+ =: e2πi(tj +δ) ,
aj− =: e2πi(tj −δ) ,
bj+ =: e2πi(rj +δ) ,
bj− =: e2πi(rj −δ) ,
Aj = aj− , aj+ , Bj = bj− , bj+ ,
j = 1, 2, . . . , n, be such that A1 ⊂ E and A1 , B1 , A2 , B2 , . . . , An , Bn are pairwise disjoint, i.e., t1 − δ < t1 + δ < r1 − δ < r1 + δ < t2 − δ < · · · < tn + δ < rn − δ < rn + δ < 1 + t1 − δ. Since y0 ∈ Y c ∩Y0c , there exists an open neighborhood W of y0 with W ⊆ W1 such that for all y ∈ W , S(y) ⊆ nj=1 int(Aj ), S0 (y) ⊆ nj=1 int(Bj ), S(y) ∩ int(Aj ) = ∅, and S0 (y) ∩ int(Bj ) = ± ± , rj,y with ∅ for all j = 1, 2, . . . , n. Thus for each y ∈ W and j = 1, 2, . . . , n there exist tj,y − + − + tj − δ tj,y tj,y tj + δ and rj − δ rj,y rj,y rj + δ such that if +
aj− =: e2πitj,y ∈ S(y),
+
bj− =: e2πirj,y ∈ S0 (y),
+ =: e2πitj,y , aj,y + bj,y =: e2πirj,y ,
−
−
then S(y) ∩ Aj = S(y) ∩ Aj,y ⊆ int(Aj ) and S0 (y) ∩ Bj = S0 (y) ∩ Bj,y ⊆ int(Bj ), where Aj,y = − + − + , aj,y ] and Bj,y = [bj,y , bj,y ]. It is clear that A1,y , B1,y , A2,y , B2,y , . . . , An,y , Bn,y are pair[aj,y wise disjoint for each y ∈ W , and, for each j = 1, 2, . . . , n, the map Ej− : W → [tj − δ, tj + δ]:
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
875
− + y → tj,y is lower semi-continuous and the map Ej+ : W → [tj − δ, tj + δ]: y → tj,y is up[t −δ,t j j +δ] : per semi-continuous. It follows that for each j = 1, 2, . . . , n, both maps Ej : W → 2 1 − + − + − + y → [tj,y .tj,y ] and φj : W → 2S : y → Aj,y = [aj,y , aj,y ] = {e2πt : t ∈ [tj,y , tj,y ]} are upper semi-continuous.
Claim 1. Given j ∈ {1, 2, . . . , n}, y ∈ W , and t ∈ T such that y · t ∈ W , there exists a unique Ltj (y) ∈ {1, 2, . . . , n} such that (Aj,y × {y}) · t = ALt (y),y·t × {y · t}. Moreover, for fixed y, t as j
above, the map Lt{·} (y) : {1, 2, . . . , n} → {1, 2, . . . , n} is a permutation of {1, 2, . . . , n}. Let j, y, t be given as above. We consider the orientation preserving homeomorphism h : S 1 → S 1 : s → ψ(s, y, t). It is clear that (Aj,y × {y}) · t = h(Aj,y ) × {y · t}, h(Aj,y ) = − + − + [h(aj,y ), h(aj,y )], and h(aj,y ), h(aj,y ) ∈ S(y · t). If there exists b ∈ h(Aj,y ) ∩ S0 (y · t), i.e., (b, y · t) ∈ (Aj,y × {y}) · t and (b, y · t) ∈ M0 , then there exists b ∈ Aj,y such n that (b , y) · t = (b, y · t) ∈ M0 . It follows that (b , y) ∈ M0 , i.e., b ∈ Aj,y ∩ S0 (y) ⊆ Aj ∩ ( i=1 Bi ) = ∅, which is impossible. Hence h(Aj,y ) ∩ S0 (y · t) = ∅.
(7.3)
− + Since y · t ∈ W and h(aj,y ), h(aj,y ) ∈ S(y · t), there exist i1 , i2 ∈ {1, 2, . . . , n} such that − + − + h(aj,y ) ∈ Ai1 ,y·t ⊆ Ai1 and h(aj,y ) ∈ Ai2 ,y·t ⊆ Ai2 . Since [h(aj,y ), h(aj,y )] ∩ S0 (y · t) = ∅, − + we must have i1 = i2 . For otherwise, i1 = i2 , and the arc [h(aj,y ), h(aj,y )] intersects both − + − Ai1 and Ai2 . It follows that Bi1 ⊆ [h(aj,y ), h(aj,y )], and hence bi1 ,y·t ∈ Bi1 ∩ S0 (y · t) ⊆ − + [h(aj,y ), h(aj,y )] ∩ S0 (y · t), which is impossible by (7.3). − + t Now let Lj (y) = i1 . Then h(aj,y ), h(aj,y ) ∈ ALt (y),y·t . Since ALt (y),y·t is a sub-arc of S 1 , j
j
− + − + either (i) [h(aj,y ), h(aj,y )] ⊆ ALt (y),y·t or (ii) [h(aj,y ), h(aj,y )] ⊇ S 1 \ ALt (y),y·t . But the case j
j
− + (ii) is impossible because S0 (y · t) ⊆ S 1 \ ALt (y),y·t and [h(aj,y ), h(aj,y )] ∩ S0 (y · t) = ∅. It now j follows from (i) that
Aj,y × {y} · t ⊆ ALt (y),y·t × {y · t}.
(7.4)
j
Such Ltj (y) is unique because A1,y·t , A2,y·t , . . . , An,y·t are pairwise disjoint. Let y = y · t and t = −t. Then y , y · t ∈ W . From the above, for each i = 1, 2, . . . , n, there exists a unique Lti (y ) ∈ {1, 2, . . . , n} such that Ai,y × {y } · t ⊆ ALt (y ),y ·t × {y · t }.
(7.5)
i
For each j = 1, 2, . . . , n, we let i(j ) = Ltj (y). By (7.4) and (7.5), we have Aj,y × {y} ⊆ Ai(j ),y × {y } · t ⊆ ALt
i(j ) (y
),y
× {y}.
Since A1,y·t , A2,y·t , . . . , An,y·t are pairwise disjoint, Lti(j ) (y ) = j and Aj,y × {y} = (Ai(j ),y × {y }) · t . This implies that
876
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
Aj,y × {y} · t = Ai(j ),y × {y } · t · t = Ai(j ),y × {y } = ALt (y),y·t × {y · t}, j
i.e., (Aj,y × {y}) · t = ALt (y),y·t × {y · t}. j
Note that Ltj (y) ∈ {1, 2, . . . , n}, (Aj,y × {y}) · t = ALt (y),y·t × {y · t} for each j = j 1, 2, . . . , n, and A1,y , A2,y , . . . , An,y are pairwise disjoint. We have that Lt{·} (y) : {1, 2, . . . , n} → {1, 2, . . . , n} is one to one, i.e., a permutation of {1, 2, . . . , n}. This proves Claim 1. Claim 2. Let V be a nonempty, connected, closed subset of W and t ∈ T be such that V · t ⊂ W . Then there exists a permutation P of {1, 2, . . . , n} such that for each j = 1, 2, . . . , n and y ∈ V , (Aj,y × {y}) · t = AP (j ),y·t × {y · t}. Let j ∈ {1, 2, . . . , n} and y ∈ V be given. By Claim 1, there exists a unique Ltj (y) ∈ {1, 2, . . . , n} such that (Aj,y × {y}) · t = ALt (y),y·t × {y · t}. j
We first show that the map Ltj (·) : V → {1, 2, . . . , n} is continuous. Let {yk }∞ k=1 ⊂ V cont t t verges to some y ∈ V and denote i = Lj (y). If Lj (yk ) Lj (y), then by Claim 1 there exists a subsequence {k1 < k2 < · · ·} of {k} and r ∈ {1, 2, . . . , n} \ {i} such that Ltj (yk ) = r for each ∈ N. Take a sequence of points z ∈ Aj,yk , ∈ N. We assume without loss of generality 1
that lim →∞ z = z for some z ∈ S 1 . By the upper semi-continuity of the map φj : W → 2S : y → Aj,y , we have that z ∈ Aj,y . Since (z , yk ) · t ∈ Ar,yk × {yk · t}, it again follows from the upper semi-continuity of φj that (z, y) · t = lim →∞ (z , yk ) · t ∈ Ar,y·t × {y · t}. But since (z, y) · t ∈ (Aj,y × {y}) · t ∈ Ai,y·t × {y · t}, (z, y) · t ∈ Ai,y·t × {y · t} ∩ Ar,y·t × {y · t}. Hence Ai,y·t ∩ Ar,y·t = ∅, a contradiction to the fact that i = r. This shows the continuity of Ltj (·) : V → {1, 2, . . . , n}. Now, for each j = 1, 2, . . . , n, Ltj (·) : V → {1, 2, . . . , n} must be a constant map since its domain is connected and its range is discrete. Let y ∈ V and P (j ) = Ltj (y), j = 1, 2, . . . , n. Then Aj,y × {y} · t = AP (j ),y·t × {y · t}. It follows from Claim 1 that P : {1, 2 · · · , n} → {1, 2, . . . , n} is a permutation of {1, 2, . . . , n}. Claim 3. For any sequence {si } in T, N ((a1 , y0 ), U ) ∩ {sk − sk : k > k } = ∅, i.e., N ((a1 , y0 ), U ) is a Δ∗ -set. Let m = n! and d be an invariant compatible metric on Y , i.e., d(y1 · t, y2 · t) = d(y1 , y2 ), y1 , y2 ∈ Y , t ∈ T. Since (Y, T) is almost periodic, there exists an increasing subsequence si −r {ik } ⊂ N, an integer r ∈ {0, 1, . . . , m − 1}, and a homeomorphism g : Y → Y such that km ∈ T and 1 si − r sup d y · k , g(y) k m 2 y∈Y
(7.6)
for all k = 1, 2, . . . . Since Y is locally connected, there exists a connected closed neighborhood V of y0 such that Bδ (V ) =: {y ∈ Y : d(y, V ) < δ} ⊆ W and B(y0 , δ) ⊆ V for some δ > 0. Let K δ be a natural number such that 21K < 2m . Then for any u, v ∈ N with u > v K, we have by (7.6) that
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
877
si − r siu − siv si − r , y = sup d y · u ,y · v sup d y · m m m y∈Y y∈Y si − r si − r sup d y · u , g(y) + sup d y · v , g(y) m m y∈Y y∈Y It follows that d(V · (
siu −siv m
), V ) <
1 1 1 δ + v < K−1 < . u 2 2 m 2 δ m,
and, for each = 1, 2, . . . , n,
siu − siv siu − siv siu − siv d y0 · ( ), y0 ), y0 · ((j − 1) ) d y0 · (j m n n j =1
siu − siv d y0 · = , y0 < δ δ, m m j =1
s −s
s −s
i.e., y0 · ( iu m iv ) ⊆ V . Let t = iu m iv . Since V · t ⊆ W , we have by Claim 2 that there exists a permutation P of {1, 2, . . . , n} such that for each j = 1, 2, . . . , n and y ∈ V , (Aj,y × {y}) · t = AP (j ),y·t × {y · t}. Clearly, P m (j ) = j for each j = 1, 2, . . . , n. Thus, for any w = 0, 1, 2, . . . , m − 1, since y0 · (wt) ∈ V , we have AP w (1),y0 ·(wt) × y0 · (wt) · t = AP w+1 (1),y0 ·((w+1)t) × y · (w + 1)t , where P 0 (1) = 1. By induction, we further have A1,y0 × {y0 } · (mt) = A1,y0 ·(mt) × y0 · (mt) ⊆ A1 × V ⊆ U. Since A1,y0 = {(a1 , y0 )} and mt = siu − siv , we have (a1 , y0 ) · (siu − siv ) ∈ U . Hence siu − siv ∈ N((a1 , y0 ), U ) ∩ {sk − sk : k > k }, i.e., N ((a1 , y0 ), U ) ∩ {sk − sk : k > k } = ∅. Since {si } is arbitrary, N ((a1 , y0 ), U ) is a Δ∗ -set. This completes the proof. 2 Now, parts (1) and (2) of Theorem 4 are the respective parts in Theorem 7.2 above, and, part (3) of Theorem 4 is just Theorem 7.3 above. 8. Mean motion, transitivity, and connectivity In this section, we consider a SPCF (S 1 × Y, T) = (S 1 × Y, {Λˆ t }t∈T ) in the angular form (1.3), i.e., Λˆ t (φ0 , y0 ) = φ(φ0 , y0 , t), y0 · t , t ∈ T, (8.1) ˜ φ˜ 0 , y0 , t) be the lift of φ(φ0 , y0 , t) in R 1 satisfying where φ, φ0 ∈ R 1 (mod 1), y0 ∈ Y . Let φ( ˜ φ˜ 0 + 1, y0 , t) ≡ φ( ˜ φ˜ 0 , y0 , t) + 1. Then it is clear that Λˆ t is generated from the flow Λ˜ t : R 1 × φ( Y → R1 × Y : ˜ φ˜ 0 , y0 , t), y0 · t , t ∈ T Λ˜ t (φ˜ 0 , y0 ) = φ( when φ˜ 0 , φ˜ are identified modulo 1.
878
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
Through the section, for simplicity, we will often use the same symbol φ0 to denote a point φ0 ∈ S 1 and its lift φ˜ 0 ∈ R 1 . 8.1. Rotation number and mean motion It is more or less known that a SPCF (8.1) with uniquely ergodic base flow (Y, T) admits a well-defined rotation number (see [23] for the discrete case and [34] for certain almost periodic continuous case). Below, for the sake of completeness, we give a unified proof of this result for both discrete and continuous cases. The following result is known as the Oxtoby ergodic theorem (see [15]). Lemma 8.1. Let (X, T) be uniquely ergodic and f ∈ C(X, R 1 ). Then for any x ∈ X, 1 lim T →+∞ λT ([0, T ) ∩ T)
f (x · t) dλT (t) = [0,T )∩T
1 T →+∞ λT ((−T , 0] ∩ T)
f (z) dμ(z), X
f (x · t) dλT (t) =
lim
f (z) dμ(z), X
(−T ,0]∩T
where λT is the Haar measure on T with λT ([0, 1) ∩ T) = 1 and μ is the unique T-invariant probability measure on (X, T). Theorem 8.1. Consider the SPCF (8.1) with uniquely ergodic base flow (Y, T). Then for any φ0 ∈ R 1 , y0 ∈ Y , the limit ˜ 0 , y0 , t) φ(φ t→∞ t
ρ = lim
exists and is independent of choice of (φ0 , y0 ) ∈ R 1 × Y . Proof. For simplicity, we only consider the limit as t → +∞. First, we observe from the periodicity of φ˜ in the first argument that for any t ∈ T, y ∈ Y , if ∗ φ1 , φ2∗ ∈ R 1 are such that |φ1∗ − φ2∗ | < l for some positive integer l, then also ∗ φ˜ φ , y, t − φ˜ φ ∗ , y, t < l. 1 2 Next, for any φ0 ∈ R 1 , y ∈ Y , t, s ∈ T, we let 0 φ1 , φ2 < 1 be such that φ1 ≡ φ0 ,
˜ 0 , y, t) φ2 ≡ φ(φ
(mod 1).
Then ˜ 0 , y · t, s) − φ0 = φ(φ ˜ 1 , y · t, s) − φ1 , φ(φ ˜ 0 , y, t + s) − φ(φ ˜ 0 , y, t) = φ(φ ˜ 2 , y · t, s) − φ2 . φ(φ It follows from (8.2) that
(8.2)
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
879
φ(φ ˜ 0 , y, t) − φ(φ ˜ 0 , y · t, s) + φ0 ˜ 0 , y, t + s) − φ(φ ˜ 2 , y · t, s) − φ(φ ˜ 1 , y · t, s) + φ2 − φ1 = φ(φ ˜ 2 , y · t, s) − φ(φ ˜ 1 , y · t, s)| + |φ2 − φ1 4, φ(φ i.e., ˜ 0 , y, t + s) − φ(φ ˜ 0 , y, t) − φ(φ ˜ 0 , y · t, s) + φ0 4. −4 φ(φ
(8.3)
Integrating (8.3) with respect to t from 0 to a positive number T ∈ T yields that −4 ł e
1 T
˜ 0 , y, t) dλT (t) ˜ 0 , y, t + s) − φ(φ φ(φ
[0,T )∩T
−
˜ φ(φ0 , y · t, s) − φ0 dλT (t) 4.
(8.4)
[0,T )∩T
˜ 0 , z, τ ) − φ0 |: φ0 ∈ R 1 , z ∈ Y, |τ | s}. It For any positive number s ∈ T, we let Ms = sup{|φ(φ is clear that 0 Ms < +∞ and ˜ ˜ ˜ φ(φ0 , y, t + s) − φ(φ0 , y, t) dλT (t) − s φ(φ0 , y, T ) [0,T )∩T
=
˜ 0 , y, T + t) − φ(φ ˜ 0 , y, t) dλT (t) − s φ(φ ˜ 0 , y, T ) φ(φ
[0,s)∩T
=
˜ 0 , y, T ) dλT (t) ˜ 0 , y, T ), y · T , t − φ(φ φ˜ φ(φ
[0,s)∩T
−
˜ φ(φ0 , y, t) − φ0 dλT (t) − sφ0
[0,s)∩T
s |φ0 | + 2Ms =: sΦs . Combining this with (8.4) yields that 4T + sΦs 1 1 ˜ 0 , y, T ) − − φ(φ sT T sT
˜ 0 , y · t, s) − φ0 dλT (t) φ(φ
[0,T )∩T
4T + sΦs . sT
(8.5) ˜
˜
Now consider functions ρ ∗ (y) = lim supt→+∞ φ(φ0t,y,t) and ρ∗ (y) = lim inft→+∞ φ(φ0t,y,t) . We note by (8.2) that both ρ ∗ (y) and ρ∗ (y) are independent of φ0 . Since (Y, T) is uniquely ergodic, we have by Lemma 8.1 that
880
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
1 lim T →+∞ T
˜ 0 , y · t, s) − φ0 dλT (t) = φ(φ
[0,T )∩T
˜ 0 , z, s) − φ0 dμ(z), φ(φ
(8.6)
Y
where μ is the unique T-invariant Borel probability measure on (Y, T). By letting T → +∞ in (8.5) and applying (8.5) and (8.6), we have that 4 1 4 1 ∗ ˜ ˜ 0 , z, s) − φ0 dμ(z). φ(φ0 , z, s) − φ0 dμ(z) ρ∗ (y) ρ (y) + φ(φ − + s s s s Y
Y
Now, taking limit s → +∞ in the above, we see that ρ∗ (y) = ρ ∗ (y) and equals 1 ˜ 0 , z, s) − φ0 dμ(z), lim φ(φ s→+∞ s Y
which is a constant, denoted by ρ.
2
The limit in the theorem is referred to as the rotation number associated with the SPCF (8.1). Recall that the SPCF (8.1) is said to admit mean motion if ˜ 0 , y0 , t) − φ0 − ρt < ∞ supφ(φ t∈T
for all (φ0 , y0 ) ∈ R 1 × Y . Theorem 8.2. Consider the SPCF (8.1) with strictly ergodic base flow (Y, T). Then the followings are equivalent: (a) (b) (c) (d) (e) (f) (g)
(8.1) admits mean motion; ˜ 0 , y0 , t) − φ0 − ρt| < +∞ for some (φ0 , y0 ) ∈ R 1 × Y ; supt0 |φ(φ ˜ 0 , y0 , t) − φ0 − ρt| < +∞ for some (φ0 , y0 ) ∈ R 1 × Y ; supt0 |φ(φ ˜ 0 , y0 , t) − φ0 − ρt) < +∞ for all (φ0 , y0 ) ∈ R 1 × Y ; supt0 (φ(φ ˜ 0 , y0 , t) − φ0 − ρt) < +∞ for all (φ0 , y0 ) ∈ R 1 × Y ; supt0 (φ(φ ˜ 0 , y0 , t) − φ0 − ρt) > −∞ for all (φ0 , y0 ) ∈ R 1 × Y ; inft0 (φ(φ ˜ 0 , y0 , t) − φ0 − ρt) > −∞ for all (φ0 , y0 ) ∈ R 1 × Y . inft0 (φ(φ
Proof. It is clear that (a) implies (b)–(g). Suppose (b) holds and let ˜ 0 , y0 , t) − φ0 − ρt . M = supφ(φ t0
Then by the flow property, φ˜ φ(φ ˜ 0 , y0 , s), y0 · s, t − φ(φ ˜ 0 , y0 , s) − ρt + φ(φ ˜ 0 , y0 , s) − φ0 − ρs M
(8.7)
for all s + t 0. Consider the ω-limit set ω(φ0 , y0 ) of (φ0 , y0 ) with respect to the flow (8.1). For simplicity, we view ω(φ0 , y0 ) as a subset of [0, 1] × Y . Let (φ∗ , y∗ ) ∈ ω(φ0 , y0 ) and sn → +∞
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
881
be a sequence such that Λ˜ sn (φ0 , y0 ) → (φ∗ , y∗ ). By taking a subsequence if necessary, we let ˜ 0 , y0 , sn ) − φ0 − ρsn |. It follows from (8.7) that r(φ∗ , y∗ ) = limn→∞ |φ(φ φ(φ ˜ ∗ , y∗ , t) − φ∗ − ρt M − r(φ∗ , y∗ ) for all t ∈ T. Hence by (8.2), supφ˜ φ 0 , y 0 , t − φ 0 − ρt < ∞ t∈T
for all (φ 0 , y 0 ) ∈ R 1 × Y , i.e., (a) holds. This shows that (b) implies (a). Similarly, (c) implies (a). Now let (d) hold. Suppose for contradiction that (a) fails. It follows from the flow property and the equivalence between (a) and (b) that ˜ ∗ , y ∗ , n) − φ ∗ − ρn = +∞ (8.8) supφ(φ n∈N
for all (φ ∗ , y ∗ ) ∈ R 1 × Y . Let E be a minimal set of the time-1 map Λˆ 1 and consider the function ˜ y, 1) − φ − ρ. Since u(φ + 1, y) ≡ u(φ, y), u can be viewed u : R 1 × Y → R 1 : u(φ, y) = φ(φ, 1 as a continuous function on S × Y . Using flow property and induction, it is easy to see that n−1 ˜ u Λ˜ i (φ, y) = φ(φ, y, n) − φ − ρn
(8.9)
i=0
˜ ∗ ∗ for all n ∈ N and (φ, y) ∈ R 1 × Y . Hence by (8.8), | n−1 i=0 u(Λi (φ , y ))| is unbounded on n−1 1 ∗ ∗ ∗ ∗ N for any (φ , y ) ∈ E. Since limn→∞ n i=0 u(Λ˜ i (φ , y )) = 0 for any (φ ∗ , y ∗ ) ∈ E, there ˜ is a residual subset E∗ of E such that for any (φ∗ , y∗ ) ∈ E∗ the function n−1 i=0 u(Λi (φ∗ , y∗ )) oscillates from −∞ to +∞ as n → +∞ (see e.g., [30,35]). In particular, n−1 ˜ ∗ , y∗ , n) − φ∗ − ρn = sup u Λ˜ i (φ∗ , y∗ ) = +∞ sup φ(φ
n∈N
(8.10)
n∈N i=0
for all (φ∗ , y∗ ) ∈ E∗ . This is a contradiction to the condition in (d). Hence (d) implies (a). Similarly, either (e) or (f) or (g) implies (a). 2 8.2. APCF with mean motion The aim of this subsection is to prove Theorem 6. We will need the following Lemma 8.2. Let M be a minimal set of an almost periodically forced skew-product flow (R 1 × Y, T) = (R 1 × Y, {Πt }t∈T ): Πt (x0 , y0 ) = x(x0 , y0 , t), y · t , t ∈ T, where (Y, T) is an almost periodic minimal flow. Then M is an almost 1–1 extension of Y hence is almost automorphic.
882
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
Proof. In the case T = R, the lemma is a special case of the main result in [56] concerning totally monotone skew-product semiflows. The proof for the discrete case follows from that of the continuous case (see also [62]) almost word by word. 2 The following theorem is just our main result Theorem 6. Theorem 8.3. Suppose that (8.1) is an APCF which admits mean motion. Then the following holds. (1) Each minimal set of (8.1) is almost automorphic whose frequency module is generated by the rotation number and the forcing frequencies. (2) If a minimal set of (8.1) is an almost N –1 extension of Y for some positive integer N , then N is the smallest positive integer whose multiplication to the rotation number is contained in the frequency module of the forcing. Proof. (1) Let E be a minimal set of (8.1). It is sufficient to only consider the case T = Z, because, if T = R, then a point of E is almost automorphic iff it is almost automorphic for the time-1 map Λ˜ 1 [5]. Let Y ∗ = S 1 × Y be given the flow (φ0 , y0 ) · t = (φ0 + ρt (mod 1), y0 · t), t ∈ Z. Then (Y ∗ , Z) is almost periodic (need not be minimal). Consider the skew-product flow Λ∗t : R 1 × Y ∗ → R1 × Y ∗ : ˜ 0 , y0 , t) − φ0 − ρt + x0 , φ0 + ρt (mod 1), y0 · t , Λ∗t (x0 , φ0 , y0 ) = φ(φ
t ∈ Z.
It follows from Lemma 8.2 that each minimal set of Λ∗t is an almost 1–1 extension of a minimal set in (Y ∗ , Z). Using this fact and (8.9), the rest of the proof follows from that of Theorem 3.2(1) in [62] almost word by word. (2) Let ρ be the rotation number of (8.1) and M be a minimal set of (8.1) which is an almost N –1 extension of Y . We denote by M(M), M(Y ) as the frequency modules of M, Y , respectively. Then by the general module containment property for almost automorphic minimal sets (e.g., [57]), N is the smallest positive integer such that N M(M) ⊂ M(Y ). But by (1), M(M) is generated by ρ and M(Y ). The theorem follows. 2 8.3. APCF without mean motion We first prove a general result on positive transitivity and the uniqueness of minimal set in a SPCF (S 1 × Y, T). Theorem 8.4. If a SPCF (S 1 × Y, T) with minimal base flow (Y, T) is positively transitive, then it has a unique minimal set. Proof. Suppose for contradiction that (S 1 × Y, T) has two distinct minimal sets M, M0 . We let S(y), S0 (y), ρ, ρ0 , Y c and Y0c be defined as in the proof of Theorem 7.2 for the present M, M0 . Since (S 1 × Y, T) is positively transitive, the set Tran+ (S 1 × Y ) of positively transitive points in S 1 × Y is a residual subset of S 1 × Y . Let YT = y ∈ Y : Tran+ S 1 × Y ∩ S 1 × {y} is a residual subset of S 1 × {y} .
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
883
Then by Lemma 5.1, YT is a residual subset of Y . For a given y0 ∈ Y c ∩ Y0c ∩ YT , we have by the proof of Theorem 7.2 that n =: |S0 (y0 )| = |S(y0 )| < +∞ and each of the n connected components of S 1 \ S0 (y0 ) contains precisely one point in S(y0 ). Thus there exists points 0 t1 < r1 < t2 < r2 < · · · < tn < rn < 1 + t1 such that S(y0 ) = {a1 , a2 , . . . , an } and S0 (y0 ) = {b1 , b2 , . . . , bn }, where aj = e2πitj and bj = e2πirj for j = 1, 2, . . . , n. Since Tran+ (S 1 × Y ) ∩ (S 1 × {y0 }) is dense in S 1 × {y0 }, there exists w1 ∈ (t1 , r1 ) such that (c1 , y0 ) ∈ Tran+ (S 1 × Y ), where c1 = e2πiw1 . Take a number w2 such that w2 ∈ (r1 , t2 ) when n 2 and w2 ∈ (r1 , 1 + t1 ) when n = 1, and let c2 = e2πiw2 . Then c2 ∈ (b1 , a2 ) when n 2 and c2 ∈ (b1 , a1 ) when n = 1. In any case, c2 ∈ / S(y0 ) ∪ S0 (y0 ). Since (c1 , y0 ) ∈ Tran+ (S 1 × Y ), there exists a monotonically increasing, positive sequence si → +∞ such that limi→∞ (c1 , y0 ) · si = (c2 , y0 ). Without loss of generality, we assume that limi→∞ (a1 , y0 ) · si = (aj , y0 ) for some 1 j n. Consider the family of functions ft : S 1 → S 1 : u → ψ(u, y0 , t), t ∈ T. Then each ft is an orientation preserving homeomorphism. Since limi→∞ fsi (c1 ) = c2 , limi→∞ fsi (a1 ) = aj , and aj = c2 , we have by Lemma 5.3(1) that limi→∞ fsi ([a1 , c1 ]) = [aj , c2 ]. Using the fact that b1 ∈ (a1 , c2 ) ⊆ (aj , c2 ), we can find a sufficiently small open neighborhood V of b1 in S 1 such that V ⊂ (aj , c2 ). Since y0 ∈ Y0c and b1 ∈ S0 (y0 ), there exists an open neighborhood U of y0 in Y such that S0 (y)∩V = ∅ for each y ∈ U . Note that fsi ((a1 , c1 )) = (fsi (a1 ), fsi (c1 )), V ⊂ (aj , c2 ), and y0 · sj → y0 . It follows that there exists a positive integer N such that fsi ((a1 , c1 )) ⊃ V and y0 · si ∈ U as i N . Moreover, for a fixed i N , there exists b ∈ (a1 , c1 ) such that fsi (b) ∈ S0 (y0 · si ) ∩ V . This implies that (fsj (b), y0 · sj ) ∈ M0 , i.e., (b, y0 ) · si ∈ M0 . Hence (b, y0 ) ∈ M0 , i.e., b ∈ S0 (y0 ). This is a contradiction to the fact that (a1 , c1 ) ∩ S0 (y0 ) = ∅. 2 To prove Theorem 7, we need the following lemmas. Lemma 8.3. Consider the SPCF (8.1) with strictly ergodic base flow (Y, T). Then for any φ1 , φ2 ∈ R 1 , y ∈ Y , and t ∈ T, ˜ 1 , y, t) − φ1 φ(φ ˜ 2 , y, t) − φ2 + 2. φ(φ ˜ y, t) : R 1 → R 1 Proof. Let φ1 , φ2 ∈ R 1 , y ∈ Y , and t ∈ T be given. Since the function φ(·, ˜ 1 , y, t) φ(φ ˜ 2 + [φ1 − φ2 ] + 1, y, t) = φ(φ ˜ 2 , y, t) + [φ1 − φ2 ] + 1, is strictly increasing, φ(φ where for each r ∈ R 1 , [r] denotes the largest integer which is less than or equal to r. Thus ˜ 1 , y, t) − φ1 φ(φ ˜ 2 , y, t) + [φ1 − φ2 ] − φ1 + 1 φ(φ ˜ 2 , y, t) + (φ1 − φ2 + 1) − φ1 + 1, i.e., φ(φ ˜ 1 , y, t) − φ1 φ(φ ˜ 2 , y, t) − φ2 + 2. 2 φ(φ Lemma 8.4. Consider the SPCF (8.1) with strictly ergodic base flow (Y, T). If there exists (φ∗ , y∗ ) ∈ R 1 × Y such that ˜ ∗ , y∗ , t) − φ∗ − ρt = +∞ lim sup φ(φ t→+∞
˜ ∗ , y∗ , t) − φ∗ − ρt = −∞ , resp. lim inf φ(φ t→+∞
(8.11)
˜ 0 , y0 , t) − φ0 − ρt) = +∞ then there exists a residual subset Y∗ of Y such that lim supt→+∞ (φ(φ ˜ (resp. lim inft→+∞ (φ(φ0 , y0 , t) − φ0 − ρt) = −∞) for all (φ0 , y0 ) ∈ R 1 × Y∗ .
884
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
Proof. We only consider the case that ˜ ∗ , y∗ , t) − φ∗ − ρt = +∞. lim sup φ(φ t→+∞
˜ ∗ , y∗ , s), s ∈ R 1 . Let M ∈ N and s ∈ T be given. It follows from (8.11) that Denote φs = φ(φ there exists t (s) > M such that φ˜ φ∗ , y∗ , s + t (s) − φ∗ − ρ s + t (s) > M + 2 + φs − φ∗ − ρs, ˜ s , y∗ · s, t (s)) − φs − ρt (s) > M + 2. By continuity, we let UsM be an open neighborhood i.e., φ(φ ˜ s , y, t (s)) − φs − ρt (s) > M + 2 for all y ∈ UsM . Then by Lemma 8.3, of y∗ · s such that φ(φ φ˜ φ, y, t (s) − φ − ρt (s) > M for all (φ, y) ∈ R 1 × UsM . Let UM = s∈T UsM . Since {y∗ · s}s∈T is dense in Y , UM is a dense ˜ y, t) − φ − ρt > open subset of Y . Moreover, for each y ∈ UM there exists t > M such that φ(φ, M for all φ ∈R 1 . Let Y∗ = M∈N UM . Then Y∗ is a residual subset of Y , and, ˜ 0 , y0 , t) − φ0 − ρt = +∞ lim sup φ(φ t→+∞
for any (φ0 , y0 ) ∈ R 1 × Y∗ .
2
Lemma 8.5. Consider the SPCF (8.1) with strictly ergodic base flow (Y, T). Then there exist y1 , y2 ∈ Y such that ˜ sup φ(φ, y1 , t) − φ − ρt 4,
(8.12)
t1
˜ y2 , t) − φ − ρt −4 inf φ(φ,
t1
(8.13)
for all φ ∈ R 1 . Proof. Suppose for contradiction that (8.12) is not true. Then for a given y ∈ Y there exist ˜ y , y, ty ) − φy − ρty > 4. By continuity, there exists an open φy ∈ R 1 and ty 1 such that φ(φ ˜ y , y , ty ) − φy − ρty > 4 for all y ∈ Uy . Since by Lemma 8.3 neighborhood Uy of y such that φ(φ ˜ y , z, ty ) − φy + 2 for all φ ∈ R 1 and z ∈ Y , we have ˜ φ(φ, z, ty ) − φ φ(φ ˜ φ(φ, y , ty ) − φ − ρty > 2 for all (φ, y ) ∈ R 1 × Uy . Since {Uy }y∈Y is an open cover of Y and Y is compact, there exists a finite set {y1 , y2 , . . . , yk } ⊂ Y such that ki=1 Uyi = Y . We denote Ui = Uyi , ti = tyi for short and let T = max{t1 , t2 , . . . , tk }.
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
885
Given (φ0 , y0 ) ∈ R 1 × Y , we inductively define sequences {Ti } ⊂ T, {ki } ⊂ {1, 2, . . . k}, and ˜ {mi } by letting T0 = 0, y0 · Ti ∈ Uki , mi = tki , and Ti+1 = Ti + mi . Then φ(φ, y0 · Ti , mi ) − φ − ρmi > 2 for all φ ∈ R 1 . It follows that ˜ 0 , y0 , Ti+1 ) − φ0 − ρTi+1 φ(φ ˜ 0 , y0 , Ti ) − ρmi = φ˜ φ(φ0 , y0 , Ti ), y0 · Ti , mi − φ(φ ˜ 0 , y0 , Ti ) − φ0 − ρTi . ˜ 0 , y0 , Ti ) − φ0 − ρTi 2 + φ(φ + φ(φ By induction, ˜ 0 , y0 , Ti ) − φ0 − ρTi 2i φ(φ for all i ∈ N. Since i Ti T i for all i ∈ N, we have lim sup i→∞
1 1 ˜ 0 , y0 , Ti ) lim sup (φ0 + ρTi + 2i) φ(φ Ti T i i→∞ = ρ + lim sup i→∞
2i 2 ρ + > ρ, Ti T
which contradicts to the definition of the rotation number. This proves (8.12). The proof of (8.13) is similar. 2 Theorem 8.5. Suppose that the SPCF (8.1) is an APCF which admits no mean motion. Then each of its minimal set is either the entire phase space S 1 × Y or is everywhere non-locally connected. Proof. We let d be an invariant compatible metric on Y , i.e., d(y1 · t, y2 · t) = d(y1 , y2 ) for all y1 , y2 ∈ Y and t ∈ T. Since (8.1) admits no mean motion, the condition (d) of Theorem 8.2 fails, ˜ ∗ , y∗ , t) − φ∗ − ρt) = +∞. It i.e., there exists (φ∗ , y∗ ) ∈ R 1 × Y such that lim supt→+∞ (φ(φ follows from Lemma 8.4 that there exists a residual subset Y∗ of Y such that ˜ 0 , y0 , t) − φ0 − ρt = +∞ lim sup φ(φ
(8.14)
t→+∞
for any (φ0 , y0 ) ∈ R 1 × Y∗ . By Lemma 8.5(1), there exists y1 ∈ Y such that ˜ sup φ(φ, y1 , t) − φ − ρt 4
(8.15)
t1
for all φ ∈ R 1 . Suppose that the entire phase space S 1 × Y is not minimal and let X be a minimal set of (8.1). Then there exist nonempty open subsets U2 ⊂ S 1 and V2 ⊂ Y such that X ∩ (U2 × V2 ) = ∅. Since (Y, T) is minimal, there exist t1 , t2 , . . . , t ∈ T such that V = {V2 · ti } i=1 is an open cover of Y . Let δ > 0 be the Lebesgue number of V with respect to the metric d.
886
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
If X is not everywhere non-locally connected, then by Lemma 7.1 the set Xlc of locally connected points in X is an invariant residual subset of X. Since by Proposition 3.4 the projection π : X → Y is semi-open, it follows from Lemma 5.5 that X0 = x ∈ X: for any open neighborhood U of x, π(U ) is a neighborhood of π(x) is also a residual subset of X. Take x∗ ∈ Xlc ∩ X0 and denote y ∗ = π(x∗ ). Let φ1 ∈ [0, 1] be such that x∗ = (a, y ∗ ), where a = e2πiφ1 and denote A = [φ1 − 14 , φ1 + 14 ]. Also let V1 be an open neighborhood of y ∗ such that diam(V1 ) < δ and let U1 = {e2πiφ : φ ∈ A}. Since x∗ is a locally connected point of X, there exists a connected closed neighborhood W of x∗ in X such that W ⊆ U1 × V1 . Since x∗ ∈ X0 , π(W ) is also a closed neighborhood of y ∗ in Y . Using minimality of (Y, T), we let t ∗ 1 be such that y2 =: y1 · t ∗ ∈ π(W ) and denote ˜ y1 , t ∗ ) − φ − ρt ∗ |. Then C(t ∗ ) < ∞ and it follows from (8.15) that, for C(t ∗ ) = maxφ∈R 1 |φ(φ, any φ ∈ R 1 , ˜ y1 , t ∗ ), y2 , t − φ − ρ(t + t ∗ ) 4 sup φ˜ φ(φ, t0
˜ ˜ ˜ y1 , t ∗ ), y2 , t − φ(φ, y1 , t ∗ ) − φ − ρt ∗ y1 , t ∗ ) − ρt ∗ + φ(φ, = sup φ˜ φ(φ, t0
˜ ˜ y1 , t ∗ ) − ρt ∗ − C(t ∗ ), sup φ˜ φ(φ, y1 , t ∗ ), y2 , t − φ(φ, t0
i.e., ˜ sup φ(φ, y2 , t) − φ − ρt 4 + C(t ∗ ).
(8.16)
t0
Since y2 ∈ π(W ), there exists φ2 ∈ A such that (a2 , y2 ) ∈ W , where a2 = e2πiφ2 . For any (b, y) ∈ W ⊆ U1 ×V1 , there exists a unique φ(b) ∈ A such that e2πiφ(b) = b. Consider the map h : W → A × V1 : (b, y) → (φ(b), y). Clearly, h is continuous. Let F = h(W ). Then (φ2 , y2 ) ∈ F and it follows from the closeness and connectivity of W that F is a closed connected subset of A × V1 . Since π(W ) is a closed neighborhood of y ∗ , π(W ) ∩ Y∗ = ∅. Take y0 ∈ π(W ) ∩ Y∗ . Then ˜ y, ti ) − φ − ρti : φ ∈ R 1 , there exists φ0 ∈ A such that (φ0 , y0 ) ∈ F . Let L = max i=1 max{φ(φ, y ∈ Y }. Then L < ∞ and by (8.14), there exists t > max{t1 , t2 , . . . , t } such that ˜ 0 , y0 , t ) − φ0 − ρt 6 + L + C(t ∗ ). φ(φ Since diam(V1 · t ) = diam(V1 ) < δ , there exists i ∈ {1, 2, . . . , } such that V1 · t ⊆ V2 · ti , i.e., V1 · (t − ti ) ⊆ V2 . Denote t∗ = t − ti . Then t∗ > 0, V1 · t∗ ⊆ V2 , and ˜ 0 , y0 , t∗ ) − φ0 − ρt∗ φ(φ ˜ 0 , y0 , t∗ ), y0 · t∗ , ti − φ(φ ˜ 0 , y0 , t∗ ) − ρti ˜ 0 , y0 , t ) − φ0 − ρt − φ˜ φ(φ = φ(φ ˜ 0 , y0 , t∗ ) − ρti 6 + C(t ∗ ), ˜ 0 , y0 , t∗ ), y0 · t∗ , ti − φ(φ 6 + L + C(t ∗ ) − φ˜ φ(φ i.e., ˜ 0 , y0 , t∗ ) (φ0 + ρt∗ ) + 6 + C(t ∗ ). φ(φ
(8.17)
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
887
˜ Consider the function Φ : F → R 1 : (φ, y) → φ(φ, y, t∗ ). Obviously, Φ is continuous. Moreover, by noting that φ2 , φ0 ∈ A, we have by (8.16) that Φ(φ2 , y2 ) (φ2 + ρt∗ ) + 4 + C(t ∗ ) (φ0 + ρt∗ ) + 5 + C(t ∗ ), and by (8.17) that Φ(φ0 , y0 ) (φ0 + ρt∗ ) + 6 + C(t ∗ ). Since F is connected and (φ0 , y0 ), (φ2 , y2 ) ∈ F , we have that Φ(F ) ⊇ [(φ0 + ρt∗ ) + 5, (φ0 + ρt∗ ) + 6]. Hence there exists (φ3 , y3 ) ∈ F such that e2πiΦ(φ3 ,y3 ) ∈ U2 . It follows from the definition of F that (e2πiφ3 , y3 ) ∈ W ⊂ X. Now, on one hand, (e2πiφ3 , y3 ) · t∗ ∈ X, and on the other hand, (e2πiφ3 , y3 ) · t∗ = 2πiΦ(φ 3 ,y3 ) , y · t ) ∈ U × V as V · t ⊆ V . This implies that X ∩ (U × V ) = ∅, a con(e 3 ∗ 2 2 1 ∗ 2 2 2 tradiction. 2 Theorem 8.6. Suppose that the SPCF (8.1) is an APCF with locally connected base space Y . If it admits no mean motion, then it is positively transitive and has only one minimal set. Proof. Let d, Y∗ and y1 be defined in the proof of Theorem 8.5. Let U1 , U2 ⊂ S 1 and V1 , V2 ⊂ Y be any nonempty open subsets. Since (Y, T) is minimal, there exist t1 , t2 , . . . , tk ∈ T such that V = {V2 · ti }ki=1 is an open cover of Y . Let δ > 0 be the Lebesgue number of V with respect to the metric d. Since {y1 · r}r1 is dense in Y , there exists r1 1 such that y1 · r1 ∈ V1 , i.e., V1 is an open neighborhood of y∗ = y1 · r1 . Since Y is local connected, there exists a connected closed neighborhood V of y∗ such that V ⊆ V1 and diam(V ) < δ. By (8.15), we have ˜ ˜ ˜ y1 , r1 ) − φ − ρr1 y1 , r1 ), y∗ , t − φ(φ, y1 , r1 ) − ρt 4 + φ(φ, sup φ˜ φ(φ, t0
˜ ˜ for all φ ∈ R 1 . Let C = supφ∈R 1 |φ(φ, y1 , r1 ) − φ − ρr1 | = max0φ1 |φ(φ, y1 , r1 ) − φ − ρr1 |. Then C < ∞ and ˜ sup φ(φ, y∗ , t) − φ − ρt 4 + C
(8.18)
t0
for all φ ∈ R 1 . Take z1 ∈ Y∗ ∩ V and φ1 ∈ [0, 1] such that e2πiφ1 ∈ U1 . Then ˜ 1 , z1 , t) − φ1 − ρt = +∞. lim sup φ(φ t→+∞
˜ Let L = maxki=1 max{φ(φ, y, ti ) − φ − ρti : φ ∈ R 1 , y ∈ Y }. Then L < ∞. Take t0 > max{t1 , t2 , . . . , tk } such that ˜ 1 , z1 , t0 ) − φ1 − ρt0 5 + C + L. φ(φ Since diam(V · t0 ) = diam(V ) < δ, there exists i ∈ {1, 2, . . . , k} such that V · t0 ⊆ V2 · ti , i.e., V · (t0 − ti ) ⊆ V2 . Let t∗ = t0 − ti . Then t∗ > 0, V · t∗ ⊂ V2 , and ˜ 1 , z1 , t∗ ) − φ1 − ρt∗ φ(φ ˜ 1 , z1 , t∗ ) − ρti ˜ 1 , z1 , t0 ) − φ1 − ρt0 − φ˜ φ(φ1 , z1 , t∗ ), z1 · t∗ , ti − φ(φ = φ(φ ˜ 1 , z1 , t∗ ) − ρti 5 + C + L − φ˜ φ(φ1 , z1 , t∗ ), z1 · t∗ , ti − φ(φ 5 + C.
888
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
˜ 1 , y, t∗ ). Clearly, Ψ is continuous, Ψ (y∗ ) Consider the function Ψ : V → R 1 : y → φ(φ 4 + C + φ1 + ρt∗ , and Ψ (z1 ) 5 + C + φ1 + ρt∗ . Since V is connected, Ψ (V ) ⊇ [4 + C + φ1 + ρt∗ , 5 + C + φ1 + ρt∗ ]. Hence there exists y2 ∈ V such that e2πiΨ (y2 ) ∈ U2 . This shows that (e2πiφ1 , y2 ) ∈ U1 × V1 and (e2πiφ1 , y2 ) · t∗ = (e2πiΨ (y2 ) , y2 · t∗ ) ∈ U2 × V2 as V · t∗ ⊆ V2 . Therefore, (e2πiΨ (y2 ) , y2 · t∗ ) ∈ (U1 × V1 ) · t∗ ∩ (U2 × V2 ) = ∅. Since U1 , U2 , V1 , V2 are arbitrary, the flow (8.1) is positively transitive. It follows from Theorem 8.4 the flow (8.1) has a unique minimal set. 2 Now, parts (1), (2) of Theorem 7 are just the respective Theorems 8.5, 8.6 above. We note that using Theorem 6(1) and Theorem 7(2) we also obtain an alternative proof for Theorem 4(3) (Theorem 7.3). 8.4. Quasi-periodically forced circle flows Our aim of this subsection is to prove Theorem 8 in which the base space Y is further assumed to be a torus. We will use a classical result of E. Cartan that every closed subgroup of a Lie group is also a Lie group. Hence any closed subgroup of a Lie group is a Lie group which cannot be a Cantor set. The following theorem is just our main result Theorem 8. Theorem 8.7. Consider an APCF (S 1 × Y, T) = (S 1 × Y, {Λt }t∈T ) with Y being a torus (e.g., (Y, T) is quasi-periodic) and suppose that the rotation number is rationally independent of the forcing frequencies. Then (S 1 × Y, T) has a unique minimal set M and M is either the entire phase space S 1 × Y or is everywhere non-locally connected. If, in addition, the APCF admits mean motion, then M is almost automorphic, and moreover, either M = S 1 × Y or M is an everywhere non-locally connected Cantorian. Proof. Since Y is local connected, it follows from Theorem 4(1), Theorem 6(2) in the case with mean motion and from Theorem 7(2) in the case without mean motion that (S 1 × Y, T) has a unique minimal set M. Suppose that M = S 1 × Y . We want to show that M is everywhere non-locally connected. In the case that the APCF (S 1 × Y, T) admits no mean motion, we have by Theorem 7(1) that M is everywhere non-locally connected. We now consider the case that the APCF (S 1 ×Y, T) = (S 1 ×Y, {Λt }t∈T ) admits mean motion. By Theorem 3 and Theorem 6(1), M is both a Cantorian and an almost automorphic minimal set. Suppose for contradiction that M has a locally connected point. Then the set Mlc of locally connected points in M is an invariant residual subset of M. Let Y ∗ be a maximal almost periodic factor of M and p : (M, T) → (Y ∗ , T) be the almost 1–1 extension according to Theorem 3.2. Since the proximal relation P (M) = (e1 , e2 ) ∈ M × M: inf d Λt (e1 ), Λt (e2 ) = 0 , t∈T
where d denotes the standard metric on S 1 × Y , is a closed (in particular, an equivalence), equivariance relation, Y ∗ can be identified to M/P (M) with flow being induced by Λt . Let π : M → Y be the natural projection. Then it is clear that P (M) ⊂ Rπ =: (e1 , e2 ) ∈ M × M: π(e1 ) = π(e2 ) .
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
889
Thus there exists an extension η : (Y ∗ , T) → (Y, T) such that π = η ◦ p. Since p is almost 1–1 and Mlc = ∅, we have by Theorem 7.1(2) that each fiber p −1 (y ∗ ), y ∗ ∈ Y ∗ , is connected. For each y ∗ ∈ Y ∗ , we note that p −1 (y ∗ ) ⊆ π −1 η(y ∗ ) = s, η(y ∗ ) : s, η(y ∗ ) ∈ M . It follows that for each y ∗ ∈ Y ∗ there is a subinterval Iy ∗ (which can be degenerate) of S 1 such that p −1 (y ∗ ) = Iy ∗ × {η(y ∗ )}. Since M is a Cantorian, there exists a residual subset Y0 of Y such that for each y ∈ Y0 , π −1 (y) is a Cantor set. For any y ∈ Y0 and y ∗ ∈ η−1 (y), since p −1 (y ∗ ) = Iy ∗ × {y} ⊆ π −1 (y), π −1 (y) is a Cantor set and Iy ∗ is a subinterval of S 1 , it follows that Iy ∗ is a singleton. Thus for each y ∈ Y0 the map p : π −1 (y) → η−1 (y) is a homeomorphism, i.e., η−1 (y) is also a Cantor set. j Fix y0 ∈ Y0 and y0∗ ∈ π −1 (y0 ). For any y1∗ , y2∗ ∈ Y ∗ , there exist sequences {ti }∞ i=1 , j = 1, 2, j ∗ ∗ such that limi→∞ y0 · ti = yj , j = 1, 2. We define y1∗ ◦ y2∗ = lim y0∗ · ti1 + ti2 . i→∞
(8.19)
Since (Y ∗ , T) is almost periodic, (8.19) is well-defined and is independent of the choose of sej ∗ ∗ ∗ quences {ti }∞ i=1 , j = 1, 2. With the operation y1 ◦ y2 , Y becomes a compact Abelian topological ∗ group with unity y0 (see Theorem 3.2.1 in [58]). Using y0 , we can define an operation “◦” on Y similarly so that it becomes a compact topological group with unity y0 . j j ∗ ∗ For any y1∗ , y2∗ ∈ Y ∗ , we take sequences {ti }∞ i=1 , j = 1, 2, such that limi→∞ y0 · ti = yj , j = 1, 2. Then limi→∞ y0 · ti = limi→∞ η(y0∗ · ti ) = η(yj∗ ), j = 1, 2, and j
j
η y1∗ ◦ y2∗ = lim η y0∗ · ti1 + ti2 = lim η y0∗ · ti1 + ti2 i→∞
i→∞
= lim y0 · ti1 + ti2 = η y1∗ ◦ η y2∗ . i→∞
Hence η is a group homomorphism from (Y ∗ , ◦) to (Y, ◦) and η−1 (y0 ) = ker(η) is a closed subgroup of (Y ∗ , ◦). Since p is semi-open, the set M0 = m ∈ M: for any open neighborhood U of m, p(U ) is a neighborhood of p(m) is a residual subset of M by Lemma 5.5. Hence Mlc ∩ M0 is also a residual subset of M. Since ∅ = p(Mlc ∩ M0 ) ⊆ Ylc∗ , Y ∗ admits a locally connected point y∗ . For each y ∗ ∈ Y ∗ , we consider the map Hy ∗ ,y∗ : Y ∗ → Y ∗ : y → y ◦ (y ∗ −1 ◦ y∗ ), where y ∗ −1 is the inverse of y ∗ . Since Hy ∗ ,y∗ is a homeomorphism and Hy ∗ ,y∗ (y ∗ ) = y∗ , y ∗ is also a locally connected point of Y ∗ . This shows that Y ∗ is locally connected. For any y ∈ Y and y ∗ ∈ η−1 (y), since Hy0∗ ,y ∗ : η−1 (y0 ) → η−1 (y) is a homeomorphism, η−1 (y) is a Cantor set and hence dim(η−1 (y)) = 0, where dim(·) denotes the covering dimension. It follows from Theorem VI.7 in [25] that dim(Y ∗ ) dim(Y ) + sup dim η−1 (y) = dim(Y ) < ∞. y∈Y
890
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
Summarizing up, we have shown that (Y ∗ , ◦) is a locally connected, finite dimensional, compact Abelian topological group. It follows from a classical result of Pontrjagin (see Theorem 56 in [48]) that Y ∗ is a Lie group. Now, since η−1 (y0 ) is a closed subgroup of the Lie group (Y ∗ , ◦), η−1 (y0 ) is a Lie group, which is in particular not a Cantor set. This contradicts to the fact that η−1 (y) is a Cantor set for all y ∈ Y . 2 R)-valued cocycles 9. Projective bundle flows of sl(2,R Let T = R or Z and (Y, T) be an almost periodic minimal flow. We consider an almost periodic, sl(2, R)-valued cocycle {Φ(y, t)}y∈Y,t∈T , i.e., the map (y, t) → Φ(y, t) ∈ sl(2, R) is continuous, Φ(y, 0) ≡ I —the identity matrix, and {Φ(y, t)}y∈Y,t∈T satisfies the following cocycle property: Φ(y, t + s) = Φ(y · s, t)Φ(y, s),
y ∈ Y, t, s ∈ T.
We refer the cocycle as continuous cocycle if T = R and as discrete cocycle if T = Z. In the discrete case, we always assume that the cocycle is homotopic to identity. An important example of continuous, almost periodic, sl(2, R)-valued cocycles is the one generated from an almost periodic, 2-dimensional, linear system of ordinary differential equations: x = A(y · t)x,
x ∈ R 2 , y ∈ Y, t ∈ R,
(9.1)
where tr A(y, t) ≡ 0. In this case, the principal matrix solution of the linear system clearly forms a continuous cocycle. The average exponential growth of the norm of {Φ(y, t)} is measured by the (maximal) Lyapunov exponent log Φ(y, t) dμ(y) 0, λ = lim t→+∞ t Y
where μ denotes the Haar measure on Y . We note that the limit exists by subadditivity, is independent of the matrix norm, and is non-negative because Φ(y, t) ∈ sl(2, R). By Kigman sub-additive ergodic theorem [36], lim
t→+∞
log Φ(y, t) = λ, t
μ − a.e. y ∈ Y.
In fact, there exist Y∗ ⊂ Y with μ(Y∗ ) = 1 and invariant, measurable line bundles {u± (y)}y∈Y∗ ⊂ R 2 \ {(0, 0)} such that Φ(y, t)u± (y) = u± (y · t), y ∈ Y∗ , t ∈ T, and log Φ(y, t)u± (y) = ±λ, t→∞ t lim
y ∈ Y∗ .
(9.2)
We say that the cocycle {Φ(y, t)}y∈Y,t∈T is elliptic if supt∈T Φ(y, t) < +∞ for all y ∈ Y ; hyperbolic if it admits an exponential dichotomy (or exponential splitting), parabolic if λ = 0 but the cocycle is not elliptic; and partially hyperbolic if λ > 0 but the cocycle is not hyperbolic. It is well known that if the cocycle is hyperbolic, then the line bundles {u± (y)} can be
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
891
extended continuously to the entire space Y and the limits above exist everywhere on Y . In term of Sacker–Sell spectrum theory [53] for the almost periodic linear system (9.1), hyperbolicity corresponds to the case with two-points spectrum, partially hyperbolicity corresponds to the case with non-degenerate interval spectrum, and ellipticity and parabolicity correspond to cases with zero spectrum. Following works [7,29] for continuous projective bundle flow generated from the linear differential system (9.1) and work [4] for discrete projective bundle flow with one forcing frequency, we will give a complete classification of minimal sets of the projective bundle flow generated from a general almost periodic, sl(2, R)-valued cocycle in both continuous and discrete cases. Such a classification will be particularly useful in characterizing dynamical and topological complexities of a SNA in a such projective bundle flow (see recent work [35] and references therein). 9.1. A general classification of minimal sets Consider an almost periodic, sl(2, R)-valued cocycle {Φ(y, t)}y∈Y, t∈T and its generated linear skew-product flow πt : R 2 × Y → R 2 × Y : πt (v, y) = Φ(y, t)v, y · t . It is clear that the line bundle V (l, y) = (v, y): v is a vector in the line l through the origin is invariant to πt (or to the cocycle) in the sense that πt (V (l, y)) = V (φt (l, y)) for any line l through the origin and any y ∈ Y . Thus the cocycle generates a projective bundle flow (P 1 × Y, T). For simplicity, we parameterize P 1 by angle θ ∈ [0, 1] with 0, 1 being identified, i.e., we parameterize a line l through the origin with its angle Arg(l) = πθ . Then the projective bundle flow can be defined as Λt : P 1 × Y → P 1 × Y : Λt (θ, y) =
1 Arg Φ(y, t)v , y · t =: θ˜ (θ, y, t), y · t , π
θ ∈ R 1 (mod 1), y ∈ Y, t ∈ T,
where v is a vector in R 2 with angle = πθ and θ˜ (θ + 1, y, t) = θ˜ (θ, y, t) + 1. cos πθArg(v) Denote r(θ, y, t) = Φ(y, t) sin πθ . Lemma 9.1. Let (θ1 , y) = (θ2 , y) ∈ P 1 × Y . Then (θ1 , y), (θ2 , y) are proximal iff sup r(θ1 , y, t)r(θ2 , y, t) = +∞. t∈T
Proof. Without loss of generality, we let 0 < θ1 − θ2 < 1. By taking determinant on both hand sides of the identity
r(θ1 , y, t) cos π θ˜ (θ1 , y, t) r(θ1 , y, t) sin π θ˜ (θ1 , y, t)
r(θ2 , y, t) cos π θ˜ (θ2 , y, t) r(θ2 , y, t) sin π θ˜ (θ2 , y, t)
cos πθ1 = Φ(y, t) sin πθ1
cos πθ2 sin πθ2
892
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
and using the fact that det Φ(y, t) ≡ 1, we have r(θ1 , y, t)r(θ2 , y, t) sin π θ˜ (θ1 , y, t) − θ˜ (θ2 , y, t) = sin π(θ1 − θ2 ) = 0,
y ∈ Y, t ∈ T.
(9.3)
Since (θ1 , y), (θ2 , y) are proximal iff either inft∈T (θ˜ (θ1 , y, t) − θ˜ (θ2 , y, t)) = 0 or supt∈T (θ˜ (θ1 , y, t) − θ˜ (θ2 , y, t)) = 1, the lemma immediately follows from (9.3). 2 Lemma 9.2. Consider the cocycle {Φ(y, t)}y∈Y,t∈T and its generated projective bundle flow (P 1 × Y, T). Then the followings are equivalent. (a) (b) (c) (d) (e)
The cocycle is elliptic; There exists y0 ∈ Y such that supt0 Φ(y0 , t) < +∞; There exist (θ1 , y0 ) = (θ2 , y0 ) ∈ P 1 × Y such that supt0 r(θi , y0 , t) < +∞, i = 1, 2; P 1 × Y is a distal extension of Y ; There exist (θi , y0 ) ∈ P 1 × Y , i = 1, 2, 3, which are pairwise distal.
Proof. It is clear that (a) ⇒ (b), (b) ⇒ (c), and (d) ⇒ (e). (b) ⇒ (a): Let K =: supt0 Φ(y0 , t) < +∞. Since det Φ(y, s) ≡ 1, K1 =: sups0 Φ −1 (y0 , s) < +∞. Using the cocycle property Φ(y0 · s, t) = Φ(y0 , t + s)Φ −1 (y0 , s),
s, t ∈ T,
we have that Φ(y0 · s, t) KK1 ,
s 0, s + t 0.
For any t ∈ T, y ∈ Y , we let {sn } be a positive sequence in T such that y0 · sn → y. It follows from the above inequality that Φ(y, t) KK1 , i.e., (a) holds. (c) ⇒ (b): We note that there is a constant c > 0 such that Φ(y0 , t) cos πθ1 cos πθ2 c r(θ1 , y0 , t) + r(θ2 , y0 , t) , sin πθ1 sin πθ2 from which (b) follows. (a) ⇒ (d): Let (θ1 , y), (θ2 , y) ∈ P 1 × Y be two distinct points. It follows from (a) that sup r(θ1 , y, t)r(θ2 , y, t) < +∞. t∈T
Hence by Lemma 9.1, (θ1 , y), (θ2 , y) are distal. (e) ⇒ (c): By Lemma 9.1, supt∈T {{r(θi , y0 , t)r(θj , y0 , t)} < +∞ for all 1 i < j 3. Consider the linear combination cos πθ1 cos πθ2 cos πθ3 = c1 Φ(y0 , t) + c2 Φ(y0 , t) , Φ(y0 , t) sin πθ1 sin πθ2 sin πθ3 where c1 , c2 are constants. Then
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
893
sup r 2 (θ1 , y, t) sup |c1 |r(θ1 , y, t)r(θ2 , y, t) + |c2 |r(θ1 , y, t)r(θ3 , y, t) t∈T
t∈T
|c1 | sup r(θ1 , y, t)r(θ2 , y, t) + |c2 | sup r(θ1 , y, t)r(θ3 , y, t) < ∞. t∈T
t∈T
It follows that {r(θ1 , y0 , t)} is bounded. Similarity, {r(θi , y0 , t)}, i = 2, 3 are bounded, i.e., (c) holds. 2 Proposition 9.1. If the cocycle {Φ(y, t)}y∈Y, t∈T is not elliptic, then its generated projective bundle flow has at most two minimal sets. Proof. Suppose that the projective bundle flow has three minimal sets Mi , i = 1, 2, 3. Let y0 ∈ Y and take (θi , y0 ) ∈ Mi , i = 1, 2, 3. Then {(θi , y0 ): i = 1, 2, 3} are pairwise distal. It follows from Lemma 9.2 that the cocycle is elliptic, a contradiction. 2 Lemma 9.3. Consider the cocycle {Φ(y, t)}y∈Y, t∈T and its generated projective bundle flow (P 1 × Y, T). Then the cocycle is hyperbolic iff supt∈T r(θ, y, t) = +∞ for all (θ, y) ∈ P 1 × Y . Proof. It is a special case of the main result in [52].
2
Theorem 9.1. Consider the cocycle {Φ(y, t)}y∈Y, t∈T and its generated projective bundle flow (P 1 × Y, T). Then the following holds: (1) If the cocycle is elliptic, then either P 1 × Y is minimal and distal or there is an integer N 1 such that P 1 × Y laminates into infinitely many minimal N –1 extensions of Y (hence they are almost periodic). (2) If the cocycle is hyperbolic, then (P 1 × Y, T) has precisely two minimal sets and each of them is a 1–1 extension of Y (hence they are almost periodic). Proof. (1) Since, by Lemma 9.2, P 1 × Y is a distal extension of Y , either (i) it is minimal and distal; or (ii) it laminates into infinitely many minimal sets [12]. In the case (ii), we have by Theorem 4 and distality that there is a positive integer N such that each minimal set is an N –1 extension of Y . (2) Let {u± (y)}y∈Y ⊂ R 2 be the continuous, invariant line bundles associated with hyperbolicity and let θ ± (y) = π1 Arg u± (y), y ∈ Y . Then M± =
± θ (y), y : y ∈ Y
are two minimal sets of (P 1 × Y, T) which are clearly 1–1 extensions of Y . By Proposition 9.1, the projective bundle flow (P 1 × Y, T) cannot have more than two minimal sets in this case. 2 We now exam minimal dynamics of the project bundle flow if the cocycle is either parabolic or partially hyperbolic. Lemma 9.4. The cocycle {Φ(y, t)}y∈Y, t∈T is either parabolic or partially hyperbolic iff there are (θ10 , y0 ), (θ20 , y0 ) ∈ P 1 × Y such that supt0 r(θ10 , y0 , t) = +∞ and supt∈T r(θ20 , y0 , t) < +∞.
894
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
Proof. The lemma is a direct consequence of Lemmas 9.2, 9.3.
2
We call an ordered pair {(θ+ , y), (θ− , y)} in P 1 × Y a Morse pair if lim sup t→∞
r(θ+ , y, t) = +∞. r(θ− , y, t)
This notion is a relaxed version of relative dichotomy or More decomposition in linear skewproduct flows. Let π : P 1 × Y → Y be the natural projection. Lemma 9.5. If the cocycle {Φ(y, t)}y∈Y, t∈T is not elliptic and some fiber of its generated projective bundle flow (P 1 × Y, T) over Y admits a distal pair, then each fiber of (P 1 × Y, T) over Y admits a Morse pair. Proof. Suppose for contradiction that there is a fiber π −1 (y0 ) which admits no Morse pair. Since some fiber of P 1 × Y over Y admits a distal pair, then all fibers of P 1 × Y over Y admit distal pairs. In particular, π −1 (y0 ) admits a distal pair, say {(θ01 , y0 ), (θ02 , y0 )}. It follows from Lemma 9.1 that sup r θ01 , y0 , t r θ02 , y0 , t < +∞.
(9.4)
t∈T
Since (θ01 , y0 ), (θ02 , y0 ) do not form a Morse pair in both orders, there are positive constants K1 , K2 such that K1 r θ01 , y0 , t r θ02 , y0 , t K2 r θ01 , y0 , t ,
t ∈ T.
It follows from (9.4) that both r(θ01 , y0 , t) and r(θ02 , y0 , t) are bounded. Hence by Lemma 9.2, the cocycle is elliptic, a contradiction. 2 Lemma 9.6. Let {(θ+0 , y 0 ), (θ−0 , y 0 )} be a Morse pair. Then the following holds: (1) For any θ = θ−0 , {(θ, y 0 ), (θ−0 , y 0 )} is a Morse pair. (2) If θ1 , θ2 = θ−0 , then (θ1 , y 0 ), (θ2 , y 0 ) are proximal. (3) Suppose supt∈T {r(θ−0 , y 0 , t)} < ∞. Then for any θ1 , θ2 = θ−0 , (θ1 , y 0 ), (θ−0 , y 0 ) are proximal iff (θ2 , y 0 ), (θ−0 , y 0 ) are proximal. In particular, for any θ = θ0− , (θ, y 0 ), (θ−0 , y 0 ) are proximal iff (θ+0 , y 0 ), (θ−0 , y 0 ) are proximal. Proof. (1) Let θ = θ−0 ∈ P 1 and consider the linear combination cos πθ+0 cos πθ−0 cos πθ 0 + c , Φ(y 0 , t) = c+ Φ(y 0 , t) Φ(y , t) − sin πθ sin πθ+0 sin πθ−0 where c+ , c− are constants. Since θ = θ−0 and c+ = 0, we have by (9.5) that
(9.5)
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
lim sup t→∞
895
r(θ+0 , y 0 , t) |c | | = +∞, lim sup − |c + − r(θ−0 , y 0 , t) r(θ−0 , y 0 , t) t→∞ r(θ, y 0 , t)
i.e., {(θ, y 0 ), (θ−0 , y 0 )} is a Morse pair. (2) Let θ1 , θ2 = θ−0 and consider the linear combination cos πθ−0 cos πθ2 cos πθ1 = c1 Φ(y 0 , t) + c0 Φ(y 0 , t) , Φ(y 0 , t) sin πθ2 sin πθ1 sin πθ−0
(9.6)
where c1 , c0 are constants. Since θ2 = θ−0 and c1 = 0, we have by (9.3) that r(θ1 , y 0 , t)r(θ−0 , y 0 , t) admits a positive lower bound, say c(θ1 ). Then by (9.6), 0 0 r(θ1 , y 0 , t) 0 0 r θ1 , y , t r θ2 , y , t r θ1 , y , t r θ− , y , t max 0, |c1 | 0 − |c0 | r(θ− , y 0 , t) r(θ1 , y 0 , t) − |c0 | . c(θ1 ) max 0, |c1 | 0 r(θ− , y 0 , t)
0
It follows that supt∈T {r(θ1 , y 0 , t)r(θ2 , y 0 , t)} = +∞ since {(θ1 , y 0 ), (θ−0 , y 0 )} is a Morse pair. Hence by Lemma 9.1, (θ1 , y 0 ), (θ2 , y 0 ) are proximal. (3) By symmetry, it is sufficient to show that if (θ1 , y 0 ), (θ−0 , y 0 ) are proximal, then so are (θ2 , y 0 ), (θ−0 , y 0 ). Assume that (θ1 , y 0 ), (θ−0 , y 0 ) are proximal, i.e., supt∈T {r(θ1 , y 0 , t)r(θ−0 , y 0 , t)} = +∞. Then by (9.6), 2 r θ2 , y 0 , t r θ−0 , y 0 , t |c1 |r θ1 , y 0 , t r θ−0 , y 0 , t − |c0 |r θ−0 , y 0 , t . It follows that supt∈T {r(θ1 , y 0 , t)r(θ−0 , y 0 , t)} = +∞ since supt∈T {r(θ1 , y 0 , t)r(θ−0 , y 0 , t)} = +∞ and supt∈T {r(θ−0 , y 0 , t)} < ∞. By Lemma 9.1, (θ2 , y 0 ), (θ−0 , y 0 ) are proximal. 2 Theorem 9.2. Let the cocycle {Φ(y, t)}y∈Y, t∈T be either parabolic or partially hyperbolic. Then the following holds for its generated projective bundle flow (P 1 × Y, T): (1) (P 1 × Y, T) admits at most two minimal sets. (2) If (P 1 × Y, T) admits two minimal sets, then each minimal set is an almost 1–1 extension of Y (hence they are almost automorphic). (3) If (P 1 × Y, T) admits only one minimal set M, then precisely one of the following holds: (i) M is almost automorphic and is either an almost 1–1 or almost 2–1 extension of Y ; (ii) M is an everywhere non-locally connected Cantorian and is residually Li–Yorke chaotic; (iii) M is the entire space P 1 × Y and is residually Li–Yorke chaotic. Moreover, in cases (ii) and (iii), M is a proximal extension of Y which is not almost 1–1 (hence M is not almost automorphic). Proof. (1) is clear by Proposition 9.1.
896
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
(2) It follows from Theorem 4 that each minimal set is an almost N –1 extension of Y for some positive integer N . We note that an N –1 extension of Y must be a distal extension. But by Lemma 9.2 there are no three points on a same fiber which can be pair-wise distal. Hence N = 1. (3) By Lemma 9.4, we let (θ+ (y0 ), y0 ), (θ− (y0 ), y0 ) ∈ P 1 × Y be such that supt0 r(θ+ (y0 ), y0 , t) = +∞ and supt∈T r(θ− (y0 ), y0 , t) < +∞. It is clear that {(θ+ (y0 ), y0 ), (θ− (y0 ), y0 )} is a Morse pair. Case 1. (θ+ (y0 ), y0 ), (θ− (y0 ), y0 ) are proximal. In this case, since supt∈T {r(θ− (y0 ), y0 , t)} < ∞, we have by Lemma 9.6(3) that any two points on the fiber p −1 (y0 ) are proximal. This particularly implies that M is a proximal extension of Y . Hence if M is point-distal, then it must be an almost 1–1 extension of Y . Now suppose that M is not point-distal. Then by Theorem 2, it must be residually Li–Yorke chaotic, and by Theorem 6, the corresponding projective bundle flow admits no mean motion (because M is not almost automorphic). Moreover, since M is not an almost N –1 extension of Y for any positive integer N , it follows from Theorem 3 that M is either a Cantorian or the entire phase space, and, in the case that M is a Cantorian, we have by Theorem 7 that it is everywhere non-locally connected. Case 2. (θ+ (y0 ), y0 ), (θ− (y0 ), y0 ) are distal. We note that supt∈T {r(θ− (y0 ), y0 , t)} < ∞. It follows from Lemma 9.6 that (θ, y0 ), (θ− (y0 ), y0 ) are distal if θ = θ− (y0 ) and (θ1 , y0 ), (θ2 , y0 ) are proximal if θ1 , θ2 = θ− (y0 ). According to a general result due to Auslander [3], there exists (θ∗ , y0 ) ∈ M such that (θ− (y0 ), y0 ), (θ∗ , y∗ ) are proximal. Since by Lemma 9.6, (θ, y0 ), (θ− (y0 ), y0 ) are distal for any θ = θ− (y0 ), we have that θ∗ = θ− (y0 ), i.e., (θ− (y0 ), y0 ) ∈ M. Applying the result of Auslander to (θ+ (y0 ), y0 ), we also find a point (θ (y0 ), y0 ) ∈ M such that (θ (y0 ), y0 ), (θ+ (y0 ), y0 ) are proximal. Clearly, θ (y0 ) = θ− (y0 ) and (θ (y0 ), y0 ), (θ− (y0 ), y0 ) are distal. Hence π −1 (y0 ) ∩ M admits a distal pair. It follows that all fibers π −1 (y) ∩ M, y ∈ Y , admit distal pairs. It now follows from Lemma 9.5 that each fiber π −1 (y) admits a Morse pair {(θ+ (y), y), (θ− (y), y)}. Hence by Lemma 9.6(2), (θ1 , y), (θ2 , y) are proximal for any θ1 , θ2 = θ− (y). Since π −1 (y) ∩ M admits a distal pairs, (θ− (y), y) ∈ M and there exists (θ (y), y) ∈ M such that (θ (y), y), (θ−0 (y), y) are distal. Let δ = inft∈T |θ˜ (θ (y0 ), y0 , t) − θ˜ (θ− (y0 ), y0 , t)|. It is clear that δ > 0. Claim 1. For any (θ, y) ∈ M with θ = θ− (y), (θ, y), (θ− (y), y) are distal and |θ − θ− (y)| δ. Since (θ (y0 ), y0 ) ∈ M, there exists a sequence {tn } ⊂ T such that limn→∞ Λtn (θ (y0 ), y0 ) = (θ, y) and limn→∞ Λtn (θ− (y0 ), y0 ) = (θ∗ (y), y) for some (θ∗ (y), y) ∈ M. Clearly, (θ, y), (θ∗ (y), y) are distal and |θ − θ∗ (y)| δ. Since (θ1 , y), (θ2 , y) are proximal for any θ1 , θ2 = θ− (y), we have θ∗ (y) = θ− (y). This proves the claim. Claim 2. There is a residual set Y0 ⊂ Y such that |π −1 (y) ∩ M| = 2 for all y ∈ Y0 . We use the argument in the proof of Theorem 7.4 in [29]. By Claim 1, there exist 0 θ1 θ2 < 1 + θ1 such that (θ1 , y0 ), (θ2 , y0 ) ∈ M and ([θ1 , θ2 ] × {y0 }) ∩ M = π −1 (y0 ) ∩ M \ {(θ− (y0 ), y0 )}. Let Y0 be the set of all continuity points of the upper semicontinuous map y → π −1 (y) ∩ M. Then Y0 is a residual subset of Y . For any y ∈ Y0 , since (θ1 , y0 ), (θ2 , y0 ) are proximal, there exists a sequence {tn } ⊂ T such that limn→∞ Λtn (θ1 , y0 ) = limn→∞ Λtn (θ2 , y0 ) = (θ (y), y) and limn→∞ Λtn (θ− (y0 ), y0 ) = (θ ∗ (y), y) for some (θ ∗ (y), y) ∈ M. Since (θ1 , y0 ), (θ− (y0 ), y0 ) are distal, so are (θ (y), y), (θ ∗ (y), y). Hence θ ∗ (y) = θ− (y). It follows from Lemma 5.3(2) that either limn→∞ Λtn ([θ1 , θ2 ] × {y0 }) = {(θ (y), y)} or limn→∞ Λtn ([θ2 , θ1 ] × {y0 }) = {(θ (y), y)} by taking subsequences if necessary.
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
897
Since θ− (y0 ) ∈ [θ2 , θ1 ] and limn→∞ Λtn (θ− (y0 ), y0 ) = (θ− (y), y) = (θ (y), y), we have limn→∞ Λtn ([θ1 , θ2 ] × {y0 }) = {(θ (y), y)}. Thus π −1 (y) ∩ M = lim π −1 (y0 · tn ) ∩ M ⊆ lim Λtn [θ1 , θ2 ] × {y0 } ∪ Λtn θ− (y0 ), y0 n→∞
= (θ (y), y), (θ− (y), y) .
n→∞
It follows that |π −1 (y) ∩ M| = 2 for all y ∈ Y0 . Now, we have by Claim 2 that M is an almost 2–1 extension of Y . To show that M is almost automorphic, we note that it easily follows from Claim 1 and Lemma 9.6(2) that the proximal relation P (M) is closed. Hence M/P (M) is a compact Hausdorff space and there is a natural flow (M/P (M), T) induced from the flow (P 1 × Y, T). By Lemma 9.1, M/P (M) is a 2–1 extension of Y , hence it is almost periodic minimal. Let p : M → M/P (M) be the natural projection. Then it follows from Claim 2 that p : (M, T) → (M/P (M), T) is an almost 1–1 extension. Hence M is almost automorphic. 2 In the partially hyperbolic case, more precise information can be obtained as follows. Theorem 9.3. Let the cocycle {Φ(y, t)}y∈Y, t∈T be partially hyperbolic. Then its generated projective bundle flow (P 1 × Y, T) admits a unique minimal set M. Moreover, the following holds: (a) M is characterized precisely by one of the case (i)–(iii) in Theorem 9.2 but it is nether almost periodic nor an almost 2–1 extension of Y ; (b) M is non-uniquely ergodic and admits precisely two ergodic sheets {(u± (y), y)}y∈Y ∗ as in (9.2); (c) There is a residual set Y0 ⊂ Y such that for each (θ, y) ∈ M ∩ p −1 (y), r(θ, y, t) oscillates between 0 and +∞ as t → ±∞. Proof. The fact that M cannot be an almost 2–1 extension of Y was proved in [29] for the linear system (9.1). (b) and (c) were given in Theorem 4.10 of [35] also for the linear system (9.1). The proof for the general situation follows from similar arguments. Since M is not uniquely ergodic, it cannot be almost periodic. 2 Examples of continuous, almost periodic, sl(2, R)-valued cocycles whose projective bundle flows have the property (i) stated in Theorem 9.2 are well-known (see [7,29]). Also there are many continuous, almost periodic, sl(2, R)-valued cocycles whose projective bundle flows have the property (iii) stated in Theorem 9.2 (see [31,42]). An interesting question is whether case (ii) in Theorem 9.2 can really occur in a projective bundle flow. The following result shows that the answer to this question is negative when the forcing space in a projective bundle flow is locally connected. Theorem 9.4. Let Y be locally connected and the cocycle {Φ(y, t)}y∈Y,t∈T be either parabolic or partially hyperbolic. Then its generated projective bundle flow (P 1 × Y, T) admits no Cantorian minimal set.
898
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
Proof. If (P 1 × Y, T) admits mean motion, then we have by Theorem 6 that all its minimal sets are almost automorphic hence they are not residually Li–Yorke chaotic. This show that case (ii) in Theorem 9.2 does not occur. In particular, (P 1 × Y, T) admits no Cantorian minimal set. If (P 1 × Y, T) admits no mean motion, then by Theorem 7(2) (Theorem 8.6) it is topologically transitive. It then follows from almost exact proof of Proposition 4.6 in [4] that if the entire phase space P 1 × Y is not minimal, then a minimal set M of (P 1 × Y, T) is either an almost 1–1 or an almost 2–1 extension of Y . In particular, M is not a Cantorian. 2 Remark. (1) Using the same argument as the above, one sees that if a projective bundle flow admits mean motion, then case (ii) in Theorem 9.2 cannot occur regardless whether the base Y is locally connected or not (this has already been shown in [7] for the continuous case). It then remains an open question whether case (ii) in Theorem 9.2 can occur in a projective bundle flow without mean motion when the base is not locally connected. We think that the answer to this question should be affirmative. (2) Suppose that the projective bundle flow of a partially hyperbolic, quasi-periodically forced cocycle {Φ(y, t)}y∈Y, t∈T admits a globally attracting SNA, say A. Then A cannot be the entire phase space, and by Theorem 9.3, A is made up by a unique minimal set M along with its “homoclinic orbits” (in the sense of proximality). Now, by Theorems 9.2, 9.4, M is non-almostperiodic, almost automorphic extension of Y . If we further assume that the rotation number of the projective bundle flow is rationally independent of the forcing frequencies, then by Theorem 8, M is everywhere non-locally connected. All these simply suggests an important role played by almost automorphic dynamics to such a SNA: topologically the minimal set in the SNA is everywhere non-local connected and an almost 1-cover of the forcing space, and dynamically the minimal set in the SNA is almost automorphic. 9.2. Cases with mean motion properties With the classification given in the above, it is important to know when or how often almost automorphic dynamics can occur in the projective bundle flow of an almost periodic, sl(2, R)valued cocycle in the non-parabolic case. Some affirmative answers to this problem was given in [35] with respect to extreme points of spectral gaps of the following almost periodic Schrödinger and Schrödinger-like operators: d2 Lq = − 2 + q(y · t) : L2 (R) → L2 (R); dt d − Q(y · t) : L2 R, R 2 → L2 R, R 2 ; LQ = J dt Lv = −A + v(y · n) : L2 (Z) → L2 (Z), where (Y, T) is almost periodic minimal for T = R or Z, q, v are continuous functions on Y , Q is a 2 × 2 matrix-valued continuous function on Y , J is the standard 2 × 2 symplectic matrix, and A is the operator defined by Az(n) = z(n + 1) + z(n − 1). Of course, when automorphic dynamics exist in the projective bundle flow of an almost periodic, sl(2, R)-valued cocycle in the non-hyperbolic case, it is also interesting to know whether the corresponding projective bundle flow admits mean motion.
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
899
Consider the spectral problem Lq x(t) = λx(t),
(9.7)
LQ X(t) = λX(t),
(9.8)
Lv z(n) = λz(n).
(9.9)
Each linear equation (9.7)–(9.9) generates an almost periodic, sl(2, T)-valued cocycle for T = R or Z in the natural way, which gives rise to a projective bundle flow Πλ = (P 1 × Y, T), where P 1 is parametrized by φ = − π2 Arg xx , φ = π2 Arg X, φ = − π2 Arg z(n+1) for (9.7)–(9.9) rez(n) spectively. For each L = Lq , LQ , Lv , according to the Gap labeling theorem [34], the rotation number ρ(λ) of Πλ is monotonically increasing and increases precisely on the spectrum ΣL of L which is contained in a half line [λ∗ , +∞). For each λ in the resolvent of L, it is well known that the corresponding cocycle is hyperbolic, hence Πλ admits exactly two minimal sets which are all 1–1 extensions of Y (hence they are almost periodic). Proposition 9.2. Consider L = Lq , LQ , Lv and let λ0 be a finite extreme point of a spectral gap (i.e., a maximal open interval in the resolvent of L). Then Πλ0 admits mean motion. Consequently, each minimal set of Πλ0 is almost automorphic and in fact an almost 1–1 extension of Y. Proof. We only give the proof for the case of Lq . The other two cases can be treated similarly. Observe from (9.7) that φ = − π2 Arg xx satisfies the equation φ =
λ − q(y · t) − 1 λ − q(y · t) + 1 + cos πφ. π π
(9.10)
We denote φ˜ λ (φ, y, t) as the solution of (9.10) corresponding to λ, y and with initial value φ. Let λ0 be the left end point of a spectral gap I in the resolvant of Lq . Then the rotation number of (9.10) is a constant over I¯, which we denote by ρ. By elementary theory of ordinary differential equations, we see from (9.10) that ∂ φ˜ λ (φ, y, t) 0 ∂λ
(9.11)
for all λ, φ ∈ R 1 , y ∈ Y , and t 0. For any (φ0 , y0 ) ∈ R 1 × Y and a given λ∗ ∈ I , we denote φ0 (t) = φ˜ λ0 (φ0 , y0 , t), φ∗ (t) = φ˜ λ∗ (φ0 , y0 , t). Using (9.11) and the comparison principle of scalar ordinary differential equations, it is easy to see that φ0 (t) φ∗ (t) for all t 0. It follows that φ0 (t) − φ0 − ρt φ∗ (t) − φ0 − ρt for all t 0. Since Πλ∗ admits almost periodic motion, it admits mean motion. Hence sup φ0 (t) − φ0 − ρt sup φ∗ (t) − φ0 − ρt < ∞. t0
t0
900
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
Since (φ0 , y0 ) is arbitrary, we have by Theorem 8.2(d) that Πλ0 admits mean motion. It follows from Theorem 6 that each minimal set of Πλ0 is almost automorphic. Since almost periodic minimal sets of Πλ∗ are all 1–1 extensions of Y , we have by Theorem 6(2) that ρ is contained in the frequency module of the forcing. Applying Theorem 6(2) again, we conclude that any minimal set of Πλ0 cannot be an almost 2–1 extension of Y . The case when λ0 is the right end point (including λ∗ ) of a spectral gap in the resolvant of Lq is similar. 2 Dynamics of Πλ when λ entering the spectrum through λ0 are expected to be more complicated due to the possible loss of mean motion property. This can be viewed as another intermittency phenomenon characterized by almost automorphic intermediate dynamics. Acknowledgments We would like to thank Professor Russell A. Johnson for valuable comments and suggestions, and also for bringing the work [4] to our attention. We are also grateful to an anonymous referee for corrections and helpful suggestions which lead to an improvement of the paper. References [1] E. Akin, E. Glasner, W. Huang, S. Shao, X. Ye, Sufficient conditions under which a transitive system is chaotic, preprint, 2008. [2] V.I. Arnold, Small divisors, I. On mappings of a circle onto itself, Izv. Akad. Nauk SSSR Ser. Math. 25 (1961) 21–86. [3] J. Auslander, Minimal Flows and Their Extensions, North-Holland Math. Stud., vol. 153, North-Holland, Amsterdam, 1988. [4] F. Béguin, S. Croviser, T.H. Jäger, F. Le Roux, Denjoy construction for fibered homeomorphism of the torus, Trans. Amer. Math. Soc., in press. [5] A. Berger, S. Siegmund, Y. Yi, On almost automorphic dynamics in symbolic lattices, Ergodic Theory Dynam. Systems 24 (3) (2004) 677–696. [6] K. Bjerklov, Positive Lyapunov exponent and minimality for a class of one-dimensional quasi-periodic Schrödinger equations, Ergodic Theory Dynam. Systems 25 (4) (2005) 1015–1045. [7] K. Bjerklov, R.A. Johnson, Minimal subsets of projective flows, Discrete Contin. Dyn. Sys. Ser. B 9 (3–4) (2008) 495–516. [8] F. Blanchard, E. Glasner, S. Kolyada, A. Maass, On Li–Yorke pairs, J. Reine Angew. Math. 547 (2002) 51–68. [9] R. Bowen, Entropy for group endomorphisms and homogeneous spaces, Trans. Amer. Math. Soc. 153 (1971) 401– 414. [10] M. Denker, C. Grillenberger, C. Sigmund, Ergodic Theory on Compact Spaces, Lecture Notes in Math., vol. 527, Springer-Verlag, New York, 1976. [11] R. Ellis, Distal transformation groups, Pacific J. Math. 8 (1958) 401–405. [12] R. Ellis, Lectures on Topological Dynamics, W.A. Benjamin, Inc., New York, 1969. [13] A. Fathi, Weak KAM Theorem in Lagrangian Dynamics, Cambridge Stud. Adv. Math., vol. 88, Cambridge University Press, Cambridge, 2007. [14] U. Feudel, S. Kuznetsov, A. Pikovsky, Strange Nonchaotic Attactors, World Scientific, 2006. [15] H. Furstenberg, Strictly ergodicity and transformations of the torus, Amer. J. Math. 83 (1961) 573–601. [16] H. Furstenberg, Recurrence in Ergodic Theory and Combinatorial Number Theory, Princeton Univ. Press, 1981. [17] H. Furstenberg, B. Weiss, On almost 1–1 extensions, Israel J. Math. 65 (1989) 311–322. [18] E. Glasner, Ergodic Theory via Joinings, Math. Surveys Monogr., vol. 101, Amer. Math. Soc., Providence, RI, 2003. [19] E. Glanser, Topological weak mixing and quasi-Bohr systems, Israel J. Math. 148 (2005) 277–304. [20] E. Glasner, J.P. Thouvenot, B. Weiss, Entropy theory without a past, Ergodic Theory Dynam. Systems 20 (2000) 1355–1370. [21] P. Glendinning, T.H. Jäger, G. Keller, How chaotic are strange non-chaotic attractors? Nonlinearity 19 (2006) 2005– 2022.
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
901
[22] C. Grebogi, E. Ott, S. Pelikan, J.A. Yorke, Strange attractors that are not chaotic, Phys. D 13 (1984) 261–268. [23] M.R. Herman, Une méthode pour minorer les exposants de Lyapunov et quelques, Comment. Math. Helv. 58 (1983) 453–502. [24] P. Hulse, Sequence entropy relative to an invariant σ -algebra, J. London Math. Soc. 33 (1986) 59–72. [25] W. Hurewicz, H. Wallman, Dimension Theory, Princeton Math. Ser., vol. 4, Princeton Univ. Press, Princeton, NJ, 1941. [26] R. Iturriaga, Minimizing measures for time-dependent Lagrangians, Proc. London Math. Soc. 73 (1996) 216–240. [27] T.H. Jäger, G. Keller, The Denjoy type of argument for quasiperiodically forced circle diffeomorphism, Ergodic Theory Dynam. Systems 26 (2) (2006) 447–465. [28] T.H. Jäger, J. Stark, Towards a classification for quasi-periodically forced circle homeomorphism, J. London Math. Soc. 77 (2006) 727–744. [29] R.A. Johnson, On a Floquet theory for almost-periodic, two-dimensional linear systems, J. Differential Equations 37 (1980) 184–205. [30] R.A. Johnson, Minimal functions with unbounded integral, Israel J. Math. 31 (1978) 133–141. [31] R.A. Johnson, Two-dimensional, almost periodic linear systems with proximal and recurrent behavior, Proc. Amer. Math. Soc. 82 (1981) 417–422. [32] R.A. Johnson, On almost-periodic linear differential systems of Millionšˇcikov and Vinograd, J. Math. Anal. Appl. 85 (1982) 452–460. [33] R.A. Johnson, An example concerning the geometric significance of the rotation numbers-integrated density of states, in: Proc. Breman Conf., in: Lecture Notes in Math., vol. 1186, Springer-Verlag, 1984. [34] R.A. Johnson, J. Moser, The rotation number for almost periodic potentials, Comm. Math. Phys. 84 (1982) 403–438. [35] A. Jorba, C. Nunez, R. Obaya, J.C. Tatjer, Old and new results on SNAs on the real line, Internat. J. Bifur. Chaos Appl. Sci. Engrg. 17 (11) (2007) 3895–3928. [36] J.F.C. Kingman, Subadditive ergodic theory, Ann. Probab. 1 (1973) 883–909. [37] T.Y. Li, J.A. Yorke, Period three implies chaos, Amer. Math. Monthly 82 (1975) 985–992. [38] J.C. Martin, Substitution minimal flows, Amer. J. Math. 93 (1971) 503–526. [39] J. Mather, Minimal measures, Comment. Math. Helv. 64 (1989) 375–394. [40] J. Mather, Minimal action measures for positive-definite Lagrangian systems, in: IXth International Congress on Mathematical Physics, Swansea, 1988, Hilger, Bristol, 1989, pp. 466–468. [41] J. Mather, Action minimizing invariant measures for positive definite Lagrangian systems, Math. Z. 207 (1991) 169–207. [42] M. Nerurka, On the construction of smooth ergodic skew-products, Ergodic Theory Dynam. Systems 8 (1988) 311–326. [43] J. Moser, On the theory of quasi-periodic motions, SIAM Rev. 8 (1966) 145–172. [44] J. Mycielski, Independent sets in topological algebra, Fund. Math. 55 (1964) 139–147. [45] K. Namakura, On bicompact semigroup, Math. J. Okayama Univ. 1 (1952) 99–108. [46] W. Parry, Topics in Ergodic Theory, Cambridge Tracts in Math., vol. 75, Cambridge Univ. Press, Cambridge–New York, 1981. [47] A. Pliss, G.R. Sell, Planetary motions and climate of the Earth, preprint, 2006. [48] L. Pontrjagin, Topological Groups, Princeton Math. Ser., vol. 2, Princeton Univ. Press, Princeton, 1939. [49] F.P. Ramsey, On a problem of formal logic, Proc. London Math. Soc. (2) 30 (1930) 264–286. [50] V.A. Rohlin, On the fundament ideas of measure theory, Mat. Sb. (N.S.) 25 (67) (1949) 107–150, English transl. in Amer. Math. Soc. Transl. Ser. 1 10 (1962) 1–54. [51] F.J. Romeiras, E. Ott, Strange non-chaotic attractors of the damped pendulum with quasiperiodic forcing, Phys. Rev. A 35 (1987) 4404–4413. [52] R.J. Sacker, G.R. Sell, Existence of Dochotomies and invariant splitting for linear systems I, J. Differential Equations 15 (1974) 429–458. [53] R.J. Sacker, G.R. Sell, A spectral theory for linear differential systems, J. Differential Equations 27 (1978) 320–358. [54] W. Shen, Global attractor in quasi-periodically forced Josephson junctions, Far East J. Dyn. Syst. 3 (2001) 51–80. [55] W. Shen, Y. Yi, Dynamics of almost periodic scalar parabolic equations, J. Differential Equations 122 (1995) 114– 136. [56] W. Shen, Y. Yi, On minimal sets of scalar parabolic equations with skew-product structures, Trans. Amer. Math. Soc. 347 (1995) 4413–4431. [57] W. Shen, Y. Yi, Almost automorphy and skew-product semi-flow, Mem. Amer. Math. Soc. 136 (647) (1998). [58] W.A. Veech, Almost automorphic functions on groups, Amer. J. Math. 87 (1965) 719–751. [59] W.A. Veech, Point-distal flows, Amer. J. Math. 92 (1970) 205–242.
902
W. Huang, Y. Yi / Journal of Functional Analysis 257 (2009) 832–902
[60] P. Walters, An Introduction to Ergodic Theory, Grad. Texts in Math., vol. 79, Springer-Verlag, New York–Berlin, 1982. [61] Y. Yi, A generalized integral manifold theorem, J. Differential Equations 102 (1993) 153–187. [62] Y. Yi, On almost automorphic oscillations, Fields Inst. Commun. 42 (2004) 75–99. [63] R.J. Zimmer, Extensions of ergodic group actions, Illinois J. Math. 20 (1976) 373–409. [64] R.J. Zimmer, Ergodic actions with generalized discrete spectrum, Illinois J. Math. 20 (1976) 555–588.
Journal of Functional Analysis 257 (2009) 903–930 www.elsevier.com/locate/jfa
On a parabolic logarithmic Sobolev inequality H. Ibrahim a,b,c , R. Monneau a,∗ a Université Paris-Est, CERMICS, Ecole des Ponts, 6 et 8 avenue Blaise Pascal, Cité Descartes Champs-sur-Marne,
77455 Marne-la-Vallée Cedex 2, France b LaMA-Liban, Lebanese University, P.O. Box 826 Tripoli, Lebanon c CEREMADE, Université Paris-Dauphine, Place De Lattre de Tassigny, 75775 Paris Cedex 16, France
Received 19 December 2008; accepted 12 January 2009 Available online 29 January 2009 Communicated by H. Brezis
Abstract In order to extend the blow-up criterion of solutions to the Euler equations, Kozono and Taniuchi [H. Kozono, Y. Taniuchi, Limiting case of the Sobolev inequality in BMO, with application to the Euler equations, Comm. Math. Phys. 214 (2000) 191–200] have proved a logarithmic Sobolev inequality by means of isotropic (elliptic) BMO norm. In this paper, we show a parabolic version of the Kozono–Taniuchi inequality by means of anisotropic (parabolic) BMO norm. More precisely we give an upper bound for the L∞ norm of a function in terms of its parabolic BMO norm, up to a logarithmic correction involving its norm in some Sobolev space. As an application, we also explain how to apply this inequality in order to establish a long-time existence result for a class of nonlinear parabolic problems. © 2009 Elsevier Inc. All rights reserved. Keywords: Logarithmic Sobolev inequalities; Parabolic BMO spaces; Anisotropic Lizorkin–Triebel spaces; Harmonic analysis
1. Introduction and main results In [12], Kozono and Taniuchi showed an L∞ estimate of a given function by means of its BMO norm (space of functions of bounded mean oscillation) and the logarithm of its norm in some Sobolev space. In fact, they proved that for f ∈ Wps (Rn ), 1 < p < ∞, the following estimate * Corresponding author.
E-mail addresses: [email protected] (H. Ibrahim), [email protected] (R. Monneau). 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.01.008
904
H. Ibrahim, R. Monneau / Journal of Functional Analysis 257 (2009) 903–930
holds (with log+ x = max(log x, 0)): f L∞ (Rn ) C 1 + f BMO(Rn ) 1 + log+ f Wps (Rn ) ,
sp > n,
(1.1)
for some constant C = C(n, p, s) > 0. The main advantage of the above estimate is that it was successfully applied (see [12, Theorem 2]) to extend the blow-up criterion of solutions to the Euler equations which was originally given by Beale, Kato and Majda in [1]. Inequality (1.1), as well as some variants of it, are shown (see [11,12,14]) using harmonic analysis on isotropic functional spaces of the Lizorkin–Triebel and Besov type. However, as is well known, it is important, say for parabolic partial differential equations to consider spaces that are anisotropic. Motivated by the study of the long-time existence of a certain class of singular parabolic coupled systems (see [8,9]), we show in this paper an analogue of the Kozono–Taniuchi inequality (1.1) but of the parabolic (anisotropic) type. Due to the parabolic anisotropy, we consider functional spaces on Rn+1 = Rn × R with the generic variable z = (x, t), where each coordinate xi , i = 1, . . . , n is given the weight 1, while the time coordinate t is given the weight 2. We now state the main results of this paper. The first result concerns a Kozono–Taniuchi parabolic type inequality on the entire space Rn+1 . Introducing parabolic bounded mean oscillation BMOp spaces, and parabolic Sobolev spaces W22m,m (for the definition of these spaces, see Definitions 2.1 and 2.2), we present our first theorem. Theorem 1.1 (Parabolic logarithmic Sobolev inequality on Rn+1 ). Let u ∈ W22m,m (Rn+1 ), m > n+2 4 . Then there exists a constant C = C(m, n) > 0 such that: uL∞ (Rn+1 ) C 1 + uBMOp (Rn+1 ) 1 + log+ uW 2m,m (Rn+1 ) .
(1.2)
2
The proof of Theorem 1.1 will be given in Section 2, and is based on an approach developed by Ogawa [14]. Let us mention that our proof in this paper is self-contained. The second result of this paper concerns a Kozono–Taniuchi parabolic type inequality on the bounded domain ΩT = (0, 1)n × (0, T ) ⊂ Rn+1 ,
T > 0.
More precisely, our next theorem reads: Theorem 1.2 (Parabolic logarithmic Sobolev inequality on a bounded domain). Let u ∈ W22m,m (ΩT ) with m > n+2 4 . Then there exists a constant C = C(m, n, T ) > 0 such that: uL∞ (ΩT ) C 1 + uBMOp (ΩT ) 1 + log+ uW 2m,m (Ω ) , 2
T
(1.3)
where · BMOp (ΩT ) = · BMOp (ΩT ) + · L1 (ΩT ) . The proof of Theorem 1.2 will be given in Section 3. 1.1. Brief review of the literature The brief review presented here only concerns logarithmic Sobolev inequalities of the elliptic type. Up to our knowledge, logarithmic Sobolev inequalities of the parabolic type have not been treated elsewhere in the literature.
H. Ibrahim, R. Monneau / Journal of Functional Analysis 257 (2009) 903–930
905
The original type of the logarithmic Sobolev inequalities was found in Brezis and Gallouet [3] and Brezis and Wainger [4] where the authors investigated the relation between L∞ , Wrk and Wps and proved that there holds the embedding: r−1 f L∞ (Rn ) C 1 + log r 1 + f Wps (Rn ) ,
(1.4)
sp > n,
provided f Wrk (Rn ) 1 for kr = n. The estimate (1.4) was applied to prove global existence of solutions to the nonlinear Schrödinger equation (see [3,7]). Similar embedding for f ∈ (Wps (Rn ))n with div f = 0 was investigated by Beale, Kato and Majda in [1]. The authors showed that: ∇f L∞ C 1 + rot f L∞ 1 + log+ f W s+1 + rot f L2 , p
sp > n,
(1.5)
where they made use of this estimate in order to give a blow-up criterion of solutions to the Euler equations (see [1]). In [12], Kozono and Taniuchi showed their inequality (1.1) in order to extend the blow-up criterion of solutions to the Euler equations given in [1] (see [12, Theorem 2]). A generalized version of (1.1) in Besov spaces was given in Kozono et al. [11]. Finally, a sharp version of the logarithmic Sobolev inequality of the Beale–Kato–Majda and the Kozono– Taniuchi type in the Lizorkin–Triebel spaces was established by Ogawa in [14]. 1.2. Organization of the paper This paper is organized as follows. In Section 2, we recall basic tools used in our analysis, and give the proof of Theorem 1.1. In Section 3, we present the proof of Theorem 1.2, and as an application, we explain how to use the parabolic Kozono–Taniuchi inequality in order to prove the long-time existence of certain parabolic equations. 2. A parabolic Kozono–Taniuchi inequality on R n+1 This section is devoted to the proof of Theorem 1.1. 2.1. Preliminaries and basic tools 2.1.1. Parabolic BMOp and Sobolev spaces We start by recalling some definitions and introducing some notations. A generic point in Rn+1 will be denoted by z = (x, t) ∈ Rn × R, x = (x1 , . . . , xn ). Let S(Rn+1 ) be the usual Schwartz space, and S (Rn+1 ) the corresponding dual space. Let u ∈ S (Rn+1 ). For ξ = (ξ1 , . . . , ξn ) ∈ Rn and τ ∈ R we denote by F u(ξ, τ ) ≡ u(ξ, ˆ τ ), and F −1 u(ξ, τ ) ≡ u(ξ, ˇ τ) r the Fourier, and the inverse Fourier transform of u respectively. We also denote Dtr = ∂t∂ r , r ∈ N, and Dxs , s ∈ N, any derivative with respect to x of order s. The parabolic distance from z = (x, t) to the origin is defined by: [z] = max |x1 |, . . . , |xn |, |t|1/2 .
(2.1)
Let O ⊆ Rn+1 be an open set. The parabolic bounded mean oscillation space BMOp and the parabolic Sobolev space W22m,m are now recalled.
906
H. Ibrahim, R. Monneau / Journal of Functional Analysis 257 (2009) 903–930
Definition 2.1 (Parabolic bounded mean oscillation spaces). A function u ∈ L1loc (O) is said to be of parabolic bounded mean oscillation, u ∈ BMOp (O) if we have: 1 uBMOp (O) = sup |u − uQ | < +∞. (2.2) Q⊂O |Q| Q
Here Q denotes an arbitrary parabolic cube Q = Qr = Qr (z0 ) = z ∈ Rn+1 ; [z − z0 ] < r ,
(2.3)
and uQ =
1 |Q|
(2.4)
u. Q
The functions in BMOp are defined up to an additive constant. We also define the space BMOp as: BMOp (O) = BMOp (O) ∩ L1 (O)
with · BMOp = · BMOp + · L1 .
Definition 2.2 (Parabolic Sobolev spaces). Let m be a non-negative integer. We define the parabolic Sobolev space W22m,m (O) as follows: W22m,m (O) = u ∈ L2 (O); Dtr Dxs u ∈ L2 (O), ∀r, s ∈ N such that 2r + s 2m . The norm of u ∈ W22m,m (O) is defined by: uW 2m,m (O) = 2
2m j =0
2r+s=j
Dtr Dxs uL2 (O) .
The next lemma concerns a Sobolev embedding of W22m,m . Lemma 2.3 (Sobolev embedding). (See [13, Lemma 3.3].) Let m be a non-negative integer satisfying m > n+2 4 . Then there exists a positive constant C depending on m and n such that for any 2m,m u ∈ W2 (O), the function u is continuous and bounded on O, and satisfies uL∞ (O) CuW 2m,m (O) .
(2.5)
2
2.1.2. Parabolic Lizorkin–Triebel and Besov spaces Here we give the definition of Lizorkin–Triebel spaces. These spaces are constructed out of the parabolic Littlewood–Paley decomposition that we recall here. Let ψ0 (z) ∈ C0∞ (Rn+1 ) be a function such that ψ0 (z) = 1 if [z] 1 and ψ0 (z) = 0 if [z] 2.
(2.6)
For such a function ψ0 , we may define a smooth, anisotropic dyadic partition of unity (ψj )j ∈N by letting ψj (z) = ψ0 2−j a z − ψ0 2−(j −1)a z if j 1.
H. Ibrahim, R. Monneau / Journal of Functional Analysis 257 (2009) 903–930
907
Here a = (1, . . . , 1, 2) ∈ Rn+1 , and for η ∈ R, b = (b1 , . . . , bn , bn+1 ) ∈ Rn+1 , the dilation ηb z is defined by ηb z = (ηb1 z1 , . . . , ηbn zn , ηbn+1 zn+1 ). It is clear that ∞
ψj (z) = 1
for z ∈ Rn+1 ,
j =0
and supp ψj ⊂ z; 2j −1 [z] 2j +1 ,
j 1.
Define φj , j 0 as the inverse Fourier transform of ψj , i.e. φˆ j = ψj . It is worth noticing that φj (z) = 2(n+2)(j −1) φ1 2(j −1)a z
for j 1,
(2.7)
and that for any u ∈ S (Rn+1 ), u = (2π)−
(n+1) 2
∞
with convergence in S Rn+1 .
φj ∗ u
j =0
We now give the definition of the anisotropic Besov and Lizorkin–Triebel spaces. s (Rn+1 ) = B s , Definition 2.4 (Anisotropic Besov spaces). The anisotropic Besov space Bp,q p,q s ∈ R, 1 p ∞ and 1 q ∞ is the space of functions u ∈ S (Rn+1 ) with finite quasi-norms
s = uBp,q
∞
1/q 2
sqj
j =0
q φj ∗ uLp (Rn+1 )
(2.8)
and the natural modification for q = ∞, i.e. s uBp,∞ = sup 2sj φj ∗ uLp (Rn+1 ) .
(2.9)
j 0
Definition 2.5 (Anisotropic Lizorkin–Triebel spaces). The anisotropic Lizorkin–Triebel space s (Rn+1 ) = F s , s ∈ R, 1 p < ∞ and 1 q ∞ (or 1 q < ∞ and p = ∞) is the space Fp,q p,q of functions u ∈ S (Rn+1 ) with finite quasi-norms s uFp,q
∞
1/q sqj q = 2 |φj ∗ u| j =0
(2.10)
Lp (Rn+1 )
and the natural modification for q = ∞, i.e. s = sup 2sj |φj ∗ u| uFp,∞ j 0
Lp (Rn+1 )
.
(2.11)
908
H. Ibrahim, R. Monneau / Journal of Functional Analysis 257 (2009) 903–930
A very useful space throughout our analysis will be the truncated anisotropic (parabolic) s that we define here. p,q Lizorkin–Triebel space F Definition 2.6 (Truncated anisotropic Lizorkin–Triebel space). The truncated anisotropic s (Rn+1 ) = F s , s ∈ R, 1 p < ∞ and 1 q ∞ (1 q < ∞ p,q p,q Lizorkin–Triebel space F if p = ∞) is the space of functions u ∈ S (Rn+1 ) with finite quasi-norms uF p,q s
∞
1/q = 2sqj |φj ∗ u|q j =1
(2.12)
Lp (Rn+1 )
and the natural modification for q = ∞, i.e. uF p,∞ = sup 2sj |φj ∗ u| s j 1
Lp (Rn+1 )
.
(2.13)
s and F s is that in F s we omit the term φ ∗ u and only take p,q p,q The basic difference between Fp,q 0 in consideration the terms φj ∗ u, j 1. Sobolev embeddings of parabolic Lizorkin–Triebel and Besov spaces are shown by the next two lemmas.
Lemma 2.7 (Embeddings of Besov spaces). (See [10, Theorem 7].) Let s, t ∈ R, s > t, and n+2 1 p, r ∞ satisfy: s − n+2 p = t − r . Then for any 1 q ∞ we have the following continuous embedding n+1 n+1 s t R
→ Br,q R . Bp,q
(2.14)
Lemma 2.8 (Sobolev embeddings). (See [15, Proposition 2].) Take an integer m 1. Then we have 2m 2m B2,1
→ W22m,m → B2,∞ .
(2.15)
2.2. Basic logarithmic Sobolev inequality In this subsection we show a basic logarithmic Sobolev inequality. In particular, we show the following lemma. Lemma 2.9 (Basic logarithmic Sobolev inequality). Let u ∈ W22m,m (Rn+1 ) for some m ∈ N, m > n+2 4 . Then there exists some constant C = C(m, n) > 0 such that uF 0
∞,1
1/2 . C 1 + uF 0 1 + log+ uW 2m,m ∞,2
(2.16)
2
Proof. First, let us mention that the ideas of the proof of this lemma are inspired from the proof of Ogawa [14, Corollary 2.4]. The proof is divided into three steps, and the constants in the proof may vary from line to line. Step 1 (Estimate of uF 0 ). Let γ > 0, and N ∈ N be two arbitrary variables. We compute: ∞,1
H. Ibrahim, R. Monneau / Journal of Functional Analysis 257 (2009) 903–930
uF 0
∞,1
|φj ∗ u| 1j
−γj γj + 2 2 |φj ∗ u| ∞
L
j N
1/2 2 N 1/2 |φ ∗ u| j
L∞
1j
909
L∞
γj 2 1/2 2 + Cγ 2−γ N |φ ∗ u| j
L∞
j N
Cγ N 1/2 uF 0 + 2−γ N uF γ , ∞,2
∞,2
where Cγ > 0 is a positive constant. Step 2 (Optimization in N ). We optimize the previous inequality in N by setting: N = 1 if uF γ
∞,2
2γ uF 0 . ∞,2
In this case we can easily check that: uF 0
∞,1
In the case where uF γ
∞,2
uF γ 1/2 ∞,2 1 + log+ . ∞,2 uF 0
Cγ uF 0
(2.17)
∞,2
> 2γ uF 0 , we choose 1 β < 2γ such that ∞,2
u γ F∞,2 ∈ N. N = log+ 2γ β u 0 F ∞,2
We then compute: N 1/2 uF 0 + 2−γ N uF γ
∞,2
∞,2
uF 0
∞,2
uF 0
∞,2
u γ 1/2 F∞,2 1 + log+ γ β 2 β uF 0 ∞,2
uF γ 1/2
1 2 ∞,2 + log+ β log 2γ uF 0
Cγ uF 0
∞,2
∞,2
1 + log+
uF γ 1/2 ∞,2
uF 0
,
∞,2
hence we also have (2.17) with a different constant Cγ . Step 3 (Estimate of uF γ
∞,2
and conclusion). Noting the inequality
1/2
y C(1 + x(log(e + y))1/2 ) x log e + x Cx(log(e + y))1/2
for 0 < x 1, for x > 1,
we deduce from (2.17) that: uF 0
∞,1
1/2 , C 1 + uF 0 1 + log+ uF γ ∞,2
∞,2
(2.18)
910
H. Ibrahim, R. Monneau / Journal of Functional Analysis 257 (2009) 903–930
where the constant C depends also on γ . We now estimate the term uF γ . Choose γ such that ∞,2
0 < γ < 2m − Call α = 2m −
n+2 2 ,
n+2 . 2
we compute: uF γ
∞,2
1/2 2j γ 2 = 2 |φ ∗ u| j
L∞
j 0
2
2j (γ −α)
1/2 sup 2αj |φj ∗ u| j 0
j 0 α CuB∞,∞ .
L∞
(2.19)
It is easy to check (see (2.14), Lemma 2.7, and (2.15), Lemma 2.8) that we have the continuous embeddings 2m α W22m,m → B2,∞
→ B∞,∞ .
Therefore (from inequality (2.19)) we get: uF γ
∞,2
CuW 2m,m ,
hence the result directly follows from (2.18).
2
2
2.3. Proof of Theorem 1.1 In this subsection we present the proof of several lemmas leading to the proof of Theorem 1.1. We start with the following lemma concerning mean estimates of functions on parabolic cubes. Call Q2j ⊂ Rn+1 , j 0, any arbitrary parabolic cube of radius 2j (see (2.3) for the definition of parabolic cubes). For the sake of simplicity, we denote Qj = Q2j
for all j ∈ Z.
(2.20)
Our next lemma reads: Lemma 2.10 (Mean estimates on parabolic cubes). Let u ∈ BMOp (Rn+1 ). Take Qj ⊂ Qj +1 , j 0 (Qj and Qj +1 do not necessarily have the same center). Then we have (with the notation (2.4)): |uQj +1 − uQj | 1 + 2n+2 uBMOp .
(2.21)
More generally, we have for any Qj ⊆ Qk , j, k ∈ Z: |uQk − uQj | (k − j ) 1 + 2n+2 uBMOp .
(2.22)
H. Ibrahim, R. Monneau / Journal of Functional Analysis 257 (2009) 903–930
911
Proof. We easily remark that: j +1 Q = 2n+2 Qj . We compute: 1 |uQj +1 − uQj | = j |Q |
1 |Qj |
|uQj +1 − uQj | Qj
|u − uQj | + Qj
uBMOp +
2n+2 |Qj +1 |
1 |Qj |
|u − uQj +1 | Qj
|u − uQj +1 | Qj +1
uBMOp + 2n+2 uBMOp 1 + 2n+2 uBMOp , 2
which immediately gives (2.21), and consequently (2.22).
The following two lemmas are of notable importance for the proof of the logarithmic Sobolev inequality (1.2). In the first lemma we bound the terms φj ∗ u for j 1, while, in the second lemma, we give a bound on φ0 ∗ u. Lemma 2.11 (Estimate of φj ∗ uL∞ (Rn+1 ) for j 1). Let u ∈ BMOp (Rn+1 ). Then there exists a constant C = C(n) > 0 such that: u ∗ φj L∞ (Rn+1 ) CuBMOp (Rn+1 )
for any j 1,
(2.23)
where (φj )j 1 is the sequence of functions given in (2.7). Proof. We will show that (φj ∗ u)(z) CuBMO p
for z = 0.
(2.24)
The general case with z ∈ Rn+1 could be deduced from (2.24) by translation. Throughout the proof, we will sometimes omit (when there is no confusion) the dependence of the norm on the space Rn+1 . The proof is divided into three steps. Step 1 (Decomposition of (φj ∗ u)(0) on parabolic cubes). Since φˆ j is supported in {z ∈ Rn+1 ; 2j −1 [z] 2j +1 } then φˆ j (0) = 0 = Rn+1 φj . Using this equality, we can write: (φj ∗ u)(0) =
φj (−z) u(z) − uQ1−j dz,
Rn+1
where Q1−j is the parabolic cube defined by (2.20) and centered at 0. This implies that
912
H. Ibrahim, R. Monneau / Journal of Functional Analysis 257 (2009) 903–930
(φj ∗ u)(0)
A1
φj (−z)u(z) − u
Q
1−j dz
Q1−j A2
φj (−z)u(z) − u
+
Q
1−j dz .
(2.25)
Rn+1 \Q1−j
Step 1.1 (Estimate of A1 ). From (2.7), the term A1 can be estimated as follows:
A1 2(n+2)(j −1) φ1 L∞
u(z) − u
Q1−j
dz
Q1−j
2(n+2)(j −1) Q1−j φ1 L∞ uBMO
p
|Q1 |φ1 L∞ uBMOp , hence A1 C0 uBMOp
with C0 = |Q1 |φ1 L∞ (Rn+1 ) .
(2.26)
Step 2 (Estimate of A2 ). We rewrite A2 as the following series:
A2 = 2(n+2)(j −1)
(j −1)a φ1 −2 z u(z) − u
Q1−j
−∞
dz.
(2.27)
Q2−k \Q1−k
Since φ1 is the inverse Fourier transform of a compactly supported function then we have: ∀m ∈ N∗ , ∃C1 > 0,
φ1 (z) C1 [z]m
for all [z] 1.
(2.28)
The asymptotic behavior of φ1 shown by (2.28) leads to the following decomposition of the term A2 : A3
A2 C1 2
(n+2)(j −1)
−∞
+ C1 2
(n+2)(j −1)
Q2−k \Q1−k
−∞
u(z) − u 2−k dz Q [2(j −1)a z]m 1
A4
Q2−k \Q1−k
1 [2(j −1)a z]m
|uQ2−k − uQ1−j | dz .
H. Ibrahim, R. Monneau / Journal of Functional Analysis 257 (2009) 903–930
913
Step 2.1 (Estimate of A3 ). Since the integral appearing in A3 is done over Q2−k \ Q1−k , we obtain (j −1)a m z 2m(j −k) . 2 Using this inequality together with the fact that
u(z) − u
Q2−k
dz 2(n+2)(2−k) |Q1 |uBMO , p
Q2−k \Q1−k
we can estimate the term A3 as follows:
A3 C1 2
n+2
2
−(m−(n+2))(j −k)
|Q1 |uBMOp ,
(2.29)
−∞
where the above series converges for m > n + 2. Step 2.2 (Estimate of A4 ). Using Lemma 2.10, and the fact that [2(j −1)a z]m 2m(j −k) on Q2−k \ Q1−k , the term A4 can be estimated as follows: A4 C1 2(n+2)(j −1)
C1 2n+2
2−m(j −k) (1 + j − k)Q2−k uBMOp
−∞
2−(m−(n+2))(j −k) (1 + j − k) |Q1 |uBMOp ,
(2.30)
−∞
where the above series also converges for m > n + 2. Step 3 (Conclusion). From (2.26), (2.29) and (2.30), inequality (2.23) directly follows with a constant C > 0 independent of j . 2 Lemma 2.12 (Estimate of φ0 ∗ uL∞ (Rn+1 ) ). Let u ∈ W22m,m (Rn+1 ) with m > exists a constant C = C(m, n) > 0 such that we have: φ0 ∗ uL∞ C 1 + uBMOp 1 + log+ uW 2m,m .
n+2 4 .
Then there
(2.31)
2
Proof. The constants that will appear may differ from line to line, but only depend on n and m. The proof of this lemma combines somehow the proof of Lemmas 2.9 and 2.11. We write down uQ1 as a finite sum of a telescopic sequence for N 1: uQ1 = (uQ1 − uQ2 ) + · · · + (uQN−1 − uQN ) + uQN . From Lemma 2.10, we deduce that: |uQ1 | C(N − 1)uBMOp + |uQN |.
914
H. Ibrahim, R. Monneau / Journal of Functional Analysis 257 (2009) 903–930
Remark that applying Cauchy–Schwarz inequality, we get |uQN |
1 |QN |
1/2
|u|
1/2
u2
QN
12
QN
,
QN
then we obtain n+2 . with γ = |uQ1 | C NuBMOp + 2−γ N uW 2m,m 2 2
(2.32)
Following similar arguments as in the proof of Lemma 2.9, we may optimize (2.32) in N , we finally get: |uQ1 | C 1 + uBMOp 1 + log+ uW 2m,m . (2.33) 2
We now estimate |(φ0 ∗ u)(z)| for z = 0. Again, the same estimate could be obtained for any z ∈ Rn+1 by translation. We write φ0 (−z)u(z) (φ0 ∗ u)(0) = Rn+1
φ0 (−z) u(z) − uQ1 +
= Rn+1
φ0 (−z)uQ1
Rn+1 B1
φ0 (−z) u(z) − uQ1 +
=
B3
B2
φ0 (−z) u(z) − uQ1 + φ0 (−z)uQ1 ,
Rn+1 \Q1
Q1
Rn+1
where |B1 | CuBMOp ,
(2.34)
and, from (2.33), |B3 | C 1 + uBMOp 1 + log+ uW 2m,m .
(2.35)
2
In order to estimate B2 , we argue as in Step 2 of Lemma 2.11. In fact we have: φ0 (−z)|u k+1 − u 1 | φ0 (−z) u(z) − uQk+1 + |B2 | Q Q k1 k+1 k Q \Q
2
k1 k+1 k Q \Q
k+1 (1 + k) uBMOp sup φ0 (−z) Q
k+1 k k1 Q \Q
n+2
k1
2
−(m−(n+2))
(1 + k) |Q1 |uBMOp ,
(2.36)
H. Ibrahim, R. Monneau / Journal of Functional Analysis 257 (2009) 903–930
915
where for the last line we have used the fact that |φ0 (z)| [z]Cm for [z] 1. Of course the above series converges if we choose m > n + 2. From (2.34), (2.35) and (2.36), the result follows. 2 0 (Rn+1 ), then u ∈ Corollary 2.13 (A control of uF 0 ). Let u ∈ BMOp (Rn+1 ) ∩ F ∞,1 ∞,2 0 n+1 F (R ) and we have: ∞,2
uF 0
∞,2
1/2
1/2 F∞,1
CuBMOp u 0 ,
(2.37)
where C = C(n) > 0 is a positive constant. Proof. Using (2.23), we compute: uF 0
∞,2
1/2 2 = |φj ∗ u|
L∞
j 1
1/2
1/2 ∞ sup φj ∗ uL |φj ∗ u| j 1
L∞
j 1
1/2 F∞,1
CuBMOp u 0 , which terminates the proof.
2
Remark 2.14. From [2], it seems that BMOp spaces can be characterized in terms of parabolic Lizorkin–Triebel spaces. In the case of elliptic spaces, it is a well-known result (see [6,16]) which allows to simplify the proof of the Kozono–Taniuchi inequality. We can now give the proof of our first main result (Theorem 1.1). Proof of Theorem 1.1. Using (2.16) and (2.37), we obtain: uF 0
∞,1
1/2 1/2 1/2 . C 1 + uBMOp u 0 1 + log+ uW 2m,m F∞,1
(2.38)
2
Notice that the constant C can always be chosen such that C 1. If uF 0 1, we evidently ∞,1 have: uF 0
∞,1
If uF 0
∞,1
C C 1 + uBMOp 1 + log+ uW 2m,m .
(2.39)
2
1/2 F∞,1
> 1, then, dividing (2.38) by u 0 , we can easily deduce inequality (2.39). Using
the fact that uL∞ C
φj ∗ uL∞ C φ0 ∗ uL∞ + uF 0 ,
j 0
and using inequalities (2.31) and (2.39), we directly get into the result.
∞,1
2
916
H. Ibrahim, R. Monneau / Journal of Functional Analysis 257 (2009) 903–930
3. A parabolic Kozono–Taniuchi inequality on a bounded domain The goal of this section is to present, on the one hand, the proof of Theorem 1.2. On the other hand, at the end of this section, we give an application where we show how to use inequality (1.3) in order to maintain the long-time existence of solutions to some parabolic equations. Let us indicate that throughout this section, the positive constant C = C(T ) > 0 may vary from line to line. 3.1. Proof of Theorem 1.2 In order to simplify the arguments of the proof, we first show Theorem 1.2 in the special case when n = m = 1. Then we give the principal ideas how to prove the result in the general case. Call I = (0, 1)
and ΩT = I × (0, T ),
we first show the following proposition: Proposition 3.1 (Theorem 1.2, case: n = m = 1). Let u ∈ W22,1 (ΩT ). Then there exists a constant C = C(T ) > 0 such that: uL∞ (ΩT ) C 1 + uBMOp (ΩT ) 1 + log+ uW 2,1 (Ω ) . 2
T
(3.1)
As a similar inequality of (3.1) is already shown on R2 (see inequality (1.2)), the idea of the proof of (3.1) lies in using (1.2) for a special extension of the function u ∈ W22,1 (ΩT ) to the entire space R2 . For this reason, we demand that the extended function stays in W22,1 (R2 ) which is done via the following arguments. Remark first that the function u can be extended by continuity to the boundary ∂ΩT of ΩT . Take u˜ as the function defined over T = (−1, 2) × (−T , 2T ) Ω as follows: u(x, ˜ t) =
−3u(−x, t) + 4u(− x2 , t) for − 1 < x < 0, 0 t T , 3−x −3u(2 − x, t) + 4u( 2 , t) for 1 < x < 2, 0 t T ,
(3.2)
and u(x, ˜ t) =
u(x, −t) for − T < t 0, u(x, 2T − t) for T t < 2T .
(3.3)
A direct consequence of this extension is the following lemma. Lemma 3.2 (L1 estimate of u). ˜ Let u˜ be the function defined by (3.2) and (3.3). Then there exists a constant C = C(T ) > 0 such that: u ˜ L1 (Ω T ) CuL1 (ΩT ) .
(3.4)
H. Ibrahim, R. Monneau / Journal of Functional Analysis 257 (2009) 903–930
Proof. The proof of this lemma is direct by the extension.
917
2
T ), Another important consequence of the extension (3.2) and (3.3) is the fact that u˜ ∈ W22,1 (Ω and that we have (see for instance [5]) u ˜ W 2,1 (Ω ) CuW 2,1 (Ω ) , 2
T
2
T
C = C(T ) > 0.
(3.5)
T defined by: Let Z1 ⊂ Z2 be the two subsets of Ω Z1 = (x, t); −1/4 < x < 5/4 and − T /4 < t < 5T /4 , and Z2 = (x, t); −3/4 < x < 7/4 and − 3T /4 < t < 7T /4 . Taking the cut-off function Ψ ∈ C0∞ (R2 ), 0 Ψ 1 satisfying: Ψ (x, t) =
1 for (x, t) ∈ Z1 , 0 for (x, t) ∈ R2 \ Z2 ,
(3.6)
we can easily deduce from (3.5) that Ψ u˜ ∈ W22,1 (R2 ), and Ψ u ˜ W 2,1 (R2 ) CuW 2,1 (Ω ) . 2
2
(3.7)
T
Since Ψ u˜ ∈ W22,1 (R2 ), we can apply inequality (1.2) to the function Ψ u, ˜ and, having (3.7) in hands, the proof of Proposition 3.1 directly follows if we can show that Ψ u ˜ BMOp (R2 ) CuBMOp (ΩT ) ,
(3.8)
and this will be done in the forthcoming arguments. 3.1.1. Proof of Proposition 3.1 In all what follows, it will be useful to deal with an equivalent norm of the BMOp space. This norm is given by the following lemma. Lemma 3.3 (Equivalent BMOp norms). Let u ∈ BMOp (O), O ⊆ Rn+1 is an open set. The parabolic BMOp norm of u given by (2.2) is equivalent to the following norm for which we keep the same notation:
1 |u − c| , Q given by (2.3). (3.9) uBMOp (O) = sup inf Q⊂O c∈R |Q| Q
Proof. The proof of this lemma is direct. It suffices to see that for any c ∈ R, we have: 1 |u − c|, |u − uQ | |u − c| + |c − uQ | |u − c| + |Q| Q
918
H. Ibrahim, R. Monneau / Journal of Functional Analysis 257 (2009) 903–930
which immediately gives:
|u − uQ | 2 Q
|u − c|, Q
hence 1 2|Q|
1 c∈R |Q|
|u − uQ | inf Q
and the equivalence of the two norms follows.
1 |Q|
|u − c| Q
|u − uQ |,
(3.10)
Q
2
From now on, and for the sake of simplicity, we will denote: 1 −u= u. |Q| Q
Q
The following lemma gives an estimate of infc∈R − Q |u − c| on small parabolic cubes. Lemma 3.4. Let f ∈ L1loc (R2 ). Take Qr ⊆ Q2r two parabolic cubes of R2 . We do not require that the cubes have the same center. Then we have: inf − |f − c| 8 inf − |f − c|.
c∈R
c∈R
Qr
(3.11)
Q2r
Proof. For c ∈ R, we compute: |Q2r | − |f − c| − |f − c| 8 − |f − c|. |Qr | Qr
Q2r
Taking the infimum of both sides we arrive to the result.
Q2r
2
The next lemma gives an estimate of infc∈R − Qr |u˜ − c| on small parabolic cubes in T = (−1, 2) × (0, T ). Ω Define the term r0 > 0 as the greatest positive real number such that there exists Qr0 ⊆ ΩT , i.e., r0 = sup r > 0; r 1/2 and r 2 T /2 . We show the following:
(3.12)
H. Ibrahim, R. Monneau / Journal of Functional Analysis 257 (2009) 903–930
919
Fig. 1. Analysis on cubes intersecting {x = 0}.
T ). Let u˜ be the function defined by (3.2) Lemma 3.5 (Estimates on small parabolic cubes in Ω and (3.3). Take any parabolic cube Qr satisfying: T Qr ⊆ Ω
with r r1 and 2r1 = r0 ,
(3.13)
where r0 is given by (3.12). Then there exists a universal constant C > 0 such that: inf − |u˜ − c| CuBMOp (ΩT ) .
c∈R
(3.14)
Qr g
Proof. Call ΩTd and ΩT the right and the left neighbor sets of ΩT defined respectively by: ΩTd = (−1, 0) × (0, T )
g
and ΩT = (1, 2) × (0, T ).
First let us mention that if the cube Qr lies in ΩT then inequality (3.14) is evident (see the equivalent definition (3.9) of the parabolic BMOp norm). Two remaining cases are to be considered: g either Qr intersects the set {x = 0} ∪ {x = 1}, or Qr lies in ΩTd ∪ ΩT . Our assumption (3.13) on g the radius of the parabolic cube makes it impossible that the cube Qr meets ΩTd and ΩT at the same time. Therefore, and in order to make the proof simpler, we only consider the following g cases: either Qr intersects the set {x = 0}, or Qr lies in ΩT . The proof is then divided into three main steps: Step 1 (Qr intersects the line {x = 0}). Step 1.1 (First estimate). Again the assumption (3.13) imposed on the radius r makes it possible T of radius 2r, which is symmetric with respect to embed Qr in a larger parabolic cube Q2r ⊆ Ω to the line {x = 0} (see Fig. 1). Then the center of the cube Q2r should be also on the same line,
920
H. Ibrahim, R. Monneau / Journal of Functional Analysis 257 (2009) 903–930
but we do not require that the two cubes Qr and Q2r have centers with the same ordinate t. Now, using Lemma 3.4, we deduce that: inf − |u˜ − c| 8 inf − |u˜ − c|,
c∈R
(3.15)
c∈R
Qr
Q2r
and hence in order to conclude, we need to estimate the right-hand side of the above inequalg ity with respect to uBMOp (ΩT ) . Call Qd2r and Q2r the right and the left sides of Q2r defined respectively by: Qd2r = Q2r ∩ ΩT
g
g
and Q2r = Q2r ∩ ΩT .
Also call Qtrans 2r ⊆ ΩT , the translation of the cube Q2r by the vector (2r, 0), i.e. Qtrans 2r = (2r, 0) + Q2r . For c ∈ R, we compute:
|u˜ − c| = g Q2r
Q2r
|u˜ − c| +
g Q2r
|u − c|,
Qd2r
|u˜ − c| +
|u − c|,
(3.16)
Qtrans 2r
where we have used the fact that u˜ = u on ΩT , and that Qd2r ⊆ Qtrans 2r . Step 1.2 (Estimate of on
g ΩT ):
g
Q2r
g
Q2r
|u˜ − c|). We compute (using the definition (3.2) of the function u˜
u(x, ˜ t) − c dx dt =
−3u(−x, t) + 4u(−x/2, t) − c dx dt
g
Q2r
3
u(−x, t) − c dx dt + 4
g
3 Qd2r
u(−x/2, t) − c dx dt
g
Q2r
Q2r
u(x, t) − c dx dt + 8
u(x, t) − c dx dt,
¯ Qd2r
where ¯ Qd2r = (x/2, t); (x, t) ∈ Qd2r ⊆ Qd2r ⊆ Qtrans 2r .
(3.17)
H. Ibrahim, R. Monneau / Journal of Functional Analysis 257 (2009) 903–930
921
From (3.17) we easily deduce that:
|u˜ − c| 11 g Q2r
|u − c|,
Qtrans 2r
and hence (using (3.16)), we finally get:
|u˜ − c| 12
|u − c|.
(3.18)
Qtrans 2r
Q2r
Since |Q2r | = |Qtrans 2r |, inequality (3.18) gives − |u˜ − c| 12 − |u − c|. Qtrans 2r
Q2r
Since Qtrans is a parabolic cube in ΩT , taking the infimum over c ∈ R of the above inequality, 2r we obtain: inf − |u˜ − c| 12uBMOp (ΩT ) . (3.19) c∈R
Q2r
From (3.15) and (3.19), we deduce (3.14). g
Step 2 (Qr ⊆ ΩT ). Let 0 < a0 < b0 < 1 and 0 < a1 < b1 < T be such that Qr = (−b0 , −a0 ) × (a1 , b1 ). For any c ∈ R, we compute:
u(x, ˜ t) − c dx dt =
Qr
−3u(−x, t) + 4u(−x/2, t) − c dx dt
Qr
3
u(x, t) − c dx dt + 8
u(x, t) − c dx dt
(3.20)
Qsr¯
Qsr
with (see Fig. 2), Qsr
= (a0 , b0 ) × (a1 , b1 ) ⊆ ΩT
and
Qsr¯
=
a0 b0 , 2 2
× (a1 , b1 ) ⊆ ΩT .
We remark that Qsr is a parabolic cube in ΩT , while Qsr¯ is not (its aspect ratio is different). In ¯ ¯ fact Qsr¯ could be embedded in a parabolic cube Qsr¯ ⊆ Qsr¯ ⊆ ΩT , where Qsr¯ is simply a space
922
H. Ibrahim, R. Monneau / Journal of Functional Analysis 257 (2009) 903–930
g
Fig. 2. Analysis on cubes Qr ⊆ ΩT .
translation of Qsr . In particular we have: ¯ |Qr | = Qsr = Qsr¯ .
(3.21)
The above arguments, together with (3.20) give:
|u˜ − c| 3
|u − c| + 8
Qsr
Qr
|u − c|.
(3.22)
Qsr¯¯
Taking the infimum in c ∈ R for both sides of inequality (3.22), leads to inf − |u˜ − c| 11uBMOp (ΩT ) ,
c∈R
(3.23)
Qr
which implies (3.14). Step 3 (Conclusion). As it was already mentioned at the beginning of the proof, the case where the parabolic cube Qr meets the line {x = 1} or lies completely in ΩTd , could be treated using identical arguments. Therefore, for all small parabolic cubes Qr satisfying (3.13), inequality (3.14) is always valid, and this terminates the proof of Lemma 3.5. 2 A generalization of Lemma 3.5 is now given. T ). Let u˜ be the function defined by (3.2) Lemma 3.6 (Estimates on small parabolic cubes in Ω T satisfying: and (3.3). Take any parabolic cube Qr ⊆ Ω r r2
√ with r2 2 = r1 ,
(3.24)
H. Ibrahim, R. Monneau / Journal of Functional Analysis 257 (2009) 903–930
923
Fig. 3. Qr ∩ {t = T } = ∅.
where r1 is given by (3.13). Then there exists a universal constant C > 0 such that: inf − |u˜ − c| CuBMOp (ΩT ) .
c∈R
(3.25)
Qr
Sketch of the proof. The arguments leading to the proof of this lemma are already contained in √ T , we enter directly (since r r 2 r1 ) to the proof of Lemma 3.5. First notice that if Qr ⊆ Ω the framework of Lemma 3.5, and hence (3.25) is direct. Because r r1 , remark that there exists T . Therefore it is impossible that a cube Qr obtained by a time translation of Qr such that Qr ⊆ Ω Qr meets at the same time (−1, 2) × (T , 2T ) and (−1, 2) × (−T , 0). For this reason, we either consider parabolic cubes intersecting {t = T } (see Fig. 3), or parabolic cubes in (−1, 2)×(T , 2T ) (see Fig. 4). Case Qr ∩ {t = T } = ∅. In this case, we first embed Qr in a larger parabolic cube Qr √2 which is symmetric with respect to the line {t = T }, so the center of this cube lies in {t = T }. We now repeat the same arguments as in Step 1 of Lemma 3.5, using in particular the symmetry (3.3) of the function u˜ with respect to {t = T }, and the fact that we can consider the cube √ = 0, −2r 2 + Q √ Qtrans r 2 r 2 such that
T , √ ⊆ Qr ⊆ Ω Qtrans 1 r 2
√ are already controlled by (3.14). for some cube Qr1 . Indeed, estimates on all such cubes Qtrans r 2
924
H. Ibrahim, R. Monneau / Journal of Functional Analysis 257 (2009) 903–930
Fig. 4. Qr ∩ {t = T } = ∅.
Case Qr ∩ {t = T } = ∅. In this case we repeat the same arguments as in Step 2 of Lemma 3.5. Indeed, in the present case, it is even simpler since the function u˜ is symmetric with respect to {t = T }. 2 We now show how to prove estimate (3.8). Proof of estimate (3.8). The parabolic BMOp norm (3.9) of Ψ u˜ could be estimated taking the supremum of − Qr |Ψ u˜ − (Ψ u) ˜ Qr |, Qr ⊆ R2 , over small parabolic cubes (Qr with r r2 /2), and big parabolic cubes (Qr with r > r2 /2). The proof is then divided into two steps. Step 1 (Analysis on big parabolic cubes Qr , r > r2 /2). We compute, using the fact that Ψ = 0 on R2 \ Z2 , and Ψ 1 on R2 (see (3.6)): 2 ˜ Qr 2 − |Ψ u| − Ψ u˜ − (Ψ u) ˜ |Qr | Qr
Qr
22 r23
Q r ∩Z 2
|u| ˜
Q r ∩Z 2
|u| ˜
22 r23
|u| ˜ CuL1 (ΩT ) .
(3.26)
T Ω
Step 2 (Analysis on small parabolic cubes Qr , r r2 /2). From the definition (3.24) of r2 , and the T . construction (3.6) of the function Ψ , we deduce that if Qr intersects Z2 then forcedly Qr ⊆ Ω If not, i.e. Qr ∩ Z2 = ∅ then Ψ = 0 on Qr , and therefore: ˜ Qr = 0. (3.27) − Ψ u˜ − (Ψ u) Qr
H. Ibrahim, R. Monneau / Journal of Functional Analysis 257 (2009) 903–930
925
T . Then we have only to consider Qr ⊆ Ω Step 2.1 (First estimate). Using (3.10), we get ˜ Qr 2 inf − |Ψ u˜ − c| 2 − |Ψ u˜ − c0 ΨQr |, − Ψ u˜ − (Ψ u) c∈R
Qr
Qr
(3.28)
Qr
for any fixed constant c0 ∈ R. Remark that we can write: Ψ u˜ − c0 ΨQr = (Ψ − ΨQr )u˜ + (u˜ − c0 )ΨQr .
(3.29)
Hence, we deduce that ˜ Qr Cr − |u| ˜ + 2 inf − |u˜ − c0 | − Ψ u˜ − (Ψ u) c0 ∈R
Qr
Qr
Qr
˜ + 2CuBMOp (ΩT ) , Cr − |u|
(3.30)
Qr
where for the first line we have used that fact that Ψ 1 and that Ψ is Lipschitz, and for the second line we have used (3.25). ˜ We have Step 2.2 (Estimate of − Qr |u|). − |u| ˜ |u˜ Qr | + − |u˜ − u˜ Qr | Qr
Qr
|u˜ Qr | + 2 inf − |u˜ − c| c∈R
Qr
|u˜ Qr | + 2CuBMOp (ΩT ) ,
(3.31)
where for the second line, we have used (3.10), while for the third line, we have used (3.25). T : Remark that from the proof of Lemma 2.10 with n = 1, we have for Q2j r ⊆ Q2j +1 r ⊆ Ω |u˜ Q2j r
3 − u˜ Q2j +1 r | − |u˜ − u˜ Q2j r | + 2 − |u˜ − u˜ Q2j +1 r | Q2j r
2 1 + 23
Q2j +1 r
sup
T ,ρ2j +1 r Qρ ⊆Ω
2C 1 + 23 uBMOp (ΩT ) ,
inf − |u˜ − c|
c∈R
Qρ
926
H. Ibrahim, R. Monneau / Journal of Functional Analysis 257 (2009) 903–930
where we have used (3.10) for the second line, and, for the third line, we have used (3.25) assuming 2j +1 r r2 . Defining j0 = min j ∈ N; r2 /2 2j r < r2 , and using a telescopic sequence, we can deduce that |u˜ Qr − u˜ Q
2j0 r
| j0 2C 1 + 23 uBMOp (ΩT ) C 1 + | log r| uBMOp (ΩT ) .
(3.32)
Moreover, we have |u˜ Q
2j0 r
|
1 |Qr2 /2 |
|u| ˜ CuL1 (ΩT ) ,
(3.33)
T Ω
where we have used (3.4) for the second inequality. From (3.30), (3.32) and (3.33), we get: − |u| ˜ C uL1 (ΩT ) + 1 + | log r| uBMOp (ΩT )
(3.34)
Qr
for some constant C > 0. Step 2.3 (Conclusion for r r2 /2). Finally, putting together (3.30) and (3.34), we deduce that ˜ Qr C r| log r| + 1 uBMOp (ΩT ) + uL1 (ΩT ) − Ψ u˜ − (Ψ u) Qr
C uBMOp (ΩT ) + uL1 (ΩT ) ,
(3.35)
where in the second line, we have used that r ∈ (0, 1), and that r| log r| is bounded. Step 3 (General conclusion). Putting together (3.26), (3.27) and (3.35), we get (3.8).
2
We are now ready to show the proof of Proposition 3.1. Proof of Proposition 3.1. Applying estimate (2.37), with m = n = 1, to the function Ψ u˜ ∈ W22,1 (R2 ) ⊆ L∞ (R2 ), we get: ˜ L∞ (ΩT ) Ψ u ˜ L∞ (R2 ) C 1 + Ψ u ˜ BMOp (R2 ) 1 + log+ Ψ u ˜ W 2,1 (R2 ) . uL∞ (ΩT ) = Ψ u 2
Here, we have also used the fact that Ψ = 1 over ΩT (see (3.6)). Using (3.7), (3.8) and the above inequality, we directly get (3.1). 2
H. Ibrahim, R. Monneau / Journal of Functional Analysis 257 (2009) 903–930
927
3.1.2. Ideas of the proof of Theorem 1.2 One of the main motivations for starting with the detailed proof of Proposition 3.1 (a simplified version of Theorem 1.2) is that it was used to show [8, Theorem 1.1]. The other motivation is that the arguments of the proof of Theorem 1.2 are all contained in the proof of Proposition 3.1. It suffices to make the following generalizations that we list below. T ) Extension of u. ˜ In order to extend the function u ∈ W22m,m (ΩT ) to the function u˜ ∈ W22m,m (Ω T = (−1, 2)n × (−T , 2T ), we first make the extension separately and successively with with Ω respect to the spatial variables xi , with i = 1, . . . , n. Then we make the extension with respect to the time variable that is treated somehow differently. Fix (x2 , . . . , xn , t) ∈ (0, 1)n−1 × (0, T ), the spatial extension of u in x1 is as follows: 2m−1 u(x ˜ 1 , . . .) = with λj =
1 2j
for − 1 < x1 < 0,
j =0
cj u(−λj x1 , . . .)
j =0
cj u(1 + λj (1 − x1 ), . . .) for 1 < x1 < 2,
2m−1
(3.36)
, and where we require that: 2m−1
cj (−λj )k = 1 for k = 0, . . . , 2m − 1.
j =0
The above inequalities can be regarded as a linear system whose associated matrix is of the Vandermonde type and hence invertible. This ensures the existence of the constants cj , j = 0, . . . , 2m − 1, and therefore the above extension (3.36) gives sense. After doing the extension with respect to x1 , the extension with respect to x2 is done in the same way by varying the x2 and fixing all other variables. This is repeated successively until the xn variable. For the time variable, we also use the same extension (3.36). Indeed, in this case, we may only sum up to m − 1 in (3.36). The cut-off function Ψ . For the definition of the cut-off function Ψ , we first define the two sets: Z1 = (x1 , . . . , xn , t); ∀i = 1, . . . , n, −1/4 < xi < 5/4 and − T /4 < t < 5T /4 and Z2 = (x1 , . . . , xn , t); ∀i = 1, . . . , n, −3/4 < xi < 7/4 and − 3T /4 < t < 7T /4 . The function Ψ is then defined as Ψ ∈ C0∞ (Rn+1 ) with 0 Ψ 1 and Ψ (x, t) =
1 for (x, t) ∈ Z1 , 0 for (x, t) ∈ R2 \ Z2 .
(3.37)
Generalization of Lemma 3.6. An analogue estimate of (3.25) could be obtained for (n + 1)-di T = (−1, 2)n × (−T , 2T ). It suffices to replace r2 satisfying mensional parabolic cubes Qr ⊆ Ω (3.24), by the radius
928
H. Ibrahim, R. Monneau / Journal of Functional Analysis 257 (2009) 903–930
rn rn+1 = √ , 2 where rn is defined recursively as follows: rj +1 = rj /2 for 0 j n − 1. Using the above generalizations, the proof of Theorem 1.2 follows, line by line, the proof of Proposition 3.1. 3.2. Application of the parabolic Kozono–Taniuchi inequality In this subsection, we show how to apply the parabolic Kozono–Taniuchi inequality in order to give some a priori estimates for the solution of certain parabolic equations. These a priori estimates provide a good control on the solution in order to avoid singularities at a finite time, and hence serve for the long-time existence. The application that will be given here deals with a model that can be considered as a toy model. Indeed, this is a simplification of the one treated in [8], where a rigorous proof of the long-time existence of solutions of a singular parabolic coupled system was presented (see [8, Theorem 1.1]). Consider, for 0 < a < 1, the following parabolic equation: ⎧ ⎨ ut (x, t) − uxx (x, t) = sin ux (x, t)ux (x + a, t) + sin log ux (x, t) u(x + 1, t) = u(x, t) + 1 on R × (0, ∞), ⎩ ux (x, 0) δ0 > 0 on R,
on R × (0, ∞), (3.38)
the following proposition can be established: Proposition 3.7 (Gradient estimate). Let v = ux and m(t) = minx∈R v(x, t). If u ∈ C ∞ (R × [0, ∞)) is a smooth solution of (3.38), then, for some constant C = C(t) > 0 we have: mt −Cm |log m| + 1 ,
∀t 0.
(3.39)
Remark 3.8. Inequality (3.39) directly implies that for every t 0 we have m(t) > 0. This is important to avoid the logarithmic singularity in (3.38) when v = ux = 0. Remark 3.9. The proof of the above proposition goes along the same lines as the proof of [8, Theorem 1.1]. For this reason we only present a heuristic proof explaining only the basic ideas. The interested reader could see the full details in [8]. Ideas of the proof of Proposition 3.7. Heuristically, the proof is divided into the following four steps. In what follows all the constants can depend on the time t, but are bounded for any finite t. Step 1 (First estimate from below on the gradient). Writing down the equation satisfied by v: ⎧ vt (x, t) − vxx (x, t) = cos v(x, t)v(x + a, t) vx (x, t)v(x + a, t) + v(x, t)vx (x + a, t) ⎪ ⎪ ⎪ ⎪ vx (x, t) ⎨ on R × (0, ∞), + cos log v(x, t) (3.40) v(x, t) ⎪ ⎪ ⎪ v(x + 1, t) = v(x, t) on R × (0, ∞), ⎪ ⎩ v(x, 0) δ0 > 0 on R,
H. Ibrahim, R. Monneau / Journal of Functional Analysis 257 (2009) 903–930
929
we can show that for every t 0: mt −mG
with G(t) = maxvx (x, t). x∈R
(3.41)
Step 2 (Estimate of vx BMOp ). Using the fact that u(x + 1, t) = u(x, t) + 1, and that the righthand term of the first equation of (3.38) is bounded, we apply the BMO theory for parabolic equation to (3.38) and hence we obtain, for some positive constant c1 > 0: vx BMOp ((0,1)×(0,t)) c1
for any t > 0.
However, the Lp theory for parabolic equation applied to (3.38) gives, for some positive constant c2 > 0: vx L1 ((0,1)×(0,t)) c2
for any t > 0.
Finally, the above two inequalities give: vx BMOp ((0,1)×(0,t)) c1 + c2
for any t > 0.
(3.42)
Step 3 (Estimate of vx W 2,1 ). Let w = vx , we write down the equation satisfied by w: 2
⎧ wt (x, t) − wxx (x, t) ⎪ ⎪ 2 ⎪ ⎪ ⎪ = − sin v(x, t)v(x + a, t) v(x + a, t)vx (x, t) + v(x, t)vx (x + a, t) ⎪ ⎪ ⎪ ⎪ + cos v(x, t)v(x + a, t) v(x + a, t)vxx (x, t) + 2vx (x, t)vx (x + a, t) ⎪ ⎪ ⎪ ⎪ ⎪ v 2 (x, t) ⎨ + v(x, t)vxx (x + a, t) − sin log v(x, t) x2 v (x, t) ⎪ " ⎪ 2 ⎪ ⎪ vxx (x, t) vx (x, t) ⎪ ⎪ − 2 on R × (0, ∞), + cos log v(x, t) ⎪ ⎪ v(x, t) v (x, t) ⎪ ⎪ ⎪ ⎪ w(x + 1, t) = w(x, t) on R × (0, ∞), ⎪ ⎪ ⎩ w(x, 0) = vx (x, 0) on R.
(3.43)
Using the Lp theory for parabolic equations (with various values of p) to (3.38), (3.40) and (3.43), we deduce, for some other positive constant c > 0, that: vx W 2,1 ((0,1)×(0,t)) 2
c
for any t > 0.
m2 (t)
(3.44)
Step 4 (Conclusion). Applying the parabolic Kozono–Taniuchi inequality (3.1) to the function vx , using in particular (3.42) and (3.44), we deduce that: G C 1 + | log m| , which, together with (3.43), directly gives the result.
2
930
H. Ibrahim, R. Monneau / Journal of Functional Analysis 257 (2009) 903–930
Acknowledgments This work was supported by the contract ANR MICA (2006–2009). The authors would like to thank M. Jazar for his encouragement during the preparation of this work. The first author would like to thank B. Kojok for some discussions. References [1] J.T. Beale, T. Kato, A. Majda, Remarks on the breakdown of smooth solutions for the 3-D Euler equations, Comm. Math. Phys. 94 (1984) 61–66. [2] M. Bownik, Anisotropic Triebel–Lizorkin spaces with doubling measures, J. Geom. Anal. 17 (2007) 387–424. [3] H. Brézis, T. Gallouët, Nonlinear Schrödinger evolution equations, Nonlinear Anal. 4 (1980) 677–681. [4] H. Brézis, S. Wainger, A note on limiting cases of Sobolev embeddings and convolution inequalities, Comm. Partial Differential Equations 5 (1980) 773–789. [5] L.C. Evans, Partial Differential Equations, Grad. Stud. Math., vol. 19, American Mathematical Society, Providence, RI, 1998. [6] M. Frazier, B. Jawerth, A discrete transform and decompositions of distribution spaces, J. Funct. Anal. 93 (1990) 34–170. [7] N. Hayashi, W. von Wahl, On the global strong solutions of coupled Klein–Gordon–Schrödinger equations, J. Math. Soc. Japan 39 (1987) 489–497. [8] H. Ibrahim, M. Jazar, R. Monneau, Dynamics of dislocation densities in a bounded channel. Part I: Smooth solutions to a singular coupled parabolic system, preprint hal-00281487. [9] H. Ibrahim, M. Jazar, R. Monneau, Global existence of solutions to a singular parabolic/Hamilton–Jacobi coupled system with Dirichlet conditions, C. R. Math. Acad. Sci. Paris, Ser. I 346 (2008) 945–950. [10] J. Johnsen, W. Sickel, A direct proof of Sobolev embeddings for quasi-homogeneous Lizorkin–Triebel spaces with mixed norms, J. Funct. Spaces Appl. 5 (2007) 183–198. [11] H. Kozono, T. Ogawa, Y. Taniuchi, Navier–Stokes equations in the Besov space near L∞ and BMO, Kyushu J. Math. 57 (2003) 303–324. [12] H. Kozono, Y. Taniuchi, Limiting case of the Sobolev inequality in BMO, with application to the Euler equations, Comm. Math. Phys. 214 (2000) 191–200. [13] O.A. Ladyženskaja, V.A. Solonnikov, N.N. Ura´lceva, Linear and Quasilinear Equations of Parabolic Type, Translated from the Russian by S. Smith, Transl. Math. Monogr., vol. 23, American Mathematical Society, Providence, RI, 1967. [14] T. Ogawa, Sharp Sobolev inequality of logarithmic type and the limiting regularity condition to the harmonic heat flow, SIAM J. Math. Anal. 34 (2003) 1318–1330 (electronic). [15] B. Stöckert, Remarks on the interpolation of anisotropic spaces of Besov–Hardy–Sobolev type, Czechoslovak Math. J. 32 (107) (1982) 233–244. [16] H. Triebel, Theory of Function Spaces. II, Monogr. Math., vol. 84, Birkhäuser-Verlag, Basel, 1992.
Journal of Functional Analysis 257 (2009) 931–947 www.elsevier.com/locate/jfa
Strong peak points and strongly norm attaining points with applications to denseness and polynomial numerical indices ✩ Jaegil Kim a,∗ , Han Ju Lee b,∗ a Department of Mathematics, Kent State University, Kent, OH 44240, USA b Dongguk University, Department of Mathematics Education, 26, Pil-dong 3-ga, Chung-gu, Seoul, 100-715,
Republic of Korea Received 11 July 2008; accepted 20 November 2008 Available online 17 December 2008 Communicated by K. Ball
Abstract Using the variational method, it is shown that the set of all strong peak functions in a closed algebra A of Cb (K) is dense if and only if the set of all strong peak points is a norming subset of A. As a corollary we can induce the denseness of strong peak functions on other certain spaces. In case that a set of uniformly strongly exposed points of a Banach space X is a norming subset of P(n X), then the set of all strongly norm attaining elements in P(n X) is dense. In particular, the set of all points at which the norm of P(n X) is Fréchet differentiable is a dense Gδ subset. In the last part, using Reisner’s graph-theoretic approach, we construct some strongly norm attaining polynomials on a CL-space with an absolute norm. Then we show that for a finite dimensional complex Banach space X with an absolute norm, its polynomial numerical indices are one if and only if X is isometric to n∞ . Moreover, we give a characterization of the set of all complex extreme points of the unit ball of a CL-space with an absolute norm. © 2008 Elsevier Inc. All rights reserved. Keywords: Peak points; Peak functions; Polynomial numerical index
✩
This work was supported by the Korea Research Foundation Grant funded by the Korean Government (KRF-2007412-J02301). * Corresponding authors. E-mail addresses: [email protected] (J. Kim), [email protected] (H.J. Lee). 0022-1236/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.11.024
932
J. Kim, H.J. Lee / Journal of Functional Analysis 257 (2009) 931–947
1. Introduction and preliminaries Let K be a complete metric space and X a (real or complex) Banach space. We denote by Cb (K : X) the Banach space of all bounded continuous functions from K to X with the supremum norm. A nonzero function f ∈ Cb (K : X) is said to be a strong peak function at t ∈ K if every sequence {tn } in K with limn f (tn ) = f converges to t. Given a subspace A of Cb (K : X), a point t ∈ K called a strong peak point for A if there is a strong peak function f in A with f = f (t). We denote by ρA the set of all strong peak points for A. Let BX (resp. SX ) be the unit ball (resp. sphere) of a Banach space X. A nonzero function f ∈ Cb (BX : Y ) is said to strongly attain its norm at x if for every sequence {xn } in BX with limn f (xn ) = f , there exist a scalar λ with |λ| = 1 and a subsequence of {xn } which converges to λx. Given a subspace A of Cb (BX : Y ), x ∈ BX is called a strongly norm-attaining point of A if there exists a nonzero function f in A which strongly attains its norm at x. Denote by ρA ˜ the set of all strongly norm-attaining points of A. For complex Banach spaces X and Y , we may use the following two subspaces of Cb (BX : Y ): Ab (BX : Y ) = f ∈ Cb (BX : Y ): f is holomorphic on the interior of BX , Au (BX : Y ) = f ∈ Ab (BX : Y ): f is uniformly continuous on BX . We shall denote by A(BX : Y ) either Ab (BX : Y ) or Au (BX : Y ). In case that Y is the complex filed C, we write A(BX ), Au (BX ) and Ab (BX ) instead of A(BX : C), Au (BX : C) and Ab (BX : C) respectively. If X and Y are Banach spaces, an k-homogeneous polynomial P from X to Y is a mapping such that there is a k-linear continuous mapping L from X × · · · × X to Y such that P (x) = L(x, . . . , x) for every x ∈ X. P(k X : Y ) denote the Banach space of all k-homogeneous polynomials from X to Y , endowed with the polynomial norm P = supx∈BX P (x). We also say that P : X → Y is a polynomial, and write P ∈ P(X : Y ) if P is a finite sum of homogeneous polynomials from X into Y . In particular, replace P(k X : Y ) by P(k X) and P(X : Y ) by P(X) when Y is a scalar field. We refer to [8] for background on polynomials. The (polynomial) numerical index of a Banach space is a constant relating to the concepts of the numerical radius of functions on X. Actually, for each f ∈ Cb (BX : X), the numerical radius v(f ) is defined by v(f ) = sup x ∗ f (x): x ∗ (x) = 1, x ∈ SX , x ∗ ∈ SX∗ , where X ∗ is the dual space of X. For every integer k 1, the k-polynomial numerical index of a Banach space X is the constant defined by n(k) (X) = inf v(P ): P = 1, P ∈ P k X : X . If k = 1, in particular, then it is called the numerical index of X and we write n(X). For more recent results about numerical index, see a survey paper [15] and references therein. Let us briefly see the contents of the paper. In Section 2, using the variational method in [7], we show that the set ρA is a norming subset of A if and only if the set of all strong peak functions
J. Kim, H.J. Lee / Journal of Functional Analysis 257 (2009) 931–947
933
in A is a dense Gδ subset of A, when A is a closed subspace Cb (K : X) which contains all elements of the form t → (x ∗ f (t))m x for all x ∈ X, x ∗ ∈ X ∗ , f ∈ A and integers m 1. Using this theorem and the variational methods, we investigate the denseness of the set of strong peak holomorphic functions and the denseness of the set of numerical strong peak functions on certain Banach spaces. In Section 3, we also apply the variation method to investigate the denseness of the set of strongly norm attaining polynomials when the set of all uniformly strongly exposed points of a Banach space X is a norming subset of P(n X). As a direct corollary, the set of all points at which the norm of P(n X) is Fréchet differentiable is a dense Gδ subset if the set of all uniformly strongly exposed points of a Banach space X is a norming subset of P(n X). In the last part, we will use the graph theory to get some strongly norm-attaining points or complex extreme points. Reisner gave a one-to-one correspondence between n-dimensional some Banach spaces and certain graphs with n vertices. In detail he give a characterization of all finite dimensional real CL-spaces with an absolute norm using the graph-theoretic terminology. It gives a geometric picture of extreme points of the unit ball of CL-spaces and plays an important role to find the strongly norm-attaining points of P(k X). Moreover we can find all complex extreme points on a complex CL-space with an absolute norm. These strongly normattaining points or complex extreme points help answering a problem about the numerical index of a Banach space. We give a partial answer to Problem 43 in [15]: Characterize the complex Banach spaces X satisfying n(k) (X) = 1 for all k 2. We show that for a finite dimensional complex Banach space X with an absolute norm, its polynomial numerical indices are one if and only if X is isometric to n∞ . For later use, recall the definitions of real and complex extreme points of a unit ball. Let X be a real or complex Banach space. Recall that x ∈ BX is said to be an extreme point of BX if whenever y + z = 2x for some y, z ∈ BX , we have x = y = z. Denote by ext(BX ) the set of all extreme points of BX . When X is a complex Banach space, an element x ∈ BX is said to be a complex extreme point of BX if sup0θ2π x + eiθ y 1 for some y ∈ X implies y = 0. The set of all complex extreme points of BX is denoted by extC (BX ). 2. Denseness of the set of strong peak functions Let X be a Banach space and A a closed subspace Cb (K : X). A subset F of K is said to be a norming subset for A if for each f ∈ A, we have f = sup f (t): t ∈ F . Following the definition of Globevnik in [11], the smallest closed norming subset of A is called the Shilov boundary for A and it is shown in [6] that if the set of strong peak functions is dense in A then the Shilov boundary of A exists and it is the closure of ρA. The variation method used in [7] gives the partial converse of the above mentioned result. Theorem 2.1. Let A be a closed subspace Cb (K : X) which contains all elements of the form t → (x ∗ f (t))m x for all x ∈ X, x ∗ ∈ X ∗ , f ∈ A and integers m 1. Then the set ρA is a norming subset of A if and only if the set of all strong peak functions in A is a dense Gδ subset of A.
934
J. Kim, H.J. Lee / Journal of Functional Analysis 257 (2009) 931–947
Proof. Let d be the complete metric on K. Fix f ∈ A and > 0. For each n 1, set Un = g ∈ A: ∃z ∈ ρA with (f − g)(z) > sup (f − g)(x): d(x, z) > 1/n . Then Un is open and dense in A. Indeed, fix h ∈ A. Since ρA is a norming subset of A, there is a point w ∈ ρA such that (f − h)(w) > f − h − /2.
(2.1)
Choose a peak function q ∈ A at w with q(w) = 1 and w ∗ ∈ SX∗ such that w ∗ q(w) = 1. Then it is easy to see that w ∗ ◦ q is also a strong peak function in Cb (K). So there is an integer m 1 such that |w ∗ q(x)|m < 1/3 for all x ∈ K with d(x, w) > 1/n. Now define the function by m (f − h)(w) p(t) = − w ∗ q(t) · , (f − h)(w)
∀t ∈ K.
Set g(x) = h(x) + · p(x). Then f (w) − h(w) − p(w) = f (w) − h(w) + and g − h . Eq. (2.1) shows that (f − g)(w) = f (w) − h(w) − p(w) = f (w) − h(w) + > f − h + /2 sup (f − h)(x) − p(x): d(x, w) > 1/n = sup (f − g)(x): d(x, w) > 1/n . Therefore g ∈ Un . By the Baire category theorem there is a g ∈ Un with g < , and we shall show that f − g is a strong peak function. Indeed, g ∈ Un implies that there is zn ∈ X such that (f − g)(zn ) > sup (f − g)(x): d(x, zn ) > 1/n . Thus d(zp , zn ) 1/n for every p > n, and hence {zn } converges to a point z, say. Suppose that there is another sequence {xk } in BX such that {(f − g)(xk )}k converges to f − g. Then for each n 1, there is Mn 1 such that for every m Mn , (f − g)(xm ) > sup (f − g)(x): d(x, zn ) > 1/n . Then d(xm , zn ) 1/n for every m Mn . Hence {xm }m converges to z. This shows that f − g is a strong peak function at z. By Proposition 2.19 in [14], the set of all strong peak functions in A is a Gδ subset of A. This proves the necessity. Concerning the converse, it is shown in [6] that if the set of all strong peak functions is dense in A, then the set of all strong peak points is a norming subset of A. This completes the proof. 2 Let BX be the unit ball of the Banach space X. Recall that the point x ∈ BX is said to be a smooth point if there is a unique x ∗ ∈ BX∗ such that Re x ∗ (x) = 1. We denote by sm(BX ) the set
J. Kim, H.J. Lee / Journal of Functional Analysis 257 (2009) 931–947
935
of all smooth points of BX . We say that a Banach space is smooth if sm(BX ) is the unit sphere SX of X. The following corollary shows that if ρA is a norming subset, then the set of smooth points of BA is dense in SA . Corollary 2.2. Suppose that K is a complete metric space and A is a closed subalgebra of Cb (K). If ρA is a norming subset of A, then the set of all smooth points of BA contains a dense Gδ subset of SA . Proof. It is shown in [6] that if f ∈ A is a strong peak function and f = 1, then f is a smooth point of BA . Then Theorem 2.1 completes the proof. 2 Recall that a Banach space X is said to be a locally uniformly convex if whenever there is a sequence {xn } ∈ BX with limn xn + x = 2 for some x ∈ SX , we have limn xn − x = 0. It is shown in [6] that if X is locally uniformly convex, then the set of norm attaining elements is dense in A(BX ). That is, the set consisting of f ∈ A(BX ) with f = |f (x)| for some x ∈ BX is dense in A(BX ). The following corollary gives a stronger result. Notice that if X is locally uniformly convex, then every point of SX is the strong peak point for A(BX ) [11]. Theorem 2.1 and Corollary 2.2 imply the following. Corollary 2.3. Suppose that X is a locally uniformly convex, complex Banach space. Then the set of all strong peak functions in A(BX ) is a dense Gδ subset of A(BX ). In particular, the set of all smooth points of BA(BX ) contains a dense Gδ subset of SA(BX ) . It is shown [5] that the set of all strong peak points for A(BX ) is dense in SX if X is an order continuous locally uniformly c-convex sequence space. (For the definition see [5].) Then by Theorem 2.1, we get the denseness of the set of all strong peak functions. Corollary 2.4. Let X be an order continuous locally uniformly c-convex Banach sequence space. Then the set of all strong peak functions in Au (BX : X) is a dense Gδ subset of Au (BX : X). Let Π(X) = {(x, x ∗ ) ∈ BX × BX∗ : x ∗ (x) = 1} be the topological subspace of BX × BX∗ , where BX (resp. BX∗ ) is equipped with norm (resp. weak-∗ compact) topology. The numerical radius of holomorphic functions was deeply studied in [12] and the denseness of numerical radius holomorphic functions is studied on the classical Banach spaces [2]. Recently the numerical strong peak function is introduced in [14] and the denseness of holomorphic numerical strong peak functions in A(BX : X) is studied in various Banach spaces. The function f ∈ A(BX : X) is said to be a numerical strong peak function if there is (x, x ∗ ) such that limn |xn∗ f (xn )| = v(f ) for some {(xn , xn∗ )}n in Π(X) implies that (xn , xn∗ ) converges to (x, x ∗ ) in Π(X). The function f ∈ A(BX : X) is said to be numerical radius attaining if there is (x, x ∗ ) in Π(X) such that v(f ) = |x ∗ f (x)|. An element in the intersection of the set of all strong peak functions and the set of all numerical strong peak functions is called a norm and numerical strong peak function of A(BX : X). Using the variational method again we obtain the following. Proposition 2.5. Suppose that the set Π(X) is complete metrizable and the set Γ = {(x, x ∗ ) ∈ Π(X): x ∈ ρA(BX ) ∩ sm(BX )} is a numerical boundary. That is, v(f ) = sup(x,x ∗ )∈Γ |x ∗ f (x)|
936
J. Kim, H.J. Lee / Journal of Functional Analysis 257 (2009) 931–947
for each f ∈ A(BX : X). Then the set of all numerical strong peak functions in A(BX : X) is a dense Gδ subset of A(BX : X). Proof. Let A = A(BX : X) and let d be a complete metric on Π(X). In [14], it is shown that if Π(X) is complete metrizable, then the set of all numerical peak functions in A is a Gδ subset of A. We need prove the denseness. Let f ∈ A and > 0. Fix n 1 and set Un = g ∈ A: ∃(z, z∗ ) ∈ Γ with z∗ (f − g)(z) > sup x ∗ (f − g)(x): d (x, x ∗ ), (z, z∗ ) > 1/n . Then Un is open and dense in A. Indeed, fix h ∈ A. Since Γ is a numerical boundary for A, there is a point (w, w ∗ ) ∈ Γ such that ∗ w (f − h)(w) > v(f − h) − /2. Notice that d((x, x ∗ ), (w, w ∗ )) > 1/n implies that there is δn > 0 such that x − w > δn . Choose a peak function p ∈ A(BX ) such that p = 1 = |p(w)| and |p(x)| < 1/3 for x − w > δn and |w ∗ (f − h)(w) − p(w)| = |w ∗ f (w) − w ∗ h(w)| + |p(w)| = |w ∗ f (w) − w ∗ h(w)| + . Put g(x) = h(x) + · p(x)w. Then ∗ w (f − g)(w) = w ∗ f (w) − w ∗ h(w) − p(w) = w ∗ f (w) − w ∗ h(w) + > v(f − h) + /2 sup x ∗ (f − h)(x) − p(x)x ∗ (w): d (x, x ∗ ), (w, w ∗ ) > 1/n = sup x ∗ (f − g)(x): d (x, x ∗ ), (w, w ∗ ) > 1/n . That is, g ∈ Un . By the Baire category theorem there is a g ∈ Un with g < , and we shall show that f − g is a strong peak function. Indeed, g ∈ Un implies that there is (zn , zn∗ ) ∈ Γ such that ∗ z (f − g)(zn ) > sup x ∗ (f − g)(x): d (x, x ∗ ), (zn , z∗ ) > 1/n . n n Thus d((zp , zp∗ ), (zn , zn∗ )) 1/n for every p > n, and hence {(zn , zn∗ )} converges to a point (z, z∗ ), say. Suppose that there is another sequence {(xk , xk∗ )} in Π(X) such that {|xk∗ (f − g)(xk )|} converges to v(f − g). Then for each n 1, there is Mn 1 such that for every m Mn , ∗ x (f − g)(xm ) > sup x ∗ (f − g)(x): d (x, x ∗ ), (zn , z∗ ) > 1/n . m n ∗ ), (z , z∗ )) 1/n for every m M . Hence {(x , x ∗ )} converges to (z, z∗ ). This Then d((xm , xm n n n m m shows that f − g is a numerical strong peak function at (z, z∗ ). 2
Corollary 2.6. Suppose that X is separable, smooth and locally uniformly convex. Then the set of norm and numerical strong peak functions is a dense Gδ subset of Au (BX : X).
J. Kim, H.J. Lee / Journal of Functional Analysis 257 (2009) 931–947
937
Proof. It is shown [14] that if X is separable, Π(X) is complete metrizable. In view of [22, Theorem 2.5], Γ is a numerical boundary for Au (BX : X). Hence Proposition 2.5 shows that the set of all numerical strong peak functions is dense in Au (BX : X). Finally, Corollary 2.3 implies that the set of all norm and numerical peak functions is a dense Gδ subset of Au (BX : X). 2 Corollary 2.7. Let X be an order continuous locally uniformly c-convex, smooth Banach sequence space. Then the set of all norm and numerical strong peak functions in Au (BX : X) is a dense Gδ subset of Au (BX : X). Proof. Notice that X is separable since X is order continuous. Hence the set of all smooth points of BX is dense in SX by the Mazur theorem and Π(X) is complete metrizable [14]. In view of [22, Theorem 2.5], Γ is a numerical boundary for Au (BX : X). Hence Proposition 2.5 shows that the set of all numerical strong peak functions is a dense Gδ subset of Au (BX : X). Theorem 2.1 also shows that the set of all strong peak functions is a dense Gδ subset of Au (BX : X). This completes the proof. 2 3. Denseness of strongly norm attaining polynomials Recall that the norm · of a Banach space is said to be Fréchet differentiable at x ∈ X if x + δy + x − δy − 2x = 0. δ y=1
lim sup
δ→0
It is well known that the set of Fréchet differentiable points in a Banach space is a Gδ subset [3, Proposition 4.16]. Ferrera [9] shows that in a real Banach space X, the norm of P(n X) is Fréchet differentiable at Q if and only if Q strongly attains its norm. A set {xα } of points on the unit sphere SX of X is called uniformly strongly exposed (u.s.e.), if there are a function δ() with δ() > 0 for every > 0, and a set {fα } of elements of norm 1 in X ∗ such that for every α, fα (xα ) = 1, and for any x, x 1 and Re fα (x) 1 − δ() imply x − xα . Lindenstrauss [21, Proposition 1] showed that if BX is the closed convex hull of a set of u.s.e. points, then X has property A, that is, for every Banach space Y , the set of norm-attaining elements is dense in L(X, Y ), the Banach space of all bounded operators from X into Y . The following theorem gives stronger result. Theorem 3.1. Let F be the real or complex scalar field and X, Y Banach spaces over F. For k 1, suppose that the u.s.e. points {xα } in SX is a norming subset of P(k X). Then the set of strongly norm attaining elements in P(k X : Y ) is dense. In particular, the set of all points at which the norm of P(n X) is Fréchet differentiable is a dense Gδ subset. Proof. Let {xα } be a u.s.e. points and {xα∗ } be the corresponding functional which uniformly strongly exposes {xα }. Let A = P(k X : Y ), f ∈ A and > 0. Fix n 1 and set
Un = g ∈ A: ∃z ∈ ρA with (f − g)(z) > sup (f − g)(x): inf x − λz > 1/n . |λ|=1
938
J. Kim, H.J. Lee / Journal of Functional Analysis 257 (2009) 931–947
Then Un is open and dense in A. Indeed, fix h ∈ A. Since {xα } is a norming subset of A, there is a point w ∈ {xα } such that (f − h)(w) > f − h − δ(1/n). Set g(x) = h(x) − · p(x)k
f (w) − h(w) , f (w) − h(w)
where p is a strongly exposed functional at w such that |p(x)| < 1 − δ(1/n) for inf|λ|=1 x − λw > 1/n, p(w) = 1. Then g − h and (f − g)(w) = f (w) − h(w) + p(w)k f (w) − h(w) = f (w) − h(w) + f (w) − h(w) > f − h + 1 − δ(1/n) k f (w) − h(w) (f − h)(x) − p(x) sup : inf x − λw > 1/n f (w) − h(w) |λ|=1
= sup (f − g)(x): inf x − λw > 1/n . |λ|=1
That is, g ∈ Un . By the Baire category theorem there is a g ∈ Un with g < , and we shall show that f − g is a strong peak function. Indeed, g ∈ Un implies that there is zn ∈ X such that
(f − g)(zn ) > sup (f − g)(x): inf x − λzn > 1/n . |λ|=1
Thus inf|λ|=1 zp − λzn 1/n for every p > n, and inf|λ|=1 |zp∗ (zn ) − λ| = 1 − |zn∗ (zp )| 1/n for every p > n. So limn infp>n |zn∗ (zp )| = 1. Hence there is a subsequence of {zn } which converges to z, say by [1, Lemma 6]. Suppose that there is another sequence {xk } in BX such that {(f − g)(xk )} converges to f − g. Then for each n 1, there is Mn such that Mn n and for every m Mn ,
(f − g)(xm ) > sup (f − g)(x): inf x − λzn > 1/n . |λ|=1
Then inf|λ|=1 xm − λzn 1/n for every m Mn . So inf|λ|=1 xm − λz inf|λ|=1 xm − λzn + z − zn 2/n for every m Mn . Hence we get a convergent subsequence of xn of which limit is λz for some λ ∈ SC . This shows that f − g strongly norm attains at z. It is shown in [6] that the norm is Fréchet differentiable at P if and only if whenever there are sequences {tn }, {sn } in BX and scalars α, β in SF such that limn αP (tn ) = limn βP (sn ) = P , we get lim sup αQ(tn ) − βQ(sn ) = 0. n
Q=1
(3.1)
J. Kim, H.J. Lee / Journal of Functional Analysis 257 (2009) 931–947
939
We have only to show that every nonzero element P in A which strongly attains its norm satisfies the condition (3.1). Suppose that P strongly attains its norm at z and P = 0. For each Q ∈ A, there is a k-linear form L such that Q(x) = L(x, . . . , x) for each x ∈ X. The polarization identity [8] shows that Q L (k k /k!)Q. Then for each x, y ∈ BX , Q(x) − Q(y) nLx − y and k+1 Q(x) − Q(y) k Qx − y. k!
Suppose that there are sequences {tn }, {sn } in BX and scalars α, β in SF such that limn αP (tn ) = limn βP (sn ) = P , then for any subsequences {sn } of {sn } and {tn } of {tn }, there are convergent further subsequences {tn
} of {tn } and {sn
} of {sn } and scalars α
and β
in SF such that limn tn
= α
z and limn sn
= β
z. Then αP (α
z) = βP (β
z) = 1. So α(α
)k = β(β
)k . Then we get
lim sup αQ tn
− βQ sn
lim sup αQ tn
− αQ(α
z) + βQ(β
z) − βQ sn
n
Q=1
n
Q=1
lim n
k k+1 t
− α
z + β
z − s
= 0. n n k!
This implies that limn supQ=1 |αQ(tn ) − βQ(sn )| = 0. Therefore the norm is Fréchet differentiable at P . This completes the proof. 2 Remark 3.2. Suppose that the BX is the closed convex hull of a set of u.s.e. points, then the set of u.s.e. points is a norming subset of X ∗ = P(1 X). Hence the elements in X ∗ at which the norm of X ∗ is Fréchet differentiable is a dense Gδ subset. 4. Polynomial numerical index and graph theory A norm · on Rn or Cn is said to be an absolute norm if (a1 , . . . , an ) = (|a1 |, . . . , |an |) for every scalar a1 , . . . , an , and (1, 0, . . . , 0) = · · · = (0, . . . , 0, 1) = 1. We may use the fact that the absolute norm is less than or equal to the 1 -norm and it is nondecreasing in each variable. A real or complex Banach space X is said to be a CL-space if its unit ball is the absolutely convex hull of every maximal convex subset of the unit sphere. In particular, if X is finite dimensional, then it is equivalent to the condition that |x ∗ (x)| = 1 for every x ∗ ∈ ext BX∗ and every x ∈ ext BX [20]. Let X be an n-dimensional Banach space with an absolute norm · . In this section, X as a vector space is considered Rn or Cn and we denote by {ej }nj=1 and {ej∗ }nj=1 the canonical basis and the coordinate functionals, respectively. We also denote by ext BX the set of all extreme points of BX . Now define the following mapping between n-dimensional Banach spaces with an absolute norm and certain graphs with n vertices: X → G = G(X), where G is a graph with the vertex set V = {1, 2, . . . , n} and the edge set E = {(i, j ): / BX }. For example, if X = n1 , then G(X) is a complete graph of order n, that is, a e i + ej ∈
940
J. Kim, H.J. Lee / Journal of Functional Analysis 257 (2009) 931–947
graph in which every pair of n vertices is connected by an edge. Conversely, G(X) is a graph in which no pairs of n vertices are connected by an edge if X = n∞ . Using these graphs and their theory, Reisner gave the exact characterization of all finite dimensional CL-spaces with an absolute norm. Prior to Reisner’s theorem, we give some basic definitions in the graph theory. Given a graph G = (V , E), we say that σ ⊂ V is a clique of G if the edge set E of G contains all pairs consisting of any two vertices in σ . Conversely, τ ⊂ V is called a stable set of G if E contains no pairs consisting of two vertices in τ . A graph G is said to be perfect if ω(H ) = χ(H )
for every induced subgraph H of G,
where ω(G) denotes the clique number of G (the largest cardinality of a clique of G) and χ(G) is the chromatic number of G (the smallest number of colors needed to color the vertices of G so that no two adjacent vertices share the same color). Theorem 4.1. (See [23, Reisner].) Let X be a finite dimensional Banach space with an absolute norm. Then X is a CL-space if and only if G = G(X) is a perfect graph and there exists a unique common element between every maximal clique and each maximal stable set of G. In particular, for every n-dimensional CL-space X, the following characterizations of the set of extreme points of BX and BX∗ hold respectively: (1) (x1 , x2 , . . . , xn ) ∈ ext BX if and only if |xj | ∈ {0, 1} for all j = 1, 2, . . . , n and the support {j ∈ V : xj = 0} is a maximal stable set of G. (2) (x1 , x2 , . . . , xn ) ∈ ext BX∗ if and only if |xj | ∈ {0, 1} for all j = 1, 2, . . . , n and the support {j ∈ V : xj = 0} is a maximal clique of G. In this theorem, the maximality of cliques and stable sets comes from the partial order of inclusion. Let X be a finite dimensional CL-space. If τ is a maximal clique of G = G(X), then a sub|τ | to 1 . Indeed, since the absolute norm space span{ej : j ∈ τ } of X is isometrically isomorphic is less than or equal to the 1 -norm, we have j ∈τ aj ej j ∈τ |aj | for every scalar aj . For the inverse inequality, let xτ∗ = j ∈τ sign(aj ) · ej∗ . Then xτ∗ is in ext BX∗ by Theorem 4.1 and hence xτ∗ ( j ∈τ aj ej ) = j ∈τ |aj |, which completes the proof. Remark 4.2. Originally, Reisner just proved the above theorem for the real case. However, it can be extended to the general case (real or complex). For this, we need the following comments and proposition. There is a natural one-to-one correspondence between the absolute norm of Rn and the one of Cn . Specifically, given real Banach space X = (Rn , · ) with an absolute norm, we can find the complexification X˜ = (Cn , · C ) of X defined by (z1 , . . . , zn )C := (|z1 |, . . . , |zn |) for ˜ Then X˜ is clearly the complex Banach space with the absolute norm. each (z1 , . . . , zn ) ∈ X. Moreover we get the following basic proposition. Proposition 4.3. Let X be a real Banach space with an absolute norm and X˜ the complexification of X. Then, for an element x of X, x ∈ ext BX if and only if x ∈ ext BX˜ . In particular, X is a CLspace if and only if X˜ is a CL-space.
J. Kim, H.J. Lee / Journal of Functional Analysis 257 (2009) 931–947
941
Proof. The sufficiency is clear. Suppose that x is an extreme point of BX and 2x = y + z for some y, z in BX˜ . We claim that y and z are in X. For the contrary, suppose that some j th-coordinate of y is not a real number. That is, ej∗ (y) = a + bi, a, b ∈ R and b = 0. Since 2x = y + z = Re(y) + Re(z) + i(Im(y) + Im(z)), we have Re(y) = Re(z) = x, where Re(x) = nk=1 Re(ek∗ (x))ek and Im(x) = nk=1 Im(ek∗ (x))ek . Take a positive real number δ less √ than a 2 + b2 − |a|. Note that x ± δej y 1 by the basic property of an absolute norm. So 2x = (x + δej ) + (x − δej ) contradicts to the fact that x is an extreme point of BX . 2 Using the graph-theoretic technique on CL-spaces, we are about to find the strongly norm attaining points of ρP( ˜ m X). For this, let us consider the following lemma. Lemma 4.4. Let Y = N 1 and let m be a positive integer. For any j1 , j2 , . . . , jm ∈ {1, 2, . . . , N }, define an m-homogeneous polynomial Qj1 ,j2 ,...,jm of Y by Qj1 ,j2 ,...,jm (x) =
m
ej∗k (x) +
k=1
Then Qj1 ,j2 ,...,jm attains its norm only at its norm at m1 m k=1 ejk .
c m
m
ej∗ (x)
.
j ∈{j1 ,...,jm }
m
k=1 ejk , |c| = 1. Hence Qj1 ,j2 ,...,jm
strongly attains
Proof. For positive integers m1 , . . . , mn , consider the product (x1 , . . . , xn ) →
n
xkmk
k=1
on the compact subset Rn+ ∩ Sn1 . Then it is easy to see by induction that the product has the m1 mn ∗ , . . . , m1 +···+m ). Hence the norm of the polynomial m unique maximum at ( m1 +···+m k=1 ejk (x) n n ∗ is attained only at x = (x1 , . . . , xN ), where |xj | = m1 m k=1 ej (ejk ) for each 1 j N . Notice N ∗ also that the norm of the polynomial ( j =1 en (x))m is attained only at x = (x1 , . . . , xN ), where sign(x1 ) = · · · = sign(xN ) and x ∈ SN . Hence Qj1 ,j2 ,...,jm attains its norm only at mc m k=1 ejk 1 for some c ∈ SC . This completes the proof. 2 Theorem 4.5. Let X be a finite dimensional CL-space with an absolute norm (real or complex) ˜ m X) whenever y1 , y2 , . . . , ym are extreme and let m be a positive integer. Then m1 m j =1 yj ∈ ρP( points of BX whose coordinates are nonnegative real numbers. Proof. Denote by M(G) the family of all maximal cliques of G = G(X) and let y1 , y2 , . . . , ym be extreme points of BX whose coordinates are nonnegative real numbers. For each J ∈ M(G), define the m-homogeneous polynomial QJ and linear functional LJ as the following QJ = Qj1 ,j2 ,...,jm ,
LJ =
ej∗ ,
j ∈{j1 ,...,jm }
where each jk (k = 1, 2, . . . , m) is a unique common element between a maximal clique J and the support of an extreme point yk . Now define an m-homogeneous polynomial Q of X by
942
J. Kim, H.J. Lee / Journal of Functional Analysis 257 (2009) 931–947
Q=
QJ +
m
J ∈M(G)
LJ
.
J ∈M(G)
For every maximal clique J of G, denote by PJ the projection of X onto span{ej : j ∈ J }. Then, it is clear that PJ
1 yj m m
=
j =1
1 1 PJ (yj ) = e jk . m m m
m
j =1
k=1
Notice that QJ ◦ PJ = QJ and LJ ◦ PJ = LJ for each J ∈ M(G). It follows by Theorem 5.1 that
1 Q yj m m
=
j =1
QJ
J ∈M(G)
=
QJ
+
j =1
QJ PJ
J ∈M(G)
=
m
J ∈M(G)
=
1 yj m
m
j =1
m
+
QJ +
1 yj m
m
j =1
LJ PJ
LJ
1 yj m m
m
j =1
J ∈M(G)
J ∈M(G)
m
+
k=1
J ∈M(G)
LJ
J ∈M(G)
1 yj m
1 e jk m
1 e jk m m
m
k=1
m
LJ
J ∈M(G)
= Q, and that Q attains its norm at m1 m j =1 yj . Then we claim that the above polynomial Q strongly attains its norm at m1 m j =1 yj . Now, suppose that |Q(y)| = Q and we need to show that y = mc m y for some c ∈ SC . Then we j j =1 have the following inequalities: m Q(y) = QJ (PJ y) + LJ (PJ y) J ∈M(G)
J ∈M(G)
m QJ (PJ y) + LJ (PJ y) J ∈M(G)
J ∈M(G)
J ∈M(G)
QJ +
m
LJ
J ∈M(G)
= Q. Hence |QJ (PJ y)| = QJ , LJ (PJ y) = LJ for each J∈ M(G) and sign LJ (PJ y) are all the same for all J ∈ M(G). By Theorem 5.1, PJ y = cmJ m k=1 ejk for each J ∈ M(G). Since sign(LJ (PJ y)) = cJ and they are all the same for all J ∈ M(G), take cJ = c for all J ∈ M(G).
J. Kim, H.J. Lee / Journal of Functional Analysis 257 (2009) 931–947
943
Since each maximal clique J induces every extreme point xJ∗ of BX∗ , the above equation implies that xJ∗ (y) = xJ∗ (PJ y) = xJ∗ PJ
c yj m m
= xJ∗
j =1
Consequently y =
c m
m
j =1 yj .
This completes the proof.
m c yj . m j =1
2
It was shown in [19] that a necessary and sufficient condition for a Banach space with the Radon–Nikodym Property to have the polynomial numerical index one can be stated as follows: n(k) (X) = 1 if and only if ∗ x (x) = 1 for all x ∈ ρP ˜ k X and x ∗ ∈ ext BX∗ .
(4.1)
The following theorem is a partial answer to Problem 43 in [15]: Characterize the complex Banach spaces X satisfying n(k) (X) = 1 for all k 2. Theorem 4.6. Let X be an n-dimensional complex Banach space with an absolute norm and let k be an integer greater than or equal to 2. Then n(k) (X) = 1 if and only if X is isometric to n∞ . Proof. Suppose that n(k) (X) = 1. Since n(k) (X) = 1 implies n(X) = 1 (i.e. X is a CL-space), we can apply Theorem 4.1. To show that X is isometric to n∞ , it suffices to prove that BX has only one extreme point whose coordinates are nonnegative real number. Suppose that there exist distinct two extreme points x, y of BX whose coordinates are all nonnegative. Because the supports of x and y are maximal stable sets, we can choose i, j ∈ {1, . . . , n} such that (i, j ) is in the edge set E = E(G) and ei∗ (x) = 1 = ej∗ (y), ej∗ (x) = 0 = ei∗ (y). Take a maximal clique τ of G = G(X) containing (i, j ). Then, by Theorem 4.1, i is a unique common element between the maximal clique τ and the support of x. Similarly, j is a unique common element between τ and the support of y. Now consider an xτ∗ ∈ ext BX∗ defined by xτ∗ (ek ) =
−1, if k = i, 1, if k ∈ τ \{i}, 0, if k ∈ / τ.
Then, using Theorem 4.1, we get xτ∗
x+y 2
=
xτ∗ (x) + xτ∗ (y) −1 + 1 = = 0. 2 2
However, since x+y ˜ 2 X) by Theorem 4.5, this contradicts to (4.1). 2 ∈ ρP( For the converse, note that ρP( ˜ k X) ⊂ extC BX by [19, Proposition 2.1]. It is also easy to see that extC BX = ext BX when X = n∞ . After all, it follows from (4.1) that n(k) (n∞ ) = 1. 2
944
J. Kim, H.J. Lee / Journal of Functional Analysis 257 (2009) 931–947
5. Characterization of complex extreme points Theorem 5.1. Let X = (Cn , · ) be an n-dimensional complex CL-space with an absolute norm. Then an element (a1 , a2 , . . . , an ) in X is a complex extreme point of BX if and only if (|a1 |, |a2 |, . . . , |an |) is a convex combination of extreme points of BX whose coordinates all are positive real numbers. In short, Rn+ ∩ extC BX = co(Rn+ ∩ ext BX ). Proof. Let (a1 , a2 , . . . , an ) be a complex extreme point of BX . If X˜ = (Rn , · ), then all of BX by Proposition extreme points of BX˜ are extreme points m 4.3. So an element x := λ y where (|a1 |, |a2 |, . . . , |an |) has an expression x = m j j j =1 j =1 λj = 1, λj > 0 and each yj is an extreme point of BX whose coordinates are real numbers. Now we claim that all coordinates of each yj are positive. For the contradiction, suppose that the first coordinate of some yj is negative. More specifically, assume that the first coordinate of yj is −1 if 1 j r and 1 otherwise. Then |a1 | < 1. Note that for yj = yj + 2e1 , j = 1, 2, . . . , r, r m 1, |a2 |, . . . , |an | = λj yj + λj yj j =1
j =r+1
∈ co(ext BX ) = BX . It follows that for all ε ∈ C with |ε| < 1 − |a1 |, x + εe1 = |a1 | + ε, |a2 |, . . . , |an | 1, |a2 |, . . . , |an | 1. So (|a1 |, |a2 |, . . . , |an |) is not a complex extreme point, m which is a contradiction. λ y where For the converse, let x = m j =1 j j j =1 λj = 1 and yj ∈ ext BX for all j = 1, 2, . . . , m. Suppose that x is not a complex extreme point of BX . Then there exists a nonzero y in X such that x + εy ∈ BX whenever |ε| 1. Take a maximal clique τ in V containing some vertices which are nonzero coordinates of y. Consider the projection Pτ of X onto the linear span Y of {ej : j ∈ τ }. Then it follows from assumption that Pτ x is also not a complex extreme point of BY . Moreover, we can check that Pτ x is on the unit sphere of X. Indeed, if an extreme point / τ , then we have xτ∗ (yj ) = 1 xτ∗ of BX∗ is defined by xτ∗ (ej ) = 1 for j ∈ τ and xτ∗ (ej ) = 0 for j ∈ for all j = 1, 2, . . . , m by Theorem 4.1. Consequently, xτ∗ (Pτ x) = xτ∗ (x) =
m j =1
λj xτ∗ (yj ) =
m
λj = 1.
j =1 |τ |
Now, from the fact that Y = span{ej : j ∈ τ } is isometrically isomorphic to 1 since τ is a clique, we get that every element of norm one in Y is a complex extreme point of BY [10]. This is a contradiction. 2 The above characterization of complex extreme points has some application in the theory about the numerical index of Banach spaces. Specifically we want to apply to the analytic numerical index of X defined by
J. Kim, H.J. Lee / Journal of Functional Analysis 257 (2009) 931–947
945
na (X) = inf v(P ): P = 1, P ∈ P(X : X) . It is easily observed that 0 na (X) n(k) (X) n(X) 1 for every k 2. Note that a necessary and sufficient condition for a finite dimensional complex Banach space to have the analytic numerical index 1 can be stated using complex extreme points as follows (see [19]): na (X) = 1 if and only if ∗ x (x) = 1 for all x ∈ extC BX and x ∗ ∈ ext BX∗ .
(5.1)
After applying the characterization of complex extreme points to the above theorem, we have the following corollary by the same argument as the proof of Theorem 4.6. Corollary 5.2. Let X be an n-dimensional complex Banach space with an absolute norm. Then na (X) = 1 if and only if X is isometric to n∞ . As an immediate consequence of the above theorem, we get a corollary. For further details, we need some definitions about the Daugavet property. A function Φ ∈ ∞ (BX , X) is said to satisfy the (resp. alternative) Daugavet equation if the norm equality I d + Φ = 1 + Φ resp. sup I d + ωΦ = 1 + Φ ω∈SC
holds. If this happens for all weakly compact polynomials in P(X : X), we say that X has the (resp. alternative) p-Daugavet property. Similarly, X is said to have the k-order Daugavet property if the Daugavet equation is satisfied for all rank-one k-homogeneous polynomials in P(k X : X). Corollary 5.3. Let X be a finite dimensional complex Banach space with an absolute norm. Then the followings are equivalent: (a) (b) (b ) (c) (c ) (d) (e)
X has the alternative p-Daugavet property. X has the k-order Daugavet property for some k 2. X has the k-order Daugavet property for every k 2. n(k) (X) = 1 for some k 2. n(k) (X) = 1 for every k 2. na (X) = 1. X = n∞ .
Proof. Theorems 4.6 and 5.2 imply (c) ⇔ (e) and (d) ⇔ (e) respectively. (b) ⇒ (c) or (b ) ⇒ (c ) is induced by [4, Proposition 1.3]. It is also easy to check that (b) ⇒ (c) ⇔ (e) ⇑ ⇑ (a) ⇒ (b ) ⇒ (c ) ⇐ (d) Corollary 2.10 in [4] shows (e) ⇒ (a). The proof is done.
2
946
J. Kim, H.J. Lee / Journal of Functional Analysis 257 (2009) 931–947
Remark 5.4. Let X be a finite dimensional (real or complex) Banach space with an absolute norm. Then it is impossible that X has the p-Daugavet property. Indeed, if X is a real (resp. complex) Banach space, then n(k) (X) = 1 for all k 2 and X = R (resp. X = n∞ for some n) (see [19] and Corollary 4.6). Then it is easy to see that both R and complex n∞ do not have p-Daugavet property. It is worth to note that the complex extreme points play an important role in the study of norming subsets of A(BX ). In particular, ρA(BX ) = extC BX when X is a finite dimensional complex Banach space (see [5]). Using the result in [13], we obtain the following corollary. For the further references about complex convexity and monotonicity, see [16–18]. Corollary 5.5. Let X = (Rn , · ) be an n-dimensional real CL-space with an absolute norm. Then an element (|a1 |, |a2 |, . . . , |an |) in SX is a point of upper monotonicity of X if and only if (|a1 |, |a2 |, . . . , |an |) is a convex combination of extreme points of BX whose coordinates all are positive real numbers. Acknowledgment The latter part of this paper is based on the first named author’s master thesis which is supervised by Prof. Yun Sung Choi. Furthermore, the authors wish to thank him for his comments about this subject. References [1] María D. Acosta, Denseness of numerical radius attaining operators: Renorming and embedding results, Indiana Univ. Math. J. 40 (3) (1991) 903–914. [2] María D. Acosta, Sung Guen Kim, Denseness of holomorphic functions attaining their numerical radii, Israel J. Math. 161 (2007) 373–386. [3] Yoav Benyamini, Joram Lindenstrauss, Geometric Nonlinear Functional Analysis, vol. 1, Amer. Math. Soc. Colloq. Publ., vol. 48, American Mathematical Society, Providence, RI, 2000. [4] Yun Sung Choi, Domingo García, Manuel Maestre, Miguel Martín, The Daugavet equation for polynomials, Studia Math. 178 (1) (2007) 63–82. [5] Yun Sung Choi, Kwang Hee Han, Han Ju Lee, Boundaries for algebras of holomorphic functions on Banach spaces, Illinois J. Math. 51 (3) (2007) 883–896. [6] Yun Sung Choi, Han Ju Lee, Hyun Gwi Song, Bishop’s theorem and differentiability of a subspace of Cb (K), preprint, http://arxiv.org/abs/0708.4069, 2007. [7] Robert Deville, Gilles Godefroy, Václav Zizler, A smooth variational principle with applications to Hamilton– Jacobi equations in infinite dimensions, J. Funct. Anal. 111 (1) (1993) 197–212. [8] Seán Dineen, Complex Analysis on Infinite-Dimensional Spaces, Springer Monogr. Math., Springer-Verlag London Ltd., London, 1999. [9] Juan Ferrera, Norm-attaining polynomials and differentiability, Studia Math. 151 (1) (2002) 1–21. [10] Josip Globevnik, On complex strict and uniform convexity, Proc. Amer. Math. Soc. 47 (1975) 175–178. [11] Josip Globevnik, Boundaries for polydisc algebras in infinite dimensions, Math. Proc. Cambridge Philos. Soc. 85 (2) (1979) 291–303. [12] Lawrence A. Harris, The numerical range of holomorphic functions in Banach spaces, Amer. J. Math. 93 (1971) 1005–1019. [13] Henryk Hudzik, Agata Narloch, Relationships between monotonicity and complex rotundity properties with some consequences, Math. Scand. 96 (2) (2005) 289–306. [14] Sung Guen Kim, Han Ju Lee, Norm and numerical peak holomorphic functions on Banach spaces, preprint, http:// arxiv.org/abs/0706.0574, 2007. [15] Vladimir Kadets, Miguel Martín, Rafael Payá, Recent progress and open questions on the numerical index of Banach spaces, RACSAM Rev. R. Acad. Cienc. Exactas Fís. Nat. Ser. A Mat. 100 (1–2) (2006) 155–182.
J. Kim, H.J. Lee / Journal of Functional Analysis 257 (2009) 931–947
[16] [17] [18] [19] [20]
947
Han Ju Lee, Monotonicity and complex convexity in Banach lattices, J. Math. Anal. Appl. 307 (1) (2005) 86–101. Han Ju Lee, Complex convexity and monotonicity in quasi-Banach lattices, Israel J. Math. 159 (2007) 57–91. Han Ju Lee, Randomized series and geometry of Banach spaces, preprint, http://arxiv.org/abs/0706.3740, 2007. Han Ju Lee, Banach spaces with polynomial numerical index 1, Bull. Lond. Math. Soc. 40 (2008) 193–198. Åsvald Lima, Intersection properties of balls in spaces of compact operators, Ann. Inst. Fourier (Grenoble) 28 (1978) 35–65. [21] Joram Lindenstrauss, On operators which attain their norm, Israel J. Math. 1 (1963) 139–148. [22] Ángel Rodríguez Palacios, Numerical ranges of uniformly continuous functions on the unit sphere of a Banach space, J. Math. Anal. Appl. 297 (2) (2004) 472–476, special issue dedicated to John Horváth. [23] Shlomo Reisner, Certain Banach spaces associated with graphs and CL-spaces with 1-unconditional bases, J. London Math. Soc. (2) 43 (1) (1991) 137–148.
Journal of Functional Analysis 257 (2009) 948–991 www.elsevier.com/locate/jfa
Composition formulas in the Weyl calculus Toshiyuki Kobayashi a , Bent Ørsted b , Michael Pevzner c , André Unterberger c,∗ a Graduate School of Mathematical Sciences, The University of Tokyo, 3-8-1 Komaba, Meguro, Tokyo, 153-8914 Japan b Matematisk Institut, Byg. 430, Ny Munkegade, 8000 Aarhus C, Denmark c Laboratoire de Mathématiques, CNRS FRE 3111, Université de Reims, Moulin de la Housse, B.P. 1039,
F-51687 Reims Cedex 2, France Received 25 December 2008; accepted 30 December 2008 Available online 26 January 2009 Communicated by Paul Malliavin
Abstract In pseudodifferential analysis, the usual composition formula, which has asymptotic value, extends that valid for differential operators. The one developed here is based instead on the decomposition of symbols (functions in R n × R n ) as integral superpositions of homogeneous ones, of degrees lying on the complex line with real part −n. It extends the one known in the one-dimensional case in connection with automorphic pseudodifferential analysis. © 2009 Elsevier Inc. All rights reserved. Keywords: Weyl calculus; Composition formulas; Principal series representations; Automorphic symbols and triple products
Contents 1. 2. 3. 4. 5.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Decomposing the action of the symplectic group on L2 (Rn × Rn ) ε ,ε ;ε The integral kernel Jν 1,ν 2;ν (Y, Z; X) . . . . . . . . . . . . . . . . . . . . 1 2 Hyperplane waves and rays . . . . . . . . . . . . . . . . . . . . . . . . . . . Some one-dimensional preparation . . . . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
949 952 960 968 974
* Corresponding author.
E-mail addresses: [email protected] (T. Kobayashi), [email protected] (B. Ørsted), [email protected] (M. Pevzner), [email protected] (A. Unterberger). 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.12.023
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
949
6. Another composition of Weyl symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 983 7. Irreducibility of the decomposition of L2 (R2n ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 988 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 991
1. Introduction No symbolic calculus of operators is more popular or better known than the Weyl calculus. It is the one that associates to a function S = S(x, ξ ) of n + n variables, lying in S(Rn × Rn ), the operator Op(S), called the operator with symbol S, defined by the equation Op(S)u (x) =
S
Rn ×Rn
x +y , η e2iπx−y,η u(y) dy dη: 2
(1.1)
such a linear operator extends as a continuous operator from S (Rn ) to S(Rn ) while, in the case when S ∈ S (Rn × Rn ), one can still define Op(S) as a linear operator from S(Rn ) to S (Rn ); also, Op sets up an isometry from L2 (Rn × Rn ) onto the space of Hilbert–Schmidt operators on L2 (Rn ). The sharp composition S1 # S2 of two symbols, say lying in S(Rn × Rn ), is that which makes the formula Op(S1 )Op(S2 ) = Op(S1 # S2 ),
(1.2)
in which the left-hand side denotes the usual composition of operators, valid. The image of the Heisenberg representation is the group of unitary transformations exp(2iπ(η, Q − y, P − t)) of L2 (Rn ), as made meaningful by Stone’s theorem, where the j th component of the vector Q = (Q1 , . . . , Qn ) is the multiplication by the j th coordinate xj , 1 ∂ n n n 2 P = (P1 , . . . , Pn ) with Pj = 2iπ ∂xj , and y, η ∈ R , t ∈ R. Introducing on (R × R ) the symplectic form [,] such that (x, ξ ), (y, η) = −x, η + y, ξ ,
(1.3)
let us use on Rn × Rn the symplectic Fourier transformation F defined by the equation (F S)(X) =
S(Y )e−2iπ[X,Y ] dY,
(1.4)
Rn ×Rn
which commutes with all symplectic linear transformations of the variable in Rn × Rn . Another, fully equivalent, way to define the Weyl calculus is by means of the equation Op(S) =
(F S)(y, η) exp 2iπ η, Q − y, P dy dη.
Rn ×Rn
The first covariance rule of the Weyl calculus is the observation that
(1.5)
950
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
exp 2iπ η, Q − y, P Op(S) exp −2iπ η, Q − y, P = Op (x, ξ ) → S(x − y, ξ − η) .
(1.6)
One way to emphasize this action on symbols of the group of translations of R2n is to decompose in a systematic way the space of symbols L2 (R2n ) with respect to this action. Now, the operators which commute with it are just the partial differential operators with constant coefficients: the generalized joint eigenfunctions of these are exactly the exponentials X = (x, ξ ) → e2iπ[A,X] with A ∈ R2n , and the sought-after decomposition of a symbol is provided by the symplectic Fourier transformation. On the other hand, if A = (y, η), the operator with symbol e2iπ[A,X] is none other than the operator exp(2iπ(η, Q − y, P )), so that Heisenberg’s commutation relation, expressed in Weyl’s exponential version, takes the form 1 ,X]
e2iπ[A
2 ,X]
# e2iπ[A
1 ,A2 ]
= eiπ[A
1 +A2 ,X]
e2iπ[A
.
(1.7)
Before coming to the point of the present work, let us briefly recall a few immediate consequences of this relation. First, one has (say, when S1 and S2 lie in S(R2n )), using (1.5), the integral composition formula S1 (Y )S2 (Z)e−4iπ[Y −X,Z−X] dY dZ (1.8) (S1 # S2 )(X) = 22n R2n ×R2n
or (a fully equivalent one) (S1 # S2 )(X) = exp(iπL) S1 (Y )S2 (Z)
(Y = Z = X)
(1.9)
with (setting Y = (y, η), Z = (z, ζ )) n ∂2 1 ∂2 − . iπL = + 4iπ ∂yj ∂ζj ∂zj ∂ηj
(1.10)
j =1
Expanding the exponential into a series, one obtains the so-called Moyal formula (S1 # S2 )(x, ξ )
β α (−1)|α| 1 |α|+|β| ∂ α ∂ β ∂ ∂ = S1 (x, ξ ) S2 (x, ξ ). α!β! 4iπ ∂x ∂ξ ∂x ∂ξ
(1.11)
This formula is an exact one in the case when the two operators under consideration are differential operators, which means exactly that their symbols (of course, not in S(R2n )) are polynomial with respect to the variables ξ , with coefficients depending on x in a smooth, but otherwise fairly arbitrary way; it is also exact when one of the two symbols is a polynomial in (x, ξ ). As it turns out, this version of the composition formula is the only universally known one. Indeed, it has considerable importance in applications of pseudodifferential analysis to partial differential equations: classes of symbols for which the above formula, without being an exact one, still has some asymptotic value, provide a good proportion of the auxiliary operators needed for the solution of PDE problems. In a conclusion, however, we shall illustrate on one example
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
951
while this may sometimes fail and call instead for the composition formula which is the object of the present paper. Our derivation of (1.8) was obtained as the result of pairing the concept of sharp composition of symbols with the decomposition of symbols according to the action by translations of the group R2n : the success of this point of view was essentially dependent on the fact that this action is an ingredient of the covariance formula (1.6). This takes us to the aim of the present paper: to take advantage of the other covariance property of the Weyl calculus—to be recalled now—and follow the same policy. Recall that the metaplectic representation Met in L2 (Rn ) is a certain unitary representation [15] of the twofold cover of the symplectic group Sp(n, R), which consists of all linear transformations g of Rn × Rn such that [gX, gY ] = [X, Y ] for every pair (X, Y ) of points of Rn × Rn : it acts irreducibly on each of the two subspaces of L2 (Rn ) consisting of functions with a given parity. Unitary transformations in the image of the metaplectic representation also act as automorphisms of the space S(Rn ) or of the space S (Rn ): moreover, if such a unitary transformation U lies above g ∈ Sp(n, R), and if S ∈ S (R2n ), one has the covariance formula U Op(S)U −1 = Op S ◦ g −1 .
(1.12)
In full analogy with the procedure adopted above in connection with the Heisenberg representation, we now start from a decomposition of the phase space representation (g, S) → S ◦ g −1 of Sp(n, R) in L2 (R2n ) into irreducibles: this is just the same as decomposing functions in L2 (R2n ) as integral superpositions of functions homogeneous of a given degree, and with a given parity. Our main result is the formula which takes the place of (1.7): it decomposes the sharp product of two symbols h1 and h2 , homogeneous of degrees −n − iλ1 and −n − iλ2 and with parities characterized by indices δ1 and δ2 , as an integral superposition of functions homogeneous of degrees −n − iλ, with the parity δ ≡ δ1 + δ2 . It involves the integral kernel
−n−iλ+iλ1 −iλ2
−n−iλ−iλ1 +iλ2
−n+iλ+iλ1 +iλ2
2 2 2
[Y, X]
[X, Z]
[Z, Y ]
, ε ε ε 2
1
(1.13)
a product of three signed powers, obtained from the decomposition into homogeneous components with respect to the three variables of the integral kernel which occurs in the composition formula (1.8). Some preparation is needed in order to give this kernel a genuine meaning as a distribution, not only as a partially defined function. The principle of the proof of the new composition formula is simple, and relies on the decomposition of symbols into hyperplane waves, and the dual notion of rays. Its main difficulty lies in the singular nature of such distributions, which are nevertheless the only ones, sufficiently general, for which explicit computations are possible. In the one-dimensional case, the integral kernel above reduces to a function −1−iλ+iλ1 −iλ2 2
J (x, y, z) = |x − y|ε2
−1−iλ−iλ1 +iλ2 2
|z − x|ε1
−1+iλ+iλ1 +iλ2 2
|y − z|ε
(1.14)
of three real variables, and the composition formula was treated along these lines in [12, Section 17]. It is true that the proof, in the higher-dimensional case, is actually, for the main part, a reduction to the one-dimensional case: but signed powers of linear forms with exponents lying on the line −n + iR, the consideration of which is necessary for spectral-theoretic reasons,
952
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
are more singular distributions when n 2, which has made some technical improvements necessary. It may be interesting to recall briefly what can be done in the one-dimensional case in relation to automorphic distribution theory. In the automorphic situation, the integral kernel (1.14) enables one to build new nonholomorphic modular forms from given pairs of such. In [11], one of this paper’s authors introduced the notion of automorphic distribution: this is a distribution in R2 invariant under linear changes of coordinates associated to elements of some arithmetic subgroup of SL(2, R), for instance SL(2, Z). This concept is equivalent—in a non-trivial way—to the Lax–Phillips notion of pairs of non-holomorphic modular forms, as introduced in their scattering theory [7] for the automorphic wave equation. Automorphic distributions can be taken as symbols in the Weyl calculus and, at the price of important difficulties, the one-dimensional case of the analysis of sharp-products in the present paper can be developed in the automorphic environment. Things are more interesting, in some sense, since besides a continuous part, in which Eisenstein distributions serve as generalized eigenfunctions, the automorphic Euler operator has a discrete spectrum, and the corresponding eigendistributions are cusp-distributions. Finding the appropriate composition formulas calls for the explicit computation of integrals of J (x, y, z) against three non-holomorphic modular forms, in the realization of these as distributions on the line invariant under representations taken from the principal series of the arithmetic subgroup of SL(2, R) under consideration: this has been completed up to some large extent, for the case of the full modular group, in [12] (cf. in particular Section 16), and it provides a pseudodifferentialtheoretic approach to such notions as L-functions, convolution L-functions, etc. As a preparation for automorphic pseudodifferential analysis, and in view of other applications as well, either to arithmetic or to quantization theory, a study of the integral kernel (1.14) had been made in [11]. It has also been considered recently in [8], in the automorphic case (for its own sake, not in connection with pseudodifferential analysis), and we take it from the references there that, outside the automorphic environment, it had already appeared in [9]: note that the objects called automorphic distributions in [8] are not the same as those in [11,12] (they are close to what was called modular distributions in [11]). Obviously, it would be of great interest to push the present composition formula for n-dimensional pseudodifferential analysis up to an automorphic environment, despite the great difficulties experienced with automorphic pseudodifferential analysis in the one-dimensional case. In any case, linking pseudodifferential analysis to harmonic analysis, then to modular form theory (also the subject of [13], though the connection between these domains is different there) is certain to bring rewards in the future. In a non-automorphic environment, the basic idea put forward in the present paper, namely that of building composition formulas from the pairing of covariance with the decomposition of representations into irreducibles, may also [12, Section 19] be of use whenever some symbolic calculus of operators is examined, thus finding its place within quantization theory in general. Rn × R n ) 2. Decomposing the action of the symplectic group on L2 (R Consider the linear space Rn × Rn with its canonical symplectic form (1.3) and measure dx dξ : we also set, when convenient, X = (x, ξ ). The symplectic group G = Sp(n, R) is the group of linear transformations g of Rn × Rn which preserve the symplectic form, i.e., satisfy the identity [gX, gY ] = [X, Y ] for any pair X, Y of points of R2n . The phase space representation of G in L2 (R2n ) is defined by the action (g, h) → g.h such that (g.h)(X) = h(g −1 X). It is unitary, and since all linear transformations on Rn × Rn preserve the parity of functions and commute
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
953
with the Euler operator 2iπE =
xj
∂ ∂ + ξj +n ∂xj ∂ξj
(2.1)
(the additional constant turns E into a formally self-adjoint operator on L2 (Rn × Rn )), the (extension of the) phase space representation under study preserves the linear space of functions on R2n \ {0} homogeneous of a given degree, and with a given parity. Given h ∈ L2 (R2n ), we first decompose it into its even and odd parts. Then, setting for every real number s = 0 and α ∈ C |s|α0 = |s|α ,
|s|α1 = sα = |s|α sign s,
(2.2)
we may write h=
∞
hiλ,δ dλ,
(2.3)
δ=0,1 −∞
provided we set 1 hiλ,δ (X) = 4π
∞ |t|δn−1+iλ h(tX) dt.
(2.4)
−∞
Then, hiλ,δ is homogeneous of degree −n − iλ and has the parity associated to δ: we shall refer to the pair (−n − iλ, δ) as the type of hiλ,δ . More generally, we may consider on R2n \ {0} functions of type (−n − ν, δ) for an arbitrary complex parameter ν. So as to cut down, as is needed, the dimension by 1, one may realize functions of a given type as sections of some appropriate line bundle over the projective space P2n−1 (R). We first need to introduce the so-called tautological bundle EC over P2n−1 (R), the fibre of which above a point p(θ ) (p being the canonical map: R2n \ {0} → P2n−1 (R)) is the complex line Cθ in C2n . Incidentally, note that the total space of the real line analogue ER of this bundle is just the blown 2n which is used consistently for desingularization purposes, as will be the case in next up space R section. A canonical set of charts of P2n−1 (R) is obtained in the following way: given a vector θ S ∈ R2n \ {0}, set ΩS = {θ ∈ R2n : [θ, S] = 0} and, in ωS = p(ΩS ), take the chart p(θ ) → [θ,S] , which identifies ωS with the affine hyperplane MS = {X ∈ R2n : [X, S] = 1}. Above MS , a section of EC can be identified with a complex-valued function fS , associating to such a function the section X → fS (X)X. Note that, if X ∈ MS satisfies [X, T ] = 0 for some new vector X T ∈ R2n \ {0}, the points X ∈ MS and [X,T ] ∈ MT are truly the images, under the charts associated with S and T , of the same point in P2n−1 (R). Identifying fS (X)X with fT (Y )Y , where we X have set Y = [X,T ] , leads to the compatibility condition fT
X [X, T ]
= [X, T ]fS (X),
which defines the transition functions of the line bundle EC .
(2.5)
954
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991 μ
More generally, given (μ, δ) with μ ∈ C and δ = 0 or 1, define the signed power |(EC )|δ of EC by taking the corresponding signed powers of the transition functions: then, a section of the μ line bundle |(EC )|δ is associated to a set (fS ) of functions, fS defined in MS , satisfying the requirement that fT
X [X, T ]
μ
= [X, T ] δ fS (X)
(2.6)
whenever X ∈ M0 and [X, T ] = 0. Then, a function h of type (−n − ν, δ) can be identified with the section of |(EC )|n+ν characterized by the fact that, for every S ∈ R2n \ {0}, fS is the δ restriction of h to MS . Conversely, any function f in MS uniquely lifts as a function f in the part of R2n \ {0} consisting of vectors θ such that [θ, S] = 0, to wit the one defined by the equation
−n−ν
f (θ ) = [θ, S] δ f
θ . [θ, S]
(2.7)
The representation πν,δ from the full, non-unitary principal series of Sp(n, R) is by definition the restriction of the phase space representation of Sp(n, R) (again, this is defined by the assignment (g, h) → h ◦ g −1 ) to the space of functions in R2n \ {0} of type (−n − ν, δ). It will be convenient—but there is a price to pay—not to have to change the hyperplane MS consistently, and we denote as M0 the one which should really be denoted as Me1 (where e1 is the first vector from the canonical basis of Rn × Rn ), i.e., the one consisting of vectors X = (x; ξ ) ∈ Rn × Rn such that ξ1 = 1. Starting from (2.7) and using the fact that f is of type (−n − ν, δ), together with the relation [g −1 X, e1 ] = [X, ge1 ], one obtains the relation
−n−ν f πν,δ (g)f (X) = [X, ge1 ] δ
g −1 X . [X, ge1 ]
(2.8)
dx−b As an example, when n = 1 and g = ac db , starting from X = 1x , so that g −1 X = −cx+a , one x obtains, after one has abbreviated f 1 as f (x), the relation πν,δ (g)f (x) = | − cx + a|δ−1−ν f
dx − b . −cx + a
(2.9)
Still specializing, for the time being, in the hyperplane M0 , we set X = (x; ξ ) = (x1 , x∗ ; ξ1 , ξ∗ ),
(2.10)
and denote as hiλ,δ the restriction of hiλ,δ to M0 (it is the same as the function which would have been denoted as (hiλ,δ )e1 in the less specialized setting above). One has the reciprocal equations
hiλ,δ (x; ξ∗ ) = hiλ,δ (x; 1, ξ∗ ), x ξ∗ −n−iλ . hiλ,δ ; hiλ,δ (x; ξ ) = |ξ1 |δ ξ1 ξ1
(2.11)
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
955
Remark 2.1. Under the preceding pair of equations, the functions hiλ,δ and hiλ,δ are virtually indistinguishable, once the type (−n − iλ, δ) has been fixed. Using the second notion will be useful in connection with all concepts using integrals, such as integral operators, norms, . . . . However, the first point of view is more intrinsic, and is especially useful (since some singularities could lie “at infinity” relative to the chosen hyperplane M0 ) when, as will be the case in Section 4, we need to extend the representation πν,δ or the intertwining operator to be introduced below to a distribution setting. Proposition 2.1. The space L2 (R2n ) can be decomposed as the Hilbert direct integral ⊕ 2n L R ∼ Hiλ,δ dλ, 2
(2.12)
δ=0,1
if one denotes as Hiλ,δ the inverse image under the map hiλ,δ → hiλ,δ of the space L2 (M0 ; dx dξ∗ ): the decomposition is provided by (2.3), and it commutes with the phase space representation of G in L2 (R2n ). Proof. What remains to be done is proving the equation h2L2 (R2n )
∞ 2 h 2 = 4π iλ,δ L (M ) dλ, 0
(2.13)
δ=0,1 −∞
using on M0 the measure dx dξ∗ . Indeed, with h(δ) = heven or hodd according to the parity of δ, set (2.14) φX (s) = e2πns h(δ) e2πs X , s ∈ R, X ∈ R2n \ {0}, so that φˆ X (λ) = hiλ,δ (X).
(2.15)
The one-dimensional Fourier inversion formula then yields (2.3) (of course, using the Mellin transform rather than coupling a Fourier transform with the change of variable t = e2πs would be more natural: the choice really depends on your familiarity with the inversion formula in both cases). Next, using (2.11) and the Plancherel formula for the Fourier transformation,
∞ h(δ) 2L2 (R2n )
= 4π
e2πs ds −∞
R2n−1
∞ = 4π
h(δ) x; e2πs , ξ∗ 2 dx dξ∗
ds −∞
φ(x;1,ξ ) (s) 2 dx dξ∗ ∗
R2n−1
= 4π
∞ dx dξ∗
R2n−1
−∞
φˆ (x,1,ξ ) (s) 2 ds ∗
956
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
= 4π
∞ dx dξ∗
(2.16)
−∞
R2n−1
which proves (2.13).
hiλ,δ (x; ξ∗ ) 2 ds,
2
The decomposition above gives right to the series (πiλ,δ )λ∈R, δ=0,1 of representations of G in L2 (M0 ), a special case of the representations πν,δ already considered; it suffices to set
πiλ,δ (g)hiλ,δ = fiλ,δ
(2.17)
if h ∈ L2 (R2n ), g ∈ G, f = h ◦ g −1 . Each representation πiλ,δ (g) is unitary as a consequence of Proposition 2.1: to show that πiλ,δ (g)hiλ,δ = hiλ,δ for every λ such that hiλ,δ ∈ L2 (M0 ),
not only almost every λ, it suffices to start from a dense space of functions h such that hiλ,δ depends in a continuous way on λ, which is ensured for instance when h lies in S(R2n ). Recall (cf. Remark 2.1) that we also set πiλ,δ (g)hiλ,δ = fiλ,δ . In Section 7, it will be proved that most representations πiλ,δ are irreducible. Remark 2.2. When integrating on MS , we shall have to worry a lot about singularities: but we shall never have to worry about the contribution to integrals of the part of this hyperplane away from some compact subset because, in reality, we shall be dealing with integrals on the compact space P2n−1 (R) and (say, with the help of partitions of unity), we could always, replacing the integral under consideration by a finite sum of integrals taken on distinct hyperplanes, replace for each term the integral by the integral taken on some compact subset of the corresponding hyperplane. The (symplectic) Fourier transform of a function homogeneous of degree −n−iλ with a given parity is homogeneous of degree −n + iλ, and has the same parity, so that, given h ∈ L2 (R2n ), one has F hiλ,δ = (F h)−iλ,δ :
(2.18)
consequently, the representations πiλ,δ and π−iλ,δ are unitarily equivalent. Definition 2.2. The (unitary) intertwining operator θiλ,δ is the one characterized by the validity of the equation θiλ,δ hiλ,δ = (F h)−iλ,δ
(2.19)
for every h ∈ L2 (R2n ). We also set (cf. Remark 2.1)
θiλ,δ hiλ,δ = (F h)−iλ,δ .
(2.20)
The proof that θiλ,δ preserves the L2 -norm for every λ, not only almost every λ, is the same as the one which, in connection with the definition of πiλ,δ , followed (2.17). It is easy to make the unitary intertwining operator θiλ,δ associated to (2.18) explicit in terms of the coordinates on M0 . Indeed, starting from (2.11), one can write
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
957
(F h)−iλ,δ (x; ξ∗ ) = (F hiλ,δ )(x; 1, ξ∗ ) y η∗ exp 2iπ x1 η1 + x∗ , η∗ − y1 − y∗ , ξ∗ dy dη1 dη∗ = |η1 |δ−n−iλ hiλ,δ ; η1 η1 = |η1 |δn−1−iλ hiλ,δ (y; η∗ ) exp 2iπη1 x1 + x∗ , η∗ − y1 − y∗ , ξ∗ dy dη1 dη∗ . (2.21) Making a one-dimensional Fourier transformation explicit, this gives another approach to the intertwining operator θiλ,δ from πiλ,δ to π−iλ,δ : the operator θiλ,δ is defined formally as the operator with integral kernel 1
kiλ,δ (x, ξ∗ ; y, η∗ ) = i δ π 2 −n+iλ
( n−iλ+δ )
2
x1 − y1 + x∗ , η∗ − y∗ , ξ∗ −n+iλ . δ 1−n+iλ+δ ( ) 2
(2.22)
Note that, while Definition 2.2 is a rigorous definition of the intertwining operator, (2.22) can only be used after some preparation, which will be done in Section 3. While X = (x; ξ ) (or Y = (y; η), . . .) will always denote a generic point in R2n , we shall draw attention to points (x; 1, ξ∗ ) = (x1 , x∗ ; 1, ξ∗ ) of M0 by denoting them as X∗ : similarly, Y∗ = (y; 1, η∗ ). Given X∗ ∈ M0 , we set X∗∗ = (x∗ ; ξ∗ ), so that one can also identify X∗ with (x1 , X∗∗ ). We abbreviate the measure dx dξ∗ on M0 as dm(X∗ ). On R2n−2 , one can also consider the symplectic form obtained from an appropriate restriction of the one available on R2n , i.e., set [X∗∗ , Y∗∗ ] = −x∗ , η∗ + y∗ , ξ∗ ,
(2.23)
while, on M0 , one must define [X∗ , Y∗ ] =
(x1 , x∗ ); (1, ξ∗ ) , (y1 , y∗ ); (1, η∗ )
= −x1 + y1 − x∗ , η∗ + y∗ , ξ∗ .
(2.24)
One may then rewrite (2.22) as 1
(θiλ,δ f )(X∗ ) = i δ π 2 −n+iλ
( n−iλ+δ ) 2 ( 1−n+iλ+δ ) 2
[Y∗ , X∗ ] −n+iλ f (Y∗ ) dm(Y∗ ). δ
(2.25)
M0
The intertwining operator may be better understood after some transformation. Denote as F1 the usual Fourier transformation as applied when emphasis is set on the first variable only of a function of several variables. Given a function f on M0 , write it as hiλ,δ , which, according to (2.11), is possible in a unique way for a given pair (iλ, δ), so that the left-hand side of (2.21) is just (θiλ,δ f )(x; ξ∗ ) according to (2.18). Starting from (2.21), one can then write, if n 2,
958
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
(F1 θiλ,δ f )(t, x∗ ; ξ∗ ) = (F1 θiλ,δ f )(t, X∗∗ ) = |t|δn−1−iλ f (y1 , Y∗∗ ) exp −2iπt y1 + [X∗∗ , Y∗∗ ] dy1 dY∗∗ M0
(F1 f )(t, Y∗∗ ) exp −2iπt[X∗∗ , Y∗∗ ] dY∗∗ .
= |t|δn−1−iλ
(2.26)
R2n−2
In this definition of the intertwining operator, θiλ,δ appears as the “product” of a one-dimensional intertwining operator with respect to the first variable and of a Fourier transformation in R2n−2 : only, some rescaling, by the variable dual to the first one, is performed with respect to the last 2n − 2 variables. As a straightforward application of this equation, note the formula, in which δ2 := δ1 + δ, −i(λ1 +λ)
(F1 θiλ1 ,δ1 θiλ,δ f )(t, X∗∗ ) = |t|δ2
(F1 f )(t, X∗∗ ):
(2.27)
hence, the composition of the two intertwining operators under consideration reduces to an intertwining operator with respect to the first variable, with integral kernel (x1 , X∗∗ ), (y1 , X∗∗ ) 1
→ i δ2 π − 2 +i(λ1 +λ)
( 1−i(λ12+λ)+δ2 ) 2 ( i(λ1 +λ)+δ ) 2
1 +λ) δ(X∗∗ − Y∗∗ ). |x1 − y1 |δ−1+i(λ 2
(2.28)
At this point, it may be useful to clarify the respective roles of the coordinates ξ1 and x1 , as they occur in what precedes. Isolating the coordinate ξ1 is tantamount to singling out the affine hyperplane M0 , the equation of which is [X, e1 ] = 1, while [X, e1 ] = ξ1 generally. The expres∂f sion ∂x , for f ∈ C ∞ (M0 ), is then the image of f under a canonical operator on M0 , since 1 it may be thought of as the Poisson bracket of the function X → ξ1 with an arbitrary smooth extension of f to the whole of R2n . One may interpret the convolution operator the integral kernel of which is given in (2.28) as a function (a signed power, of course), in the sense of func1 ∂ tional calculus, of the operator 2iπ ∂x1 . On the other hand, the coordinate x1 is not intrinsically attached to M0 : with the help of a well-chosen symplectic transformation preserving the coordinate ξ1 , it can be transformed to the sum of x1 and of an arbitrary linear combination of x 2 , . . . , x n , ξ 1 , . . . , ξn . Note if f ∈ L2 (M0 ) the relation πiλ,δ (g)f = π−iλ,δ (g)f¯
(2.29)
from which, polarizing the identity which expresses that πiλ,δ is unitary, we obtain the identity
f2 (X)f1 (X∗ ) dm(X∗ ) = M0
M0
π−iλ,δ (g)f2 (X∗ ) πiλ,δ (g)f1 (X∗ ) dm(X∗ )
(2.30)
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
959
involving a pair (f1 , f2 ) of functions in L2 (M0 ): this can also be regarded as a particular case of (2.27), to the effect that the inverse of the isometry θiλ,δ is θ−iλ,δ . Assuming convergence, one can extend (2.30) as
π−ν,δ (g)f2 (X∗ ) πν,δ (g)f1 (X∗ ) dm(X∗ ).
f2 (X∗ )f1 (X∗ ) dm(X∗ ) = M0
(2.31)
M0
We now introduce the integral kernel obtained from the decomposition into homogeneous components of the integral kernel e4iπ[Y,X] e4iπ[X,Z] e4iπ[Z,Y ] which occurs in the composition formula (1.8). Consider on R2n × R2n × R2n the (almost everywhere defined only) function
α
α
α
(Y, Z; X) → [Y, X] ε 1 [X, Z] ε 2 [Z, Y ] ε 3 , 2
1
(2.32)
where the exponents and indices of parity are given. It is of type (α1 + α3 , ε + ε2 mod 2), resp. (α2 + α3 , ε + ε1 mod 2), resp. (α1 + α2 , ε1 + ε2 mod 2) with respect to Y , resp. Z, resp. X. Given a triple (ν1 , ν2 , ν) of complex numbers, and a triple (δ1 , δ2 , δ) of numbers equal to 0 or 1, satisfying the relation δ ≡ δ1 + δ2 mod 2, the system of equations ε2 + ε ≡ δ1 ,
ε1 + ε ≡ δ2 ,
ε1 + ε2 ≡ δ
(2.33)
for ε, ε1 , ε2 mod 2 has two solutions, obtained as ε ≡ j + δ,
ε1 ≡ j + δ1 ,
ε2 ≡ j + δ2
(2.34)
with j = 0 or 1. Then, the types of the function above with respect to Y, Z, X will be (−n + ν1 , δ1 ), (−n + ν2 , δ2 ) and (−n − ν, δ) if and only if α1 =
−n − ν + ν1 − ν2 , 2
−n − ν − ν1 + ν2 , 2
α2 =
α3 =
−n + ν + ν1 + ν2 . 2
(2.35)
Hence, provided that (2.33) is satisfied, the integral kernel
−n−ν+ν1 −ν2
−n−ν−ν1 +ν2
−n+ν+ν1 +ν2
;ε 2 2 2
[X, Z]
[Z, Y ]
Jνε11,ν,ε22;ν (Y, Z; X) = [Y, X] ε ε ε 2
1
(2.36)
in (R2n \ {0}) × (R2n \ {0}) × (R2n \ {0}) satisfies the covariance relation ;ε (Y, Z; X) πν,δ (g) X → Jνε11,ν,ε22;ν ;ε (Y, Z; X) . = π−ν1 ,δ1 g −1 ⊗ π−ν2 ,δ2 g −1 (Y, Z) → Jνε11,ν,ε22;ν
(2.37)
We may also restrict this integral kernel to M0 × M0 × M0 : the relation of covariance is preserved, though with a slightly different understanding (cf. (2.17)). In next section, we shall see, after we have given the integral kernel so obtained a meaning in an appropriate distribution sense, 2 ;ε not only as a partially defined function, that if one denotes as Jεν11 ,ε ,ν2 ;ν the associated operator, thought of as being defined by the equation
960
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
ε1 ,ε2 ;ε Jν1 ,ν2 ;ν (f1 , f2 ) (X∗ ) ;ε = Jνε11,ν,ε22;ν (Y∗ , Z∗ ; X∗ )f1 (Y∗ )f2 (Z∗ ) dm(Y∗ ) dm(Z∗ ),
(2.38)
M 0 ×M 0
one has the covariance identity ε1 ,ε2 ;ε 2 ;ε πν,δ (g) Jεν11 ,ε ,ν2 ;ν (f1 , f2 ) = Jν1 ,ν2 ;ν πν1 ,δ1 (g)f1 , πν2 ,δ2 (g)f2 ,
(2.39)
formally immediate from (2.37) and (2.31). In the case when f1 = (h1 )ν1 ,δ1 and f2 = (h2 )ν2 ,δ2 ,
ε1 ,ε2 ;ε 2 ;ε we can, and shall sometimes, write Jεν11 ,ε ,ν2 ;ν ((h1 )ν1 ,δ1 , (h2 )ν2 ,δ2 ) for Jν1 ,ν2 ;ν (f1 , f2 ). Also, as explained in Remark 2.1, the result can be regarded as a function in R2n \ {0} of type (−n − ν, δ) rather than, again, as being defined only on M0 . ε ,ε ;ε
3. The integral kernel Jν11,ν22;ν (Y, Z; X) In all this section, we deal with functions of a given type in their realizations as functions 2 ;ε on M0 . Rather than trying to define Jεν11 ,ε ,ν2 ;ν (f1 , f2 ), as in (2.38), as a function of X∗ , we lower our requirements, only trying to define the expression
2 ;ε Jεν11 ,ε ,ν2 ;ν (f1 , f2 ), f ;ε = Jνε11,ν,ε22;ν (Y∗ , Z∗ ; X∗ )f1 (Y∗ )f2 (Z∗ )f (X∗ ) dm(Y∗ ) dm(Z∗ ) dm(Z∗ )
(3.1)
M 0 ×M 0 ×M 0 2 ;ε for appropriate triples (f1 , f2 , f ). This is of course tantamount to a reinterpretation of Jεν11 ,ε ,ν2 ;ν ∞ as a distribution of some kind, a notion dependent on that of C -vectors of the representations πν1 ,δ1 , πν2 ,δ2 , π−ν,δ involved (the sign change in the last subscript is an effect of duality: cf. (2.30)). First, we observe that, though the representation πν,δ is not unitary unless ν is pure imaginary, it is still useful to regard it as a representation in some Hilbert space, to wit the one defined by the equation
f 2ν =
f (X∗ ) 2 |X∗ |2 Re ν dm(X∗ ):
(3.2)
M0
here, |X∗ |2 = |x|2 + 1 + |ξ∗ |2 when X∗ = (x; 1, ξ∗ ). We now show that, for any given g ∈ Sp(n, R), the transformation πν,δ (g) is a bounded endomorphism of the Hilbert space Hν thus defined. First, Y :=
g −1 X lies in M0 [X, ge1 ]
if X ∈ R2n and [X, ge1 ] = 0:
(3.3)
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
961
indeed, recall that ξ1 = [X, e1 ] if X = (x; ξ ) and that [X, ge1 ] = [g −1 X, e1 ]. Recalling the recipe, just before (2.30), which served as a definition of πν,δ (g), we first extend f , initially defined on M0 , as a function f in R2n \ {0}, setting x ξ∗ −n−ν , (3.4) f ; 1, f (x; ξ1 , ξ∗ ) = |ξ1 |δ ξ1 ξ1 so that
−n−ν
f g −1 .(x; ξ1 , ξ∗ ) = [X, ge1 ] δ f
g −1 X , [X, ge1 ]
(3.5)
and
−n−ν πν,δ (g)f (X∗ ) = [X∗ , ge1 ] δ f (Y∗ )
(3.6)
−1
g X∗ dm(Y∗ ) with Y∗ = [X . The next thing to do is to compute the Jacobian dm(X when X∗ lies in M0 : ∗ ,ge1 ] ∗) to this effect, the simplest way is to use the unitarity of π0,δ , to wit the relation
[X∗ , ge1 ] −2n f (Y∗ ) 2 dm(X∗ ) =
f (X∗ ) 2 dm(X∗ ), (3.7)
M0
M0
finding
−2n dm(Y∗ ) = [X∗ , ge1 ]
dm(X∗ ).
(3.8)
Then, with the help of the same change of variables, one has more generally πν,δ (g)f 2 = ν
[X∗ , ge1 ] −2n−2 Re ν f (Y∗ ) 2 |X∗ |2 Re ν dm(X∗ )
M0
=
[X∗ , ge1 ] −2 Re ν f (Y∗ ) 2 |X∗ |2 Re ν dm(Y∗ )
M0
= M0
|X∗ | |g −1 X∗ |
2 Re ν
f (Y∗ ) 2 |Y∗ |2 Re ν dm(Y∗ ),
(3.9)
an expression which we want to bound in terms of f 2ν . It suffices to observe that the ratio 2 Re ν is bounded for X ∈ M , the bound depending of course on g. Hence, π ∗| ( |g|X ∗ 0 ν,δ is a −1 X | ) ∗ representation by means of bounded operators in Hν . This makes it possible, in the usual way, to define the space of C ∞ vectors of the given representation. that the Lie algebra of the symplectic group consists of block A B Recalling with B and C symmetric, one sees that the space of infinitesimal operators matrices C −A of the phase space representation of Sp(n, R) in L2 (R2n ) is generated by the vector fields ξj ∂x∂ k + ξk ∂x∂ j , xj ∂x∂ k − ξk ∂ξ∂ j , xj ∂ξ∂ k + xk ∂ξ∂ j , the values of which at each point (x; ξ ) with ξ1 = 1
962
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
generate the linear subspace of R2n tangent to M0 . It follows that the space of C ∞ -vectors of the representation πν,δ consists of C ∞ functions in the usual sense. This condition is of course not sufficient: there are conditions “at infinity” best rephrased by simply changing the hyperplane M0 to an appropriate finite collection of hyperplanes MS , as will be seen for instance in the proof of Lemma 4.1. ;ε Proposition 3.1. When Re ν1 = Re ν2 = n and Re ν = −n, the function Jνε11,ν,ε22;ν (Y∗ , Z∗ ; X∗ ) as defined in (2.36) is a bounded function. One can extend its meaning as a distribution in M0 × M0 × M0 , holomorphic with respect to ν1 , ν2 , ν in the open subset of C3 defined, recalling (2.33) and (2.34), by the conditions
n + ν − ν1 + ν2 n + ν + ν1 − ν2 = ε2 + 1, ε2 + 3, . . . ; = ε1 + 1, ε1 + 3, . . . ; 2 2 n − ν − ν1 − ν2 = ε + 1, ε + 3, . . . , 2
(3.10)
together with the fact that at least one of three following conditions should hold: 3n + ν − ν1 − ν2 =
1, 3, . . . , 2j + 2, 2j + 6, . . .
and n + ν = δ + 1, δ + 3, . . .
(3.11)
or any of the conditions obtained from (3.11) by changing (ν, ν1 , ν2 ; δ, δ1 , δ2 ) to (−ν1 , −ν, ν2 ; δ1 , δ, δ2 ) or to (−ν2 , ν1 , −ν; δ2 , δ1 , δ). When n = 1, one can delete the condition 3 + ν − ν1 − ν2 = 1, 3, . . . from (3.11). Something entirely similar holds after one has replaced M0 by MS for an arbitrary S ∈ R2n \ {0}. In view of the inclusion C ∞ (πν,δ ) ⊂ C ∞ (M0 ) and of Remark 2.2, this will automatically make it a continuous trilinear form on the space of (f1 , f2 , f ) ∈ C ∞ (πν1 ,δ1 ) × C ∞ (πν2 ,δ2 ) × C ∞ (π−ν,δ ). Setting, when ν1 , ν2 , ν satisfy (3.10) and (3.11), and f1 , f2 , f are C ∞ functions with compact support in M0 , 2 ;ε Jεν11 ,ε ,ν2 ;ν (f1 , f2 ; f ) ;ε Jνε11,ν,ε22;ν (Y∗ , Z∗ ; X∗ )f1 (Y∗ )f2 (Z∗ )f (X∗ ) dm(Y∗ ) dm(Z∗ ) dm(X∗ ), =
M 0 ×M 0 ×M 0
(3.12) one has the covariance relation ε1 ,ε2 ;ε 2 ;ε Jεν11 ,ε ,ν2 ;ν πν1 ,δ1 (g)f1 , πν2 ,δ2 (g)f2 ; π−ν,δ (g)f = Jν1 ,ν2 ;ν (f1 , f2 ; f )
(3.13)
for every symplectic transformation g such that the transformed versions of f1 , f2 , f also have compact support in M0 . Proof. The “integral” on the right-hand side of (3.12) is of course a usual notation for what is in effect the result of testing a certain distribution on the function f1 ⊗ f2 ⊗ f . Before coming to the proof, let us indicate that one should not worry about the condition of compact support: in the
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
963
way explained in Remark 2.2, one can dispense with it, only replacing the domain of integration M0 × M0 × M0 by a finite collection of domains MS × MS × MS . ;ε When Re ν1 = Re ν2 = n and Re ν = −n, all exponents in definition (2.36) of Jνε11,ν,ε22;ν (Y∗ , Z∗ ; X∗ ) have real part zero, so that the first point is obvious. To define when possible, in the distribution sense, complex powers of possibly vanishing functions can often be done by using Hironaka’s desingularization theorem [4], in particular, when necessary (this will be the case here because we wish to find the poles as they appear in conditions (3.10) and (3.11)) explicit blow-up transformations: the idea was used in general, and applied toward a shorter proof of a classical theorem in partial differential equations, in [1,3]. We shall use it here, following its use in the one-dimensional case in [8]. Recall that one can define the direct image of a distribution under any C ∞ proper map. Our point is to give products of signed powers of the three functions 1 := [Y∗ , X∗ ] = x1 − y1 + x∗ , η∗ − y∗ , ξ∗ , 2 := [X∗ , Z∗ ] = z1 − x1 + z∗ , ξ∗ − x∗ , ζ∗ , 3 := [Z∗ , Y∗ ] = y1 − z1 + y∗ , ζ∗ − z∗ , η∗
(3.14)
a meaning for generic values of the parameters. Note that it is not necessary to desingularize fully the variety of zeros of the product 1 2 3 , only to reach a situation in which we are dealing locally with products of signed powers of functions with linearly independent differentials at common zeros. Considering only the partial derivatives with respect to x1 , y1 , z1 , one observes that a linear relation between the differentials of these three functions cannot hold unless it consists in the fact that the sum of the three differentials is zero: computing then the partial derivatives with respect to ξ∗ , η∗ , ζ∗ , finally with respect to x∗ , y∗ , z∗ , one sees that the three differentials are linearly dependent if and only if X∗∗ = Y∗∗ = Z∗∗ with the notation of Section 2. In the open set where this condition is not satisfied, one can complete the set of three functions under consideration into a local coordinate system in R2n , and the proposition follows in this case from the following well-known fact from the theory of distributions in one variable [10]: the function ν → |x|δ−1−ν , a locally summable function if Re ν < 0, extends as a distribution2 ;ε valued holomorphic function of ν for ν = δ, δ + 2, . . . . This gives the distribution Jεν11 ,ε ,ν2 ;ν a 1 −ν2 1 +ν2 (local) meaning provided that n+ν+ν = ε1 + 1, ε1 + 3, . . . , n+ν−ν = ε2 + 1, ε2 + 3, . . . 2 2 n−ν−ν1 −ν2 and = ε + 1, ε + 3, . . . . 2 When the condition X∗∗ = Y∗∗ = Z∗∗ is satisfied, saying that [Z∗ , Y∗ ] is zero is the same as saying that y1 = z1 , and there are two analogous statements related to the last two equations. At points where none of the three functions under consideration vanishes, there is of course no problem. Near points where only, say, the first function [Z∗ , Y∗ ] vanishes, it can be taken as one of a set of local coordinates, and the distribution under examination makes sense whenever n−ν−ν1 −ν2 = ε + 1, ε + 3, . . . . The only problem remains near points at which X∗∗ = Y∗∗ = Z∗∗ 2 and x1 = y1 = z1 i.e., X∗ = Y∗ = Z∗ . We thus need to tame the three functions under consideration near a point such as (X∗0 , X∗0 , X∗0 ), and there is no loss of generality in assuming that X∗0 = en+1 , the (n + 1)th vector from the canonical basis of Rn × Rn , since a symplectic transformation preserving the linear form X → ξ1 can take us to this case.
964
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
We first replace the triple (Y∗ , Z∗ , X∗ ) ∈ M0 × M0 × M0 by the set of points (T1 , T2 ; x1 ; 3 Y∗∗ , Z∗∗ , X∗∗ ) in R2 × R × (R2n−2 ) , with T1 = 1 (Y∗ , Z∗ , X∗ ),
T2 = 2 (Y∗ , Z∗ , X∗ ).
(3.15)
That these equations define, near (X∗0 , X∗0 , X∗0 ), an admissible new set of coordinates, follows the fact that 1 and 2 have linearly independent partial differentials with respect to the pair (y1 , z1 ). 2 of P1 (R) × R2 Next, we blow up the (T1 , T2 )-plane around 0, replacing it by the subspace R consisting of pairs (τ, T ) such that, in the case when T = 0, τ is the image of T under the canonical projection map p : R2 \ {0} → P1 (R). Generally setting τ = p(θ ), the domain ωj of 2 consisting of P1 (R) characterized by the condition θj = 0 gives rise to the domain Ωj of R pairs (τ, T ) such that either Tj = 0 and p(T ) = τ or T = 0 and τ ∈ ωj . The domains Ω1 and Ω2 2 and taking in Ω1 the set of coordinates cover R (τ2 , T1 ) =
θ2 , T1 , θ1
(3.16)
θ1 , T2 , θ2
(3.17)
and in Ω2 the set of coordinates (τ1 , T2 ) =
2 into a smooth manifold. The projection map φ : (τ, T ) → T is proper since the one turns R inverse image of a point T = 0 reduces to the point (p(T ), T ), while that of 0 is Σ = P1 (R)×{0}. 2 × R × (R2n−2 )3 of the three In Ω1 , one has 1 = T1 , 2 = τ2 T1 , so that the pullbacks in R functions under consideration express themselves as
1 = T1 ,
2 = τ2 T1 ,
3 = −(1 + τ2 )T1 + [X∗∗ , Y∗∗ − Z∗∗ ] − [Y∗∗ , Z∗∗ ].
(3.18)
The differentials of 1 and 2 are not linearly independent when T1 = 0, but the differentials of T1 and τ2 are, which is sufficient as a start. We must now insert a lemma, in order to take care of the extra terms in 3 . Lemma 3.2. Consider on R2n × R2n × R2n the function F (Y, Z, X) = [X, Y − Z] − [Y, Z],
(3.19)
6n which is critical exactly at points (−X 0 , −X 0 , X 0 ), where it vanishes. Consider the blow-up R 6n
6n of R at such a point, and the pullback F in R of the function F . Locally around any point lying in the inverse image of (−X 0 , −X 0 , X 0 ), one can find two smooth real-valued functions R expresses itself as RS 2 . and S such that F
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
965
Proof. First, observe the identity F −X 0 + Y, −X 0 + Z, X 0 + X = F (Y, Z, X),
(3.20)
6n obtained as the result so that there is no loss of generality in assuming that X 0 = 0. The space R 6n of blowing up R around 0 is covered by a family (Ωj )1j 6n of open sets with the following properties: for each j , there is a function Sj taken from the set of canonical coordinates of one of the three vectors Y, Z, X such that, within Ωj , the equation Sj = 0 defines the inverse image ˙ X, ˙ each P6n−1 (R) × {0} of 0 ∈ R6n ; next, there is a set of smooth vector-valued functions Y˙ , Z, ˙ ˙ ˙ of which has 2n components, such that the identities Y = Sj Y , Z = Sj Z, X = Sj X hold, and ˙ X˙ the coordinate which, of such that, deleting from the set of components of the vectors Y˙ , Z, necessity, is the constant 1, one obtains a family of functions which, when completed by the function Sj , constitutes an admissible set of coordinates in Ωj . Then, one may write ˙ Y˙ − Z] ˙ − [Y˙ , Z] ˙ , ˙ X) ˙ = Sj2 [X, (Sj , Y˙ , Z, F
(3.21)
and it suffices to observe that the second factor is a function without critical point. Indeed, assuming for instance that the coordinate Sj has been taken from the components of Y (it would be fully similar if it had been taken from any of the other two remaining vectors), the equation (Y˙ )j = 1 shows that the partial derivatives of φ˜ with respect to the coordinates in X˙ or Z˙ “conjugate with respect to the symplectic form” to (Y˙ )j are not zero. 2 End of proof of Proposition 3.1. Applying Lemma 3.2 with n − 1 substituted for n, we may rewrite (3.18), more precisely the pullbacks of the three functions there to a new blown-up space, as
1 = T1 ,
2 = τ2 T1 ,
3 = −(1 + τ2 )T1 + RS 2 ,
(3.22)
where the four functions T1 , τ2 , R, S have linearly independent differentials. The differential d3 is a linear combination of d1 and d3 exactly at points where S = 0, but let us not forget the origin (3.16) of the coordinate T1 , which implies that there is no loss of generality in assuming that we are near a point where T1 = 0 as well. In the open set where 1 + τ2 does not vanish, we may take 3 to the form −T1 + RS 2 , and we blow up the plane of the variables T1 , S around 0: this amounts, with new variables, to setting in appropriate domains either S = T1 S or T1 = ST1 , finding either −T1 + RS 2 = T1 (−1 + RT1 S 2 ) or −T1 + RS 2 = S(−T1 + RS). In the first case we are dealing with a pair of functions, the first of which is T1 and the second is the product of T1 by a function which, at points where it vanishes, has a differential linearly independent from dT1 . In the second case, we still have to desingularize the pair of functions (ST1 , S(−T1 + RS) or, setting aside the factors S in the product of signed powers to be analyzed, the triple of functions (S, T1 , −T1 + RS). Again, we blow up the (T1 , S)-space, which amounts to setting either S = T1 S , in which case the triple becomes (T1 S , T1 , T1 (−1 + RS ), or T1 = ST1 , in which case the triple becomes (S, ST1 , S(−T1 + R)), a satisfactory situation.
966
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
Finally, we must place ourselves near a point where T1 and 1 + τ2 vanish. We may then forget about 2 entirely, and we blow up the variables T1 , 1 + τ2 , S near 0. In local charts, this makes up one of the three following possibilities: 1 + τ 2 = T1 σ 2 , T1 = (1 + τ2 )T1 ,
S = T1 S ,
S = (1 + τ2 )S ,
T1 = ST1 ,
1 + τ2 = Sσ2 ,
3 = T12 −σ2 + RS 2 , 3 = (1 + τ2 )2 −T1 + RS 2 , 3 = S 2 −σ2 T1 + R .
(3.23)
In the first (resp. third) case, a product of signed powers of T1 and 3 becomes a product of signed powers of T1 and −σ2 + RS 2 (resp. a product of signed powers of S, of T1 and −σ2 T1 + R), a satisfactory situation since we are dealing in each case with two functions with linearly independent differentials. This is not the case on the second line, in which, after leaving the factors 1 + τ2 aside, we have to consider the pair of functions T1 and −T1 + RS 2 : these do not have linearly independent differentials; however, this pair can be desingularized since we are back to the situation examined above, relative to the pair (T1 , −T1 + RS 2 ). 2 ;ε We are now in a position to define locally the distribution Jεν11 ,ε ,ν2 ;ν as the direct image, under a proper map, of a distribution of the kind
−n−ν+ν1 −ν2 2
|1 |ε2
−n−ν−ν1 +ν2 2
|2 |ε1
−n+ν+ν1 +ν2 2
|3 |ε
,
(3.24)
where the factors 1 , 2 , 3 really denote the initial functions 1 , 2 , 3 after they have been pulled back in one of the appropriate ways just described: only, we here dispense with the collection of superscripts which has been used before in order to keep track of the number of blow-ups needed. In case the reader should worry about it, the fact that the subscript ε2 should be associated to 1 , not 2 , is not a blunder: the index δ1 is actually that which must be associated to 1 , and we recall (2.33). The important fact is that, in local charts, the functions 1 , 2 , 3 are all built as powers of the same set of functions with linearly independent differentials. Recall from (2.35) that α1 =
−n − ν + ν1 − ν2 , 2
α2 =
−n − ν − ν1 + ν2 , 2
α3 =
−n + ν + ν1 + ν2 . 2
(3.25)
To find the poles, as a distribution-valued function of ν1 , ν2 , ν, of the distribution (3.24), we must go back to the desingularizing operations and keep track of the signed powers involved in −1−μ each case, starting from the fact that |f |δ makes sense as a distribution, assuming that f has no critical zero, when μ = δ, δ + 2, . . . . As already said, when none of the three functions 1 , 2 , 3 vanishes, there is of course no condition on the exponents involved, and when just one of them vanishes (the case discussed between (3.14) and (3.15)), we must assume −α1 = ε2 + 1, ε2 + 3, . . . ;
−α2 = ε1 + 1, ε1 + 3, . . . ;
−α3 = ε + 1, ε + 3, . . . .
(3.26)
Next, we go to our discussion following (3.22). Forgetting the factors without zeros, the product of signed powers we are led to is of one of the following species, in which we introduce the new letter V , S , T1 , . . . for each of the functions, with differentials independent from the other ones at points where they vanish, such as −1 + RT1 S 2 , which have appeared in the discussion:
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
|T1 |αε21 |τ2 T1 |αε12 |T1 V |αε 3 or
α1
T S T τ2 T S T α2 T S T V α3 1
1 ε2
1
1 ε1
1
1
2 α1 2 α2 2 α3
S T τ2 S T S V
or
ε
or ε
α |T1 |αε21 |τ2 T1 |αε12 T12 V ε 3 or
α1 α2 2 α3
ST ST S V
or 1 ε2 1 ε1 ε
α
α
α α |1 + τ2 |αε21 |1 + τ2 |αε12 (1 + τ2 )2 ε 3 T1 S T1 ε 1 T1 S T1 ε 2 T1 S T1 V ε 3 2 1
α
α
α
α
|1 + τ2 |αε21 |1 + τ2 |αε12 (1 + τ2 )2 ε 3 S 2 T1 ε 1 S 2 T1 ε 2 S 2 V ε 3 . 1 ε2
967
1 ε1
2
1
or (3.27)
Besides, we must not forget that all these local forms are only available in some domains above parts of Ω1 , not Ω2 (cf. (3.16)), so we must complete the preceding list with the one obtained from it by exchanging the two pairs (ε2 , ν1 ) and (ε1 , ν2 ). All lines are treated in the same way: let us consider the last one, which happens to make all possible demands on the exponents, and let us rewrite it as
α +α +α2 |1 + τ2 |αε11+ε |1 + τ2 |α3 |S |2(α1 +α2 +α3 ) T1 ε 1+ε 2mod 2 |V |αε 3 . 2 mod 2 1
2
(3.28)
Since ε1 + ε2 + ε ≡ j mod 2, this can be written as
α +α |1 + τ2 |αε 3 |S |2(α1 +α2 +α3 ) T1 ε 1+ε 2mod 2 |V |αε 3 .
(3.29)
−3n − ν + ν1 + ν2 , α1 + α2 = −n − ν, 2 ε1 + ε2 ≡ j + ε ≡ δ mod 2,
(3.30)
α +α2 +α3
|1 + τ2 |j 1
1
2
Now, one has α1 + α2 + α3 =
so that, besides the conditions (3.26), it suffices to assume moreover that 3n + ν − ν1 − ν2 = j + 1, j + 3, . . . , 2
3n + ν − ν1 − ν2 = 1, 3, . . . ,
(3.31)
and that n + ν = δ + 1, δ + 3, . . . . These conditions are clearly invariant under the exchange of pairs (ε2 , ν1 ) and (ε1 , ν2 ). They are not fully necessary: the reason for this is that, in our desingularization procedure, we have started with giving the pair (1 , 2 ) special consideration, while we might just as well started from giving the pair (2 , 3 ) or (3 , 1 ) special consideration. This takes us to the assumptions in Proposition 3.1, not forgetting that in the one-dimensional case, the desingularization process stops at (3.18). The rest of the proof is trivial. 2 We shall also need the following result, in the same spirit as Proposition 3.1, though of course its proof presents no difficulty.
968
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
Proposition 3.3. Set, assuming −ρ = δ + 1, δ + 3, . . . and ρ = δ, δ + 2, . . . , 1
c(ρ, δ) = (−i)δ π − 2 −ρ
( ρ+1+δ 2 ) ( −ρ+δ 2 )
,
(3.32)
so that one should have, in one dimension, ρ −ρ−1 F |s|δ (σ ) = c(ρ, δ)|σ |δ
(3.33)
(of course, we are using here the usual Fourier transformation, with integral kernel e−2iπsσ : there is no symplectic Fourier transformation on an odd-dimensional space). Recalling (2.22), consider the integral kernel
−n+ν . kν,δ (x, ξ∗ ; y, η∗ ) = (−1)δ c(n − 1 − ν, δ) x1 − y1 + x∗ , η∗ − y∗ , ξ∗ δ
(3.34)
When −n < Re ν < 1 − n, this is the integral kernel of an operator θν,δ well defined, in the weak sense, from the space of C ∞ vectors of the representation πν,δ to the dual of that space (which contains the space of C ∞ vectors of the representation π−ν,δ ). As an operator-valued function of ν, θν,δ extends as a holomorphic function in C \ P , where the set P consists of the values ν such that −n + ν = δ, δ + 2, . . . or n − ν = δ + 1, δ + 3, . . . . The operator θν,δ is an intertwiner from the representation πν,δ to the representation π−ν,δ . When ν ∈ iR, it coincides with the one introduced in another way in Definition 2.2. The latter way to define the operator θiλ,δ has the advantages, especially in the version (2.19), that on one hand it continues to be meaningful after ν ∈ C has been substituted for iλ, on the other hand that it extends to a (tempered) distribution setting: but this requires that the homogeneous functions, or distributions, under consideration, should have a well-defined meaning as distributions in R2n , not only as functions, or distributions, in R2n \ {0}. 4. Hyperplane waves and rays We decompose here symbols as integral superpositions of homogeneous hyperplane waves, also of homogeneous rays, by which we mean homogeneous measures carried by straight lines through the origin of R2n . With the help of such decompositions, we shall transform, in this section, the triple product studied in Section 3 in a way crucial towards the proof of the main theorem. Consider the transformation G, a rescaled version of the symplectic Fourier transformation (also a unitary involution of L2 (R2n )) defined as h(Y )e−4iπ[X,Y ] dy: (4.1) (Gh)(X) = 2n R2n
part of our interest in this transformation [11, p. 120] is that, for every S ∈ S (R2n ), the distribution GS is the Weyl symbol of the operator u → Op(S)u, ˇ where u(x) ˇ = u(−x). If a symbol h = h(x; ξ ) depends only on ξ1 , say h(x; ξ ) = φ(ξ1 ), it is immediate that (Gh)(x; ξ ) = ˆ 2φ(−2x 1 )δ(x∗ )δ(ξ ): in other words, Gh is the measure carried by the line {te1 : t ∈ R}, with
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
969
ˆ density 2φ(−2t) dt. More generally, if S ∈ R2n \ {0}, setting S = ge1 with g ∈ Sp(n, R), the Gtransform of the hyperplane wave X → φ([X, S]) is the measure carried by the line {tS: t ∈ R}, ˆ with density 2φ(−2t) dt. In particular, for any ρ ∈ C, −ρ = δ + 1, δ + 3, . . . , we shall denote as μS (ρ, δ) the measure ρ carried by the line {tS: t ∈ R}, with density |t|δ dt. Recalling the definition (3.32) of c(ρ, δ), we have, provided that n + ν = δ + 1, δ + 3, . . . and −n − ν = δ, δ + 2, . . . ,
−n−ν
= (−1)δ 2ν c(−n − ν, δ)μS (n − 1 + ν, δ). G X → [X, S] δ
(4.2)
Note that the measure μS (ρ, δ) is a homogeneous distribution of type (ρ + 1 − 2n, δ) (do not forget that, in R2n−1 , the Dirac mass at the origin is homogeneous of degree 1 − 2n). Let us first decompose functions in S(R2n ) into homogeneous hyperplane waves. Start from the continuation of (2.4), to wit 1 hν,δ (X) = 4π
∞ |t|δn−1+ν h(tX) dt,
(4.3)
−∞
where the integral converges for every X = 0 provided that Re ν > −n. In this case, the function hν,δ is, as we now show, a C ∞ vector of the representation πν,δ . With X∗ = (x; 1, ξ∗ ), one has for every N the inequality |h(tX∗ )| C(1 + |t|)−N (1 + |x| + |ξ∗ |)−N for some constant C: then, with the norm defined in (3.2), one has X∗ → h(tX∗ )ν C(1 + |t|)−N , from which one obtains, since Re(n − 1 + ν) > −1, that the function hν,δ lies in the Hilbert space Hν defined in association with this norm. That it is a C ∞ vector of the representation πν,δ follows from the fact that this representation corresponds, under the transformation (4.3) from h to hν,δ , to the phase space representation of Sp(n, R) in S(R2n ). In the case when, moreover, Re ν < 1 − n, one may write
∞ hν,δ (X) = 2
n
|t|δn−1+ν
−∞
=
e−4iπt[X,S] (Gh)(S) dS
dt R2n
2−ν c(n − 1 + ν, δ) 4π
[X, S] −n−ν (Gh)(S) dS, δ
(4.4)
R2n
which leads to the decomposition of h into homogeneous hyperplane waves if coupled with the equation h=
δ=0,1 Re ν=a
hν,δ
dν , i
(4.5)
in which −n < a < 1 − n. From (2.3), however, the line of integration we are particularly interested in is the pure imaginary line, for which this decomposition is just the spectral decomposition of h relative to the (self-adjoint) operator E in L2 (R2n ). Starting from (4.4) and moving the set of values of ν, we certainly reach, for fixed S, poles of the distribution-valued function ν → |[X, S]|δ−n−ν , at points ν = −n + δ + 1, ν = −n + δ + 3, . . . , but these poles are
970
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
simple, and disappear after multiplication by the factor c(n − 1 + ν, δ), as seen from (3.32). This makes it possible to continue the decomposition of h into homogeneous hyperplane waves up to the spectral line. Starting from Gh in place of h and noting that (Gh)−ν,δ = Ghν,δ , one obtains also, if Re ν < n, hν,δ = =
2ν c(n − 1 − ν, δ) 4π 1 4π
−n+ν
dS h(S)G X → [X, S] δ
R2n
h(S)μS (n − 1 − ν, δ) dS,
(4.6)
(−1)δ c(ρ, δ)c(−ρ − 1, δ) = 1:
(4.7)
R2n
after one has used the equation
this leads to a decomposition of h into rays if coupled with the equation dν h= h−ν,δ , i
(4.8)
δ=0,1 Re ν=a
in which, starting from a value of a between −n and 1 − n, we can actually take a = 0 when so desired. The following lemma will enable us to deal with multipliers of the species which occurs consistently in the present work. Lemma 4.1. Let S ∈ R2n \ {0}. If ε, δ = 0 or 1 and α, ν ∈ C satisfy the condition − 12 < Re α < 1 α ∞ ∞ 2 + Re ν, the multiplication by the function X∗ → |[S, X∗ ]|ε sends the space C (πν,δ ) of C vectors of the representation πν,δ to the space L2 (M0 ). Proof. It is no loss of generality to assume that S = en+1 , i.e., [S, X∗ ] = x1 . Given f ∈ C ∞ (πν,δ ) extending to R2n \ {0} as a function f of type (−n − ν, δ), the function k(x; ξ ) = |x1 |αε |ξ1 |ν−α ε+δ mod 2 f (x; ξ )
(4.9)
is of type (−n, 0). Since the corresponding representation π0,0 preserves the Hilbert space L2 (M0 ), it suffices, in view of Remark 2.2, to check that the restriction of the function k, to M0 lies in the space L2loc (M0 ), which leads to the two conditions indicated. 2 2 ;ε We now come back to a study of the bilinear operator (f1 , f2 ) → Jεν11 ,ε ,ν2 ;ν (f1 , f2 ), or of the associated triple product obtained when testing this distribution against f ∈ C ∞ (π−ν,δ ). Recall from the end of Section 2 that such expressions can also use as arguments objects with the proper type defined in R2n \ {0} rather than their restrictions to M0 , the distinction being purely notational. We shall eventually assume, but not at one stroke, that
f1 = (h1 )ν1 ,δ1 ,
f2 = (h2 )ν2 ,δ2 ,
for a triple of functions h1 , h2 , h ∈ S(R2n ).
f = h−ν,δ
(4.10)
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
971
Lemma 4.2. Assume that h2 ∈ S(R2n ) and that all hypotheses of Proposition 3.1 are valid. Moreover, assume that Re ν2 < n and that 1 Re ν1 > − , 2
Re(ν − ν1 + ν2 ) = n,
1 Re ν < . 2
(4.11)
If f1 ∈ C ∞ (πν1 ,δ1 ), one has in the weak sense, i.e., when integrated against f (X∗ ) dm(X∗ ) for some f ∈ C ∞ (π−ν,δ ), 2 ;ε Jεν11 ,ε ,ν2 ;ν f1 , (h2 )ν2 ,δ2 (X∗ ) =
(−1)ε2 1 n−2+ν−ν 1 +ν2 4π c( , ε2 ) 2
× [X∗ , S]
−n−ν−ν1 +ν2 2 ε1
h2 (S) dS R2n
−n+ν+ν1 +ν2 2 θ n−ν+ν1 −ν2 ,ε Y∗ → [S, Y∗ ] ε f1 (Y∗ ) (X∗ ). 2
2
(4.12)
Proof. First, we observe, as a consequence of Lemma 4.1, that, under the conditions (4.11), the −n+ν+ν1 +ν2 2
multiplication by the function Y∗ → |[S, Y∗ ]|ε
sends the space C ∞ (πν1 ,δ1 ) to the space −n−ν−ν1 +ν2
L2 (M0 ) and that the multiplication by the function X∗ → |[X∗ , S]|ε1 2 sends the space L2 (M0 ) to the space of distributions C −∞ (πν,δ ), the topological dual of C ∞ (π−ν,δ ) (i.e., the linear space of continuous linear forms on that space). On the other hand, the first condition (4.11) gives the intertwining operator θ n−ν+ν1 −ν2 ,ε a meaning as a unitary operator in L2 (M0 ), 2 2 so that the right-hand side of the equation to be proved is meaningful. If one makes there the integral kernel of the operator θ n−ν+ν1 −ν2 ,ε explicit, as 2
−n−ν+ν1 −ν2 2
1 +ν2 (−1)ε2 c( n−2+ν−ν , ε2 )|[Y∗ , X∗ ]|ε2 2 −n−ν−ν1 +ν2 2
|s|ε1
, then if one sets S = sZ∗ , so that
−n+ν+ν1 +ν2 2
|s|ε
2
2 dS = |s|n−1+ν ds dm(Z∗ ), δ2
(4.13)
and if one uses the equation 1 (h2 )ν2 ,δ2 (X) = 4π
∞
2 |s|n−1+ν h2 (sX) ds, δ
(4.14)
−∞
one transforms the right-hand side of (4.12) into the left-hand side. However, the operator on the left-hand side has been defined with the help of the desingularization of its integral kernel as done in Section 3, while on the right-hand side, the claimed unitarity of the intertwining operator into consideration is a consequence of Definition 2.2: to identify the two ways to introduce it, one must use again the connection between (2.21) and (2.22). 2 Let us rewrite (4.12), as tested against f , with 1 f (X∗ ) = h−ν,δ (X∗ ) = 4π
∞ |t|δn−1−ν h(tX∗ ) dt. −∞
(4.15)
972
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
One has ε1 ,ε2 ;ε Jν1 ,ν2 ;ν f1 , (h2 )ν2 ,δ2 , h−ν,δ =
1 (−1)ε2 (4π)2 c( n−2+ν−ν1 +ν2 , ε2 ) 2
−n+ν+ν1 +ν2
−n−ν−ν1 +ν2
2 2 h2 (S) F Y → [S, Y ] ε f1 (Y ) , T → [T , S] ε h(T ) dS: × 1
R2n
(4.16) note that the two pairs of brackets , do not denote the same pairings: on the left-hand side, it corresponds to the duality between C −∞ (πν,δ ) and C ∞ (π−ν,δ ); within the integrand on the right-hand side, it corresponds to the one between S (R2n ) and S(R2n ). To prove this, we start from the right-hand side, expressing the intertwining operator there as a Fourier transformation. The function
−n+ν+ν1 +ν2
−n−ν−ν1 +ν2 2 2 F Y → [S, Y ] ε f1 (Y ) (T ) T → [T , S] ε 1
(4.17)
is of type (recalling (2.33))
−n − ν − ν1 + ν2 n − ν − ν1 − ν2 , ε1 + (−2n, 0) + , ε + (n + ν1 , δ1 ) 2 2 = (−n − ν, δ).
(4.18)
Set T = tX∗ , so that dT = |t|2n−1 dt dm(X∗ ): then, the right-hand side of (4.16) transforms into the left-hand side in view of (4.18) and (4.15). As a last step, we now use the decomposition 2−ν1 c(n − 1 + ν1 , δ1 ) (h1 )ν1 ,δ1 (Y ) = 4π
−n−ν1
(Gh1 )(R) [Y, R] δ dR 1
(4.19)
R2n
of f1 = (h1 )ν1 ,δ1 , as provided by (4.4). Proposition 4.3. Assume that all hypotheses from Proposition 3.1 are satisfied and that, moreover, −n + ν + ν1 + ν2 = ε, ε + 2, . . . , 2 2 − n − ν + ν1 + ν2 = ε2 + 1, ε2 + 3, . . . 2
ν + ν1 = δ2 , δ2 + 2, . . . ,
(4.20)
and Re ν1 > −n, Then,
Re ν2 < n,
Re ν < n.
(4.21)
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
2 ;ε Jεν11 ,ε ,ν2 ;ν (h1 )ν1 ,δ1 , (h2 )ν2 ,δ2 , h−ν,δ
973
=
−n+ν+ν1 +ν2 , ε) (−1)ε2 2−ν1 c( 2 3 n−2+ν−ν +ν 1 2 (4π) c( , ε2 ) 2
×
n−2−ν+ν1 +ν2 2
|r|j
1
R2n ×R2n
n−2−ν−ν1 −ν2 2
|s|ε
−n−ν−ν1 +ν2
2 (Gh1 )(R)h2 (S) [R, S] ε dR dS
h(rR + sS) dr ds,
(4.22)
R2
where the last integral must be understood in the distribution sense: recall that j was defined in (2.34). Proof. First, write the equation, of immediate verification, −n+ν+ν1 +ν2 1 2 (t1 , t∗ ; τ1 , τ∗ ) | − η1 |ε F (y; η) → | − y1 |−n−ν δ1 n−2−ν−ν −ν 1 2 −n + ν + ν1 + ν2 δ1 1 2 = (−1) c(−n − ν1 , δ1 )c |τ1 |n−1+ν δ(t∗ )δ(τ∗ ). , ε |t1 |ε δ1 2 (4.23)
Next, under the generic condition [R, S] = 0, one can find g ∈ Sp(n, R) such that S = ge1 ,
R = [R, S]gen+1 :
(4.24)
it follows that
−n−ν−ν1 +ν2
−n+ν+ν1 +ν2
2 2
[Y, R] −n−ν1 , T → [T , S]
h(T ) F Y → [S, Y ] ε δ1 ε1
−n−ν1
−n + ν + ν1 + ν2 δ1 , ε [R, S] δ = (−1) c(−n − ν1 , δ1 )c 1 2 −n−ν−ν1 +ν2 1 −ν2 n−2−ν−ν 2 1 × |t1 |ε |τ1 |n−1+ν δ(t∗ )δ(τ∗ ), |τ1 |ε1 2 (h ◦ g)(t1 , t∗ ; τ1 , τ∗ ) . δ1
(4.25)
Since (h ◦ g)(t1 , 0; τ1 , 0) = h t1 S + τ1
R , [R, S]
(4.26)
we set τ1 = [R, S]r and, for clarity, t1 = s, getting
−n−ν−ν1 +ν2
−n+ν+ν1 +ν2
2 2
[Y, R] −n−ν1 , T → [T , S]
h(T ) F Y → [S, Y ] ε δ1 ε1
−n−ν−ν1 +ν2
−n + ν + ν1 + ν2 2 , ε [R, S] ε = (−1)δ1 c(−n − ν1 , δ1 )c 1 2 n−2−ν+ν1 +ν2 n−2−ν−ν1 −ν2 2 2 × |r|j |s|ε h(rR + sS) dr ds R2
as a result.
(4.27)
974
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
Then, using (4.16) and (4.19) together with Eq. (4.7) (−1)δ1 c(n − 1 + ν1 , δ1 )c(−n − ν1 , δ1 ) = 1,
(4.28)
we obtain (4.22) under the conditions which made Lemma 4.2, and (4.16) as a consequence, valid. Analytic continuation is possible, the hypotheses from Proposition 3.1 giving a meaning to the left-hand side. The conditions (4.21) make it possible to extract (h1 )ν1 ,δ1 , (h2 )ν2 ,δ2 and h−ν,δ 1 from h1 , h2 , h; the first condition (4.20) gives a meaning to |s|−1−ν−ν as a distribution (the δ2 factor depending on r is already locally summable from the previous condition), and the other two inequalities (4.20) make up half the conditions needed in order that the ratio
−n+ν+ν1 +ν2 ,ε) 2 n−2+ν−ν1 +ν2 c( ,ε2 ) 2
c(
be well defined and nonzero while, as it turns out, the other two conditions necessary for that have already been taken care of by the assumptions of Proposition 3.1. 2 5. Some one-dimensional preparation Let us briefly recall the spectral decomposition of the one-dimensional Euler operator in L2 (R), with the notation of Section 2. Given a function hiλ,δ on R2 , homogeneous of degree −1 − iλ and with a given parity specified by the index δ = 0 or 1, we set
hiλ,δ (s) = hiλ,δ (s, 1)
(5.1)
so that hiλ,δ (x, ξ ) = |ξ |δ−1−iλ hiλ,δ
x . ξ
(5.2)
Then, every function h ∈ L2 (R2 ) can be decomposed as h=
∞
(5.3)
hiλ,δ dλ
δ=0,1 −∞
with 1 hiλ,δ (x, ξ ) = 2π
∞ t iλ hδ (tx, tξ ) dt,
(5.4)
0
where hδ denotes the even, or odd, part of h, according to whether δ = 0 or 1. Note that we denote here as hiλ,δ the function denoted as hλ,δ in [11, p. 34]. Using the equations (in which signed powers such as |s|αδ have been defined in (2.2)) d −ν−2 |x|−1−ν = −(1 + ν)|x|1−δ dx δ
and
d log|x| = x −1 , dx
(5.5)
one obtains the well-known fact, already used in Section 3, that the function ν → |x|δ−1−ν , a locally summable function if Re ν < 0, extends as a distribution-valued holomorphic function of ν for ν = δ, δ + 2, . . . .
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
975
1 2 If |x|−1−ν and |ξ |−1−ν make sense as distributions as just defined, the symbol h(x, ξ ) = δ1 δ2
1 2 |x|−1−ν # |ξ |−1−ν makes sense as a tempered distribution on R2 : in other words, the composiδ1 δ2 tion of the two operators, the first of which is the convolution by the inverse Fourier transform 2 1 , and the second is the multiplication by |x|−1−ν , is well defined as an operator from of |ξ |−1−ν δ2 δ1 S(R) to S (R). To see this, one may use as an intermediary space the space OM [10, p. 101] of C ∞ functions on the line each derivative of which is bounded by some polynomial. Under the lift from hiλ,δ to hiλ,δ provided by (5.2), the distribution associated to the function
|s|
−1−ν1 +ν2 −iλ 2
is given as (x, ξ ) → |x|
−1−ν1 +ν2 −iλ 2
and the distribution associated to the function s
−1−ν1 +ν2 −iλ 2
−1−ν1 +ν2 −iλ 2
(x, ξ ) → x
−1+ν1 −ν2 −iλ 2
|ξ |δ
(5.6)
is given as
−1+ν1 −ν2 −iλ 2
|ξ |1−δ
.
(5.7)
Both distributions make sense if −1±(ν12−ν2 )−iλ = −1, −2, . . . , which is the case whenever λ ∈ R if one assumes that |Re(ν1 − ν2 )| < 1. We may then recall Lemma 5.1 from [11] as follows: Lemma 5.1. Let ν1 , ν2 ∈ C and δ1 , δ2 = 0 or 1: assume that ν1 = δ1 , ν2 = δ2 and that |Re(ν1 ± ν2 )| < 1 which implies that |Re ν1 | < 1, |Re ν2 | < 1. Let δ = 0 or 1 be such that 1 2 δ ≡ δ1 + δ2 mod 2. Set h1 (x, ξ ) = |x|−1−ν , h2 (x, ξ ) = |ξ |−1−ν and h = h1 # h2 , a tempered δ1 δ2 2 2 distribution in R . It admits the weak decomposition in S (R ) given as ∞ h=
hiλ,δ dλ
(5.8)
−∞
with hiλ,δ (x, ξ ) = 2
ν1 +ν2 −iλ−5 2
π
ν1 +ν2 −iλ 2
( −ν12+δ1 )( −ν22+δ2 )
( ν1 +δ21 +1 )( ν2 +δ22 +1 )
2 +iλ ( 1+ν1 −ν )( 1+ν1 +ν24−iλ+2δ1 )( 1−ν1 +ν42 +iλ+2δ ) 4 × i δ2 −δ 1−ν +ν ( 1 4 2 −iλ )( 1−ν1 −ν24+iλ+2δ1 )( 1+ν1 −ν42 −iλ+2δ ) × |x|
−1−ν1 +ν2 −iλ 2
+ i −δ2 −δ+1
−1+ν1 −ν2 −iλ 2
|ξ |δ
2 +iλ ( 3+ν1 −ν )( 3+ν1 +ν24−iλ−2δ1 )( 3−ν1 +ν42 +iλ−2δ ) 4
2 −iλ ( 3−ν1 +ν )( 3−ν1 −ν24+iλ−2δ1 )( 3+ν1 −ν42 −iλ−2δ ) 4 −1+ν1 −ν2 −iλ −1−ν1 +ν2 −iλ 2 2 . |ξ |1−δ × x
(5.9)
976
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
Note that the integrand, as a distribution-valued function of λ, has no singularity on the real line. Also, as a consequence of Stirling’s formula, the coefficient is bounded, for large |λ|, by some power of |λ|: since our claim is that the integral decomposition (5.8) is valid in a weak sense in S (R2 ), we may ensure convergence by means of the equation |x|
−1−ν1 +ν2 −iλ 2
−1+ν1 −ν2 −iλ 2
|ξ |δ
−N N −1−ν1 +ν2 −iλ −1+ν12−ν2 −iλ 2 1 + 4π 2 E 2 |x| , = 1 + λ2 |ξ |δ
(5.10)
∂ ∂ + ξ ∂ξ , and of a similar one involving the second term on the right-hand in which 2iπE = 1 + x ∂x side of (5.9). 1 2 We now need to consider the case of two symbols |x|−n−ν and |ξ |−n−ν , in which n = δ1 δ2 1, 2, . . . is given, the same in both functions. The reason is that, even though the proof of the main theorem depends on the decomposition of symbols into homogeneous hyperplane waves, which are essentially one-dimensional objects, the spectral decomposition of the Euler operator in L2 (R2n ) demands that we consider decompositions of the same species as (5.3) in which, however, the degrees of homogeneity of the functions in the decomposition lie on the complex line with real part −n rather than −1. Let Q and P be the basic infinitesimal operators of Heisenberg’s representation, where Q is 1 d the operator of multiplication by the variable x on the real line, and P = 2iπ dx . Then, in the one-dimensional Weyl calculus, one has the commutation relations
∂h 1 Op , Q, Op(h) = − 2iπ ∂ξ
1 ∂h P , Op(h) = Op . 2iπ ∂x
(5.11)
1 ∂h Also, P Op(h) = Op(ξ h + 4iπ ∂x ). If h1 (resp. h2 ) is a tempered distribution depending only on x (resp. ξ ), and if one sets A1 = Op(h1 ), A2 = Op(h2 ), one has (using the facts that A1 commutes 1 with Q, A2 commutes with P and the Heisenberg relation [P , Q] = 2iπ )
[P , A1 ][Q, A2 ] = P [Q, A1 A2 ] − [Q, A1 A2 P ] −
1 A1 A2 : 2iπ
(5.12)
it follows that if h = h1 # h2 , the symbol of the operator [P , Op(h1 )][Q, Op(h2 )] is the function
1 ∂ ξ+ 4iπ ∂x
1 ∂h 1 ∂ 1 ∂h 1 1 ∂ 2h − + ξh − − h= . 2iπ ∂ξ 2iπ ∂ξ 4iπ ∂x 2iπ 4π 2 ∂x∂ξ
(5.13)
In other words, under the present assumptions, ∂h1 ∂h2 ∂ 2h # = . ∂x ∂ξ ∂x∂ξ
(5.14)
Introduce, for k = 0, 1, . . . and a ∈ C, the Pochhammer symbols (a)k = a(a + 1) . . . (a + k − 1), and extend the definition of |s|αδ beyond the case when δ = 0 or 1, setting |s|αp = |s|αp mod 2 . With the same assumptions about ν1 , ν2 , δ1 , δ2 as in Lemma 3.1, one has for n = 1, 2, . . . (using (5.5)) the equation
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
977
−n−ν2 1 (1 + ν1 )n−1 (1 + ν2 )n−1 |x|−n−ν n−1−δ1 # |ξ |n−1−δ2
∞ = −∞
×2
1 + ν1 − ν2 + iλ 2
ν1 +ν2 −iλ−5 2
π
ν1 +ν2 −iλ 2
n−1
1 − ν1 + ν2 + iλ 2
n−1
( −ν12+δ1 )( −ν22+δ2 )
( ν1 +δ21 +1 )( ν2 +δ22 +1 )
2 +iλ ( 1+ν1 −ν )( 1+ν1 +ν24−iλ+2δ1 )( 1−ν1 +ν42 +iλ+2δ ) 4 × i δ2 −δ 1−ν +ν ( 1 4 2 −iλ )( 1−ν1 −ν24+iλ+2δ1 )( 1+ν1 −ν42 −iλ+2δ ) 1−2n−ν1 +ν2 −iλ 2
× |x|n−1
+ i −δ2 −δ+1
1−2n+ν1 −ν2 −iλ
|ξ |n−1+δ2
2 +iλ ( 3+ν1 −ν )( 3+ν1 +ν24−iλ−2δ1 )( 3−ν1 +ν42 +iλ−2δ ) 4
2 −iλ ( 3−ν1 +ν )( 3−ν1 −ν24+iλ−2δ1 )( 3+ν1 −ν42 −iλ−2δ ) 4 1−2n−ν1 +ν2 −iλ 1−2n+ν1 −ν2 −iλ 2 dλ. |ξ |n−δ 2 × |x|n
(5.15)
Note that the degree of homogeneity of each of the two terms under the integral sign is 1 − 2n − iλ, not −n − iλ as we would wish it to be: we must thus perform a deformation of contour. We substitute z ∈ C for iλ and we must move z from the pure imaginary line to the line with real part 1 − n. There is no convergence problem at infinity in the process, in view of (5.10). We must then chase for possible poles, setting μ = ν1 −ν22 +z and μ = ν1 −ν22 −z . The only singularities can arise from the factors depending on x or ξ , or from the first and third Gamma functions in the numerator of each of the two major coefficients. We make a group of each of the expressions
1 1 μ 2 −n−μ + |x|n−1 , 4 2 n−1 1 3 μ −n−μ +μ + |x|n2 , 4 2 n−1 1 1 δ μ 2 −n+μ −μ + − |ξ |n−1+δ , 4 2 2 n−1 1 3 δ μ 2 −n+μ −μ − − |ξ |n−δ . 4 2 2 n−1
1 +μ 2 1 2 1 2 1 2
(5.16)
We now show that each of the four functions under consideration remains a holomorphic function of z in a neighbourhood of the closed strip 1 − n Re z 0. First we show that the Gamma factor and the distribution (in x or ξ ) on any of the four lines have disjoint sets of singularities as functions of z. This is a consequence of the fact, noted just after (5.5), that |x|−α δ a well-defined distribution in x provided that α = δ + 1, δ + 3, . . . . For, as a consequence, the singularities of the factor depending on x or ξ on the four lines are reached when μ ∈ 12 + 2N, resp. μ ∈ 32 + 2N, resp. μ ∈ −δ − 12 − 2N, resp. μ ∈ δ − 32 + 2N, while the singularities of the corresponding Gamma factors are reached when μ ∈ − 12 − 2N, resp. μ ∈ − 32 − 2N, resp. μ ∈ −δ + 12 + 2N, resp. μ ∈ −δ + 32 + 2N.
978
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
Since the two sets of singularities under consideration are disjoint, what remains to be proved is that each of the eight expressions
1 2 1 2 1 2
1 μ + , 4 2 n−1 3 μ +μ + , 4 2 n−1 1 δ − μ + − 4 2 n−1 3 δ − μ − − 4 2 n−1
1 +μ 2
μ , 2 μ , 2
1 +μ 2 1 +μ 2
1
n−1
1 − μ 2 1 − μ 2
−n−μ
2 |x|n−1 1
|x|n2
−n−μ
, ,
n−1
n−1
n−1
1
−n+μ
1
−n+μ
2 |ξ |n−1+δ ,
2 |ξ |n−δ
(5.17)
is regular for z lying in the strip 1 − n Re z 0. So far as the distribution on the right of d n−1 d n−1 ) , or ( dξ ) -derivative of the each line is concerned, we write it as (−1)n−1 times the ( dx 1
1
1
+μ
− 1 +μ
2 , resp. |ξ |1−δ . Now, the condition Re z 0, distribution |x|− 2 −μ , resp. x− 2 −μ , resp. |ξ |δ2 together with the assumption |Re(ν1 − ν2 )| < 1, implies that Re μ < 12 and Re μ > − 12 , which gives the four distributions under consideration a meaning as locally summable functions. So far as the Gamma factors are concerned, every other term in the product
3 1 + μ ... n − + μ or = 2 2 n−1 1 3 1 1 −μ −μ − μ ... n − − μ = 2 2 2 2 n−1 1 +μ 2
1 +μ 2
(5.18)
will help in killing the relevant poles of the corresponding Gamma factor. Indeed, with p = 1, 2, . . . , each of the two expressions ( 12 + μ)2p−1 ( 14 + μ2 ) and ( 12 + μ)2p−2 ( 14 + μ2 ) is the product of a polynomial in μ by the function (p + 14 + μ2 ), while each of the two expressions ( 12 + μ)2p−1 ( 34 + μ2 ) and ( 12 + μ)2p−2 ( 34 + μ2 ) is the product of a polynomial in μ by the
μ μ 1 1 2 ). The last two expressions to be analyzed are ( 2 − μ )n−1 ( 4 − 2 ) and ( 12 − μ )n−1 ( 34 − μ2 ). We use this time the inequality Re μ < n2 and observe that each of the two expressions ( 12 − μ )2p−1 ( 14 − μ2 ) and ( 12 − μ )2p−2 ( 14 − μ2 ) is the product of a polynomial by (p + 14 − μ2 ), while each of the two expressions ( 12 − μ )2p−1 ( 34 − μ2 ) and ( 12 − μ )2p−2 ( 34 − μ2 ) is the product of a polynomial by (p − 14 − μ2 ).
function (p −
1 4
+
Performing the change of contour which was the aim of the lengthy preparation just made, we finally obtain the following. Lemma 5.2. Let ν1 , ν2 ∈ C and δ1 , δ2 = 0 or 1: assume that ν1 = δ1 , ν2 = δ2 and that |Re(ν1 ± ν2 )| < 1. Let n = 1, 2, . . . , and let δ, δ1 , δ2 be the numbers, all equal to 0 or 1, characterized by the congruences mod 2 δ ≡ δ1 + δ2 ,
δ1 ≡ n − 1 − δ1 ,
δ2 ≡ n − 1 − δ2 .
(5.19)
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
979
1 2 Set h1 (x, ξ ) = |x|−n−ν , h2 (x, ξ ) = |ξ |−n−ν and let h = h1 # h2 , a tempered distribution in R2 . δ1 δ2 2 It admits the weak decomposition in S (R ) given as
∞ h=
(n)
(5.20)
hiλ,δ dλ
−∞
with (n) −1 hiλ,δ (x, ξ ) = (1 + ν1 )−1 n−1 (1 + ν2 )n−1
×2
ν1 +ν2 −iλ+n−6 2
π
2 − n + ν1 − ν2 + iλ 2
n−1+ν1 +ν2 −iλ 2
n−1
2 − n − ν1 + ν2 + iλ 2
n−1
−ν +δ −ν +δ ( 12 1 )( 22 2 ) ν +δ +1 ν +δ +1 ( 1 21 )( 2 22 )
n+ν +ν −iλ+2δ1 2−n+ν1 −ν2 +iλ 2 +iλ+2δ )( 1 24 )( 2−n−ν1 +ν ) δ2 −δ ( 4 4 × i 2−n−ν1 −ν2 +iλ+2δ1 n−ν1 +ν2 −iλ n+ν1 −ν2 −iλ+2δ ( )( )( ) 4 4 4 −n−ν1 +ν2 −iλ 2
× |x|n−1 +i
−n+ν1 −ν2 −iλ
|ξ |n−1−δ2
n+2+ν1 +ν2 −iλ−2δ1 4−n+ν1 −ν2 +iλ 2 +iλ−2δ )( )( 4−n−ν1 +ν ) 4 4 4 4−n−ν −ν +iλ−2δ 1 2 2 −iλ−2δ 1 ( n+2−ν14+ν2 −iλ )( )( n+2+ν1 −ν ) 4 4
−δ2 −δ+1 (
−n−ν1 +ν2 −iλ 2
× |x|n
−n+ν1 −ν2 −iλ 2
|ξ |n−δ
(5.21)
,
where we recall our convention that |s|αp = |s|αp with p = 0 or 1 and p ≡ p mod 2. In the proof of Lemma 5.2, we have avoided moving ν1 and ν2 , which would have complicated the pole chasing even more. It is, however, necessary to check that analytic continuation with respect to ν1 and ν2 is possible up to some point, in the sense of the following lemma. −1−ν1
Lemma 5.3. Set ν1 = n − 1 + ν1 , ν2 = n − 1 + ν2 , so that |x|δ1 2 . |ξ |−n−ν δ2
(n) hiλ,δ
−1−ν2
1 = |x|−n−ν and |ξ |δ2 δ1
=
To obtain the term from the decomposition (5.20) of h1 # h2 (same notation as in Lemma 5.2), it suffices to perform the substitutions ν1 → ν1 , ν2 → ν2 and iλ → ν = iλ + n − 1 on the right-hand side of (5.9). Proof. The proof, based on the duplication formula and on the formula of complements for the Gamma function, is perfectly ugly, though one can take solace in the fact that it offers a means of verification. Starting from the right-hand side of (5.9) and making the substitution (ν1 , ν2 , iλ) → (ν1 , ν2 , iλ + n − 1), we want to show that we just obtain the right-hand side of (5.21). We shall limit ourselves to the case when n is odd. One has (1 + ν1 )−1 n−1
1−n−ν +δ
2−n−ν −δ
1 1 1 1 ( )( ) (1 − n − ν1 ) 2 2 = 21−n = , −ν +δ 1−ν −δ 1 1 (−ν1 ) 1 1 ( )( )
2
2
(5.22)
980
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
so that (1 + ν1 )−1 n−1
−ν1 +δ1 1−n−ν1 +δ1 2−n−ν1 −δ1 ) 1−n ( )( ) 2 2 2 2 1+ν +δ 1+ν +δ 1−ν −δ ( 21 1 ) ( 21 1 )( 21 1 )
(
21−n times the corresponding coefficient
−ν1 +δ1 ) 2 1+ν1 +δ1 ( ) 2
(
=2
1−n−ν1 +δ1 ) 2 , n+ν +δ ( 21 1 )
1−n (
(5.23)
arising after the shift ν1 → ν1 from a factor
in (5.9). The same goes so far as the comparable coefficient depending on ν2 is concerned. The powers of 2 and π , as well as the Gamma factors in the middle of the coefficients we are interested in, transform in an immediately satisfactory way. The remaining headache arises from the coefficient, obtained from (5.9) and the required shift, B :=
2 +iλ )( n−ν1 +ν42 +iλ+2δ ) ( n+ν1 −ν 4
2 −iλ+2δ ( 2−n−ν14+ν2 −iλ )( 2−n+ν1 −ν ) 4
:
(5.24)
2 −iλ−2δ multiplying by ( 4−n−ν14+ν2 −iλ )( 4−n+ν1 −ν ) up and down, using the formula of com4 plements upstairs and the duplication formula downstairs, we obtain
n + ν1 − ν2 + iλ n − ν1 + ν2 + iλ + 2δ −1 sin π B = n+iλ sin π 2 4 4 2 − n + ν1 − ν2 − iλ −1 2 − n − ν1 + ν2 − iλ × . 2 2 π
(5.25)
This must be compared to the similar coefficient from (5.21), which must be accompanied, as a factor, by the product of the two remaining Pochhammer symbols. This is A :=
2 −iλ ( n−ν1 +ν ) 2
2 −iλ ) ( n+ν1 −ν 2
( 2−n−ν12+ν2 −iλ ) ( 2−n+ν12−ν2 −iλ ) ×
2 +iλ+2δ ( 2−n+ν14−ν2 +iλ )( 2−n−ν1 +ν ) 4 2 −iλ ( n−ν1 +ν )( n+ν1 −ν42 −iλ+2δ ) 4
:
(5.26)
if we multiply the product of fractions on the second line, up and down, by 2 −iλ−2δ ( 2+n−ν14+ν2 −iλ )( 2+n+ν1 −ν ), if we apply again the formula of complements upstairs 4 and the duplication formula downstairs, it becomes
2 − n + ν1 − ν2 + iλ 2 − n − ν1 + ν2 + iλ + 2δ −1 sin π 4 4 22−n+iλ −1 n + ν1 − ν2 − iλ n − ν1 + ν2 − iλ × . 2 2 π
sin π
(5.27)
It follows that A = 22n−2 B, which completes our verification, in the case when n is odd, so far as the coefficient of the first term on the right-hand side of (5.9) or (5.21) is concerned. We shall not write down everything in the case when (still with n odd) the coefficient of the
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
981
second term is concerned. The trick is, this time, to multiply the fraction B which takes the 2 −iλ+2δ ); next, the fraction on the place of B, up and down, by ( 2−n−ν14+ν2 −iλ )( 2−n+ν1 −ν 4 second line of the expression A which takes the place of A is to be multiplied, up and down, by 2 −iλ )( n+ν1 −ν42 −iλ+2δ ): again, we find that A = 22n−2 B . The lemma is thus proved ( n−ν1 +ν 4 in the case when n is odd. The proof is of course similar in the case when it is even: only, one should not forget that, in this case, δ1 = 1 − δ1 and δ2 = 1 − δ2 . Also, the right-hand side of (5.9) will yield, after transformation, the two terms on the right-hand side of (5.21) in reverse order. 2 Making all Gamma factors apparent has been necessary for the discussion of the change of complex contour. Using the shorthand provided by (3.32), i.e., making the substitution ( ρ+1+δ 2 ) ( −ρ+δ 2 )
1
= i δ π ρ+ 2 c(ρ, δ),
(5.28)
one obtains the following. Proposition 5.4. Under the assumptions of Lemma 5.2, one has h(n) iλ,δ (x, ξ ) = C0 (ν1 , ν2 , iλ; δ1 , δ2 , δ)|x|
−n−ν1 +ν2 −iλ 2
+ C1 (ν1 , ν2 , iλ; δ1 , δ2 , δ)x
−n+ν1 −ν2 −iλ 2
|ξ |δ
−n−ν1 +ν2 −iλ 2
−n+ν1 −ν2 −iλ 2
|ξ |1−δ
,
(5.29)
with C0 (ν1 , ν2 , iλ; δ1 , δ2 , δ)
n − 2 + ν1 − ν2 + iλ π (−1) c(−n − ν1 , δ1 )c(−n − ν2 , δ2 )c =2 ,0 2 n − 2 − ν1 + ν2 + iλ n − 2 + ν1 + ν2 − iλ , δ1 c ,δ (5.30) ×c 2 2 ν1 +ν2 −iλ+n−6 2
−1
δ
and C1 (ν1 , ν2 , iλ; δ1 , δ2 , δ)
n − 2 + ν1 − ν2 + iλ ,1 π (−1) c(−n − ν1 , δ1 )c(−n − ν2 , δ2 )c =2 2 n − 2 − ν1 + ν2 + iλ n − 2 + ν1 + ν2 − iλ , 1 − δ1 c ,1 − δ . (5.31) ×c 2 2 ν1 +ν2 −iλ+n−6 2
−1
δ
In view of the proof of the main theorem in next section, and as a final topic in this very 1 2 computational section, we compute the G-transform (4.1) of the symbol |x1 |−n−ν # |ξ1 |−n+ν , δ1 δ2 2n considered as a distribution in R : we still set x = (x1 , x∗ ), ξ = (ξ1 , ξ∗ ). The change ν2 → −ν2 is needed for the application in next section: at the same time, we change the variable of integration λ to −λ below so as to decompose the result as an integral superposition of distributions
982
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991 (n)
(n)
of type (−n − iλ, δ); we denote as k−iλ,δ the function obtained from hiλ,δ after these two sign changes. Proposition 5.5. Assume that ν1 = δ1 , −ν2 = δ2 and |Re(ν1 ± ν2 )| < 1. One has the weak decomposition in S (R2n ), given by the equation 1 2 (x, ξ ) = # |η1 |−n+ν G Y → |y1 |−n−ν δ1 δ2
∞
(n) Gk−iλ,δ (x, ξ ) dλ
(5.32)
−∞
with n−2−ν1 −ν2 −iλ n−2+ν1 +ν2 −iλ (n) 2 2 Gk−iλ,δ (x, ξ ) = B0 (ν1 , ν2 , iλ; δ1 , δ2 , δ)|x1 |δ |ξ1 | δ(x∗ )δ(ξ∗ ) n−2−ν1 −ν2 −iλ 2
+ B1 (ν1 , ν2 , iλ; δ1 , δ2 , δ)|x1 |1−δ
ξ1
n−2+ν1 +ν2 −iλ 2
δ(x∗ )δ(ξ∗ ), (5.33)
where B0 (ν1 , ν2 , iλ; δ1 , δ2 , δ) =2
ν1 −ν2 −iλ+n−6 2
π −1 c(−n − ν1 , δ1 )c(−n + ν2 , δ2 )c
n − 2 + ν1 − ν2 + iλ , δ1 2
(5.34)
and B1 (ν1 , ν2 , iλ; δ1 , δ2 , δ) = −2
ν1 −ν2 −iλ+n−6 2
π
−1
n − 2 + ν1 − ν2 + iλ , 1 − δ1 . c(−n − ν1 , δ1 )c(−n + ν2 , δ2 )c 2 (5.35)
Proof. This is a consequence of the preceding proposition, together with the equation G Y → |y1 |αω1 |ξ1 |βω2 (x, ξ ) = 2−n−α−β (−1)ω2 c(α, ω1 )c(β, ω2 )|x1 |ω−1−β |ξ1 |ω−1−α δ(x∗ )δ(ξ∗ ). 2 1
(5.36)
A simplification occurs from the use of Eqs. (4.7) −n − ν1 − ν2 + iλ n − 2 + ν1 + ν2 − iλ ,0 c , 0 = 1, 2 2 n − 2 + ν1 + ν2 − iλ −n − ν1 − ν2 + iλ c ,δ c , δ = (−1)δ , 2 2 −n − ν1 − ν2 + iλ n − 2 + ν1 + ν2 − iλ ,1 c , 1 = −1, c 2 2 n − 2 + ν1 + ν2 − iλ −n − ν1 − ν2 + iλ c ,1 − δ c , 1 − δ = (−1)1−δ . 2 2
c
2
(5.37)
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
983
6. Another composition of Weyl symbols Theorem 6.1. Given δ1 , δ2 and δ = 0 or 1 with δ ≡ δ1 + δ2 mod 2, and j = 0 or 1, define ε1 , ε2 , ε by means of (2.34), and set, for real λ1 , λ2 , λ, (j )
aδ1 ,δ2 ;δ (iλ1 , iλ2 ; iλ) =2
n−6+i(λ1 +λ2 −λ) 2
×
i ε−ε1 −ε2 π
3(1−n)−2+i(λ1 +λ2 −λ) 2
( n+i(λ1 −λ22 +λ)+2ε1 )
2 +λ)+2ε2 ( n+i(−λ1 +λ ) 2
( n−i(λ1 +λ22 +λ)+2ε )
2 +λ)+2ε1 2 +λ)+2ε2 2 +λ)+2ε ( 2−n−i(λ1 −λ ) ( 2−n−i(−λ1 +λ ) ( 2−n+i(λ1 +λ ) 2 2 2
.
(6.1)
Given two symbols h1 and h2 in the space S(R2n ), one has, in the weak sense in S (R2n ), ∞ h1 # h2 =
(h1 # h2 )iλ dλ,
(6.2)
−∞
with (h1 # h2 )iλ =
∞ ∞
(j )
δ1 =0,1 δ2 =0,1 −∞ −∞ j =0,1
aδ1 ,δ2 ;δ (iλ1 , iλ2 ; iλ)
2 ;ε × Jεiλ1 ,ε (h1 )iλ1 ,δ1 , (h2 )iλ2 ,δ2 dλ1 dλ2 , 1 ,iλ2 ;iλ
(6.3)
2 ;ε where Jεiλ1 ,ε is the bilinear operator from C ∞ (πiλ1 ,δ1 ) × C ∞ (πiλ2 ,δ2 ) to C −∞ (πiλ,δ ) for1 ,iλ2 ;iλ mally introduced in (2.38) and discussed in Section 3.
Proof. One has h1 # h2 = G(h1 # Gh2 ), as it follows from the interpretation of the transformation G of symbols recalled in the beginning of Section 4. Next, we decompose h1 into hyperplane waves with the help of (4.4), and h2 into rays with the help of (4.6), recalling that one can move the line of integration up to the spectral line and writing h1 =
∞
(h1 )iλ1 ,δ1 dλ1 ,
∞
Gh2 =
δ1 =0,1 −∞
(Gh2 )−iλ2 ,δ2 dλ2 ,
(6.4)
δ2 =0,1 −∞
with (h1 )iλ1 ,δ1 (X) =
2−iλ1 c(n − 1 + iλ1 , δ1 ) 4π
−n−iλ1
(Gh1 )(R) [X, R] δ dR, 1
R2n
(Gh2 )−iλ2 ,δ2 (X) =
2iλ2 c(n − 1 − iλ2 , δ2 ) 4π
R2n
−n+iλ2
h2 (S) [X, S] δ dS: 2
(6.5)
984
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
1 recall that the product c(n − 1 + ν1 , δ1 )|[X, R]|−n−ν , can be continued analytically with respect δ1 to ν1 , as a distribution in X. Then,
(h1 # h2 )(X) =
∞ ∞
δ1 =0,1 δ2 =0,1 −∞ −∞
δ1 ,δ2 Fiλ (X) dλ1 dλ2 1 ,iλ2
(6.6)
with ,δ2 Fνδ11,ν (X) = 2
2−ν1 +ν2 c(n − 1 + ν1 , δ1 )c(n − 1 − ν2 , δ2 ) (4π)2
−n−ν1
−n+ν2
× dR dS, (6.7) (Gh1 )(R)h2 (S) G [X, R] δ # [X, S] δ 1
2
R2n ×R2n
the two signed powers under the sharp product of which appears under the integral sign being regarded as functions of X. Actually, so as to obtain the last equation, we have changed the order of the bilinear operation # and of the integration with respect to dR dS. Though not completely trivial, the justification is fully similar to that, based on the consideration of the domains of powers of the harmonic oscillator, which occurred, in the one-dimensional case, in [12, p. 209]: we shall not reproduce it here. Generically, one has [R, S] = 0 and, as noticed in (4.24), there exists g ∈ Sp(n, R) such that g −1 S = e1 ,
g −1 R = [R, S]en+1
(6.8)
in terms of the canonical basis of Rn × Rn . Then, using the covariance of the Weyl calculus, and the fact that the transformation G commutes with symplectic changes of coordinates, we obtain ,δ2 Fνδ11,ν (X) = 2
×
2−ν1 +ν2 c(n − 1 + ν1 , δ1 )c(n − 1 − ν2 , δ2 ) (4π)2
−n−ν1
(Gh1 )(R)h2 (S) [S, R]
G Y → |y1 |−n−ν1 # |η1 |−n+ν2 g −1 X dR dS. δ1
δ1
δ2
R2n ×R2n
(6.9) ,δ2 The function Fνδ11,ν 2 can then be made explicit, starting from (6.9), with the help of Proposition 5.5. Rewrite the result of this proposition, tested against h ∈ S(R2n ), as
1 2 ,h # |η1 |−n+ν G Y → |y1 |−n−ν δ1 δ2
∞ =
dλ
−∞
n−2−ν1 −ν2 −iλ n−2+ν1 +ν2 −iλ 2 2 h(se1 + ren+1 ) B0 (ν1 , ν2 , iλ; δ1 , δ2 , δ)|r| |s|δ
R2
+ B1 (ν1 , ν2 , iλ; δ1 , δ2 , δ)r1 Then,
n−2+ν1 +ν2 −iλ 2
n−2−ν1 −ν2 −iλ 2
|s|1−δ
dr ds.
(6.10)
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
985
−1 1 2 G Y → |y1 |−n−ν ◦ g ,h # |η1 |−n+ν δ1 δ2
∞ =
dλ
−∞
n+ν1 +ν2 −iλ n−2+ν1 +ν2 −iλ n−2−ν12−ν2 −iλ
2 2 h(sS + rR) B0 (ν1 , ν2 , iλ; δ1 , δ2 , δ) [R, S]
|r| |s|δ
R2
n+ν1 +ν2 −iλ n−2+ν1 +ν2 −iλ n−2−ν12−ν2 −iλ 2 2 dr ds, + B1 (ν1 , ν2 , iλ; δ1 , δ2 , δ) [R, S] r |s|1−δ
(6.11)
as seen after one has used (6.8) and the change of variable r → [R, S]r, and δ1 ,δ2 Fiλ 1 ,iλ2
∞ =
δ1 ,δ2 Fiλ dλ 1 ,iλ2 ;iλ
(6.12)
−∞
with i(−λ1 +λ2 ) δ1 ,δ2 δ1 2 , h = (−1) c(n − 1 + iλ1 , δ1 )c(n − 1 − iλ2 , δ2 ) Fiλ ,iλ ;iλ 1 2 (4π)2 × (Gh1 )(R)h2 (S) dR dS h(rR + sS)
R2n ×R2n
× B0 (iλ1 , iλ2 , iλ; δ1 , δ2 , δ)
R2
−n+i(−λ21 +λ2 −λ) n−2+i(λ1 +λ2 −λ) n−2+i(−λ21 −λ2 −λ)
2 × [R, S] δ1 |r| |s|δ + B1 (iλ1 , iλ2 , iλ; δ1 , δ2 , δ)
−n+i(−λ1 +λ2 +λ) n−2+i(λ1 +λ2 −λ) n−2+i(−λ21 −λ2 −λ)
2 dr ds. r |s|1−δ × [R, S] 1−δ1 2
(6.13)
Finally, making the coefficients B0 and B1 explicit with the help of Proposition 5.5 and using (4.7) again, n−2+i(−λ1 +λ2 −λ) 2
(−1)δ2 2 1 δ1 ,δ2 Fiλ1 ,iλ2 ;iλ , h = 4π (4π)4
h(rR + sS)
(Gh1 )(R)h2 (S) dR dS R2n ×R2n
n − 2 + i(λ1 − λ2 + λ) , δ1 × c 2
R2
−n+i(−λ21 +λ2 −λ) n−2+i(λ1 +λ2 −λ) n−2+i(−λ21 −λ2 −λ)
2 × [R, S] δ1 |r| |s|δ n − 2 + i(λ1 − λ2 + λ) −c , 1 − δ1 2
−n+i(−λ1 +λ2 +λ) n−2+i(λ1 +λ2 −λ) n−2+i(−λ21 −λ2 −λ)
2 × [R, S] 1−δ1 2 r |s|1−δ dr ds. (6.14)
986
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
δ1 ,δ2 The distribution Fiλ ∈ S (R2n ) is of type (−n − iλ, δ). Now, given any element S of 1 ,iλ2 ;iλ −∞ C (πiλ,δ ) extended as a distribution in R2n of type (−n − iλ, δ) with the same name, and any function h ∈ S(R2n ), one has the equation
S, hS (R2n )×S (R2n ) = 4πS, h−iλ,δ C −∞ (πiλ,δ )×C ∞ (π−iλ,δ )
(6.15)
linking the two kinds of pairings. Starting from the case when S is a function, one obtains (6.15) from the equation S(tX∗ ) = |t|δ−n−iλ S(X∗ ) and (2.4) or, if preferred, from a polarization δ1 ,δ2 , h−iλ,δ , the of (2.13). The left-hand side of (6.14) can thus also be regarded as being Fiλ 1 ,iλ2 ;iλ −∞ ∞ pairing now denoting that between C (πiλ,δ ) and C (π−iλ,δ ). The comparison with (4.22) is now easy. δ1 ,δ2 ;δ 2 ;ε With another look at (2.34), one sees that Jεν11 ,ε ,ν2 ;ν coincides with Jν1 ,ν2 ;ν when j = 0, and with
1 ,1−δ2 ;1−δ J1−δ when j = 1. Then, the first or second term on the right-hand side of (6.14) is a ν1 ,ν2 ;ν multiple of the right-hand side of (4.22) taken with j = 0 or 1, as it follows from a comparison of the exponents and subscripts in (4.22) and in each of the two terms of (6.14) of the signed powers of [R, S], r and s. The coefficient by which one must multiply the expression on right-hand side of (4.22) to obtain the corresponding term in right-hand side of (6.14) is
n−2+i(−λ1 +λ2 +λ) , ε2 ) c( 1 n−2+i(λ1 +λ2 −λ) n − 2 + i(λ1 − λ2 + λ) 2 2 c . 2 , ε1 −n+i(λ1 +λ2 +λ) 4π 2 c( , ε)
(6.16)
2
Expanding, we can write this as
2
n−6+i(λ1 +λ2 −λ) 2
×
i ε−ε1 −ε2 π
3(1−n)−2+i(λ1 +λ2 −λ) 2
( n+i(λ1 −λ22 +λ)+2ε1 )
2 +λ)+2ε2 ( n+i(−λ1 +λ ) 2
( n−i(λ1 +λ22 +λ)+2ε )
2 +λ)+2ε1 2 +λ)+2ε2 2 +λ)+2ε ( 2−n−i(λ1 −λ ) ( 2−n−i(−λ1 +λ ) ( 2−n+i(λ1 +λ ) 2 2 2
This concludes the proof of Theorem 6.1.
.
(6.17)
2
As an example, let us consider the harmonic oscillator L = Op(π) with (x, ξ ) = |x|2 + |ξ |2 , and sharp products of fractional powers of . Proposition 6.2. Let ν1 , ν2 ∈ C satisfy the conditions −n < Re ν1 < n, −n < Re ν2 < n. Then, −n−ν1 −n−ν2 the decomposition into homogeneous components hiλ of the symbol h = 2 # 2 is given by the equation n−2+ν1 +ν2 −iλ −n−iλ 1 2 hiλ = (2π) 2 4
×
2 −iλ 2 +iλ 2 +iλ 2 −iλ ( n+ν1 +ν )( n+ν1 −ν )( n−ν1 +ν )( n−ν1 −ν ) 4 4 4 4
n+ν2 n−iλ 1 ( n+ν 2 )( 2 )( 2 )
.
(6.18)
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
987
Proof. It is identical to that of the one-dimensional case, as treated in [12, p. 214]. Only, one starts this time from the equation L −2πs n 2 −2 1 − s Op e = 1−s 1+s
(6.19)
(same reference as in the one-dimensional case), leading rapidly to the equation
h=
(2π)
ν1 +ν2 +2n 2
n+ν2 1 ( n+ν 2 )( 2 )
∞∞
n+ν1 −2 2
s1
n+ν2 −2 2
s2
e
s +s2 1 s2
1 −2π 1+s
0 0
ds1 ds2 , (1 + s1 s2 )n
(6.20)
then ν1 +ν2 +n−2−iλ ( n+iλ −n−iλ 1 2 ) 2 2 hiλ = (2π) n+ν 2 2 ( 2 1 )( n+ν ) 2
∞∞ ×
n+ν1 −2 2
s1
n+ν2 −2 2
s2
(s1 + s2 )
−n−iλ 2
(1 + s1 s2 )
−n+iλ 2
ds1 ds2 ,
(6.21)
0 0
from which it is easy to conclude. 1 2 Let us observe that, if not dealing with differential operators (i.e., when −n−ν and −n−ν are 2 2 not both non-negative integers), Moyal’s expansion (1.11) would lead in this example to a sum of terms with increasing singularities at 0, without significance, even asymptotic, as a distribution in R2n : however, let us hasten to say that microlocal analysis does not attach much significance to points of the phase space. 2 As a comment, let us express our conviction that the new composition formula has at best limited interest so far as applications of pseudodifferential analysis to partial differential equations are concerned. This is not to mean that symplectic covariance does not play any role in PDE: only, its role is essentially subordinate to that of the covariance under translations. It would be more correct to say that, in the more technical classes of symbols used in pseudodifferential analysis, it is rather the notion of uniformity under actions of conjugates of the group of translations under local families of symplectic transformations that is important. Here, our tilt is entirely towards the symplectic action, to the point that we have completely forgotten about the action of translations. On the other hand, automorphic pseudodifferential analysis calls for the present point of view, as experienced in the one-dimensional case: automorphic symbols are much too singular to be even remotely reminiscent of symbols in any of the classes developed for PDE applications. This does not imply that, to obtain the sharp composition of two automorphic symbols, it suffices to apply the present formula. Rather, the specific formula developed in this case, which has many special features inherent in the theory of modular forms, is based on the same principles (coupling symplectic covariance with the decomposition of automorphic symbols into their homogeneous components of a definite parity) as the ones which made the formula discussed here a natural one.
988
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
R2n ) 7. Irreducibility of the decomposition of L2 (R We prove here the irreducibility of most unitary representations appearing in the spectral decomposition of Proposition 2.1. In the last decades, general irreducibility results such as Kostant’s irreducibility theorem for spherical (minimal) principal series representations [6] and Vogan–Wallach’s irreducibility theorem for generic parameters [14] have been developed. Also, many specific cases have been studied in detail by R. Howe, E.-T. Tan, S.-T. Lee, S. Sahi, etc. by algebraic and combinatorial methods. However, to the best of our knowledge, neither the general theory nor the known special results contain Theorem 7.3 below, the proof of which is based on the extension of the idea of branching laws to non-compact subgroups [5] and on properties of the Weyl calculus in Rn−1 . Lemma 7.1. Let Mvect 0 = {S = (s1 , s∗ ; 0, σ∗ )} denote the linear space of translations of the affine hyperplane M0 . Given S ∈ M0 , define the linear automorphism TS of R2n by the equation TS X = X + [S, X]e1 + [e1 , X]S.
(7.1)
2n For every S ∈ Mvect 0 , TS is a symplectic transformation of R preserving M0 . The group of all such symplectic transformations is generated by the group N of transformations TS , S ∈ Mvect 0 , together with the group M of transformations (x1 , x∗ ; ξ1 , ξ∗ ) → (x1 , y∗ ; ξ1 , η∗ ), where the map (x∗ ; ξ∗ ) → (y∗ ; η∗ ) is a symplectic transformation in the 2n − 2 variables involved; the latter normalizes the first within Sp(n, R).
Proof. That [TS X, TS Y ] = [X, Y ] for every pair X, Y is an immediate consequence of the relations [e1 , e1 ] = [e1 , S] = [S, S] = 0. That the group MN generates the stabilizer of M0 is a consequence of the observation following (2.28). 2 Eq. (2.8) reduces when g ∈ MN to πν,δ (g)f (X) = f g −1 X ,
X ∈ M0 .
(7.2)
If one sets S∗∗ = (s∗ ; σ∗ ), X∗∗ = (x∗ ; ξ∗ ), the transformation T−S expresses itself when considered on M0 as T−S (x1 , x∗ ; 1, ξ∗ ) = x1 − 2s1 + [S∗∗ , X∗∗ ], x∗ − s∗ ; 1, ξ∗ − σ∗ :
(7.3)
it follows in particular that, given (iλ, δ) ∈ iR × {0, 1}, all transformations πiλ,δ (g) with g ∈ MN, when regarded as unitary transformations of L2 (M0 ), commute with the differential operator 1 ∂ 2iπ ∂x1 . Let us first decompose the restriction of the representation πiλ,δ to MN: from what has just been said, it can be analyzed when coupled with the spectral decomposition of the operator 1 ∂ 2iπ ∂x1 , in other words when fixing the first variable t in the partial Fourier transform F1 f of f ∈ L2 (M0 ), as already done in Section 2. From (7.2), one has if n 2 the identity F1 πiλ,δ (TS )f (t, x∗ ; ξ∗ ) = e−2iπt (2s1 −[S∗∗ ,X∗∗ ]) (F1 f )(t, x∗ − s∗ ; ξ∗ − σ∗ ),
(7.4)
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
989
a group of transformations in which we may regard t = 0 as a parameter by specializing to s1 = 0, (t) getting a projective representation πiλ,δ of R2n−2 , actually independent of (iλ, δ), as a result; the same is true when considering transformations F1 (πiλ,δ (g))F1−1 with g ∈ M. Lemma 7.2. Assume that n 2. For fixed t = 0, the linear space of bounded operators in (t) L2 (R2n−2 ) which commute with all transformations F1 (πiλ,δ (g))F1−1 with g ∈ MN is generated by the identity and the transformation F1 Σt F1−1 characterized by the equation e−2iπt[X∗∗ ,Y∗∗ ] (F1 f )(t, Y∗∗ ) dY∗∗ . (7.5) (F1 Σt f )(t, X∗∗ ) = |t|n−1 R2n−2
Proof. First assume that t = 2. Looking at (7.4), one sees that the linear space of infinitesimal operators of the representation of N under consideration is generated by the following operators, 1 ∂ where j, k 2: (i) the operators ξj + 4iπ ∂xj , where ξj denotes the operator of multiplication 1 ∂ by ξj ; (ii) the operators xk − 4iπ ∂ξk . From (1.11), these are just the operators h → ξj # h and h → xk # h. Taking advantage of the Weyl calculus in Rn−1 , set
(2) (2) (g)Op(h) = Op F1 πiλ,δ (g) F1−1 h ,
g ∈ MN,
(7.6)
defining in this way a unitary representation (2) of MN in the space of Hilbert–Schmidt operators in L2 (Rn−1 ). From what has just been seen, the image (2) (N ) consists of the automorphisms A → exp 2iπ η, Q − y, P A (7.7) (where the first factor was defined in the introduction). On the other hand, in view of (1.12), the image under (2) of M consists of the maps A → U AU −1 with U in the image of the metaplectic representation. Since the Heisenberg representation in L2 (Rn−1 ) is irreducible, while that of the metaplectic representation decomposes into its restrictions to spaces of functions with a given parity, it follows that the commutant of the representation (2) of MN is the linear space generated by the identity together with the automorphism A → ACh, where Ch is the parity map u → u, ˇ of the space of Hilbert–Schmidt operators in L2 (Rn−1 ). Going back to symbols and using what immediately follows (4.1), one obtains the case t = 2 of Lemma 7.2, from which one obtains the general case by a simple rescaling of coordinates of S. 2 Consider now any bounded operator K in the commutant of the representation πiλ,δ . Restricting the representation to MN, it follows from Lemma 7.2 that the operator F1 KF1−1 is a linear combination, with coefficients depending on t (the variable used in the definition of the partial Fourier transform), of the operators I and F1 Σt F1−1 . Introduce the group A of symplectic transformations of R2n defined as ga : (x, ξ ) → ax, a −1 ξ , a > 0. (7.8) From (2.8), one has πiλ,δ (ga )f (x1 , x∗ ; 1, ξ∗ ) = a −n−iλ f a −2 x1 , a −2 x∗ ; 1, ξ∗ .
(7.9)
990
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
Then, the operator K must also commute with the Euler operator j 1 xj ∂x∂ j , and the operator F1 KF1−1 must commute with the operator −t ∂t∂ + j 2 xj ∂x∂ j : after a change of variables in (7.5), it follows that the above-referred coefficients depend only on sign t. Theorem 7.3. Given any n 1, and any pair (iλ, δ) ∈ iR × {0, 1} such that (iλ, δ) = (0, 1) and (iλ, δ) = (0, 0), the representation πiλ,δ is irreducible; if (iλ, δ) = (0, 1), it decomposes as the direct sum of two irreducible representations, and such is the case if (iλ, δ) = (0, 0) and n 2. Proof. We may assume that n 2, since the one-dimensional case is classical [2]. From the considerations that precede in this section, any operator commuting with the representation πiλ,δ must lie in the algebra generated by the following two involutions: (i) the transformation Σ defined by (F1 f )(t, X∗∗ ) = |t|n−1
e−2iπt[X∗∗ ,Y∗∗ ] (F1 f )(t, Y∗∗ ) dY∗∗ ;
(7.10)
R2n−2 1 (ii) the transformation Ψ = sign( 2iπ
∂ ∂x1 )
defined by
F1 (Ψf ) (t, X∗∗ ) = (sign t)(F1 f )(t, X∗∗ ).
(7.11)
Looking at (2.26), one may note that Σ = θ0,0 and that the composition ΣΨ = Ψ Σ coincides with the intertwining operator θ0,1 . Now, θ0,1 is a non-trivial (i.e., distinct from a scalar) intertwining operator of the representation π0,1 with itself, and θ0,0 is an intertwining operator of the representation π0,0 with itself, non-trivial as soon as n 2. What remains to be seen, fixing n 2, is that the operator θ0,1 cannot commute with the representation πiλ,δ unless (iλ, δ) = (0, 1) and that the operator θ0,0 cannot commute with the representation πiλ,δ unless (iλ, δ) = (0, 0), finally that Ψ can never (if n 2) commute with a representation πiλ,δ . Given (iλ, δ), set Θj = θiλ,δ θ0,j
(7.12)
(F1 Θj f )(t, X∗∗ ) = |t|−iλ j −δ (F1 f )(t, X∗∗ ).
(7.13)
so that, from (2.27),
If θ0,j happens to be an intertwining operator from the representation πiλ,δ to itself, the operator Θj is an intertwining operator from πiλ,δ to π−iλ,δ . This operator, in its realization on L2 (M0 ), has an integral kernel which, evaluated at some pair ((x1 , X∗∗ ), (y1 , Y∗∗ )), is the product of some distribution in x1 − y1 by δ(X∗∗ − Y∗∗ ): as n 2, it is obvious that such an integral kernel, unless it is that of a scalar operator, cannot satisfy the covariance property that would make it an intertwining operator between two representations of the species under consideration. The same applies to the operator Ψ . 2
T. Kobayashi et al. / Journal of Functional Analysis 257 (2009) 948–991
991
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]
M. Atiyah, Resolution of singularities and division of distributions, Comm. Pure Appl. Math. 23 (1970) 145–150. V. Bargmann, Irreducible unitary representations of the Lorentz group, Ann. of Math. 48 (1947) 568–640. I.N. Bernstein, S.I. Gelfand, Meromorphy of the function P λ , Funktsional. Anal. i Prilozhen. 3 (1969) 84–85. H. Hironaka, Resolution of singularities of an algebraic variety over a field of characteristic zero I, II, Ann. of Math. (2) 79 (1964) 109–203; Ann. of Math. (2) 79 (1964) 205–326. T. Kobayashi, Branching problems of unitary representations, in: Proc. of ICM 2002, Beijing, vol. 2, 2002, pp. 615– 627. B. Kostant, On the existence and irreducibility of certain series of representations, Bull. Amer. Math. Soc. 75 (1969) 627–642. P.D. Lax, R.S. Phillips, Scattering Theory for Automorphic Functions, Ann. of Math. Stud., vol. 87, Princeton Univ. Press, 1976. S.D. Miller, W. Schmid, The Rankin–Selberg method for automorphic distributions, in: Representation Theory and Automorphic Forms, in: Progr. Math., vol. 255, Birkhäuser Boston, Boston, MA, 2008, pp. 111–150. A.I. Osak, Trilinear Lorentz invariant forms, Comm. Math. Phys. 29 (1973) 189–217. L. Schwartz, Théorie des distributions, vols. 1, 2, Hermann, Paris, 1959. A. Unterberger, Quantization and Non-holomorphic Modular Forms, Lecture Notes in Math., vol. 1742, SpringerVerlag, Berlin, Heidelberg, 2000. A. Unterberger, Automorphic Pseudodifferential Analysis and Higher-Level Weyl Calculi, Progr. Math., vol. 209, Birkhäuser, Basel, Boston, Berlin, 2002. A. Unterberger, Quantization and Arithmetic, Pseudodifferential Operators, vol. 1, Birkhäuser, 2008. D.A. Vogan Jr., N.R. Wallach, Intertwining operators for real reductive groups, Adv. Math. 82 (1990) 203–243. A. Weil, Sur certains groupes d’opérateurs unitaires, Acta Math. 111 (1964) 143–211.
Journal of Functional Analysis 257 (2009) 992–1017 www.elsevier.com/locate/jfa
Singular stochastic equations on Hilbert spaces: Harnack inequalities for their transition semigroups ✩ Giuseppe Da Prato b , Michael Röckner c,d , Feng-Yu Wang a,e,∗ a School of Math. Sci. and Lab. Math Com. Sys., Beijing Normal University, 100875, China b Scuola Normale Superiore di Pisa, Italy c Faculty of Mathematics, University of Bielefeld, Germany d Department of Mathematics and Statistics, Purdue University, W. Lafayette, 47906 IN, USA e Department of Mathematics, Swansea University, Singleton Park, SA2 8PP, Swansea, UK
Received 3 January 2009; accepted 7 January 2009 Available online 21 January 2009 Communicated by Paul Malliavin
Abstract We consider stochastic equations in Hilbert spaces with singular drift in the framework of [G. Da Prato, M. Röckner, Singular dissipative stochastic equations in Hilbert spaces, Probab. Theory Related Fields 124 (2) (2002) 261–303]. We prove a Harnack inequality (in the sense of [F.-Y. Wang, Logarithmic Sobolev inequalities on noncompact Riemannian manifolds, Probab. Theory Related Fields 109 (1997) 417–424]) for its transition semigroup and exploit its consequences. In particular, we prove regularizing and ultraboundedness properties of the transition semigroup as well as that the corresponding Kolmogorov operator has at most one infinitesimally invariant measure μ (satisfying some mild integrability conditions). Finally, we prove existence of such a measure μ for noncontinuous drifts. © 2009 Elsevier Inc. All rights reserved. Keywords: Stochastic differential equations; Harnack inequality; Monotone coefficients; Yosida approximation; Kolmogorov operators
✩ Supported in part by “Equazioni di Kolmogorov” from the Italian “Ministero della Ricerca Scientifica e Tecnologica”, WIMCS, Creative Research Group Fund of the National Natural Science Foundation of China (No. 10721091), the 973Project, the DFG through SFB-701 and IRTG 1132, by NSF-Grant 0603742 as well as by the BIBOS-Research Center. * Corresponding author at: School of Math. Sci. and Lab. Math Com. Sys. Beijing Normal University, 100875, China. E-mail address: [email protected] (F.-Y. Wang).
0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.01.007
G. Da Prato et al. / Journal of Functional Analysis 257 (2009) 992–1017
993
1. Introduction, framework and main results In this paper we continue our study of stochastic equations in Hilbert spaces with singular drift through its associated Kolmogorov equations started in [6]. The main aim is to prove a Harnack inequality for its transition semigroup in the sense of [16] (see also [1,14,17] for further development) and exploit its consequences. See also [12] for an improvement of the main results in [14] concerning generalized Mehler semigroups. To describe our results more precisely, let us first recall the framework from [6]. Consider the stochastic equation dX(t) = AX(t) + F X(t) dt + σ dW (t), (1.1) X(0) = x ∈ H. Here H is a real separable Hilbert space with inner product ·,· and norm | · |, W = W (t), t 0, is a cylindrical Brownian motion on H defined on a stochastic basis (Ω, F , (Ft )t0 , P) and the coefficients satisfy the following hypotheses: (H1) (A, D(A)) is the generator of a C0 -semigroup, Tt = etA , t 0, on H and for some ω ∈ R Ax, x ω|x|2 ,
∀x ∈ D(A).
(1.2)
(H2) σ ∈ L(H ) (the space of all bounded linear operators on H ) such that σ is positive definite, self-adjoint and ∞ (i) 0 (1 + t −α )Tt σ 2HS dt < ∞ for some α > 0, where · HS denotes the norm on the space of all Hilbert–Schmidt operators on H . (ii) σ −1 ∈ L(H ). (H3) F : D(F ) ⊂ H → 2H is an m-dissipative map, i.e., u − v, x − y 0,
∀x, y ∈ D(F ), u ∈ F (x), v ∈ F (y),
(“dissipativity”) and Range(I − F ) :=
x − F (x) = H. x∈D(F )
Furthermore, F0 (x) ∈ F (x), x ∈ D(F ), is such that F0 (x) = min |y|. y∈F (x)
Here we recall that for F as in (H3) we have that F (x) is closed, nonempty and convex. The corresponding Kolmogorov operator is then given as follows: Let EA (H ) denote the linear span of all real parts of functions of the form ϕ = eih,· , h ∈ D(A∗ ), where A∗ denotes the adjoint operator of A, and define for any x ∈ D(F ), L0 ϕ(x) =
1 2 2 Tr σ D ϕ(x) + x, A∗ Dϕ(x) + F0 (x), Dϕ(x) , 2
Additionally, we assume:
ϕ ∈ EA (H ).
(1.3)
994
G. Da Prato et al. / Journal of Functional Analysis 257 (2009) 992–1017
(H4) There exists a probability measure μ on H (equipped with its Borel σ -algebra B(H )) such that (i) μ(D(F )) = 1, (ii) H (1 + |x|2 )(1 + |F0 (x)|)μ(dx) < ∞, (iii) H L0 ϕdμ = 0 for all ϕ ∈ EA (H ). Remark 1.1. (i) A measure for which the last equality in (H4) (makes sense and) holds is called infinitesimally invariant for (L0 , EA (H )). (ii) Since ω in (1.2) is an arbitrary real number we can relax (H3) by allowing that for some c ∈ (0, ∞) u − v, x − y c|x − y|2 ,
∀x, y ∈ D(F ), u ∈ F (x), v ∈ F (y).
We simply replace F by F − c and A by A + c to reduce this case to (H3). (iii) At this point we would like to stress that under the above assumptions (H1)–(H4) (and (H5) below) because F0 is merely measurable and σ is not Hilbert–Schmidt, it is unknown whether (1.1) has a strong solution. (iv) Similarly as in [6] (see [6, Remark 4.4] in particular) we expect that (H2)(ii) can be relaxed to the condition that σ = (−A)−γ for some γ ∈ [0, 1/2]. However, some of the approximation arguments below become more involved. So, for simplicity we assume (H2)(ii). The following are the main results of [6] which we shall use below. Theorem 1.2. (Cf. [5, Theorem 2.3 and Corollary 2.5].) Assume (H1), (H2)(i), (H3) and (H4). Then for any measure μ as in (H 4) the operator (L0 , EA (H )) is dissipative on L1 (H, μ), hence μ closable. Its closure (Lμ , D(Lμ )) generates a C0 -semigroup Pt , t 0, on L1 (H, μ) which μ μ is Markovian, i.e., Pt 1 = 1 and Pt f 0 for all nonnegative f ∈ L1 (H, μ) and all t > 0. μ Furthermore, μ is Pt -invariant, i.e.,
μ Pt f dμ = f dμ, ∀f ∈ L1 (H, μ). H
H
Below Bb (H ), Cb (H ) denote the bounded Borel-measurable, continuous functions respectively from H into R and · denotes the usual norm on L(H ). Theorem 1.3. (Cf. [5, Proposition 5.7].) Assume (H1)–(H4) hold. Then for any measure μ as in (H4) and H0 := supp μ (:= largest closed set of H whose complement is a μ-zero set) there μ μ μ exists a semigroup pt (x, dy), x ∈ H0 , t > 0, of kernels such that pt f is a μ-version of Pt f for all f ∈ Bb (H ), t > 0, where as usual
μ μ pt f (x) = f (y)pt (x, dy), x ∈ H0 . H
Furthermore, for all f ∈ Bb (H ), t > 0, x, y ∈ H0 , |ω|t μ pt f (x) − ptμ f (y) √e f 0 σ −1 |x − y| t ∧1
(1.4)
G. Da Prato et al. / Journal of Functional Analysis 257 (2009) 992–1017
and for all f ∈ Lipb (H ) (:= all bounded Lipschitz functions on H ) μ pt f (x) − ptμ f (y) e|ω|t f Lip |x − y|, ∀t > 0, x, y ∈ H0 ,
995
(1.5)
and μ
lim pt f (x) = f (x),
t→0
∀x ∈ H0 .
(1.6) μ
(Here f 0 , f Lip denote the supremum, Lipschitz norm of f respectively.) Finally, μ is pt invariant. Remark 1.4. (i) Both results above have been proved in [6] on L2 (H, μ) rather than on L1 (H, μ), but the proofs for L1 (H, μ) are entirely analogous. (ii) In [6] we assume ω in (H1) to be negative, getting a stronger estimate than (1.4) (cf. [6, (5.11)]). But the same proof as in [6] leads to (1.4) for arbitrary ω ∈ R (cf. the proof of [6, μ Proposition 4.3] for t ∈ [0, 1]). Then by virtue of the semigroup property and since pt is Markov we get (1.4) for all t > 0. (iii) Theorem 1.3 holds in more general situations since (H2)(ii) can be relaxed (cf. [6, Remark 4.4] and [4, Proposition 8.3.3]). μ μ (iv) (1.4) above implies that pt , t > 0, is strongly Feller, i.e., pt (Bb (H )) ⊂ C(H0 ) (= all continuous functions on H0 ). We shall prove below that under the additional condition (H5) we μ even have pt (Lp (H, μ)) ⊂ C(H0 ) for all p > 1 and that μ in (H4) is unique. However, so far we have not been able to prove that for this unique μ we have supp μ = H , though we conjecture that this is true. For the results on Harnack inequalities, in this paper we need one more condition. (H5) (i) (1 + ω − A, D(A)) satisfies the weak sector condition (cf. e.g. [10]), i.e., there exists a constant K > 0 such that 1/2 1/2 (1 + ω − A)y, y , (1 + ω − A)x, y K (1 + ω − A)x, x ∀x, y ∈ D(A).
(1.7)
(ii) There exists a sequence of A-invariant finite dimensional subspaces Hn ⊂ D(A) such that ∞ n=1 Hn is dense in H . We note that if A is self-adjoint, then (H2) implies that A has a discrete spectrum which in turn implies that (H5)(ii) holds. Remark 1.5. Let (A, D(A)) satisfy (H1). Then the following is well known: (i) (H5)(i) is equivalent to the fact that the semigroup generated by (1 + ω − A, D(A)) on the complexification HC of H is a holomorphic contraction semigroup on HC (cf. e.g. [10, Chapter I, Corollary 2.21]). (ii) (H5)(i) is equivalent to (1 + ω − A, D(A)) being variational. Indeed, let (E, D(E)) be the coercive closed form generated by (1 + ω − A, D(A)) (cf. [10, Chapter I, Section 2]) and
D(E)) be its symmetric part. Then define (E, V := D(E)
with inner product E and V ∗ to be its dual.
(1.8)
996
G. Da Prato et al. / Journal of Functional Analysis 257 (2009) 992–1017
Then V ⊂H ⊂V∗
(1.9)
and 1 + ω − A : D(A) → H has a natural unique continuous extension from V to V ∗ satisfying all the required properties (cf. [10, Chapter I, Section 2, in particular Remark 2.5]). μ
Now we can formulate the main result of this paper, namely the Harnack inequality for pt , t > 0. μ
Theorem 1.6. Suppose (H1)–(H5) hold and let μ be any measure as in (H4) and pt (x, dy) as in Theorem 1.3 above. Let p ∈ (1, ∞). Then for all f ∈ Bb (H ), f 0, 2 μ p μ p pt f (x) pt f (y) exp σ −1
pω|x − y|2 , (p − 1)(1 − e−2ωt )
t > 0, x, y ∈ H0 . (1.10)
As consequences in the situation of Theorem 1.6 (i.e. assuming (H1)–(H5)) we obtain: Corollary 1.7. For all t > 0 and p ∈ (1, ∞) μ pt Lp (H, μ) ⊂ C(H0 ). Corollary 1.8. μ in (H4) is unique. μ
Because of this result below we write pt (x, dy) instead of pt (x, dy). Finally, we have Corollary 1.9. (i) For every x ∈ H0 , pt (x, dy) has a density ρt (x, y) with respect to μ and ρt (x, ·)p/(p−1) p
1 , 2 −1 2 pω|x−y| μ(dy) exp −σ −2ωt H (1−e )
x ∈ H0 , p ∈ (1, ∞).
(1.11)
(ii) If μ(eλ|·| ) < ∞ for some λ > 2(ω ∧ 0)2 σ −1 2 , then pt is hyperbounded, i.e. pt L2 (H,μ)→L4 (H,μ) < ∞ for some t > 0. 2
Corollary 1.10. For simplicity, let σ = I and instead of (H1) assume that more strongly (A, D(A)) is self-adjoint satisfying (1.2). We furthermore assume that |F0 | ∈ L2 (H, μ). (i) There exists M ∈ B(H0 ), M ⊂ D(F ), μ(M) = 1 such that for every x ∈ M Eq. (1.1) has a pointwise unique continuous strong solution (in the mild sense see (4.11) below), such that X(t) ∈ M for all t 0 P-a.s.
G. Da Prato et al. / Journal of Functional Analysis 257 (2009) 992–1017
997
(ii) Suppose there exists Φ ∈ C([0, ∞)) positive and strictly increasing such that lims→∞ s −1 Φ(s) = ∞ and
∞ Ψ (s) :=
dr < ∞, Φ(r)
∀s > 0.
(1.12)
s
If there exists a constant c > 0 such that F0 (x) − F0 (y), x − y c − Φ |x − y|2 ,
∀x, y ∈ D(F ),
(1.13)
then pt is ultrabounded with λ(1 + Ψ −1 (t/4)) pt L2 (H,μ)→L∞ (H,μ) exp , (1 − ε −ωt/2 )2
t > 0,
holding for some constant λ > 0. Remark 1.11. We emphasize that since the nonlinear part F0 of our Kolmogorov operator is in general not continuous, it was quite surprising for us that in this infinite dimensional case nevertheless the generated semigroup Pt maps L1 -functions to continuous ones as stated in Corollary 1.7. The proof that Corollary 1.9 follows from Theorem 1.6 is completely standard. So, we will omit the proofs and instead refer to [14,17]. Corollary 1.7 is new and follows whenever a semigroup pt satisfies the Harnack inequality (see Proposition 4.1 below). μ Corollary 1.8 is new. Since (1.10) implies irreducibility of pt and Corollary 1.7 implies that it is strongly Feller, a well known theorem due to Doob immediately implies that μ is the unique μ μ invariant measure for pt , t > 0. pt , however, depends on μ, so Corollary 1.8 is a stronger statement. Corollary 1.10 is also new. Theorem 1.6 as well as Corollaries 1.7, 1.8 and 1.10 will be proved in Section 4. In Section 3 we first prove Theorem 1.6 in case F0 is Lipschitz, and in Section 2 we prepare the tools that allow us to reduce the general case to the Lipschitz case. In Section 5 we prove two results (see Theorems 5.2 and 5.4) on the existence of a measure satisfying (H4) under some additional conditions and present an application to an example where F0 is not continuous. For a discussion of a number of other explicit examples satisfying our conditions see [6, Section 9]. 2. Reduction to regular F0 Let F be as in (H3). As in [6] we may consider the Yosida approximation of F , i.e., for any α > 0 we set 1 Jα (x) − x , α
x ∈ H,
Jα (x) := (I − αF )−1 (x),
α > 0,
Fα (x) := where for x ∈ H
(2.1)
998
G. Da Prato et al. / Journal of Functional Analysis 257 (2009) 992–1017
and I (x) := x. Then each Fα is single valued, dissipative and it is well known that lim Fα (x) = F0 (x), ∀x ∈ D(F ), Fα (x) F0 (x), ∀x ∈ D(F ).
α→0
(2.2) (2.3)
Moreover, Fα is Lipschitz continuous, so F0 is B(H )-measurable. Since Fα is not differentiable in general, as in [6] we introduce a further regularization by setting
(2.4) Fα,β (x) := eβB Fα eβB x + y N 1 B −1 (e2βB −1) (dy), α, β > 0, 2
H
where B : D(B) ⊂ H → H is a self-adjoint, negative definite linear operator such that B −1 is of trace class and as usual for a trace class operator Q the measure NQ is just the standard centered Gaussian measure with covariance given by Q. Fα,β is dissipative, of class C ∞ , has bounded derivatives of all the orders and Fα,β → Fα pointwise as β → 0. Furthermore, for α > 0 |Fα,β (x)| cα := sup : x ∈ H, β ∈ (0, 1] < ∞. (2.5) 1 + |x| We refer to [8, Theorem 9.19] for details. Now we consider the following regularized stochastic equation
dXα,β (t) = AXα,β (t) + Fα,β Xα,β (t) dt + σ dW (t), Xα,β (0) = x ∈ H.
(2.6)
It is well known that (2.6) has a unique mild solution Xα,β (t, x), t 0. Its associated transition semigroup is given by α,β
Pt
f (x) = E f Xα,β (t, x) ,
t > 0, x ∈ H,
for any f ∈ Bb (H ). Here E denotes expectation with respect to P. Proposition 2.1. Assume (H1)–(H4). Then there exists a Kσ -set K ⊂ H such that μ(K) = 1 and for all f ∈ Bb (H ), T > 0 there exist subsequences (αn ), (βn ) → 0 such that for all x ∈ K lim
lim P•αn ,βm f (x) = p•μ f (x)
n→∞ m→∞
weakly in L2 (0, T ; dt).
(2.7)
Proof. This follows immediately from the proof of [6, Proposition 5.7]. (A closer look at the proof even shows that (2.7) holds for all x ∈ H0 = supp μ.) 2 As we shall see in Section 4, the proof of Theorem 1.6 follows from Proposition 2.1 if we can α,β prove the corresponding Harnack inequality for each Pt . Hence in the next section we confine ourselves to the case when F0 is dissipative and Lipschitz.
G. Da Prato et al. / Journal of Functional Analysis 257 (2009) 992–1017
999
3. The Lipschitz case In this section we assume that (H1)–(H3) and (H5) hold and that F0 in (H3) is in addition Lipschitz continuous. The aim of this section is to prove Theorem 1.6 for such special F0 (see Proposition 3.1 below). We shall do this by finite dimensional (Galerkin) approximations, since for the approximating finite dimensional processes we can apply the usual coupling argument. We first note that since F0 is Lipschitz (1.1) has a unique mild solution X(t, x), t 0, for every initial condition x ∈ H (cf.[8]) and we denote the corresponding transition semigroup by Pt , t > 0, i.e. Pt f (x) := E f X(t, x) ,
t > 0, x ∈ X,
where f ∈ Bb (H ). Now we need to consider an appropriate Galerkin approximation. To this end let ek ∈ D(A), k ∈ N, be orthonormal such that Hn = linear span{e1 , . . . , en }, n ∈ N. Hence {ek : k ∈ N} is an orthonormal basis of (H, ·,·). Let πn : H → Hn be the orthogonal projection with respect to (H, ·,·). So, we can define An := πn A|Hn = A|Hn by (H5)(ii)
(3.1)
and, furthermore Fn := πn F0|Hn ,
σn := πn σ|Hn .
Obviously, σn : Hn → Hn is a self-adjoint, positive definite linear operator on Hn . Furthermore, σn is bijective, since it is one-to-one. To see the latter, one simply picks an orthonormal basis {e1σ , . . . , enσ } of Hn with respect to the inner product ·,·σ defined by x, yσ := σ x, y. Then if x ∈ Hn is such that σn x = πn σ x = 0, it follows that
x, eiσ
σ
= σ x, eiσ = 0,
∀1 i n.
But x = ni=1 x, eiσ σ eiσ , hence x = 0. Now fix n ∈ N and on Hn consider the stochastic equation
dXn (t) = An Xn (t) + Fn Xn (t) dt + σn dWn (t), Xn (0) = x ∈ Hn ,
(3.2)
where Wn (t) = πn W (t) = ni=1 ek , W (t)ek . (3.2) has a unique strong solution Xn (t, x), t 0, for every initial condition x ∈ Hn which is pathwise continuous P-a.s. Consider the associated transition semigroup defined as before by Ptn f (x) = E f Xn (t, x) , where f ∈ Bb (Hn ).
t > 0, x ∈ Hn ,
(3.3)
1000
G. Da Prato et al. / Journal of Functional Analysis 257 (2009) 992–1017
Below we shall prove the following: Proposition 3.1. Assume that (H1)–(H5) hold. Then: (i) For all f ∈ Cb (H ) and all t > 0, lim Ptn f (x) = Pt f (x),
n→∞
∀x ∈ Hn0 , n0 ∈ N.
(ii) For all nonnegative f ∈ Bb (H ) and all n ∈ N , p ∈ (1, ∞) 2 n p pω|x − y|2 Pt f (x) Ptn f p (y) exp σ −1 , (p − 1)(1 − e−2ωt )
t > 0, x, y ∈ Hn .
(3.4)
Proof. (i) Define
t WA,σ (t) :=
t 0.
e(t−s)A σ dW (s), 0
Note that by (H2)(i) we have that WA,σ (t), t 0, is well defined and pathwise continuous. For x ∈ Hn0 , n0 ∈ N fixed, let Z(t), t 0, be the unique variational solution (with triple V ⊂ H ⊂ V ∗ as in Remark 1.5(ii), see e.g. [13]) to dZ(t) = AZ(t) + F0 Z(t) + WA,σ (t) dt, (3.5) Z(0) = x, which then automatically satisfies 2 E sup Z(t) < +∞.
(3.6)
t∈[0,T ]
Then we have (see [8]) that Z(t) + WA,σ (t), t 0, is a mild solution to (1.1) (with F0 Lipschitz), hence by uniqueness X(t, x) = Z(t) + WA,σ (t),
t 0.
(3.7)
Clearly, since 2 E sup WA,σ (t) < +∞,
(3.8)
t∈[0,T ]
we have πn WA,σ (t) → WA,σ (t)
as n → ∞ in L2 (Ω, F , P), ∀t 0.
We set Xn (t) := Xn (t, x) (= solution of (3.2)). Defining
t WAn ,σn (t) =
e(t−s)An σn dWn (t), 0
t 0,
G. Da Prato et al. / Journal of Functional Analysis 257 (2009) 992–1017
1001
and Zn (t) := Xn (t) − WAn ,σn (t),
n ∈ N, t 0,
it is enough to show that Zn (t) → Z(t)
as n → ∞ in L2 (Ω, F , P), ∀t 0,
Xn (t) → X(t)
as n → ∞ in L2 (Ω, F , P), ∀t 0,
(3.9)
because then by (3.7)
and the assertion follows by Lebesgue’s dominated convergence theorem. To show (3.9) we first note that by the same argument as above dZn (t) = An Zn (t) + Fn Zn (t) + WAn ,σn (t) dt and thus (in the variational sense), since A = An on Hn by (3.1) d Z(t) − Zn (t) = A Z(t) − Zn (t) + F0 X(t) − Fn Xn (t) dt. Applying Itô’s formula we obtain that for some constant c > 0 2 1 Z(t) − Zn (t) 2
t
2 (ω + 1/2)Z(s) − Zn (s)
0
2 2 + F0 X(s) − F0 Xn (s) + (1 − πn )F0 X(s) ds
t
c
Z(s) − Zn (s)2 ds + c
0
t
WA,σ (s) − WA ,σ (s)2 ds n n
0
t +
(1 − πn )F0 X(s) 2 ds.
0
Now (3.9) follows by the linear growth of F0 , (3.6)–(3.8) and Gronwall’s lemma, if we can show that
T
2 EWA,σ (s) − WAn ,σn (s) ds → 0 as n → ∞.
(3.10)
0
To this end we first note that a straightforward application of Duhamel’s formula yields that etA |Hn = etAn
∀t 0.
1002
G. Da Prato et al. / Journal of Functional Analysis 257 (2009) 992–1017
Therefore
s WA,σ (s) − WAn ,σn (s) =
e(t−r)A (σ − πn σ πn ) dW (r), 0
and thus EWA,σ (s) − WA
n ,σn
2 (s) =
s
2 (t−r)A e (σ − πn σ πn ) dr
0 ∞
rA e (σ − πn σ πn )ei 2 dr. = s
i=1 0
Since for any i ∈ N, r ∈ [0, s], the integrands converge to 0, Lebesgue’s dominated convergence theorem implies (3.10). (ii) Fix T > 0, n ∈ N and x, y ∈ Hn . Let ξ T ∈ C 1 ([0, ∞)) be defined by ξ T (t) :=
2ωe−ωt |x − y| , 1 − e−2ωT
t 0.
Consider for Xn (t) = Xn (t, x), t 0, see the proof of (i), the stochastic equation ⎧ ⎪ T (t) Xn (t) − Yn (t) 1 ⎪ dY Y (t) = A Y (t) + F (t) + ξ ⎪ n n n n X (t) =Yn (t) dt ⎨ n |Xn (t) − Yn (t)| n ⎪ + σn dWn (t), ⎪ ⎪ ⎩ Yn (0) = y.
(3.11)
Since z→
Xn (t) − z 1X (t) =z |Xn (t) − z| n
is dissipative on Hn for all t 0 (cf. [17]), (3.11) has a unique strong solution Yn (t) = Yn (t, y), t 0, which is pathwise continuous P-a.s. Define the first coupling time (3.12) τn := inf t 0: Xn (t) = Yn (t) . Writing the equation for Xn (t) − Yn (t), t 0, applying the chain rule to φ (z) := z ∈ (− 2 , ∞), > 0, and letting → 0 subsequently, we obtain d Xn (t) − Yn (t) ωXn (t) − Yn (t) − ξ T (t)1Xn (t) =Yn (t) dt
√ z + 2,
t 0,
which yields d e−ωt Xn (t) − Yn (t) −e−ωt ξ T (t)1Xn (t) =Yn (t) dt,
t 0.
(3.13)
G. Da Prato et al. / Journal of Functional Analysis 257 (2009) 992–1017
1003
In particular, t → e−ωt |Xn (t) − Yn (t)| is decreasing, hence Xn (T ) = Yn (T ) for all T τn . But by (3.13) if T τn then Xn (T ) − Yn (T )e−ωT |x − y| − |x − y|
T 0
2ωe−2ωt dt = 0. 1 − e−2ωT
So, in any case Xn (T ) = Yn (T ),
P-a.s.
(3.14)
Let
T ∧τn
R := exp − 0
1 − 2
T ∧τn
0
ξ T (t) Xn (t) − Yn (t), σ −1 dWn (t) |Xn (t) − Yn (t)|
(ξ T (t))2 |σ −1 (Xn (t) − Yn (t))|2 dt . |Xn (t) − Yn (t)|2
By (3.14) and Girsanov’s theorem for p > 1, p p p n = E Rf Xn (T ) PT f (y) = E f Yn (T ) p−1 PTn f p (x) E R p/(p−1) .
(3.15)
Let
p Mp = exp − p−1 p2 − 2(p − 1)2
T ∧τn
0 T ∧τn
0
ξ T (t) Xn (t) − Yn (t), σ −1 dWn (t) |Xn (t) − Yn (t)| (ξ T (t))2 |σ −1 (Xn (t) − Yn (t))|2 dt . |Xn (t) − Yn (t)|2
We have EMp = 1 and hence,
ER
p/(p−1)
p = E Mp exp 2(p − 1)2
p sup exp 2(p − 1)2 Ω 2 exp σ −1
T ∧τn
0
(ξ T (t))2 |σ −1 (Xn (t) − Yn (t))|2 dt |Xn (t) − Yn (t)|2
T ∧τn
T 2 −1 2 ξ (t) σ dt
0
pω|x − y|2 . (p − 1)2 (1 − e−2ωT )
Combining this with (3.15) we get the assertion (with T replacing t).
2
1004
G. Da Prato et al. / Journal of Functional Analysis 257 (2009) 992–1017
4. Proof and consequences of Theorem 1.6 On the basis of Propositions 3.1 and 2.1 we can now easily prove Theorem 1.6. Proof of Theorem 1.6. Let f ∈ Lipb (H ), f 0. By Proposition 3.1(i) it then follows that (3.4) holds with Pt f replacing Ptn f provided F is Lipschitz. Using that n∈N Hn is dense in H and that Pt f (x) is continuous on x (cf. [8]) we obtain (3.4) for all x, y ∈ H . In particular, this is true α ,β for Pt n n f from Proposition 2.1. Now fix t > 0 and k ∈ N, let 1 χk (s) := 1[t,t+1/k] (s), k αn ,βm
Using (3.4) for Pt
μ pt f (x) =
1 lim k→∞ k
s 0.
f , (1.6), Proposition 2.1 and Jensen’s inequality, we obtain for x, y ∈ K
t+1/k
psμ f (x) ds t
= lim lim
t+1 lim χk (s)Psαn ,βm f (x) dx
k→∞ n→∞ m→∞
0
lim lim
t+1 2 α ,β p 1/p n m lim χk (s) Ps f (y) exp σ −1
lim lim
t+1 2 lim χk (s)Psαn ,βm f p (y) exp σ −1
k→∞ n→∞ m→∞
0
k→∞ n→∞ m→∞
0
2 μ p 1/p = pt f (y) exp σ −1
ω|x − y|2 ds (p − 1)(1 − e−2ωs )
pω|x − y|2 ds (p − 1)(1 − e−2ωs )
1/p
ω|x − y|2 , (p − 1)(1 − e−2ωt )
where we note that we have to choose the sequences (αn ), (βn ) such that (2.7) holds both for f and f p instead of f . Since K is dense in H0 , (1.10) follows for f ∈ Cb (H ), for all x, y ∈ H0 , μ since pt f is continuous on H0 by (1.4). Let now f ∈ Bb (H ), f 0. Let fn ∈ Cb (H ), n ∈ N, such that fn → f in Lp (H, μ) as μ n → ∞, p ∈ (1, ∞) fixed. Then, since μ is invariant for pt , t > 0, selecting a subsequence if necessary, it follows that there exists K1 ∈ B(H ), μ(K1 ) = 1, such that μ
μ
pt fn (x) → pt f (x)
as n → ∞,
∀x ∈ K1 . μ
Taking this limit in (1.10) we obtain (1.10) for all x, y ∈ K1 . Taking into account that pt is continuous and that K1 is dense in H0 = supp μ, (1.10) follows for all x, y ∈ H0 . 2 Corollary 1.7 immediately follows from Theorem 1.6 and the following general result:
G. Da Prato et al. / Journal of Functional Analysis 257 (2009) 992–1017
1005
Proposition 4.1. Let E be a topological space and P a Markov operator on Bb (E). Assume that for any p > 1 there exists a continuous function ηp on E × E such that ηp (x, x) = 0 for all x ∈ E and 1/p η (x,y) P |f |(x) P |f |p (y) e p
∀x, y ∈ E, f ∈ Bb (E).
(4.1)
Then P is strong Feller, i.e. maps Bb (E) into Cb (E). Furthermore, for any σ -finite measure μ on (E, B(E)) such that
|Pf | dμ C |f | dμ, ∀f ∈ Bb (E), (4.2) E
E
for some C > 0, P uniquely extends to Lp (E, μ) with P Lp (E, μ) ⊂ C(E) for any p > 1. Proof. Since P is linear, we only need to consider f 0. Let f ∈ Bb (E) be nonnegative. By (4.1) and the property of ηp we have 1/p lim sup Pf (x) Pf p (y) ,
p > 1.
x→y
Letting p ↓ 1 we obtain lim supx→y Pf (x) Pf (y). Similarly, using f 1/p to replace f and replacing x with y, we obtain 1/p p Pf (y) Pf (x) epηp (y,x) ,
∀x, y ∈ E, p > 1.
First letting x → y then p → 1, we obtain lim infx→y Pf (x) Pf (y). So Pf ∈ Cb (E). Next, for any nonnegative f ∈ Lp (E, μ), let fn = f ∧ n, n 1. By (4.2) and fn → f in Lp (E, μ) we have P |fn − fm |p → 0 in L1 (E, μ) as n, m → ∞. In particular, there exists y ∈ E such that lim P |fn − fm |p (y) = 0.
n,m→∞
(4.3)
Moreover, by (4.1), for BN := {x ∈ E: ηp (x, y) < N} p p sup Pfn (x) − Pfm (x) sup P |fn − fm |(x) P |fn − fm |p (y) epN . x∈BN
x∈BN
Since by the strong Feller property Pfn ∈ Cb (E) for any n 1 and noting that Cb (BN ) is complete under the uniform norm, we conclude from (4.3) that Pf is continuous on BN for any N 1, and hence, Pf ∈ C(H ). 2 Proof of Corollary 1.8. Let μ1 , μ2 be probability measures on (H, B(H )) satisfying (H4). Define μ := 12 μ1 + 12 μ2 . Then μ satisfies (H4) and μi = ρi μ, i = 1, 2, for some B(H )-measurable ρi : H → [0, 2]. Let i ∈ {1, 2}. Since ρi is bounded, by (H4)(iii) and Theorem 1.2 it follows that
Lμ u dμi = 0, ∀u ∈ D(Lμ ). H
1006
G. Da Prato et al. / Journal of Functional Analysis 257 (2009) 992–1017
Hence d dt
Lμ etLμ u dμi = 0,
etLμ u dμi = H
∀u ∈ D(Lμ ),
H
i.e.
μ pt u dμi
H
=
u dμi
∀u ∈ EA (H ).
H μ
Since EA (H ) is dense in L1 (H, μi ), μi is (pt )-invariant. But as mentioned before, by Theμ orem 1.6 it follows that (pt ) is irreducible on H0 (see [9]) and it is strong Feller on H0 by Corollary 1.7. So, since μi (H0 ) = 1, μi = μ. 2 Proof of Corollary 1.10. Let A˜ := A − ωI,
˜ := D(A), D(A)
F˜0 := F0 + ωI. By (H2), A˜ has discrete spectrum. Let ek ∈ H , −λk ∈ (−∞, 0], be the corresponding orthonormal eigenvectors, eigenvalues respectively. For k ∈ N define ϕk (x) := ek , x,
x ∈ H.
We note that by a simple approximation (1.5) also holds for any Lipschitz function on H and thus (cf. the proof of [6, Proposition 5.7(iii)]) also (1.6) holds for such functions, i.e. in particular, for all k ∈ N [0, ∞) t → pt ϕk (x)
is continuous for all x ∈ H0 .
(4.4)
Since any compactly supported smooth function on RN is the Fourier transform of a Schwartz test function, by approximation it easily follows that setting F Cb∞ {ek } := g e1 , ·, . . . , eN , · : N ∈ N, g ∈ Cb∞ RN , we have F Cb∞ ({ek }) ⊂ D(Lμ ) and for ϕ ∈ F Cb∞ ({ek }) Lμ ϕ(x) =
1 2 Tr D ϕ(x) + x, ADϕ(x) + F0 (x), Dϕ(x) , 2
x ∈ H.
Then by approximation it is easy to show that ϕk , ϕk2 ∈ D(Lμ )
and Lμ ϕk = −λk ϕk + ek , F˜0 ,
Lμ ϕk2 = −2λk ϕk2 + 2ϕk ek , F˜0 + 2,
∀k ∈ N.
(4.5)
G. Da Prato et al. / Journal of Functional Analysis 257 (2009) 992–1017
1007
Since we assume that |F0 | is in L2 (H, μ), by [3, Theorem 1.1] we are in the situation of [15, Chapter II]. So, we conclude that by [15, Chapter II, Theorem 1.9] there exists a normal (that is Px [X(0) = x] = 1) Markov process (Ω, F , (Ft )t0 , (X(t))t0 , (Px )x∈H0 ) with state space H0 and M ∈ B(H0 ), μ(M) = 1, such that X(t) ∈ M for all t 0 Px -a.s. for all x ∈ M and which has continuous sample paths Px -a.s for all x ∈ M and for which by the proof of [6, Proposition 8.2] and (4.4), (4.5) we have that for all k ∈ N
βkx (t) := ϕk
X(t) − ϕk (x) −
t
Lμ ϕk X(s) ds,
t 0,
0
Mkx (t) := ϕk2
X(t) − ϕk2 (x) −
t
Lμ ϕk2 X(s) ds,
t 0,
(4.6)
0
are continuous local (Ft )-martingales with βkx (0) = Mk (0) = 0 under Px for all x ∈ M. Fix x ∈ M. Below Ex denotes expectation with respect to Px . Since for T > 0
T
2 Ex 1 + X(s) 1 + F0 X(s) dsμ(dx)
H 0
1 + |x|2 1 + F0 (x) μ(dx) < ∞,
=T H
making M smaller if necessary, by (H4)(ii) we may assume that
T Ex
2 1 + X(s) 1 + F0 X(s) ds < ∞.
(4.7)
0
By standard Markov process theory we have for their covariation processes under Px ,
βkx , βkx t
t =
Dϕk X(s) , Dϕk X(s) ds = tδk,k ,
t 0.
(4.8)
0
Indeed, an elementary calculation shows that for all k ∈ N, t 0,
t βkx (t)2
−
Dϕk X(s) 2 ds
0
t = Mkx (t) − 2ϕk (x)βkx (t) − 0
x βk (t) − βkx (s) Lμ ϕk X(s) ds,
(4.9)
1008
G. Da Prato et al. / Journal of Functional Analysis 257 (2009) 992–1017
where all three summands on the right-hand side are martingales. Since we have a similar formula for finite linear combinations of ϕk s replacing a single ϕk , by polarization we get (4.8). Note that by (4.5) and (4.7) all integrals in (4.6), (4.9) are well defined. Hence, by (4.8) βkx , k ∈ N, are independent standard (Ft )-Brownian motions under Px . Now it follows by [11, Theorem 13] that, with W x = (W x (t))t0 , being the cylindrical Wiener process on H given by W x = (βkx ek )k∈N , we have for every t 0,
t X(t) = e x + tA
e
(t−s)A
F0 X(s) ds +
0
t e(t−s)A dW x (s),
P-a.s.,
(4.10)
0
that is, the tuple (Ω, F , (Ft )t0 , Px , W x , X) is a solution to ⎧
t
t ⎪ ⎪ ⎨ Y (t) = etA Y (0) + e(t−s)A F Y (s) ds + e(t−s)A dW (s), 0 ⎪ 0 0 ⎪ ⎩ law Y (0) = δx (:= Dirac measure in x),
P-a.s., ∀t 0,
(4.11)
in the sense of [11, p. 4]. We note that the zero set in (4.10) is indeed independent of t, since all terms are continuous in t Px -a.s. because of (H2)(ii) and (4.7). Claim. We have X-pathwise uniqueness for Eq. (4.11) (in the sense of [11, p. 98]). For any given cylindrical (Ft )-Wiener process W on a stochastic basis (Ω , F , (Ft )t0 , P ) let Y = Y (t), Z = Z(t), t 0, be two solutions of (4.11) such that law(Z) = law(Y ) = law(X) and Y (0) = Z(0) P -a.s. Then by (4.7)
E
T
F0 Y (s) ds = E
T
0
F0 Z(s) ds = Ex
0
T
F0 X(s) ds < ∞
(4.12)
0
(which, in particular implies by (4.11) and by (H2)(i) that both Y and Z have P -a.s. continuous sample paths). Hence applying [11, Theorem 13] again (but this time using the dual implication) we obtain for all k ∈ N
ek , Y (t) − Z(t) = −λk
t
ek , Y (s) − Z(s) ds
0
t +
ek , F˜0 Y (s) − F˜0 Z(s) ds,
0
Therefore, by the chain rule for all k ∈ N,
t 0, P -a.s.
G. Da Prato et al. / Journal of Functional Analysis 257 (2009) 992–1017
2 ek , Y (t) − Z(t) = −2λk
t
1009
2 ek , Y (s) − Z(s) ds
0
t +2
ek , Y (s) − Z(s) ek , F˜0 Y (s) − F˜0 Z(s) ds,
t 0, P -a.s.
0
Dropping the first term on the right-hand side and summing up over k ∈ N (which is justified by (4.11) and the continuity of Y and Z), we obtain from (H3) that Y (t) − Z(t)2 2
t
Y (s) − Z(s), F˜0 Y (s) − F˜0 Z(s) ds
0
t 2ω
Y (s) − Z(s)2 ds,
t 0, P -a.s.
0
Hence, by Gronwall’s lemma Y = Z P -a.s. and the Claim is proved. By the Claim we can apply [11, Theorem 10, (1) ⇔ (3)] and then [11, Theorem 1] to conclude that Eq. (4.11) has a strong solution (see [11, Definition 1]) and that there is one strong solution with the same law as X, which hence by (4.7) has continuous sample paths a.s. Now all conditions in [11, Theorem 13.2] are fulfilled and, therefore, we deduce from it that on any stochastic basis (Ω, F , (Ft )t0 , P) with (Ft )-cylindrical Wiener process W on H and for x, y ∈ M there exist pathwise unique continuous strong solutions X(t, x), X(t, y), t 0, to (4.11) such that P ◦ X(·, x)−1 = Px ◦ X −1 and P ◦ X(·, y)−1 = Py ◦ X −1 , in particular, X(0, x) = x and X(0, y) = y and P ◦ X(t, x)−1 (dz) = pt (x, dz),
t 0,
P ◦ X(t, y)−1 (dz) = pt (y, dz),
t 0.
(4.13)
In particular, we have proved (i). To prove (ii), below for brevity we set X := X(·, x), X := X(·, y). Then proceeding as in the proof of the Claim, by (1.13) and noting that s −1 Φ(s) → ∞ as s → ∞, we obtain 2 2 d X(t) − X (t) a − Φ0 X(t) − X (t) dt for some constant a > 0, only depending on ω and Φ, where Φ0 = 12 Φ. Now we consider two cases.
(4.14)
1010
G. Da Prato et al. / Journal of Functional Analysis 257 (2009) 992–1017
Case 1. |x − y|2 Φ0−1 (2a). Define f (t) := |X(t) − X (t)|2 , t 0, and suppose there exists t0 ∈ (0, ∞) such that f (t0 ) > Φ0−1 (a). Then we can choose δ ∈ [0, t0 ] maximal such that f (t) > Φ0−1 (a),
∀t ∈ (t0 − δ, t0 ].
Hence, because by (4.14) f is decreasing on every interval where it is larger than Φ0−1 (a), we obtain that f (t0 − δ) f (t0 ) > Φ0−1 (a). Suppose t0 − δ > 0. Then f (t0 − δ) Φ0−1 (a) by the continuity of f and the maximality of δ. So, we must have t0 − δ = 0, hence f (t0 ) f (t0 − δ) = f (0) = |x − y|2 Φ0−1 (2a). So, X(t) − X (t)2 Φ −1 (2a), 0
∀t > 0.
Case 2. |x − y|2 > Φ0−1 (2a). Define t0 = inf{t 0: |X(t) − X (t)|2 Φ0−1 (2a)}. Then by Case 1, starting at t = t0 rather than t = 0 we know that X(t) − X (t)2 Φ −1 (2a), 0
∀t t0 .
(4.15)
Furthermore, it follows from (4.14) that 2 2 1 d X(t) − X (t) − Φ0 X(t) − X (t) dt, 2
∀t t0 .
This implies 2 1 Ψ X(t) − X (t) 2
2 |x−y|
|X(t)−X (t)|2
t dr , Φ0 (r) 4
∀t t0 .
Therefore, X(t) − X (t)2 Ψ −1 (t/4),
∀t t0 .
(4.16)
G. Da Prato et al. / Journal of Functional Analysis 257 (2009) 992–1017
1011
Combining Case 1, (4.15) and (4.16) we conclude that X(t) − X (t)2 Ψ −1 (t/4) + Φ −1 (2a),
∀t > 0.
0
(4.17)
Combining (4.17) with Theorem 1.6 for all f ∈ Bb (H ) we obtain 2 λ(1 + Ψ −1 (t/8)) , pt/2 |f | X(t/2) pt/2 f 2 X (t/2) exp (1 − ε −ωt/2 )2
∀t > 0
for some constant λ > 0. By Jensen’s inequality and approximation it follows that for all f ∈ L2 (H, μ) 2 2 pt |f |(x) E pt/2 |f | X(t/2) λ(1 + Ψ −1 (t/8)) 2 , pt f (y) exp (1 − ε −ωt/2 )2
∀t > 0, ∀x, y ∈ M.
(4.18)
But since H0 = supp μ, M is dense in H0 , hence by the continuity of pt f (cf. Corollary 1.7) (4.18) holds for all x ∈ H0 , y ∈ M. Since μ(M) = 1 this completes the proof by integrating both sides with respect to μ(dy). 2 Remark 4.2. We would like to mention that by using [2] instead of [15] we can drop the assumption that |F0 | ∈ L2 (H, μ). So, by (4.9) and the proof above we can derive (4.8) avoiding to assume the usually energy condition
t
F0 X(s) 2 ds < ∞,
Px -a.s.
0
Details will be included in a forthcoming paper. We would like to thank Tobias Kuna at this point from whom we learnt identity (4.9) by private communication. 5. Existence of measures satisfying (H4) To prove existence of invariant measures we need to strengthen some of our assumptions. So, let us introduce the following conditions. (H1) (A, D(A)) is self-adjoint satisfying (1.2). (H6) There exists η ∈ (ω, ∞) such that F0 (x) − F0 (y), x − y −η|x − y|2 ,
∀x, y ∈ D(F ).
Remark 5.1. (i) Clearly, (H1) implies (H1) and (H5). (H1) and (H2)(i) imply that (A, D(A)) and thus also (1 + ω − A, D(A)) has a discrete spectrum. Let λi ∈ (0, ∞), i ∈ N, be the eigenvalues of the latter operator. Then by (H2) ∞ i=1
λ−1 i < ∞.
(5.1)
1012
G. Da Prato et al. / Journal of Functional Analysis 257 (2009) 992–1017
(ii) If we assume (5.1), i.e. that (1 + ω − A)−1 is trace class, then all what follows holds with (H2) replaced by (H2)(i). So, σ −1 ∈ L(H ) is not needed in this case. Let Fα , α < 0, be as in Section 2. Then e.g. by [5, Theorem 3.2] equation (1.1) with Fα replacing F0 has a unique mild solution Xα (t, x), t 0. Since there exist η˜ ∈ (ω, ∞) and α0 > 0 such that each Fα , α ∈ (0, α0 ), satisfies (H6) with η˜ replacing η, by [5, Section 3.4] Xα has a unique invariant measure μα on (H, B(H )) such that for each m ∈ N
|x|m μα (dx) < ∞.
sup α∈(0,α0 )
(5.2)
H
That these moments are indeed uniformly bounded in α, follows from the proof of [5, Proposition 3.18] and the fact that η˜ ∈ (ω, ∞). Let NQ denote the centered Gaussian measure on (H, B(H )) with covariance operator Q defined by
∞ Qx :=
x ∈ H,
etA σ etA x dt, 0
which by (H2)(ii) is trace class. Let W 1,2 (H, NQ ) be defined as usual, that is as the completion of EA (H ) with respect to the norm !
ϕW 1,2 :=
2 ϕ + |Dϕ|2 dNQ
"1/2 ϕ ∈ EA (H ),
,
H
where D denotes first Fréchet derivative. By [7] we know that W 1,2 (H, NQ ) ⊂ L2 (H, NQ ),
compactly.
(5.3)
Theorem 5.2. Assume that (H1) , (H2), (H3) and (H6) hold and let μα , α ∈ (0, α0 ) be as above. Suppose that there exists a lower semi-continuous function G : H → [0, ∞] such that
{G < ∞} ⊂ D(F ),
|F0 | G on D(F )
and
G2 dμα < ∞.
sup α∈(0,α0 )
(5.4)
H
Then {μα : α ∈ (0, α0 )} is tight and any limit point μ satisfies (H4) and hence by Corollary 1.8 all of these limit points coincide. Furthermore, for all m ∈ N
F0 (x)2 + |x|m μ(dx) < ∞
(5.5)
H
and there exists ρ : H → [0, ∞), B(H )-measurable, such that μ = ρNQ and
√ ρ ∈ W 1,2 (H, μ).
G. Da Prato et al. / Journal of Functional Analysis 257 (2009) 992–1017
1013
Proof. We recall that by [3, Theorem 1.1] for each α ∈ (0, α0 ) μα = ρα NQ ;
√ ρα ∈ W 1,2 (H, NQ )
and as is easily seen from its proof, that
1 √ 2 |D ρα | dNQ |Fα |2 dμα . 4 H
(5.6)
(5.7)
H
But by (2.3) and (5.4) the right-hand side of (5.7) is uniformly bounded in α. Hence by (5.3) there exists a zero sequence {αn } such that √ √ ραn → ρ for some
in L2 (H, NQ ) as n → ∞,
√ ρ ∈ W 1,2 (H, NQ ) and therefore, in particular, ραn → ρ
in L1 (H, NQ ) as n → ∞.
(5.8)
Define μ := ρNQ and ρn := ραn , n ∈ N. Since G is lower semi-continuous and μαn → μ as n → ∞ weakly, (5.2) and (5.4) imply
2 G (x) + |x|m μ(dx) < ∞, ∀m ∈ N. (5.9) H
Hence by (5.4) both (H4)(i) and (H4)(ii) follow. So, it remains to prove (H4)(iii). Since σ is independent of α, to show (5.9) it is enough to prove that for all ϕ ∈ Cb (H ), h ∈ D(A),
Fαhn (x)ϕ(x)μαn (dx) = F0h (x)ϕ(x) μ(dx), (5.10) lim n→∞
H
H
where Fαh := h, Fα , α ∈ [0, α0 ). We have
F h ϕ dμα − F h ϕ dμ n αn 0 H
ϕ∞
H
h F − F h ρn dNQ + αn 0
H
h F ϕ |ρn − ρ| dNQ . 0
H
But by (2.3) and (5.4) we have
h F − F h ρn dNQ αn 0 H
h F − F h ρn dNQ αn 0
{|G|M}
+
2|h| sup M α∈(0,α0 )
G2 dμα . H
(5.11)
1014
G. Da Prato et al. / Journal of Functional Analysis 257 (2009) 992–1017
Hence first letting n → ∞ then M → ∞ by (2.2), (5.4) and (5.8) Lebesgue’s generalized dominated convergence theorem implies that the first term on the right-hand side of (5.11) converges to 0. Furthermore, for every δ ∈ (0, 1)
F0h F h ϕ dμα − F h ϕ dμ n 0 0 1 + δ|F h | ϕ(ρn − ρ) dNQ 0 H
H
H
!
+ δϕ∞
h 2 F dμα + n 0
H
" h 2 F dμ . 0
(5.12)
H
Since by (2.3) and (5.4)
sup α∈(0,α0 )
h 2 F dμα < ∞, 0
H
(H4)(iii) follows from (5.12) by letting first n → ∞ and then δ → 0, since for fixed δ > 0 the first term in the right-hand side converges to zero by (5.8). 2 Example 5.3. Let H = L2 (0, 1), Ax = x, x ∈ D(A) := H 2 (0, 1) ∩ H01 (0, 1). Let f : R → R be decreasing such that for some c3 > 0, m ∈ N, f (s) c3 1 + |s|m , ∀s ∈ R. (5.13) Let si ∈ R, i ∈ N, be the set of all arguments where f is not continuous and define [f (si + ), f (si − )], if s = si for some i ∈ N, f¯(s) = f (s), else. Define F : D(F ) ⊂ H → 2H ,
x → f¯ ◦ x,
where D(F ) = {x ∈ H : f¯ ◦ x ⊂ H }. Then F is m-dissipative. Let F0 be defined as in Section 2. Since A ω for some ω < 0, it is easy to check that all conditions (H1) , (H2), (H3), (H6) with η = 0 hold for any σ ∈ L(H ) such that σ −1 ∈ L(H ). Define 1/2 1 2m if x ∈ L2m (0, 1), 0 |x(ξ )| dξ G(x) := +∞ if x ∈ / L2m (0, 1). Then {G < ∞} ⊂ D(F ) and |F0 | = |F0 |L2 (0,1) G on D(F ). Furthermore, by [6, (9.3)]
G2 dμα < ∞.
sup α∈(0,α0 )
H
(5.14)
G. Da Prato et al. / Journal of Functional Analysis 257 (2009) 992–1017
1015
Note that from [6, Hypothesis 9.5] only the first inequality, which clearly holds by (5.13) in our case, was used to prove [6, (9.3)]. Hence all assumptions of Theorem 5.2 above hold and we obtain the existence of the desired unique probability measure μ satisfying (H4) in this case. We emphasize that no continuity properties of f and F0 are required. In particular, then all results stated in Section 1 except for Corollary 1.10(ii) hold in this case. If moreover there exists an increasing positive convex function Φ on [0, ∞) satisfying (1.12) such that f (s) − f (t) (s − t) c − Φ |s − t|2 , s, t ∈ R, then by Jensen’s inequality (1.13) holds. Hence, by Corollary 1.10 one obtains an explicit upper bound for pt L2 (H,μ)→L∞ (H,μ) . A natural and simple choice of Φ is Φ(s) = s m for m > 1. One can extend these results to the case, where (0, 1) above is replaced by a bounded open 1 set in Rd , d = 2 or 3 for σ = (−)γ , γ ∈ ( d−2 4 , 2 ), based on Remark 1.1(iv). Before to conclude we want to present a condition in the general case (i.e for any Hilbert space H as above) that implies (5.4), hence by Theorem 5.2 ensures the existence of a probability measure satisfying (H4) so that all results of Section 1 apply also to this case. As will become clear from the arguments below, such condition is satisfied if the eigenvalues of A grow fast enough in comparison with |F 0 |. To this end we first note that by (5.1) for i ∈ N we can find qi −1 qi ∈ (0, λi ), qi ↑ ∞ such that ∞ i=1 qi < ∞ and λi → 0 as i → ∞. Define Θ : H → [0, ∞] by Θ(x) :=
∞ λi i=1
qi
x, ei 2 ,
x ∈ H,
(5.15)
where {ei }i∈N is an eigenbasis of (1 + ω − A, D(A)) such that ei has eigenvalue λi . Then Θ has compact level sets and | · |2 Θ. Below we set Hn := lin span {e1 , . . . , en }, A˜ := A − (1 + ω)I,
πn := projection onto Hn , ˜ := D(A), D(A)
F˜0 := F0 + (1 + ω)I.
(5.16) (5.17)
We note that obviously Hn ⊂ {Θ < +∞} for all n ∈ N. Theorem 5.4. Assume that (H1) , (H2), (H3) and (H6) hold and let μα , α ∈ (0, α0 ), be as above. Suppose that {Θ < +∞} ⊂ D(F ) and that for some C ∈ (0, ∞), m ∈ N F0 (x) C 1 + |x|m + Θ 1/2 (x) , ∀x ∈ D(F ). (5.18) Then
Θ dμα < ∞
sup α∈(0,α0 )
and (5.4) holds, so Theorem 5.2 applies.
H
(5.19)
1016
G. Da Prato et al. / Journal of Functional Analysis 257 (2009) 992–1017
Proof. Consider the Kolmogorov operator Lα corresponding to Xα (t, x), t 0, x ∈ H, which for ϕ ∈ F Cb2 ({en }), i.e., ϕ = g(e1 , ·, . . . , eN , ·) for some N ∈ N, g ∈ Cb2 (RN ), is given by Lα ϕ(x) :=
1 2 2 Tr σ D ϕ(x) + x, ADϕ(x) + Fα (x), Dϕ(x) , 2
x ∈ H,
(5.20)
where D 2 denotes the second Fréchet derivative. Then, an easy application of Itô’s formula shows that the L1 (H, μα )-generator of (Ptα ) (given as before by Ptα f (x) = E[f (Xα (t, x))]) is given on F Cb2 ({en }) by Lα . In particular,
Lα ϕ dμα = 0, ∀ϕ ∈ F Cb2 {en } . H
By a simple approximation argument and (5.2) we get for α ∈ (0, α0 ) and ϕn (x) :=
n
qi−1 x, ei 2 ,
x ∈ H, n ∈ N,
i=1
that also
Lα ϕn dμα = 0.
(5.21)
H
But for all x ∈ H , with F˜α defined as F˜0 in (5.17), we have Lα ϕn (x) = −2
n λi i=1
+
n
qi
x, ei 2 + 2
n
qi−1 F˜α (x), ei x, ei
i=1
qi−1 σn ei , σn ej
i,j =1
−2Θ(πn x) + 2
n
2 qi−1 F˜α (x), ei
i=1
+
n
1/2 n
1/2
qi−1 x, ei 2
i=1
qi−1 |σn ei |2
i=1 ∞ −2Θ(πn x) + c1 1 + |x|m+1 + Θ 1/2 (x)|x| + σ 2 qi−1 ,
(5.22)
i=1
for some constant c1 independent of n and α. Here we used (2.3) and (5.18). Now (5.21), (5.2) and (5.22) immediately imply that for some constant c˜1 ! "
∞ m+2 sup Θ(x)μα (dx) sup c˜1 1 + |x| μα (dx) + σ 2 qi−1 < ∞. α∈(0,α0 )
H
α∈(0,α0 )
H
i=1
So, (5.19) is proved, which by (5.18) implies (5.4) and the proof is complete.
2
G. Da Prato et al. / Journal of Functional Analysis 257 (2009) 992–1017
1017
Acknowledgments The second named author would like to thank UCSD, in particular, his host Bruce Driver, for a very pleasant stay in La Jolla where a part of this work was done. The authors would like to thank Ouyang for his comments leading to a better constant involved in the Harnack inequality. References [1] M. Arnaudon, A. Thalmaier, F.-Y. Wang, Harnack inequality and heat kernel estimates on manifolds with curvature unbounded below, Bull. Sci. Math. 130 (3) (2006) 223–233. [2] L. Beznea, N. Boboc, M. Röckner, Markov processes associated with Lp -resolvents and applications to stochastic differential equations in Hilbert spaces, J. Evol. Equ. 6 (4) (2006) 745–772. [3] V. Bogachev, G. Da Prato, M. Röckner, Regularity of invariant measures for a class of perturbed Ornstein– Uhlenbeck operators, Nonlinear Differential Equations Appl. 3 (1996) 261–268. [4] S. Cerrai, Second Order PDE’s in Finite and Infinite Dimensions. A Probabilistic Approach, Lecture Notes in Math., vol. 1762, Springer-Verlag, 2001. [5] G. Da Prato, Kolmogorov Equations for Stochastic PDEs, Birkhäuser, 2004. [6] G. Da Prato, M. Röckner, Singular dissipative stochastic equations in Hilbert spaces, Probab. Theory Related Fields 124 (2) (2002) 261–303. [7] G. Da Prato, P. Malliavin, D. Nualart, Compact families of Wiener functionals, C. R. Acad. Sci. Paris 315 (1992) 1287–1291. [8] G. Da Prato, J. Zabczyk, Stochastic Equations in Infinite Dimensions, Cambridge Univ. Press, 1992. [9] W. Liu, Doctor-Thesis, Bielefeld University, 2008. [10] Z.M. Ma, M. Röckner, Introduction to the Theory of (Non-Symmetric) Dirichlet Forms, Springer-Verlag, 1992. [11] M. Ondreját, Uniqueness for stochastic evolution equations in Banach spaces, Dissertationes Math. (Rozprawy Mat.) 426 (2004). [12] S.-X. Ouyang, Doctor-Thesis, Bielefeld University, 2008. [13] C. Prevot, M. Röckner, A Concise Course on Stochastic Partial Differential Equations, Lecture Notes in Math., Springer, 2007. [14] M. Röckner, F.-Y. Wang, Harnack and functional inequalities for generalized Mehler semigroups, J. Funct. Anal. 203 (1) (2003) 237–261. [15] W. Stannat, (Nonsymmetric) Dirichlet operators on L1 : Existence, uniqueness and associated Markov processes, Ann. Sc. Norm. Super. Pisa Cl. Sci. (4) 28 (1) (1999) 99–140. [16] F.-Y. Wang, Logarithmic Sobolev inequalities on noncompact Riemannian manifolds, Probab. Theory Related Fields 109 (1997) 417–424. [17] F.-Y. Wang, Harnack inequality and applications for stochastic generalized porous media equations, Ann. Probab. 35 (4) (2007) 1333–1350.
Journal of Functional Analysis 257 (2009) 1018–1029 www.elsevier.com/locate/jfa
Stable invariant manifolds for parabolic dynamics ✩ Luis Barreira ∗ , Claudia Valls Departamento de Matemática, Instituto Superior Técnico, 1049-001 Lisboa, Portugal Received 11 December 2008; accepted 14 January 2009 Available online 31 January 2009 Communicated by Paul Malliavin
Abstract We consider nonautonomous equations v = A(t)v in a Banach space that exhibit stable and unstable behaviors with respect to arbitrary growth rates ecρ(t) for some function ρ(t). This corresponds to the existence of a “generalized” exponential dichotomy, which is known to be robust. When ρ(t) = t this behavior can be described as a type of parabolic dynamics. We consider the general case of nonuniform exponential dichotomies, for which the Lyapunov stability is not uniform. We show that for any sufficiently small perturbation f of a “generalized” exponential dichotomy there is a stable invariant manifold for the perturbed equation v = A(t)v + f (t, v). We also consider the case of exponential contractions, which allow a simpler treatment, and we show that they persist under sufficiently small nonlinear perturbations. © 2009 Elsevier Inc. All rights reserved. Keywords: Invariant manifolds; Parabolic dynamics; Stability theory
1. Introduction We consider linear equations v = A(t)v
(1)
in a Banach space, which exhibit stable and unstable behaviors with respect to growth rates of the form ecρ(t) for an arbitrary function ρ(t). This corresponds to the existence of what we ✩
Partially supported by FCT through CAMGSD, Lisbon.
* Corresponding author.
E-mail addresses: [email protected] (L. Barreira), [email protected] (C. Valls). URL: http://www.math.ist.utl.pt/~barreira/ (L. Barreira). 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.01.014
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 1018–1029
1019
call a ρ-nonuniform exponential dichotomy. This includes linear dynamics that may have all Lyapunov exponents zero or all Lyapunov exponents infinite, of course besides the usual case when ρ(t) = t. In such a situation, one is not able, at least without further modifications, to apply the existing stability theory, using for example Lyapunov’s so-called regularity theory (see for example [1] for details), which can only be applied successfully when all Lyapunov exponents are finite. On the other hand, we show in [3] that for ρ in a large class of rate functions, any linear equation as in (1) in a finite-dimensional space, and with two blocks with asymptotic rates ecρ(t) respectively with c negative and positive, has a ρ-nonuniform exponential dichotomy. Moreover, we show in [5] that a ρ-nonuniform exponential dichotomy is robust under sufficiently small linear perturbations. This shows that there are plenty exponential dichotomies with the generalized behavior given by a function ρ. Therefore, it is of interest to develop a corresponding theory. Our main aim is to show that the exponential behavior exhibited by a linear nonuniform exponential dichotomy persists under sufficiently small nonlinear perturbations. Namely, we establish the existence of stable invariant manifolds for the equation v = A(t)v + f (t, v),
(2)
provided that there are constants c, q > 0 with q sufficiently large such that f (t, u) − f (t, v) cu − v uq + vq .
(3)
We also consider the case of exponential contractions, which allow a simpler treatment, and we show that they persist under sufficiently small nonlinear perturbations. We first consider contractions, and then dichotomies with an elaboration of the proof for contractions. The strategy of proof is essentially based on work in [2], although that paper does not consider exponential contractions. In the theory of differential equations, the notion of exponential dichotomy plays a central role in the study of invariant manifolds. In particular, the existence of an exponential dichotomy for Eq. (1) ensures the existence of stable and unstable invariant manifolds under sufficiently small nonlinear perturbations. The theory of exponential dichotomies and its applications are widely developed. Moreover, there exist large classes of linear equations with exponential dichotomies. We refer to the books [6–9,15] for details and further references. On the other hand, the notion of exponential dichotomy is too stringent for the dynamics and it is of interest to look for more general types of hyperbolic behavior. It is in this context that we consider the more general notion of nonuniform exponential dichotomy. Our work can also be considered a contribution to the theory of nonuniformly hyperbolic dynamics. We refer to [1] for an introduction to the theory, which goes back to the works of Oseledets [11] and Pesin [12,13]. Invariant manifolds were first obtained for nonuniformly hyperbolic trajectories by Pesin in [12]. The first related results in Hilbert spaces were established by Ruelle in [14]. The case of transformations in Banach spaces under some compactness assumptions was considered by Mañé in [10]. 2. Stability of exponential contractions Let X be a Banach space and let A : R+ 0 → B(X) be a continuous function, where B(X) is the set of bounded linear operators in X. We consider the linear equation (1). Given an increasing
1020
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 1018–1029
+ differentiable function ρ : R+ 0 → R0 such that
lim
t→+∞
log t = 0, ρ(t)
(4)
we say that Eq. (1) admits a ρ-nonuniform exponential contraction if there exist constants λ < 0, a 0 and D 1 such that for every t s 0 we have T (t, s) Deλ(ρ(t)−ρ(s))+aρ(s) ,
(5)
where T (t, s) is the linear evolution operator associated to Eq. (1). Let also f : R+ 0 × X → X be a continuous function with f (t, 0) = 0 for every t 0. We consider the nonlinear equation (2), and we study the persistence of the stability of an exponential contraction. Let B(δ) ⊂ X be the open ball of radius δ centered at zero, and set α = (1 + 2/q)a.
(6)
Theorem 1. Assume that Eq. (1) admits a ρ-nonuniform exponential contraction, and that there exist constants c, q > 0 such that (3) holds for every t 0 and u, v ∈ X, with qλ + a < 0. Then for any D > D and any sufficiently small δ > 0, provided that s 0 is sufficiently large, each solution of Eq. (2) with v(s) = ξ ∈ B(δe−αρ(s) ) satisfies v(t) D eλ(ρ(t)−ρ(s))+aρ(s) ξ ,
t s.
Proof. We consider the space B = v : [s, +∞) → X: v is continuous and v δe−αρ(s) , with the norm v =
1 sup v(t)e−λ(ρ(t)−ρ(s))−aρ(s) : t s . 2D
We can easily verify that B is a complete metric space. Set t (J v)(t) = T (t, s)ξ +
U (t, τ )f τ, v(τ ) dτ
s
for each v ∈ B and t s. Clearly, J v is continuous and (J v)(s) = ξ . By the assumptions in the theorem, for each v, w ∈ B and τ s we have f τ, v(τ ) − f τ, w(τ ) cv(τ ) − w(τ ) v(τ )q + w(τ )q c2q+2 δ q D 1+q eλ(q+1)(ρ(τ )−ρ(s))+a(q+1)ρ(s)−αqρ(s) v − w .
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 1018–1029
1021
Therefore, using (6) we obtain (J v)(t) − (J w)(t)
t
T (t, τ ) · f τ, v(τ ) − f τ, w(τ ) dτ
s
c2q+2 δ q D 2+q eλ(ρ(t)−ρ(s)) v − w
t e(qλ+a)(ρ(τ )−ρ(s)) dτ.
(7)
s
It follows from (4) that ρ(t) → +∞ as t → +∞, and thus tρ (t) → +∞ as t → +∞. Therefore, for each c > 0 there exists t0 > 0 such that tρ (t) > c for every t > t0 . For s > t0 we obtain t c
t e
(qλ+a)(ρ(τ )−ρ(s))
τρ (τ )e(qλ+a)(ρ(τ )−ρ(s)) dτ
dτ
s
s
t τ e(qλ+a)(ρ(τ )−ρ(s)) ∞ 1 − e(qλ+a)(ρ(τ )−ρ(s)) dτ, qλ + a qλ + a s s
since qλ + a < 0. By (4) we have log τ e(qλ+a)ρ(τ ) = log τ + (qλ + a)ρ(τ )
log τ + qλ + a → −∞ = ρ(τ ) ρ(τ ) as τ → +∞, and thus,
1 c+ qλ + a
t e(qλ+a)(ρ(τ )−ρ(s)) dτ s
s . |qλ + a|
Taking c sufficiently large, we have t e(qλ+a)(ρ(τ )−ρ(s)) dτ s.
(8)
s
By (7), setting K = supst0 (se−aρ(s) ) < +∞, we obtain q+2 q 2+q (J v)(t) − (J w)(t) c2 δ D K eλ(ρ(t)−ρ(s))+aρ(s) v − w , |qλ + a|
and thus, J v − J w θ v − w ,
(9)
1022
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 1018–1029
with θ = c2q+1 δ q D 1+q K/|qλ + a|. Taking δ sufficiently small so that θ < 1/2, the operator J becomes a contraction. Furthermore, by (9) and (5), 1 J v J 0 + θ v ξ + θ v 2 1 1 δe−αρ(s) + δe−αρ(s) = δe−αρ(s) , 2 2
(10)
and J (B) ⊂ B. Therefore, there exists a unique function v ∈ B such that J v = v. It follows from (10) that v ξ /(2(1 − θ )). This completes the proof of the theorem. 2 3. Stable invariant manifolds We establish in this section the existence of stable invariant manifolds under sufficiently small perturbations of nonuniform exponential dichotomies. We first introduce the notion of exponential dichotomy. We say that Eq. (1) admits a ρ-nonuniform exponential dichotomy if there exist projections P (t) for t 0 satisfying P (t)T (t, s) = T (t, s)P (s),
t, s 0,
and there exist constants λ < 0 μ,
a 0 and D 1
such that for every t s 0 we have T (t, s)P (s) Deλ(ρ(t)−ρ(s))+aρ(s) , T (t, s)−1 Q(t) De−μ(ρ(t)−ρ(s))+aρ(t) ,
(11)
where Q(t) = Id −P (t). For each t 0 we consider the linear subspaces E(t) = P (t)(X)
and F (t) = Q(t)(X).
Now we assume that Eq. (1) admits a ρ-nonuniform exponential dichotomy. For each t 0, let Bt (δ) ⊂ E(t) be the open ball of radius δ centered at zero, and set β = a(1 + 3/q). We consider the set Zβ = Zβ (δ) = (s, ξ ): s 0, ξ ∈ Bs δe−βρ(s) ,
(12)
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 1018–1029
1023
and we denote by Xβ the space of continuous functions φ : Zβ → X such that for every s 0 we have φ(s, 0) = 0, φ(s, ξ ) ∈ F (s), and φ(s, ξ ) − φ(s, ξ¯ ) ξ − ξ¯ for every ξ, ξ¯ ∈ e−βρ(s) . With the norm φ = sup φ(s, ξ )/ξ : s 0 and ξ ∈ Bs δe−βρ(s) \ {0} , Xβ becomes a complete metric space. Given φ ∈ Xβ we consider its graph W=
s, ξ, φ(s, ξ ) : (s, ξ ) ∈ Zβ .
(13)
For each (s, ξ, η) ∈ R+ 0 × E(s) × F (s) we also consider the semiflow Ψτ (s, ξ, η) = s + τ, x(s + τ ), y(s + τ ) ,
τ 0,
where t x(t) = T (t, s)ξ +
T (t, τ )P (τ )f τ, x(τ ), y(τ ) dτ,
s
t y(t) = T (t, s)η +
T (t, τ )Q(τ )f τ, x(τ ), y(τ ) dτ.
s
The following is our stable manifold theorem. Theorem 2. Assume that Eq. (1) admits a ρ-nonuniform exponential dichotomy, and that there exist constants c, q > 0 such that (3) holds for every t 0 and u, v ∈ X, with qλ + a < 0 and λ + a < μ. Then there exist δ, R, D > 0, and a unique function φ ∈ Xβ such that: 1. the set W is forward invariant under the semiflow Ψτ , that is, Ψτ s, ξ, φ(s, ξ ) ∈ W
whenever (s, ξ ) ∈ Zβ+a (δ/R);
2. for every s 0, ξ, ξ¯ ∈ Bs (δe−(β+a)ρ(s) ), and τ 0 we have Ψτ s, ξ, φ(s, ξ ) − Ψτ s, ξ¯ , φ(s, ξ¯ ) D eλ(ρ(τ )−ρ(s))+aρ(s) ξ − ξ¯ . Proof. We consider the space X of continuous functions φ : Z → X, where Z = (s, ξ ): s 0, ξ ∈ E(s) ,
1024
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 1018–1029
such that φ|Zβ ∈ Xβ , and φ(s, ξ ) = φ s, δe−βρ(s) ξ/ξ whenever ξ ∈ / Bs (δe−βρ(s) ). We note that there is a one-to-one correspondence between functions in Xβ and X. In particular, X is a Banach space with the norm X φ → φ|Zβ . Moreover, for each φ ∈ X and s 0 we have (see Lemma 4.3 in [4]) φ(s, x) − φ(s, y) 2x − y for every x, y ∈ E(s). (14) Now we prove some auxiliary results. Lemma 1. There exists R > 0 such that for every δ > 0 sufficiently small: 1. for each φ ∈ X, given (s, ξ ) ∈ Zβ there is a unique continuous function x = xφ : [s, +∞) × Bs (δe−βρ(s) ) → X such that for every t s, t x(t) = T (t, s)ξ +
T (t, τ )P (τ )f τ, x(τ, ξ ), φ τ, x(τ, ξ ) dτ ;
(15)
s
xφ (t, ξ ) Reλ(ρ(t)−ρ(s))+aρ(s) ξ ,
2.
t s.
Proof. This is an elaboration of the proof of Theorem 1. We consider the space Bs of continuous functions x : [s, +∞) × Bs δe−βρ(s) → X such that x δe−βρ(s) , with the norm x =
1 sup x(t, ξ )e−λ(ρ(t)−ρ(s))−aρ(s) : t s, (s, ξ ) ∈ Zβ (δ) . 2D
We can easily verify that Bs is a complete metric space. We define an operator Js by t (Js x)(t, ξ ) = U (t, s)ξ +
U (t, τ )f τ, x(τ, ξ ), φ τ, x(τ, ξ ) dτ
s
for each x ∈ Bs . Clearly, Js x is continuous, and (Js x)(s, ξ ) = ξ . By the assumptions in the theorem, for each x, y ∈ Bs , φ ∈ X, and τ s we have f τ, x(τ, ξ ), φ τ, x(τ, ξ ) − f τ, y(τ, ξ ), φ τ, y(τ, ξ ) 2c6q+1 δ q D 1+q eλ(q+1)(ρ(τ )−ρ(s))+a(q+1)ρ(s)−βqρ(s) x − y . Therefore, proceeding as in the proof of Theorem 1 (see also (8)) we obtain (Js x) − (Js y) θ x − y ,
(16)
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 1018–1029
1025
with θ = 2c6q+1 δ q D 2+q K/|qλ + a|, and K = supst0 (se−aρ(s) ). Taking δ sufficiently small so that θ < 1/2, the operator Js becomes a contraction. Moreover, by (16) and (11), proceeding again as in the proof of Theorem 1 we obtain Js x δe−βρ(s) ,
that is,
Js (Bs ) ⊂ Bs ,
and the unique fixed point of Js satisfies x ξ /(2(1 − θ )).
2
Lemma 2. There exists K > 0 such that for every δ > 0 sufficiently small, φ, ψ ∈ X, s 0 sufficiently large, ξ, ξ¯ ∈ Bs (δe−βρ(s) ), and t s we have xφ (t, ξ ) − xφ (t, ξ¯ ) + xφ (t, ξ ) − xψ (t, ξ ) K eλ(ρ(t)−ρ(s))+aρ(s) ξ − ξ¯ + ξ · φ − ψ .
(17)
Proof. For every τ s, setting η = 2c3q+1 R q δ q ,
(18)
and using the definition of β in (12) and Lemma 1, we have f τ, xφ (τ, ξ ), φ τ, xφ (τ, ξ ) − f τ, xφ (τ, ξ¯ ), φ τ, xφ (τ, ξ¯ ) ηe−3aρ(s) eqλ(ρ(t)−ρ(s)) xφ (τ, ξ ) − xφ (τ, ξ¯ ),
(19)
f τ, xφ (τ, ξ ), φ τ, xφ (τ, ξ ) − f τ, xψ (τ, ξ ), ψ τ, xψ (τ, ξ ) ηe−3aρ(s) eqλ(ρ(t)−ρ(s)) xφ (τ, ξ ) · φ − ψ + 3xφ (τ, ξ ) − xψ (τ, ξ ) .
(20)
and
Setting γ (t) = xφ (τ, ξ ) − xφ (τ, ξ¯ ) + xφ (τ, ξ ) − xψ (τ, ξ ), we obtain γ (t) U (t, s) · ξ − ξ¯ + 3ηe−3aρ(s)
t
U (t, τ )eqλ(ρ(τ )−ρ(s)) γ (τ ) dτ
s
+ ηe−3aρ(s)
t
U (t, τ )eqλ(ρ(τ )−ρ(s)) xφ (τ ) · φ − ψ dτ,
s
where U (t, s) = T (t, s)P (s). Using (11) we conclude that
1026
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 1018–1029
e−λ(ρ(t)−ρ(s)) γ (t) Deaρ(s) ξ − ξ¯ + 3Dηe
−2aρ(s)
t
e(qλ+a)(ρ(τ )−ρ(s)) e−λ(ρ(τ )−ρ(s)) γ (τ ) dτ
s
+ ηDRe
−aρ(s)
t ξ · φ − ψ
e(qλ+a)(ρ(τ )−ρ(s)) dτ. s
It was shown in the proof of Theorem 1 that there exists t0 > 0 such that (8) holds for any t > t0 . Therefore, for s > t0 we have e−λ(ρ(t)−ρ(s)) γ (t) Deaρ(s) ξ − ξ¯ + ηDRKξ · φ − ψ + 3Dηe
−2aρ(s)
t
e(qλ+a)(ρ(τ )−ρ(s)) e−λ(ρ(τ )−ρ(s)) γ (τ ) dτ.
s
Applying Gronwall’s lemma to the function e−λ(ρ(t)−ρ(s)) γ (t), we obtain γ (t) K eλ(ρ(t)−ρ(s))+aρ(s) ξ − ξ¯ + ξ · φ − ψ , for some constant K > 0.
2
The next step is to show the existence of a function φ ∈ X satisfying
φ t, x(t) = T (t, s)φ s, x(s) +
t
T (t, τ )Q(τ )f τ, x(τ ), φ τ, x(τ ) dτ
(21)
s
with x = xφ . First we transform this problem into an equivalent one. Lemma 3. Given δ > 0 sufficiently small, s 0 sufficiently large, and φ ∈ X the following properties hold: 1. if (21) holds for every (s, ξ ) ∈ Zβ and t s, then ∞ φ(s, ξ ) = −
T (τ, s)−1 Q(τ )f τ, xφ (τ ), φ τ, xφ (τ ) dτ
(22)
s
for every (s, ξ ) ∈ Zβ ; 2. if (22) holds for every (s, ξ ) ∈ Zβ , then (21) holds for every (s, ξ ) ∈ Zβ+a (δ/R) and t s. Proof. By Lemma 1 we have f τ, xφ (τ, ξ ), φ τ, xφ (τ, ξ ) c3q+1 xφ (τ, ξ )q+1 c3q+1 R q+1 e(q+1)λ(ρ(τ )−ρ(s))+qaρ(s) ξ q+1 ,
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 1018–1029
1027
and hence, ∞ T (τ, s)−1 Q(τ )f τ, xφ (τ, ξ ), φ τ, xφ (τ, ξ ) dτ s
∞ c3
q+1
DR
q+1
ξ
q+1 (q+1)aρ(s)
e((q+1)λ−μ+a)(ρ(τ )−ρ(s)) dτ.
e
s
Since (q + 1)λ − μ + a < 0, arguing as in the proof of Theorem 1 (see (8)) we find that the integral is well defined. Now we assume that (21) holds for every (s, ξ ) ∈ Zβ and t s. This is equivalent to φ(s, ξ ) = V (t, s)−1 φ t, xφ (t, ξ ) t −
V (τ, s)−1 f τ, xφ (τ, ξ ), φ τ, xφ (τ, ξ ) dτ,
(23)
s
where V (t, s) = T (t, s)Q(s). Since V (t, s)−1 φ t, xφ (t, ξ ) 2DRξ e2aρ(s) e(λ−μ+b)(ρ(t)−ρ(s)) , letting t → +∞ in (23) we conclude that (22) holds for every (s, ξ ) ∈ Zβ and t s. The second property follows by repeating arguments in the proof of Lemma 4.7 in [4]. 2 Lemma 4. Given δ > 0 sufficiently small, there is a unique function φ ∈ X such that (22) holds for every (s, ξ ) ∈ Zβ . Proof. We consider the operator Φ defined for each φ ∈ X by ∞ (Φφ)(s, ξ ) = −
V (τ, s)−1 f τ, xφ (τ, ξ ), φ τ, xφ τ, x(τ, ξ ) dτ
s
when (s, ξ ) ∈ Zβ , and by (Φφ)(s, ξ ) = (Φφ) s, δe−βρ(s) ξ/ξ otherwise. When ξ = 0 we have xφ (t, ξ ) = 0, and hence (Φφ)(t, 0) = 0. By (19) and Lemmas 1 and 2 we obtain a(τ ) := f τ, xφ (τ, ξ ), φ τ, xφ (τ, ξ ) − f τ, xφ (τ, ξ¯ ), φ τ, xφ (τ, ξ¯ ) ηK e−2aρ(s) e(q+1)λ(ρ(τ )−ρ(s)) ξ − ξ¯ , and using (11),
1028
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 1018–1029
(Φφ)(s, ξ ) − (Φφ)(s, ξ¯ )
∞ V (τ, s)−1 a(τ ) dτ s −aρ(s)
ηK e
ξ − ξ¯
∞ e((q+1)λ−μ+a)(ρ(τ )−ρ(s)) dτ. s
Since (q + 1)λ − μ + a < 0, arguing as in the proof of Theorem 1 (see (8)) we find that there exists t0 > 0 such that ∞ e((q+1)λ−μ+a)(ρ(τ )−ρ(s)) dτ s s
for every s t0 . Hence, we obtain (Φφ)(s, ξ ) − (Φφ)(s, ξ¯ ) ηK Kξ − ξ¯ whenever (s, ξ ), (s, ξ¯ ) ∈ Zβ , and proceeding as in the proof of Lemma 4.3 in [4] yields (Φφ)(s, ξ ) − (Φφ)(s, ξ¯ ) 2ηK Kξ − ξ¯ for arbitrary ξ and ξ¯ . Therefore, taking δ sufficiently small (see (18)) the operator Φ : X → X is well defined. On the other hand, by (20) and Lemmas 1 and 2 we have b(τ ) := f τ, xφ (τ, ξ ), φ τ, xφ (τ, ξ ) − f τ, xψ (τ, ξ ), ψ τ, xψ (τ, ξ ) η(R + 3K )e−2aρ(s) e(q+1)λ(ρ(τ )−ρ(s)) ξ · φ − ψ, and using (11), (Φφ)(s, ξ ) − (Φψ)(s, ξ )
∞ V (τ, s)−1 b(τ ) dτ s
η(R + 3K )e
−2aρ(s)
∞ ξ φ − ψ
e((q+1)λ−μ+a)(ρ(τ )−ρ(s)) dτ. s
This implies that Φφ − Φψ η(R + 3K )K φ − ψ for some constant K > 0. Taking δ sufficiently small (see (18)) the operator Φ : X → X becomes a contraction. 2 We proceed with the proof of the theorem. By Lemma 1, for each φ ∈ X there is a unique function x = xφ satisfying (15), and thus it remains to solve (21) for φ setting x = xφ . This is
L. Barreira, C. Valls / Journal of Functional Analysis 257 (2009) 1018–1029
1029
an immediate consequence of Lemmas 3 and 4 for (s, ξ ) ∈ Zβ+a (δ/R) ⊂ Zβ . Furthermore, by Lemma 3, for every (s, ξ ) ∈ Zβ+a (δ/R) we have xφ (t) Reλ(ρ(t)−ρ(s))+aρ(s) ξ δe−βρ(s) ,
(24)
and (t, xφ (t)) ∈ Zβ for every t s. Thus, there exists a unique function φ ∈ X such that the graph set W in (13) obtained from φ|Zβ is invariant under the semiflow Ψτ for initial conditions (s, ξ ) ∈ Zβ+a (δ/R). Finally, it follows from (17) and (14) that for every t s, Ψt−s s, ξ, φ(s, ξ ) − Ψt−s s, ξ¯ , φ(s, ξ¯ ) 3xφ (t, ξ ) − xφ (t, ξ¯ ) 3K eλ(ρ(t)−ρ(s))+aρ(s) ξ − ξ¯ .
(25)
For each ξ, ξ¯ ∈ Bs (δe−(β+a)ρ(s) ), in view of (24) we can replace the function φ ∈ X in (25) by its restriction to Zβ . This completes the proof. 2 References [1] L. Barreira, Ya. Pesin, Lyapunov Exponents and Smooth Ergodic Theory, Univ. Lecture Ser., vol. 23, Amer. Math. Soc., 2002. [2] L. Barreira, C. Valls, Stable manifolds for nonautonomous equations without exponential dichotomy, J. Differential Equations 221 (2006) 58–90. [3] L. Barreira, C. Valls, Growth rates and nonuniform hyperbolicity, Discrete Contin. Dyn. Syst. 22 (2008) 509–528. [4] L. Barreira, C. Valls, Stability of Nonautonomous Differential Equations, Lecture Notes in Math., vol. 1926, Springer, 2008. [5] L. Barreira, C. Valls, Robustness of general dichotomies, J. Funct. Anal. (2009), doi:10.1016/j.jfa.2008.11.018, in press. [6] C. Chicone, Yu. Latushkin, Evolution Semigroups in Dynamical Systems and Differential Equations, Math. Surveys Monogr., vol. 70, Amer. Math. Soc., 1999. [7] W. Coppel, Dichotomies in Stability Theory, Lecture Notes in Math., vol. 629, Springer, 1978. [8] J. Hale, Asymptotic Behavior of Dissipative Systems, Math. Surveys Monogr., vol. 25, Amer. Math. Soc., 1988. [9] D. Henry, Geometric Theory of Semilinear Parabolic Equations, Lecture Notes in Math., vol. 840, Springer, 1981. [10] R. Mañé, Lyapunov exponents and stable manifolds for compact transformations, in: J. Palis (Ed.), Geometric Dynamics, Rio de Janeiro, 1981, in: Lecture Notes in Math., vol. 1007, Springer, 1983, pp. 522–577. [11] V. Oseledets, A multiplicative ergodic theorem. Lyapunov characteristic numbers for dynamical systems, Trans. Moscow Math. Soc. 19 (1968) 197–221. [12] Ya. Pesin, Families of invariant manifolds corresponding to nonzero characteristic exponents, Math. USSR-Izv. 10 (1976) 1261–1305. [13] Ya. Pesin, Characteristic Lyapunov exponents, and smooth ergodic theory, Russian Math. Surveys 32 (1977) 55– 114. [14] D. Ruelle, Characteristic exponents and invariant manifolds in Hilbert space, Ann. of Math. (2) 115 (1982) 243–290. [15] G. Sell, Y. You, Dynamics of Evolutionary Equations, Appl. Math. Sci., vol. 143, Springer, 2002.
Journal of Functional Analysis 257 (2009) 1030–1052 www.elsevier.com/locate/jfa
An asymptotic functional-integral solution for the Schrödinger equation with polynomial potential S. Albeverio a,b,c,d,e,f,g , S. Mazzucchi a,e,∗,1 a Institut für Angewandte Mathematik, Wegelerstr. 6, 53115 Bonn, Germany b HCM, Germany c SFB611, BIBOS, Germany d IZKS, Austria e Dipartimento di Matematica, Università di Trento, 38050 Povo, Trento, TN, Italy f Cerfim (Locarno), Switzerland g Acc. Arch. (USI) (Mendrisio), Italy
Received 23 January 2009; accepted 5 February 2009 Available online 20 February 2009 Communicated by Paul Malliavin
Abstract A functional integral representation for the weak solution of the Schrödinger equation with a polynomially growing potential is proposed in terms of an analytically continued Wiener integral. The asymptotic expansion in powers of the coupling constant λ of the matrix elements of the Schrödinger group is studied and its Borel summability is proved. © 2009 Elsevier Inc. All rights reserved. Keywords: Feynman path integrals; Schrödinger equation; Analytic continuation of Wiener integrals; Polynomial potential; Asymptotic expansions
* Corresponding author at: Dipartimento di Matematica, Università di Trento, 38050 Povo, Trento, TN, Italy.
E-mail address: [email protected] (S. Mazzucchi). 1 F. Severi fellow.
0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.02.005
S. Albeverio, S. Mazzucchi / Journal of Functional Analysis 257 (2009) 1030–1052
1031
1. Introduction The Schrödinger equation ⎧ ⎨
h¯ 2 ∂ ψ =− ψ + V ψ, 2m ⎩ ∂t ψ(0, x) = ψ0 (x) i h¯
(1)
with a polynomial potential V of the form V (x) = λ|x|2M and the asymptotic behaviour of its solution in some limiting situation (for instance when λ → 0 or h¯ → 0, or t → 0) is a largely studied topic [9,25,11,8]. Particularly interesting is the study of a possible functional integral representation, in the spirit of Feynman path integrals. During the last four decades, rigorous mathematical definitions of the heuristic Feynman path integrals have been given by means of different methods, and the properties of these rigorous integrals have been studied. Let us mention here only three of them, namely the one using the analytic continuation of Wiener integrals [10,12,16,17], the one provided by infinite-dimensional oscillatory integrals [1,3,20] and the one using white noise calculus [15] (see also the references given in [1,20] to other approaches). The main problem which is common to all the existing approaches is the restriction on the class of potentials V which can be handled. For most results one has to impose that V has at most quadratic growth at infinity. There are two exceptions to this restriction, the one of potentials which are Laplace transform of measures (like exponential potentials, see [1,15] and references therein) and quartic potentials [2,5–7,19]. A difficulty in the study of Eq. (1) with a polynomial potential is the non-regular behaviour of the solution. Indeed in [28] it has been shown that for superquadratic potentials the fundamental solution of (1) is nowhere of class C 1 . In [5,7,19] an infinite-dimensional oscillatory integral representation for the weak solution of the Schrödinger equation, i.e. the matrix element of the Schrödinger group, has been presented and studied in the case where the potential has precisely a quartic growth at infinity. The present paper generalizes partially these results to polynomial potentials with higher growth at infinity. For a dense set of vectors φ, ψ ∈ L2 (Rd ), we define an analytically continued Wiener integral Iti (φ, ψ) (Eq. (28)) which realizes rigorously the Feynman path integral i
representing the corresponding “matrix elements” of the Schrödinger group φ, e− h¯ H t ψ and prove that it solves the Schrödinger equation in a weak sense (Theorem 8). The relation between i the functional integral Iti (φ, ψ) and the matrix elements φ, e− h¯ H t ψ is investigated in details. In particular we prove that these quantities are asymptotically equivalent both as t → 0 and as i λ → 0. The asymptotic expansion in powers of λ of φ, e− h¯ H t ψ is studied and its Borel summability is proved. This result allows one to recover the matrix elements of the Schrödinger group from the asymptotic expansion in powers of λ of the functional integral Iti (φ, ψ), which in this sense can be recognized as an asymptotic weak solution of the Schrödinger equation. The paper is organized as follows. In Section 2 the analyticity properties of the spectrum i of the anharmonic oscillator Hamiltonian H and of the matrix elements φ, e− h¯ H t ψ of the Schrödinger group are studied. In Section 3 the Borel summability of the asymptotic expani sion in powers of the coupling constant λ (Dyson expansion) of the quantities φ, e− h¯ H t ψ is proved. Section 4 studies the definition and the properties of the functional integral Iti (φ, ψ), while Section 5 investigates its relations with the matrix elements of the Schrödinger group and their asymptotic equivalence.
1032
S. Albeverio, S. Mazzucchi / Journal of Functional Analysis 257 (2009) 1030–1052
2. The Schrödinger equation with polynomial potential Let us consider the quantum anharmonic oscillator Hamiltonian with polynomial potential on L2 (Rd ), that is the operator defined on the vectors φ ∈ C0∞ (Rd ) by H φ(x) = −
h¯ 2 ψ(x) + λV2M (x)ψ(x), 2
x ∈ Rd ,
(2)
where λ ∈ R is a real positive “coupling” constant, V2M is a positive homogeneous 2M-order polynomial, and h¯ is the reduced Planck’s constant (the mass of the particle is set equal to 1 for simplicity). In the following, in order to simplify some notations, we shall put V2M (x) := |x|2M , but all our results are also valid in the more general case as they depend only on the positivity and the homogeneity properties of the potential V2M . H is essentially self-adjoint on C0∞ (Rd ) (see [23, Theorem X.28]). Its closure, denoted again by H , has the following domain: D(H ) = D() ∩ D |x|2M
2 2 d 2 4M 4 ˆ φ(x) dx < ∞, = φ∈L R : |x| |p| φ(p) dp < ∞ , where φˆ denotes the Fourier transform of φ. It is well known that H is a positive operator with a pure point spectrum {En } ⊂ R+ . Therefore −H generates an analytic contraction semigroup, P (z) : L2 (Rd ) → L2 (Rd ), P (z) = e−H z , with z being a complex parameter with positive real part. In the case where z is purely imaginary of the form z = hi¯ t, t ∈ R, one obtains a one pai
rameter group of unitary operators U (t) := e− h¯ H t , i.e. the Schrödinger group. Given a vector ψ0 ∈ L2 (Rd ), the vector ψ(t) := U (t)ψ0 belongs to D(H ) and it satisfies the Schrödinger equation: i h¯
∂ ψ(t) = H ψ(t). ∂t
(3)
The particular form of the potential allows one to prove a scaling property for the eigenvalues {En } of the operator H as well as their analyticity as a function of the coupling constant λ on a suitable region of a Riemann surface. The present lemma is taken from [25], which presents a detailed study of this problem, also in more general cases. Lemma 1. Let En (λ) denote the nth eigenvalue of the Hamiltonian (2). Then for λ, α > 0 one has En (λ) = α −1 En λα M+1 .
(4)
In particular 1
En (λ) = λ M+1 En (1).
(5)
S. Albeverio, S. Mazzucchi / Journal of Functional Analysis 257 (2009) 1030–1052
1033
Proof. Let us consider, for any α ∈ R+ , the unitary operator V (α) : L2 (Rd ) → L2 (Rd ) given on vectors φ ∈ S(Rd ) by V (α)φ(x) = α 1/4 φ α 1/2 x ,
x ∈ Rd .
(6)
It simple to verify that V (α) leaves D(−) and D(|x|2M ) invariant and V (α)x 2M V (α)−1 = α M x 2M , V (α)V (α)−1 = α −1 . It follows that 2 h¯ V (α)H V (α)−1 = α −1 − + α M+1 λx 2M . 2 In particular, by taking α = λ−1/(M+1) one has
V λ
−1/(M+1)
2 −1/(M+1) −1 h¯ 1/(M+1) 2M − +x . HV λ =λ 2
(7)
As for any α ∈ R+ , the operator V (α)H V (α)−1 has the same spectrum of H , from Eq. (7) one easily deduces Eq. (5). 2 Remark 1. By analytic continuation, relation (5) allows to extends En (λ) to all complex values of λ belonging to a Riemann surface. In particular, the function En is many-sheeted and has an (M + 1)st order branch point at λ = 0. i
Let us consider now the matrix elements of the evolution operator U (t) = e− h¯ H t , i.e. the inner products φ, U (t)ψ, with φ, ψ ∈ L2 (Rd ). Let us consider also the function f : R+ → C of the coupling constant λ (present in H , hence in U (t)) defined by
f (λ) := φ, U (t)ψ ,
λ ∈ R, λ > 0.
(8)
˜ of the logarithm defined by Let us denote by Dθ1 ,θ2 the sector of the Riemann surface C Dθ1 ,θ2 := z ∈ C, z = ρeiφ : ρ > 0, φ ∈ (θ1 , θ2 ) . Let us consider the dense subset of L2 (Rd ) made of finite linear combinations of vectors of the form φ(x) = P (x)e−σ
2 |x|2
,
x ∈ Rd
(9)
with P being any polynomial with complex coefficients and σ 2 ∈ C a complex constant with positive real part (that these vectors are dense in L2 (Rd ) follows from the known fact that the finite linear combinations of Hermite functions are dense in the same space).
1034
S. Albeverio, S. Mazzucchi / Journal of Functional Analysis 257 (2009) 1030–1052
¯ ψ ∈ S(Rd ) are of the Theorem 1. Let φ, ψ ∈ S(Rd ) ⊂ L2 (Rd ), such that the functions φ, form (9), with σ 2 ∈ C, σ 2 = |σ 2 |eiδ , δ ∈ R such that there exists an > 0 with
cos(δ + α) > ,
M −1 ∀α ∈ 0, π . M +1
(10)
M−1 .) Then the function f : R+ → C defined by (8) (This is the case for instance if δ = −π 2(M+1) ˜ of can be extended to an analytic function on the sector D−(M−1)π,0 of the Riemann surface C the logarithm.
Proof. Let us denote by {En (λ)}, resp. {en (λ)} the eigenvalues and resp. the eigenvectors of the Hamiltonian operator (2). Under the given assumptions on φ, ψ and by Eq. (5), for λ ∈ R+ the function f is given by f (λ) =
i
an (λ)bn (λ)e− h¯ En (λ)t =
i
an (λ)bn (λ)e− h¯ λ
1 M+1 En (1)t
with
an (λ) = φ, en (λ) ,
bn (λ) = en (λ), ψ .
On the other hand each coefficient an , bn can be extended to an analytic function of the variable λ on D−(M−1)π,0 . Indeed one has (for λ > 0) that en (λ) = V (λ−1/(M+1) )−1 en (1), where V (−λ1/(M+1) ) is the operator defined by (6). Without loss of generality, we can consider as an 2 2 instance a vector ψ of the form ψ(x) = x k e−σ x . In this case one has: −1
bn (λ) = V λ−1/(M+1) en (1), ψ = en (1), V λ−1/(M+1) ψ 1 1 k 2 − (M+1) |x|2 = λ− 4(M+1) en (1)(x)λ− 2(M+1) |x|k e−σ λ
(11) (12)
and the coefficient bn (λ) can be interpreted in terms of the inner product between the vector en (1) and the function 1
k
x → ψλ (x) := λ− 4(M+1) λ− 2(M+1) |x|k e−σ
1 2 λ− (M+1) |x|2
.
(13)
For λ ∈ D−(M−1)π,0 and for σ 2 satisfying the assumptions of the theorem, it is simple to verify that the function (13) belongs to L2 (Rd ) and its L2 -norm is uniformly bounded in D−(M−1)π,0 : ψλ <
Ck , k+1/2
where the constant Ck depends on k, while is the parameter appearing in condition (10). An analogous reasoning holds also for the coefficients an (λ). On the other hand, for λ ∈ D−(M−1)π,0 , one has 1 − i λ M+1 En (1)t e h¯ 1.
S. Albeverio, S. Mazzucchi / Journal of Functional Analysis 257 (2009) 1030–1052
1035
The function f is then given by the following limit of a sequence of analytic functions on D−(M−1)π,0 , uniformly bounded on it: f (λ) = lim
N
N →∞
i
an (λ)bn (λ)e− h¯ λ
1 M+1 En (1)t
,
n
and by Vitali’s theorem, the limit defines an analytic function on D−(M−1)π,0 .
2
In the case where the potential is not homogeneous, in particular if we add to it a second degree term, i.e. a term of the harmonic oscillator type, the proof of Theorem 1 does not work. 2 2 In particular, in the case where H = H0 + λV2M , with H0 = − h¯2 + Ω2 x 2 , Ω 2 a d × d symmetric positive matrix and V2M (x) = |x|2M , an analogous result can only be obtained by further restricting the region of analyticity, as exposed in the following: 2
Theorem 2. For H = H0 + λV2M , with H0 = − h¯2 + − hi¯ H t
f : R+ → C given by f (λ) := φ, e complex variable λ in the sector D−π,0 .
Ω2 2 2 x
and V2M (x) = |x|2M , the function
ψ can be extended to an analytic function of the
Proof. By Trotter’s product formula, for λ ∈ R+ one has for any φ, ψ ∈ L2 (Rd ):
it n it f (λ) = lim φ, e− nh¯ H0 e− nh¯ λV2M ψ . n→∞
The positive multiplication operator V2M generates an analytic contraction semigroup, and for any n ∈ N, the function fn : D−π,0 → C defined by n
it it fn (λ) := φ, e− nh¯ H0 e− nh¯ λV2M ψ is analytic on D−π,0 and satisfies the bound |fn (λ)| φ ψ . By Vitali’s theorem the functions fn converge to an analytic function f on D−π,0 . 2 3. Borel summability of the Dyson expansion of the Schrödinger group in powers of the coupling constant Let us consider now the asymptotic expansion of the function f when λ → 0. The present section is devoted to the proof of its Borel summability. We recall that an asymptotic expansion an zn of a function f (z) as z → 0 in an appropriate region of the complex plane is said to be Borel summable [14,27] if the following procedure is possible: (1) B(t) = an t n /n! converges in some circle |t| < r; (2) B(t) has an analytic continuation to a neighborhood of the positive real axis; (3) f can be computed in terms of the Borel–Laplace transform of B(t), i.e. f (z) = 1 ∞ −t/z B(t) dt. z 0 e In this case the asymptotic expansion an zn allows to construct the function f without any ambiguity and to characterize it uniquely. One of the main tools for the proof of Borel summability
1036
S. Albeverio, S. Mazzucchi / Journal of Functional Analysis 257 (2009) 1030–1052
is Watson’s theorem (and its improved version, i.e. Nevanlinna’s theorem). We give here for later use a particular form of it [27,24]: ˜ of Theorem 3. Let f (z) be an analytic function in a sectorial region of the Riemann surface C the logarithm
1 ˜ z ∈ C 0 < |z| < R, arg(z) < kπ 2 and satisfying there an estimate of the form N −1 an zn AC N |z|N (kN )! f (z) − n
uniformly in N and z in the sector. Then the asymptotic series an zn is Borel summable to the function f . i
Let us consider the function f of the complex variable λ given by f (λ) = φ, e− h¯ H t ψ, where H is given by (2), and describing the matrix elements of the Schrödinger group. Its asymptotic expansion as λ → 0 can be obtained in terms of the Dyson expansion of the evolution operator i Uλ (t) = e− h¯ H t . In the following we shall denote by H0 the Hamiltonian operator H in the case where λ = 0. i
Lemma 2. If λ ∈ R+ , φ ∈ S(Rd ) and ψ ∈ L2 (Rd ), the function f (λ) = φ, e− h¯ H t ψ describing the Schrödinger group has the following asymptotic expansion:
f (λ) =
N −1
an λn + RN (λ)
(14)
n
with t t
i n 1 − an = V2M (−s1 ) . . . V2M (−sn )U0 (−t)φ, ψ ds1 . . . dsn ... n! h¯ 0
0
and
RN (λ) =
t t
λN i N V (−sN ) . . . V (−s1 )U0 (−t)φ, U0 (−sN )Uλ (sN )ψ ds1 . . . dsN , ... − N! h¯ 0
0
(15) it
where V (si ) := U0 (−si )V2M U0 (si ), U0 (t) = e− h¯ H0 .
S. Albeverio, S. Mazzucchi / Journal of Functional Analysis 257 (2009) 1030–1052
1037
Proof. The asymptotic expansion can be obtained by means of Dyson’s expansion. Let us it it set Uλ (t) = e− h¯ H and set U0 (t) = e− h¯ H0 . Let λ ∈ R+ . Given a vector ψ ∈ D(Hλ ) = D(H0 ) ∩ D(V2M ) one can easily prove that the vector Uλ (t)ψ satisfies the following integral equation iλ Uλ (t)ψ = U0 (t)ψ − h¯
t U0 (t − s)V2M Uλ (s)ψ ds 0
so that for any φ ∈ L2 (Rd ), one has
iλ f (λ) = φ, Uλ (t)ψ = φ, U0 (t)ψ − h¯
t
φ, U0 (t − s)V2M Uλ (s)ψ ds.
0
By choosing φ ∈ S(Rd ), we have φ ∈ D[(U0 (s)V2M )n ] for any s ∈ [0, t] and n ∈ N, and one can easily prove that for any N ∈ N the following holds −1 N φ, Uλ (t)ψ = an λn + RN (λ),
(16)
n
where t t
i n 1 − an = V2M (−s1 ) . . . V2M (−sn )U0 (−t)φ, ψ ds1 . . . dsn ... n! h¯ 0
0
and t t
i N λN − V (−sN ) . . . V (−s1 )U0 (−t)φ, U0 (−sN )Uλ (sN )ψ ds1 . . . dsN ... RN (λ) = N! h¯ 0
0
(17) with V (si ) := U0 (−si )V2M U0 (si ). Both sides of (16) are continuous functionals of the vector ψ ∈ D(H ) and can be extended to ψ ∈ L2 (Rd ). 2 Lemma 3. Let φ, ψ satisfy the assumptions of Theorem 1. Then the asymptotic expansion (14) holds in the whole analyticity region D−(M−1)π,0 . Proof. Under the stated assumptions, both sides of (14) are analytic in D−(M−1)π,0 and coincides on R+ . By the uniqueness of analytic continuation they coincide in the whole sector D−(M−1)π,0 . 2
1038
S. Albeverio, S. Mazzucchi / Journal of Functional Analysis 257 (2009) 1030–1052
The following theorem gives the Borel summability property of the asymptotic expansion (14) for ψ ∈ L2 (Rd ) and φ belonging to a dense set of vectors in L2 (Rd ). Theorem 4. Let φ, ψ satisfy the assumptions of Theorem 1. Then the asymptotic expansion (14) of the function f describing the Schrödinger group is Borel summable. Proof. By Theorem 1 the function f is analytic on a sector of amplitude π(M − 1) and admits there an asymptotic expansion of the form (14). By exploiting the particular form of the vector φ, by a direct computation it is possible to verify that the remainder RN satisfies uniformly in D−(M−1)π,0 the bound RN (λ) AC N |λ|N Γ N (M − 1) , where A, C are constants depending on φ, ψ . By Nevanlinna’s theorem [22,27], the asymptotic expansion (14) is Borel summable.
2
Remark 2. In the case V (x) = |x|4 the Borel summability of the asymptotic expansion (14) can be proved also in the case where H0 is the harmonic oscillator Hamiltonian, by using the result of Theorem 2. 4. On the functional integral representation of the weak solution of the Schrödinger equation Let us consider the heuristic Feynman path integral representation for the matrix elements of the Schrödinger group generated by the Hamiltonian (2):
φ, e
− hi¯ H t
ψ =
Rd
=
Rd
¯ dx φ(x)
i e h¯ St (γ ) ψ 0, γ (0) dγ
γ (t)=x
¯ dx φ(x)
i
e 2h¯
t 0
|γ˙ (s)|2 ds − iλ h¯
e
t 0
|γ (s)+x|2M ds
ψ 0, γ (0) + x dγ
,
γ (t)=0
(18) where φ, ψ ∈ L2 (Rd ). The aim of the present section is to provide a rigorous mathematical definition of the right-hand side of Eq. (18) in terms of a well-defined functional integral, by means of an analytically continued Wiener integral. Let us consider first of all the heat equation h¯
∂ ψ = −zH ψ, ∂t
(19)
where z is a positive real parameter. It is well known [23] that the self-adjoint operator H defined by closure from its restriction to the Schwartz space of test functions S(Rd ) by (2) is the generator of an analytic semigroup. In particular given two vectors φ, ψ ∈ L2 (Rd ), the inner product z φ, e− h¯ H t ψ is given by the Feynman–Kac formula [26]:
S. Albeverio, S. Mazzucchi / Journal of Functional Analysis 257 (2009) 1030–1052
φ, e
− hz¯ H t
ψ =
¯ φ(x)
Rd
= zd/2
t √ ¯ zω(s)+x|2M ds 0| h
zλ
e− h¯
Ct
√ ¯ zx) φ(
Rd
e−
zM+1 λ t 0 h¯
ψ0
1039
√ h¯ zω(t) + x W (dω) dx
√ | h¯ ω(s)+x|2M ds
ψ0
√ √ h¯ zω(t) + zx W (dω) dx.
Ct
(20) By the analyticity property of the semigroup generated by H , one can easily deduce the following result: z
Theorem 5. The left-hand side of (20), namely the matrix element φ, e− h¯ H t ψ, extends, for any φ, ψ ∈ L2 (Rd ), to an analytic function of the complex variable z, holomorphic in D−π/2,π/2 and continuous on the boundary D¯ −π/2,π/2 . In particular for z = i one obtains the matrix elements i
of the Schrödinger group φ, e− h¯ H t ψ.
By imposing suitable analyticity conditions on the vectors φ, ψ in (20) one obtains the following: ¯ ψ ∈ L2 (Rd ) satisfying the following conditions: Theorem 6. Let φ, (1) For any x ∈ Rd the functions √ ¯ zx), z → φ(
√ z → ψ( zx),
π π z ∈ D¯ − 2(M+1) , 2(M+1)
π π π π are continuous on D¯ − 2(M+1) and holomorphic on D− 2(M+1) ; , 2(M+1) , 2(M+1) ¯ π π (2) For any z ∈ D− 2(M+1) , the functions , 2(M+1)
√ ¯ zx), x → φ(
√ x → ψ( zx),
x ∈ Rd
belong to L2 (Rd ). Then the right-hand side of (20), namely the integral Itz (φ, ψ) := zd/2 Rd
√ ¯ zx) φ(
e−
zM+1 λ t 0 h¯
√ | h¯ ω(s)+x|2M ds
ψ0
√ √ h¯ zω(t) + zx W (dω)
Ct
(21) extends, for any φ, ψ ∈ L2 (Rd ), to an analytic function of the complex variable z, holomorphic π π π π in D− 2(M+1) and continuous on the boundary D¯ − 2(M+1) . , 2(M+1) , 2(M+1) π π Proof. By the stated assumptions, for any z ∈ D¯ − 2(M+1) , the integral (21) is well defined, , 2(M+1) and
1040
S. Albeverio, S. Mazzucchi / Journal of Functional Analysis 257 (2009) 1030–1052
z I (φ, ψ) zd/2
√ φ( ¯ zx)
t
Rd
√ ψ h¯ zω(t) + √zx W (dω)
Ct
1 = |z|d/2 φz , e− h¯ H0 t ψz φz ψz ,
(22)
¯ √zx)|, ψz (x) := |ψ(√zx)|. where φz , ψz ∈ L2 (Rd ) are defined resp. by φz (x) := |φ( π π The analyticity of the function z → Itz (φ, ψ) on D− 2(M+1) follows by Fubini’s and by , 2(M+1) ¯ Morera’s theorems. The continuity on D− π , π follows by the dominated convergence theorem.
2(M+1) 2(M+1)
2
From the previous theorem one can easily deduce the following ¯ ψ ∈ L2 (Rd ) satisfy the following conditions: Corollary 1. Let φ, (1) For any x ∈ Rd the functions √ ¯ zx), z → φ(
√ z → ψ( zx),
z ∈ D¯ 0, π2
are continuous on D¯ 0, π2 and holomorphic on D0, π2 ; (2) For any z ∈ D¯ 0, π2 , the functions √ x → ψ( zx),
√ ¯ zx), x → φ(
x ∈ Rd
belong to L2 (Rd ). Then
φ, e
− hi¯ H0 t
ψ =e
i π4 d
π φ¯ ei 4 x
Rd
ψ
√
π π h¯ ei 4 ω(t) + ei 4 x W (dω) dx,
(23)
Ct
where H0 is the free Hamiltonian given on ψ ∈ C02 (Rd ) by H0 ψ(x) = −
h¯ 2 ψ(x). 2
Proof. Let us consider the function f1 : D¯ −π/2,π/2 → C given by f1 (z) = φ, e− h¯ H0 ψ. By Theorem 5, f1 is analytic on D−π/2,π/2 and continuous on D¯ −π/2,π/2 . Let f2 : D¯ 0, π2 → C be defined by z
f2 (z) = zd/2 Rd
√ ¯ zx) φ(
ψ
√ √ √ h¯ zω(t) + zx W (dω) dx.
Ct
By Theorem 6, f2 is analytic on D0,π/2 and continuous on the closure D¯ 0,π/2 of D0,π/2 .
S. Albeverio, S. Mazzucchi / Journal of Functional Analysis 257 (2009) 1030–1052
1041
By the Feynman–Kac formula (20), the functions f1 and f2 coincide on R+ . By the uniqueness of analytic continuation they coincide on the whole domain. In particular, by the continuity on the boundary, one has f1 (i) = f2 (i), i.e.
φ, e
− hi¯ H0 t
ψ =e
i π4 d
π φ¯ ei 4 x
Rd
ψ
√ i π π h¯ e 4 ω(t) + ei 4 x W (dω) dx.
2
Ct
By restricting the time interval [0, t], it is possible to generalize the previous result to the case where H0 is the harmonic oscillator Hamiltonian, given on ψ ∈ C02 (Rd ) by H0 ψ(x) = −
h¯ 2 1 ψ(x) + xΩ 2 xψ(x), 2 2
(24)
where Ω is a d × d symmetric positive matrix and Ωj , j = 1, . . . , d, are its eigenvalues. ¯ ψ ∈ L2 (Rd ) satisfying condition (1) of Corollary 1. Let us assume that there Theorem 7. Let φ, exists a positive constant C ∈ R+ such that ∀z ∈ D¯ 0, π2 and ∀(x, y) ∈ Rd × Rd the following inequality holds: √ √ √ √ |x|2 φ( ¯ z h¯ x)ψ z h¯ (x + y) e 2 C. Let us assume moreover that for any j = 1, . . . , d the time t satisfies the following inequalities: π , 2
Ωj t <
1 − Ωj tan(Ωj t) > 0.
(25)
Then
φ, e
− hi¯ H0 t
ψ =e
i π4 d d/2
h¯
1 t
× e2
0
π√ φ¯ ei 4 h¯ x
Rd
ψ
√ i π π√ h¯ e 4 ω(t) + ei 4 h¯ x
Ct
(ω(s)+x)Ω 2 (ω(s)+x) ds
(26)
W (dω) dx,
where H0 is the harmonic oscillator Hamiltonian (24). Proof. By the Feynman–Kac formula and a change of variable, for any z ∈ R+ one has
φ, e
− hz¯ H0 t
ψ =z
d/2 d/2
h¯
z2
× e− 2
√ √ ¯ z h¯ x) φ(
ψ
√ √ |x|2 h¯ z ω(t) + x e 2
Ct Rd t 2 0 (ω(s)+x)Ω (ω(s)+x) ds
W (dω)e−
|x|2 2
dx.
(27)
By the analyticity of the semigroup generated by the harmonic oscillator Hamiltonian, the left¯ hand side is a holomorphic function of z ∈ D−π/2, π/2 and continuous for z ∈ D−π/2, π/2. The right-hand side satisfies, by the stated assumptions on φ, ψ, the following bound for any z ∈ D¯ 0, π2 :
1042
S. Albeverio, S. Mazzucchi / Journal of Functional Analysis 257 (2009) 1030–1052
d/2 d/2 √ √ |x|2 √ √ z h¯ ¯ z h¯ x) ψ h¯ z ω(t) + x e 2 φ( Rd
×e
Ct
2 − z2
t
0 (ω(s)+x)Ω
2 (ω(s)+x) ds
1
|z|d/2 |h¯ |d/2 C
e2
t
W (dω)e
0 (ω(s)+x)Ω
− |x|2
2
dx
2 (ω(s)+x) ds
W (dω)e−
|x|2 2
dx.
Rd ×Ct
By the assumption (25) the latter integral is convergent (see [5] and [20] for more details on this estimate). By Fubini’s and Morera’s theorems, the right-hand side of (27) is a holomorphic function of z z ∈ D0,π/2 and continuous for z ∈ D¯ 0,π/2 , which coincides with the function z → φ, e− h¯ H0 t ψ on R+ . By the uniqueness of analytic continuation one gets for z = i Eq. (26). 2 We are now going to see to which extent the results of Corollary 1 and Theorem 7 can be generalized to the case where the free Hamiltonian resp. the harmonic oscillator Hamiltonian are replaced by the anharmonic oscillator Hamiltonian with polynomial potential (2). Formally, by substituting z = i also on the right-hand side of (20) one obtains the following expression: π
ei 4 d
π φ¯ ei 4 x
Rd
e−
e
i(M+1) π 2 h¯
λ
t √ ¯ ω(s)+x|2M ds 0| h
ψ
√ i π π h¯ e 4 ω(t) + ei 4 x W (dω) dx
Ct
:= Iti (φ, ψ).
(28)
Theorem 8. Let the vectors φ, ψ ∈ L2 (Rd ) satisfy the assumptions of Corollary 1 and let the π degree 2M of the polynomial potential V2M be such that Re(ei(M+1) 2 ) 0. Then the integral Iti (φ, ψ) in Eq. (28) is well defined and satisfies the following inequality i I (φ, ψ) φi ψi , t
¯ i π4 x)|, ψi (x) := |ψ(ei π4 x)|. Moreover, if φ, ψ ∈ D(H ) and H φ, H ψ satisfy where φi (x) := |φ(e the assumptions of Corollary 1, the functional Iti (φ, ψ) satisfies the Schrödinger equation (3) in the following weak sense: I0i (φ, ψ) = φ, ψ, i h¯
d i I (φ, ψ) = Iti (φ, H ψ) = Iti (H φ, ψ). dt t
(29) (30)
Proof. The first part of the theorem follows by a direct estimate. Eq. (29) is a consequence of Corollary 1, while Eq. (30) follows from Ito’s formula. 2 A stronger result, namely the equality
i Iti (φ, ψ) = φ, e− h¯ H t ψ
(31)
S. Albeverio, S. Mazzucchi / Journal of Functional Analysis 257 (2009) 1030–1052
1043
Fig. 1. The sector D−π/2,π/2 .
cannot in general be proved in the case where H is given by (2). In fact, the analyticity argument used in the proof of Corollary 1 cannot in general be applied to the proof of equality (31), as the following considerations show. z By Theorem 5 the function f1 : D¯ −π/2,π/2 → C, defined by f1 (z) = φ, e− h¯ H ψ, is analytic on the sector D−π/2,π/2 (shown in Fig. 1) and continuous on D¯ −π/2,π/2 (as H has a positive spectrum and generates an analytic semigroup). On the other hand, as we have already seen, on the positive real line R+ the function f1 can be expressed in terms of a functional integral by means of the Feynman–Kac formula:
zN+1 λ t √ √ 2N − hz¯ H d/2 ¯ φ( zx) e− h¯ 0 | h¯ ω(s)+x| ds φ, e ψ =z Rd
Ct
√ √ √ × ψ h¯ zω(t) + zx W (dω) dx dx
(32)
and the right-hand side of (32) is well defined, if φ, ψ are sufficiently regular, when π 2π π 2π Re(zM+1 ) 0, i.e. for z = |z|eiθ , with − 2(M+1) + k M+1 θ 2(M+1) + k M+1 , k ∈ Z. i.e. on M + 1 different sectors of the complex plane. In particular, the right-hand side of (32) defines M + 1 different holomorphic functions gk with k = 0, . . . , M, each of them defined on a different sector of the complex plane, i.e. Dk := D− π +k 2π , π +k 2π , k = 0, . . . , M. As 2(M+1)
M+1 2(M+1)
M+1
the M + 1 open sectors are disjoint (actually the intersection of their closures contain a unique point), we cannot consider g0 , . . . , gM as the same analytic function defined on different regions of the complex plane. In particular the analyticity properties of the left- and the right-hand sides of (32) allows to extend the Feynman–Kac formula, if the conditions of Theorem 6 are satisfied, π π . This sector does not include z = i (unless one considers to all the values of z ∈ D¯ − 2(M+1) , 2(M+1) the trivial case M = 0). For instance, if 2M = 4 (the quartic oscillator case), the integral on the right-hand side of (32) becomes z3 λ t √ √ 2N d/2 ¯ I (z) := z φ( zx) e− h¯ 0 | h¯ ω(s)+x| ds Rd
Ct
√ √ √ × ψ h¯ zω(t) + zx W (dω) dx.
(33)
1044
S. Albeverio, S. Mazzucchi / Journal of Functional Analysis 257 (2009) 1030–1052
Fig. 2. The set of definition of the integral I (z).
Under analyticity and growing conditions on φ, ψ, the integral I (z) is well defined on three sectors D¯ − π6 , π6 ∪ D¯ π , 5π ∪ D¯ 7π , 3π of the complex z plane shown in Fig. 2. In particular the 2 6 6 2 integral I (z) in (33) defines three analytic functions g0 , g1 , g2 defined respectively on the disjoints domains D− π6 , π6 , D π , 5π and D 7π , 3π . They are analytic on their domains of definition and 2 6 6 2 continuous on the boundaries. However the information we have does not allow to prove that g0 , g1 , g2 are the same analytic functions, since the intersection of the closure of their definition domains is a single point, i.e. D¯ − π6 , π6 ∩ D¯ π , 5π ∩ D¯ 7π , 3π = {0}. 2 6 6 2 We can then only say that Eq. (32), i.e. the equality f (z) = I (z), holds for z belonging to D¯ − π6 , π6 , but nothing can be said as it stands for z = i. By these considerations, the Feynman path integral representation for the weak solution of the Schrödinger equation studied in [5] has to be interpreted in the weak sense of Theorem 8. The difficulties in the investigation on the relations between the functional integral (28) and i the matrix elements of the Schrödinger group φ, e− h¯ H t ψ can be better understood by means of the following simplified model. Let us consider the two functions of the complex variable λ, f1 (λ) :=
eiλx
f2 (λ) := eiπ/4
4 +ix 2
e−iλx
dx,
4 −x 2
dx
defined and analytic respectively on D1 = {Im(λ > 0)} and D2 = {Im(λ < 0)}. The function f1 , for λ ∈ R λ < 0 can be seen as the one-dimensional analogue of i φ, e− h¯ H t ψ, while function f2 as the one-dimensional analogue of the functional integral (28). It is possible to verify by means of a rotation of the integration contour in the complex 4 2 plane, that for λ ∈ R+ one has f1 (λ) = f2 (λ). For instance if λ = 1 one has eix +ix dx = 4 2 eiπ/4 e−ix −x dx.
S. Albeverio, S. Mazzucchi / Journal of Functional Analysis 257 (2009) 1030–1052
1045
The extension of the equality between f1 and f2 on the negative real line is, on the other hand, not possible. Indeed it is possible to see that f1 and f2 are different branches on an analytic multivalued function defined on a Riemann sheet. A transformation of variable allows us to represent the two functions in a fashion allowing us to enlighten their analyticity properties and the nature of the singularity in λ = 0. Indeed when λ ∈ R+ , the following equality holds f2 (λ) = eiπ/4
e−iλx
4 −x 2
dx =
eiπ/4 λ1/4
e
−ix 4 −
x2 λ1/2
dx.
The right-hand side is an analytic function in the region λ ∈ C, λ = |λ|eiφ : |λ| > 0, −π < φ < π . It is continuous on the boundary of its analyticity domain but it is multivalued and it assumes different values approaching the negative real axis from above and from below: f2 |λ|eiπ =
eiλx +ix dx, 4 2 −iπ iπ/2 =e f2 |λ|e eiλx −ix dx. 4
2
By a rotation technique it is easy to verify that the latter integral is equal to eiπ/2
eiλx
4 −ix 2
dx = eiπ/4
e−iλx
4 −x 2
dx.
4 2 4 2 In other words we can say that the two integrals eiλx +ix dx and eiπ/4 e−iλx −x dx do not coincide on the negative real line: they are different branches of the same analytic but multivalued function. In a similar way, the functional integral (28) and the matrix elements of the Schrödinger group can be interpreted, as functions of the complex variable λ, as different branches of the same analytic but multivalued function. Remark 3. Despite the problems described so far, in the literature some particular cases have been handled by means of different techniques. In [19], equality (31) has been proved in the case where H is the inverse quartic oscillator Hamiltonian H0 ψ(x) = −
1 h¯ 2 ψ(x) + xΩ 2 xψ(x) − λ|x|4 ψ(x), 2 2
ψ ∈ C02 (Rd ), λ ∈ R+ ,
by means of an analytic continuation technique in the mass parameter. In [12] the pointwise solution of the heat equation (19) and its functional integral representation have been considered: − z Ht √ zλ t √ 2M e h¯ ψ (x) = e− h¯ 0 | h¯ zω(s)+x| ds ψ0 h¯ zω(t) + x W (dω). (34) Ct
1046
S. Albeverio, S. Mazzucchi / Journal of Functional Analysis 257 (2009) 1030–1052
The right-hand side of (34) evaluated for z = i gives
iλ
e− h¯
t √ ¯ iω(s)+x|2M ds 0| h
ψ0
√ h¯ iω(t) + x W (dω).
(35)
Ct π
For suitable exponents 2M, namely for Re(ei(M+1) 2 ) < 0, the integral (35) is well defined and, as proved in [12] by means of a probabilistic argument, it represents the pointwise solution of Schrödinger equation. Let us consider now the integral (28) and let us assume that the hypothesis of Theorem 8 are satisfied. By considering two suitable sets of vectors in S1 , S2 ⊂ L2 (Rd ), with φ ∈ S2 and φ ∈ S1 , it is possible to interpret the integral Iti (φ, ψ) as the matrix element of an evolution operator in L2 (Rd ). Let us denote by S1 the subset of S(Rd ) made of the functions φ : Rd → C of the form φ(x) = P (x)e−
x2 2 (1−i)
,
x ∈ Rd ,
(36)
and by S2 the subset of S(Rd ) made of the functions φ : Rd → C of the form φ(x) = Q(x)e−
x2 2 (1+i)
(37)
,
where P and Q are polynomials with complex coefficients. As the Hermite functions form a complete orthonormal system in L2 (Rd ), it is simple to verify that both S1 and S2 are dense in L2 (Rd ). Moreover the functions φ ∈ S1 are such that: (1) the function z → φ(zx), x ∈ Rd , z ∈ D¯ 0,π/4 is analytic on D0,π/4 and continuous on D¯ 0,π/4 , π (2) the function x → φ(ei 4 x), x ∈ Rd is in L2 , while the functions φ ∈ S2 are such that: (1) the function z → φ(zx), x ∈ Rd , z ∈ D¯ −π/4,0 is analytic on D−π/4,0 and continuous on D¯ −π/4,0 , π (2) the function x → φ(e−i 4 x), x ∈ Rd belongs to L2 (Rd ). Let us denote by T : S1 → S2 the linear operator defined by π π T φ(x) = ei 8 d φ ei 4 x ,
φ ∈ S1 ,
and by T −1 : S2 → S1 its inverse, defined by π π T −1 φ(x) = e−i 8 d φ e−i 4 x ,
φ ∈ S2 .
By considering two vectors φ, ψ ∈ S1 one can easily verify that φ, T ψ = T φ, ψ.
(38)
S. Albeverio, S. Mazzucchi / Journal of Functional Analysis 257 (2009) 1030–1052
1047
Analogously, by considering two vectors φ, ψ ∈ S2 , one can easily verify that
φ, T −1 ψ = T −1 φ, ψ .
(39)
This implies that, for φ ∈ S2 , ψ ∈ S1 , one has T −1 φ, T ψ = φ, ψ. Let HT : S2 → S2 be the operator defined by −iHT := T H T −1 . It is easy to verify that HT φ(x) = −
M+1 h¯ 2 φ(x) + λei 2 π |x|2M φ(x), 2
φ ∈ S2 .
M+1
Theorem 9. Let ψ ∈ S1 and φ ∈ S2 . Let us assume that Re(ei 2 π ) 0. Then the operator HT is the restriction to S2 of the generator A of a strongly continuous contraction semigroup 1 V (t) = e− h¯ At and the integral Iti (φ, ψ) given by (28) is equal to the matrix element T −1 φ, V (t)T ψ. Proof. Let V (t)t0 be the C0 -contraction semigroup defined by the Feynman–Kac type formula:
e−e
V (t)ψ(x) :=
i M+1 2 πλ t h¯ 0
√ | h¯ ω(s)+x|2M ds
ψ
√
h¯ ω(t) + x W (dω).
(40)
Ct
The operator-theoretic results for semigroups [13] of the form (40) have been investigated in t [21,18] (see also [16, Chapter 13.5]). In particular the generator A of the semigroup V (t) = e− h¯ A is given on smooth vectors ψ ∈ S(Rd ) by Aψ(x) = −
h¯ 2 ψ(x) + Q2M (x)ψ(x), 2
Q2M (x) := λei
M+1 2 π
|x|2M
(41)
with domain
h¯ 2 D(A) = ψ ∈ H 1 Rd : − ψ + Q2M ψ ∈ L2 Rd . 2
(42)
By a direct computation and by taking ψ ∈ S1 and φ ∈ S2 , one can easily verify that π
ei 4 d
π φ¯ ei 4 x
Rd
= T
e−
Ct −1
φ, V (t)T ψ
e
i(M+1) π 2 h¯
λ
t √ ¯ ω(s)+x|2M ds 0| h
ψ
√ i π π h¯ e 4 ω(t) + ei 4 x W (dω) dx
so that Iti (φ, ψ) = T −1 φ, V (t)T ψ.
2
Remark 4. Under the assumptions of Theorem 9, it is possible to give an alternative proof of Theorem 8, i.e. that the integral Iti (φ, ψ) is a weak solution of the Schrödinger equation in the sense that Eqs. (29) and (30) are satisfied.
1048
S. Albeverio, S. Mazzucchi / Journal of Functional Analysis 257 (2009) 1030–1052
Indeed Eq. (29) follows by writing I0i (φ, ψ) as T −1 φ, T ψ and by Eqs. (38) and (39). t
Eq. (30) follows from the equality Iti (φ, ψ) := T −1 φ, e− h¯ A T ψ. Indeed i h¯
t t t d −1 T φ, e− h¯ A T ψ = T −1 φ, e− h¯ A (−iHT )T ψ = T −1 φ, e− h¯ HT T H ψ . dt it
Remark 5. To prove that Iti (φ, ψ) = φ, e− h¯ H ψ it would be sufficient to have that |Iti (φ, ψ)| t C φ ψ , or, in other words, that given ψ ∈ S1 , one has that the vector e− h¯ HT T ψ belongs to the domain of (T −1 )∗ , so that
∗ t t T −1 φ, e− h¯ HT T ψ = φ, T −1 e− h¯ HT T ψ .
If the inequality |Iti (φ, ψ)| C φ ψ holds true, then this would namely imply that there exists a bounded operator B(t) : L2 → L2 such that Iti (φ, ψ) = φ, B(t)ψ and B(0) = I . Iti (·,·) defined on S2 × S1 can be extended to L2 × L2 . In particular then Iti (U (t)φ, ψ) makes sense and by differentiating with respect to the time variable t we obtain i h¯
d i I U (t)φ, ψ = Iti U (t)φ, H ψ − Iti H U (t)φ, ψ = 0 dt t
so that Iti (U (t)φ, ψ) = I0i (U (0)φ, ψ) = φ, ψ for any t and this implies that B(t) = U (t). 5. The functional integral as asymptotic solution The present section is devoted to the proof that the functional integral (28) coincides asympi totically both as t → 0 and as λ → 0 with the matrix element φ, e h¯ H t ψ of the Schrödinger group. Theorem 10. Let φ, ψ ∈ L2 (Rd ) and M ∈ N satisfy the assumptions of Theorem 9. Then as i t → 0 the integral Iti (φ, ψ) and the matrix element φ, e h¯ H t ψ of the Schrödinger group admit the following asymptotic expansions Iti (φ, ψ) =
an t n ,
n i φ, e h¯ H t ψ = bn t ,
and they coincides, i.e. an = bn ∀n ∈ N. Proof. By Theorem 9, the functional integral Iti (φ, ψ) can be written as T −1 φ, e− h¯ A T ψ, where A is the operator defined by (41) and (42). Under the stated assumptions, the vector T ψ belongs to the domain of An ∀n ∈ N, and An T ψ belongs to S2 ⊂ D(T −1 ) ∀n ∈ N. In particular, for any N ∈ N one has t
N
−1 t t n −1 1 − T φ, e− h¯ A T ψ = T φ, An T ψ + RN , h¯ n! n=0
S. Albeverio, S. Mazzucchi / Journal of Functional Analysis 257 (2009) 1030–1052
1049
1
t 1 t N RN = uN −1 T −1 φ, AN e− h¯ A(1−u) T ψ du. − h¯ N − 1! 0
The remainder RN can easily be estimated by means of Schwartz inequality and one has: |RN |
|t|N |h¯ |−N T −1 φ AN T ψ = O |t|N . N!
As T ψ ∈ S2 , one has An T ψ = (HT )n T ψ = i n T H n ψ , so that by Eq. (39)
T
−1
φ, e
− ht¯ A
N it n 1 − φ, H n ψ + O t N . Tψ = n! h¯
n=0
Analogously
φ, e
− ith¯ H
N it n 1
− φ, H n ψ + RN ψ = , n! h¯
n=0
RN
1
it it N 1 − = uN −1 φ, H N e− h¯ H (1−u) ψ du, N − 1! h¯ 0
= O(t N ), and one can easily verify that the asymptotic expansion in powers of t of with RN it
Iti (φ, ψ) and of φ, e− h¯ H ψ coincide.
2
Remark 6. The power series of the variable t are not convergent, but only asymptotic. This fact implies that the result Theorem 10 is not sufficient to deduce the equality between Iti (φ, ψ) and i
φ, e h¯ H t ψ. Let us consider now couples of vectors φ, ψ satisfying the assumptions of Theorem 1 (so that the result of Theorem 4 holds) and such that the functional integral (28) is well defined. In fact, it is always possible to find a dense set of vectors in L2 (Rd ) satisfying both conditions. For instance, if M = 2, it is sufficient to take ψ ∈ S1 and φ ∈ S2 , while if M 3 the fulfillment of hypothesis of Theorem 1 implies that the integral (28) is well defined. Under these conditions, it is possible to interpret the functional integral (28) as an asymptotic weak solution of the Schrödinger equation, in the sense of the following theorem. Theorem 11. Under the assumptions above, the asymptotic expansion in powers of the coupling constant λ as λ → 0 of the functional integral representation (28) coincides with the corresponding asymptotic expansion (14) of the matrix elements of the Schrödinger group. Moreover the latter is Borel summable. Proof. By expanding the functional integral (28) in powers of λ one has
1050
S. Albeverio, S. Mazzucchi / Journal of Functional Analysis 257 (2009) 1030–1052
Iti (φ, ψ) =
N −1
a n λ n + RN ,
n
with 1 π an = e i 4 d n!
π φ¯ ei 4 x
Rd
×ψ
√
h¯ e
n i(M+1) π t 2M √ 2λ e h¯ ω(s) + x ds − h¯
Ct i π4
ω(t) + e
i π4
0
x W (dω) dx
and π λN RN = ei 4 d (N − 1)!
× e−u
e
i(M+1) π 2 h¯
1
N −1
(1 − u)
π φ¯ ei 4 x
Rd
0
λ t √ ¯ ω(s)+x|2M ds 0| h
N i(M+1) π t √ 2M 2λ e ds − h¯ ω(s) + x h¯
Ct
ψ
0
√ i π π h¯ e 4 ω(t) + ei 4 x W (dω) dx du.
It is easy to verify that RN = O(λN ). Moreover, the coefficients an can be written as: i(M+1) π n t t 2λ π e 1 iπd an = e 4 − . . . ds1 . . . dsn φ¯ ei 4 x n! h¯ ×
0
Rd
0
√ √ √ h¯ ω(s1 ) + x 2M ds . . . h¯ ω(sn ) + x 2M ds ψ h¯ ei π4 ω(t) + ei π4 x W (dω) dx.
Ct
By the result of Corollary 1, the latter coincides with the coefficient an of the asymptotic expansion (14). On the other hand, by Theorem 4, the matrix elements of the Schrödinger group can be obtained in terms of the Borel sum of the asymptotic expansion an λn . 2 Remark 7. By a direct computation it is possible to verify that in the case M = 2 (the quartic oscillator case) the functional integral Iti (φ, ψ) can be extended to an analytic function of the variable λ in the sector D0,π of the complex plane and satisfies there an estimate of the following form N −1 i n an λ AC N |λ|N N !. It (φ, ψ) − n
By Watson–Nevanlinna’s theorem, it is possible to recover Iti (φ, ψ) in terms of the coefficients i
an in the asymptotic expansion. This result, combined with the analogous result for φ, e− h¯ H t ψ, i is not sufficient to deduce the equality Iti (φ, ψ) = φ, e− h¯ H t ψ, as the two function are defined as the Borel sums of the same asymptotic expansion but on different regions of the complex plane (the left-hand side on D0,π and the right-hand side on Dπ,o ). Indeed, let us consider two functions f1 (z) and f2 (z) of the complex variable z, defined and holomorphic resp. in D0,π
S. Albeverio, S. Mazzucchi / Journal of Functional Analysis 257 (2009) 1030–1052
1051
and Dπ,0 , admitting as z → 0 the same asymptotic expansion and estimate uniformly in their analyticity domains:
f1 (z) ∼
f2 (z) ∼
N −1 an zn A1 C1N |z|N N !, f1 (z) − n N −1 n an z A2 C2N |z|N N !. f2 (z) −
an z n ,
an z n ,
n
The function f1 , f2 can be recovered by their asymptotic expansion an zn by means of the following procedure. Let us define two functions g1 (z) and g2 (z) of the complex variable z ∈ D−π/2,π/2 defined by g1 (z) := f1 (iz),
g2 (z) := f2 (−iz),
z ∈ D−π/2,π/2 .
g1 and g2 admit the following asymptotic expansion and estimate: g1 (z) ∼
g2 (z) ∼
n
N −1 n n i an z A1 C1N |z|N N !, g1 (z) − n N −1 (−i)n an zn A2 C2N |z|N N !. g2 (z) − n
n
i an z ,
(−i)n an zn ,
By Theorem 3 they are both Borel summable, i.e. formally: 1 g1 (z) = z
∞
e−t/z
i n an n!
t n dt,
(43)
0
g2 (z) =
1 z
∞
e−t/z
(−i)n an n!
t n dt.
(44)
0
One would have f1 (z) = f2 (z) for z ∈ R+ , if g1 (iρ) = g2 (−iρ) for ρ ∈ R+ , however the Borel summability of the asymptotic expansion an zn , in particular the relations (43) and (44), are by themselves not sufficient to deduce this equality. Remark 8. In [4] the asymptotic expansion in powers of the parameter h¯ of a finite-dimensional analogue of the integral Iti (φ, ψ) has been studied and its Borel summability has been proved. Acknowledgments The hospitality of the Mathematics Institutes of Trento and of Bonn Universities is gratefully acknowledged, as well as the financial support of the F. Severi fellowship of I.N.d.A.M.
1052
S. Albeverio, S. Mazzucchi / Journal of Functional Analysis 257 (2009) 1030–1052
References [1] S. Albeverio, R. Høegh-Krohn, S. Mazzucchi, Mathematical theory of Feynman path integrals, in: An Introduction, 2nd and enlarged edition, in: Lecture Notes in Math., vol. 523, Springer-Verlag, Berlin/Heidelberg, 2008. [2] S. Albeverio, S. Mazzucchi, Generalized infinite-dimensional Fresnel integrals, C. R. Math. Acad. Sci. Paris 338 (3) (2004) 255–259. [3] S. Albeverio, S. Mazzucchi, Some new developments in the theory of path integrals, with applications to quantum theory, J. Stat. Phys. 115 (112) (2004) 191–215. [4] S. Albeverio, S. Mazzucchi, Generalized Fresnel integrals, Bull. Sci. Math. 129 (1) (2005) 1–23. [5] S. Albeverio, S. Mazzucchi, Feynman path integrals for polynomially growing potentials, J. Funct. Anal. 221 (1) (2005) 83–121. [6] S. Albeverio, S. Mazzucchi, Feynman path integrals for the time dependent quartic oscillator, C. R. Math. Acad. Sci. Paris 341 (10) (2005) 647–650. [7] S. Albeverio, S. Mazzucchi, The time dependent quartic oscillator—A Feynman path integral approach, J. Funct. Anal. 238 (2) (2006) 471–488. [8] S. Albeverio, S. Mazzucchi, The trace formula for the heat semigroup with polynomial potential, SFB-611-Preprint No. 332, Bonn, 2007. [9] C. Bender, T. Wu, Anharmonic oscillator, Phys. Rev. (2) 184 (1969) 1231–1260. [10] R.H. Cameron, A family of integrals serving to connect the Wiener and Feynman integrals, J. Math. Phys. 39 (1960) 126–140. [11] I.M. Davies, A. Truman, On the Laplace asymptotic expansion of conditional Wiener integrals and the Bender–Wu formula for x 2N -anharmonic oscillators, J. Math. Phys. 24 (2) (1983) 255–266. [12] H. Doss, Sur une Résolution Stochastique de l’Equation de Schrödinger à Coefficients Analytiques, Comm. Math. Phys. 73 (1980) 247–264. [13] K.-J. Engel, R. Nagel, One-parameter semigroups for linear evolution equations, in: S. Brendle, M. Campiti, T. Hahn, G. Metafune, G. Nickel, D. Pallara, C. Perazzoli, A. Rhandi, S. Romanelli, R. Schnaubelt (Eds.), Grad. Texts in Math., vol. 194, Springer-Verlag, New York, 2000. [14] G.H. Hardy, Divergent Series, Oxford University Press, London, 1963. [15] T. Hida, H.H. Kuo, J. Potthoff, L. Streit, White Noise, Kluwer, Dordrecht, 1995. [16] G.W. Johnson, M.L. Lapidus, The Feynman Integral and Feynman’s Operational Calculus, Oxford University Press, New York, 2000. [17] G. Kallianpur, D. Kannan, R.L. Karandikar, Analytic and sequential Feynman integrals on abstract Wiener and Hilbert spaces, and a Cameron Martin formula, Ann. Inst. H. Poincaré Probab. Th. 21 (1985) 323–361. [18] T. Kato, On some Schrödinger operators with a singular complex potential, Ann. Sc. Norm. Super. Pisa Cl. Sci. (4) 5 (1) (1978) 105–114. [19] S. Mazzucchi, Feynman path integrals for the inverse quartic oscillator, J. Math. Phys. 49 (9) (2008) 093502 (15 pp.). [20] S. Mazzucchi, Mathematical Feynman Path Integrals and Applications, World Scientific Publishing, Singapore, 2009. [21] E. Nelson, Feynman integrals and the Schrödinger equation, J. Math. Phys. 5 (1964) 332–343. [22] F. Nevanlinna, Zur Theorie der asymptotischen Potenzreihen, Ann. Acad. Sci. Fenn. (A) 12 (3) (1919) 1–81. [23] M. Reed, B. Simon, Methods of modern mathematical physics, II, in: Fourier Analysis, Self-Adjointness, Academic Press/Harcourt Brace Jovanovich Publishers, New York/London, 1975. [24] M. Reed, B. Simon, Methods of modern mathematical physics, IV, in: Analysis of Operators, Academic Press/ Harcourt Brace Jovanovich Publishers, New York/London, 1978. [25] B. Simon, Coupling constant analyticity for the anharmonic oscillator, Ann. Physics 58 (1970) 76–136. [26] B. Simon, Functional Integration and Quantum Physics, second ed., Amer. Math. Soc. Chelsea Publishing, Providence, RI, 2005. [27] A. Sokal, An improvement of Watson’s theorem on Borel summability, J. Math. Phys. 21 (1980) 261–263. [28] K. Yajima, Smoothness and non-smoothness of the fundamental solution of time dependent Schrödinger equations, Comm. Math. Phys. 181 (1996) 605–629.
Journal of Functional Analysis 257 (2009) 1053–1091 www.elsevier.com/locate/jfa
Local minimizers of the Ginzburg–Landau functional with prescribed degrees Mickaël Dos Santos Université de Lyon, Université Lyon 1, INSA de Lyon, F-69621, Ecole Centrale de Lyon, CNRS, UMR5208, Institut Camille Jordan, 43 blvd du 11 novembre 1918, F-69622 Villeurbanne cedex, France Received 19 February 2009; accepted 23 February 2009 Available online 17 March 2009 Communicated by H. Brezis
Abstract We consider, in bounded multiply connected domain D ⊂ R2 , the Ginzburg–Landau en a smooth 1 1 2 ergy Eε (u) = 2 D |∇u| + 2 D (1 − |u|2 )2 subject to prescribed degree conditions on each com4ε ponent of ∂D. In general, minimal energy maps do not exist [L. Berlyand, P. Mironescu, Ginzburg– Landau minimizers in perforated domains with prescribed degrees, preprint, 2004]. When D has a single hole, Berlyand and Rybalko [L. Berlyand, V. Rybalko, Solution with vortices of a semi-stiff boundary value problem for the Ginzburg–Landau equation, J. Eur. Math. Soc. (JEMS), in press, 2008, http://www.math.psu.edu/berlyand/publications/publications.html] proved that for small ε local minimizers do exist. We extend the result in [L. Berlyand, V. Rybalko, Solution with vortices of a semi-stiff boundary value problem for the Ginzburg–Landau equation, J. Eur. Math. Soc. (JEMS), in press, 2008, http://www.math.psu.edu/berlyand/publications/publications.html]: Eε (u) has, in domains D with 2, 3, . . . holes and for small ε, local minimizers. Our approach is very similar to the one in [L. Berlyand, V. Rybalko, Solution with vortices of a semi-stiff boundary value problem for the Ginzburg–Landau equation, J. Eur. Math. Soc. (JEMS), in press, 2008, http://www.math.psu.edu/berlyand/publications/publications.html]; the main difference stems in the construction of test functions with energy control. © 2009 Elsevier Inc. All rights reserved. Keywords: Ginzburg–Landau functional; Prescribed degrees; Local minimizers
E-mail address: [email protected]. 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.02.023
1054
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
1. Introduction This article deals with the existence problem of local minimizers of the Ginzburg–Landau functional with prescribed degrees in a 2D perforated domain D. The domain we consider is of the form D = Ω \ i∈NN ωi , where N ∈ N∗ , Ω and the ωi ’s are simply connected, bounded and smooth open sets of R2 . We assume that ωi ⊂ Ω and ωi ∩ ωj = ∅ for i, j ∈ NN := {1, . . . , N}, i = j . The Ginzburg–Landau functional is 2 1 1 Eε (u, D) := 1 − |u|2 dx |∇u|2 dx + 2 (1) 2 4ε D
D
with u : D → C R2 and ε is a positive parameter (the inverse of κ, the Ginzburg–Landau parameter). When there is no ambiguity we will write Eε (u) instead of Eε (u, D). Functions we will consider belong to the class J = u ∈ H 1 (D, C) s.t. |u| = 1 on ∂D . Clearly, J is closed under weak H 1 -convergence. This functional is a simplified version of the Ginzburg–Landau functional which arise in superconductivity (or superfluidity) to model the state of a superconductor submitted to a magnetic field (see, e.g., [10] or [9]). The simplified version of the Ginzburg–Landau functional considered in (1) ignores the magnetic field. The issue we consider in this paper is existence of local minimizers with prescribed degrees on ∂D. We next formulate rigorously the problem discussed in this paper. To this purpose, we start by defining properly the degrees of a map u ∈ J . For γ ∈ {∂Ω, . . . , ∂ωN } and u ∈ J we let 1 degγ u = u × ∂τ u dτ. 2π γ
Here: • each γ is directly (counterclockwise) oriented, • τ = ν ⊥ , τ is the tangential vector of γ and ν the outward normal to Ω if γ = ∂Ω or ωi if γ = ∂ωi , • ∂τ = τ · ∇, the tangential derivative and “·” stands for the scalar product in R2 , • “×” stands for the vectorial product in C, (z1 + ız2 ) × (w1 + ıw2 ) := z1 w2 − z2 w1 , z1 , z2 , w1 , w2 ∈ R, • the integral over γ should be understood using the duality between H 1/2 (γ ) and H −1/2 (γ ) (see, e.g., [4, Definition 1]). It is known that degγ u is an integer see [4] (the introduction) or [6]. We denote the (total) degree of u ∈ J in D by deg(u, D) = deg∂ω1 (u), . . . , deg∂ωN (u), deg∂Ω (u) ∈ ZN × Z.
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
1055
For (p, q) ∈ ZN × Z, we are interested in the minimization of Eε in Jp,q := u ∈ J s.t. deg(u, D) = (p, q) . There is a huge literature devoted to the minimization of Eε . In a simply connected domain Ω, the minimization problem of Eε with the Dirichlet boundary condition g ∈ C ∞ (∂Ω, S 1 ) is studied in details in [7]. Eε has a minimizer for each ε > 0. This minimizer need not to be unique. In this framework, when deg∂Ω (g) = 0, the authors studied the asymptotic behaviour of a sequence of minimizers (when εn ↓ 0) and point out the existence (up to subsequence) of a finite set of singularities of the limit. Other types of boundary conditions were studied, like Dirichlet condition g ∈ C ∞ (∂Ω, C \ {0}) (in a simply connected domain Ω) in [1] and later for g ∈ C ∞ (∂Ω, C) (see [2]). If the boundary data is not u|∂ D , but a given set of degrees, then the existence of local minimizers is not trivial. Indeed, one can show that Jp,q is not closed under weak H 1 -convergence (see next section), so that one cannot apply the direct method in the calculus of variations in order to derive existence of minimizer. Actually this is not just a technical difficulty, since in general the infimum of Eε in Jp,q is not attained, we need more assumptions like the value of the H 1 -capacity of D (see [3] and [4]). Minimizers u of Eε in Jp,q , if they do exist, satisfy the equation ⎧ u ⎪ −u = 2 1 − |u|2 in D, ⎪ ⎨ ε |u| = 1 on ∂D, ⎪ ⎪ u × ∂ u = 0 on ∂D, ν ⎩ deg(u, D) = (p, q)
(2)
∂ = ν · ∇. where ∂ν denotes the normal derivative, i.e., ∂ν = ∂ν Existence of local minimizers of Eε is obtained following the same lines as in [5]. It turns out that, even if the infimum of Eε in Jp,q is not attained, (2) may have solutions. This was established by Berlyand and Rybalko when D has a single hole, i.e., when N = 1. Our main result is the following generalisation of the main result in [5]:
Theorem 1. Let (p, q) ∈ ZN × Z and let M ∈ N∗ , there is ε1 (p, q, M) > 0 s.t. for ε < ε1 , there are at least M locally minimizing solutions. Actually, we will prove a more precise form of Theorem 1 (see Theorem 2), whose statement relies on the notion of approximate bulk degree introduced in [5] and generalised in the next section. The main difference with respect to [5] stems in the construction of the test functions with energy control in Section 6. In a sense that will be explained in details in Section 6, our construction is local, while the one in [5] is global. We also simplify and unify some proofs in [5]. We do not know whether the conclusion of Theorem 1 still holds when D has no holes at all. That is, we do not know whether for a simply connected domain Ω, a given d ∈ Z∗ and small ε, the problem ⎧ −u = εu2 (1 − |u|2 ) in Ω, ⎪ ⎪ ⎨ on ∂Ω, u × ∂ν u = 0 (3) ⎪ |u| = 1 on ∂Ω, ⎪ ⎩ deg∂Ω (u) = d
1056
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
has solutions. Existence of a solution of (3) is clear when Ω is a disc, say Ω = D(0, R) (it suffices z d z d ) with u|∂Ω = ( |z| ) ). to consider a solution of −u = εu2 (1 − |u|2 ) of the form u(z) = f (|z|)( |z| However, we do not know the answer when Ω is not radially symmetric anymore. 2. The approximate bulk degree This section is a straightforward adaptation of [5]. Existence of (local) minimizers for Eε in Jp,q is not straightforward since Jp,q is not closed under weak H 1 -convergence. A typical example (see [4]) is a sequence (Mn )n s.t. Mn : D(0, 1) → D(0, 1), x →
x − (1 − 1/n) , (1 − 1/n)x − 1
where D(0, 1) ⊂ C is the open unit disc centred at the origin. Then Mn 1 in H 1 , degS 1 (Mn ) = 1 and degS 1 (1) = 0. To obtain local minimizers, Berlyand and Rybalko (in [5]) devised a tool: the approximate bulk degree. We adapt this tool for a multiply connected domain. We consider, for i ∈ NN := {1, . . . , N}, Vi the unique solution of
−Vi = 0 Vi = 1 Vi = 0
in D, on ∂D \ ∂ωi , on ∂ωi .
For u ∈ J := {u ∈ H 1 (D, C), |u| = 1 on ∂D}, we set, noting ∂k u = 1 abdegi (u, D) = 2π
(4)
∂ ∂xk u
u × (∂1 Vi ∂2 u − ∂2 Vi ∂1 u) dx, D
abdeg(u, D) = abdeg1 (u, D), . . . , abdegN (u, D) .
(5)
Following [5], we call abdeg(u, D) the approximate bulk degree of u. abdegi : J → R, in general, is not an integer (unlike the degree). However, we have Proposition 1. (1) If u ∈ H 1 (D, S 1 ), then abdegi (u, D) = deg∂ωi (u); (2) Let Λ, ε > 0 and u, v ∈ J s.t. Eε (u), Eε (v) Λ, then abdeg (u) − abdeg (v) 2 Vi 1 Λ1/2 u − v 2 ; i i C (D ) L (D ) π
(6)
(3) Let Λ > 0 and (uε )ε>0 ⊂ J s.t. for all ε > 0, Eε (uε ) Λ, then dist abdeg(uε ), ZN → 0 when ε → 0.
(7)
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
1057
Proof of Proposition 1 is postponed to Appendix B. We define for d = (d1 , . . . , dN ) ∈ ZN , p = (p1 , . . . , pN ) ∈ ZN and q ∈ Z, 1 d d . Jp,q = Jp,q (D) := u ∈ Jp,q ; abdeg(u) − d ∞ := max di − abdegi (u) 3 i∈NN d in never empty for (p, q, d) ∈ ZN × Z × ZN . The following result states that Jp,q For i ∈ {0, . . . , N}, we denote ei = (δi,1 , . . . , δi,N , δi,0 ) ∈ ZN +1 where
δi,k =
1 if i = k, 0 otherwise.
It is the Kronecker symbol. d = ∅. Proposition 2. Let (p, q, d) ∈ ZN × Z × ZN . Then Jp,q
Proof. For i ∈ {0, . . . , N}, there is Mni ∈ J(pi −di )ei if i = 0 and Mn0 ∈ J(q− dj )e0 s.t. Mni 1 in H 1 and |Mni | 1 (Lemmas 6.1 and 6.2 in [4]). Let Ed := u ∈ H 1 D, S 1 ; deg(u, D) = (d, d) ,
d = (d1 , . . . , dN ), d =
N
dj .
j =1
i We note that, Ed = ∅, see, e.g., [7]. Let u ∈ Ed and un := u N i=0 Mn . Then we will prove that, d for large n, we have, up to subsequence, that un ∈ Jp,q . Indeed, up to subsequence, un u in H 1 ,
un ∈ Jp,q .
Using the fact that abdeg(u) = d and the weak H 1 -continuity of the approximate bulk degree, d . 2 we obtain for n sufficiently large, that un ∈ Jp,q d , i.e., We denote mε (p, q, d) the infimum of Eε on Jp,q
mε (p, q, d) = inf Eε (u). d u∈Jp,q
We may now state a refined version of Theorem 1. Theorem 2. Let d ∈ (N∗ )N . Then, for all (p1 , . . . , pN , q) ∈ ZN +1 s.t. q d and pi di , there is ε2 = ε2 (p, q, d) > 0 s.t. for 0 < ε < ε2 , mε (p, q, d) is attained. Moreover, we have the following estimate mε (p, q, d) = I0 (d, D) + π(d1 − p1 + · · · + dN − pN + d − q) − oε (1),
oε (1) − −−→ 0. ε→0
N ∗ )N s.t. p d and For i i further use, a configuration of degrees (p, q, d) ∈ Z × Z × (N N q di will be called a “good configuration”. Noting that, for d = d˜ ∈ Z and (p, q) ∈ ZN ×Z, d ∩ J d˜ = ∅, we are led to we have Jp,q p,q
1058
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
Proof of Theorem 1. Let (p, q) ∈ ZN × Z and set for k ∈ N∗ , d = max max |pi |, |q| i
and dk = (d + k, . . . , d + k).
dk We apply Theorem 2 to the class Jp,q . We obtain the existence of
ε1 (p, q, M) = min ε2 (p, q, dk ) > 0 k∈NM
s.t. for ε < ε1 , k ∈ NM , mε (p, q, dk ) is achieved by ukε . Noting the continuity of the degree and of the approximate bulk degree for the strong dk H 1 -convergence, there exists Vεk ⊂ Jp,q ⊂ J an open (for H 1 -norm) neighbourhood of ukε . It follows easily that Eε ukε = min Eε (u). u∈Vεk
Then ukε ∈ Jp,q is a local minimizer of Eε in J (for H 1 -norm) for 0 < ε < ε1 (p, q, M).
2
3. Basic facts of the Ginzburg–Landau theory It is well known (cf. [4, Lemma 4.4, p. 22]) that the local minimizers of Eε in Jp,q satisfy 1 u 1 − |u|2 in D, 2 ε and u × ∂ν u = 0 on ∂D.
−u = |u| = 1
(8) (9)
Eq. (8) and the Dirichlet condition on the modulus in (9) are classical. The Neumann condition on the phase in (9) is less standard but it is for example stated in [4]. Eq. (8) combined with the boundary condition on ∂D implies, via a maximum principle, that |u| 1 in D.
(10)
One of the questions in the Ginzburg–Landau model is the location of the vortices of stable solutions (i.e., local minimizers of Eε ). We will define ad hoc a vortex as an isolated zero x of u with nonzero degree on small circles around x. The following result shows that, under energy bound assumptions on solutions of (8), vortices are expelled to the boundary when ε → 0. Lemma 1. (See [8].) Let Λ > 0 and let u be a solution of (8) satisfying (10) and the energy bound Eε (u) Λ. Then with C, Ck and ε3 depending only on Λ, D, we have, for 0 < ε < ε3 and x ∈ D, 2 1 − u(x)
Cε 2 dist2 (x, ∂D)
(11)
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
1059
and k D u(x)
Ck distk (x, ∂D)
When u is smooth in D and ρ = |u| > 0, the map
u ρ
(12)
.
admits a lifting θ , i.e., we may write
u = ρeıθ , where θ is a smooth (and locally defined) real function on D and ∇θ is a globally defined smooth vector field. Using (8) and (9), we have
div ρ 2 ∇θ = 0 in B, on ∂D, ∂ν θ = 0 −ρ + |∇θ |2 ρ + ε12 ρ ρ 2 − 1 = 0 ρ=1
(13) in B, on ∂D,
(14)
here, B = {x ∈ D; u(x) = 0}. We will need later the following. Lemma 2. (See [5].) Let u be a solution of (8) and (9). Let G ⊂ D be an open Lipschitz set s.t. u does not vanish in G. Write, in G, u = ρv with ρ = |u|. Let w ∈ H 1 (G, C) be s.t. |tr∂G w| ≡ 1. Then Eε (ρw, G) = Eε (u, G) + Lε (w, G), with 1 Lε (w, G) = 2
1 ρ |∇w| dx − 2 2
2
G
1 |w| ρ |∇v| dx + 2 4ε 2 2
2
G
2 ρ 4 1 − |w|2 dx.
G
For further use, we note that we may write, locally in G, u = ρeıθ , so that v = eıθ . It turns out that ∇θ is smooth and globally defined in G. In terms of ∇θ , we may rewrite Lε (w, G) =
1 2
ρ 2 |∇w|2 dx − G
1 2
|w|2 ρ 2 |∇θ |2 dx + G
1 4ε 2
2 ρ 4 1 − |w|2 dx.
G
For u a solution of (8) and (9), we can consider (see Lemma 7 in [5]) h the unique globally defined solution of ⎧ ⎨ ∇ ⊥ h = u × ∇u in D, (15) h=1 on ∂Ω, ⎩ on ∂ωi , h = ki
1060
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
where ki ’s are real constants uniquely defined by the first two equations in (15). Here u × ∂1 u −∂2 h is the orthogonal gradient of h and u × ∇u = . ∇ ⊥h = ∂1 h u × ∂2 u It is easy to show that ⎧ ∇h = −ρ 2 ∇ ⊥ θ ⎪ ⎪ ⎨ 1 div 2 ∇h = 0 ⎪ ρ ⎪ ⎩ h = 2∂1 u × ∂2 u
in B, in B,
(16)
in B,
here, B = {x ∈ D; u(x) = 0}. In [7], Brezis, Bethuel and Hélein consider the minimization of E(u) = 12 D |∇u|2 dx, the Dirichlet functional, in the class Ed := u ∈ H 1 D, S 1 ; deg(u, D) = (d, d) ; here, d = dk . Theorem I.1 in [7] gives the existence of a unique solution (up to multiplication by an S 1 -constant) for the minimization of E in Ed . We denote u0 this solution. This u0 is also a solution of −v = v|∇v|2 in D, on ∂D. v × ∂ν v = 0 Moreover, we have 1 I0 (d, D) := min E(u) = u∈Ed 2 with h0 the unique solution of ⎧ h0 = 0 ⎪ ⎪ ⎪ ⎪ h =1 ⎪ ⎨ 0 h0 = Cst ⎪ ⎪ ⎪ ∂ν h0 dσ = 2πdk ⎪ ⎪ ⎩
|∇h0 |2 dx D
in D, on ∂Ω, on ∂ωk , k ∈ {1, . . . , N}, for k ∈ {1, . . . , N}.
∂ωk
One may prove that h0 is the (globally defined) harmonic conjugate of a local lifting of u0 . 4. Energy needed to change degrees We denote
æ : ZN × Z × ZN × Z → N, N |di − pi | + |d − q|. (d, d), (p, q) → i=1
(17)
(18)
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
1061
The next result quantifies the energy needed to change degrees in the weak limit. Lemma 3. (See [4, Lemma 1].) Let (un )n ⊂ Jp,q be a sequence weakly converging in H 1 to u. Then lim inf E(un ) E(u) + π æ deg(u, D), (p, q)
(19)
lim inf Eε (un ) Eε (u) + π æ deg(u, D), (p, q) .
(20)
n
and n
The next lemma is proved in [5]. Lemma 4. Let d = (d1 , . . . , dN ), p = (p1 , . . . , pN ) ∈ ZN , q ∈ Z. There is oε (1) − −−→ 0 (deε→0 d pending of (p, q, d)) s.t. for u ∈ Jp,q we have Eε (u) I0 (d, D) + π æ (d, d), (p, q) − oε (1). Here, d :=
(21)
di .
We present below a simpler proof than the original one in [5]. Proof. Let (p, q, d) ∈ ZN × Z × ZN . We argue by contradiction and we suppose that there are d s.t. δ > 0, εn ↓ 0 and (un ) ⊂ Jp,q Eεn (un ) I0 (d, D) + π æ (d, d), (p, q) − δ.
(22)
Since (un )n is bounded in H 1 , there is some u s.t., up to subsequence, un u in H 1 and un → u in L4 . Using the strong convergence in L4 , (22) and Proposition 1, we have d . u ∈ H 1 (D, S 1 ) ∩ Jd,d To conclude, we apply Lemma 3 I0 (d, D) + π æ (d, d), (p, q) − δ lim inf Eεn (un ) n
lim inf E(un ) n E(u) + π æ (d, d), (p, q) I0 (d, D) + π æ (d, d), (p, q) which is a contradiction.
2
One may easily proved (see Lemma 14 in Appendix C) that for η > 0, i ∈ {0, . . . , N } and u ∈ Jdeg(u,D) , there are v± ∈ Jdeg(u,D)±ei s.t. Eε (v± ) Eε (u) + π + η.
1062
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
The key ingredient is a sharper result which holds under two additional hypotheses. In order to unify the notations, we use the notation ω0 for Ω. We may now state the main ingredient in the proof of Theorem 2. Lemma 5. Let u ∈ Jp,q be a solution of (8), (9). Assume that 1 1 , abdegj (u) ∈ dj − , dj + 3 3
∀j ∈ NN .
(23)
Let i ∈ {0, . . . , N} and assume that there is some point x i ∈ ∂ωi s.t. u × ∂τ u(x i ) > 0. Then there is u˜ ∈ J(p,q)−ei s.t. Eε (u) ˜ < Eε (u) + π, 1 1 , abdegj (u) ˜ ∈ dj − , dj + 3 3
∀j ∈ NN .
The proof of Lemma 5 is postponed to Section 6. We also have an upper bound for mε (p, q, d). Lemma 6. Let ε > 0 and (p, q, d) ∈ ZN × Z × ZN . Then mε (p, q, d) I0 (d, D) + π æ (d, d), (p, q) .
(24)
To prove Lemma 6, we need the following Lemma 7. Let u ∈ J , ε > 0 and δ = (δ1 , . . . , δN , δ0 ) ∈ ZN +1 . For all η > 0, there is uδη ∈ Jdeg(u,D)+δ s.t.
Eε uδη Eε (u) + π
|δi | + η
(25)
i∈{0,...,N}
and u − uδ
η L2 (D )
= oη (1),
oη (1) − −−→ 0. η→0
The proof of Lemma 7 is postponed to Appendix C. Proof. We prove that for η > 0 small, we have mε (p, q, d) I0 (d, D) + π æ (d, d), (p, q) + η. We denote u0 ∈ Ed s.t. E(u0 ) = I0 (d, D). Then abdegi (u0 ) = di . Using Lemma 7 with δ = (p, q) − (d, d), there is uη s.t. uη ∈ J(p,q) and Eε (uη ) Eε (u0 ) + π æ (d, d), (p, q) + η = I0 (d, D) + π æ (d, d), (p, q) + η.
(26)
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
1063
d Furthermore, by (26), u0 − uη L2 (D) = oη (1). For η small, by Proposition 1, we have u0 ∈ Jp,q which proves the lemma. 2
5. A family with bounded energy converges In this section we discuss: d (ε ↓ 0) with 1. the asymptotic behaviour of a sequence of solutions of (8), (9), (uεn )n ⊂ Jp,q n bounded energy, i.e., Eεn (uεn ) Λ, d , 2. the asymptotic behaviour of a minimizing sequence of Eε in Jp,q 3. a fundamental lemma. d with u a solution of (8), (9), s.t. for Λ > 0, we have Proposition 3. Let εn ↓ 0, (uεn )n ⊂ Jp,q εn
Eεn (uεn ) Λ. Then, denoting hεn the unique solution of (15) with u = uεn , we have hεn h0
in H 1 (D),
(27)
in H 1 (D),
(28)
where h0 is the unique solution of (18). Up to subsequence, it holds uεn u0
where u0 ∈ Ed is the unique solution of (17) up to multiplication by an S 1 -constant. Proof. Using the energy bound on uεn and a Poincaré type inequality, we have, up to subsequence, hεn h
in H 1 .
In order to establish (27), it suffices to prove that h = h0 . The set H := {h ∈ H 1 (D, R); ∂τ h ≡ 0 on ∂D and h|∂Ω ≡ 1} is closed convex in H 1 (D, R). Since (hεn )n ⊂ H, we find that h ∈ H. 2 (D, R2 ). Therefore Since Eεn (uεn ) is bounded, Lemma 1 implies that uεn is bounded in Cloc 1 1 there is some u ∈ Cloc (D, C) s.t., up to subsequence, uεn → u in Cloc (D, R2 ), L4 (D, R2 ) and weakly in H 1 (D, R2 ). Using the strong convergence in L4 and the energy bound on uεn , we find that u ∈ H 1 (D, S 1 ). It follows that ∂1 u × ∂2 u = 0 in D. On the other hand, 0 . hεn = 2∂1 uεn × ∂2 uεn → 0 in Cloc
Therefore, h is a harmonic function in D. In order to show that h = h0 , it suffices to check that ∂ν h dσ = 2πdi . ∂ωi
1064
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
To this end, we note that, since uεn × (∂1 Vi ∂2 uεn − ∂2 Vi ∂1 uεn ) = ∇Vi · ∇hεn , we have 2π abdegi (uεn ) = ∇Vi · ∇hεn dx − ∇V · ∇h dx = ∂ν h dσ. − − − → i n→∞ D
D
∂ D \∂ωi
Noting that, by Proposition 1, abdegi (uεn ) − −−−→ abdegi (u) = deg∂ωi (u), n→∞ abdegi (uεn ) − −−−→ di n→∞ and that 0 = D h dx = ∂ D ∂ν h dσ , we obtain
∂ν h dσ = ∂ D \∂ωi
∂ν h dσ = 2πdi = 2π deg∂ωi (u).
∂ωi
In the first integral, ν is the outward normal to D, in the second, ν is the outward normal to ωi . This proves (27). We next turn to (28). Let u0 be s.t., up to subsequence, uεn u0 in H 1 (D). Since |uεn | 1, we find that uεn × ∇uεn u0 × ∇u0
in L2 (D).
In view of (15) and (27), we have u0 × ∇u0 = ∇ ⊥ h0 . Therefore, E(u0 ) = E(h0 ) = I0 (d, D). Proposition 1 implies that u0 ∈ Ed . Then u0 is the unique, up to multiplication by an S 1 -constant, minimizer of E in Ed . 2 d be a minimizing Proposition 4. Let (p, q, d) ∈ ZN × Z × ZN . For ε > 0, let (uεn )n0 ⊂ Jp,q d . Then there is ε (p, q, d) > 0 s.t. for 0 < ε < ε , up to subsequence, sequence of Eε in Jp,q 4 4 d 1 un u in H with u which minimizes Eε in Jdeg(u, . D) d be minimizing sequences of E in J . Up to subsequence, Proof. For ε > 0, let (uεn )n ⊂ Jp,q ε using Proposition 1,
uεn uε
in H 1
d with uε ∈ Jdeg(u ε ,D ) .
Using Lemma 3, we see that {deg(uε , D), ε > 0} ⊂ ZN × Z is a finite set. Applying Lemma 6, it follows that Eε (uε ) is bounded. Therefore, with Proposition 1, there is ε4 > 0 s.t. |abdegi (uε ) − di | < 13 for all i ∈ NN and 0 < ε < ε4 . We argue by contradiction and we assume that there is ε < ε4 s.t. Eε (uε ) = mε (deg(uε , D), d ε d) + 2η, η > 0. Let u ∈ Jdeg(u ε ,D ) be s.t. Eε (u) mε (deg(u , D), d) + η. ε Using Lemma 7 with δ = (p, q) − deg(u , D), there is v ∈ Jp,q s.t. Eε (v) < Eε (u) + π æ (p, q), deg uε , D + η.
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
1065
Furthermore, by the construction, u − vL2 can be taken arbitrary small, so that we may further d . To summarise we have assume v ∈ Jp,q mε (p, q, d) = lim inf Eε uεn n Eε uε + π æ (p, q), deg uε , D = mε deg uε , D , d + 2η + π æ (p, q), deg uε , D Eε (u) + π æ (p, q), deg uε , D + η > Eε (v) mε (p, q, d). This contradiction completes the proof.
2
The main tool requires the following lemma. Lemma 8. Let (p, q, d) ∈ ZN × Z × ZN and Λ > 0. There is ε5 (p, q, d, Λ) > 0 s.t. for ε < ε5 d , a solution of (8) and (9) with E (u) Λ, if d > 0 (respectively d > 0), then there and u ∈ Jp,q ε i 0 is x ∈ ∂Ω (respectively x i ∈ ∂ωi ) s.t. u × ∂τ u(x 0 ) > 0 (respectively u × ∂τ u(x i ) > 0). Proof. We prove existence of x 0 ∈ ∂Ω under appropriate assumptions. Existence of x i is similar. d solutions of (8) and (9) We argue by contradiction. Assume that there are εn ↓ 0, (un ) ⊂ Jp,q with Eεn (un ) Λ s.t. un × ∂τ un 0 on ∂Ω. 1 Since q = 2π ∂Ω un × ∂τ un , we have q 0. Up to subsequence, by Proposition 3, we can assume that un → u0 a.e. with u0 the unique solution up to S 1 of (17). Let x0 ∈ ∂Ω and let γ : ∂Ω → [0, H1 (∂Ω)[ = I be s.t. γ −1 is the direct arc-length parametrization of ∂Ω with the origin at x0 . We denote θn : I → R the smooth functions s.t. un (x) = eıθn [γ (x)] ∀x ∈ ∂Ω, 0 θn (0) < 2π. Then, for all n, θn is nonincreasing and θn ∈ [θn (0) + 2πq, θn (0)] ⊂ [2πq, 2π]. Using Helly’s selection theorem, up to subsequence, we can assume that θn → θ everywhere on I with θ nonincreasing. Denote Ξ the set of discontinuity points of θ . Since θ is nonincreasing, Ξ is a countable set. Using the monotonicity of θ , we can consider the following decomposition θ = θc + θδ,
with θ c and θ δ are nonincreasing functions.
θ c is the continuous part of θ and θ δ is the jump function. The set of discontinuity points of θ δ is Ξ . For t ∈ / Ξ, θ (s+) − θ (s−) 1(−∞,s] (u). θ δ (t) = 0<s
1066
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
We obtain easily that u0 (x) = eıθ[γ (x)] a.e. x ∈ ∂Ω. Since u0 , θn and γ have side limits at each points and u0 = eıθ◦γ a.e., we find that u0 (x±) = eıθ[γ (x±)]
for each x ∈ ∂Ω.
Using the continuity of u0 , we obtain eıθ[γ (x+)] = eıθ[γ (x−)] ∀x ∈ ∂Ω which implies that θ γ (x+) − θ γ (x−) ∈ 2πZ,
∀x ∈ ∂Ω.
For t ∈ / Ξ, θ δ (t) =
θ (s+) − θ (s−) 1(−∞,s] (u) ∈ 2πZ.
0<s
Then u0 (x)e−ıθ
c [γ (x)]
= eıθ
δ [γ (x)]
=1
a.e. x ∈ ∂Ω.
Finally, u0 (x) = eıθ [γ (x)] a.e. x ∈ ∂Ω, which is equivalent (using the continuity of the functions) c at u0 = eıθ . We have a contradiction observing that c
0 < 2π deg∂Ω (u0 ) = 2πd = θ c H1 (∂Ω) − θ c (0) and using the fact that θ c is nonincreasing.
2
6. Proof of Lemma 5 We prove only the part of the lemma concerning ∂Ω. The proof for the other connected components of ∂D is similar. For reader’s convenience, we state the part of Lemma 5 that we will actually prove Lemma 5. Let u ∈ Jp,q be a solution of (8) and (9). Assume that 1 1 abdegj (u) ∈ dj − , dj + , 3 3
∀j ∈ NN ,
and that there is some point x 0 ∈ ∂Ω s.t. u × ∂τ u(x 0 ) > 0. Then there is u˜ ∈ J(p,q−1) s.t. Eε (u) ˜ < Eε (u) + π, 1 1 , ˜ ∈ dj − , dj + abdegj (u) 3 3
∀j ∈ NN .
(23)
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
1067
Fig. 1. Decomposition of D.
6.1. Decomposition of D By hypothesis, there is some x 0 ∈ ∂Ω s.t. ∂ν h(x 0 ) > 0. Without loss of generality, we may assume that u(x 0 ) = 1. (See Fig. 1.) Then there is Υ ⊂ D, a compact neighbourhood of x 0 , simply connected and with nonempty interior, s.t.: • • • •
γ := ∂Ω ∩ ∂Υ is connected with nonempty interior; x 0 is an interior point of γ ; |∇h| > 0, ρ > 0, h 1 in Υ ; ∂ν h > 0 on γ (ν the outward normal of Ω).
It follows that, in Υ , θ is globally defined (we take the determination of θ which vanishes at x 0 ). Using the inverse function theorem, we may assume, by further restricting Υ , that there are some 0 < η, δ < 1 s.t. Υ = x ∈ D s.t. dist x, x 0 < η, 1 − δ h(x) 1, −2δ θ (x) 2δ . We may further assume that, by replacing δ by smaller value if necessary and denoting ◦
Dδ :=Υ , we have
1068
(i)
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
Θ := (θ, h)|Dδ : Dδ → (−2δ, 2δ) × (1 − δ, 1) is a C 1 -diffeomorphism, x → (θ, h),
(ii) ∂Dδ \ ({h = 1} ∪ {h = 1 − δ}) = ∂Dδ ∩ ({θ = −2δ} ∪ {θ = 2δ}), (iii) Dδ is a Lipschitz domain. We consider δ0 > 0 s.t. for δ < δ0 , |Dδ |1/2 <
π|abdeg(u) − d∞ − 13 | . 6 maxi Vi C 1 (D) (Eε (u) + π)1/2
(29)
Using Proposition 1 and (29), if v ∈ H 1 (D, C) satisfies u = v in D \ Dδ , |v| 2 in D and Eε (v) < Eε (u) + π , then we have abdegi (v) ∈ (di − 1/3, di + 1/3). We let δ < δ0 and we denote Dδ := Θ −1 (−δ, δ) × (1 − δ, 1) , Dδ− := Θ −1 (−2δ, −δ) × (1 − δ, 1) , Dδ+ := Θ −1 (δ, 2δ) × (1 − δ, 1) , so that Dδ , Dδ− and Dδ+ are Lipschitz domains. 6.2. Construction of the test function We consider an application (with unknown expression in Dδ ) ψt : D → C (t > 0 smaller than δ) s.t.
1 in D \ Dδ , ψt (x) = e−ıθ −(1−tϕ(θ)) (30) on ∂Ω ∩ ∂Dδ , e−ıθ (1−tϕ(θ))−1 with 0 ϕ 1 a smooth, even and 2π -periodic function satisfying ϕ|(−δ/2,δ/2) ≡ 1 and ϕ|[−π,π[\(−δ,δ) ≡ 0. It is clear that ψt|∂ D ∈ C ∞ (∂D) and deg∂ωi (ψt ) = 0 for all i ∈ NN .
(31)
Expanding in Fourier series, we have e−ıθ − (1 − tϕ(θ )) (t) + t bk (t)e−(k+1)ıθ . = 1 − tb −1 e−ıθ (1 − tϕ(θ )) − 1
(32)
k=−1
−ıθ
−(1−tϕ(θ)) Noting that the real part of ee−ıθ (1−tϕ(θ))−1 is even and the imaginary part is odd, we obtain that bk (t) ∈ R for all k, t. The following lemma is proven in Appendix B.
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
1069
Lemma 9. We denote, for eıθ ∈ S 1 , e−ıθ − (1 − tϕ(θ )) Ψt eıθ = −ıθ e (1 − tϕ(θ )) − 1
e−ıθ − (1 − t) and Ft eıθ = −ıθ . e (1 − t) − 1
Then: (1) |Ψt − Ft | Cδ t on S 1 ; z−(1−t) = (1 − tc−1 ) + t k=−1 ck (t)zk+1 , with (2) Ft (z) = z(1−t)−1 ⎧ ⎨ (t − 2)(1 − t)k ck = 0 ⎩ 1
if k 0, if k −2, if k = −1;
(3) |bk (t) − ck (t)| C(n, δ)(1 + |k|)−n , ∀n > 0, with C(n, δ) independent of t sufficiently small. It is easy to see using Lemma 9 that, for t sufficiently small, degS 1 (Ψt ) = degS 1 (Ft ) = −1. Using the previous equality and the fact that ∂τ θ > 0 on γ , we find that deg∂Ω (ψt ) = −1.
(33)
It will be convenient to use h and θ as a shorthand for h(x) and θ (x). With these notations, we will look for ψt of the form ψt (x) = ψ˜t (h, θ ) ⎧ −(k+1)ıθ ⎪ ⎨ (1 − tf−1 (h)b−1 (t)) + t k=−1 bk (t)fk (h)e 2δ−θ ˜ = θ−δ δ + ψt (h, δ) δ ⎪ ⎩ θ+δ − + ψ˜t (h, −δ) 2δ+θ δ
δ
in Dδ ,
in Dδ+ , in Dδ− .
(34)
We impose fk (1 − δ) = 0 and fk (1) = 1 for k ∈ Z. Our aim is to show that for t > 0 small and appropriate fk ’s, the function ψt defined by (34) satisfies (30) and Lε ψt eıθ , Dδ < π. Here, Lε is the functional defined in Lemma 2, so that Eε ρψt eıθ , Dδ = Eε (u, Dδ ) + Lε ψt eıθ , Dδ . Then, considering ψt =
ψt ψt 2 |ψ t|
if |ψt | 2, if |ψt | > 2
(35)
1070
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
and set u˜ =
ρwt = ψt u u
in Dδ , in D \ Dδ .
In view of (35), it is straightforward that u˜ satisfies the conclusion of Lemma 5. 6.3. Upper bound for Lε (·, Dδ ). An auxiliary problem If we let w: ˜ [1 − δ, 1] × [−2δ, 2δ] be s.t. w(h(x), ˜ θ (x)) := w(x), then we have |∇w|2 =
2 ∂h w(h, |∂i w|2 = ˜ θ )∂i h + ∂θ w(h, ˜ θ )∂i θ i
i
2 2 = ρ 4 ∂h w(h, ˜ θ ) + ∂θ w(h, ˜ θ ) |∇θ |2 . Therefore, 1 Lε (w, Dδ ) = 2
2 2 2 4 ρ ∂h w(h, ˜ θ ) + ∂θ w(h, ˜ θ ) − w(h, ˜ θ ) ρ 2 |∇θ |2
Dδ
2 2 1 4 ˜ θ) dx + 2 ρ 1 − w(h, 2ε 2 2 2 2 1 ∂h w(h, ˜ θ ) + λeıθ − w(h, ˜ θ ) + ∂θ w(h, ˜ θ ) − w(h, ˜ θ ) ρ 2 |∇θ |2 dx 2 Dδ
=: Mλ (w, Dδ ), provided that |w| 2 in Dδ and λ
(36) 9 . 2ε 2 infDδ |∇θ|2
In order to simplify formulas, we will write, in what follows, the second integral in (36) as 1 2
2 |∂h w| ˜ 2 + |∂θ w| ˜ 2 − |w| ˜ 2 + λeıθ − w˜ ρ 2 |∇θ |2 dx.
Dδ
The same simplified notation will be implicitly used for similar integrals. w Claim. If we replace w by w := |w| min(|w|, 2), then Mλ does not increase. Furthermore replacing w by w does not affect the Dirichlet condition of (30). Therefore, by replacing w by w if necessary, we may assume |w| 2.
We next state a lemma which allows us to give a new form of Mλ .
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
1071
Lemma 10. Let f ∈ C 1 (R, R). Then, for k ∈ Z, we have
f (h) cos(kθ )ρ |∇θ | dx = 2
2
Dδ
2δ
1
1−δ f (s) ds 2 sin(kδ) 1 1−δ f (s) ds k
if k = 0, if k = 0,
1 f (h)ρ 2 |∇θ |2 dx = δ
Dδ±
f (s) ds.
1−δ
Proof. This result is easily obtained by noting that the jacobian of the change of variable x → (θ (x), h(x)) is exactly ρ 2 |∇θ |2 . 2 For w = wt = ψt eıθ where ψt of the form given by (34), we have
1 Mλ (w, Dδ ) = 2
2 |∂h w| ˜ 2 + |∂θ w| ˜ 2 − |w| ˜ 2 + λeıθ − w˜ ρ 2 |∇θ |2 dx.
Dδ
We next rewrite Mλ (wt , Dδ ). Recalling that for a sequence {ak } ⊂ R, we have 2 ıkθ ak e = ak2 + 2 ak al cos (k − l)θ . k∈Z
k∈Z
k,l∈Z k>l
Then we obtain Mλ w, Dδ =
Dδ
t 2 2 2 bk fk + fk2 k 2 + λ − 1 − t bk fk (k + 1) cos (k + 1)θ 2
− t2
k=−1
+ t2
k=−1
k∈Z
b−1 bk f−1 fk − f−1 fk (k − λ + 1) cos (k + 1)θ bk bl fk fl + (kl + λ − 1)fk fl cos (k − l)θ ρ 2 |∇θ |2 .
k,l=−1 k−l>0
Using Lemma 10 and (37), we have 1 2 2 Mλ w, Dδ = δt bk φk (fk ) − 2t bk sin (k + 1)δ fk k=−1
k∈Z
− 2t 2
k=−1
b−1 bk
sin[(k + 1)δ] k+1
1−δ
1 1−δ
f−1 fk − (k − λ + 1)f−1 fk
(37)
1072
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
+ 2t
2
k,l=−1 k−l>0
sin[(k − l)δ] bk bl k−l
= Rλ (w) − 2t
1
(38)
1−δ
bk sin (k + 1)δ
k=−1
fk fl + (kl + λ − 1)fk fl
1 (39)
fk
1−δ
with Rλ (w) = δt
2
bk2 φk (fk ) − 2t 2
k=−1
k∈Z
+ 2t 2
bk bl
k,l=−1 k−l>0
sin[(k + 1)δ] b−1 bk k+1
sin[(k − l)δ] k−l
1
1
f−1 fk − (k − λ + 1)f−1 fk
1−δ
fk fl + (kl + λ − 1)fk fl ,
1−δ
1 φk (f ) =
2 f + αk2 f 2
1−δ
and αk =
k 2 + λ − 1.
We next establish a similar identity for Mλ (wt , Dδ± ). Using (34), we have
Mλ wt , Dδ± =
=
1 2 1 2
2 2 2 ∂h w(h, ˜ θ ) + ∂θ w(h, ˜ θ ) − |w|2 + λeıθ − w ρ 2 |∇θ |2
Dδ±
Dδ±
∓ 2δ =
1 2δ 2
+t
2 2 ∂h ψ˜t (h, ±δ)2 2δ ∓ θ + δ −2 1 + λ(2δ ∓ θ )2 ψ˜t (h, ±δ) − 1 δ
−1
˜ Im ψt (h, ±δ) ρ 2 |∇θ |2
∂h ψ˜t (h, ±δ)2 (2δ ∓ θ )2 + 1 + λ(2δ ∓ θ )2 ψ˜t (h, ±δ) − 12 ρ 2 |∇θ |2
Dδ±
k=−1
bk (t) sin (k + 1)δ
1 fk ,
1−δ
where Im ψ denotes the imaginary part of ψ . To obtain (40), we used the identity ıθ 2 ∂θ ψe = |∂θ ψ|2 + |ψ|2 + 2ψ × ∂θ ψ.
(40)
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
1073
6.4. Choice of w = ψt eıθ We take fk (h) =
eαk (h−1) e−αk (h−1) + . 1 − e−2αk δ 1 − e2αk δ
(41)
With this choice, by direct computations we have φk (fk ) = αk 1 + 1 1−δ
2 e2αk δ − 1
,
2 1 1− α δ fk = αk e k +1
(42)
(43)
and for k, l ∈ Z s.t. k = ±l, 1 fk fl = 1−δ
1 − e−2(αk +αl )δ (αk + αl )(1 − e−2αk δ )(1 − e−2αl δ ) −
1 αk αl
1
fk fl =
1−δ
1 − e−2(αk −αl )δ , (αk − αl )(1 − e−2αk δ )(e2αl δ − 1)
(44)
1 − e−2(αk +αl )δ (αk + αl )(1 − e−2αk δ )(1 − e−2αl δ ) +
1 − e−2(αk −αl )δ . (αk − αl )(1 − e−2αk δ )(e2αl δ − 1)
(45)
Using (39)–(45), we may obtain the following estimate, whose proof is postpone to Appendix B. Lemma 11. We have Mλ (wt , Dδ ) δ − 2δt + 4t 2
ck cl
k>l>0
sin[(k − l)δ] kl + o(t). k−l k+l
(46)
6.5. End of the proof of Lemma 5 We denote
S(δ, t) :=
ck cl
k>l>0
Setting n = k − l and noting that
(n+l)l n+2l
=
l 2
+
sin[(k − l)δ] kl . k−l k+l
ln 2(n+2l) ,
we have
(47)
1074
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
2 sin(nδ) l . S(δ, t) = (1 − t)n l(1 − t)2l + (1 − t)n+2l sin(nδ) 2 n n + 2l (t − 2) n>0
l>0
n,l>0
Here, we have used the explicit formulae for the ck ’s, given by Lemma 9. Using Appendix A (see Appendix A.1) we find that for 0 < t < δ, we have 1 − t − cos δ cos δ (1 − t)2 S(δ, t) = arctan + arctan sin δ sin δ 2t 2 +
(1 − t + cos δ)(2 − t) + O(1). 8t sin δ
(48)
We note that 1 − t − cos δ 1 − cos δ t sin δ arctan = arctan − + O t2 sin δ sin δ 2(1 − cos δ) δ t sin δ = − + O t2 2 2(1 − cos δ)
(49)
and π cos δ = − δ. arctan sin δ 2
(50)
From (48)–(50) we infer S(δ, t)
1 sin δ 1 (1 − t + cos δ)(2 − t) − + O(1) (π − δ) + t 8 sin δ 4(1 − cos δ) 4t 2
(51)
with (1 − t + cos δ)(2 − t) sin δ (1 + cos δ) sin δ − < − = 0. 8 sin δ 4(1 − cos δ) 4 sin δ 4(1 − cos δ)
(52)
Mλ (w, Dδ ) δ − 2δt + 4t 2 S(t, δ) + o(t).
(53)
4t 2 S(t, δ) π − δ + o(t).
(54)
From (46),
Using (51) and (52),
Finally, we have combining (53) and (54), Mλ (w, Dδ ) π − 2δt + o(t) < π
for t small.
We conclude that for t sufficiently small, Ldε (w t , Dδ ) < π .
(55)
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
1075
6.6. Conclusion t |,2) , satisfies the desired properties i.e.: u˜ := ψu, with ψ = ψt min(|ψ |ψt |
˜ < Eε (u) + π (by (36) and (55)); • Eε (u) d • u˜ ∈ Jp,q−1 (by (29), (31) and (33)). 7. Proof of Theorem 2 The energy estimate is obtained from Lemmas 4 and 6. The proof is made by induction on K = æ (d, d), (p, q) = |d1 − p1 | + · · · + |dN − pN | + |d − q| 0. We call (p, q, d) a good configuration of degrees if (p, q, d) ∈ ZN × Z × (N∗ )N ,
pi di and q
di =: d.
i
We prove Theorem 2 when K = 0. Let (p, q, d) be a configuration s.t. æ(d, d) = 0 (⇔ p = d and q = d). d . For ε < ε , up to subsequence, For ε > 0, let (uεn )n be a minimizing sequence of Eε in Jd,d 4 ε ε 1 using Proposition 4, un → u weakly in H and strongly in L4 and uε is a (global) minimizer d of Eε in Jdeg(u, D) . Applying Lemmas 3 and 4, for ε < ε2 (d) ε4 (here, ε2 is s.t. the oε (1) of Lemma 4 is lower than π2 ), I0 (d, D) Eε uε + π æ deg uε , D , (d, d) π I0 (d, D) − + 2π æ deg uε , D , (d, d) . 2 d . It follows, æ(deg(uε , D), (d, d)) 14 which implies uε ∈ Jd,d Assuming Theorem 2 true for all configurations (p, q, d) s.t.
0 æ (p, q), (d, d) K. ˜ q, We prove it for all good configurations (p, ˜ d) s.t.
˜ q), ˜ (d, d) = K + 1. æ (p, ˜ q), ˜ q, ˜ (d, d)) = K + 1. Then there is some Let (p, ˜ d) be a good configuration s.t. æ((p, ˜ q) ˜ q) ˜ + ei , (d, d)) = K. i ∈ {0, . . . , N} s.t. ((p, ˜ + ei , d) be a good configuration and æ((p, Without loss of generality, we may assume that i = 0. Set p = p˜ and q˜ + 1 = q. By induction hypothesis, Theorem 2 holds for (p, q, d). d . Let ε < ε2 (p, q, d) and (uε )ε2 >ε>0 be a family of (global) minimizers of (Eε )ε2 >ε>0 in Jp,q 0 0 By Lemma 8, for ε < ε5 (p, q, d, Λ), there is xε ∈ ∂Ω s.t. (uε × ∂τ uε )(xε ) > 0.
1076
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
The third assertion in Proposition 1 and the energy bound give the existence of 0 < ε2 (p, q, d, Λ) < ε5 (p, q, d, Λ) s.t. for 0 < ε < ε2 , 1 1 . abdegi (uε ) ∈ di − , di + 3 3 d Using Lemma 5, for ε < ε2 , we have the existence of u˜ε ∈ Jp,q−1 s.t.
mε (p, q, d) + π = Eε (uε ) + π > Eε (u˜ε ) mε (p, q − 1, d). Using (24), we have, mε (p, q − 1, d) < I0 (d, D) + (K + 1)π. d Let (uεn )n be a minimizing sequence of Eε in Jp,q−1 . Using Proposition 4, for ε < ε4 , up to subsequence, uεn → uε weakly in H 1 and strongly in L2 to u which is a global minimizer of Eε d in Jdeg(u ε ,D ) .
d It suffices to show that, for ε sufficiently small, uε ∈ Jp,q−1 . The argument is similar to the 2 ε one used for K = 0. By strong L convergence, abdegi (u ) ∈ [di − 1/3, di + 1/3]. Using Lemmas 3–5, we have
I0 (d, D) + (K + 1)π > mε (p, q − 1, d) = lim inf Eε uεn by the definition of uεn n Eε uε + π æ (p, q − 1), deg uε , D (Lemma 3) ε > I0 (d, D) + π æ (p, q − 1), deg u , D π (Lemma 4) + æ (d, d), deg uε , D − 2 1 I0 (d, D) + π æ (p, q − 1), (d, d) − 2 (by the triangle inequality) 1 π. I0 (d, D) + K + 2 It follows that
æ (p, q − 1), deg uε , D
+ æ (d, d), deg uε , D = K + 1.
Since æ((p, q − 1), (d, d)) = K + 1, we must have pi deg∂ωi uε di
and q − 1 deg∂Ω uε d.
Let H := (p , q ) s.t. pi pi di , q − 1 q di ⊂ ZN × Z.
(56)
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
Using (56), 1 card(H ) < ∞. Then, we can define 0 < ε2 min ε2 , min ε (p , q , d) 5 (p ,q )∈H
1077
(ε5 defined in Lemma 8).
Consequently, for ε < ε2 , (p , q ) ∈ H and u ∈ Jpd ,q a solution of (8) and (9), by Lemma 8, on each connected component of ∂D, there is some x s.t. u × ∂τ u(x) > 0. Only two cases are possible: Case 1. For ε < ε2 (p, q − 1, d), deg∂ωi (uε ) = pi for all i ∈ NN and deg∂Ω (uε ) = q − 1. d In this case we have the result since for 0 < ε < ε2 (p, q − 1, d), uε ∈ Jp,q−1 .
Case 2. There is 0 < ε < ε2 s.t. deg(uε , D) ∈ H and deg(uε , D) = (p, q − 1). We denote αi = deg∂Ω uε − pi ,
α0 = deg∂Ω uε − (q − 1).
Then, αi ∈ N, i ∈ {0, . . . , N}. Let J = {i ∈ {0, . . . , N} s.t. αi > 0} = ∅. We enumerate the elements of J in (lm )m∈{1,...,|J |} s.t. lm < lm+1 , m < |J |. d Using æ(deg(uε , D), (p, q − 1)) times Lemma 5, we obtain the existence of v|J | ∈ Jp,q−1 s.t.
mε (p, q − 1, d) Eε (v|J | ) < Eε uε + π æ deg uε , D , (p, q − 1) .
(57)
Indeed, for 0 m |J | we construct inductively vm ∈ J s.t. ⎧ v0 = uε , ⎪ ⎪ ⎪ ⎨ abdeg (v ) − d < 1 , i i m 3 ⎪ ⎪ for m < |J |, vm+1 ∈ Jdeg(vm ,D)−αlm elm , ⎪ ⎩ for m < |J |, Eε (vm+1 ) < Eε (vm ) + αlm π. The map vm+1 is obtained from vm by applying αlm times Lemma 5 as follows. For j ∈ j {1, . . . , αlm }, we denote vm the global minimizer of Eε in Jdeg(v j −1 ,D)−e . For j = 0, we set lm
m
0 =v . vm m j From Lemma 5 and since vm is a global minimizer of Eε in J d
j −1
deg(vm ,D )−elm
,
j j −1 Eε vm < Eε vm + π.
(58)
α
From (58), for m < |J |, by noting that vm+1 = vmlm , we obtain Eε (vm+1 ) < Eε (vm ) + αlm π.
(59)
d For m = |J |, we have v|J | ∈ Jp,q−1 and
Eε (v|J | ) = mε (p, q − 1, d) < Eε uε + π æ deg uε , D , (p, q − 1) .
(60)
1078
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
On the other hand, Lemma 3 yields mε (p, q − 1, d) = lim inf Eε uεn Eε uε + π æ deg uε , D , (p, q − 1) . n
(61)
Estimate (61) contradicts (60). This contradiction achieves the proof. Acknowledgment The author would like to express his gratitude to Professor Petru Mironescu for suggesting him to study the problem treated in this paper and for his useful remarks. Appendix A. Results used in the proof of Lemma 5 A.1. Power series expansions For X ∈ C, |X| < 1, we have |X|k k
k1
= − ln 1 − |X| ,
(A.1)
1 , 1−X
(A.2)
X , (1 − X)2
(A.3)
Xk =
k0
kX k =
k1
X sin δ , (A.4) 1 − 2X cos δ + X 2 k>0 sin(kδ) cos δ X − cos δ X k = arctan + arctan , (A.5) k sin δ sin δ k>0 1 l X + cos δ X − cos δ n+2l X + Cst(δ). (A.6) − sin(nδ) = arctan n + 2l sin δ 4(1 − X 2 ) sin δ 4 sin2 δ sin(kδ)X k =
n,l>0
Proof. The first four identities are classical. We sketch the argument that leads to (A.5) and (A.6). Identities (A.5) follows from (A.4) by integration. We next prove (A.6). Let f (X) =
n,l>0
sin(nδ)
l X n+2l . n + 2l
On the one hand, by (A.3), (A.4), f (X) =
1 X 2 sin δ . sin(nδ)X n lX 2l = X (1 − X 2 )2 (1 − 2X cos δ + X 2 ) n>0
l>0
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
1079
On the other hand X + cos δ X 2 sin δ d 1 X − cos δ = − . arctan dX 4 sin δ(1 − X 2 ) 4 sin2 δ sin δ (1 − X 2 )2 (1 − 2X cos δ + X 2 )
2
A.2. Estimates for fk and αk Recall that we defined, in Section 6, fk and αk by eαk (h−1) e−αk (h−1) + , −2α δ 1−e k 1 − e2αk δ αk = k 2 + λ − 1.
fk (h) =
In this part, we prove the following inequalities. 1 , αk = |k| + O |k| + 1
(A.7)
fk (h) − e−|k|(1−h) C , with C independent of k ∈ Z∗ , h ∈ (1 − δ, 1), k2 f (h) − |k|e−|k|(1−h) C , with C independent of k ∈ Z∗ , h ∈ (1 − δ, 1). k |k|
(A.8) (A.9)
Proof. The first assertion is obtained using a Taylor expansion. Let gh (u) = eu(h−1) , we have fk (h) − e−|k|(1−h) gh (αk ) − gh |k| + C sup g (u)αk − |k| + C h 2 k k2 (|k|,αk )
C 1 1 C + 2 2. ek 2k k k
The proof of (A.9) is similar, one uses g˜ h (u) = ueu(h−1) instead of gh .
2
A.3. Further estimates on fk and αk We have 1 0
2 fk − αk2 fk2
1−δ
1
2 fk − k 2 fk2
1−δ
C , |k| + 1
with C independent of k ∈ Z, (A.10)
1 C fk fl , max(|k|, |l|) 1−δ
with C independent of k, l ∈ Z, s.t. |k| = |l|,
(A.11)
1080
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
1 fk fl C min |k|, |l| + 1 ,
with C independent of k, l ∈ Z, s.t. |k| = |l|. (A.12)
1−δ
Proof. Actually (A.11), (A.12) still hold when |k| = |l|, but this will not used in the proof of Lemma 5 and requires a separate argument. Since αk |k|, 1
2 fk − αk2 fk2
1−δ
1
2 fk − k 2 fk2 .
1−δ
By direct computations, 1 0
2 fk − αk2 fk2 =
1−δ
1
2 fk − k 2 fk2 =
1−δ
1 fk2 1−δ
4δαk2 −2α (1 − e k δ )(e2αk δ 1
1−δ
− 1)
C(δ, n) , kn
2 fk − αk2 fk2 + (λ − 1)
∀n ∈ N∗ ,
1 fk2 ,
1−δ
1 1 1 1 1 +O =O . = − 2αk 1 − e−2αk δ 1 − e2αk δ |k| + 1 |k| + 1
Which proves (A.10). For |k| = |l|, we have 1 1 − e−2(αk +αl )δ 1 − e−2(αk −αl )δ − fk fl = (αk + αl )(1 − e−2αk δ )(1 − e−2αl δ ) (αk − αl )(1 − e−2αk δ )(e2αl δ − 1) 1−δ
C 1 − e−2(αk −αl )δ . + −2α δ 2α δ max(|k|, |l|) (αk − αl )(1 − e k )(e l − 1)
We assume that |k| > |l| and we consider the two following cases: αl < αk 2αl and αk > 2αl . −2xδ Noting that 1−ex is bounded for x ∈ R∗+ , we have 1 − e−2(αk −αl )δ C C (α − α )(1 − e−2αk δ )(e2αl δ − 1) e2αl δ max(|k|, |l|) if αl < αk 2αl , k l C 1 − e−2(αk −αl )δ C (α − α )(1 − e−2αk δ )(e2αl δ − 1) α − α max(|k|, |l|) if αk > 2αl . k l k l This proves (A.11).
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
1081
For |k| = |l|, 1 αk αl
1
fk fl =
1−δ
1 − e−2(αk −αl )δ 1 − e−2(αk +αl )δ + . (αk + αl )(1 − e−2αk δ )(1 − e−2αl δ ) (αk − αl )(1 − e−2αk δ )(e2αl δ − 1)
It is clear that, αk αl (1 − e−2(αk +αl )δ ) αk αl C C min |k|, |l| + 1 . −2α δ −2α δ k l αk + αl (αk + αl )(1 − e )(1 − e )
(A.13)
As in the proof of (A.11), we have αk αl (1 − e−2(αk −αl )δ ) Cαk αl (α − α )(1 − e−2αk δ )(e2αl δ − 1) max(|k|, |l|) C min |k|, |l| + 1 . k l Inequalities (A.12) follows from (A.13) and (A.14).
(A.14)
2
A.4. Two fundamental estimates In this part, we let k > l 0 and prove the following: Xk,l :=
1 2kl (αk αl + kl + λ − 1)(1 − e−2(αk +αl )δ ) + O , = k+l l+1 (αk + αl )(1 − e−2αk δ )(1 − e−2αl δ )
(A.15)
(αk αl + kl + λ − 1)(1 − e−2(αk −αl )δ ) Ce−δl . (αk − αl )(1 − e−2αk δ )(e2αl δ − 1)
(A.16)
Yk,l :=
The computations are direct: Xk,l −
2kl 1 2kl 2kl = + O − k + l (αk + αl )(1 − e−2αk δ )(1 − e−2αl δ ) k + l l+1 1 k + l − (αk + αl )(1 − e−2αk δ )(1 − e−2αl δ ) + O = 2kl l+1 (k + l)(αk + αl )(1 − e−2αk δ )(1 − e−2αl δ ) 2 −lδ/2 1 ) O(k + k le + O = l+1 (k + l)(αk + αl )(1 − e−2αk δ )(1 − e−2αl δ ) 1 . =O l+1
We now turn to (A.16). If αk 2αl (or equivalently, if αk − αl
αk 2 ),
then
(αk αl + kl + λ − 1)(1 − e−2(αk −αl )δ ) kl C e−2αl δ Ce−δl . −2α δ 2α δ k l αk (αk − αl )(1 − e )(e − 1)
1082
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
If αk < 2αl , then (αk αl + kl + λ − 1)(1 − e−2(αk −αl )δ ) Cl 2 e−2αl δ Ce−δl . (αk − αl )(1 − e−2αk δ )(e2αl δ − 1) Appendix B. Proof of Proposition 1 and of Lemma 9 B.1. Proof of Proposition 1 1 1 The proof of (1) is direct by noticing that if u ∈ H (D, S ), then ∂1 u and ∂2 u are pointwise proportional and deg∂Ω (u) = i deg∂ωi (u),
1 k (−1) (u × ∂k u)∂3−k Vi abdegi (u, D) = 2π k=1,2
1 = 2π
D
Vi u × ∂τ u dτ = deg∂Ω (u) −
deg∂ωj (u) = deg∂ωi (u).
j =i
∂D
Proof of (2). Since Vi is locally constant on ∂D, integrating by parts,
v × (∂1 u∂2 Vi − ∂2 u∂1 Vi ) dx = D
u × (∂1 v∂2 Vi − ∂2 v∂1 Vi ) dx. D
Then 2π abdegi (u) − abdegi (v) = (u − v) × (∂1 Vi ∂2 u − ∂2 Vi ∂1 u) + (∂1 Vi ∂2 v − ∂2 Vi ∂1 v) dx D
√ 2u − vL2 (D) Vi C 1 (D) ∇uL2 (D) + ∇vL2 (D) 1/2 1/2 + Eε (v) 2u − vL2 (D) Vi C 1 (D) Eε (u) 4u − vL2 (D) Vi C 1 (D) Λ1/2 . We prove assertion (3) by showing that dist(abdegi (uε ), Z) = o(1). Using the first and the second assertion, we have dist abdegi (uε ), Z inf abdegi (uε ) − abdegi (v) v∈E0Λ
2 Vi C 1 (D) Λ1/2 inf uε − vL2 (D) π v∈E0Λ
(B.1)
where E0Λ := {u ∈ H 1 (D, S 1 ) s.t. 12 D |∇u|2 dx Λ} = ∅. Now, it suffices to show that infv∈E Λ uε − vL2 (D) → 0. We argue by contradiction and we 0 assume that there is an extraction (εn )n ↓ 0 and δ > 0 s.t. for all n, infv∈E Λ uεn − vL2 (D) > δ. 0
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
1083
We see that (uεn )n is bounded in H 1 . Then, up to subsequence, un converges to u ∈ H 1 (D, R2 ) weakly in H 1 and strongly in L4 . Since |uεn |2 − 1L2 (D) → 0, we have u ∈ H 1 (D, S 1 ) and by weakly convergence, ∇u2L2 (D) 2Λ. To conclude, we have u ∈ E0Λ et uεn − uL2 → 0, which is a contradiction.
B.2. Proof of Lemma 9 (1) We see easily that, with z = eıθ , we have Ψt (z) − Ft (z) (1 − ϕ(θ ))(1 − z2 ) A(θ, t) = ≡ . t [z(1 − t) − 1][z(1 − tϕ(θ )) − 1] B(θ, t)
(B.2)
The modulus of the RHS of (B.2) can be bounded by noting that • there is some m > 0 s.t. |B(θ, t)| m for each t and each θ s.t. |θ | > δ/2 mod 2π ; • there is some M > 0 s.t. |A(θ, t)| M for each t and each θ s.t. |θ | > δ/2 mod 2π ; • if |θ | δ/2 (modulo 2π ), then (Ψt − Ft )t −1 ≡ 0. (2) This assertion is a standard expansion. (3) With a classical result relating regularity of Ψt − Ft to the asymptotic behaviour of its Fourier coefficients, we have n
2n+1 π∂θ (Ψt − Ft )L∞ (S 1 ) bk (t) − ck (t) . t (1 + |k|)n Noting that, for ∂θ (Ψt − Ft )t −1 ≡ n
An (θ,t) Bn (θ,t)
• there is some mn > 0 s.t. |Bn (θ, t)| mn for each t and each θ s.t. |θ | > δ/2 mod 2π ; • there is some Mn > 0 s.t. |An (θ, t)| Mn for each t and each θ s.t. |θ | > δ/2 mod 2π ; • if |θ | δ/2 (modulo 2π ), then (Ψt − Ft )t −1 ≡ 0 we obtain the result. B.3. Proof of Lemma 11 The key argument to treat the energetic contribution of Dδ± is the following lemma. Lemma 12. 1. |ψ˜t (h, ±δ) − 1| = O(t); 2. |∂h ψ˜t (h, ±δ)| = O(t| ln t|). Proof. Using Lemma 9, (A.2) and (A.8), we have −1 ˜ −ı[(k+1)δ] t ψt (h, δ) − 1 −c−1 f−1 (h) + ck (t)fk (h)e k=−1
1084
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
−ı(k+1)δ bk − ck (t) fk (h)e + −(b−1 − c−1 )f−1 (h) + k=−1
−(1−h)−ıδ k C(δ) + 1 = O(1). (1 − t)e k0
We prove that |∂h ψ˜t (h, δ)| = O(t| ln t|). Using Lemma 9, (A.3) and (A.9), t −1 ∂h ψ˜t (h, δ) −c−1 f−1 + ck fk e−ı(k+1)δ k=−1
+ −(b−1 − c−1 )f−1 + (bk − ck )fk e−ı(k+1)δ k=−1
k 2 k (1 − t)e−ıδ−(1−h) + O | ln t| = O | ln t| .
2
k0
Using (39), (40) and Lemma 12, we have (with the notation of Section 6) that Mλ (wt , Dδ ) = Rλ (wt ) + o(t), where
Rλ (wt ) = δt
2
bk2 φk (fk ) − 2t 2
k=−1
k∈Z
+ 2t 2
k,l=−1 k−l>0
bk bl
sin[(k + 1)δ] b−1 bk k+1
sin[(k − l)δ] k−l
1
1
f−1 fk − (k − λ + 1)f−1 fk
1−δ
fk fl + (kl + λ − 1)fk fl .
1−δ
The proof of Lemma 12 is completed provided we establish the following estimate: Rλ (wt ) δ − 2δt + 4t 2
ck cl
k,l0 k−l>0
sin[(k − l)δ] kl + o(t). k−l k+l
(B.3)
The remaining part of this appendix is devoted to the proof of (B.3). We estimate the first term of Rλ : Using (42) and Lemma 9, we have (with C independent of t) 2 2 b φ (f ) − c φ (f ) k k k k k k C. k∈Z
With (42) and (A.7), we obtain
k∈Z
(B.4)
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
φk (fk ) = α 1 +
2 2αδ e −1
= |k| + O
1 |k| + 1
1085
when |k| → ∞.
(B.5)
From (A.1), (A.3) and (B.5), t2
ck2 φk (fk ) = t 2 φ−1 (f−1 ) + t 2 (t − 2)2
k∈Z
= t 2 (t − 2)2
(1 − t)2k φk (fk )
k0
k(1 − t)2k + o(t) = 1 − 2t + o(t).
(B.6)
k>0
We estimate the second term of Rλ : Using Lemma 9, (A.11) and (A.12), we have (with C independent of t) 1 sin[(k + 1)δ] f−1 fk − (k − λ + 1)f−1 fk C. (bk − ck ) k+1 k=−1
1−δ
Since b−1 (t) is bounded by a quantity independent of t, in the order to estimate the third term of the RHS of (38), we observe that there is C independent of t s.t. 1 (1 − t)k sin[(k + 1)δ] f−1 fk − (k − λ + 1)f−1 fk k+1 k0
C
k1
1−δ
(1 − t)k k
+ 1 = C | ln t| + 1 .
Finally, using Lemma 9, (44) and (45), we have 1 sin[(k + 1)δ] f b f − (k − λ + 1)f f k −1 k C | ln t| + 1 . −1 k k+1 k=−1
(B.7)
1−δ
We estimate the last term of Rλ : First, we consider the case k = −l > 0 (i.e., fk = fl ). Using (43), 0 fk 1 and (A.10), we have (with C independent of t) 1 2 2 2 sin 2kδ fk + −k + λ − 1 fk C = C(t). bk b−k 2k k>0
1−δ
It remains to estimate the last sum in Rλ , considered only over the indices k and l s.t. |k| = |l|. We start with
1086
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
sin[(k − l)δ] (bk bl − ck cl ) k−l
k,l=−1 k−l>0, k=−l
=
k,l=−1 k−l>0, k=−l
1
1
fk fl + (kl + λ − 1)fk fl
1−δ
sin[(k − l)δ] (bk − ck )(bl − cl ) + ck (bl − cl ) + cl (bk − ck ) k−l
fk fl + (kl + λ − 1)fk fl .
×
(B.8)
1−δ
By the assertion (3) of Lemma 9, the first sum of the RHS of (B.8) is easily bounded by a quantity independent of t. By (A.11), (A.12) and Lemma 9,
k,l=−1 k−l>0, k=−l
sin[(k − l)δ] ck (bl − cl ) k−l
C
k0, l=−1 k−l>0, k=−l
1
fk fl + (kl + λ − 1)fk fl
1−δ
(1 − t)k |bl − cl ||l| + C. k−l
On the other hand (putting n = k − l), k0, l=−1 k−l>0, k=−l
(1 − t)k |bl − cl ||l| k−l
(1 − t)k |bl − cl |l + k−l
k>l0
l0, n>0
k0, l−1
(1 − t)n |bl − cl |l + n
(1 − t)k |bl − cl ||l| k + |l|
k>0, l−1
(1 − t)k |bl − cl ||l| + O(1) k
= O | ln t| . Similarly, we may prove that
k,l=−1 k−l>0, k=−l
sin[(k − l)δ] cl (bk − ck ) k−l
1 1−δ
fk fl + (kl + λ − 1)fk fl = O | ln t| .
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
1087
We have thus proved that
k,l=−1 k−l>0, k=−l
sin[(k − l)δ] (bk bl − ck cl ) k−l
1
fk fl + (kl + λ − 1)fk fl = o t −1 .
1−δ
To finish the proof, it suffices to obtain k,l=−1 k−l>0, k=−l
=2
sin[(k − l)δ] ck cl k−l
ck cl
k,l0 k−l>0
1
fk fl + (kl + λ − 1)fk fl
1−δ
sin[(k − l)δ] kl + o t −1 . k−l k+l
Since cm = 0 for m < −1, it suffices to consider the case k > l 0. Under these hypotheses, we have by (44), (45), (A.15) and (A.16), k>l0
sin[(k − l)δ] ck cl k−l
=2
k>l0
ck cl
1
fk fl + (kl + λ − 1)fk fl
1−δ
sin[(k − l)δ] kl +O k−l k+l
ck cl | sin[(k − l)δ]| 1 . k−l l+1 k>l0
We conclude by noting that (1 − t)n (1 − t)2l sin[(k − l)δ] C 1 + C 1 + ln2 t . c c k l (k − l)(l + 1) n l n>0
k>l0
l>0
Appendix C. Proof of Lemma 7 Lemma 13. Let 0 < δ, η < 1, there is Mη,δ : D(0, 1) → C x → Mη,δ (x)
s.t.:
(C.1)
(i) degS 1 (Mη,δ ) = 1, (ii) 12 D(0,1) |∇Mη,δ |2 π + η, (iii) |Mη,δ | 2, (iv) if |θ | > δ mod 2π , then Mη,δ (eıθ ) = 1. Claim. Taking Mη,δ instead of Mη,δ , we obtain the same conclusions replacing the assertion (i) by degS 1 (Mη,δ ) = −1.
1088
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
Proof. As in Section 6, let ϕ ∈ C ∞ (R, R) be s.t. • 0 ϕ 1, • ϕ is even and 2π -periodic, • ϕ|(−δ/2,δ/2) ≡ 1 and ϕ|[−π,π[\(−δ,δ) ≡ 0. For 0 < t < δ, let Mt = M be the unique solution of ⎧ ⎨
eıθ − (1 − tϕ(θ )) , M eıθ = ıθ e (1 − tϕ(θ )) − 1 ⎩ M = 0 in D(0, 1). It follows easily that M satisfies (i), (ii) and (iv). We will prove that for t small (iii) holds. Using (32), we have eıθ − (1 − tϕ(θ )) = 1 − tb (t) + t bk (t)e(k+1)ıθ . −1 eıθ (1 − tϕ(θ )) − 1
(C.2)
k=−1
It is not difficult to see that M reıθ = 1 − tb−1 (t) + t bk (t)r |k+1| e(k+1)ıθ . k=−1
From (C.3), 1 2
2π
|∇M| = t 2
D(0,1)
2
1 dθ
0
= πt 2
dr 0
bk2 (k + 1)r 2|k+1|−2
k=−1
bk2 (k + 1) + πt 2
k0
= πt 2
|k + 1|bk2
k−2
ck2 (k + 1) + O t 2 (using Lemma 9)
k0
= π(2 − t)2 t 2
(1 − t)2k (k + 1) + O t 2
(using Lemma 9)
k0
= π + O t 2 (using (A.2) and (A.3)) π +η
for t small.
We finish the proof taking, for t small, Mη,δ = Mt .
2
Lemma 14. Let u ∈ J , i ∈ {0, . . . , N} and ε > 0. For all η > 0, there is u± η ∈ Jdeg(u,D )±ei
(C.3)
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
1089
Eε u± η Eε (u) + π + η
(C.4)
s.t.
and u − u± η
L2 (D )
= oη (1),
oη (1) − −−→ 0. η→0
(C.5)
Proof. We prove that for i = 0, there is u+ η ∈ Jdeg(u,D )+ei satisfying (C.4) and (C.5). In the other cases the proof is similar. Using the density of C 0 (D, C) ∩ J in J for the H 1 -norm, we may assume u ∈ C 0 (D, C) ∩ J . It suffices to prove the result for 0 < η < min{10−3 , ε 2 }. Let x 0 ∈ ∂Ω and Vη be an open regular set of D s.t.: • • • • •
∂Vη ∩ ∂D = ∅, |Vη | η2 , x 0 is an interior point of ∂Ω ∩ ∂Vη , Vη is simply connected, |u|2 1 + η2 in Vη , ∇uL2 (Vη ) η2 .
Using the Carathéodory’s theorem, there is Φ : Vη → D(0, 1), a homeomorphism s.t. Φ|Vη : Vη → D(0, 1) is a conformal mapping. Without loss of generality, we may assume that Φ(x 0 ) = 1. Let δ > 0 be s.t. for |θ | δ we have Φ −1 (eıθ ) ∈ ∂Vη ∩ ∂Ω. Let Nη ∈ J be defined by
1 Nη (x) = M (Φ(x)) η2 ,δ
if x ∈ D \ Vη , otherwise.
Here, Mη2 ,δ is defined by Lemma 13. Using the conformal invariance of the Dirichlet functional, we have 1 1 |∇Nη |2 = |∇Mη2 ,δ |2 π + η2 . (C.6) 2 2 Vη
D(0,1)
It is not difficult to see that u+ η := uNη ∈ Jdeg(u,D )+e0 . Since |Nη | 2 and Nη − 1L2 (D ) = oη (1), using the Dominated convergence theorem, we may prove that uNη → u in L2 (D) when η → 0. It follows that (C.5) holds. From (C.6) and using the following formula, ∇(uv)2 = |v|2 |∇u|2 + |u|2 |∇v|2 + 2 (v∂j u) · (u∂j v) j =1,2
we obtain
1090
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091
1 2
Vη
+ 2 1 ∇u = η 2
|Nη |2 |∇u|2 + |u|2 |∇Nη |2 + 2
(Nη ∂j u) · (u∂j Nη )
j =1,2
Vη
1 + η2 π + η2 + 2∇u2L2 (V ) + 4 1 + η2 ∇uL2 (Vη ) ∇Nη L2 (Vη ) η
η π + . 2
(C.7)
Furthermore, we have 1 4ε 2
2 2 2 η η. 1 − u+ η 2 4ε 2
(C.8)
Vη
From (C.7) and (C.8), it follows + Eε u+ η , D = Eε (u, D \ Vη ) + Eε uη , Vη Eε (u, D) + π + η. The previous inequality completes the proof.
2
We may now prove Lemma 7. For the convenience of the reader, we recall the statement of the lemma. Lemma 7. Let u ∈ J , ε > 0 and δ = (δ1 , . . . , δN , δ0 ) ∈ ZN +1 . For all η > 0, there is uδη ∈ Jdeg(u,D)+δ s.t.
Eε uδη Eε (u) + π
|δi | + η
(25)
i∈{0,...,N}
and u − uδ
η L2 (D )
= oη (1),
oη (1) − −−→ 0. η→0
(26)
Proof. As in the previous lemma, it suffices to prove the proposition for 0 < η < min{10−3 , ε 2 } and u ∈ C 0 (D, C) ∩ J . We construct uδη in 1 = i∈{0,...,N } |δi | steps. If 1 = 0 (which is equivalent at δ = 0ZN+1 ) then, taking uδη = u, (25) and (26) hold. Assume 1 = 0. Let Γ = {i ∈ NN s.t. δi = 0} = ∅, L = Card Γ and μ = η1 . We enumerate the elements of Γ in (in )n∈NL s.t. for n ∈ NL−1 we have in < in+1 . x Let σ be the sign function i.e. for x ∈ R∗ , σ (x) = |x| . For n ∈ NL and l ∈ N|δin | , we construct vnl ∈ Jdeg(v l−1 ,D)+σ (δi )ei n
s.t.
n
M. Dos Santos / Journal of Functional Analysis 257 (2009) 1053–1091 |δi
v00 = u,
1091
|
n−1 vn0 = vn−1 with for n = 1, δi0 = 0,
(vnl )+ μ if δin > 0, 0 l < |δin |. vnl+1 = l (vn )− μ if δin < 0,
± l Here, (vnl )± μ stands for uμ defined by Lemma 14 taking u = vn and η = μ. |δ |
It is clear that vnl is well defined and that for n ∈ NL , vn := vn in ∈ Jdeg(vn−1 ,D)+δin ein with v0 = u. Therefore, using (C.4), we have for n ∈ NL , vn ∈ Jdeg(u,D)+k∈N
δ e , n ik ik
Eε (vn ) Eε (u) + (π + μ)
|δik |.
k∈Nn
Taking n = L, we obtain that uδη = vL ∈ Jdeg(u,D)+δ ,
Eε uδη Eε (u) + π
|δi | + η.
i∈{0,...,N }
Furthermore, uδη is obtained from u multiplying by 1 factors Nl , l ∈ N1 . Each Nl is bounded by 2 and converges to 1 in L2 -norm (when η → 0). Using the Dominated convergence theorem, we may prove that uδη satisfies (26). 2 References [1] N. André, I. Shafrir, Minimisation of a Ginzburg–Landau type functional with boundary condition which is not S 1 -valued, Calc. Var. Partial Differential Equations 7 (1998) 191–217. [2] N. André, I. Shafrir, On the minimizers of a Ginzburg–Landau type energy when the boundary condition has zeros, Adv. Differential Equations 9 (2004) 891–960. [3] L. Berlyand, D. Golovaty, V. Rybalko, Nonexistence of Ginzburg–Landau minimizers with prescribed degree on the boundary of a doubly connected domain, C. R. Math. Acad. Sci. Paris 343 (2006) 63–68. [4] L. Berlyand, P. Mironescu, Ginzburg–Landau minimizers in perforated domains with prescribed degrees, preprint, 2004. [5] L. Berlyand, V. Rybalko, Solution with vortices of a semi-stiff boundary value problem for the Ginzburg– Landau equation, J. Eur. Math. Soc. (JEMS), in press, 2008, http://www.math.psu.edu/berlyand/publications/ publications.html. [6] H. Brezis, New questions related to the topological degree, in: The Unity of Mathematics, in: Progr. Math., vol. 244, Birkhäuser Boston, 2006. [7] H. Brezis, F. Bethuel, F. Hélein, Ginzburg–Landau Vortices, Birkhäuser, 1994. [8] P. Mironescu, Explicit bounds for solutions to a Ginzburg–Landau type equation, Rev. Roumaine Math. Pures Appl. 41 (1996) 263–271. [9] E. Sandier, S. Serfaty, Vortices in the Magnetic Ginzburg–Landau Model, Birkhäuser, 2007. [10] M. Tinkham, Introduction to Superconductivity, McGraw–Hill, New York, 1996.
Journal of Functional Analysis 257 (2009) 1092–1132 www.elsevier.com/locate/jfa
Higher order spectral shift Ken Dykema 1 , Anna Skripka ∗ Department of Mathematics, Texas A&M University, College Station, TX 77843-3368, USA Received 18 December 2008; accepted 25 February 2009 Available online 10 March 2009 Communicated by N. Kalton
Abstract We construct higher order spectral shift functions, extending the perturbation theory results of M.G. Krein [M.G. Krein, On a trace formula in perturbation theory, Mat. Sb. 33 (1953) 597–626 (in Russian)] and L.S. Koplienko [L.S. Koplienko, Trace formula for perturbations of nonnuclear type, Sibirsk. Mat. Zh. 25 (1984) 62–71 (in Russian); translation in: Trace formula for nontrace-class perturbations, Siberian Math. J. 25 (1984) 735–743] on representations for the remainders of the first and second order Taylor-type approximations of operator functions. The higher order spectral shift functions represent the remainders of higher order Taylor-type approximations; they can be expressed recursively via the lower order (in particular, Krein’s and Koplienko’s) ones. We also obtain higher order spectral averaging formulas generalizing the Birman–Solomyak spectral averaging formula. The results are obtained in the semi-finite von Neumann algebra setting, with the perturbation taken in the Hilbert–Schmidt class of the algebra. © 2009 Elsevier Inc. All rights reserved. Keywords: Spectral shift function; Taylor formula
1. Introduction Let H be a separable Hilbert space and B(H) the algebra of bounded linear operators on H. Let M be a semi-finite von Neumann algebra acting on H and τ a semi-finite normal faithful trace on M. We study how the value f (H0 ) of a function f on a self-adjoint operator H0 in M changes under a perturbation V = V ∗ ∈ M of the operator argument H0 . It is well known that * Corresponding author.
E-mail addresses: [email protected] (K. Dykema), [email protected] (A. Skripka). 1 Research supported in part by NSF grant DMS-0600814.
0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.02.019
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
1093
for certain functions f , the value f (H0 + V ) can be approximated by the Fréchet derivatives of the mapping H ∗ = H → f (H ) at point H0 . Theorem 1.1. (Cf. [24, Theorem 1.43, Corollary 1.45].) Let f : R → C be a bounded function such that the mapping H → f (H ) defined on self-adjoint elements of B(H) is p times continuously differentiable in the sense of Fréchet (and, hence, in the sense of Gâteaux). Let H0 = H0∗ , V = V ∗ ∈ B(H) and denote Rp,H0 ,V (f ) = f (H0 + V ) −
p−1 j =0
1 d j f (H0 + tV ). j ! dt j t=0
(1)
dp f (H0 + tV ) dt dt p
(2)
Then Rp,H0 ,V (f ) =
1 (p − 1)!
1 (1 − t)p−1 0
and Rp,H
0 ,V
(f ) = O V p .
(3)
Theorem 1.1 generalizes the Taylor approximation theorem for scalar functions. It was proved in [8] that for f ∈ C 2p (R), the operator function f is Fréchet differentiable p times on B(H), with the derivative written as an iterated operator integral. For f ∈ Wp (the set of functions C p (R) such that for each j = 0, . . . , p, the derivative f (j ) equals the Fourier transform f ∈ itλ R e dμf (j ) (λ) of a finite Borel measure μf (j ) ) and a (possibly) unbounded H0 , the differentiability of H → f (H ) in the sense of Fréchet of order p was established in [1]; in that case, the dp Gâteaux derivative dt p f (H0 + tV ) was represented as a Bochner-type multiple operator integral. 1 (R) ∩ B p (R), it is known that the Gâteaux derivative of f of order For f in the Besov class B∞1 ∞1 p exists [23], but the bound (3) has not been proved. In the scalar case (dim(H) = 1), we have that τ [Rp,H0 ,V (f )] is a bounded functional on the space of functions f (p) and τ Rp,H
0 ,V
τ (|V |p ) (p) f . (f ) ∞ p!
(4)
In the case of a nontrivial H (dim(H) > 1), it is generally hard to separate contribution of the perturbation V to the estimate for the remainder (3) from contribution of the scalar function f (p) . One of approaches to (4) is the estimate Rp,H0 ,V (f ) C(H0 , V )f (2p) ∞ , for f ∈ C 2p (R) [8], with C(H0 , V ) a constant depending on bounded self-adjoint operators H0 and V . Another approach is the estimate τ Rp,H
0 ,V
τ (|V |p ) (f ) μf (p) p!
(5)
(p) ), for τ the usual [10] (see [12] for an example when μf (p) can be replaced with f 1 ∗ ∗ trace, H0 = H0 an operator in H, V = V an operator in the Schatten p-class, and f ∈ Wp . If
1094
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
H0 = H0∗ is affiliated with a semi-finite von Neumann algebra M, V = V ∗ is in the τ -Schatten p-class of M, and f ∈ Wp , then the remainder Rp,H0 ,V (f ) belongs to the Schatten p-class of M as well and p 1/p ([τ (|V |p )]1/p + V )p μf (p) τ Rp,H0 ,V (f ) + Rp,H0 ,V (f ) p! see [1]. In the particular case of p = 1 or p = 2, the functional τ [Rp,H0 ,V (f )] is bounded on the space of functions f or f , respectively, and (4) holds. The measure representing the functional is absolutely continuous (with respect to Lebesgue’s measure), with the density equal to Krein’s spectral shift function ξH0 +V ,H0 or Koplienko’s spectral shift function ηH0 ,H0 +V , respectively. That is, we have
τ R1,H0 ,V (f ) =
τ R1,H
f (t)ξH0 +V ,H0 (t) dt,
0 ,V
(f ) τ |V | f ∞
(6)
R
and
τ R2,H0 ,V (f ) =
f (t)ηH0 ,H0 +V (t) dt,
2
τ R2,H ,V (f ) τ (|V | ) f ∞ . 0 2
(7)
R
Existence of ξH0 +V ,H0 , with τ (|V |) < ∞, satisfying (6) for f ∈ W1 , was proved in the setting M = B(H) in [16] (cf. also [17]) and extended to the setting of an arbitrary semi-finite von Neumann algebra M in [2,7]. Moreover, when M = B(H), the trace formula in (6) is known 1 (R) [21]. In the setting M = B(H), existence of η to hold for f ∈ B∞1 H0 ,H0 +V , with V in the Hilbert–Schmidt class, satisfying (7) for bounded rational functions f was proved in [15]. Later, it was proved in [22] that ηH0 ,H0 +V satisfies the trace formula in (7) for functions f in 1 (R) ∩ B p (R). When V is in the trace class, Koplienko’s spectral shift function can be B∞1 ∞1 written explicitly as t ηH0 ,H0 +V (t) = −
ξH0 +V ,H0 (λ) dλ + τ EH0 (−∞, t) V ,
(8)
−∞
where EH0 is the spectral measure of H0 [15]. In the context of a general M, Koplienko’s spectral shift function ηH0 ,H0 +V , with τ (|V |) < ∞, and the representation (8) are discussed in [26]. For p 3, M = B(H), and τ (|V |p ) < ∞, the distribution τ [Rp,H0 ,V (f )] is given by an L2 -function γp,H0 ,V satisfying
τ (V p ) (p) τ Rp,H0 ,V (f ) = f (0) + p!
f (p+1) (t)γp,H0 ,V (t) dt, R
for all f ∈ Wp+1 [10]. It was conjectured in [15] that there exists a Borel measure νp with the |p ) such that total variation bounded by τ (|V p!
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
τ Rp,H0 ,V (f ) =
1095
f (p) (t) dνp (t),
(9)
R
for bounded rational functions f . Unfortunately, the proof of (9) in [15] was based on the false claim that for V in the Schatten p-class, p > 2, the set function defined on rectangles of Rp+1 by
A1 × A2 × · · · × Ap+1 → τ E(A1 )V E(A2 )V . . . V E(Ap+1 ) ,
(10)
where E(·) is a spectral measure on R with values in B(H), extends to a (countably additive) measure of bounded variation (see a counterexample in Section 4). When V is in the Hilbert– Schmidt class of M = B(H), the set function in (10) does extend to a (countably-additive) measure of bounded variation [5,20] and thus ideas of [15] can be applied to prove existence of a measure νp satisfying (9) for bounded rational functions (see Section 7). In this case, the total variation of νp is bounded by νp
(τ (|V |2 ))p/2 . p!
Adjusting techniques of [10] then extends (9) to the functions f ∈ Wp . For M a von Neumann algebra acting on an infinite-dimensional Hilbert space H, the set function in (10), with E(·) the spectral measure attaining its values in M and V ∈ M satisfying τ (|V |2 ) < ∞, may fail to extend to a finite measure on Rp+1 for p > 2 even if τ is finite (see a counterexample in Section 4). Therefore, the approach of [15] is not applicable in the proof of (9). When M is a general semi-finite von Neumann algebra, we prove (9) for p = 3 by relating R3,H0 ,V to R2,H0 ,V , which allows to reduce the problem to the case of p = 2 (see Sections 6 and 8). We also study the case when M is finite and H0 , V ∈ M are free with respect to the finite trace τ (which is assumed normalized so that τ (1) = 1). Freeness was introduced by Voiculescu (see, for example, [28]) and amounts to a specific prescription for the values of the mixed moments of H0 and V in terms of the individual moments of H0 and V . Free perturbations have appeared in the study of quite general operators in finite von Neumann algebras, for example in the seminal work of Haagerup and Schultz [13]. Assuming freeness, we show that for all p the set function in (10) extends to a finite measure on Rp+1 (see Section 4), from which (9) can be derived. Under the assumptions that we impose to prove existence of νp satisfying (9) (see discussion in the two preceding paragraphs), we also construct a function ηp , called the pth-order spectral shift function, such that dνp (t) = ηp (t) dt, provided H0 is bounded (see statements in Section 5 and proofs in Sections 7 and 8). The spectral shift function of order p admits the recursive representation
t ηp (t) = − −∞
ηp−1 (λ) dλ +
splineλ1 ,...,λp−1 (t) dmp−1,H0 ,V (λ1 , . . . , λp−1 ),
(11)
Rp−1
where splineλ1 ,...,λp−1 is a piecewise polynomial of degree p − 2 with breakpoints λ1 , . . . , λp−1 and dmp−1,H0 ,V (λ1 , . . . , λp−1 ) is a measure on Rp−1 determined by p − 1 copies of the spectral
1096
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
measure of H0 intertwined with p − 1 copies of the perturbation V (see Section 5 for the precise formula). As it is noticed in Section 5, the function η2 given by (11) coincides with the function ηH0 ,H0 +V given by (8), provided τ (|V |) < ∞. The techniques of [15] that prove existence of νp when M = B(H) do not give absolute continuity of νp . We obtain ηp by analyzing the Cauchy transform of the measure νp satisfying the trace formula (9) (see Section 6). The approach of this paper, developed mainly for higher order spectral shift functions, contributes to the subject of Krein’s spectral shift function as well. In 1972, using Theorem 1.1, (2) and the double operator integral representation for the derivative d f (H0 + xV ) = dx
(1)
R2
λ1 ,λ2 (f ) EH0 +xV (dλ1 )V EH0 +xV (dλ2 ),
M.Sh. Birman and M.Z. Solomyak [4] showed that
τ f (H0 + V ) − f (H0 ) =
1
f (t)
τ EH0 +xV (dt) dx
0
R
(see [1, Theorem 6.3] for the analogous result in the context of von Neumann algebras), which along with Krein’s trace formula
τ f (H0 + V ) − f (H0 ) =
f (t)ξH0 +V ,H0 (t) dt
R
[2,7,16] implied the spectral averaging formula 1
τ EH0 +xV (dt) dx = ξH0 +V ,H0 (t) dt
(12)
0
(see [11,18,25,26] for generalizations and extensions). The operator f (H0 + V ) − f (H0 ) also admits a double operator integral representation f (H0 + V ) − f (H0 ) =
(1)
R2
λ1 ,λ2 (f ) EH0 +V (dλ1 )V EH0 (dλ2 ).
(13)
A natural question raised by M.Sh. Birman (see, e.g., [3]) asks if it is possible to deduce ex1 istence of ξH0 +V ,H0 , or equivalently, absolute continuity of the measure 0 τ [EH0 +xV (dt)] dx, directly from the double operator integral representation (13). For M a finite von Neumann algebra, we answer this question affirmatively and represent ξH0 +V ,H0 as an integral of a basic spline straightforwardly from (13) (see Section 9). A general property of a basic spline is that it has the minimal support among all the splines with the same degree, smoothness, and domain properties (see, e.g., [9]). When dim(H) < ∞, higher order spectral shift functions can be written as integrals of basic splines as well (see Section 9). By combining different representations for the remainder τ [Rp,H0 ,V (f )] in the setting of M = B(H), we prove absolute continuity of the measure
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
1 A →
1097
p
dx (1 − x)p−1 τ EH0 +xV (A)V
0
and derive higher order analogs of the spectral averaging formula (12) (see Section 10). Basic technical tools of the paper are discussed in Sections 2–4, main results are stated in Section 5 and then proved in Sections 6–8, additional representations for spectral shift functions are obtained in Section 9, and the Birman–Solomyak spectral averaging formula is generalized in Section 10. By saying “the standard setting” or “τ is the standard trace,” we implicitly assume that M = B(H) and τ is the usual trace defined on the trace class operators of B(H). Let Lp (M, τ ) denote the noncommutative Lp -space of (M, τ ) with the norm V p = τ (|V |p )1/p and Lp (M, τ ) = Lp (M, τ ) ∩ M the Schatten p-class of (M, τ ). The Schatten p-class is equipped with the norm · p,∞ = · p + · , where · is the operator norm. Throughout the paper, H0 and V denote self-adjoint operators in M or affiliated with M; V is mainly taken to be an element of Lp (M, τ ). Let R denote the set of rational functions on R with nonreal poles, Rb the subset of R of bounded functions. The symbol fz is reserved for the function 1 , where z ∈ C \ R. R λ → z−λ 2. Divided differences and splines Definition 2.1. The divided difference of order p is an operation on functions f of one (real) variable, which we will usually call λ, defined recursively as follows: (0)
λ1 (f ) := f (λ1 ), ⎧ (p−1) (p−1) ⎪ ⎨ λ1 ,...,λp−1 ,λp (f )−λ1 ,...,λp−1 ,λp+1 (f ) (p) λp −λp+1 λ1 ,...,λp+1 (f ) := ⎪ (p−1) ⎩∂| ∂t t=λp λ1 ,...,λp−1 ,t (f )
if λp = λp+1 , if λp = λp+1 .
Below we state selected facts on the divided difference (see, e.g., [9]). Proposition 2.2. (p)
(1) (See [9, Section 4.7, (a)].) λ1 ,...,λp+1 (f ) is symmetric in λ1 , λ2 , . . . , λp+1 . (2) (See [9, Section 4.7, (h)].) If all λ1 , λ2 , . . . , λp+1 are distinct, then (p)
λ1 ,...,λp+1 (f ) =
p+1 j =1
f (λj ) . k=j (λj − λk )
(3) (See [9, Section 4.7].) For f a sufficiently smooth function, (p) λ1 ,...,λp+1 (f ) =
i )−1 m(λ
i∈I
cij (λ1 , . . . , λp+1 )f (j ) (λi ).
j =0
Here I is the set of indices i for which λi are distinct, m(λi ) is the multiplicity of λi , and cij (λ1 , . . . , λp+1 ) ∈ C.
1098
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132 (p)
(4) (See [9, Section 4.7].) λ1 ,...,λp+1 (ap λp + ap−1 λp−1 + · · · + a1 λ + a0 ) = ap , where a0 , a1 , . . . , ap ∈ C. (5) (See [9, Section 5.2, (2.3) and (2.6)].) The basic spline with the break points λ1 , . . . , λp+1 , where at least two of the values are distinct, is defined by ⎧ 1 ⎨ |λ −λ χ(min{λ1 ,λ2 },max{λ1 ,λ2 }) (t) if p = 1, 2 1| t → p−1 ⎩ (p) if p > 1. λ1 ,...,λp+1 ((λ − t)+ ) Here the truncated power is defined by k x+ =
if x 0, if x < 0,
xk 0
for k ∈ N. The basic spline is non-negative, supported in
min{λ1 , . . . , λp+1 }, max{λ1 , . . . , λp+1 } and integrable with the integral equal to 1/p. (Often the basic spline is normalized so that its integral equals 1). (6) (See [9, Section 5.2, (2.2) and Section 4.7, (c)].) For f ∈ C p [min{λ1 , . . . , λp+1 }, max{λ1 , . . . , λp+1 }], (p)
λ1 ,...,λp+1 (f ) ⎧ ∞ (p) (p) p−1 ⎨ 1 (t)λ1 ,...,λp+1 ((λ − t)+ ) dt (p−1)! −∞ f = ⎩ 1 f (p) (λ1 ) p!
if ∃i1 , i2 such that λi1 = λi2 , if λ1 = λ2 = · · · = λp+1 .
(7) (See [9, Section 4.7, (l)].) Let f ∈ C p [a, b]. Then, for {λ1 , . . . , λp+1 } ⊂ [a, b], 1 max f (p) (λ). p! λ∈[a,b]
(p)
λ1 ,...,λp+1 (f )
Below we state useful properties of the divided difference to be used in the paper. Lemma 2.3. For z ∈ C, with Im(z) = 0, (p) λ1 ,...,λp+1
1 z−λ
=
p+1 j =1
1 , z − λj
where the divided difference is taken with respect to the real variable λ.
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
1099
Proof. We notice that by Definition 2.1, (1) λ1 ,λ2
1 z−λ
=
1 1 1 ( z−λ − z−λ ) 1 = (z−λ1 )(z−λ 1 2 λ1 −λ2 2) ∂ 1 1 1 ( ) = = ∂t t=λ z−t (z−λ1 )(z−λ2 ) (z−λ )2 1
1
if λ1 = λ2 , if λ1 = λ2 .
By repeating the same argument, we obtain (2) λ1 ,λ2 ,λ3
1 z−λ
=
1 . (z − λ1 )(z − λ2 )(z − λ3 )
The rest of the proof is accomplished by induction.
2
Lemma 2.4. Let D be a domain in C and f a function continuously differentiable sufficiently many times on D × R. Then for p ∈ N, (i)
(p) (p) λ1 ,...,λp+1 f (z, λ) dz = λ1 ,...,λp+1
f (z, λ) dz ,
with an appropriate choice of the constant of integration on the left-hand side; (ii) (p) (p) lim λ1 ,...,λp+1 f (z, λ) = λ1 ,...,λp+1 lim f (z, λ) ,
z→z0
z→z0
z0 ∈ D;
(iii)
∂ (p) ∂ (p) f (z, λ) , f (z, λ) = λ1 ,...,λp+1 ∂z λ1 ,...,λp+1 ∂z where the divided difference is taken with respect to the variable λ. Proof. Follows immediately from Proposition 2.2(3).
2
Corollary 2.5. For p, k ∈ N, (−1)k ∂ k k! ∂zk
p+1 j =1
1 z − λj
(p)
= λ1 ,...,λp+1
1 . (z − λ)k+1
Proof. Follows immediately from Lemma 2.3 and Lemma 2.4.
2
1100
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
3. Remainders of Taylor-type approximations In this section, we collect technical facts on derivatives of operator functions and remainders of the Taylor-type approximations. The following lemma is routine. Lemma 3.1. Let H0 = H0∗ be an operator in H and V = V ∗ ∈ B(H). Let Hx = H0 + xV , with x ∈ R. Then, dp (zI − Hx )−k = p! p dx
(zI − Hx )−k0 V (zI − Hx )−k1 V . . . V (zI − Hx )−kp .
1k0 ,k1 ,...,kp k k0 +k1 +···+kp =k+p
If, in addition, H0 is bounded, then dp k H = p! dx p x
k
Hxk0 V Hxk1 V . . . V Hx p ,
p k.
0k0 ,k1 ,...,kp k0 +k1 +···+kp =k−p
Lemma 3.2. Let H0 = H0∗ be an operator affiliated with M and V = V ∗ ∈ L2 (M, τ ). Then,
(−1)k d k τ (zI − H0 )−1 V (zI − H0 )−1 V (zI − H0 )−1 k k! dz 2 d 1 −k−1 = τ . (zI − H − xV ) 0 2 dx 2 x=0
(14)
Proof. Firstly, we compute the left-hand side of (14). By cyclicity of the trace,
τ (zI − H0 )−1 V (zI − H0 )−1 V (zI − H0 )−1 = τ (zI − H0 )−2 V (zI − H0 )−1 V . By continuity of the trace in the norm · 1,∞ ,
d d τ (zI − H0 )−2 V (zI − H0 )−1 V = τ (zI − H0 )−2 V (zI − H0 )−1 V . dz dz It is easy to see that dk (zI − H0 )−2 V (zI − H0 )−1 V k dz =
k j =0
k! (−1)j (j + 1)!(zI − H0 )−2−j V (−1)k−j (k − j )!(zI − H0 )−1−(k−j ) V j !(k − j )!
= (−1)k k!
k (j + 1)(zI − H0 )−2−j V (zI − H0 )−1−(k−j ) V . j =0
(15)
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
1101
Now we compute the right-hand side of (14). Let Hx = H0 + xV . It is routine to see that d (zI − Hx )−(k+1) = (zI − Hx )−i V (zI − Hx )−(k+2−i) , dx k+1 i=1
and hence, d 2 (zI − Hx )−(k+1) 2 dx x=0 k+1 d (zI − Hx )−i V (zI − H0 )−(k+2−i) =2 dx x=0 i=1
=2
k+1 i−1
(zI − H0 )−(i−j ) V (zI − H0 )−1−j V (zI − H0 )−(k+2−i) .
i=1 j =0
Multiplying by 1/2 and evaluating the trace in the latter expression provides 2 d 1 −k−1 τ (zI − H − xV ) 0 2 dx 2 x=0 =
k+1 i−1
τ (zI − H0 )−1−j V (zI − H0 )−(k+2−j ) V i=1 j =0
=
k k+1
τ (zI − H0 )−1−j V (zI − H0 )−2−(k−j ) V j =0 i=j
=
k
(k + 1 − j )τ (zI − H0 )−1−j V (zI − H0 )−2−(k−j ) V .
j =0
By changing the index of summation i = k − j in the latter expression and by cyclicity of the trace, we obtain k
(i + 1)τ (zI − H0 )−2−i V (zI − H0 )−1−(k−i) V .
(16)
i=0
Comparing (15) and (16) completes the proof of the lemma.
2
As a particular case of results of [23] we have the lemma below. Lemma 3.3. Let H0 = H0∗ be an operator in H and V = V ∗ ∈ B(H). Denote Hx = H0 + xV . For f ∈ Rb ,
1102
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
dp f (H0 + xV ) dx p (p) = p! . . . λ1 ,...,λp+1 (f ) EHx (dλ1 )V EHx (dλ2 )V . . . V EHx (dλp+1 ). R R
(17)
R
If, in addition, H0 is bounded, then (17) holds for f ∈ R. p
d Remark 3.4. It was proved in [8, Theorem 2.2] that for H0 a bounded operator, dt p f (H0 + tV ) 2p is defined when f ∈ C (R) and the derivative can be computed as an iterated operator intedp gral (17). It was proved later in [23] that the Gâteaux derivative dt p f (H0 + tV ) is defined for f p 1 in the intersection of the Besov classes B∞1 (R) ∩ B∞1 (R) and can be computed as a Bochnertype multiple operator integral.
The following lemma is a straightforward consequence of Lemma 3.1. Lemma 3.5. Let H0 = H0∗ be an operator in H and V = V ∗ ∈ B(H). Then for f a polynomial of degree m,
Rp,H0 ,V (f ) =
k
k
ak0 ,k1 ,...,kp H0 0 V H0k1 V . . . V H0 p ,
k0 ,k1 ,...,kp 0 k0 +k1 +···+kp =m−p
with ak0 ,k1 ,...,kp numbers. Lemma 3.6. Let H0 = H0∗ be an operator in H and V = V ∗ ∈ B(H). Then, Rp,H0 ,V (fz ) = (zI − H0 − V )−1 −
p−1
j (zI − H0 )−1 V (zI − H0 )−1
(18)
j =0
p = (zI − H0 − V )−1 V (zI − H0 )−1 . Proof. By Lemma 3.1, j d j (zI − H0 − xV )−1 = j !(zI − H0 − x0 V )−1 V (zI − H0 − x0 V )−1 , j dx x=x0 which gives (18). To derive (19) from (18), we use repeatedly the resolvent identity (zI − H0 − V )−1 − (zI − H0 )−1 = (zI − H0 − V )−1 V (zI − H0 )−1 . By combining (zI − H0 − V )−1 and the first summand of p−1
j (zI − H0 )−1 V (zI − H0 )−1 ,
j =0
(19)
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
1103
we obtain that (provided p > 1) (zI − H0 − V )−1 −
p−1
j (zI − H0 )−1 V (zI − H0 )−1
j =0 −1
= (zI − H0 − V )
−1
V (zI − H0 )
−
p−1
j (zI − H0 )−1 V (zI − H0 )−1 .
j =1
Repeating the reasoning above sufficiently many times completes the proof of (19).
2
From (18) we have the following relation between the remainders of different order. Lemma 3.7. Let H0 = H0∗ be an operator in H and V = V ∗ ∈ B(H). Then p Rp+1,H0 ,V (fz ) = Rp,H0 ,V (fz ) − (zI − H0 )−1 V (zI − H0 )−1 . The following lemma is a straightforward generalization of [10, Lemma 2.6]. Lemma 3.8. Let H0 = H0∗ , V = V ∗ ∈ B(H), and Γ = {λ: |λ| = 1 + H0 + V }. Then, for every function f analytic in a neighborhood of D = {λ: |λ| 1 + H0 + V }, 1 Rp,H0 ,V (f ) = 2πi
p −1 f (λ)(λI − H0 )−1 V (λI − H0 )−1 I − V (λI − H0 )−1 dλ.
Γ ∗
Let (S, ν) be a measure space and let Lso ∞ (S, ν, L1 (M, τ )) denote the ∗-algebra of · bounded so∗ -measurable functions F : S → L1 (M, τ ) [19]. ∗
Proposition 3.9. (See [1, Lemma 3.10]). Let F be a function in Lso ∞ (S, ν, L1 (M, τ )) uniformly L1 (M, τ )-bounded. Then S F (s) dν(s) ∈ L1 (M, τ ), τ (F (·)) is measurable and τ S
F (s) dν(s) = τ F (s) dν(s). S
Similarly to [1, Lemma 4.5, Theorem 5.7], we have the following differentiation formula for an operator function f (·), with f ∈ Wp . Lemma 3.10. Let H0 = H0∗ be an operator affiliated with M and V = V ∗ ∈ Lp (M, τ ). Let Hx = H0 + xV , with x ∈ R. Then, for f ∈ Wp given by f (λ) = R eitλ dμf (t), the function f (Hx ) is p times Fréchet differentiable in the norm · 1,∞ and the derivative equals the Bochner-type multiple operator integral dp f (Hx ) = p! dx p
Π (p)
ei(s0 −s1 )Hx V . . . V ei(sp−1 −sp )Hx V eisp Hx dσf (s0 , . . . , sp ). (p)
1104
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
Here Π (p) = (s0 , s1 , . . . , sp ) ∈ Rp+1 : |sp | · · · |s1 | |s0 |, sign(s0 ) = · · · = sign(sp ) (p)
and dσf (s0 , s1 , . . . , sp ) = ip μf (ds0 )ds1 . . . dsp . In particular, for t ∈ R, d p itHx p e = i p! ei(t−s1 )Hx V . . . V ei(sp−1 −sp )Hx V eisp Hx dsp . . . ds1 . dx p Π (p−1)
By applying Proposition 3.9 and Lemma 3.10, we obtain the following Lemma 3.11. Let H0 = H0∗ be an operator affiliated with M and V = V ∗ ∈ Lp (M, τ ). Let dp Hx = H0 + xV , with x ∈ R. Then for f ∈ Wp , we have dx p f (Hx ) ∈ L1 (M, τ ), τ
(p) dp f (H ) = p! τ ei(s0 −s1 )Hx V . . . V ei(sp−1 −sp )Hx V eisp Hx dσf (s0 , . . . , sp ) x p dx Π (p)
and p d p dx p f (Hx ) V p μf (p) p . 1
Corollary 3.12. Under the assumptions of Lemma 3.11, p p d d itHx dμf (t). τ f (H ) = τ e x dx p dx p R
Proof. The claim is proved by reducing the double integral to an iterated one and applying Lemmas 3.10 and 3.11. 2 Remark 3.13. By combining Lemma 3.11 and Theorem 1.1 (2), one obtains the estimate (5). 4. Multiple spectral measures We will need the fact that certain finitely additive “multiple spectral measures” extend to countably additive measures. Theorem 4.1. Let 2 p ∈ N and let E1 , E2 , . . . , Ep be projection-valued Borel measures from R into M. Suppose that V1 , . . . , Vp belong to L2 (M, τ ). Assume that either τ is the standard trace or p = 2. Then there is a unique (complex) Borel measure m on Rp with total variation not exceeding the product V1 2 V2 2 · · · Vp 2 , whose value on rectangles is given by
m(A1 × A2 × · · · × Ap ) = τ E1 (A1 )V1 E2 (A2 )V2 . . . Vp−1 Ep (Ap )Vp for all Borel subsets A1 , A2 , . . . , Ap of R.
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
1105
Proof. It is enough (see, e.g., [14, Theorem 2.12] for p = 2) to prove that the variation of the set function m on the rectangles of Rp is bounded by V1 2 V2 2 · · · Vp 2 , which can be accomplished completely analogously to the proof of [20, Theorem 1] (see also [5]). 2 Remark 4.2. For τ the standard trace, the bound for the total variation in Theorem 4.1 was proved in [20, Theorem 1]. Theorem 4.1 with τ standard was also obtained in [5]. The proof in [5] is based on the facts that a Hilbert–Schmidt operator can be approximated by finite-rank operators in the norm · 2 and that for rank-one perturbations V1 , . . . , Vp and τ the standard trace, the set function m decomposes into a product of scalar measures. It is classical that a direct product of countably additive measures always has a countably-additive extension to the σ -algebra generated by the direct product of the σ -algebras involved. The argument of [5] cannot be directly extended to the case of a general trace. For a general trace τ , the set function m is known to be of bounded variation only if p = 2. Technically, this constraint is explained by the fact that in general · p is not dominated by · 2 , as distinct from the particular case of the standard trace τ . A counterexample constructed further in this section demonstrates that p = 2 is not only a technical constraint. Corollary 4.3. Let 2 p ∈ N and let E1 , E2 , . . . , Ep be projection-valued Borel measures from R to M. Suppose that V1 , . . . , Vp belong to L2 (M, τ ). Assume that either τ is the standard trace or p = 2. Then there is a unique (complex) Borel measure m1 on Rp+1 with total variation not exceeding the product V1 2 V2 2 · · · Vp 2 , whose value on rectangles of Rp+1 is given by
m1 (A1 × A2 × · · · × Ap × Ap+1 ) = τ E1 (A1 )V1 E2 (A2 )V2 . . . Vp−1 Ep (Ap )Vp E1 (Ap+1 ) , for all Borel subsets A1 , A2 , . . . , Ap , Ap+1 of R. Proof. It is straightforward to see that
m1 (A1 × A2 × · · · × Ap × Ap+1 ) = τ E1 (A1 ∩ Ap+1 )V1 E2 (A2 )V2 . . . Vp−1 Ep (Ap )Vp . By repeating the argument of [5,20], one can see that the total variation of the set function m1 is bounded on the rectangles of Rp+1 by V1 2 V2 2 · · · Vp 2 . Thus, m1 extends to a unique complex Borel measure on Rp+1 with variation bounded by V1 2 V2 2 · · · Vp 2 . 2 Corollary 4.4. Let 2 p ∈ N, E1 , . . . , Ep projection-valued Borel measures from R into M, and τ a finite trace. Suppose that V1 , . . . , Vp−1 belong to M. Assume that either τ is the standard trace or p = 2. Then there is a unique complex Borel measure m2 on Rp with total variation not exceeding V1 2 V2 2 · · · Vp−1 2 τ (I )1/2 , whose value on rectangles is given by
m2 (A1 × A2 × · · · × Ap ) = τ E1 (A1 )V1 E2 (A2 )V2 . . . Vp−1 Ep (Ap ) for all Borel subsets A1 , A2 , . . . , Ap of R. Proof. It is an immediate consequence of Theorem 4.1 applied to Vp = I .
2
1106
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
In the sequel, we will work with the set functions
mp,H0 ,V (A1 × A2 × · · · × Ap ) = τ EH0 (A1 )V EH0 (A2 )V . . . V EH0 (Ap )V , (1)
mp,H0 ,V (A1 × A2 × · · · × Ap × Ap+1 )
= τ EH0 (A1 )V EH0 (A2 )V . . . V EH0 (Ap )V EH0 (Ap+1 ) , (2)
mp,H0 ,V (A1 × A2 × · · · × Ap × Ap+1 )
= τ EH0 +V (A1 )V EH0 (A2 )V . . . V EH0 (Ap )V EH0 (Ap+1 ) , and their countably-additive extensions (when they exist). Here Aj are measurable subsets of R, H0 = H0∗ is affiliated with M, and V = V ∗ ∈ L2 (M, τ ). In the next result, freeness of (zI − H0 )−1 and V means freeness of the algebra generated by the spectral projections of H0 and the unital algebra generated by V . Theorem 4.5. Let τ be a finite trace normalized by τ (I ) = 1 and let H0 = H0∗ be affiliated with M and V = V ∗ ∈ M. Assume that (zI − H0 )−1 and V are free. Then the set functions mp,H0 ,V (1) and mp,H0 ,V extend to countably additive measures of bounded variation. (1)
Proof. We prove the claim for the function mp,H0 ,V ; the case of mp,H0 ,V is completely anal ogous. Using the moment-cumulant formula (see [27, Theorem 2.17]), and that i EH0 (Ai ) = EH0 ( i Ai ) we have
mp,H0 ,V (A1 × · · · × Ap ) = τ EH0 (A1 )V . . . EH0 (Ap )V =
kK(π) [V , . . . , V ]
τ EH0 (∩i∈Bj Ai ) ,
(20)
j =1
π={B1 ,...,B }∈NC(p)
where NC(p) is the lattice of all noncrossing partitions of {1, . . . , p} and where kK(π) [V , . . . , V ] is the product of cumulants of V, associated to the block structure of the Kreweras complement K(π) of π ; thus, kK(π) [V , . . . , V ] is equal to a polynomial (that depends on π ) in p variables, evaluated at τ (V ), τ (V 2 ), . . . , τ (V p ). Given π = {B1 , . . . , B } ∈ NC(p), the measure
γp,π
: A1 × · · · × Ap → τ E H0 j =1
Ai
(21)
i∈Bj
is the push-forward of the -fold product ×1 (τ ◦ EH0 ) of the spectral distribution measure of H0 (with respect to τ ) under the mapping of R onto the product of diagonals in Rp according to the block structure B1 , . . . , B . Each such push-forward is a probability measure. Thus, we see that mp,H0 ,V is a linear combination of probability measures, and has finite total variation. 2 In some cases used in the paper, the measure m2 is known to be real-valued and the measure m non-negative.
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
1107
Lemma 4.6. Let τ be a finite trace. Let H0 = H0∗ be affiliated with M and V = V ∗ ∈ M. Then the measure m(2) 1,H0 ,V is real-valued. Proof. For arbitrary measurable subsets A1 and A2 of R,
τ EH0 +V (A1 )V EH0 (A2 )
= τ EH0 +V (A1 )(H0 + V )EH0 (A2 ) − τ EH0 +V (A1 )H0 EH0 (A2 )
= τ EH0 (A2 ) EH0 +V (A1 )(H0 + V ) EH0 (A2 ) − τ EH0 +V (A1 ) H0 EH0 (A2 ) EH0 +V (A1 ) , where the operators EH0 (A2 ) EH0 +V (A1 )(H0 + V ) EH0 (A2 ) and EH0 +V (A1 ) H0 EH0 (A2 ) EH0 +V (A1 ) (2)
(2)
are self-adjoint. Therefore, m1,H0 ,V (A1 × A2 ) ∈ R, and the extension of m1,H0 ,V to the Borel subsets of R2 is real-valued. 2 Lemma 4.7. Let H0 = H0∗ be affiliated with M and V = V ∗ ∈ L2 (M, τ ). Then the measure m2,H0 ,V is non-negative. Proof. For arbitrary measurable subsets A1 and A2 of R,
τ EH0 (A1 )V EH0 (A2 )V = τ EH0 (A1 )V EH0 (A2 )V EH0 (A1 ) 0, since !
" ! " EH0 (A1 )V EH0 (A2 )V EH0 (A1 )f, f = EH0 (A2 ) V EH0 (A1 )f , V EH0 (A1 )f 0,
for any f ∈ H.
2
Lemma 4.8. Let τ be a finite trace. Let H0 = H0∗ be an operator affiliated with M and V = (2) V ∗ ∈ M. Then mk,H0 ,V has no atoms on the diagonal Dk+1 = {(λ1 , λ2 , . . . , λk+1 ): λ1 = λ2 = · · · = λk+1 ∈ R} of Rk+1 . (2)
Proof. By definition of the measure mk,H0 ,V , (2)
mk,H0 ,V
k−1
(λ, λ, . . . , λ) = τ EH0 +V {λ} V EH0 {λ} V EH0 {λ} .
We will show that EH0 +V ({λ})V EH0 ({λ}) is the zero operator. Let g be an arbitrary vector in H and let h = EH0 ({λ})g. Then H0 h = λh and EH0 +V {λ} V EH0 {λ} g = EH0 +V {λ} V h = EH0 +V {λ} (H0 + V )h − EH0 +V {λ} H0 h = EH0 +V {λ} (H0 + V )h − λEH0 +V {λ} h = (H0 + V − λI )EH0 +V {λ} h = 0.
2
1108
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
Upon evaluating a trace, some iterated operator integrals can be written as Lebesgue integrals with respect to a “multiple spectral measure.” Lemma 4.9. Assume the hypothesis of Theorem 4.1. Assume that the spectral measures E1 , E2 , . . . , Ep correspond to self-adjoint operators H0 , H1 , . . . , Hp affiliated with M, respectively, and that V1 , V2 , . . . , Vp ∈ L2 (M, τ ). Let f1 , f2 , . . . , fp be functions in C0∞ (R) (vanishing at infinity). Then
τ f1 (H1 )V1 f2 (H2 )V2 . . . fp (Hp )Vp = f1 (λ1 )f2 (λ2 ) . . . fp (λp ) dm(λ1 , λ2 , . . . , λp ), Rp
with m as in Theorem 4.1. Proof. The result obviously holds for f1 , f2 , . . . , fp simple functions. Uniform approximation of f1 , f2 , . . . , fp ∈ C0∞ (R) by (totally bounded) simple functions completes the proof. 2 Remark 4.10. (i) The result analogous to the one of Theorem 4.1 holds for integrals with respect to the measures m1 and m2 . (ii) When the operators H0 , H1 , . . . , Hp are bounded, the functions f1 , f2 , . . . , fp can be taken in C ∞ (R). Corollary 4.11. Let H0 = H0∗ be affiliated with M and V = V ∗ ∈ L2 (M, τ ). Denote Hx := H0 + xV , x ∈ [0, 1]. Assume that either τ is standard or p = 2. Then for f ∈ Rb , p d (p) (1) τ f (H + xV ) = p! λ1 ,...,λp+1 (f ) dmp,Hx ,V (λ1 , λ2 , . . . , λp+1 ). 0 dx p Rp+1
Proof. It is enough to prove the result for f (λ) = (z − λ)−k , k ∈ N. By Lemma 3.3, dp (zI − Hx )−k p dx (p) = p! . . . λ1 ,...,λp+1 (z − λ)−k EHx (dλ1 )V EHx (dλ2 )V . . . V EHx (dλp+1 ). R R
(22)
R
By Corollary 2.5, the function λ1 ,...,λp+1 ((z − λ)−k ) is a linear combination of products p+1 ∞ j =1 fj (λj ), with fj in C0 (R) for 1 j p + 1, and hence, the trace of the expression in (22) can be written as a linear combination of integrals like in Remark 4.10 (i), where (1) m1 = mp,Hx ,V (λ1 , λ2 , . . . , λp+1 ). 2 (p)
Remark 4.12. One also has p
τ (zI − H0 )−1 V =
Rp
(p−1)
λ1 ,...,λp (fz ) dmp,H0 ,V (λ1 , λ2 , . . . , λp )
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
1109
and p
τ (zI − H0 − V )−1 V (zI − H0 )−1 (p) (2) = λ1 ,...,λp+1 (fz ) dmp,H0 ,V (λ1 , λ2 , . . . , λp+1 ). Rp+1
Remark 4.13. If H0 is bounded, then Corollary 4.11 also holds for f a polynomial and for f ∈ C ∞ (R) such that f |[a,b] is a polynomial, where [a, b] ⊃ σ (H0 ) ∪ σ (H0 + V ). A counterexample. Let p 2 be an integer. Let V be a self-adjoint operator on a Hilbert space H and assume V belongs to the Schatten p-class, with respect to the usual trace Tr. Let E be a spectral measure. A crucial estimate in [15] is of the total variation of the function that is defined on product sets by A1 × · · · × Ap → Tr E(A1 )V E(A2 )V · · · E(Ap )V . Unfortunately, the estimate result in [15] is false when p 3. In this section, we provide an example, based on Hadamard matrices, having unbounded total variation. We also give a version for finite traces. Consider the self-adjoint unitary 2 × 2 matrix 1 V2 = √ 2
1 1
1 . −1
When n = 2k , consider the self-adjoint unitary n × n matrix Vn = V2 ⊗ · · · ⊗ V2 . # $% & k times
Then each such Vn is a multiple of a Hadamard matrix. Let (ej k )1j,kn be the usual system of matrix units for Mn (C). Let En be the spectral measure on the set {1, . . . , n} taking values in Mn (C), defined by En ({j }) = ejj . The following lemma can be proved directly for all n, or first in the case n = 2 and then by observing how the total variation behaves under taking tensor products of matrices and spectral measures. Lemma 4.14. For every integer p 2, and every n that is a power of 2, the set function A1 × · · · × Ap → Trn En (A1 )Vn En (A2 )Vn · · · En (Ap )Vn has total variation np/2 .
1110
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
Consider the von Neumann algebra M with normal trace τ given by (M, τ ) =
∞ ' k=1
Mn(k) (C), # $% & α(k)
where α(k) > 0 and this notation indicates that M is the ∞ -direct sum of the matrix algebras Mn(k) (C), where every n(k) is a power of 2 and where τ is the trace determined by τ 0 · · ⊕ 0& ⊕In(k) ⊕ 0 ⊕ 0 ⊕ · · · = α(k). # ⊕ ·$% k−1 times
We will be interested in the following two cases: (I) M embeds in B(H), for H a separable, infinite-dimensional Hilbert space, in such a way that τ is the restriction of the usual trace Tr on B(H); this is equivalent to α(k) being an integer multiple of n(k) for all k. (II) M is a finite von (Neumann algebra and τ is normalized to take value 1 on the identity; this is equivalent to ∞ k=1 α(k) = 1. Example 4.15. Consider T = t1 Vn(1) ⊕ t2 Vn(2) ⊕ · · · ∈ M, for a bounded sequence of tk 0. Then |T | = t1 In(1) ⊕ t2 In(2) ⊕ · · · , T p =
∞
p
tk α(k).
(23)
k=1
Taking the obvious diagonal spectral measure E defined on the set {(k, j ) ∈ N2 | j n(k)} by E (k, j ) = #0 ⊕ ·$% · · ⊕ 0& ⊕ejj ⊕ 0 ⊕ 0 ⊕ · · · k−1 times
and using the result of Lemma 4.14, we find that the total variation of the set function A1 × · · · × Ap → τ E(A1 )T E(A2 )T · · · E(Ap )T is ∞ k=1
p
tk
∞ α(k) p n(k)p/2 = tk α(k)n(k)(p/2)−1 . n(k) k=1
(24)
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
1111
Now assuming n(k) is unbounded as k → ∞ and fixing an integer p 3, it is easy to choose values of α(k) and tk such that the p-norm in (23) is finite while the total variation (24) is infinite, in both cases (I) and (II) above. Remark 4.16. The above examples also work to show that the set function
A1 × · · · × Ap+1 → τ E(A1 )T . . . T E(Ap+1 ) has infinite total variation. 5. Main results In this section we state the main results which will be proved in the next three sections. Theorem 5.1. Let 2 < p ∈ N. Let H0 be a self-adjoint operator affiliated with M = B(H) and V a self-adjoint operator in L2 (M, τ ). Then, the following assertions hold. (i) There is a unique finite real-valued measure νp on R such that the trace formula
τ Rp,H0 ,V (f ) =
f (p) (t) dνp (t)
(25)
R
holds for f ∈ Wp . The total variation of νp is bounded by var(νp ) cp :=
1 p V 2 . p!
(ii) If, in addition, H0 is bounded, then νp is absolutely continuous and supported in [a, b] ⊃ σ (H0 ) ∪ σ (H0 + V ). The density ηp of νp can be computed recursively by ηp (t) =
τ (V p−1 ) − νp−1 (−∞, t] (p − 1)! 1 (p−2) p−2 − λ1 ,...,λp−1 (λ − t)+ dmp−1,H0 ,V (λ1 , . . . , λp−1 ), (p − 1)!
(26)
Rp−1
for a.e. t ∈ R. In this case, (25) also holds for f ∈ R. Theorem 5.2. Let p ∈ {2, 3}. Let H0 = H0∗ be an operator affiliated with a von Neumann algebra M with normal faithful semi-finite trace τ and V = V ∗ an operator in L2 (M, τ ). Then, the following assertions hold. (i) There is a unique real-valued measure νp on R such that the trace formula (25) holds for f ∈ Cc∞ (R). If H0 is bounded, then νp is finite and (25) also holds for f ∈ Wp ∪ R. (ii) The measure ν2 is absolutely continuous. If, in addition, H0 is bounded, then νp is absolutely continuous for p = 3.
1112
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
(iii) Assume, in addition, that V ∈ L1 (M, τ ) if p = 2 or that H0 is bounded if p = 3. Then the density ηp of νp can be computed recursively by (26). Remark 5.3. When V ∈ L1 (M, τ ), Koplienko’s spectral shift function η2 = ηH0 ,H0 +V can be represented by (26), which reduces to the known formula (see [15,26]) t η2 (t) = τ (V ) −
∞ ξH0 +V ,H0 (λ) dλ −
−∞
t =−
dτ EH0 (λ)V
t
ξH0 +V ,H0 (λ) dλ + τ EH0 (−∞, t) V .
−∞
For V in the standard Hilbert–Schmidt class, no explicit formula for ηH0 ,H0 +V is known; existence of Koplienko’s spectral shift function is proved implicitly by approximation of V by finite-rank operators. Remark 5.4. Representation (8) of Koplienko’s spectral shift function via Krein’s spectral shift function was obtained by integrating by parts in the trace formula in (6) [15,26]. When V ∈ L1 (M, τ ) and f ∈ Rb (or f ∈ R if H0 is bounded), one can see from Lemma 3.1 that d |t=0 f (H0 + tV )) = τ [Vf (H0 )], and thus, τ ( dt
τ R2,H0 ,V (f ) = τ f (H0 + V ) − f (H0 ) − Vf (H0 ) . When M is finite and H0 is bounded (so that η2 is integrable and supported in a segment containing the spectra of H0 and H0 + V ), integrating by parts in Koplienko’s trace formula in (7) gives 1 τ f (H0 + V ) − f (H0 ) − Vf (H0 ) − V 2 f (H0 ) 2 t
1 2 dt. η2 (λ) dλ + τ V EH0 (−∞, t) = f (t) − 2 R
(27)
−∞
The bound for the remainder in the approximation formula (27) is O(V 22 ) since η2 1 = and η2 0 (see [15,26] for properties of η2 ).
V 22 2
Corollary 5.5. Let H0 be a self-adjoint operator in M and V a self-adjoint operator in Lp (M, τ ), where 2 < p ∈ N if M = B(H) or p = 3 if M is a general semi-finite von Neumann algebra. Then, there exists a sequence {ηp,n }n of L∞ -functions such that
τ Rp,H0 ,V (f ) = lim
f (p) (t)ηp,n (t) dt,
n→∞ R
for f ∈ Wp .
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
1113
Proof. Given V ∈ Lp (M, τ ), there exists a sequence of operators Vn ∈ M, which are linear combinations of τ -finite projections in M (or just finite rank operators when M = B(H)) such that limn→∞ V − Vn p = 0. Then by Lemma 3.11 and Proposition 3.9 for f ∈ Wp ,
lim τ Rp,H0 ,Vn (f ) = τ Rp,H0 ,V (f ) .
n→∞
(28)
By Theorem 5.1 in the case of M = B(H) or Theorem 5.2 in the case of a general M, respectively, applied to the τ -Hilbert–Schmidt perturbations Vn , there exists a sequence {ηp,n }n of L∞ -functions such that
τ Rp,H0 ,Vn (f ) =
f (p) (t)ηp,n (t) dt.
(29)
R
Combining (28) and (29) completes the proof.
2
Theorem 5.6. Let τ be finite and let H0 = H0∗ be affiliated with the algebra M and V = V ∗ ∈ M. Assume that (zI − H0 )−1 and V are free in the noncommutative space (M, τ ). Then for p 3 the following assertions hold. (i) There is a unique finite real-valued measure νp on R such that the trace formula (25) holds for f ∈ Wp . (ii) If, in addition, H0 is bounded, then νp is absolutely continuous and supported in [a, b] ⊃ σ (H0 ) ∪ σ (H0 + V ). The density ηp of νp can be computed recursively by (26). In this case, (25) also holds for f ∈ R. 6. Recursive formulas for the Cauchy transform Let H0 and V be self-adjoint operators in M. Assume, in addition, that V ∈ L2 (M, τ ). In this section we investigate a measure νp = νp,H0 ,V as defined in (25) for f = fz and f (t) = t p . We derive properties of the Cauchy transform of the measure νp which will be used in Section 7 to show that the measure νp+1 = νp+1,H0 ,V satisfying (25) for f ∈ R is absolutely continuous and its density can be determined explicitly via the density of νp and an integral of a spline function against a certain multiple spectral measure. In addition, for a general trace τ , the results of this section will be used in Section 8 to prove existence of an absolutely continuous measure ν3 satisfying (25) for p = 3 and find an explicit formula for the density of ν3 . Let Gν denote the Cauchy transform of a finite measure ν: Gν (z) = R
1 dν(t), z−t
Im(z) = 0.
(30)
The goal of this section is to prove the theorem below. Theorem 6.1. Let H0 = H0∗ ∈ M and V = V ∗ ∈ L2 (M, τ ). Suppose that νp is a real-valued absolutely continuous measure satisfying (25) for f = fz and f (t) = t p . Let G : C+ → C be the analytic function satisfying
1114
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
p (p+1) G(p+1) (z) = −G(p) τ (zI − H0 )−1 V (zI − H0 )−1 , νp (z) − (−1) lim
Im |z|→∞
G(z) = 0.
(31) (32)
Then G(z) is the Cauchy transform of the measure νp+1 satisfying (25) for f = fz , which is absolutely continuous with the density given by ηp+1 (t) =
1 (p−1) p−1 τ V p − p!νp (−∞, t] − λ1 ,...,λp (λ − t)+ dmp,H0 ,V (λ1 , . . . , λp ) , p! Rp
for a.e. t ∈ R. Lemma 6.2. Let νp be a measure satisfying (25) for f (t) = t p . Then dνp (t) =
1 p τ V . p!
R
Proof. Applying the trace formula (25) to the polynomial f (t) = t p and applying Lemma 3.1 give
) dνp (t) = τ (H0 + V ) − p
p! R
p−1
j =0
k0 ,k1 ,...,kj 0 k0 +k1 +···+kj =p−j
* p p H0 0 V H0 1 V
p . . . V H0 j
=τ Vp .
2
Lemma 6.3. Let H0 = H0∗ ∈ M and V = V ∗ ∈ Lp (M, τ ). Let νp and νp+1 be compactly supported measures. Then νp and νp+1 satisfy (25) for f = fz if and only if
p (p+1) (z) = −G(p) τ (zI − H0 )−1 V (zI − H0 )−1 . Gν(p+1) νp (z) − (−1) p+1 Proof. The result follows immediately from Lemma 3.7 upon employing the straightforward equality
(−1)p G(p) νp (z) = τ Rp,H0 ,V (fz ) .
2
Lemma 6.3 will be used to construct an absolutely continuous measure νp+1 satisfying (25) for f = fz based on the existence of an absolutely continuous measure νp satisfying (25) for f = fz . Lemma 6.4. Let H0 = H0∗ ∈ M and V = V ∗ ∈ Lp (M, τ ). Let νp be a measure satisfying (25) for f = fz and f (t) = t p . Assume, in addition, that νp is absolutely continuous with the density ηp compactly supported in [a, b]. Assume that G : C+ → C is an analytic function satisfying (31). Then G is determined by
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
1 G(z) = − log(z − b) τ V p − p! + (−1)(p+1)
1 p
R
1 χ[a,b] (λ) z−λ
1115
λ ηp (t) dt dλ a
p p dz , τ (zI − H0 )−1 V
···
up to a polynomial of degree p. Proof. We note that
p p
d 1 −τ (zI − H0 )−1 V (zI − H0 )−1 = τ (zI − H0 )−1 V . dz p Then by Lemma 6.3, G(z) = −
Gνp (z) dz + (−1)(p+1)
1 p
···
p p dz . τ (zI − H0 )−1 V
(33)
By the assumption of the lemma, dνp (λ) = ηp (λ) dλ, and hence, b Gνp (z) = a
1 ηp (λ) dλ. z−λ
Integrating the latter expression by parts gives Gνp (z) =
1 z−λ
λ
b b λ 1 ηp (t) dt − ηp (t) dt dλ. (z − λ)2 a
a
a
(34)
a
By Lemma 6.2, the first summand in (34) equals
1 z−b
b ηp (t) dt = a
1 1 p τ V . z − b p!
(35)
The second summand in (34) equals
d dz
b a
1 z−λ
Combining (33)–(36) completes the proof.
λ
ηp (t) dt dλ .
a
2
(36)
1116
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
Theorem 6.5. Let H0 = H0∗ ∈ M and V = V ∗ ∈ L2 (M, τ ). Let [a, b] be a segment containing σ (H0 ) ∪ σ (H0 + V ). Assume that either τ is standard or p = 2. Let νp be a measure compactly supported in [a, b] and satisfying (25) for f = fz and f (t) = t p . Assume, in addition, that νp is absolutely continuous with the density ηp . Then the function G(z) =
1 p!
R
−
1 τ V p − p!νp (−∞, t] z−t (p−1) p−1 λ1 ,...,λp (λ − t)+ dmp,H0 ,V (λ1 , . . . , λp ) dt
Rp
satisfies (31) and (32). Proof. Since dνp (t) = ηp (t) dt, we have λ χ[a,b] (λ)
ηp (t) dt = νp (−∞, λ] χ[a,b] (λ).
(37)
a
By Remark 4.12, we obtain the representation p
= τ (zI − H0 )−1 V
(p−1)
λ1 ,...,λp
Rp
1 dmp,H0 ,V (λ1 , . . . , λp ). z−λ
(38)
Since σ (H0 )∪σ (H0 +V ) ⊂ [a, b], the measure mp,H0 ,V is supported in [a, b]. By Lemma 2.4(i), we can interchange the order of integration in
p p dz τ (zI − H0 )−1 V
···
=
···
(p−1) λ1 ,...,λp
[a,b]p
1 dmp,H0 ,V (λ1 , . . . , λp ) dzp z−λ
and obtain
···
p p dz τ (zI − H0 )−1 V
= [a,b]p
(p−1) λ1 ,...,λp
···
1 p dz dmp,H0 ,V (λ1 , . . . , λp ), z−λ
(39)
with a suitable choice of constants of integration on the left-hand side of (39). For a reason to become clear later, we choose the antiderivatives in (39) with real constants of integration. Since
···
1 dzp = (z − λ)p−1 log(z − λ) + αp−1 zp−1 + polp−2 (z), z−λ
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
1117
with polp−2 (z) a polynomial of degree p − 2 and a constant αp−1 ∈ R to be fixed later, we obtain by Proposition 2.2(4) that the expression in (39) equals 1 (p − 1)!
(p−1) λ1 ,...,λp (z − λ)p−1 log(z − λ) + αp−1 dmp,H0 ,V (λ1 , . . . , λp ).
(40)
[a,b]p
By Lemma 6.4 and (37)–(40), b G(z) = − a
+
1 νp (−∞, t] dt z−t
(−1)p+1 p!
(p−1) λ1 ,...,λp ((z − λ)p−1 log(z − λ))
[a,b]p
+ (−1) log(z − b) + αp−1 dmp,H0 ,V (λ1 , . . . , λp ). p
(41)
Now we will represent the second integral in (41) as the Cauchy transform of an absolutely continuous measure. If not all λ1 , λ2 , . . . , λp coincide, then by Proposition 2.2(6) and (5), (p−1) λ1 ,...,λp (z − λ)p−1 log(z − λ) p−1 (p−1) ∂ 1 p−2 p−1 = (z − t) (λ − t) dt log(z − t) + λ ,...,λ p 1 (p − 2)! ∂t p−1 R
1 = (p − 2)! = (−1)
p−1
(p−1) p−2 (−1)p−1 (p − 1)! log(z − t) + γp−1 λ1 ,...,λp (λ − t)+ dt
R
(p − 1)
(p−1) p−2 log(z − t)λ1 ,...,λp (λ − t)+ dt +
R
1 γp−1 , (p − 1)!
(42)
with γp−1 ∈ R. By (42) and Proposition 2.2(5), we obtain (p−1) Jλ1 ,...,λp (z) = λ1 ,...,λp (z − λ)p−1 log(z − λ) + (−1)p log(z − b) + αp−1 (p−1) p−2 log(z − t) − log(z − b) λ1 ,...,λp (λ − t)+ dt = (−1)p−1 (p − 1) R
+
1 γp−1 + αp−1 . (p − 1)!
(43) (p−1)
p−2
Since in (41) we need only to consider λ1 , . . . , λp ∈ (a, b) and λ1 ,...,λp ((λ − t)+ ) is supported in [min{λ1 , . . . , λp }, max{λ1 , . . . , λp }], we obtain that in (43) it is enough to take t ∈ [a, b]. By standard computations, for t < b, the function z → log(z − t) − log(z − b) maps C+ to C−
(44)
1118
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
and lim iy log(iy − t) − log(iy − b) = b − t.
y→∞
(45)
1 Let αp−1 = − (p−1)! γp−1 . Then (44) and (45) along with Proposition 2.2(5) imply that Jλ1 ,...,λp in (43) maps C+ to C± (depending on the sign of (−1)p−1 ) and limy→∞ iyJλ1 ,...,λp (iy) ∈ R. By the classical theory of analytic functions, Jλ1 ,...,λp is the Cauchy transform of a finite real-valued measure. If λ1 = λ2 = · · · = λp , then
(p−1) Jλ1 ,...,λ1 (z) = λ1 ,...,λp (z − λ)p−1 log(z − λ) + (−1)p log(z − b) + αp−1 = (−1)p−1 log(z − λ1 ) − log(z − b) + αp−1 .
(46)
By (44) and (45), the function Jλ1 ,...,λ1 is also the Cauchy transform of a finite real-valued measure. Below we show that the measure generating Jλ1 ,...,λp is absolutely continuous. If all λ1 , λ2 , . . . , λp are distinct, then by Proposition 2.2(2), (z − λk )p−1 log(z − λk ) (p−1) λ1 ,...,λp (z − λ)p−1 log(z − λ) = . j =k (λk − λj ) p
k=1
(p−1)
Since λ1 ,...,λp ((z − λ)p−1 log(z − λ)) is symmetric in λ1 , λ2 , . . . , λp , we may assume without loss of generality that λ1 < λ2 < · · · < λp . Then 1 lim Im (−1)p log(t + iε − b) + αp−1 π ε→0+ (p−1) 1 lim Im λ1 ,...,λp (t + iε − λ)p−1 log(t + iε − λ) − + π ε→0 ⎧ (p p−1 ⎪ (−1)p+1 + (−1)p k=1 (λk −t) if t < λ1 , ⎪ (λk −λj ) ⎪ j = k ⎪ ⎪ ⎨ ( p−1 p p+1 + (−1)p (λk −t) k=m j =k (λk −λj ) if λm−1 t < λm , for 2 m p, = (−1) ⎪ ⎪ ⎪ ⎪ (−1)p+1 if λp t < b, ⎪ ⎩ 0 if t b.
φ(t) := −
(47)
By Proposition 2.2(2) and (4), (−1)p+1 + (−1)p
p k=1
(λk − t)p−1 (p−1) = (−1)p+1 + (−1)p λ1 ,...,λp (λ − t)p−1 = 0, j =k (λk − λj )
(48)
and hence, φ is supported in [a, b]. Combining (47) and (48) gives (p−1) p−1 φ(t) = (−1)p+1 χ(−∞,b] (t) + (−1)p λ1 ,...,λp (λ − t)+ .
(49)
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
1119
Similarly, with the use of Definition 2.1 and Lemma 2.4 (ii), one can see that (49) holds when some of the values λ1 , λ2 , . . . , λp repeat. Combining (41) and (49) gives b G(z) = − a
+
1 νp (−∞, t] dt z−t
1 p!
1 (p−1) p−1 χ(−∞,b] (t) − λ1 ,...,λp (λ − t)+ dt dmp,H0 ,V (λ1 , . . . , λp ). z−t
[a,b]p R
(50) Changing the order of integration in the second integral in (50) and applying Lemma 6.2 along with the fact that νp is supported in [a, b] imply the representation G(z) = R
1 τ (V p ) χ(−∞,b] (t) − νp (−∞, t] z−t p!
= R
1 − p!
(p−1) p−1 λ1 ,...,λp (λ − t)+ dmp,H0 ,V (λ1 , . . . , λp )
Rp
1 z−t
τ (V p ) − νp (−∞, t] p!
1 − p!
dt
(p−1) p−1 λ1 ,...,λp (λ − t)+ dmp,H0 ,V (λ1 , . . . , λp )
dt.
2
Rp
Proof of Theorem 6.1. In view of Theorem 6.5, it is enough to prove that the function p 1 (p−1) p−1 τ V − p!νp (−∞, t] − λ1 ,...,λp (λ − t)+ dmp,H0 ,V (λ1 , . . . , λp ) t → p!
(51)
Rp
is real-valued. The integral
(p−1) p−1 λ1 ,...,λp (λ − t)+ dmp,H0 ,V (λ1 , . . . , λp )
(52)
Rp
can be written as
(p−1) p−1 λ1 ,...,λp (λ − t)+ d Re mp,H0 ,V (λ1 , . . . , λp )
Rp
+i Rp
(p−1) p−1 λ1 ,...,λp (λ − t)+ d Im mp,H0 ,V (λ1 , . . . , λp ) .
(53)
1120
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
It is easy to see that mp,H0 ,V (dλ1 , dλ2 , . . . , dλp−1 , dλp ) = mp,H0 ,V (dλp , dλp−1 , . . . , dλ2 , dλ1 ), and hence, Im mp,H0 ,V (dλ1 , dλ2 , . . . , dλp−1 , dλp ) = − Im mp,H0 ,V (dλp , dλp−1 , . . . , dλ2 , dλ1 ) . (54) (p−1)
p−1
Along with symmetry of the divided difference λ1 ,...,λp ((λ − t)+ ) in λ1 , . . . , λp , the equality (54) implies that the second integral in (53) equals 0, and thus (52) is real-valued. We have that ν1 and η1 are real-valued. By induction, we obtain that νp and ηp are real-valued for every p ∈ N. Therefore, (51) is real-valued. 2 7. Spectral shift functions for M = B(H) Proof of Theorem 5.1(i). Let Hx = H0 + xV . The proof of the theorem will proceed in several steps. Step 1. Assume first that H0 is bounded and f ∈ R. Let [a, b] be a segment containing σ (H0 ) ∪ σ (H0 + V ). By Corollary 4.4, the finitely additive measure defined on rectangles by
(1) mp,Hx ,V (A1 × A2 × · · · × Ap × Ap+1 ) = τ EHx (A1 )V EHx (A2 )V . . . EHx (Ap )V EHx (Ap+1 ) , with A1 , . . . , Ap+1 Borel subsets of R, extends to a countably additive measure with total variap tion not exceeding V 2 . It follows from Corollary 4.11 and Remark 4.13 that
dp (p) (1) τ f (H + xV ) = p! λ1 ,...,λp+1 (f ) dmp,Hx ,V (λ1 , λ2 , . . . , λp+1 ). 0 dx p
(55)
Rp+1
By Proposition 2.2(7), (p)
1 max f (p) (λ), p! λ∈[a,b]
λ1 ,...,λp+1 (f )
which along with (55) ensures that p d p τ max f (p) (λ). dx p f (H0 + xV ) V 2 λ∈[a,b] Applying the latter estimate to the integrand in (2) guarantees that Rp,H0 ,V (f ) is a bounded p 1 functional on the space of f (p) with the norm not exceeding p! V 2 . Therefore, there exists a measure νp,H0 ,V supported in [a, b] and of variation not exceeding
τ Rp,H0 ,V (f ) =
such that
b f (p) (t) dνp,H0 ,V (t), a
for all f ∈ R.
p 1 p! V 2
(56)
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
1121
Step 2. We prove the claim of the theorem for H0 bounded and f ∈ Wp . Repeating the reasoning of [10, Theorem 2.8], one extends (56) from R to the set of functions R λ → eitλ , t ∈ R, as follows. By Runge’s Theorem, there exists a sequence of rational functions rn with poles off D = {λ: |λ| 1 + H0 + V } such that rn(k) (λ) → (it)k eitλ ,
λ ∈ D, k = 0, 1, 2, . . . ,
where the convergence is understood in the uniform sense. Making use of Lemma 3.8 and passing to the limit on both sides of (56) written for f ∈ R proves (56) for f (λ) = eitλ , with the same measure νp,H0 ,V as at the previous step. Finally, applying Corollary 3.12 extends (56) to the class of f ∈ Wp , with the same measure νp,H0 ,V . Step 3. Now we extend (25) to the case of an unbounded operator H0 and f ∈ Wp . This is done similarly to [10, Lemma 2.7], with replacement of iterated operator integrals by multiple operator integrals. Let H0,n = EH0 ((−n, n))H0 and Hx,n = H0,n + xV . It follows from (2) of Theorem 1.1 that 1 Rp,H0 ,V (f ) − Rp,H0,n ,V (f ) = (p − 1)!
1
dp dp f (H ) − f (H ) (1 − x)p−1 dx. x x,n dx p dx p
0
There exists a finite Borel measure μf such that f (λ) = Lemma 3.11,
Re
itλ dμ
f (t).
On the strength of
dp dp f (Hx ) − p f (Hx,n ) τ dx p dx
(p) τ ei(s0 −s1 )Hx V . . . V eisp Hx − ei(s0 −s1 )Hx,n V . . . V eisp Hx,n dσf (s0 , . . . , sp ). = p!
(57)
Π (p)
Proposition 3.9 implies that the integrand in (57) converges to 0, and hence, the whole expression in (57) converges to 0 as n → ∞. Then applying Proposition 3.9 yields
lim τ Rp,H0 ,V (f ) − Rp,H0,n ,V (f )
n→∞
1 = lim n→∞ (p − 1)!
1 p d dp τ f (H ) − f (H ) (1 − x)p−1 dx = 0. x x,n dx p dx p
(58)
0
By the result of the previous step applied to the bounded operators H0,n , there is a sequence of measures νp,H0,n ,V of variation bounded by cp , representing the functionals Rp,H0,n ,V (f ) for f ∈ Wp . Denote by Fn the distribution function of νp,H0,n ,V . By Helly’s selection theorem, there is a subsequence {Fnk }k and a function F of variation not exceeding cp such that Fnk converges to F pointwise and in L1loc (R). The trace formula (25) for bounded operators and the convergence in (58) ensure that the measure with the distribution F satisfies (25) for f ∈ Wp . 2 Proof of Theorem 5.1(ii). It is an immediate consequence of Theorem 5.1(i) and Theorem 6.1. 2
1122
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
8. Spectral shift functions for an arbitrary semi-finite M Proof of Theorem 5.2 for p = 2. Due to Theorem 4.1 and Corollary 4.11, the proof of existence of Koplienko’s spectral shift function η2 for a Hilbert–Schmidt perturbation V [15, Lemma 3.3] (cf. also [6]) can be extended to the case of a τ -Hilbert–Schmidt perturbation. 2 The proof of Theorem 5.2 for p = 3 will be based on the fact (see the lemma below) that if a measure (possibly complex-valued) satisfies (25) for f = fz , then it satisfies (25) for any f ∈ Rb . Lemma 8.1. Let H0 = H0∗ be an operator affiliated with M and V = V ∗ ∈ L2 (M, τ ). Let νp , with p = 3, be a Borel measure satisfying Rp,H0 ,V (fz ) = p! R
1 dνp (t). (z − t)p+1
Then, for all f ∈ Rb , Rp,H0 ,V (f ) =
f (p) (t) dνp (t).
(59)
R
If, in addition, H0 is bounded and νp is compactly supported, then (59) holds for f ∈ R. To prove Lemma 8.1, we need a simple lemma below. Lemma 8.2. Assume that the trace formula (25) holds for f = fz with a finite measure νp . Then, ) −1
p G(p) νp (z) = (−1) τ
(zI − H0 − V )
−
p−1
−1
(zI − H0 )
j V (zI − H0 )−1
* (60)
j =0
p
= (−1)p τ (zI − H0 − V )−1 V (zI − H0 )−1 .
(61)
Proof. Differentiating the integral in (30) gives p G(p) νp (z) = (−1) p! R
1 dνp (t), (z − t)p+1
Im(z) = 0.
(62)
Applying the trace formula (25) to f = fz ensures ) −1
τ (zI − H0 − V )
−
p−1
−1
(zI − H0 )
j =0
* −1 j V (zI − H0 ) = p! R
1 dνp (t). (z − t)p+1
(63)
Comparing (63) with (62) completes the proof of (60); comparing (63) with (19) of Lemma 3.6 completes the proof of (61). 2
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
1123
Proof of Lemma 8.1. Step 1. Assume that H0 is bounded. We prove the claim for f a polynomial. For z ∈ C \ R, with |z| large enough, Gνp (z) =
∞
z−(k+1)
k=0
t k dνp (t), R
and hence, (−1)p G(p) νp (z) =
∞
z−(k+p+1) (k + 1)(k + 2) . . . (k + p)
k=0
t k dνp (t).
(64)
R
On the other hand, (−1)p G(p) νp (z) ) −1
= τ (zI − H0 − V )
−
p−1
−1
(zI − H0 )
j V (zI − H0 )−1
*
j =0
) * p−1 1 H0 −1 H0 −1 j H0 + V −1 1 I− V I− I− =τ − . z z z z zj +1
(65)
j =0
Employing the power series expansion in (65) gives ) p
(−1)
G(p) νp (z) = τ
∞ 1 H0 + V m z z m=0
−
p−1 ∞ j =0 i=0
) =τ
∞
1 zj +1
k0 ,k1 ,...,kj 0 k0 +k1 +···+kj =i
H0 z
k0 k1 kj * H0 H0 V V ...V z z
z−(m+1) (H0 + V )m
m=0
−
p−1 ∞
z
j =0 i=0
*
−(j +1)
z
−i
k H0 0 V H0k1 V
k . . . V H0 j
.
(66)
k0 ,k1 ,...,kj 0 k0 +k1 +···+kj =i
By expanding (H0 + V )m one can see that τ
) p−1 m=0
= 0.
z−(m+1) (H0 + V )m −
p−1 p−1−j j =0
i=0
z−(j +1)
* k z−i H0 0 V H0k1 V
k . . . V H0 j
k0 ,k1 ,...,kj 0 k0 +k1 +···+kj =i
(67)
1124
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
Subtracting (67) from (66) yields (−1)p G(p) νp (z) ) ∞ p−1 ∞ =τ z−(m+1) (H0 + V )m − z−(i+j +1) j =0 i=p−j
m=p
) =τ
∞
z
−(m+1)
(H0 + V ) − m
m=p
*
k H0 0 V H0k1 V
k0 ,k1 ,...,kj 0 k0 +k1 +···+kj =i
p−1
j =0
k0 ,k1 ,...,kj 0 k0 +k1 +···+kj =m−j
k . . . V H0 j
*
k H0 0 V H0k1 V
k . . . V H0 j
(68)
.
By the continuity of the trace τ , (68) can be rewritten as (−1)p G(p) νp (z) =
∞
)
z−(m+1) τ (H0 + V )m −
m=p
=
∞
j =0
k0 ,k1 ,...,kj 0 k0 +k1 +···+kj =m−j
) z
−(k+p+1)
*
p−1
τ (H0 + V )
k+p
k=0
−
k H0 0 V H0k1 V
k . . . V H0 j
*
p−1
j =0
k0 ,k1 ,...,kj 0 k0 +k1 +···+kj =k+p−j
k H0 0 V H0k1 V
k . . . V H0 j
.
(69)
(p)
By comparing the representations for (−1)p Gνp (z) of (64) and (69), we obtain that for any k ∈ {0} ∪ N, ) τ (H0 + V )
k+p
−
p−1
j =0
k0 ,k1 ,...,kj 0 k0 +k1 +···+kj =k+p−j
* k H0 0 V H0k1 V
k . . . V H0 j
= (k + 1)(k + 2) . . . (k + p)
t k dνp (t), R
along with Lemma 3.1 proving the trace formula (25) for all polynomials. We note that under the assumptions of Step 1, p can be any natural number. Step 2. Assume that f ∈ Rb , with H0 not necessarily bounded. It is enough to prove the 1 statement for f (t) = (z−t) k+1 , k ∈ {0} ∪ N. Applying Lemma 3.7 gives p! R
1 dνp (t) = (p − 1)! (z − t)p+1
R
p−1 1 dνp−1 (t) − τ (zI − H0 )−1 V (zI − H0 )−1 . p (z − t) (70)
Differentiating (70) k times with respect to z gives
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
1125
1 dνp (t) (z − t)p+1+k
(−1) (p + k)! k
R
p−1 1 d k dν (t) − τ (zI − H0 )−1 V (zI − H0 )−1 . p−1 (z − t)p+k dzk
= (−1) (p − 1 + k)! k
R
(71) Dividing by (−1)k k! on both sides of (71) implies (p + k)! k!
R
=
1 dνp (t) (z − t)p+1+k
(p − 1 + k)! k!
p−1 1 (−1)k d k dν (t) − τ (zI − H0 )−1 V (zI − H0 )−1 . p−1 p+k k (z − t) k! dz
R
(72) Making use of the representation Rp−1,H0 ,V
1 (z − t)k+1
=
(p − 1 + k)! k!
R
1 dνp−1 (t) (z − t)p+k
(see Theorem 5.2 for Koplienko’s spectral shift function) and Lemma 3.2 converts (72) to (p + k)! k!
R
1 dνp (t) (z − t)p+1+k
= Rp−1,H0 ,V
1 (z − t)k+1
2 1 d −k−1 − τ . (zI − H − xV ) 0 2 dx 2 x=0
(73)
2 1 d −k−1 − τ (zI − H . − xV ) 0 2 dx 2 x=0
(74)
By (1), Rp,H0 ,V
1 (z − t)k+1
= Rp−1,H0 ,V
1 (z − t)k+1
Comparing (73) and (74) completes the proof of (25) for f (t) =
1 . (z−t)k+1
2
Proof of Theorem 5.2 for p = 3. When H0 is bounded, Lemma 8.1 and Theorem 6.1 prove the theorem for f ∈ R. Repeating the argument of Step 2 from the proof of Theorem 5.1(i) for τ the standard trace extends (ii) and (iii) of Theorem 5.2 to f ∈ Wp for H0 bounded. Repeating the argument of Step 3 from the proof of Theorem 5.1(i) on each segment of R extends (i) to f ∈ Cc∞ (R) for H0 unbounded. 2
1126
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
Proof of Theorem 5.6. (i) Due to Theorem 4.5, there exists a bounded measure νp satisfying the trace formula (25) for f ∈ Wp . The proof repeats the proof of Theorem 5.1 for the standard trace. (ii) Using the moment-cumulant formula (see [27, Theorem 2.17]), we have
p−1
= τ (zI − H0 )−1 V
kK(π) [V , . . . , V ]
τ (zI − H0 )−|Bj | ,
(75)
j =1
π={B1 ,...,B }∈NC(p−1)
where (see the proof of Theorem 4.5 for a bit of explanation, or [27, Theorem 2.17] for a thorough description) kK(π) [V , . . . , V ] is a polynomial of τ (V ), τ (V 2 ), . . . , τ (V p−1 ). Since for b 1,
τ (zI − H0 )−b =
1 τ EH0 (dλ1 ) · · · EH0 (dλ1 ) , (z − λ1 ) . . . (z − λb )
Rb
we have
τ (zI − H0 )−b = j =1
Rp−1
1 dγp−1,π (λ1 , . . . , λp ), (z − λ1 ) · · · (z − λp )
where γp−1,π is the measure described at (21). Combining (75) and (20) gives p−1
τ (zI − H0 )−1 V =
(p−2) λ1 ,...,λp−1
Rp−1
1 dmp−1,H0 ,V (λ1 , . . . , λp−1 ). z−λ
Following the lines in the proof of Theorem 6.1 completes the proof of the absolute continuity of νp and repeating the proof of Lemma 8.1, Step 1, proves (25) for f a polynomial. 2 9. Spectral shift functions via basic splines We represent the density of the measure νp provided by Theorem 5.1 as an integral of a basic spline against a certain multiple spectral measure when H0 and V are matrices. In addition, we show that existence of Krein’s spectral shift function can be derived from the representation of the Cauchy transform via basic splines when M is finite. The representation of the Cauchy transform via basic splines, in its turn, follows from the double integral representation of f (H0 + V ) − f (H0 ). Lemma 9.1. Let dim(H) < ∞ and H0 = H0∗ , V = V ∗ ∈ M = B(H). Then the Cauchy transform of the measure νp satisfying (25) equals G(p) νp (z) =
1 dp (p) p p−1 (−1) (z − λ) λ1 ,...,λp+1 log(z − λ) dzp (p − 1)! Rp+1
*
(2) dmp,H0 ,V (λ1 , λ2 , . . . , λp+1 )
,
Im(z) = 0.
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
1127
Proof. Upon applying Remark 4.12 and Lemma 8.2, we obtain p G(p) νp (z) = (−1)
(p) λ1 ,...,λp+1
Rp+1
1 (2) dmp,H0 ,V (λ1 , λ2 , . . . , λp+1 ). z−λ
By Lemma 2.4, one of the antiderivatives of order p of the function (p) z → λ1 ,...,λp+1
1 z−λ
equals (p) λ1 ,...,λp+1
1 p−1 p−1 , log(z − λ) − cp−1 (z − λ) (z − λ) (p − 1)!
where cp−1 is a constant. Applying Proposition 2.2(4) gives (p) λ1 ,...,λp+1 cp−1 (z − λ)p−1 = 0, 2
completing the proof of the lemma.
Lemma 9.2. Let Dp+1 = {(λ1 , λ2 , . . . , λp+1 ): λ1 = λ2 = · · · = λp+1 ∈ R}. Then, for any (λ1 , λ2 , . . . , λp+1 ) ∈ Rp+1 \ Dp+1 and z ∈ C \ R, 1 p−1 (z − λ) log(z − λ) (p − 1)! 1 (−1)p (p) p−1 = λ1 ,...,λp+1 (λ − t)+ dt. (p − 1)! z−t
(p) λ1 ,...,λp+1
R
Proof. By Proposition 2.2(6),
1 (z − λ)p−1 log(z − λ) (p − 1)! p 1 ∂ 1 (p) p−1 p−1 (z − t) = log(z − t) λ1 ,...,λp+1 (λ − t)+ dt p (p − 1)! ∂t (p − 1)!
(p)
λ1 ,...,λp+1
R
=
1 (p − 1)!
R
(−1)p (p) p−1 λ1 ,...,λp+1 (λ − t)+ dt. z−t
2
Theorem 9.3. Let dim(H) < ∞ and H0 = H0∗ , V = V ∗ ∈ M = B(H). Then the Cauchy transform of the measure νp satisfying (25) equals
1128
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
Gνp (z) = R
1 z−t
1 (p − 1)!
(p) p−1 (2) λ1 ,...,λp+1 (λ − t)+ dmp,H0 ,V (λ1 , λ2 , . . . , λp+1 )
dt.
Rp+1 \Dp+1
Proof. By Lemmas 9.1 and 9.2, Gνp (z) = polp (z) 1 + (p − 1)! +
1 (p − 1)!
Rp+1 \Dp+1
Dp+1
R
1 (p) p−1 (2) (λ − t)+ dt dmp,H0 ,V (λ1 , λ2 , . . . , λp+1 ) z − t λ1 ,...,λp+1
1 (2) dmp,H0 ,V (λ, λ, . . . , λ), z−λ
(76)
where polp (z) is a polynomial of degree p. As stated in Proposition 2.2(5), the basic spline (p)
p−1
λ1 ,...,λp+1 ((λ − t)+ ) is non-negative and integrable, with the L1 -norm equal to 1/p. By Corol(2)
lary 4.4, the measure mp,H0 ,V has bounded variation. On one hand, it guarantees that the first integral in (76) is O(1/ Im(z)) as Im(z) → +∞. On the other hand, it allows to change the order of integration in the first integral in (76). By Lemma 4.8, the second integral in (76) equals 0. Comparing the asymptotics of Gνp (z) and the integrals in (76) as Im(z) → +∞ implies that polp (z) = 0, completing the proof of the theorem. 2 Corollary 9.4. Let dim(H) < ∞ and H0 = H0∗ , V = V ∗ ∈ M = B(H). Then the density of the measure νp satisfying (25) equals ηp (t) =
1 (p − 1)!
(p) p−1 (2) λ1 ,...,λp+1 (λ − t)+ dmp,H0 ,V (λ1 , λ2 , . . . , λp+1 ),
Rp+1 \Dp+1
for a.e. t ∈ R. Proof. By Theorem 9.3, the Cauchy transforms of νp and ηp (t)dt coincide. This implies (see Step 1 in the proof of Lemma 8.1) that the functionals given by νp and ηp (t)dt coincide on the polynomials defined on [a, b], where [a, b] contains the spectra of H0 and H0 + V . Hence, dνp = ηp (t)dt. 2 Below, we prove absolute continuity of ν1 by techniques different from those of [16]. Theorem 9.5. Let τ be finite. Let H0 = H0∗ be an operator affiliated with M and V = V ∗ ∈ M. The trace formula (25) with p = 1 holds for every f ∈ W1 , with ν1 absolutely continuous. The density η1 of ν1 is given by the formula
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
1 (2) χ(min{λ,μ},max{λ,μ}) (t) dm1,H0 ,V (λ, μ), |μ − λ|
η1 (t) = R2 \D2
1129
(77)
for a.e. t ∈ R. If, in addition, H0 is bounded, then (25) holds for f ∈ R. Proof. Repeating the argument in the proof of Theorem 9.3 leads to the formula (76). By (2) Lemma 4.6, the measure m1,H0 ,V is real-valued. Then by Lemma 4.8 and the Poisson inversion, for any x ∈ R, lim Im
ε→0+
D2
1 (2) dm1,H0 ,V (λ, λ) = 0, x + iε − λ
proving Krein’s trace formula for f = fz with η1 given by (77). Adjusting the argument in the proof of Lemma 8.1, Step 2, extends (25) to f ∈ Rb . Repeating the argument in the proof of [29, Lemma 8.3.2] extends the result of the theorem from f ∈ R to f ∈ W1 with the same absolutely continuous measure dν1 (t) = η1 (t) dt. 2 10. Higher order spectral averaging formulas Theorem 10.1. Assume that H0 = H0∗ ∈ M and either τ is standard or p = 2. Let V ∈ L2 (M, τ ). Then the measure 1
p
dx (1 − x)p−1 τ EH0 +xV (dt)V
0
is absolutely continuous with the density equal to ηp (t)(p − 1)! 1
(p) p−1 (1) λ1 ,...,λp+1 (λ − t)+ dmp,H0 +xV ,V (λ1 , . . . , λp+1 ) dx.
(1 − x)p−1
−p
Rp+1 \Dp+1
0
Proof. Let [a, b] ⊃ σ (H0 ) ∪ σ (H0 + V ). Then by Theorem 1.1 (2) and Remark 4.13,
τ Rp,H0 ,V (f ) 1 = (p − 1)! 1 = (p − 1)!
1 (1 − x)
p−1
dp τ f (H0 + xV ) dx dx p
0
1 (1 − x)p−1 p! 0
Rp+1
(p)
(1)
λ1 ,...,λp+1 (f )dmp,H0 +xV ,V (λ1 , . . . , λp+1 ) dx,
1130
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
for f ∈ Cc∞ (R) such that f |[a,b] coincides with a polynomial. Applying Proposition 2.2(6) and then changing the order of integration yield
τ Rp,H0 ,V (f ) p = (p − 1)!
1 (1 − x)p−1 0
× Rp+1 \Dp+1
1 + (p − 1)! =
f (p) (t) R
(p) p−1 f (p) (t)λ1 ,...,λp+1 (λ − t)+ dt
1
f (p) (λ) dmp,H0 +xV ,V (λ, . . . , λ) dx.
Dp+1
0
p (p − 1)!
(p) p−1 (1) λ1 ,...,λp+1 (λ − t)+ dmp,H0 +xV ,V (λ1 , . . . , λp+1 ) dx
(1 − x)
dt
Rp+1 \Dp+1
0
f
(1)
(1 − x)
p−1
p−1
+
(1)
dmp,H0 +xV ,V (λ1 , . . . , λp+1 ) dx
R
1 ×
(p)
R
1 (t) (p − 1)!
1
p
dx. (1 − x)p−1 τ EH0 +xV (dt)V
(78)
0
Along with Theorem 5.1 in the case of M = B(H) or Theorem 5.2 in the case of a general M, respectively, (78) implies that f (p) (t) R
1 (p − 1)!
1
p
dx (1 − x)p−1 τ EH0 +xV (dt)V
0
=
f (p) (t)ηp (t) dt − R
R
×
p f (p) (t) (p − 1)!
1 (1 − x)p−1 0
(p) p−1 (1) λ1 ,...,λp+1 (λ − t)+ dmp,H0 +xV ,V (λ1 , . . . , λp+1 ) dx
dt,
(79)
Rp+1 \Dp+1
from which the statement of the theorem follows.
2
Remark 10.2. The assertion of Theorem 10.1 remains true if τ is finite, H0 and V are free in (M, τ ), and p 2.
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
1131
Remark 10.3. The argument in the proof of Theorem 10.1 can be repeated for p = 1, provided (1) V ∈ L1 (M, τ ). Since m1,H0 +xV ,V (A1 × A2 ) = τ [EH0 +xV (A1 ∩ A2 )V ], for A1 , A2 ∈ R, one has (1)
that m1,H0 +xV ,V (Rp+1 \ Dp+1 ) = 0. Therefore, (78) converts to
τ f (H0 + V ) − f (H0 ) =
1
f (t)τ EH0 +xV (dt)V dx
0 R
=
1
f (t) R
τ EH0 +xV (dt)V dx.
0
Along with Krein’s trace formula the latter implies that
1 0
τ [EH0 +xV (dt)V ] dx = η1 (t) dt.
Acknowledgments The authors thank Bojan Popov for helpful conversations about splines. References [1] N.A. Azamov, A.L. Carey, P.G. Dodds, F.A. Sukochev, Operator integrals, spectral shift, and spectral flow, Canad. J. Math., in press. [2] N.A. Azamov, P.G. Dodds, F.A. Sukochev, The Krein spectral shift function in semifinite von Neumann algebras, Integral Equations Operator Theory 55 (2006) 347–362. [3] M.Sh. Birman, Spectral shift function and double operator integrals, in: Linear and Complex Analysis Problem, Book 3, in: V.P. Havin, et al. (Eds.), Lecture Notes in Math., vol. 1573, Springer-Verlag, Berlin, 1994, pp. 272–273. [4] M.Sh. Birman, M.Z. Solomyak, Remarks on the spectral shift function, Zap. Nauchn. Sem. LOMI 27 (1972) 33–46 (in Russian); translation in: J. Soviet Math. 3 (1975) 408–419. [5] M.Sh. Birman, M.Z. Solomyak, Tensor product of a finite number of spectral measures is always a spectral measure, Integral Equations Operator Theory 24 (1996) 179–187. [6] J.M. Bouclet, Trace formulae for relatively Hilbert–Schmidt perturbations, Asymptot. Anal. 32 (2002) 257–291. [7] R.W. Carey, J.D. Pincus, Mosaics, principal functions, and mean motion in von Neumann algebras, Acta Math. 138 (1977) 153–218. [8] Yu.L. Daletskii, S.G. Krein, Integration and differentiation of functions of Hermitian operators and application to the theory perturbations, in: Trudy Sem. Funktsion. Anal., vol. 1, Voronezh Gos. Univ., 1956, pp. 81–105 (in Russian). [9] R.A. DeVore, G.G. Lorentz, Constructive Approximation, Grundlehren Math. Wiss., vol. 303, Springer-Verlag, Berlin, 1993. [10] M. Dostani`c, Trace formula for nonnuclear perturbations of selfadjoint operators, Publ. Inst. Math. 54 (68) (1993) 71–79. [11] F. Gesztesy, K.A. Makarov, S.N. Naboko, The spectral shift operator, in: J. Dittrich, P. Exner, M. Tater (Eds.), Mathematical Results in Quantum Mechanics, in: Oper. Theory Adv. Appl., vol. 108, Birkhäuser, Basel, 1999, pp. 59–90. [12] F. Gesztesy, A. Pushnitski, B. Simon, On the Koplienko spectral shift function, I. Basics, Zh. Mat. Fiz. Anal. Geom. 4 (1) (2008) 63–107. [13] U. Haagerup, H. Schultz, Invariant subspaces for operators in a general II 1 -factor, preprint arXiv:math/0611256. [14] S. Karni, E. Mezbach, On the extension of bimeasures, J. Anal. Math. 55 (1990) 1–16. [15] L.S. Koplienko, Trace formula for perturbations of nonnuclear type, Sibirsk. Mat. Zh. 25 (1984) 62–71 (in Russian); translation in: Trace formula for nontrace-class perturbations, Siberian Math. J. 25 (1984) 735–743. [16] M.G. Krein, On a trace formula in perturbation theory, Mat. Sb. 33 (1953) 597–626 (in Russian). [17] I.M. Lifshits, On a problem of the theory of perturbations connected with quantum statistics, Uspekhi Mat. Nauk 7 (1952) 171–180 (in Russian).
1132
K. Dykema, A. Skripka / Journal of Functional Analysis 257 (2009) 1092–1132
[18] K.A. Makarov, A. Skripka, Some applications of the perturbation determinant in finite von Neumann algebras, Canad. J. Math., in press. [19] B. de Pagter, F.A. Sukochev, Differentiation of operator functions in non-commutative Lp -spaces, J. Funct. Anal. 212 (1) (2004) 28–75. [20] B.S. Pavlov, On multidimensional operator integrals, in: Problems of Mathematial Analysis, vol. 2, Leningrad State Univ., 1969, pp. 99–121 (in Russian). [21] V.V. Peller, Hankel operators in the perturbation theory of unbounded self-adjoint operators, in: Analysis and Partial Differential Equations, in: Lect. Notes Pure Appl. Math., vol. 122, Dekker, New York, 1990, pp. 529–544. [22] V.V. Peller, An extension of the Koplienko–Neidhardt trace formulae, J. Funct. Anal. 221 (2005) 456–481. [23] V.V. Peller, Multiple operator integrals and higher operator derivatives, J. Funct. Anal. 223 (2006) 515–544. [24] J.T. Schwartz, Nonlinear Functional Analysis, Gordon and Breach Science Publishers, New York, London, Paris, 1969. [25] B. Simon, Spectral averaging and the Krein spectral shift, Proc. Amer. Math. Soc. 126 (5) (1998) 1409–1413. [26] A. Skripka, Trace inequalities and spectral shift, Oper. Matrices, in press. [27] R. Speicher, Free calculus, in: Quantum Probability Communications, vol. XII, Grenoble, 1998, World Sci. Publ., River Edge, NJ, 2003, pp. 209–235. [28] D.V. Voiculescu, K.J. Dykema, A. Nica, Free Random Variables, CRM Monogr. Ser., vol. 1, Amer. Math. Soc., 1992. [29] D.R. Yafaev, Mathematical Scattering Theory: General Theory, Amer. Math. Soc., Providence, RI, 1992.
Journal of Functional Analysis 257 (2009) 1133–1143 www.elsevier.com/locate/jfa
A duality principle for groups Dorin Dutkay a , Deguang Han a,∗ , David Larson b a Department of Mathematics, University of Central Florida, Orlando, FL 32816, United States b Department of Mathematics, Texas A&M University, College Station, TX, United States
Received 16 February 2009; accepted 9 March 2009 Available online 24 March 2009 Communicated by D. Voiculescu
Abstract The duality principle for Gabor frames states that a Gabor sequence obtained by a time–frequency lattice is a frame for L2 (Rd ) if and only if the associated adjoint Gabor sequence is a Riesz sequence. We prove that this duality principle extends to any dual pairs of projective unitary representations of countable groups. We examine the existence problem of dual pairs and establish some connection with classification problems for II1 factors. While in general such a pair may not exist for some groups, we show that such a dual pair always exists for every subrepresentation of the left regular unitary representation when G is an abelian infinite countable group or an amenable ICC group. For free groups with finitely many generators, the existence problem of such a dual pair is equivalent to the well-known problem about the classification of free group von Neumann algebras. © 2009 Elsevier Inc. All rights reserved. Keywords: Group representations; Frame vectors; Bessel vectors; Duality principle; Von Neumann algebras; II1 factors
1. Introduction Motivated by the duality principle for Gabor representations in time–frequency analysis we establish a general duality theory for frame representations of infinite countable groups, and build its connection with the classification problem [2] of II1 factors. We start by recalling some notations and definitions about frames. * Corresponding author.
E-mail addresses: [email protected] (D. Dutkay), [email protected] (D. Han), [email protected] (D. Larson). 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.03.007
1134
D. Dutkay et al. / Journal of Functional Analysis 257 (2009) 1133–1143
A frame [6] for a Hilbert space H is a sequence {xn } in H with the property that there exist positive constants A, B > 0 such that Ax2
x, xn 2 Bx2
(1.1)
g∈G
holds for every x ∈ H . A tight frame refers to the case when A = B, and a Parseval frame refers to the case when A = B = 1. In the case that (1.1) hold only for all x ∈ span{xn }, then we say that {xn } is a frame sequence, i.e., it is a frame for its closed linear span. If we only require the right-hand side of the inequality (1.1), then {xn } is called a Bessel sequence. One of the well studied classes of frames is the class of Gabor (or Weyl-Heisenberg) frames: Let K = AZd and L = BZd be two full-rank lattices in Rd , and let g ∈ L2 (Rd ) and Λ = L × K. Then the Gabor (or Weyl–Heisenberg) family is the following collection of functions in L2 (Rd ): G(g, Λ) = G(g, L, K) := e2πi,x g(x − κ) ∈ L, κ ∈ K . For convenience, we write gλ = gκ, = e2πi,x g(x − κ), where λ = (κ, ). If E and Tκ are the modulation and translation unitary operators defined by E f (x) = e2πi,x f (x) and Tκ f (x) = f (x − κ) for all f ∈ L2 (Rd ). Then we have gκ, = E Tκ g. The well-known Ron–Shen duality principle states that a Gabor sequence G(g, Λ) is a frame (respectively, Parseval frame) for L2 (Rd ) if and only if the adjoint Gabor sequence G(g, Λo ) is a Riesz sequence (respectively, orthonormal sequence), where Λo = (B t )−1 Zd × (At )−1 Zd is the adjoint lattice of Λ. Gabor frames can be viewed as frames obtained by projective unitary representations of the abelian group Zd × Zd . Let Λ = AZd × BZd with A and B being d × d invertible real matrices. The Gabor representation πΛ defined by (m, n) → EAm TBn is not necessarily a unitary representation of the group Zd × Zd . But it is a projective unitary representation of Zd × Zd . Recall (cf. [25]) that a projective unitary representation π for a countable group G is a mapping g → π(g) from G into the group U (H ) of all the unitary operators on a separable Hilbert space H such that π(g)π(h) = μ(g, h)π(gh) for all g, h ∈ G, where μ(g, h) is a scalar-valued function on G × G taking values in the circle group T. This function μ(g, h) is then called a multiplier or 2-cocycle of π . In this case we also say that π is a μ-projective unitary representation. It is clear from the definition that we have (i) μ(g1 , g2 g3 )μ(g2 , g3 ) = μ(g1 g2 , g3 )μ(g1 , g2 ) for all g1 , g2 , g3 ∈ G, (ii) μ(g, e) = μ(e, g) = 1 for all g ∈ G, where e denotes the group unit of G. Any function μ : G × G → T satisfying (i)–(ii) above will be called a multiplier for G. It follows from (i) and (ii) that we also have (iii) μ(g, g −1 ) = μ(g −1 , g) holds for all g ∈ G.
D. Dutkay et al. / Journal of Functional Analysis 257 (2009) 1133–1143
1135
Examples of projective unitary representations include unitary group representations (i.e., μ ≡ 1) and the Gabor representations in time–frequency analysis. Similar to the group unitary representation case, the left and right regular projective representations with a prescribed multiplier μ for G can be defined by λg χh = μ(g, h)χgh , and
ρg χh = μ h, g −1 χhg −1 ,
h ∈ G,
h ∈ G,
where {χg : g ∈ G} is the standard orthonormal basis for 2 (G). Clearly, λg and rg are unitary operators on 2 (G). Moreover, λ is a μ-projective unitary representation of G with multiplier μ(g, h) and ρ is a projective unitary representation of G with multiplier μ(g, h). The representations λ and ρ are called the left regular μ-projective representation and the right regular μ-projective representation, respectively, of G. Let L and R be the von Neumann algebras generated by λ and ρ, respectively. It is known (cf. [9]), similarly to the case for regular group representations, that both R and L are finite von Neumann algebras, and that R is the commutant of L. Moreover, if for each e = u ∈ G, either {vuv −1 : v ∈ G} or {μ(vuv −1 , v)μ(v, u): v ∈ G} is an infinite set, then both L and R are factor von Neumann algebras. Notations. In this paper for a subset M of a Hilbert space H and a subset A of B(H ) of all the bounded linear operators on H , we will use [M] to denote the closed linear span of M, and A to denote the commutant {T ∈ B(H ): T A = AY, ∀A ∈ A} of A. So we have L = λ(G)
= ρ(G) and R = ρ(G)
= λ(G) . We also use M N to denote two ∗-isomorphic von Neumann algebras M and N . For any projection P ∈ λ(G) (where λ is the left regular projective representation) we use λ|P to denote the restriction of λ to P H . We refer to [17] for any other notations and terminologies about von Neumann algebras used in this paper. Given a projective unitary representation π of a countable group G on a Hilbert space H , a vector ξ ∈ H is called a complete frame vector (resp. complete tight frame vector, complete Parseval frame vector) for π if {π(g)ξ }g∈G (here we view this as a sequence indexed by G) is a frame (resp. tight frame, Parseval frame) for the whole Hilbert space H , and is just called a frame vector (resp. tight frame vector, Parseval frame vector) for π if {π(g)ξ }g∈G is a frame sequence (resp. tight frame sequence, Parseval frame sequence). A Bessel vector for π is a vector ξ ∈ H such that {π(g)ξ }g∈G is Bessel. We will use Bπ to denote the set of all the Bessel vectors of π . For x ∈ H , let Θx be the analysis operator for {π(g)x}g∈G (see Section 2). It is useful to note that if ξ and η are Bessel vectors for π , then Θη∗ Θξ commutes with π(G). Thus, if ξ is a −1/2
complete frame vector for π , then η := Sξ ξ is a complete Parseval frame vector for π , where Sξ = Θξ∗ Θξ and is called the frame operator for ξ (or Bessel operator if ξ is a Bessel vector). Hence, a projective unitary representation has a complete frame vector if and only if it has a complete Parseval frame vector. In this paper the terminology frame representation refers to a projective unitary representation that admits a complete frame vector. Proposition 1.1. (See [9,23].) Let π be a projective unitary representation π of a countable group G on a Hilbert space H . Then π is frame representation if and only if π is unitarily equivalent to a subrepresentation of the left regular projective unitary representation of G. Consequently, if π is frame representation, then both π(G) and π(G)
are finite von Neumann algebras.
1136
D. Dutkay et al. / Journal of Functional Analysis 257 (2009) 1133–1143
The duality principle for Gabor frames was independent and essentially simultaneous discovered by Daubechies, H. Landau, and Z. Landau [3], Janssen [16], and Ron and Shen [24], and the techniques used in these three articles to prove the duality principle are completely different. We refer to [15] for more details about this principle and its important applications. For Gabor representations, let Λo be the adjoint lattice of a lattice Λ. The well-known density theorem (cf. [14]) implies that one of two projective unitary representations πΛ and πΛo for the group G = Zd × Zd must be a frame representation and the other admits a Riesz vector. So we can always assume that πΛ is a frame representation of Zd × Zd and hence πΛ(o) admits a Riesz vector. Moreover, we also have πΛ (G) = πΛ(o) (G)
, and both representations share the same Bessel vectors. Rephrasing the duality principle in terms of Gabor representations, it states that {πΛ (m, n)g}m,n∈Zd is a frame for L2 (Rd ) if and only if {πΛ(o) (m, n)g}m,n∈Zd is a Riesz sequence. Our first main result reveals that this duality principle is not accidental and in fact it is a general principle for any commutant pairs of projective unitary representations. Definition 1.1. Two projective unitary representations π and σ of a countable group G on the same Hilbert space H are called a commutant pair if π(G) = σ (G)
. Theorem 1.2. Let π be a frame representation and (π, σ ) be a commutant pair of projective unitary representations of G on H such that π has a complete frame vector which is also a Bessel vector for σ . Then (i) {π(g)ξ }g∈G is a frame sequence if and only if {σ (g)ξ }g∈G is a frame sequence, (ii) if, in addition, assuming that σ admits a Riesz sequence, then {π(g)ξ }g∈G is a frame (respectively, a Parseval frame) for H if and only if {σ (g)ξ }g∈G is a Riesz sequence (respectively, an orthonormal sequence). For a frame representation π , we will call (π, σ ) a dual pair if (π, σ ) is a commutant pair such that π has a complete frame vector which is also a Bessel vector for σ , and σ admits a Riesz sequence. We remark that this duality property is not symmetric for π and σ since π is assumed to be a frame representation and σ in general is not. Theorem 1.2 naturally leads to the following existence problem: Problem 1. Let G be a infinite countable group and μ be a multiplier for G. Does every μprojective frame representation π of G admit a dual pair (π, σ )? While we maybe able to answer this problem for some special classes of groups, this is in general open due to its connections (See Theorem 1.4) with the classification problem of II1 factors which is one of the big problems in von Neumann algebra theory. It has been a longstanding unsolved problem to decide whether the factors obtained from the free groups with n and m generators respectively are isomorphic if n is not equal to m with both n, m > 1. This problem was one of the inspirations for Voiculescu’s theory of free probability. Recall that the fundamental group F (M) of a type II1 factor M is an invariant that was considered by Murray and von Neumann in connection with their notion of continuous dimension in [18], where they proved that F (M) = R∗+ when M is isomorphic to a hyperfinite type II1 factor, and more generally when it splits off such a factor. For free groups Fn of n-generators, by using Voiculescu’s free probability theory [26], Radulescu [21,22] showed that the fundamental group F (M) = R∗+ for
D. Dutkay et al. / Journal of Functional Analysis 257 (2009) 1133–1143
1137
M = λ(F∞ ) . But the problem of calculating F (M) for M = λ(Fn ) with 2 n < ∞ remains a central open problem in the classification of II1 factors, and it can be rephrased as: Problem 2. Let Fn (n > 1) be the free group of n-generators and P ∈ λ(Fn ) is a nontrivial projection. Is λ(Fn ) ∗-isomorphic to P λ(Fn ) P ? It was proved independently by Dykema [7] and Radulescu [22] that either all the von Neumann algebras P λ(Fn ) P (0 = P ∈ λ(Fn ) ) are ∗-isomorphic, or no two of them are ∗isomorphic. Our second main result established the equivalence of these two problems for free groups. Theorem 1.3. Let π = λ|P be a subrepresentation of the left regular representation of an ICC (infinite conjugate class) group G and P ∈ λ(G) be a projection. Then the following are equivalent: (i) λ(G) and P λ(G) P are isomorphic von Neumann algebras; (ii) there exists a group representation σ such that (π, σ ) form a dual pair. The above theorem implies that the answer to Problem 1 is negative in general, but is affirmative for amenable ICC groups. Theorem 1.4. Let G be a countable group and λ be its left regular unitary representation (i.e. μ ≡ 1). Then we have (i) If G is either an abelian group or an amenable ICC group, then for every projection 0 = P ∈ λ(G) , there exists a unitary representation σ of G such that (λ|P , σ ) is a dual pair. (ii) There exist ICC groups (e.g., G = Z2 SL(2, Z)), such that none of the nontrivial subrepresentations λ|P admits a dual pair. We will give the proof of Theorem 1.2 in Section 2 and the proof requires some resent work by the present authors including the results on parameterizations and dilations of frame vectors [10–12], and some result results on the “duality properties” for π -orthogonal and π -weakly equivalent vectors [13]. The proofs for Theorems 1.3 and 1.4 will provided in Section 3, and additionally we will also discuss some concrete examples including the subspace duality principle for Gabor representations. 2. The duality principle We need a series of preparations in order to prove Theorem 1.2. For any projective representation π of a countable group G on a Hilbert space H and x ∈ H , the analysis operator Θx for x from D(Θx )(⊆ H ) to 2 (G) is defined by Θx (y) =
y, π(g)x χg ,
g∈G
where D(Θx ) = {y ∈ H : g∈G |y, π(g)x|2 < ∞} is the domain space of Θx . Clearly, Bπ ⊆ D(Θx ) holds for every x ∈ H . In the case that Bπ is dense in H , we have that Θx is a densely de-
1138
D. Dutkay et al. / Journal of Functional Analysis 257 (2009) 1133–1143
fined and closable linear operator from Bπ to 2 (G) (cf. [8]). Moreover, x ∈ Bπ if and only if Θx is a bounded linear operator on H , which in turn is equivalent to the condition that D(Θx ) = H . Lemma 2.1. (See [8].) Let π be a projective representation of a countable group G on a Hilbert space H such that Bπ is dense in H . Then for any x ∈ H , there exists ξ ∈ Bπ such that (i) {π(g)ξ : g ∈ G} is a Parseval frame for [π(G)x]; (ii) Θξ (H ) = [Θx (Bπ )]. Lemma 2.2. Assume that π is a projective representation of a countable group G on a Hilbert space H such that π admits a Riesz sequence and Bπ is dense in H . If [Θξ (H )] = 2 (G), then there exists 0 = x ∈ H such that [Θx (H )] ⊥ [Θξ (H )]. Proof. Assume that {π(g)η}g∈G is a Riesz sequence. Then we have that Θη (H ) = 2 (G) and Θη is invertible when restricted to [π(G)η]. Let P be the orthogonal projection from 2 (G) onto [Θξ (H )]. Then P ∈ λ(G) and P = I . Let x = θη−1 P ⊥ χe . Then x = 0 and [Θx (H )] ⊥ [Θξ (H )]. 2 Lemma 2.3. (See [8,11].) Assume that π is a projective representation of a countable group G on a Hilbert space H such that π admits a complete frame vector ξ . If {π(g)η}g∈G is a frame sequence, then there exists a vector h ∈ H such that η and h are π -orthogonal and {π(g)(η + h)}g∈G is a frame for H . Two other concepts are needed. Definition 2.1. Suppose π is a projective unitary representation of a countable group G on a separable Hilbert space H such that the set Bπ of Bessel vectors for π is dense in H . We will say that two vectors x and y in H are π -orthogonal if the ranges of Θx and Θy are orthogonal, and that they are π -weakly equivalent if the closures of the ranges of Θx and Θy are the same. The following result obtained in [13] characterizes the π -orthogonality and π -weakly equivalence in terms of the commutant of π(G). Lemma 2.4. Let π be a projective representation of a countable group G on a Hilbert space H such that Bπ is dense in H , and let x, y ∈ H . Then (i) x and y are π -orthogonal if and only if [π(G) x] ⊥ [π(G) y] (or equivalently, x ⊥ π(G) y); (ii) x and y are π -weakly equivalent if and only if [π(G) x] = [π(G) y]. We also need the following parameterization result [10–12]. Lemma 2.5. Let π be a projective representation of a countable group G on a Hilbert space H and {π(g)ξ }g∈G is a Parseval frame for H . Then (i) {π(g)η}g∈G is a Parseval frame for H if and only if there is a unitary operator U ∈ π(G)
such that η = U ξ ;
D. Dutkay et al. / Journal of Functional Analysis 257 (2009) 1133–1143
1139
(ii) {π(g)η}g∈G is a frame for H if and only if there is an invertible operator U ∈ π(G)
such that η = U ξ ; (iii) {π(g)η}g∈G is a Bessel sequence if and only if there is an operator U ∈ π(G)
such that η = U ξ , i.e., Bπ = π(G)
ξ . There are several other interesting parametrization results for frame vectors. In particular, all the frame vectors can be parameterized by a special class of k-tuple of operators from π(G) [4], where k is the cyclic multiplicity of π(G) . As a consequence of Lemma 2.5 we have Corollary 2.6. Let π be a frame representation of a countable group G on a Hilbert space H . Then (i) Bπ is dense in H ; (ii) π has a complete frame vector which is also a Bessel vector for σ if and only if Bπ ⊆ Bσ . Proof. (i) follows immediately from Lemma 2.5(iii). For (ii), assume that {π(g)ξ }g∈G is a frame for H and {σ (g)ξ }g∈G is also Bessel. Then for every η ∈ Bπ , we have by Lemma 2.5(iii) there is A ∈ π(G)
such that η = Aξ . Thus {σ (g)η}g∈G = A{σ (g)ξ }g∈G is Bessel, and so η ∈ Bσ . Therefore we get Bπ ⊆ Bσ . The other direction is trivial. 2 Now we are ready to prove Theorem 1.2. We divide the proof into two propositions. Proposition 2.7. Let π be a frame representation and (π, σ ) be a commutant pair of projective unitary representations of G on H such that π has a complete frame vector which is also a Bessel vector for σ . Then {π(g)ξ }g∈G is a frame sequence (respectively, a Parseval frame sequence) if and only if {σ (g)ξ }g∈G is a frame sequence (respectively, a Parseval frame sequence). Proof. “⇒”: Assume that {π(g)ξ }g∈G is a frame sequence. Since π is a frame representation, by the dilation result (Lemma 2.3), there exists h ∈ H such that (ξ, h) are π -orthogonal and {π(g)(ξ + h)}g∈G is a frame for H . If we prove that {σ (g)(ξ + h)}g∈G is a frame sequence, then {σ (g)ξ }g∈G is a frame sequence. In fact, using the π -orthogonality of ξ and h and Lemma 2.4, we get that [π(G) ξ ] ⊥ [π(G) h], and hence [σ (G)ξ ] ⊥ [σ (G)h] since σ (G)
= π(G) . Therefore, projecting {σ (g)(ξ + h)}g∈G onto [σ (G)ξ ] we get that {σ (g)ξ }g∈G is a frame sequence as claimed. Thus, without losing the generality, we can assume that {π(g)ξ }g∈G is a frame for H . By Corollary 2.6, we have ξ ∈ Bπ ⊆ Bσ . From Lemma 2.1 we can choose η ∈ [σ (G)ξ ] =: M such that ξ and η are σ -weakly equivalent and {σ (g)η}g∈G is a Parseval frame for [σ (G)ξ ]. By the parameterization theorem (Lemma 2.5) there exists an operator A ∈ σ (G)
|M such that ξ = Aη. Assume that C is the lower frame bound for {π(g)ξ }g∈G . Then for every x ∈ M we have x2
2 1 1 x, π(g)Aη 2 x, π(g)ξ = C C g∈G
g∈G
2 2 1 1 ∗ A x, π(g)η = A∗ x . = C C g∈G
1140
D. Dutkay et al. / Journal of Functional Analysis 257 (2009) 1133–1143
Thus A∗ is bounded from below and therefore it is invertible since σ (G)
|M is a finite von Neumann algebra (Proposition 1.1). This implies that A is invertible (on M) and so {σ (g)ξ }g∈G (= {Aπ(g)η}g∈G ) is a frame for M. “⇐”: Assume that {σ (g)ξ }g∈G is a frame sequence. Applying Lemma 2.1 again there exists η ∈ [π(G)ξ ] such that η and ξ are π -weakly equivalent, and {π(g)η}g∈G is a Parseval frame for [π(G)ξ ]. Using the converse statement proved above, we get that {σ (g)η}g∈G is a frame sequence for M := [σ (G)η]. Since ξ are π -weakly equivalent, we have by Lemma 2.4 that [π(G) ξ ] = [π(G) η] and so M = [σ (G)η] = [σ (G)ξ ]. Thus {σ (g)η}g∈G is a frame for [σ (G)ξ ]. By the parameterization theorem (Lemma 2.5), there exists an invertible operator A ∈ σ (G)
|M such that ξ = Aη. Extending A to an invertible operator B in σ (G)
, we have Aη = Bη, and so π(g)ξ = π(g)Aη = π(g)Bη = Bπ(g)η. Thus {π(g)ξ }g∈G is a frame sequence since {π(g)η}g∈G is a frame sequence and B is bounded invertible. For the Parseval frame sequence case, all the operators A and B involved in the parameterization are unitary operators and the rest of the argument is identical to the frame sequence case. 2 Proposition 2.8. Let π be a frame representation of G on H . Assume that (π, σ ) is a commutant pair of projective unitary representations of G on H such that such that π has a complete frame vector which is also a Bessel vector for σ . If σ admits a Riesz sequence, then (i) {π(g)ξ }g∈G is a frame for H if and only if {σ (g)ξ }g∈G is a Riesz sequence. (ii) {π(g)ξ }g∈G is a Parseval frame for H if and only if {σ (g)ξ }g∈G is an orthonormal sequence. Proof. (i) “⇒”: Assume that {π(g)ξ }g∈G is a frame for H . Then from Proposition 2.7 we have that {σ (g)ξ }g∈G is a frame sequence. Thus, in order to show that {σ (g)ξ }g∈G is a Riesz sequence, it suffices to show that [Θσ,ξ (H )] = 2 (G), where Θσ,ξ is the analysis operator of {σ (g)ξ }g∈G . We prove this by contradiction. Assume that [Θσ,ξ (H )] = 2 (G). Then, by Lemma 2.2, there is a vector 0 = x ∈ H such that Θσ,x (H ) ⊥ Θσ,ξ (H ). Since Bσ is dense in H (recall that Bπ is dense in H since π is a frame representation), we get by Lemma 2.4 that [σ (G) x] ⊥ [σ (G) ξ ] and so [π(G)x] ⊥ [π(G)ξ ] since σ (G) = π(G)
. On the other hand, since {π(g)ξ }g∈G is a frame for H , we have [π(G)ξ ] = H and so we have x = 0, a contradiction. “⇐”: Assume that {σ (g)ξ }g∈G is a Riesz sequence. Then, again by Proposition 2.7 we {π(g)ξ }g∈G is a frame sequence. So we only need to show that [π(G)ξ ] = H . Let η ⊥ [π(G)ξ ]. So we have [π(G)η] ⊥ [π(G)ξ ]. By Lemma 2.4, we have that Θσ,η (H ) ⊥ Θσ,ξ (H ). But Θσ,ξ (H ) = 2 (G) since {σ (g)ξ }g∈G is a Riesz sequence. This implies that η = 0, and so [π(G)ξ ] = H , as claimed. (ii) Replace “frame” by “Parseval frame”, and “Riesz” by “orthonormal”, the rest is exactly the same as in (i). 2
D. Dutkay et al. / Journal of Functional Analysis 257 (2009) 1133–1143
1141
3. The existence problem We will divide the proof of Theorem 1.4 into two cases: The abelian group case and the ICC group case. We deal the abelian group first, and start with an simple example when G = Z. Example 3.1. Consider the unitary representation of Z defined by π(n) = Me2πint on the Hilbert space L2 [0, 1/2]. Then σ (n) = Me2πi2nt is another unitary representation of Z on L2 [0, 1/2]. Note that {σ (n)1[0,1/2] }n∈Z is an orthogonal basis for L2 [0, 1/2]. We have that σ (Z)
is maximal abelian and hence σ (Z)
= M∞ = π(Z) . Moreover a function f ∈ L2 [0, 1/2] is a Bessel vector for π (respectively, σ ) if and only if f ∈ L∞ [0, 1/2]. So π and σ share the same Bessel vectors. Therefore (π, σ ) is a commutant pair with the property that Bπ = Bσ , and σ admits a Riesz sequence. It turns out the this example is generic for abelian countable discrete group. Proposition 3.1. Let π be a unitary frame representation of an abelian infinite countable discrete group G on H . Then there exists a group representation σ such that (π, σ ) is a dual pair. ˆ be the dual group of G. Then G ˆ is a compact space. Let μ be the unique Haar Proof. Let G ˆ measure of G. Any frame representation π of G is unitarily equivalent to a representation of the ˆ with positive measure, and eg is defined by form: g → eg |E , where E is a measurable subset of G ˆ So without losing the generality, we can assume that π(g) = eg |E . eg (χ) = g, χ for all χ ∈ G. 1 Let ν(F ) := μ(E) μ(F ) for any measurable subset F of E. Then both μ and ν are Borel probability measures without any atoms. Hence (see [5]) there is a measure preserving bijection ˆ Define a unitary representation σ of G on L2 (E) by ψ from E onto G. σ (g)f (χ) = eg ψ(χ) f (χ),
f ∈ L2 (E).
Then by the same arguments as in Example 3.1 we have that {σ (g)1E }g∈G is an orthogonal basis for L2 (E), and (π, σ ) satisfies all the requirements of this theorem. 2 Proof of Theorem 1.3. “(i) ⇒ (ii)”: Let Φ : λ(G) → P λ(G) P be an isomorphism between the two von Neumann algebras. Note that tr(A) = Aχe , χe is a normalized normal trace for λ(G) . Define τ on λ(G) by τ (A) =
1 tr Φ(A) , tr(P )
∀A ∈ λ(G) .
Then τ is also an normalized normal trace for λ(G) . Thus τ (·) = tr(·) since λ(G) is a factor von Neumann algebra. In particular we have that 1 tr Φ(ρg ) = τ (ρg ) = tr(ρg ) = δg,e . tr(P ) Therefore, if we define σ (g) = Φ(ρg ), then σ is a unitary representation of G such that σ (G)
= P λ(G) P = (λ(G)P ) = π(G) and σ admits an orthogonal sequence {σ (g)ξ }g∈G , where ξ = P χe . Moreover, for any A ∈ π(G)
we have that σ (g)Aξ = Aσ (g)ξ and so Aξ is a Bessel vector
1142
D. Dutkay et al. / Journal of Functional Analysis 257 (2009) 1133–1143
for σ . By Lemma 2.5(iii), we know that Bπ = π(G)
ξ . Thus we get Bπ ⊆ Bσ and therefore (π, σ ) is a dual pair. “(ii) ⇒ (i)”: Assume that (π, σ ) is a dual pair. Let {σ (g)ψ}g∈G be a Riesz sequence, σ1 (g) := σ (g)|M and σ2 (g) := σ (g)|M ⊥ , where M = [σ (g)ψ]. Then σ is unitarily equivalent to the group representation ζ := σ1 ⊕ σ2 acting on the Hilbert space K := M ⊕ M ⊥ . Since σ1 is unitarily equivalent to the right regular representation of G (because of the Riesz sequence), we have that σ1 (G)
λ(G) . Let q be the orthogonal projection from K onto M ⊕ 0. Then q ∈ ζ (G) . Clearly, ζ (G)
q σ1 (G)
. Since ζ (G)
is a factor, we also have that ζ (G)
ζ (G)
q, and hence σ (G)
λ(G) , i.e., λ(G) P λ(G) P since σ (G)
= P λ(G) P . 2 Remark 3.1. Although we stated the result in Theorem 1.3 for group representations, the proof works for general projective unitary representations when the von Neumann algebra generated by the left regular projective unitary representation of G is a factor. Proof of Theorem 1.4. (i) The abelian group case is proved in Proposition 3.1. If G is an amenable ICC group G, then the statement follows immediately from Theorem 1.3 and the famous result of A. Connes [2] that when G is an amenable ICC group, then λ(G) is the hyperfinite II1 factor, and we have that λ(G) and P λ(G) P are isomorphic for any non-zero projection P ∈ λ(G) . (ii) Recall that the fundamental group of a type II1 factor M is the set of numbers t > 0 for which the “amplification” of M by t is isomorphic to M, F (M) = {t > 0: M Mt }. Let G = Z2 SL(2, Z). Then, by [19,20], the fundamental group of λ(G) is {1}, which implies that von Neumann algebras P λ(G) P is not ∗-isomorphic to λ(G) for any nontrivial projection P ∈ λ(G) . Thus, by Theorem 1.3, none of the nontrivial subrepresentations λ|P admits a dual pair. 2 Example 3.2. Let G = F∞ . Using Voiculescu’s free probability theory Radulescu [21,22] proved that fundamental group F (M) = R∗+ for M = λ(F∞ ) . Therefore for λ(F∞ ) P λ(F∞ ) P for any nonzero projection P ∈ λ(F∞ ) , and thus λ|P admits a dual pair for free group F∞ . Example 3.3. Let G = Zd × Zd , and πΛ (m, n) = EAm TBn be the Gabor representation of G on L2 (Rd ) associated with the time–frequency lattice Λ = AZd × BZd . Since G is abelian, we have that the von Neumann algebra πΛ (G) is amenable (cf. [1]). Thus, if πΛ (G) is a factor, then for every πΛ invariant subspace M of L2 (Rd ), we have by the remark after the proof of Theorem 1.3 that πΛ |M admits a dual pair. Therefore the duality principle in Gabor analysis holds also for subspaces at least for the factor case (e.g., d = 1, A = a and B = b with ab irrational). In the case that A = B = I , then the Gabor representation πΛ is a unitary representation of the abelian group Zd × Zd , and so, from Proposition 3.1, the duality principle holds for subspaces for this case as well, In fact in this case a concrete representation σ can be constructed by using the Zak transform. Acknowledgments The authors thank their colleagues Ionut Chifan, Ken Dykema, Junsheng Fang, Palle Jorgensen and Roger Smith for many helpful conversations and comments on this paper. They also thank the anonymous referee for some suggestions in the final version of this paper.
D. Dutkay et al. / Journal of Functional Analysis 257 (2009) 1133–1143
1143
References [1] E. Bédos, R. Conti, On twisted Fourier analysis and convergence of Fourier series on discrete groups, preprint, 2008. [2] A. Connes, Classification of injective factors. Cases II 1 , II ∞ , III λ , λ = 1, Ann. of Math. 104 (1976) 73–115. [3] I. Daubechies, H. Landau, Z. Landau, Gabor time–frequency lattices and the Wexler–Raz identity, J. Fourier Anal. Appl. 1 (1995) 437–478. [4] D. Dutkay, D. Han, G. Picioroaga, Frames for ICC groups, J. Funct. Anal., in press. [5] J. Dixmier, Von Neumann Algebras, with a Preface by E.C. Lance, Translated from the second French edition by F. Jellett North-Holland Math. Library, vol. 27, North-Holland Publishing Co., Amsterdam–New York, 1981, xxxviii+437 pp. [6] R.J. Duffin, A.C. Schaeffer, A class of nonharmonic Fourier series, Trans. Amer. Math. Soc. 72 (1952) 341–366. [7] K. Dykema, Interpolated free group factors, Pacific J. Math. 163 (1994) 123–135. [8] J.-P. Gabardo, D. Han, Subspace Weyl–Heisenberg frames, J. Fourier Anal. Appl. 7 (2001) 419–433. [9] J.-P. Gabardo, D. Han, Frame representations for group-like unitary operator systems, J. Operator Theory 49 (2003) 223–244. [10] J.-P. Gabardo, D. Han, The uniqueness of the dual of Weyl–Heisenberg subspace frames, Appl. Comput. Harmon. Anal. 17 (2004) 226–240. [11] D. Han, Frame representations and parseval duals with applications to Gabor frames, Trans. Amer. Math. Soc. 360 (2008) 3307–3326. [12] D. Han, D. Larson, Frames, bases and group representations, Mem. Amer. Math. Soc. 697 (2000). [13] D. Han, D. Larson, Frame duality properties for projective unitary representations, Bull. London Math. Soc. 40 (2008) 685–695. [14] D. Han, Y. Wang, Lattice tiling and Weyl–Heisenberg frames, Geom. Funct. Anal. 11 (2001) 742–758. [15] C. Heil, History and evolution of the density theorem for Gabor frames, J. Fourier Anal. Appl. 13 (2007) 113–166. [16] A. Janssen, Duality and biorthogonality for Weyl–Heisenberg frames, J. Fourier Anal. Appl. 1 (1995) 403–436. [17] R. Kadison, J. Ringrose, Fundamentals of the Theory of Operator Algebras, vols. I and II, Academic Press, Inc., 1983–1985. [18] F.J. Murray, J. von Neumann, On rings of operators. IV, Ann. of Math. 44 (1943) 716–808. [19] S. Popa, On the fundamental group of type II1 factors, Proc. Natl. Acad. Sci. 101 (2004) 723–726. [20] S. Popa, On a class of type II1 factors with Betti numbers invariants, Ann. of Math. 163 (2006) 809–899. [21] F. Radulescu, The fundamental group of the von Neumann algebra of a free group with infinitely many generators is R+ \ {0}, J. Amer. Math. Soc. 5 (1992) 517–532. [22] F. Radulescu, Random matrices, amalgamated free products and subfactors of the von Neumann algebra of a free group, of noninteger index, Invent. Math. 115 (1994) 347–389. [23] M.A. Rieffel, Von Neumann algebras associated with pairs of lattices in Lie groups, Math. Ann. 257 (1981) 403– 413. [24] A. Ron, Z. Shen, Weyl–Heisenberg frames and Riesz bases in L2 (Rd ), Duke Math. J. 89 (1997) 237–282. [25] V.S. Varadarajan, Geometry of Quantum Theory, second ed., Springer-Verlag, New York–Berlin, 1985. [26] D.V. Voiculescu, K.J. Dykema, A. Nica, Free Random Variables, A Noncommutative Probability Approach to Free Products with Applications to Random Matrices, Operator Algebras and Harmonic Analysis on Free Groups, CRM Monogr. Ser., Amer. Math. Soc., Providence, RI, 1992, vi+70.
Journal of Functional Analysis 257 (2009) 1144–1174 www.elsevier.com/locate/jfa
Energy image density property and the lent particle method for Poisson measures Nicolas Bouleau a,∗ , Laurent Denis b a Ecole des Ponts, ParisTech, Paris-Est, 6 Avenue Blaise Pascal, 77455 Marne-La-Vallée Cedex 2, France b Equipe Analyse et Probabilités, Université d’Evry-Val-d’Essonne, Boulevard François Mitterrand,
91025 Evry Cedex, France Received 7 February 2009; accepted 9 March 2009 Available online 2 April 2009 Communicated by Paul Malliavin
Abstract We introduce a new approach to absolute continuity of laws of Poisson functionals. It is based on the energy image density property for Dirichlet forms. The associated gradient is a local operator and gives rise to a nice formula called the lent particle method which consists in adding a particle and taking it back after some calculation. © 2009 Elsevier Inc. All rights reserved. Keywords: Poisson functionals; Dirichlet forms; Energy image density; Lévy processes; Gradient
Contents 1. 2.
3.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Energy Image Density property (EID) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1. A sufficient condition on (Rr , B(Rr )) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2. The case of a product structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3. The case of structures obtained by injective images . . . . . . . . . . . . . . . . . . . . . . Dirichlet structure on the Poisson space related to a Dirichlet structure on the states space . 3.1. Density lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2. Construction using the Friedrichs’ argument . . . . . . . . . . . . . . . . . . . . . . . . . . .
* Corresponding author.
E-mail addresses: [email protected] (N. Bouleau), [email protected] (L. Denis). 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.03.004
. . . . . . . .
. . . . . . . .
1144 1146 1146 1150 1151 1151 1152 1153
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
3.2.1. Basic formulas and pre-generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2. Particle-wise product of a Poisson measure and a probability . . . . . . . . . . . 3.2.3. Gradient and welldefinedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.4. Upper structure and first properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.5. Link with the Fock space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3. Extension of the representation of the gradient and the lent particle method . . . . . . . . 3.3.1. Extension of the representation of the gradient . . . . . . . . . . . . . . . . . . . . . 3.3.2. The lent particle method: first application . . . . . . . . . . . . . . . . . . . . . . . . . 4. (EID) property on the upper space from (EID) property on the bottom space and the domain Dloc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1. Upper bound of a process on [0, t] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2. Regularizing properties of Lévy processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3. A regular case violating Hörmander conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1145
1153 1155 1156 1159 1159 1162 1162 1164 1166 1168 1168 1169 1171 1173
1. Introduction The aim of this article is to improve some tools provided by Dirichlet forms for studying the regularity of Poisson functionals. First, the energy image density property (EID) which guarantees the existence of a density for Rd -valued random variables whose carré du champ matrix is almost surely regular. Second, the Lipschitz functional calculus for a local gradient satisfying the chain rule, which yields regularity results for functionals of Lévy processes. For a local Dirichlet structure with carré du champ, the energy image density property is always true for real-valued functions in the domain of the form (Bouleau [5], Bouleau and Hirsch [10, Chap. I, §7]). It has been conjectured in 1986 (Bouleau and Hirsch [9, p. 251]) that (EID) was true for any Rd -valued function whose components are in the domain of the form for any local Dirichlet structure with carré du champ. This has been shown for the Wiener space equipped with the Ornstein–Uhlenbeck form and for some other structures by Bouleau and Hirsch (cf. [10, Chap. II, §5 and Chap. V, Example 2.2.4]) and also for the Poisson space by A. Coquio [12] when the intensity measure is the Lebesgue measure on an open set, but this conjecture being at present neither refuted nor proved in full generality, it has to be established in every particular setting. We will proceed in two steps: first (Section 2) we prove sufficient conditions for (EID) based mainly on a study of Shiqi Song [31] using a characterization of Albeverio and Röckner [2], then (Section 4) we show that the Dirichlet structure on the Poisson space obtained from a Dirichlet structure on the states space inherits from that one the (EID) property. If we think a local Dirichlet structure with carré du champ (X, X , ν, d, γ ) as a description of the Markovian movement of a particle on the space (X, X ) whose transition semi-group pt is symmetric with respect to the measure ν and strongly continuous on L2 (ν), the construction of the Poisson measure allows to associate to this structure a structure on the Poisson space (Ω, A, P, D, Γ ) which describes similarly the movement of a family of independent identical particles whose initial law is the Poisson measure with intensity ν. This construction is ancient and may be performed in several ways. The simplest one, from the point of view of Dirichlet forms, is based on products and follows faithfully the probabilistic construction (Bouleau [6], Denis [14], Bouleau [7, Chap. VI, §3]).
1146
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
The cuts that this method introduces are harmless for the functional calculus with the carré du champ Γ , but it does not clearly show what happens for the generator and its domain. Another way consists in using the transition semi-groups (Martin-Löf [20], Wu [33], partially Bichteler, Gravereaux and Jacod [4], Surgailis [32]). It is supposed that there exists a Markov process xt with values in X whose transition semi-group πt is a version of pt (cf. Ma and Röckner [21, Chap. IV, §3]), the process starting at the point z is denoted by xt (z) and a probability space (W, W, Π) is considered where a family (xt (z))z∈X of independent processes is realized. For a symmetric function F , the new semi-group Pt is directly defined by (Pt F )(z1 , . . . , zn , . . .) =
F xt (z1 ), . . . , xt (zn ), . . . dΠ.
Choosing as initial law the Poisson measure with intensity ν on (X, X ), it is possible to show the symmetry and the strong continuity of Pt . This method, based on a deep physical intuition, often used in the study of infinite systems of particles, needs a careful formalization in order to prevent any drawback from the fact that the mapping X z → xt (z) is not measurable in general due to the independence. For extensions of this method see [19]. In any case, the formulas involving the carré du champ and the gradient require computations and key results on the configuration space from which the construction may be performed as starting point. From this point of view the works are based either on the chaos decomposition (Nualart and Vives [25]) and provide tools in analogy with the Malliavin calculus on Wiener space, but non-local (Picard [26], Ishikawa and Kunita [17], Picard [27]) or on the expression of the generator on a sufficiently rich class and Friedrichs’ argument (cf. what may be called the German school in spite of its cosmopolitanism, especially [1] and [22]). We will follow a way close to this last one. Several representations of the gradient are possible (Privault [28]) and we will propose here a new one with the advantages of both locality (chain rule) and simplicity on usual functionals. It provides a new method of computing the carré du champ Γ —the lent particle method—whose efficiency is displayed on some examples. With respect to the announcement [8] we have introduced a clearer new notation, the operator ε − being shared from the integration by N . Applications to stochastic differential equations driven by Lévy processes will be gathered in an other article. 2. The Energy Image Density property (EID) In this section we give sufficient conditions for a Dirichlet structure to fulfill (EID) property. These conditions concern finite dimensional cases and will be extended to the infinite dimensional setting of Poisson measures in Section 4. For each positive integer d, we denote by B(Rd ) the Borel σ -field on Rd and by λd the Lebesgue measure on (Rd , B(Rd )) and as usually when no confusion is possible, we shall denote it by dx. For f measurable f∗ ν denotes the image of the measure ν by f . For a σ -finite measure Λ on some measurable space, a Dirichlet form on L2 (Λ) with carré du champ γ is said to satisfy (EID) if for any d and for any Rd -valued function U whose components are in the domain of the form U∗ det γ U, U t · Λ λd where det denotes the determinant.
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
1147
2.1. A sufficient condition on (Rr , B(Rr )) Given r ∈ N∗ , for any B(Rr )-measurable function u : Rr → R, all i ∈ {1, . . . , r} and all x = (x1 , . . . , xi−1 , xi+1 , . . . , xr ) ∈ Rr−1 , we consider u(i) x : R → R the function defined by (i) ux (s) = u (x, s)i ,
∀s ∈ R,
where (x, s)i = (x1 , . . . , xi−1 , s, xi+1 , . . . , xr ). Conversely if x = (x1 , . . . , xr ) belongs to Rr we set x i = (x1 , . . . , xi−1 , xi+1 , . . . , xr ). Then following standard notation, for any B(R) measurable function ρ : R → R+ , we denote by R(ρ) the largest open set on which ρ −1 is locally integrable. Finally, we are given k : Rr → R+ a Borel function and ξ = (ξij )1i,j r an Rr×r -valued and symmetric Borel function. We make the following assumptions which generalize Hamza’s condition (cf. Fukushima, Oshima and Takeda [16, Chap. 3, §3.1 (3◦ ), p. 105]): Hypotheses (HG): (i) (i) 1. For any i ∈ {1, . . . , r} and λr−1 -almost all x ∈ {y ∈ Rr−1 : R ky (s) ds > 0}, kx = 0, (i) λ1 -a.e. on R \ R(kx ). 2. There exists an open set O ⊂ Rr such that λr (Rr \ O) = 0 and ξ is locally elliptic on O in the sense that for any compact subset K, in O, there exists a positive constant cK such that
∀x ∈ K, ∀c ∈ Rr ,
r
ξij (x)ci cj cK |c|2 .
i,j =1
Following Albeverio and Röckner, Theorems 3.2 and 5.3 in [2] and also Röckner and Wielens, Section 4 in [29], we consider d the set of B(Rr )-measurable functions u in L2 (k dx), such that (i) (i) for any i ∈ {1, . . . , r}, and λr−1 -almost all x ∈ Rr−1 , ux has an absolute continuous version u˜ x (i) ∂u ∂u on R(kx ) (defined λ1 -a.e.) and such that i,j ξij ∂x ∈ L1 (k dx), where i ∂xj (i)
d u˜ ∂u = x . ∂xi ds Sometimes, we will simply denote ∂x∂ i by ∂i . And we consider the following bilinear form on d: ∀u, v ∈ d,
e[u, v] =
1 2
Rr
ξij (x)∂i u(x)∂j v(x)k(x) dx.
i,j
As usual we shall simply denote e[u, u] by e[u]. We have
1148
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
Proposition 1. (d, e) is a local Dirichlet form on L2 (k dx) which admits a carré du champ operator γ given by ξij ∂i u∂j v. ∀u, v ∈ d, γ [u, v] = i,j
Proof. All is clear excepted the fact that e is a closed form on d. To prove it, let us consider a sequence (un )n∈N∗ of elements in d which converges to u in L2 (k dx) and such that limn,m→+∞ e[un − um ] = 0. Let W ⊂ O, an open subset which satisfies W ⊂ O and such that W is compact. Let dW be the set of B(Rr )-measurable functions u in L2 (1W × k dx), such that for any (i) (i) i ∈ {1, . . . , r}, and λr−1 -almost all x ∈ Rr−1 , ux has an absolute continuous version u˜ x on (i) ∂u ∂u R((1W × k)x ) and such that i,j ξij ∂x ∈ L1 (1W × k dx), equipped with the bilinear form i ∂xj ∀u, v ∈ dW ,
1 eW [u, v] = 2
W
i
1 ∂i u(x)∂i v(x)k(x) dx = 2
∇u(x) · ∇v(x)k(x) dx. W
One can easily verify, since W is an open set, that for all x ∈ Rr−1 (i) (i) Sxi (W ) ∩ R kx ⊂ R (1W × k)x ,
(1)
where Sxi (W ) is the open set {s ∈ R: (x, s)i ∈ W }. Then it is clear that the function 1W × k satisfies property 1 of (HG) and as a consequence of Theorems 3.2 and 5.3 in [2], (dW , eW ) is a Dirichlet form on L2 (1W × k dx). We have for all n, m ∈ N
2 1
1 ∇un (x) − ∇um (x) k(x) dx e(un − um ), eW (un − um ) = 2 cW W
as (d, eW ) is a closed form, we conclude that u belongs to dW . Consider now an exhaustive sequence (Wm ), of relatively compact open sets in O such that for all m ∈ N, W m ⊂ Wm+1 ⊂ O. We have that for all m, u belongs to dWm hence by Theorems 3.2 (i) and 5.3 in [2], for all i ∈ {1, . . . , r}, and λr−1 -almost all x ∈ Rr−1 , ux has an absolute continuous +∞ (i) version on m=1 R((1Wm × k)x ). Using relation (1), we have (i) +∞ (i) +∞ (i) Sxi (Wm ) ∩ R kx ⊂ R (1Wm × k)x . Sxi (O) ∩ R kx = m=1
m=1
(i) (i) 1 As λr (Rr \ O) = 0, we get that for almost all x ∈ Rr−1 , +∞ m=1 R((1Wm × k)x ) = R(kx ) λ -a.e. Moreover, by a diagonal extraction, we have that a subsequence of (∇un ) converges k dx-a.e. to ∇u, so by Fatou’s lemma, we conclude that u ∈ d and then limn→+∞ e[un − u] = 0, which is the desired result. 2 For any d ∈ N∗ , if u = (u1 , . . . , ud ) belongs to dd , we shall denote by γ [u] the matrix (γ [ui , uj ])1i,j d .
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
1149
Theorem 2. (EID) property: the structure (Rr , B(Rr ), k dx, d, γ ) satisfies u∗ det γ [u] · k dx λd .
∀d ∈ N∗ , ∀u ∈ dd ,
Proof. Let us mention that a proof was given by S. Song in [31, Theorem 16], in the more general case of classical Dirichlet forms. Following his ideas, we present here a shorter proof. The proof is based on the co-area formula stated by H. Federer in [15, Theorems 3.2.5 and 3.2.12]. We first introduce the subset A ⊂ Rr :
(i) A = x ∈ Rr : xi ∈ R kx i , i = 1, . . . , r . As a consequence of property 1 of (HG), Ac k(x) dx = 0. Let u = (u1 , . . . , ud ) ∈ dd . We follow the notation and definitions introduced by Bouleau and Hirsch in [10, Chap. II, Section 5.1]. Thanks to Theorem 3.2 in [2] and Stepanoff’s theorem (see Theorem 3.1.9 in [15] or Remark 5.1.2, Chap. II in [10]), it is clear that for almost derivatives rall a ∈ A, the approximate ∂u 1/2 , this is equal ap ∂x exist for i = 1, . . . , r and if we set: J u = [det(( ∂ u ∂ u ) )] k i k j 1i,j d k=1 i k dx a.e. to the determinant of the approximate Jacobian matrix of u. Then, by Theorem 3.1.4 in [15], u is approximately differentiable at almost all points a in A. We denote by Hr−d the (r − d)-dimensional Hausdorff measure on Rr . As a consequence of Theorems 3.1.8, 3.1.16 and Lemma 3.1.7 in [15], for all n ∈ N∗ , there exists a map un : Rr → Rd of class C 1 such that
1 λr A \ x: u(x) = un (x) n and
∀a ∈ x: u(x) = un (x) ,
ap
∂u ∂un (a) = ap (a), ∂xi ∂xi
i = 1, . . . , r.
Assume first that d r. Let B be a Borelian set in Rd such that λr (B) = 0 . Thanks to the co-area formula we have
1B u(x) J u(x)k(x) dx
Rr
=
1B u(x) J u(x)k(x) dx
A
= lim
n→+∞ A∩{u=un }
= lim
n→+∞ A∩{u=un }
1B u(x) J u(x)k(x) dx 1B un (x) J un (x)k(x) dx
1150
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
1A∩{u=un } (x)1B un (x) k(x) dHr−d (x) dλr (y)
= lim
n→+∞ Rr
= lim
n→+∞ Rr
(un )−1 (y)
1B (y)
1A∩{u=un } (x)k(x) dHr−d (x) dλr (y)
(un )−1 (y)
= 0. So that, u∗ (J u · k dx) λd . We have 1/2 J u = det Du · (Du)t
and γ (u) = Du · ξ · Dut ,
∂ui where Du is the d × r matrix: ( ∂x )1id; 1kr . k As ξ(x) is symmetric and positive definite on O and λr (Rr \ O) = 0, we have
x ∈ A; J u(x) > 0 = x ∈ A; det γ (u)(x) > 0 a.e., and this ends the proof in this case. Now, if d > r, det(γ (u)) = 0 and the result is trivial.
2
2.2. The case of a product structure We consider a sequence of functions ξ i and ki , i ∈ N∗ , ki being non-negative Borel functions such that Rr ki (x) dx = 1. We assume that for all i ∈ N∗ , ξ i and ki satisfy hypotheses (HG) so that, we can construct, as for k in the previous subsection, the Dirichlet form (di , ei ) on L2 (Rr , ki dx) associated to the carré du champ operator γi given by: ∀u, v ∈ di ,
γi [u, v] =
ξkli ∂k u∂l v.
k,l
˜ e) We now consider the product Dirichlet form (d, ˜ = +∞ on the product space i=1 (di , ei ) defined ∗ ∗ ((Rr )N , (B(R r ))N ) equipped with the product probability Λ = +∞ k i=1 i dx. We denote by ∗ r N ∗ (Xn )n∈N the coordinates maps on (R ) . Let us recall that U = F (X1 , X2 , . . . , Xn , . . .) belongs to d˜ if and only if: ∗
∗
1. U belongs to L2 ((Rr )N , (B(Rr ))N , Λ). ∗ 2. For all k ∈ N∗ and Λ-almost all (x1 , . . . , xk−1 , xk+1 , . . .) in (Rr )N , F (x1 , . . . , xk−1 , ·, xk+1 , . . .) to dk . belongs 3. e(U ˜ ) = k (Rr )N∗ ek (F (X1 (x), . . . , Xk−1 (x), ·, Xk+1 (x), . . .))Λ(dx) < +∞. Where as usual, the form ek acts only on the kth coordinate. ˜ e) It is also well known that (d, ˜ admits a carré du champ γ˜ given by γ˜ [U ] =
k
γk F (X1 , . . . , Xk−1 , ·, Xk+1 , . . .) (Xk ).
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
1151
To prove that (EID) is satisfied by this structure, wefirst prove that it is satisfied for a finite product. So, for all n ∈ N∗ , we consider (d˜ n , e˜n ) = ni=1 (di , ei )defined on the product space ((Rr )n , (B(R r ))n ) equipped with the product probability Λn = ni=1 ki dx. By restriction, we keep the same notation as the one introduced for the infinite product. We know that this structure admits a carré du champ operator γ˜n given by γ˜n = ni=1 γi . Lemma 3. For all n ∈ N∗ , the Dirichlet structure (d˜ n , e˜n ) satisfies (EID): ∀d ∈ N∗ , ∀U ∈ (d˜ n )d ,
U∗ det γ˜n [U ] · Λn λd .
Proof. The proof consists in remarking that this is nothing but a particular case of Theorem 2 on Rnd , ξ being replaced by Ξ , the diagonal matrix of the ξ i , and the density being the product density. 2 As a consequence of Chapter V, Proposition 2.2.3 in Bouleau and Hirsch [10], we have ˜ e) Theorem 4. The Dirichlet structure (d, ˜ satisfies (EID): ∀d ∈ N∗ , ∀U ∈ d˜ d ,
U∗ det γ˜ [U ] · Λ λd .
2.3. The case of structures obtained by injective images The following result could be extended to more general images (see Bouleau and Hirsch [10, Chapter V, §1.3, p. 196 et seq.]). We give the statement in the most useful form for Poisson measures and processes with independent increments. Let (Rp \ {0}, B(Rp \ {0}), ν, d, γ ) be a Dirichlet structure on Rp \ {0} satisfying(EID). Thus ν is σ -finite, γ is the carré du champ operator and the Dirichlet form is e[u] = 1/2 γ [u] dν. Let U : Rp \ {0} → Rq \ {0} be an injective map such that U ∈ dq . Then U∗ ν is σ -finite. If we put
dU = ϕ ∈ L2 (U∗ ν): ϕ ◦ U ∈ d , eU [ϕ] = e[ϕ ◦ U ], d U∗ (γ [ϕ ◦ U ].ν) , γU [ϕ] = d U∗ ν we have Proposition 5. The term (Rq \ {0}, B(Rq \ {0}), U∗ ν, dU , γU ) is a Dirichlet structure satisfying (EID). Proof. (a) That (Rq \ {0}, B(Rq \ {0}), U∗ ν, dU , γU ) be a Dirichlet structure is general and does not use the injectivity of U (cf. the case ν finite in Bouleau and Hirsch [10, Chap. V, §1, p. 186 et seq.]). (b) By the injectivity of U , we see that for ϕ ∈ dU γU [ϕ] ◦ U = γ [ϕ ◦ U ] ν-a.s.
1152
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
so that if f ∈ (dU )r f∗ det γU [f ] · U∗ ν = (f ◦ U )∗ det γ [f ◦ U ] · ν which proves (EID) for the image structure.
2
Remark 1. Applying this result yields examples of Dirichlet structures on Rn satisfying (EID) whose measures are carried by a (Lipschitzian) curve in Rn or, under some hypotheses, a countable union of such curves, and therefore without density. 3. Dirichlet structure on the Poisson space related to a Dirichlet structure on the states space Let (X, X , ν, d, γ ) be a local symmetric Dirichlet structure which admits a carré du champ operator i.e. (X, X , ν) is a measured space called the bottom space, ν is σ -finite and the bilinear form 1 e[f, g] = γ [f, g] dν 2 is a local Dirichlet form with domain d ⊂ L2 (ν) and carré du champ operator γ (see Bouleau and Hirsch [10, Chap. I]). We assume that for all x ∈ X, {x} belongs to X and that ν is diffuse (ν({x}) = 0 ∀x). The generator associated to this Dirichlet structure is denoted by a, its domain is D(a) ⊂ d and it generates the Markovian strongly continuous semi-group (pt )t0 on L2 (ν). Our aim is to study, thanks to Dirichlet forms methods, functionals of a Poisson measure N , associated to (X, X , ν). It is defined on the probability space (Ω, A, P) where Ω is the configuration space, the set of measures which are countable sum of Dirac measures on X, A is the sigma-field generated by N and P is the law of N (see Neveu [24]). The probability space (Ω, A, P) is called the upper space. 3.1. Density lemmas Let (F, F , μ) be a probability space such that for all x ∈ F , {x} belongs to F and μ is x2 , . . . , xn the coordinates maps on (F n , F ⊗n , μ×n ) and diffuse. Let n ∈ N∗ . We denote by x1 , we consider the random measure m = ni=1 εxi . Lemma 6. Let S be the symmetric sub-sigma-field in F ⊗n and p ∈ [1, +∞[. Sets {m(g1 ) · · · m(gn ): gi ∈ L∞ (μ) ∀i = 1, . . . , n} and {em(g) : g ∈ L∞ (μ)} are both total in Lp (F n , S, μ×n ) and the set {eim(g) : g ∈ L∞ (μ)} is total in Lp (F n , S, μ×n ; C). Proof. Because μ is diffuse, the set {g1 (x1 ) · · · gn (xn ): gi ∈ L∞ (μ), gi with disjoint supports ∀i = 1, . . . , n} is total in Lp (μ×n ). Let G(x1 , . . . , xn ) be a linear combination of such functions. If F (x1 , . . . , xn ) is symmetric and belongs to Lp (μ×n ) then the distance in Lp (μ×n ) between F (x1 , . . . , xn ) and G(xσ (1) , . . . , xσ (n) ) for σ ∈ S the set of permutations on {1, . . . , n}, does not depend on σ and as a consequence is larger than the distance between F (x1 , . . . , xn ) and the 1 1 ∞ barycenter n! σ ∈S G(xσ (1) , . . . , xσ (n) ). So, the set { n! σ ∈S G(xσ (1) , . . . , xσ (n) : gi ∈ L (μ), p n ×n gi with disjoint supports ∀i = 1, . . . , n} is total in L (F , S, μ ). We conclude by using the
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
1153
following property: if fi , i = 1, . . . , n, are F -measurable functions with disjoint supports then: m(f1 ) · · · m(fn ) = σ ∈S f1 (xσ (1) ) · · · fn (xσ (n) ). 2 Lemma 7. Let N1 be a random Poisson measure on (F, F , μ1 ) where μ1 , the intensity of N1 , is a finite and diffuse measure, defined on some probability space (Ω1 , A1 , P1 ) where A1 = σ (N1 ). Then, for any p ∈ [1, +∞[, the set {e−N1 (f ) : f 0, f ∈ L∞ (μ1 )} is total in Lp (Ω1 , A1 , P1 ) and {eiN1 (f ) : f ∈ L∞ (μ1 )} is total in Lp (Ω1 , A1 , P1 ; C). Proof. Let us put P = N1 (F ), it is an integer-valued random variable. As {eiλP : λ ∈ R} is total in Lp (N, P(N), PP ) where PP is the law of P , for any n ∈ N∗ and any g ∈ L∞ (μ1 ), one can approximate in Lp (Ω1 , A1 , P1 ; C) the random variable 1{P =n} eiN1 (g) by a sequence of iλk P eiN1 (g) with a , λ ∈ R, k = 1 · · · K. But, as a consequence variables of the form K k k k=1 ak e of the previous lemma, we know that {1{P =n} eiN1 (f ) : f ∈ L∞ (μ1 )} is total in Lp ({P = n}, A1 |{P =n} , P1 |{P =n} ; C), which provides the result. 2 We now give the main lemma, with the notation introduced at the beginning of this section. Lemma 8. For p ∈ [1, ∞[, the set {e−N (f ) : f 0, f ∈ L1 (ν) ∩ L∞ (ν)} is total in Lp (Ω, A, P) and {eiN (f ) : f ∈ L1 (ν) ∩ L∞ (ν)} is total in Lp (Ω, A, P; C). Proof. Assume that ν is non-finite. Let (Fk )k∈N be a partition of Ω such that for all k, ν(Fk ) be finite. By restriction of N to each set Fk , we construct a sequence of independent Poisson measures (Nk ) such that N = k Nk . As any variable in Lp is the limit of variables which depend only on a finite number of Nk , we conclude thanks to the previous lemma. 2 3.2. Construction using the Friedrichs’ argument 3.2.1. Basic formulas and pre-generator We set N˜ = N − ν. Then the identity E[(N˜ (f ))2 ] = f 2 dν, for f ∈ L1 (ν) ∩ L2 (ν) can be extended uniquely to f ∈ L2 (ν) and this permits to define N˜ (f ) for f ∈ L2 (ν). The Laplace characteristic functional ˜ if E ei N (f ) = e− 1−e +if dν ,
f ∈ L2 (ν),
(2)
i ˜ = 0. N a[h] + N γ [f, h] 2
(3)
yields: Proposition 9. For all f ∈ d and all h ∈ D(a), E e
i N˜ (f )
˜
Proof. Deriving in 0 the map t → E[ei N (f +ta[h]) ], we have thanks to (2), ˜ if E ei N(f )+ (1−e +if ) dν N˜ a[h] =
if e − 1 a[h] dν,
(4)
1154
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
then using the fact that function x → eix − 1 is Lipschitz and vanishes in 0 and the functional calculus related to a local Dirichlet form (see Bouleau and Hirsch [10, Section I.6]) we get that the member on the right-hand side in (4) is equal to i 1 − γ eif − 1, h dν = − eif γ [f, h] dν. 2 2 We conclude by applying once more (4) with γ [f, h] instead of a[h].
2
˜
The linear combinations of variables of the form ei N (f ) with f ∈ D(a) ∩ L1 (ν) are dense in L2 (Ω, A, P; C) thanks to Lemma 8. This is a natural choice for test functions, but, for technical reason, we need in addition that γ [f ] belongs to L2 (ν). So we suppose: Bottom core hypothesis (BC). The bottom structure is such that there exists a subspace H of D(a) ∩ L1 (ν) such that ∀f ∈ H , γ [f ] ∈ L2 (ν), and the space D0 of linear combinations of ˜ ei N (f ) , f ∈ H , is dense in L2 (Ω, A, P; C). This hypothesis will be fulfilled in all cases on Rr where D(a) contains the C ∞ functions with compact support and γ operates on them. ˜ If U = p λp ei N (fp ) belongs to D0 , we put A0 [U ] =
λp e
˜ p) i N(f
p
1 ˜ i N a[fp ] − N γ [fp ] . 2
(5)
This is a natural choice as candidate for the pre-generator of the upper structure, since, as easily seen using (5), it induces the relation Γ [N (f )] = N (γ [f ]) between the carré du champ operators of the upper and the bottom structures, which is satisfied in the case ν(X) < ∞. One has to note that for the moment, A0 is not uniquely determined since a priori A0 [U ] depends on the expression of U which is possibly non-unique. Proposition 10. Let U, V ∈ D0 , U =
p λp e
i N˜ (fp )
and V =
q
˜
μq ei N(gq ) . One has
1 i N˜ (fp −gq ) λp μq e N γ [fp , gq ] −E A0 [U ]V = E 2 p,q
(6)
which is also equal to 1 E Fp Gq N γ [fp , gq ] , 2 p,q
(7)
where F and G are such that U = F (N˜ (f1 ), . . . , N˜ (fn )) and V = G(N˜ (g1 ), . . . , N˜ (gm )) and ∂F ˜ ∂G ˜ (N (f1 ), . . . , N˜ (fn ), Gq = ∂x (N (g1 ), . . . , N˜ (gm )). Fp = ∂x p q Proof. We have 1 i N˜ (fp −gq ) ˜ i N a[fp ] − N γ [fp ] . λp μq e −E A0 [U ]V = −E 2 p,q
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
1155
Thanks to Proposition 9, 1 ˜ p −gq ) i N˜ (fp −gq ) ˜ i N(f −E λp μq e i N a[fp ] = − E λp μq e N γ [fp , fp − gq ] 2 p,q p,q 1 ˜ λp μq ei N (fp −gq ) N γ [fp , gq ] = E 2 p,q 1 i N˜ (fp −gq ) − E λp μq e N γ [fp ] 2 p,q which gives the statement.
2
It remains to prove that A0 is uniquely determined and so is an operator acting on D0 . To thanks to the previous proposition, we just have to prove that the quantity this end, N (γ [f , g ]) does not depend on the choice of representations for U and V . In F G p q p,q p q the same spirit as Ma and Röckner (see [22]), the introduction of a gradient will yield this nondependence. Let us mention that the gradient we introduce is different from the one considered by these authors and is based on a notion that we present now. 3.2.2. Particle-wise product of a Poisson measure and a probability We are still considering N the random Poisson measure on (X, X , ν) and we are given an auxiliary probability space (R, R, ρ). We construct a random Poisson measure N ρ on (X × R, X ⊗ R, ν × ρ) such that if N = i εxi then N ρ = i ε(xi ,ri ) where (ri ) is a sequence of i.i.d. random variables independent of N whose common law is ρ. Such a random Poisson measure N ρ is sometimes called a marked Poisson measure. The construction of N ρ follows line by line the one of N . Let us recall it. We first study the case where ν is finite and we consider the probability space
N, P(N), Pν(X) × X, X ,
ν ν(X)
N∗ ,
where Pν(X) denotes the Poisson law with intensity ν(X) and we put N=
Y
εxi
with the convention
0
=0
1
i=1
where Y, x1 , . . . , xn , . . . denote the coordinates maps. We introduce the probability space ˆ = (R, R, ρ)N∗ , ˆ P) ˆ A, (Ω, and the coordinates are denoted by r1 , . . . , rn , . . . . On the probability space (N, P(N), Pν(X) ) × ν N∗ ˆ we define the random measure N ρ = Y ε(x ,r ) . It is a Poisson ˆ P), ˆ A, (X, X , ν(X) ) × (Ω, i i i=1 random measure on X × R with intensity measure ν × ρ. For f ∈ L1 (ν × ρ) f dN ρ = f (x, r) dρ(r) N (dx) P-a.e. (8) Eˆ X×R
X
R
1156
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
and if f ∈ L2 (ν × ρ) 2 2 2 ˆE = f dN ρ f dρ dN − f dρ dN + f 2 dρ dN, X×R
X R
X
R
(9)
X R
ˆ where Eˆ stands for the expectation under the probability P. If ν is σ -finite, we extend this construction by a standard product argument. Eventually in all ˆ it is a random ˆ P), ˆ A, cases, we have constructed N on (Ω, A, P) and N ρ on (Ω, A, P) × (Ω, Poisson measure on X × R with intensity measure ν × ρ. We now are able to generalize identities (8) and (9): Proposition 11. Let F be an A ⊗ X ⊗ R measurable function such that E E R ( X |F | dν)2 dρ are both finite. Then the following relation holds Eˆ
2
=
F dN ρ X×R
2 F dρ dN
−
X R
2 F dρ
X
X×R F
2 dν dρ
and
dN +
R
F 2 dρ dN.
(10)
X R
Proof. Approximating first F by a sequence of elementary functions and then introducing a partition (Bk ) of subsets of X of finite ν-measure, this identity is seen to be a consequence of (9). 2 We denote by PN the measure PN = P(dw)Nw (dx) on (Ω × X, A ⊗ X ). Let us remark that PN and P × ν are singular because ν is diffuse. We will use the following consequence of the previous proposition: Corollary 12.Let F be an A ⊗ X ⊗ R measurable function. If F belongs to L2 (Ω × X × R, PN × ρ) and F (w, x, r)ρ(dr) = 0 for PN -almost all (w, x), then F dN ρ is well defined ˆ moreover and belongs to L2 (P × P), Eˆ
2
F dN ρ
=
F 2 dN dρ
P-a.e.
(11)
X×R
Proof. If F satisfies hypotheses of Proposition 11 then the result is clear. The general case is obtained by approximation. 2 3.2.3. Gradient and welldefinedness From now on, we assume that the Hilbert space d is separable so that (see Bouleau and Hirsch [10, Ex. 5.9, p. 242]) the bottom Dirichlet structure admits a gradient operator in the sense that there exist a separable Hilbert space H and a continuous linear map D from d into L2 (X, ν; H ) such that • ∀u ∈ d, D[u]2H = γ [u]. • If F : R → R is Lipschitz then ∀u ∈ d,
D[F ◦ u] = (F ◦ u)Du.
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
1157
• If F is C 1 (continuously differentiable) and Lipschitz from Rd into R (with d ∈ N) then ∀u = (u1 , . . . , ud ) ∈ dd ,
D[F ◦ u] =
d Fi ◦ u D[ui ]. i=1
As only the Hilbertian structure plays a role, we can choose for H the space L2 (R, R, ρ) where (R, R, ρ) is a probability space such that the dimension of the vector space L2 (R, R, ρ) is infinite. As usual, we identify L2 (ν; H ) and L2 (X ×R, X ⊗R, ν ×ρ) and we denote the gradient D by : ∀u ∈ d,
Du = u ∈ L2 (X × R, X ⊗ R, ν × ρ).
Without loss of generality, we assume moreover that operator takes its values in the orthogonal space of 1 in L2 (R, R, ρ), in other words we take for H the orthogonal of 1. So that we have (12) ∀u ∈ d, u dρ = 0 ν-a.e. Let us emphasize that hypothesis (12) although restriction-free, is a key property here (as in many applications to error calculus cf. [7, Chap. V, p. 225 et seq.]). Thanks to Corollary 12, it is the feature which will avoid non-local finite difference calculation on the upper space. Finally, although not necessary, we assume for simplicity that constants belong to dloc (see Bouleau and Hirsch [10, Chap. I, Definition 7.1.3]) which implies γ [1] = 0 and 1 = 0.
1 ∈ dloc
(13)
We now introduce the creation and annihilation operators ε + and ε − well known in quantum mechanics (see Meyer [23], Nualart and Vives [25], Picard [26], etc.) in the following way: ∀x, w ∈ Ω,
εx+ (w) = w1{x∈supp w} + (w + εx )1{x ∈supp / w} ,
∀x, w ∈ Ω,
εx− (w) = w1{x ∈supp / w} + (w − εx )1{x∈supp w} .
One can verify that for all w ∈ Ω, εx+ (w) = w
and εx− (w) = w − εx
for Nw -almost all x
(14)
for ν-almost all x.
(15)
and εx+ (w) = w + εx
and εx− (w) = w
We extend this operator to the functionals by setting: ε + H (w, x) = H εx+ w, x and ε − H (w, x) = H εx− w, x . The next lemma shows that the image of P × ν by ε + is nothing but PN whose image by ε − is P × ν:
1158
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
Lemma 13. Let H be A ⊗ X -measurable and non-negative, then E
+
ε H dν = E
H dN
and E
−
ε H dN = E
H dν.
Proof. Let us assume first that H = e−N (f ) g where f and g are non-negative and belong to L1 (ν) ∩ L2 (ν). We have: E
+
ε H dν = E
e−N (f ) e−f (x) g(x) dν(x),
and by standard calculations based on the properties of the Laplace functional we obtain that E
e−N (f ) e−f (x) g(x) dν(x) = E e−N (f ) N (g) = E
H dN.
We conclude using a monotone class argument and similarly for the second equation. Let us also remark that if F ∈ L2 (PN × ρ) satisfies + ε F (w, x, r) = F (εx+ (w), x, r) we have
ε + F dN ρ =
2
F dρ = 0 PN -a.e. then if we put
F dN ρ
P-a.e.
(16)
Indeed (ε + F − F )2 dN dρ = 0 P-a.e. because εx+ (w) = w for Nw -almost all x. Definition 14. For all F ∈ D0 , we put F =
ε − ε + F dN ρ.
Thanks to hypothesis (13) we have the following representation of F : F (w, w) ˆ =
ε − F ε·+ (w) − F (w) (x, r)N ρ(dx dr).
X×R
Let us also remark that Definition 14 makes sense because for all F ∈ D0 and P-almost all ˜ w ∈ Ω, the map y → F (εy+ (w)) − F (w) belongs to d. To see this, take F = ei N(f ) with f ∈ D(a) ∩ L1 (ν), then ˜ F εy+ (w) − F (w) = ei N(f ) eif (y) − 1 , and we know that eif − 1 ∈ d. We now proceed and obtain i N˜ (f ) e =
˜ ε − ei N(f ) eif − 1 dN ρ =
˜ ε − ei N(f )+if (if ) dN ρ
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
1159
and eventually
˜
ei N(f )
=
˜
ei N (f ) (if ) dN ρ.
˜ ˜ So, if F, G ∈ D0 , F = p λp ei N (fp ) , G = q μq ei N (gq ) , as fp dρ = gq dρ = 0 and thanks to Corollary 12, we have ˜ λp μq ei N(fp −gq ) Eˆ F G =
(ifp ) (igq ) dN dρ,
p,q
and so ˜ λp μq ei N(fp −gq ) N γ (fp , gq ) . Eˆ F G =
(17)
p,q
But, by Definition 14, it is clear that F does not depend on the representation of F in D0 so as ˜ a consequence of the previous identity p,q λp μq ei N (fp −gq ) N (γ (fp , gq )) depends only on F and G and thanks to (6), we conclude that A0 is well defined and is a linear operator from D0 into L2 (P). 3.2.4. Upper structure and first properties As a consequence of Proposition 10, it is clear that A0 is symmetric, non-positive on D0 therefore (see Bouleau and Hirsch [10, p. 4]) it is closable and we can consider its Friedrichs extension (A, D(A)) which generates a closed Hermitian form E with domain D ⊃ D(A) such that ∀U ∈ D(A), ∀V ∈ D,
E(U, V ) = −E A[U ]V .
Moreover, thanks to Proposition 10, it is clear that contractions operate, so (see Bouleau and Hirsch [10, Ex. 3.6, p. 16]) (D, E) is a local Dirichlet form which admits a carré du champ operator Γ . The upper structure that we have obtained (Ω, A, P, D, Γ ) satisfies the following properties: ˜ ) ∈ D and • ∀f ∈ d, N(f Γ N˜ (f ) = N γ [f ] ,
(18)
moreover the map f → N˜ (f ) is an isometry from d into D. ˜ • ∀f ∈ D(a), ei N (f ) ∈ D(A), and 1 i N(f ˜ ) i N˜ (f ) ˜ i N a[f ] − N γ [f ] . Ae =e 2
(19)
1160
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
• The operator (defined on D0 ) admits an extension on D, still denoted , it is a gradient associated to Γ and for all f ∈ d: N˜ (f ) =
f dN ρ.
(20)
X×R
As a gradient for the Dirichlet structure (Ω, A, P, D, Γ ), is a closed operator from L2 (P) into ˆ It satisfies the chain rule and operates on the functionals of the form Φ(N˜ (f )), Φ LipL2 (P× P). schitz f ∈ d, or more generally Ψ (N˜ (f1 ), . . . , N˜ (Fn )) with Ψ Lipschitz and C 1 and f1 , . . . , fn in d. Let us also remark that if F belongs to D0 , A[F ] = N ε − a ε + F .
(21)
3.2.5. Link with the Fock space The aim of this subsection is to make the link with other existing works and to present another approach based on the Fock space. It is independent of the rest of this article. Let g ∈ D(a) ∩ L1 (ν) such that − 12 g 0 and a[g] ∈ L1 (ν). Clearly, f = − log(1 + g) is non-negative and belongs to d. We have for all v ∈ d ∩ L1 (ν) 1 E e−N (f ) , e−N (v) = E e−N (f ) e−N (v) Γ N (f ), N (v) 2 1 = E e−N (f ) e−N (v) N γ [f, v] 2 1 −f −v ) dν γ [f, v]e−f −v dν. = e X (1−e 2 X
As a consequence of the functional calculus
γ [f, v]e−f −v dν =
X
X
γ g, e−v dν = −2
a[g]e−v dν,
X
this yields −N (f ) −N (v) a[g] −N (f ) −N (v) . E e = −E e ,e e N 1+g
(22)
Thus by Lemma 8, we obtain Proposition 15. Let g ∈ D(a) ∩ L1 (ν) such that − 12 g 0 and a[g] ∈ L1 (ν) then e
N (log(1+g))
∈ D(A)
and A eN (log(1+g)) = eN (log(1+g)) N
a[g] . 1+g
(23)
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
1161
Let us recall that (pt ) is the semi-group associated to the bottom structure. If g satisfies the hypotheses of the previous proposition, pt g also satisfies them. The map Ψ : t → eN (log(1+pt g)) N (log(1+g)) hence Ψ (t) = P [eN (log(1+g)) ] where is differentiable and dΨ t dt = AΨ with Ψ (0) = e (Pt ) is the strongly continuous semi-group generated by A. So, we have proved Proposition 16. Let g be a measurable function with − 12 g 0, then ∀t 0,
Pt eN (log(1+g)) = eN (log(1+pt g)) .
For any m ∈ N∗ , we denote by L2sym (X m , X ⊗m , ν ×m ) the set of symmetric functions in and we recall that ν is diffuse. For all F ∈ L2sym (X m , X ⊗m , ν ×m ), we put
L2 (X m , X ⊗m , ν ×m )
F (x1 , . . . , xm )1{∀i=j, xi =xj } N˜ (dx1 ) · · · N˜ (dxm ).
Im (F ) = Xm
One can easily verify that for all F, G ∈ L2sym (X m , X ⊗m , ν ×m ) and all n, m ∈ N∗ , E[Im (F )In (G)] = 0 if n = m and E In (F )In (G) = n!F, GL2sym (Xn ,X ⊗n ,ν ×n ) , where ·,·L2sym (Xn ,X ⊗n ,ν ×n ) denotes the scalar product in L2sym (X n , X ⊗n , ν ×n ). For all n ∈ N∗ , we consider Cn , the Poisson chaos of order n, i.e. the sub-vector space of L2 (Ω, A, P) generated by the variables In (F ), F ∈ L2sym (X n , X ⊗n , ν ×n ). The fact that
L (Ω, A, P) = R 2
+∞
Cn
n=1
has been proved by K. Ito (see [18]) in 1956. This proof is based on the fact that the set {N(E1 ) · · · N(Ek ), (Ei ) disjoint sets in X } is total in L2 (Ω, A, P). Another approach, quite natural, consists in studying carefully, for g ∈ L1 ∩ L∞ (ν), what has to be subtracted from the integral with respect to the product measure
g(x1 ) · · · g(xn ) N˜ (dx1 ) · · · N˜ (dxn )
Xn
to obtain the Poisson stochastic integral In g ⊗n =
Xn
g(x1 ) · · · g(xn )1{∀i=j, xi =xj } N˜ (dx1 ) · · · N˜ (dxn ).
1162
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
This can be done in an elegant way by the use of lattices of partitions and the Möbius inversion formula (see Rota and Wallstrom [30]). This leads to the following formula (observe the tilde on the first N only): n In g ⊗n = Bn,k N˜ (g), −1!N g 2 , 2!N g 3 , . . . , (−1)n−k (n − k)!N g n−k+1 , k=1
where the Bn,k are the exponential Bell polynomials given by Bn,k =
n! x c1 x c2 · · · c1 !c2 ! · · · (1!)c1 (2!)c2 · · · 1 2
the sum being taken over all the non-negative integers c1 , c2 , . . . such that c1 + 2c2 + 3c3 + · · · = n, c1 + c2 + · · · = k. In (g ⊗n ) is a homogeneous function of order n with respect to g. If we express the Taylor expansion of eN (log(1+tg)) and compute the nth derivate with respect to t thanks to the formula of the composed functions (see Comtet [11]) we obtain eN (log(1+tg))−tν(g) = 1 +
+∞ n n t n=1
n!
Bn,k N˜ (g), −1!N g 2 , . . . , (−1)n−k (n − k)!N g n−k+1 ,
k=1
this yields eN (log(1+g))−ν(g) = 1 +
+∞ 1 ⊗n In g . n!
(24)
n=1
The density of the chaos is now a consequence of Lemma 8. Conversely, one can prove formula (24) thanks to the density of the chaos, see for instance Surgailis [32]. By transportation of structure, the density of the chaos has a short proof using stochastic calculus for the Poisson process on R+ , cf. Dellacherie, Maisonneuve and Meyer [13, p. 207], see also Applebaum [3, Theorems 4.1 and 4.3]. 3.3. Extension of the representation of the gradient and the lent particle method 3.3.1. Extension of the representation of the gradient The goal of this subsection is to extend the formula of Definition 14 to any F ∈ D. To this aim, we introduce an auxiliary vector space D which is the completion of the algebraic tensor product D0 ⊗ d with respect to the norm D which is defined as follows. Considering η, a fixed strictly positive function on X such that N (η) belongs to L2 (P), we set for all H ∈ D0 ⊗ d:
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
1163
1 2 − − H D = E ε γ [H ] (w, x)N (dx) + E ε |H | (w, x)η(x)N (dx) X
1 2 γ [H ] (w, x)ν(dx) + E |H |(w, x)η(x)ν(dx). = E X
One has to note that if F ∈ D0 then ε + F − F ∈ D0 ⊗ d and if F =
p λp e
i N˜ (fp ) ,
we have
˜ γ ε+ F − F = λp λq ei N (fp −fq ) ei(fp −fq ) γ [fp , fq ], p,q
so that
ε γ ε + F − F dN = −
˜
λp λq ei N (fp −fq ) γ [fp , fq ] dN,
p,q
X
by the construction of Proposition 10, this last term is nothing but Γ [F ]. Thus, if F ∈ D0 then ε + F − F ∈ D and + 1 ε F − F = EΓ [F ] 2 + E D
+
ε F − F η dN
1 2E[F ] 2 + 2F L2 (P) N (η)L2 (P) . As a consequence, ε + − I admits a unique extension on D. It is a continuous linear map from D into D. Since by (13) γ [ε + F − F ] = γ [ε + F ] and (ε + F − F ) = (ε + F ) , this leads to the following theorem: Theorem 17. The formula ∀F ∈ D,
F =
ε − ε + F dN ρ,
(25)
X×R
is justified by the following decomposition: d(N ρ) ε − ((.) ) ε + −I ˆ F ∈ D −→ ε + F − F ∈ D −→ ε − ε + F ∈ L20 (PN × ρ) − → F ∈ L2 (P × P) where each operator is continuous on the range of the preceding one and where L20 (PN × ρ) is the closed set of elements G in L2 (PN × ρ) such that R G dρ = 0 PN -a.e. Moreover, we have for all F ∈ D 2 Γ [F ] = Eˆ F =
X
ε − γ ε + F dN.
1164
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
Proof. Let H ∈ D. There exists a sequence (Hn ) of elements in D0 ⊗ d which converges to H in D and we have for all n ∈ N
2 ε Hn dPN dρ = E −
ε − γ [Hn ] dN Hn 2D ,
therefore (Hn ) is a Cauchy sequence in L20 (PN ×ρ) hence converges to an element in L20 (PN ×ρ) that we denote by ε − (H ). Moreover, if K ∈ L20 (PN × ρ), we have EEˆ
2 K(w, x, r)N ρ(dx dr) = E K 2 dN dρ = K2L2 (P
X×R
N ×ρ)
.
X×R
This provides the assertion of the statement.
2
The functional calculus for and Γ involves mutually singular measures and may be followed step by step: Let us first recall that by Lemma 13 the map (w, x) → (εx+ (w), x) applied to classes of functions PN -a.e. yields classes of functions P×ν-a.e. and also the map (w, x) → (εx− (w), x) applied to classes of functions P × ν-a.e. yields classes of functions PN -a.e. But product functionals of the form F (w, x) = G(w)g(x) where G is a class P-a.e. and g a class ν-a.e. belong necessarily to a single class PN -a.e. Hence, if we applied ε + to such a ˜ functional, this yields a unique class P × ν-a.e. In particular with F = ei Nf g: ˜ ˜ ε + ei N f g = ei Nf eif g
P × ν-a.e.
from this class the operator ε − yields a class PN -a.e. ˜ ˜ ε − ei N f eif g = ei Nf g
PN -a.e.
and this result is the same as F PN -a.e. This applies to the case where F depends only on w and is defined P-a.e. then ε− ε+ F = F
PN -a.e.
Thus the functional calculus decomposes as follows: Proposition 18. Let us consider the subset of D of functionals of the form H = Φ(F1 , . . . , Fn ) with Φ ∈ C 1 ∩ Lip(Rn ) and Fi ∈ D. Putting F = (F1 , . . . , Fn ) we have the following: (a)
ε+ H
=
i
Φi ε + F ε + Fi
P × ν × ρ-a.e.,
+ + + γ ε+ H = P × ν-a.e., Φi ε F Φj ε F γ ε F i , ε + F j ij
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
Φi (F )ε − ε + Fi ε− ε+ H =
(b)
1165
PN × ρ-a.e.,
i
Φi (F )Φj (F )ε − γ ε + Fi , ε + Fj ε− γ ε+ H =
PN -a.e.,
ij
(c)
H =
ε − ε + H dN ρ = Φi (F )
Γ [H ] =
ε − ε + Fi dN ρ
i
ε − γ ε + H dN = Φi (F )Φj (F )
ˆ P × P-a.e.,
ε − γ ε + Fi , ε + Fj dN
P-a.e.
ij
Remark 2. The projection of the measure PN on Ω is a (possibly non σ -finite) measure equivalent to P only if ν(X) = +∞, i.e. if P{N (1) > 0} = 1. condition for exisIf ν(X) = ν < +∞, then P{N (1) = 0} = e−ν > 0, and the sufficient tence of density Γ [F ] > 0 P-a.s. is never fulfilled because Γ [F ] = ε − (γ [ε + F ]) dN vanishes on {N(1) = 0}. Conditioning arguments with respect to the set {N (1) > 0} have to be used. 3.3.2. The lent particle method: first application The preceding theorem provides a new method to study the regularity of Poisson functionals, that we present on an example. Let us consider, for instance, a real process Yt with independent increments and Lévy measure σ integrating x 2 , Yt being supposed centered without Gaussian part. We assume that σ has a density satisfying Hamza’s condition (Fukushima, Oshima and Takeda [16, p. 105]) so that a local Dirichlet structure may be constructed on R \ {0} with carré du champ γ [f ] = x 2 f 2 (x). We suppose also hypothesis (BC) (cf. Section 3.2.1). If N is the random Poisson measure with t intensity dt × σ we have 0 h(s) dYs = 1[0,t] (s)h(s)x N˜ (ds dx) and the choice done for γ t t gives Γ [ 0 h(s) dYs ] = 0 h2 (s) d[Y, Y ]s for h ∈ L2loc (dt). In order to study the regularity of the t random variable V = 0 ϕ(Ys− ) dYs where ϕ is Lipschitz and C 1 , we have two ways: (a) We may represent the gradient as Yt = B[Y,Y ]t where B is a standard auxiliary independent Brownian motion. Then by the chain rule t V =
t
ϕ (Ys− )(Ys− ) dYs +
0
ϕ(Ys− ) dB[Y ]s 0
now using (Ys− ) = (Ys )− , a classical but rather tedious stochastic calculus yields
Γ [V ] = Eˆ V
2
=
αt
t Yα2
2 ϕ (Ys− ) dYs + ϕ(Yα− )
,
(26)
]α
where Yα = Yα − Yα− . Since V has real values the energy image density property holds for V , and V has a density as soon as Γ [V ] is strictly positive a.s. what may be discussed using the relation (26). (b) An other more direct way consists in applying the theorem. For this we define by choos1 1 ing ξ such that 0 ξ(r) dr = 0 and 0 ξ 2 (r) dr = 1 and putting f = xf (x)ξ(r).
1166
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
1o . First step. We add a particle (α, x) i.e. a jump to Y at time α with size x what gives t
+
ε V − V = ϕ(Yα− )x +
ϕ(Ys− + x) − ϕ(Ys− ) dYs .
]α
2o . V = 0 since V does not depend on x, and t + ε V = ϕ(Yα− )x + ϕ (Ys− + x)x dYs ξ(r)
because x = xξ(r).
]α
t 3o . We compute γ [ε + V ] = (ε + V )2 dr = (ϕ(Yα− )x + ]α ϕ (Ys− + x)x dYs )2 . 4o . We take back the particle we gave, in order to compute ε − γ [ε + V ] dN . That gives
ε − γ ε + V dN =
2 t ϕ(Yα− ) + ϕ (Ys− ) dYs x 2 N (dα dx) ]α
and (26). remark that both operators F → ε + F , F → (ε + F ) are non-local, but instead F → We − + ε (ε F ) d(N ρ) and F → ε − γ [ε + F ] dN are local: taking back the lent particle gives the locality. We will deepen this example in dimension p in Section 5. 4. (EID) property on the upper space from (EID) property on the bottom space and the domain D loc From now on, we make additional hypotheses on the bottom structure (X, X , ν, d, γ ) which are stronger but satisfied in most of the examples. Hypothesis (H1): X admits a partition of the form: X = B ∪ ( +∞ k=1 Ak ) where for all k, Ak ∈ X with ν(Ak ) < +∞ and ν(B) = 0, in such a way that for any k ∈ N∗ may be defined a local Dirichlet structure with carré du champ: Sk = (Ak , X|Ak , ν|Ak , dk , γk ), with ∀f ∈ d,
f|Ak ∈ dk
and γ [f ]|Ak = γk [f|Ak ].
Hypothesis (H2): Any finite product of structures Sk satisfies (EID). Remark 3. In many examples where X is a topological space, (H1) is satisfied by choosing for (Ak ), k ∈ N∗ a regular open set. Let us remark that (H2) is satisfied for the structures studied in Section 2.
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
1167
The main result of this section is the following: Proposition 19. If the bottom structure (X, X , ν, d, γ ) satisfies (H1) and (H2) then the upper structure (Ω, A, P, D, Γ ) satisfies (EID). Proof. For all k ∈ N∗ , since ν(Ak ) < +∞, we consider an upper structure Sk = (Ωk , Ak , Pk , Dk , Γk ) associated to Sk as a direct application of the construction by product (see Section 3.3.2 above or Bouleau [7, Chap. VI.3]). Let k ∈ N∗ , we denote by Nk the corresponding random Poisson measure on Ak with intensity ν|Ak and we consider N ∗ the random Poisson measure on X with intensity ν, defined on the product probability space ∗ ∗ ∗ +∞ (Ωk , Ak , Pk ), Ω ,A ,P = k=1
by N∗ =
+∞
Nk .
k=1
In a natural way, we consider the product Dirichlet structure +∞ Sk . S ∗ = Ω ∗ , A∗ , P∗ , D∗ , Γ ∗ = k=1
In the third section, we have built using the Friedrichs argument, the Dirichlet structure S = (Ω, A, P, D, Γ ), let us now make the link between those structures. First of all, thanks to Theorem 2.2.1 and Proposition 2.2.2. of Chapter V in Bouleau and Hirsch [10], we know that a function ϕ in L2 (P∗ ) belongs to D∗ if and only if: 1. For all k ∈ N∗ and
n=k Pn -almost
all ξ1 , . . . , ξk1 , ξk+1 , . . . , the map
ξ → ϕ(ξ1 , . . . , ξk1 , ξ, ξk+1 , . . .)
2.
belongs to Dk . Γ [ϕ] ∈ L1 (P∗ ) and we have Γ ∗ [ϕ] = k γk [ϕ]. k k
Consider f ∈ d ∩ L1 (γ ) then clearly N (f ) =
k
Nk (f|Ak )
1168
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
belongs to D∗ and in the same way ˜
ei N(f ) =
˜
ei Nk (f|Ak ) ∈ D∗ .
k
Moreover, by hypothesis (H1):
2 ˜
i N˜ (f ) i N(f ˜ ) i Nl (f|Al )
k |Ak
= = Nk γ [f ]|Ak Γ e
e
Γk e ∗
k
l=k
k
˜ = N γ [f ] = Γ ei N (f ) . Thus as D0 is dense in D, we conclude that D ⊂ D∗ and Γ = Γ ∗ on D. As for all k, Sk is a product structure, thanks to hypothesis (H2) and Proposition 2.2.3 in Bouleau and Hirsch [10, Chapter V], we conclude that S ∗ satisfies (EID) hence S too. 2 Main case. Let N be a random Poisson measure on Rd with intensity measure ν satisfying one of the following conditions: (i) ν = k dx and a function ξ (the carré du champ coefficient matrix) may be chosen such that hypotheses (HG) hold (cf. Section 2.1), (ii) ν is the image by a Lipschitz injective map of a measure satisfying (HG) on Rq , q d, (iii) ν is a product of measures like (ii), then the associated Dirichlet structure (Ω, A, P, D, Γ ) constructed (cf. Section 3.2.4) with ν and the carré du champ obtained by the ξ of (i) or induced by operations (ii) or (iii) satisfies (EID). We end this section by a few remarks on the localization of this structure which permits to extend the functional calculus related to Γ or to bigger spaces than D, which is often convenient from a practical point of view. Following Bouleau and Hirsch (see [10, pp. 44–45]) we recall that Dloc denotes the set of functions F : Ω → R such that there exists a sequence (En )n∈N∗ in A such that Ω=
En
and ∀n ∈ N∗ , ∃Fn ∈ D,
Fn = F
on En .
n
Moreover if F ∈ Dloc , Γ [F ] is well defined and satisfies (EID) in the sense that F∗ Γ [F ] · P λ1 . More generally, if (Ω, A, P, D, Γ ) satisfies (EID), ∀F ∈ (Dloc )n ,
F∗ det Γ [F ] · P λn .
We can consider another space bigger than Dloc by considering a partition of Ω consisting in a sequence of sets with negligible boundary. More precisely, we denote by DLOC the set of
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
1169
∗ functions F : Ω → R such that there exists a sequence of disjoint sets (An )n∈N in A such that P(Ω \ n An ) = 0 and
∀n ∈ N∗ , ∃Fn ∈ D,
Fn = F
on An .
One can easily verify that it contains the localized domain of any structure S ∗ as considered in the proof of Proposition 19, that Γ is well defined on DLOC , that the functional calculus related to Γ or remains valid and that it satisfies (EID) i.e. if (Ω, A, P, D, Γ ) satisfies (EID), ∀F ∈ (DLOC )n ,
F∗ det Γ [F ] · P λn .
5. Examples 5.1. Upper bound of a process on [0, t] Let Y be a real process with stationary independent increments satisfying the hypotheses of example 3.3.2. We consider a real càdlàg process K independent of Y and put Hs = Ys + Ks . Proposition 20. If σ (R \ {0}) = +∞ and if P[supst Hs = H0 ] = 0, the random variable supst Hs possesses a density. Proof. (a) We may suppose that K satisfies supst |Ks | ∈ L2 . Indeed, if random variables Xn have densities and P[Xn = X] → 0, then X has a density. Hence the assertion is obtained by considering (Ks ∧ k) ∨ (−k). (b) Let us put M = supst Hs . Applying the lent particle method gives + ε M (α, x) = sup (Ys + Ks )1{s<α} + (Ys + x + Ks )1{sα} st
= max sup(Ys + Ks ), sup (Ys + x + Ks ) , s<α
sα
γ ε + M (α, x) = 1{supsα (Ys +x+Ks )sups<α (Ys +Ks )} γ [j ](x) where j is the identity map j (x) = x. We take back the lent particle before integrating with respect to N and obtain, since γ [j ](x) = x 2 , Γ [M] =
ε − γ ε + M N(dα dx) = Yα2 1{supsα (Ys +Ks )sups<α (Ys +Ks )} . αt
As σ (R \ {0}) = +∞, Y has infinitely many jumps on every time interval, so that Γ [M] = 0
⇒
∀α t,
sup (Ys + Ks ) < sup(Ys + Ks ) sα
s<α
1170
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
and choosing α decreasing to zero, we obtain Γ [M] = 0
⇒
sup Hs = H0 ts0
2
and the proposition.
It follows that any real Lévy process X starting at zero and immediately entering R∗+ , whose Lévy measure dominates a measure σ satisfying Hamza’s condition and infinite, is such that supst Xs has a density. 5.2. Regularizing properties of Lévy processes Let Y be again a real process with stationary independent increments satisfying the hypotheses of example 3.3.2. By Hamza’s condition, hypothesis (H1) is fulfilled and hypothesis (H2) ensues from Theorem 2, so that the upper structure verifies (EID). Let S be an Rp -valued semi-martingale independent of Y . We will say that S is pathwise p-dimensional on [0, t] if almost every sample path of S on [0, t] spans a p-dimensional vector space. We consider the Rp -valued process Z whose components are given by Zt1 = St1 + Yt1
and Zti = Sti
∀i 2
and the stochastic integral t R=
ψ(Zs− ) dZs 0
where ψ is a Lipschitz and C 1 mapping from Rp into Rp×p . Proposition 21. If σ (R \ {0}) = +∞, if the Jacobian determinant of the column vector ψ.1 does not vanish and if R is pathwise p-dimensional on [0, t], then the law of R is absolutely continuous with respect to λp . Proof. We apply the lent particle method. Putting x = (x, 0, . . . , 0) and R = i
j
t j
ψij (Zs− ) dZs ,
0
we have +
t
ε R − R = ψi1 (Zα− )x + i
i
]α
ψi1 (Zs− + x) − ψi1 (Zs− ) dYs
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
1171
as in example 3.3.2, t + i ε R = ψi1 (Zα− )x + ∂1 ψi1 (Zs− + x)x dYs ξ(r) [α
and γ ε+ R i , ε+ R j t t = ψi1 (Zα− ) + ∂1 ψi1 (Zs− + x) dYs ψj 1 (Zα− ) + ∂1 ψj 1 (Zs− + x) dYs x 2 . [α
[α
We take back the lent particle before integrating in N : Γ Ri , Rj =
ε − γ ε + R i , ε + R j dN = Yα2 Uα Uαt αt
t where Uα is the column vector ψ.1 (Zα− ) + [α ∂1 ψ.1 (Zs− ) dYs . Let JT be the set of jump times of Y on [0, t], we conclude that det Γ R, R t = 0
⇔
dim L(Uα ; α ∈ JT) < p.
Let A = {ω: dim L(Uα ; α ∈ JT) < p}. Reasoning on A, there exist λ1 , . . . , λp such that p
λk ψk1 (Zα− ) +
k=1
t
∂1 ψk1 (Zs− ) dYs = 0 ∀α ∈ JT,
(27)
[α
now, since σ (R+ \ {0}) = +∞, JT is a dense countable subset of [0, t], so that taking left limits in (27), using (27) anew and the fact that ψ is C 1 , we obtain p
λk ψk1 (Zα− ) = 0 ∀α ∈ JT,
hence ∀α ∈ ]0, t],
k=1
thus, on A, we have dim L(ψ.1 (Zs− ); s ∈ ]0, t]) < p. Then EID property yields the conclusion. 2 The lent particle method and (EID) property may be applied to density results for solutions of stochastic differential equations driven by Lévy processes or random measures under Lipschitz hypotheses. Let us mention also that the gradient defined in Section 3.2 has the property to be easily iterated, this allows to obtain conclusions on C ∞ -regularity in the case of smooth coefficients. These applications will be investigated in forthcoming articles.
1172
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
5.3. A regular case violating Hörmander conditions In spite of the difficulty of the proofs, applying the method is quite easy. This will be pushed forward in another article, we are just showing here an extremely simple case, example of situations rarely taken in account in the literature. (a) Let us consider the following SDE driven by a two-dimensional Brownian motion ⎧ t ⎪ ⎪ ⎪ 1 ⎪ Xt = z1 + dBs1 , ⎪ ⎪ ⎪ ⎪ ⎪ 0 ⎪ ⎪ ⎪ ⎪ t t ⎨ 2 1 1 Xt = z2 + 2Xs dBs + dBs2 , ⎪ ⎪ ⎪ 0 0 ⎪ ⎪ ⎪ ⎪ t t ⎪ ⎪ ⎪ 3 1 1 ⎪ X = z3 + Xs dBs + 2 dBs2 . ⎪ ⎪ ⎩ t 0
(28)
0
This diffusion is degenerate and the Hörmander conditions are not fulfilled. The generator is A = 12 (U12 + U22 ) + V and its adjoint A∗ = 12 (U12 + U22 ) − V with U1 = ∂x∂ 1 + 2x1 ∂x∂ 2 + x1 ∂x∂ 3 , U2 =
∂ ∂x2
+ 2 ∂x∂ 3 and V = − ∂z∂ 2 −
1 ∂ 2 ∂z3 .
The Lie brackets of these vectors vanish and the Lie
algebra is of dimension 2: the diffusion remains on the quadric of equation 34 x12 − x2 + 12 x3 − 3 4 t = C. (b) Let us now consider the same equation driven by a Lévy process: ⎧ t ⎪ ⎪ ⎪ 1 ⎪ Zt = z1 + dYs1 , ⎪ ⎪ ⎪ ⎪ ⎪ 0 ⎪ ⎪ ⎪ t ⎪ t ⎨ 2 1 1 Zt = z2 + 2Zs− dYs + dYs2 , ⎪ ⎪ ⎪ 0 0 ⎪ ⎪ ⎪ t ⎪ t ⎪ ⎪ ⎪ 3 1 1 ⎪ Z = z3 + Zs− dYs + 2 dYs2 ⎪ ⎪ ⎩ t 0
0
under hypotheses on the Lévy measure such that the bottom space may be equipped with the carré du champ operator γ [f ] = y12 f1 2 + y22 f2 2 satisfying (BC) and our hypotheses yielding EID. Let us apply the lent particle method. ⎛ For α t
+ ε(α,y Zt 1 ,y2 )
yt1
⎞
1 1 = Zt + ⎝ 2Yα− y1 + 2 ]α y1 dYs + y2 ⎠ = Zt + t 1 1 Yα− y1 + ]α y1 dYs + 2y2
y1 2y1 Yt1 + y2 y1 Yt1 + 2y2
1 = Y 1 because ε + send into P × ν classes. That gives where we have used Yα− α
,
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
⎛ 2 y1 + ⎝ γ ε Zt = id id
y12 2Yt1 y12 4(Yt1 )2 + y22 id
1173
⎞ y12 Yt1 y12 2(Yt1 )2 + 2y22 ⎠ y12 (Yt1 )2 + 4y22
and ⎛ 2 y1 + ⎝ ε γ ε Zt = id id −
y12 2(Yt1 − Yα1 ) y12 4(Yt1 − Yα1 )2 + y22 id
⎞ y12 (Yt1 − Yα1 ) y12 2(Yt1 − Yα1 )2 + 2y22 ⎠ , y12 (Yt1 − Yα1 )2 + 4y22
hence ⎛ 1 2 Yα1 ⎝ id Γ [Zt ] = αt id
2(Yt1 − Yα1 ) 4(Yt1 − Yα1 )2 id
⎞ (Yt1 − Yα1 ) 2 2 0 1 1 2 2(Yt − Yα ) ⎠ + Yα 0 1 1 2 0 (Yt − Yα )
0 1 2
0 2 . 4
If the Lévy measures of Y 1 and Y 2 are infinite, it follows that Zt has a density as soon as ⎧⎛ ⎫ ⎞ 1 0 ⎨ ⎬ dim L ⎝ 2(Yt1 − Yα1 ) ⎠ , 1 α ∈ JT = 3. ⎩ ⎭ 2 (Yt1 − Yα1 ) But Y 1 possesses necessarily jumps of different sizes, hence Zt has a density on R3 . It follows that the integro-differential operator ˜ (z) = Af
&
f (z) − f
z1 + y1 z2 + 2z1 y1 + y2 z3 + z1 y1 + 2y2
' y1 − f1 (z)f2 (z)f3 (z) 2z1 y1 + y2 σ (dy1 dy2 ) z1 y1 + 2y2
is hypoelliptic at order zero, in the sense that its semi-group Pt has a density. No minoration is supposed of the growth of the Lévy measure near 0 as assumed by many authors. This result implies that for any Lévy process Y satisfying the above hypotheses, even a subordinated one in the sense of Bochner, the process Z is never strictly subordinated of the Markov process X solution of Eq. (28). References [1] S. Albeverio, Y. Kondratiev, M. Röckner, Differential geometry of Poisson space, C. R. Acad. Sci. Paris 323 (1996) 1129–1134; S. Albeverio, Y. Kondratiev, M. Röckner, Analysis and geometry on configuration spaces, J. Funct. Anal. 154 (1998) 444–500. [2] S. Albeverio, M. Röckner, Classical Dirichlet forms on topological vector spaces—closability and a Cameron– Martin formula, J. Funct. Anal. 88 (1990) 395–436. [3] D. Applebaum, Universal Malliavin calculus in Fock space and Lévy–Itô space, preprint, 2008. [4] K. Bichteler, J.-B. Gravereaux, J. Jacod, Malliavin Calculus for Processes with Jumps, Gordon and Breach, 1987. [5] N. Bouleau, Décomposition de l’énergie par niveau de potentiel, Lecture Notes in Math., vol. 1096, Springer, 1984. [6] N. Bouleau, Construction of Dirichlet structures, in: J. Král, J. Luke˘s, I. Netuka, J. Veselý (Eds.), Potential Theory – ICPT94, de Gruyter, 1996. [7] N. Bouleau, Error Calculus for Finance and Physics, the Language of Dirichlet Forms, de Gruyter, 2003. [8] N. Bouleau, Error calculus and regularity of Poisson functionals: The lent particle method, C. R. Math. Acad. Sci. Paris 346 (13–14) (2008) 779–782.
1174
N. Bouleau, L. Denis / Journal of Functional Analysis 257 (2009) 1144–1174
[9] N. Bouleau, F. Hirsch, Formes de Dirichlet générales et densité des variables aléatoires réelles sur l’espace de Wiener, J. Funct. Anal. 69 (2) (1986) 229–259. [10] N. Bouleau, F. Hirsch, Dirichlet Forms and Analysis on Wiener Space, de Gruyter, 1991. [11] L. Comtet, Advanced Combinatorics, Springer, 1974. [12] A. Coquio, Formes de Dirichlet sur l’espace canonique de Poisson et application aux équations différentielles stochastiques, Ann. Inst. H. Poincaré 19 (1) (1993) 1–36. [13] C. Dellacherie, B. Maisonneuve, P.-A. Meyer, Probabilités et Potentiel, Chap. XVII à XXIV, Hermann, 1992. [14] L. Denis, A criterion of density for solutions of Poisson-driven SDEs, Probab. Theory Related Fields 118 (2000) 406–426. [15] H. Federer, Geometric Measure Theory, Springer, 1969. [16] M. Fukushima, Y. Oshima, M. Takeda, Dirichlet Forms and Symmetric Markov Processes, de Gruyter, 1994. [17] Y. Ishikawa, H. Kunita, Malliavin calculus on the Wiener–Poisson space and its application to canonical SDE with jumps, Stochastic Process. Appl. 116 (2006) 1743–1769. [18] K. Ito, Spectral type of the shift transformation of differential processes with stationary increments, Trans. Amer. Math. Soc. 81 (1956) 253–263. [19] Y. Kondratiev, E. Lytvynov, M. Röckner, The semigroup of the Glauber dynamics of a continuous system of free particles, arXiv:math/0407359v1, 2004. [20] A. Martin-Löf, Limit theorems for the motion of a Poisson system of independent Markovian particles with high density, Z. Wahrscheinlichkeitstheorie Verw. Gebiete 34 (1976) 205–223. [21] Z.-M. Ma, M. Röckner, Introduction to the Theory of (Non-symmetric) Dirichlet Forms, Springer, 1992. [22] Z.-M. Ma, M. Röckner, Construction of diffusion on configuration spaces, Osaka J. Math. 37 (2000) 273–314. [23] P.-A. Meyer, Eléments de probabilités quantiques, in: Sém. Prob. XX, in: Lecture Notes in Math., vol. 1204, Springer, 1986. [24] J. Neveu, Processus Ponctuels, Lecture Notes in Math., vol. 598, Springer, 1977. [25] D. Nualart, J. Vives, Anticipative calculus for the Poisson process based on the Fock space, in: Sém. Prob. XXIV, in: Lecture Notes in Math., vol. 1426, Springer, 1990. [26] J. Picard, On the existence of smooth densities for jump processes, Probab. Theory Related Fields 105 (1996) 481–511. [27] J. Picard, Brownian excursions, stochastic integrals and representation of Wiener functionals, Electron. J. Probab. 11 (2006) 199–248. [28] N. Privault, A pointwise equivalence of gradients on configuration spaces, C. R. Acad. Sci. Paris 327 (7) (1998) 677–682. [29] M. Röckner, N. Wielens, Dirichlet forms—Closability and change of speed measure, in: S. Albeverio (Ed.), Infinite Dimensional Analysis and Stochastic Processes, in: Res. Notes Math., vol. 124, Pitman, 1985, pp. 119–144. [30] G.-C. Rota, T. Wallstrom, Stochastic integrals: A combinatorial approach, Ann. Probab. 25 (3) (1997) 1257–1283. [31] Sh. Song, Admissible vectors and their associated Dirichlet forms, Potential Anal. 1 (4) (1992) 319–336. [32] D. Surgailis, On multiple Poisson stochastic integrals and associated Markov processes, Probab. Math. Statist. 3 (2) (1984) 217–239. [33] L. Wu, Construction de l’opérateur de Malliavin sur l’espace de Poisson, in: Sém. Probabilité XXI, in: Lecture Notes in Math., vol. 1247, Springer, 1987.
Journal of Functional Analysis 257 (2009) 1175–1221 www.elsevier.com/locate/jfa
Mass transportation proofs of free functional inequalities, and free Poincaré inequalities Michel Ledoux a , Ionel Popescu b,c,∗ a Institut de Mathématiques de Toulouse, Université de Toulouse, F-31062 Toulouse, France b Georgia Institute of Technology, 686 Cherry Street, Atlanta, GA 30332, United States c IMAR 21, Calea Grivitei Street 010702-Bucharest, Sector 1, Romania
Received 13 February 2009; accepted 17 March 2009 Available online 31 March 2009 Communicated by D. Stroock
Abstract This work is devoted to direct mass transportation proofs of families of functional inequalities in the context of one-dimensional free probability, avoiding random matrix approximation. The inequalities include the free form of the transportation, Log-Sobolev, HWI interpolation and Brunn–Minkowski inequalities for strictly convex potentials. Sharp constants and some extended versions are put forward. The paper also addresses two versions of free Poincaré inequalities and their interpretation in terms of spectral properties of Jacobi operators. The last part establishes the corresponding inequalities for measures on R+ with the reference example of the Marcenko–Pastur distribution. © 2009 Elsevier Inc. All rights reserved. Keywords: Functional inequalities; Mass transport; Spectral gap; Random matrices
1. Introduction A distinguished role in the world of functional inequalities is played by the logarithmic Sobolev (Log-Sobolev) inequality and the Talagrand or transportation cost inequality. There is an extensive literature dedicated to these inequalities in the classical setting of Euclidean and Riemannian spaces (cf. e.g. [2,23,29,32]). * Corresponding author at: Georgia Institute of Technology, 686 Cherry Street, Atlanta, GA 30332, United States.
E-mail address: [email protected] (I. Popescu). 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.03.011
1176
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
Given a probability measure ν on Rd , the transportation cost inequality states that for some ρ > 0 and any other probability measure μ on Rd , ρW22 (μ, ν) E(μ | ν).
(T (ρ))
Here W2 (μ, ν) is the Wasserstein distance between μ and ν of finite second moment defined by 1/2 2 W2 (μ, ν) = inf |x − y| π(dx, dy) π∈Π(μ,ν)
with Π(μ, ν) denoting the set of probability measures on R2d with marginals μ and ν and dμ E(μ | ν) = log dμ dν is the relative entropy of μ with respect to ν if μ ν and +∞ otherwise. The Log-Sobolev inequality is that for any μ E(μ | ν)
1 I (μ | ν) 2ρ
(LSI(ρ))
where dμ 2 I (μ | ν) = ∇ log dμ dν is the Fisher information of μ with respect to ν which is defined in the case μ ν with dμ dν being differentiable. A more subtle inequality is the HWI inequality relating entropy (notice that E(μ | ν) is H (μ | ν) in [25] which explains the H), Wasserstein distance W, and Fisher information I ρ E(μ | ν) I (μ | ν)W2 (μ, ν) − W22 (μ, ν). (HWI(ρ)) 2 Poincaré’s inequality in this classical context is that for any compactly supported and smooth function ψ on Rd , ρ Varμ (ψ) |∇ψ|2 μ(dx) (P (ρ)) where Varμ (ψ) = ψ 2 (x) μ(dx) − ( ψ(x)μ(dx))2 is the variance of ψ with respect to μ. Starting with Gaussian measures [14,28], these inequalities were established for measures on Rd with strictly convex potentials by the Bakry–Émery criterion [2,23,29,32]. More precisely, if ν(dx) = e−V (x) dx, with V (x) − ρ|x|2 convex on Rd for some ρ > 0, both T (ρ) and LSI(ρ) hold true. Otto and Villani generated interest in this topic through their remarkable paper [25], in which they showed that the logarithmic Sobolev inequality implies the trasportation inequality, in a rather general setting. This connection was actually put further through the stronger HWI(ρ) inequality, which was shown in [25] to be valid in the case V (x) − ρ|x|2 is convex for some ρ ∈ R, When ρ > 0, LSI(ρ) is a consequence of HWI(ρ). Subsequently the main result from [25] was simplified and extended, for example [5] and recently [13] to mention only two sources.
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
1177
Another interesting connection in these families of functional inequalities is that any of T (ρ), LSI(ρ) or HWI(ρ) imply the Poincaré inequality P (ρ). The work [25] by Otto and Villani input in a powerful way the use of mass transportation ideas in the context of functional inequalities. Starting from this, Cordero-Erausquin used in [9] direct convexity arguments combined with mass transport methods to reprove the Log-Sobolev, transportation and HWI inequalities for measures with strictly convex potentials. The strategy is going back to the original approach of [28] to the transportation inequality (see also [4]). In the world of free probability, as it was shown by Ben Arous and Guionnet in [1], one can realize the free entropy as the rate function of the large deviations for the distribution of eigenvalues of some n × n complex random matrix ensembles (see also [19]). To wit a little bit here, let V : R → R be a nice function with enough growth at infinity and define the probability distribution Pn (dM) =
1 −n Trn (V (M)) e dM Zn
on the set Hn of complex Hermitian n × n matrices where dM is the Lebesgue measure on Hn . For a matrix M, let μn (M) = n1 nk=1 δλk (M) be the distribution of eigenvalues of M. These are random variables with values in P(R), the set of probability measures on R which converge almost surely to a non-random measure μV on R. For a measure μ on R, its the logarithmic energy with external field V is defined by E(μ) = V (x) μ(dx) − log |x − y| μ(dx) μ(dy). The minimizer of E(μ) over all probability measures on R is exactly the measure μV . From [1] we learned that the distributions of {μn }n1 under Pn satisfy a large deviations principle with scaling n2 and rate function given by R(μ) = E(μ) − E(μV ). The example of the quadratic potential V (x) = x 2 defining the paradigmatic Gaussian Unitary Ensemble in random matrix theory gives rise to the celebrated semicircular law as equilibrium measure. Within this random matrix framework, if V (x) − ρx 2 is smooth and convex for some ρ > 0, then the function Φ(M) = Trn (V (M)) is strongly convex (Φ(M) − nρ|M|2 is convex) on 2 Rn = Hn . An application of the classical LSI(nρ) on Hn for large n was used by Biane [3] to prove a Log-Sobolev inequality in the context of one-dimensional free probability which holds (cf. [18]) in the following form E(μ) − E(μV )
1 I (μ) 4ρ
(1.1)
for any probability measure μ on R whose density with respect to the Lebegue measure is in L3 (R), where
2 I (μ) = H μ(x) − V (x) μ(dx) with H μ = 2
1 x−y
μ(dx) being the Hilbert transform of μ.
1178
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
More precisely, Biane and Voiculescu used the free Ornstein Uhlenbeck process and the complex Burger equation. Using the large random matrix strategy, Hiai Petz and Ueda [18] reproved and extended the result of Biane and Voiculescu in the following form. If V (x) − ρx 2 is convex for some ρ > 0, then for every probability measure μ on R, ρW22 (μ, μV ) E(μ) − E(μV ).
(1.2)
Later, the first author [24] gave a simpler proof of (1.1) and (1.2) based on a free version of the geometric Brunn–Minkowski inequality obtained as a random matrix limiting case of its classical counterpart. He also showed the free analog of the Otto–Villani theorem indicating that the free Log-Sobolev inequality implies the free transportation inequality (1.2). The first scope of this paper is to provide direct proofs of the preceding functional inequalities in free probability without random matrix approximation. The second author of this paper in [26] gave a simple proof of the transportation inequality (1.2) on the same line of ideas as in [28] for the classical case where random matrix theory is entirely avoided. In this paper, following the approach of Cordero-Erausquin [9] (see also [4]), we use a combination of mass transport and convex analysis which apply to strictly convex potentials. The methods allow us besides to enlarge the class of potentials under consideration, in particular in instances which lack a proper random matrix approximation. For example, we cover potentials V on the line such that V (x) − ρ|x|p is convex for some ρ > 0 and p > 1 as well as a class of bounded perturbations of convex potentials. Using this approach, we present here an HWI free inequality for various cases of potentials. For the case V (x) − ρx 2 convex for some ρ ∈ R, this is E(μ) − E(μV )
I (μ)W2 (μ, μV ) − ρW22 (μ, μV ).
(1.3)
Also a Brunn–Minkowski inequality receives a direct proof as well. One interesting byproduct of our method is that some constants may be shown to be sharp. For the case of a quadratic V , Eqs. (1.1), (1.2) and (1.3) are sharp. Another topic discussed here in Section 3 is a free form of the transportation inequality which does not depend on the potential and that might be thought of as a version of the celebrated Pinsker inequality comparing total variation distance and entropy between probability measures. As opposed to the classical case, the free counterpart is more delicate. The second part of this work is devoted to free one-dimensional Poincaré inequalities. Using random matrix approximations and the classical Poincaré inequality, we first give an ansatz to what could be a possible Poincaré inequality in the free probability world. In the case of V (x) − ρx 2 convex for some ρ > 0, such that the measure μV has support [−1, 1], this states as,
ρ φ (x) μV (dx) 2π 2
1 1
2
−1 −1
φ(x) − φ(y) x−y
2
1 − xy dx dy, √ 1 − x2 1 − y2
(1.4)
for any smooth function φ on the interval [−1, 1]. There is also a second version of the Poincaré which is discussed in [3] for the case of the semicircular law. This inequality has a natural meaning in the context of free probability as the
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
1179
derivative ∇φ of a function from the classical P (ρ) is replaced by the noncommutative derivative φ(x)−φ(y) , and thus our second version takes the form x−y
φ(x) − φ(y) x −y
2 μ(dx) μ(dy) C Varμ (φ)
for every φ ∈ C01 (R).
(1.5)
As opposed to (1.4) which requires certain conditions on the measure μV , it turns out that (1.5) is always satisfied for any compactly supported measure μ with some constant. As was shown in [3] for the semicircular law, one can completely characterize the distribution in terms of the constant C. After the use of convexity, inequality (1.4) may actually be interpreted as a spectral gap as 1
√ (x)dx take the Jacobi operator follows. On L2 [−2,2] 2 4−x
Lf = − 1 − x 2 f (x) + xf (x) and the counting number operator defined by N Tn = nTn where Tn are the Chebyshev polynomials of the first kind, which are orthogonal in 1
√ (x) dx . Then, (1.4) for V (x) = x 2 /2 is equivalent to L2 [−2,2] 2 4−x
L N. Inequality (1.5) in the case of V (x) = x√2 /2 can also be seen as the spectral gap for the counting number operator on L2 (1[−2,2] (x) 4 − x 2 dx) with respect to the basis given by the Chebyshev polynomials of second kind. A more general situation is discussed in Section 9 which includes both versions of the Poincaré inequalities. As we mentioned already, in the classical setting, the Log-Sobolev and the transportation inequality imply the Poincaré inequalities. We do not have a satisfactory picture of these implications in the free context, for any of the two versions of the Poincaré inequality discussed here. In the final part, we investigate the preceding families of functional inequalities for probability measures supported on the positive real axis. The random matrix context is the one of Wishart ensembles with reference measure the Marcenko–Pastur distribution as opposed to the semicircular law, and the free functional inequalities correspond formally to the case of potentials V (x) = rx − s log(x) for r > 0, s 0 on R+ . Using the mass transportation method, we prove transportation, Log-Sobolev and HWI inequalities which were not investigated previously. A version of the Poincaré inequality is also discussed. The structure of the paper is as follows. Sections 2, 4, 5 and 6 deal with the mass transportation proofs of respectively the transportation, Log-Sobolev, HWI and Brunn–Minkowski inequalities. Section 3 studies transportation inequalities which involve some metric on the probabilities and which are independent of the potential V . Sections 7 and 8 are devoted to the two versions of the Poincaré inequality in the free context, related in Section 9 through Jacobi operators. Section 10 investigates the preceding inequalities with respect to the Marcenko–Pastur distribution and its convex extensions.
1180
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
2. Transportation inequality Throughout this paper we consider lower semicontinuous potentials V : R → R such that
(2.1) lim V (x) − 2 log |x| = ∞. |x|→∞
For a given Borel set Γ ⊂ R, denote by P(Γ ) the set of probability measures supported on Γ . The logarithmic energy with external potential V is defined by EV (μ) := V (x) μ(dx) − log |x − y| μ(dx) μ(dy) whenever both integrals exist and have finite values. In particular for measures μ which have atoms, EV (μ) = +∞ because the second integral is +∞. It is known (see [27] or [11]) that under condition (2.1) there exists a unique minimizer of EV in the set P(R) and the solution μV is compactly supported. The variational characterization of the minimizer μV (cf. [27, Theorem 1.3]) is that for a constant C ∈ R, V (x) 2 log |x − y| μV (dy) + C for quasi-every x ∈ R, V (x) = 2 log |x − y| μV (dy) + C for quasi-every x ∈ supp(μV ), (2.2) where supp(μV ) stands for the support of μ. If μ is such that EV (μ) < ∞, then Borel quasieverywhere sets have μ measure 0 and thus the properties above hold almost surely with respect to μ. For simplicity of the notation, we will drop the subscript V from EV unless the dependence of the potential has to be highlighted. Now we summarize some known facts about the equilibrium measure and its support as one can easily deduce them from [27, Chapter IV] and [11, Chapter 6]. Theorem 1. 1. Let V be a potential satisfying (2.1) and α = 0, β ∈ R. Set Vα,β (x) = V (αx + β). Then, μVα,β = ((id − β)/α)# μV and EV (μV ) = EVα,β (μVα,β ) − log |α|.
(2.3)
2. If V is convex satisfying (2.1), then the support of the equilibrium measure μV consists of one interval [a, b] where a and b solve the system ⎧ b ⎪ ⎪ 1 x −a ⎪ ⎪ V (x) dx = 1, ⎪ ⎪ b−x ⎪ ⎨ 2π a ⎪ b ⎪ ⎪ ⎪ b−x 1 ⎪ ⎪ ⎪ ⎩ 2π V (x) x − a dx = −1. a
(2.4)
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
1181
3. Let V be either a C 2 satisfying (2.1) whose equilibrium measure has support [a, b]. Then the equilibrium measure μV has density g(x), given by √
(x − a)(b − x) g(x) = 1[a,b] (x) 2π 2
b a
V (y) − V (x) dy. √ (y − x) (y − a)(b − y)
(2.5)
4. If V is C 2 , then V (x) = p.v.
2 μV (dx) for μV -a.s. all x ∈ supp(μV ), x −y
(2.6)
where p.v. stands for the principal value integral. Notice that the principal value makes sense as μV has a continuous density. We mention as a basic example that if V (x) = ρx 2 is quadratic, then μV is the semicircular law dx μV (dx) = 1[−√2/ρ,√2/ρ] (x) 2ρ − ρ 2 x 2 . π In this work, for p 1, we use Wp (μ, ν) for the Wasserstein distance on the space of probability measures on R defined as Wp (μ, ν) =
inf
π∈Π(μ,ν)
1/p |x − y|p π(dx, dy)
(2.7)
with Π(μ, ν) denoting the set of probability measures on R2 with marginals μ and ν. Note here that if θ is the (non-decreasing) transport map such that θ# μ = ν, then p
Wp (μ, ν) =
θ (x) − x p ν(dx).
(2.8)
For a detailed discussion on this topic we refer the reader to [29]. Our first result concerns the free version of the transportation cost inequality. As discussed in the introduction, the first assertion for strictly convex potentials was initially proved by large matrix approximation in [18]. The strategy of proof is inspired from [4,9,28] (see [26]). Theorem 2 (Transportation inequality). 1. If V is C 2 and V (x) − ρx 2 is convex for some ρ > 0, then for any probability measure μ on R, ρW22 (μ, μV ) E(μ) − E(μV ).
(2.9)
If V (x) = ρx 2 , then the equality in (2.9) is attained for measures μ = θ# μV , with θ (x) = x + m, therefore the constant ρ in front of W22 (μ, μV ) is sharp.
1182
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
2. Assume that V is C 2 , convex and V (x) ρ > 0 for all |x| r. Then, there is a constant C = C(r, ρ, μV , V ) > 0, such that CW22 (μ, μV ) E(μ) − E(μV ).
(2.10)
3. In the case V is C 2 and V (x) − ρ|x|p is convex for some real number p > 1, then, for any probability measure μ on R, p
cp ρWp (μ, μV ) E(μ) − E(μV )
(2.11)
where cp = infx∈R (|1 + x|p − |x|p − p sign(x)|x|p−1 ) > 0. Proof. 1. Since there is nothing to prove in the case E(μ) = ∞, we assume that E(μ) < ∞. In this case we also have that the measure μ and μV both have second finite moments. Now we take the non-decreasing transportation map θ such that θ# μV = μ which exists due to the lack of atoms of μV . Using the transport map θ , we first write E(μ) − E(μV ) =
V θ (x) − V (x) − V (x) θ (x) − x μV (dx)
+
θ (x) − θ (y) θ (x) − θ (y) − 1 − log μV (dx) μV (dy) x−y x−y
(2.12)
where in between we used the variational equation (2.6) to justify that
V (x) θ (x) − x μV (dx) = 2
θ (x) − x μV (dy) μV (dx) x −y (θ (x) − x) − (θ (y) − y) = μV (dy) μV (dx). x −y
Since V (x) − ρx 2 is convex, for any x, y the following holds
V (y) − V (x) − V (x)(y − x) ρ y 2 − x 2 − 2x(y − x) = ρ(y − x)2 . On the other hand since a − 1 log(a) for any a 0, Eqs. (2.12) and (2.8) yield (2.9). In the case V (x) = ρx 2 it is easy to see that for θ (x) = x + m, all inequalities involved become equalities, thus we attain equality in (2.9) for translations of μV . 2. We start the proof with (2.12), whereas this time we need to exploit the logarithmic term to get our inequality. The idea is to use the strong convexity where ψ(x) := θ (x) − x takes large values and for small values of ψ(x) we try to compensate this with the second integral of (2.12). Notice in the first place that by Taylor’s theorem we have that
1
V (y) − V (x) − V (x)(y − x) = (y − x)
2
V (1 − τ )x + τy (1 − τ ) dτ.
(2.13)
0
Now, let us assume that the support of the equilibrium measure μV is [a, b]. Next, V (x) 0 and V (x) ρ for |x| r, implies that for |y| 2r + 2 max{|a|, |b|}, we obtain that
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
1
V (y) − V (x) − V (x)(y − x) (y − x)
2
1183
V (1 − τ )x + τy (1 − τ ) dτ
1/2
ρ(y − x)2 /8
for any x ∈ [a, b].
Now write θ (x) = x + ψ(x). Thus using (2.12), and denoting R = 2r + 2 max{|a|, |b|} we continue with
V θ (x) − V (x) − V (x) θ (x) − x μV (dx)
1 2
1
ρ 16
ψ 2 (x)
V x + τ ψ(x) (1 − τ ) dτ μV (dx)
0
ψ 2 (x) μV (dx).
(2.14)
|ψ|R
This inequality provides a lower bound of the first term in (2.12). Further, it is not hard to check that ψ 2 (x) μV (dx) |ψ|R
1 1 2 = 1|ψ|R (x)ψ (x) μV (dx) + 1|ψ|R (y)ψ 2 (y) μV (dy) 2 2 2 1 1|ψ(x)−ψ(y)|2R (x, y)ψ(x) − ψ(y) μV (dx) μV (dy). 8
(2.15)
Now we treat the second integral on the left-hand side of (2.12). Use that t − log(1 + t) |t| − log(1 + |t|) for any t > −1 together with the fact that t − log(1 + t) is an increasing function for t 0 to argue that
ψ(x) − ψ(y) ψ(x) − ψ(y) − log 1 + μV (dx) μV (dy) x−y x−y |ψ(x) − ψ(y)| |ψ(x) − ψ(y)| − log 1 + μV (dx) μV (dy). b−a b−a
Further, for s 0 and u, v > 0 we have us 2 + s − log(1 + s)
v−log(1+v) 2 s , v2 2 us , 2
0sv vs
(2.16)
v − log(1 + v) 2 s . min u, v2
2R and v = b−a in combination with (2.15) and (2.16) yields This inequality used for u = ρ(b−a) 128 2 for the choice of c = min{u, (v − log(1 + v))/v } that
1184
ρ 16
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
ψ 2 (x) μV (dx) + |ψ|R
c
2 ψ(x) − ψ(y) μV (dx) μV (dy)
=c
ψ(x) − ψ(y) ψ(x) − ψ(y) − log 1 + μV (dx) μV (dy) x−y x −y
ψ 2 (x) μV (dx) −
2 . ψ(x)μV (dx)
(2.17)
This shows that E(μ) − E(μV ) is bounded below by a constant times the variance of ψ . Notice that W22 (μ, μV ) = ψ 2 (x)μV (dx) and in order to complete the proof we have to replace the variance of ψ by the integral of ψ 2 with respect to μV . This boils down to estimating the μV integral of ψ in terms of the integral of ψ 2 . To this end, use Cauchy’s inequality:
2 1
1 2 V x + τ ψ(x) (1 − τ ) dτ μV (dx) ψ(x) μV (dx) ψ (x) 1 + 2c 0
×
1+
1 1 2c 0 V (x
1 + τ ψ(x))(1 − τ ) dτ
μV (dx).
This inequality combined with Eqs. (2.12), (2.14) and (2.17), results with E(μ) − E(μV )
1 ψ (x) c + 2
1
2
V x + τ ψ(x) (1 − τ ) dτ μV (dx)
0
1
V (x + τ ψ(x))(1 − τ )dτ μV (dx) 1 2c + 0 V (x + τ ψ(x))(1 − τ ) dτ 1 0 V (x + τ ψ(x))(1 − τ ) dτ c μV (dx)W22 (μ, μV ), 1 2c + 0 V (x + τ ψ(x))(1 − τ ) dτ ×
0
where here we used the convexity encoded into V 0 and the fact that W22 (μ, μV ) = 2 ψ (x)μV (dx) to get the lower bound of the first integral. From the previous inequality, it becomes clear that we are done as soon as we prove that the quantity in front of W22 (μ, μV ) is bounded from below by a positive constant uniformly in ψ. To carry this out, notice that V can not be identically zero on [a, b]. Indeed, if V were identically zero on [a, b], then we would have that V (x) = K for all x ∈ [a, b], and this plugged into Eq. (2.4), yields that K(b − a) = 2 and K(b − a) = −2, a system without a solution. Therefore V is not identically 0 on [a, b]. If |ψ(x)| > R, then V (x + τ ψ(x)) ρ for 1/2 τ < 1, which 1 implies 0 V (x + τ ψ(x))(1 − τ )dτ ρ/8. On the other hand, if |ψ(x)| R, then 1 0
V x + τ ψ(x) (1 − τ ) dτ
δ 0
δ inf V (y) V x + τ ψ(x) (1 − τ ) dτ 2 |y−x|δR
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
1185
for all 0 δ 1. Define w(x) = sup min δ∈[0,1]
ρ δ , inf V (y) . 8 2 |y−x|δR
Since V is not identically 0 on [a, b], it follows that w is not identically zero on [a, b]. With this we obtain that 1
V x + τ ψ(x) (1 − τ ) dτ w(x) 0,
0
and then that 1
V (x + τ ψ(x))(1 − τ ) dτ μV (dx) C = 1 2c + 0 V (x + τ ψ(x))(1 − τ ) dτ 0
c
cw(x) μV (dx) > 0 2c + w(x)
which finishes the proof of (2.10) with this choice of C. 3. For the inequality (2.11), we follow the same route as in the proof of (2.9), the only change this time being that V (x) − ρ|x|p is convex, and thus we obtain
V (y) − V (x) − V (x)(y − x) ρ |y|p − |x|p − p sign(x)|x|p−1 (y − x) .
(2.18)
Writing θ (x) = x + ψ(x), and using (2.12) together with a − 1 log(a) for a 0, one arrives at E(μ) − E(μV ) ρ
|x + ψ(x)|p − |x|p − p sign(x)|x|p−1 ψ(x) μV (dx).
Now we use the fact that for all a, b ∈ R, |a + b|p − |b|p − p sign(b)|b|p−1 a cp |a|p , which applied to the above inequality in conjunction to (2.8), yields inequality (2.11).
(2.19) 2
Remark 1. 1. The C 2 regularity of V for (2.9) can be dropped (see [26]) but to simplify the presentation here we decided to consider only this case. 2. If V (x) − ρ|x|p is convex, then using inequalities (2.11), (2.10) and Young’s inequality we obtain that for any 2 k p, there exists a constant c = c(k, p, ρ, μV , V ) such that cWkk (μ, μV ) E(μ) − E(μV ).
1186
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
3. We want to point out that the inequalities (2.11) and (2.10) are somehow complementary to each other. For example, if we take V (x) = ρ|x|p with p > 1 and the measure μ = θ# μV for θ (x) = x + m, then Eq. (2.11) takes the form cp mp
|x + m|p − |x|p μV (dx)
(2.20)
while Eq. (2.10) becomes Cm2
|x + m|p − |x|p μV (dx),
which, because it is easy to check that μV is symmetric, is the same as Cm 2
|x + m|p − |x|p − p sign(x)|x|p−1 m μV (dx).
(2.21)
Notice here that (2.20) is in the right scale for large m as (2.21) is in the right scale for m close to 0, because in this case the integrand is of the size m2 . It seems that Talagrand’s transportation inequality in this context has two aspects, one is the large Wp (μ, μV ) which is dictated by the potential V for large values and results with Eq. (2.11) and the small W2 (μ, μV ) regime which is dictated by the repulsion effect of the logarithm and results with Eq. (2.10). 4. It is not clear whether inequality (2.10) still holds for the case of a potential V which is not convex. Of interest would be the particular case V (x) = ax 4 + bx 2 for some a > 0 and b < 0. This example actually raises the question of the stability of transportation inequality under bounded perturbations. 5. Very likely the constant cp in (2.11) is not sharp. 3. Potential independent transportation inequalities In this section, we investigate some potential independent transportation inequalities. A transportation inequality in the form of (2.10) can not possibly hold without a quadratic growth at infinity. Also, the proof of (2.10) might lead to the conclusion that the logarithmic term plays a more important role. Therefore the natural question one may ask is whether there is a manifestation of this fact in some sort of transportation type inequality which is independent of the potential involved. The main question reduces to hint some appropriate distance one needs to use to replace the Wasserstein distance in Theorem 2. We investigate in this section several possibilities, starting with the free version of the classical Pinsker’s inequality. The Pinsker’s inequality classically states that (cf. [10] and [21]) 2 μ − ν 2v E(μ | ν)
for any μ, ν probability measures on R,
where μ − ν v is the total variation distance between μ and ν and E(μ | ν) is the relative entropy between μ and ν. This in particular shows that if μn convergence to μ in entropy, then μn converges to μ is a very strong sense.
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
1187
The same natural question can be posed in the logarithmic entropy context. For a given potential V , is there an inequality of the form C μ − μV 2v E(μ) − E(μV ) for a given constant C > 0 and any probability distribution μ on R? It turns out that these inequalities do not hold for the logarithmic energy. In fact, we will show that even a weaker inequality of the form C|Fμ − FμV |2u E(μ) − E(μV )
(3.1)
does not hold, where Fμ denotes the cumulative function of a probability measure μ on the line. Even though the uniform distance does not have the same widespread use in probability it appears for example in the Berry–Esseen type estimates for the convergence in the central limit theorem. This is the reason why we consider this distance as the first next best candidate wherever the total variation fails. Clearly this metric gives a stronger topology as the topology of weak convergence. We will construct a counterexample to (3.1) in the case of V (x) = 2x 2 , for which the equilibrium measure is √ 2 1 − x2 dx, μV (dx) = 1[−1,1] (x) π the semicircular law on [−1, 1]. Consider now the sequence √ 2n−1 (−1)k T2k+1 (x) 2 1 − x2 μn (dx) = 1[−1,1] (x) dx + k=2 dx √ π 4(n2 − 1)π 1 − x 2 where Tk is the kth Chebyshev polynomial of the first kind. With these choices we have that E(μn ) − E(μV )
π2 |Fμn − FμV |2u log(n/3)
for all n 4.
(3.2)
Let us point out that μn is indeed a probability measure. This requires a little proof but it is entirely elementary and is left to the reader. To prove (3.1), notice that since the support of μn is the same as the support of μV , we have from (2.2) that log |x − y|(μn − μV )(dx)(μn − μV )(dy). (3.3) E(μn ) − E(μV ) = − Next remark that μn = cos# (fn λ) and μV = cos# (gλ), where λ is the Lebesgue measure on [0, π] and fn (t) =
2n−1
1 1 − cos(2t) + (−1)k cos (2k + 1)t , 2 π 4π(n − 1) k=2
and further
g(t) =
1 − cos(2t) π
1188
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
−
log |x − y|(μn − μV )(dx)(μn − μV )(dy) π π =−
log | cos t − cos s|hn (t)hn (s) dt ds
where hn = fn − g.
0 0
Now we provide a formula for the logarithmic energy we learnt from [15] and have not seen it elsewhere. Here is a quick description. Write first cos t = (eit + e−it )/2 and cos s = (eis + e−is )/2 so | cos t − cos s| = |(eit + e−it )/2 − (eis + e−is )|/2 = |1 − ei(t+s) ||1 − ei(t−s) |/2 and so, for t = s, and t or s not equal to π ,
log | cos t − cos s| = − log 2 + Re log 1 − ei(t+s) + log 1 − ei(t−s) = − log 2 −
∞
Re ei(t+s) / + ei(t−s) /
=1
= − log 2 −
∞ 2 =1
cos(t) cos(s).
From this, one gets to π π −
log | cos t − cos s|hn (t)hn (s) dt ds =
∞ 2 =1
0 0
2
π cos(t)hn (t) dt
.
(3.4)
0
But now, π 0
2n−1
1 k cos(t)hn (t) dt = (−1) cos(t) cos (2k + 1)t dt 2 4π(n − 1) π
=
k=2
(−1)(−1)/2 8(n2 −1)
0
,
0
4 4n and odd otherwise
and thus π π −
log | cos t − cos s|hn (t)hn (s) dt ds =
∞ 2 =1
0 0
=
2
π cos(t)hn (t) dt
0
2n−1 1 1 . 2 2 2 + 1 32(n − 1) =2
On the other hand |Fμn − FμV |u = |Ffn λ − Fgλ |u = supx∈[0,π] | x hn (t) dt = 0
x 0
hn (t) dt| and
2n−1 (−1) sin((2 + 1)x) 1 , 2 2 + 1 4π(n − 1) =2
(3.5)
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
1189
from which for x = π/4, we obtain x 2n−1 1 1 |Fμn − FμV |u = sup hn (t) dt . 2 4π(n − 1) 2 + 1 x∈[0,π]
(3.6)
=2
0
Combining (3.5) and (3.6) we get
2
π2 2n−1
1 =2 2+1
|Fμn − FμV |2u −
log |x − y|(μn − μV )(dx)(μn − μV )(dy)
(3.7)
1 1 which together with the fact that 2n−1 =2 2+1 2 log(n/3) for n 4 and (3.3), we finally arrive at (3.2). The example shown above has the property that E(μn ) − E(μV ) converges to 0 when n goes to infinity, and also that |Fμn − FμV |u converges to zero. Despite the fact that (3.1) does not hold, we will see below in Corollary 1 that if E(μn ) − E(μV ) converges to 0, then |Fμn − FμV |u always converges to 0. We consider now a weak form of (3.1). To do this we define the distance −|ax+b| −|ax+b| (3.8) μ(dx) − e ν(dx). d(μ, ν) = sup e a,b∈R
With this definition we have the following result. Theorem 3. For any potential V satisfying (2.1), we have that for any compactly supported measure μ, 4π 3 d 2 (μ, μV ) E(μ) − E(μV ).
(3.9)
Proof. Using Eqs. (2.1) and (2.2), we get for any compactly supported measure μ with E(μ) finite, E(μ) − E(μV ) − log |x − y|(μ − μV )(dx)(μ − μV )(dy). We will prove that for any measures μ and ν with compact support such that − log |x − y| μ(dx) μ(dy) < ∞ and − log |x − y| ν(dx) ν(dy) < ∞, we have that 4π 3 d 2 (μ, ν) −
log |x − y|(μ − ν)(dx)(μ − ν)(dy),
(3.10)
which shows that (3.10) implies (3.9). Now we use [11, Eq. (6.45)] to write ∞
−
log |x − y|(μ − μV )(dx)(μ − μV )(dy) = 0
|μ(t) ˆ − μˆ V (t)|2 dt t
(3.11)
1190
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
where the hat stands for the Fourier transform, and continue with ∞
1 |μ(t) ˆ − νˆ (t)|2 dt = t 2
0
∞ −∞
|μ(t) ˆ − ν(t)| ˆ 2 dt |a| |t|
∞
−∞
|μ(t) ˆ − νˆ (t)|2 dt a2 + t 2
∞ 2 (μ(t) ˆ − νˆ (t))e−ict a 2 dt π a2 + t 2 −∞
for any a, c ∈ R with a = 0. Further, using the inversion formula for the Fourier transform, one has ∞ −∞
(μ(t) ˆ − νˆ (t))e−ict dt = 2π a2 + t 2
because for φ(t) =
ˆ φ(x)(μ − ν)(dx) =
2π 2 |a|
e−|a(x+c)| (μ − ν)(dx) (3.12)
eict , a 2 +t 2
ˆ φ(x) =
From here, (3.10) follows immediately.
ei(x+c)t πe−|a(x+c)| . dt = |a| t 2 + a2 2
Remark 2. From Eq. (3.11) it seems that the distance one should consider should be the Sobolev norm with exponent −1/2. This is another possible candidate to the role of d played here, however not always finite. We chose the metric d as it’s definition is somehow close to uniform norm of the difference of the Laplace transforms of the measures. It is also always defined and bounded by 1, thus resembling the total variation distance. The next result is collecting facts about how strong the topology induced by d is. Proposition 1. 1. d is a distance on P(R) and if d(μn , μ) −→ 0, then μn →n→∞ μ in the weak topology. In n→∞
addition d(δa , δb ) = 1 for a = b, thus the topology induced by d is strictly stronger than the weak convergence topology. 2. For any two probability measures μ and ν, d(μ, ν) 2|Fμ − Fν |u .
(3.13)
3. If V satisfies condition (2.1), then EV (μn ) −→ EV (μV ) implies |Fμn − FμV |u −→ 0. n→∞
n→∞
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
1191
Proof. 1. To prove that d is a distance the only non trivial fact is that for two probability measures μ and ν, d(μ, ν) = 0 implies μ = ν. Thus from Eq. (3.12), we obtain for a = 1 that for all c ∈ R, ∞ −∞
(μ(t) ˆ − νˆ (t))e−ict dt = 0. 1 + t2
Since this holds true for any c ∈ R, it implies that the Fourier transform of the function ˆ ν (t) t → μ(t)−ˆ is 0, which means that the function in discussion must be 0. This means that μˆ = νˆ , 1+t 2 or equivalently that μ = ν. Let L(μ, ν) stand for the Levy distance which induces the weak topology on P(R). Let d(μn , μ) −→ 0. Assume now that there exists > 0 and a subsequence such that L(μnk , μ) . n→∞
Otherwise said, the sequence μn has a subsequence which is not convergent to μ. Since, we are dealing with probability measures, there is a subsequence μnkl which is vaguely convergent to a measure ν with total mass less than 1. This means that for any continuous function φ which is vanishing at infinity, we have that
φ dμnkl −→
l→∞
φ dν.
We can apply this for functions φ(x) = e−|ax+b| where a = 0 and infer that e
−|ax+b|
μnkl (dx) −→
l→∞
e−|ax+b| ν(dx)
for all a = 0, b ∈ R.
On the other hand, because d(μnkl , μ) −→ 0, these considerations result with l→∞
e−|ax+b| μ(dx) =
e−|ax+b| ν(dx)
for all a = 0, b ∈ R.
Further, using the dominated convergence for b = 0 and a → 0, we obtain that ν is a probability measure. From the discussion at the beginning of this proof, it also follows that ν = μ and this in turn results with μnkl being weakly convergent to μ, a contradiction. This proves that the convergence in the metric d implies weak convergence. It is obvious that d(μ, ν) 1 for any measures μ and ν. For the case of discrete measures, we also have that 1 d(δa , δb ) e−α|x−a| δa (dx) − e−α|x−a| δb (dx) for any α > 0, which yields that 1 d(δa , δb ) 1 − e−α|b−a| for all α > 0. Letting α → ∞, we get that d(δa , δb ) = 1 for a = b which shows that convergence in d is strictly stronger than convergence in the weak topology. 2. From the fact that for any finite positive measure μ, (0,∞)
1 − e−αy μ(dx) =
(0,∞)
αe−αy μ (y, ∞) dy,
1192
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
we deduce that e−α|x−a| (μ − ν)(dx) =
αe−αy Fμ (a − y) − Fμ (a + y) − Fν (a − y) + Fν (a + y) dy
(0,∞)
which easily yields (3.13). 3. We actually show that if μn and μ are compactly supported probability measures such that −
log |x − y| μ(dx)μ(dy) < ∞,
−
log |x − y| μn (dx) μn (dy) < ∞
and lim
n→∞
log |x − y|(μn − μ)(dx)(μn − μ)(dy) = 0,
then |Fμn − Fμ |u −→ 0. From (3.10) and the first part, we obtain that μn converges weakly to μ. n→∞ In addition, none of the measures μn or μ have atoms. Thus Fμn and Fμ are continuous functions which combined with the weak convergence implies that Fμn converges pointwise to Fμ . Since the functions Fμn and Fμ are distributions of probability measures, it is an easy matter to check that the convergence is actually uniform. 2 Remark 3. We do not know if the topology of convergence in d is the same as the one defined by the metric |Fμ − Fν |u . This result might leave one wondering if a stronger convergence takes place. In other words, is it true that EV (μn ) −→ EV (μV ) implies μn − μV v −→ 0? To this end, we can consider n→∞ √ n→∞ |x|+ x 2 −1 | and notice (see [27, p. 46]) that μV is the arcsine law of [−1, 1]. Thus V (x) = log | 2 if we consider dx μV (dx) = 1[−1,1] (x) √ , π 1 − x2
μn (dx) = 1[−1,1] (x)
(1 − Tn (x))dx , √ π 1 − x2
then, using the same argument which led us to (3.4), with hn there replaced by hn (x) = cos(nx) here, one arrives at E(μn ) − E(μV ) = n1 while the total variation distance is μn − μV v 1/4. 4. Log-Sobolev inequality In this section, we develop similarly the mass transportation method to prove the Log-Sobolev inequality in the free context. Note again that, as discussed in the introduction, the first assertion for strictly convex potentials was initially proved by large matrix approximation in [3]. Before we state the main result, we define inspired by Voiculescu [31], the relative free Fisher information as
2 2 I (μ) = H μ(x) − V (x) μ(dx) with H μ(x) = p.v. μ(dy). (4.1) x −y
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
1193
for measures μ on R which have density p = dμ/dx in L3 (R). In this case the principal value integral is a function in L3 . Otherwise we let I (μ) be equal to +∞. Theorem 4 (Log-Sobolev). 1. If V is C 2 and V (x) − ρx 2 is convex for some ρ > 0, then for any probability measure μ on R, E(μ) − E(μV )
1 I (μ). 4ρ
(4.2)
Equality is attained for the case V (x) = ρx 2 and μ = θ# μV , where θ (x) = x + m. Thus the inequality (4.2) is sharp for translations of μV . 2. If V is C 2 and V (x) − ρ|x|p is convex for some ρ > 0 and p > 1, then for any probability measure μ on R, q kp E(μ) − E(μV ) q/p Iq (μ) where Iq (μ) = H μ(x) − V (x) μ(dx) (4.3) ρ where here q is the conjugate of p i.e. 1/q + 1/p = 1 and the constant kp = (pcp )q/p /q, with cp from (2.11). Proof. 1. We will assume that the measure μ has a smooth compactly supported density as the general case follows via approximation arguments discussed in details in [18]. Take the (increasing) transport map θ from μV into μ. We write the inequality (4.2) in the following equivalent way
2
1 H μ θ (x) − V θ (x) μV (dx) 4ρ
+ V (x) − V θ (x) − V θ (x) x − θ (x) μV (dx)
− H μ θ (x) − V θ (x) x − θ (x) μV (dx)
x −y μV (dx) μV (dy) 0. (4.4) + H μ θ (x) x − θ (x) μV (dx) − log θ (x) − θ (y) Notice now that from the convexity of V (x) − ρx 2 , one obtains that
V (x) − V θ (x) − V θ (x) x − θ (x) ρ x 2 − θ (x)2 − 2θ (x) x − θ (x)
2 = ρ x − θ (x) .
(4.5)
Now,
H μ θ (x) x − θ (x) μV (dx) =
x − θ (x)
2 μV (dy) μV (dx) θ (x) − θ (y) x −y − 1 μV (dx) μV (dy) = (4.6) θ (x) − θ (y)
1194
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
where one has to interpret the second integral here in the principal value sense, however since θ is increasing, the last integral is actually taken in the Lebesgue sense. Using these, Eq. (4.4) may be rewritten as 1 4ρ
2
H μ θ (x) − V θ (x) − 2ρ x − θ (x) μV (dx)
+
x −y x −y μV (dx) μV (dy) 0 − 1 − log θ (x) − θ (y) θ (x) − θ (y)
which is seen to hold since u − 1 − log(u) 0 for u 0. Equality is attained for the case V (x) = ρx 2 and θ (x) = x + c, which corresponds to the translations of the measure μV . 2. With the same arguments used in the above proof and the proof of Theorem 2, we use Eqs. (2.18) and (2.19) to argue that kp ρ q/p
H μ(x) − V (x)q μ(dx) − E(μ) + E(μV )
q kp H μ θ (x) − V θ (x) q/p ρ p
+ V θ (x) − H μ θ (x) x − θ (x) + cp ρ x − θ (x) μV (dx) +
x−y x −y − 1 − log μV (dx) μV (dy) θ (x) − θ (y) θ (x) − θ (y)
0 where we used Young’s inequality a q /q + bp /p ab for a, b 0 and the constant kp = (pcp )q/p /q. 2 Remark 4. It was proved in [24] that a Log-Sobolev inequality always implies a transportation inequality. 5. HWI Inequality This section is devoted to the free analog of the HWI inequality of Otto and Villani [25] in the classical context, connecting thus the (free) entropy, Wasserstein distance and Fisher information. As we will see, the HWI implies the Log-Sobolev inequality for strictly convex potentials. This free HWI inequality was not considered before, and in particular it is not clear whether there is a random matrix proof, delicate points involving the Wasserstein distance entering into the proof. Theorem 5 (HWI inequality). 1. Assume that V is C 2 such that for some ρ ∈ R, V (x) − ρx 2 is convex. Then, for any measure μ ∈ P(R), (5.1) E(μ) − E(μV ) I (μ)W2 (μ, μV ) − ρW22 (μ, μV ).
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
1195
In the case V (x) = ρx 2 , the inequality is sharp. 2. If V is C 2 and V (x) − ρ|x|p is convex for some ρ 0 and p > 1, then for the same constant cp appearing in Theorem 2, we have that 1/q
p
E(μ) − E(μV ) Iq (μ)Wp (μ, μV ) − ρcp Wp (μ, μV ),
(5.2)
where 1/p + 1/q = 1. Proof. 1. We employ here the notations used in Theorem 4 and we will give a proof of the inequality for the case of a measure μ with smooth and compactly supported density, the general case follows through careful approximations pointed in [18]. The inequality to be proved can be restated as (5.3) + (5.4) + (5.5) 0, where
2
H μ θ (x) − V θ (x) μV (dx)
(5.3) = − (5.4) = (5.5) =
1/2
2 θ (x) − x μV (dx)
H μ θ (x) − V θ (x) x − θ (x) μV (dx)
2 V (x) − V θ (x) − V θ (x) x − θ (x) − ρ θ (x) − x μV (dx)
H μ θ (x) x − θ (x) μV (dx) −
log
(5.3) (5.4)
x−y μV (dx) μV (dy). (5.5) θ (x) − θ (y)
A simple application of Cauchy’s inequality shows that (5.3) 0. Using convexity of V (x)−ρx 2 we have from Eq. (4.5), that (5.4) 0. Finally, using (4.6), we have that (5.5) =
x−y x−y − 1 − log μV (dx) μV (dy) 0, θ (x) − θ (y) θ (x) − θ (y)
which finishes the proof of (5.1). For the case V (x) = ρx 2 , we have equality if θ (x) = x + m. 2. The inequality we want to prove is equivalent to the statement that (5.6) + (5.7) + (5.8) 0, where 1/q 1/p
p
q (5.6) = H μ θ (x) − V θ (x) μV (dx) θ (x) − x μV (dx)
− H μ θ (x) − V θ (x) x − θ (x) μV (dx) (5.7) = (5.8) =
p
V (x) − V θ (x) − V θ (x) x − θ (x) − ρcp θ (x) − x μV (dx)
H μ θ (x) x − θ (x) μV (dx) −
log
(5.6) (5.7)
x−y μV (dx) μV (dy). (5.8) θ (x) − θ (y)
Now, (5.6) is non-negative thanks to Hölder’s inequality, Eq. (5.7), follows from the convexity of V (x) − ρ|x|p and the combination of (2.18) and (2.19), while Eq. (5.8) is the same as (5.5). 2
1196
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
As pointed out in [25], HWI inequalities for ρ > 0 always implies Log-Sobolev. We give here the following formal corollary of HWI inequality. Corollary 1. 1. If ρ > 0, then inequality (5.1) implies (4.2) and (5.2) implies (4.1). 2. If V (x) − ρx 2 is a convex for some ρ ∈ R, then Talagrand’s free transportation inequality with constant C > max{0, −ρ} implies free Log-Sobolev inequality with constant 2 K = max{ρ, (C+ρ) 32C }. More precisely, ∀μ ∈ P(R),
CW22 (μ, μV ) E(μ) − E(μV ) ⇒
∀μ ∈ P(R),
E(μ) − E(μV )
1 I (μ). 4K
3. In particular, if V is convex and C 2 such that V (x) ρ > 0 for |x| r, then free LogSobolev inequality holds with the constant C > 0 from (2.10). Proof. 1. It follows as an application of Young’s inequality a p /p + bq /q ab for a, b 0. 2. For ρ > 0, everything is clear. In the case ρ 0, then, from (5.1) and Talagrand’s transportation inequality, one has for δ > 0, that I (μ)W2 (μ, μV ) − ρW22 (μ, μV )
ρ 1 − E(μ) − E(μV ) 4δI (μ) + Cδ C
E(μ) − E(μV )
which yields for any δ >
1 C+ρ
E(μ) − E(μV )
4Cδ 2 I (μ). (C + ρ)δ − 1
1 Taking minimum over δ > C+ρ gives the conclusion. 2 3. In the case V is convex, C and strongly convex for large values, part 2 of Theorem 2 does the rest. 2
6. Brunn–Minkowski inequality The (one-dimensional) free Brunn–Minkowski inequality was put forward in [24] again through random matrix approximation. We provide here a direct mass transportation proof similar to the one of its classical (one-dimensional) counterpart (see e.g. [12]). As discussed in [24], this inequality may be used to deduce in an easy way both the Log-Sobolev and transportation inequalities. The main result of this section is the following theorem.
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
1197
Theorem 6. Assume that V1 , V2 , V3 are some potentials satisfying (2.1) such that for some a ∈ (0, 1),
aV1 (x) + (1 − a)V2 (y) V3 ax + (1 − a)y
for all x, y ∈ R.
(6.1)
Then aEV1 (μV1 ) + (1 − a)EV2 (μV2 ) EV3 (μV3 ).
(6.2)
Proof. Take the (increasing) transportation map θ from μV1 into μV2 . This certainly exists as the measure μV1 has no atoms. Noticing that for any measure with finite logarithmic energy, we have the obvious equality
log |x − y| μ(dx) μ(dy) = 2
log(x − y) μ(dx) μ(dy).
x>y
Using this we argue that
aV1 (x) + (1 − a)V2 θ (x) μV1 (dx) −2
a log(x − y) + (1 − a) log θ (x) − θ (y) μV1 (dx) μV1 (dy)
x>y
V3 ax + (1 − a)θ (x) μV1 (dx)
−2
log ax + (1 − a)θ (x) − ay + (1 − a)θ (y) μV1 (dx) μV1 (dy)
x>y
= EV3 (ν) EV3 (μV3 ) where ν = (a id +(1 − a)θ )# μV1 and we used (6.1) and the concavity of the logarithm on (0, ∞). The proof is complete. 2 7. Random matrices and a first version of Poincaré inequality In the next three sections, we investigate Poincaré type inequalities in the free (onedimensional) context. We discuss two versions of it. The first one is suggested by large matrix approximations and the classical Poincaré inequality for strictly convex potentials, but will be proved directly. Recall first the classical Poincaré inequality (cf. e.g. [2,23,29,32]. . . ). Theorem 7. Let μ(dx) = e−W (x) dx be a probability measure on Rd such that W (x) − r|x|2 is convex. Then for any compactly supported and smooth function φ : Rd → R, we have that |∇φ|2 dμ r Varμ (φ).
(7.1)
1198
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
Assume now that V is a potential on R with enough growth at infinity. Consider the matrix models on Hn , the space of Hermitian n × n matrices with the inner product A, B = Tr(AB ∗ ) and the probability measure given by Pn (dM) =
1 e−n Tr(V (M)) dM Zn (V )
where here dM is the standard Lebesgue measure on Hn . We have that for any bounded continuous function F : R → R,
1 Tr F (M) Pn (dM) −→ F (x) μV (dx). (7.2) n→∞ n Assume in addition that V (x) − ρx 2 is a convex function on R. Then, consider Φ(M) = Tr φ(M), where φ : R → R is a compactly supported and smooth function. Notice that ∇Φ(M) = φ (M) and thus |∇Φ(M)|2 = |φ (M)|2 = Tr(φ (M)2 ). Since n Tr(V (M)) − nρ|M|2 is convex as a function of M, we can apply Poincaré’s inequality on Hn to obtain that
(7.3) Tr φ (M)2 Pn (dM) nρVarPn Tr φ(M) . The first term in this inequality divided by n (cf. Eq. (7.2)) converges to φ (x)2 μV (dx). To understand the second term in the above equation, notice that Var(Tr(φ(M))) = E[(Tr(φ(M)) − E[Tr(φ(M))])2 ]. The study of the asymptotic of the linear statistics, Tr(φ(M)) − E[Tr(φ(M))] in the literature of random matrix is known as “fluctuations”. From Johansson’s paper [19], it is known that this is universal in the sense that the limit in distribution of the fluctuations is Gaussian and, at least in the case of polynomial V (for which V (x) − ρx 2 fulfills the conditions in there), the variance of the Gaussian limit depends only on the endpoints of the support of μV . Moreover, in the particular case of V (x) = 2x 2 , the variance of the distribution was computed for example in [22] and [19] as 1 2π 2
1 1 −1 −1
φ(t) − φ(s) t −s
2
1 − ts dt ds. √ √ 1 − t 2 1 − s2
(7.4)
This variance is interpreted in [8] in terms of the number operator of the arcsine law. We will come back to this aspect in Section 9. Dividing the inequality in Eq. (7.3) by n and taking the limit when n → ∞, these heuristics (after a simple rescaling) suggest the following result. Theorem 8. Assume that V (x) − ρx 2 is convex for some ρ > 0. Then for any smooth function φ, one has that
ρ φ (x) μV (dx) 2π 2
b b
2
a a
φ(x) − φ(y) x−y
2
−2ab + (a + b)(x + y) − 2xy dx dy. × √ √ 2 (x − a)(b − x) (y − a)(b − y)
(7.5)
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
1199
where supp(μV ) = [a, b]. Equality is attained for V (x) = ρ(x − α)2 + β and φ(x) = c1 + c2 x for some constants c1 , c2 . The reader may wonder if the numerator in the second fraction of (7.5) is nonnegative. This is so because a+b b−a 2 a+b y− 0 −2ab + (a + b)(x + y) − 2xy = 2 − x− 2 2 2 for any x, y ∈ [a, b]. Proof. Using a simple rescaling we may assume without loss of generality that a = −1 and b = 1 and the inequality we have to show reduces to
ρ φ (x) μV (dx) 2π 2
1 1
2
−1 −1
φ(x) − φ(y) x−y
2 √
1 − xy dx dy. 1 − x2 1 − y2
(7.6)
Then, based on Eq. (2.5), we have that √ 1 1 − x2 V (y) − V (x) g(x) = dy. 2π 2 1 − y 2 (y − x) −1
From the convexity of V (x) − ρx 2 , we learn that g(x)
V (y)−V (x) y−x
2ρ and thus that
ρ 1 − x2, π
(7.7)
which implies
ρ φ (x) μV (dx) π
1
2
φ (x)2 1 − x 2 dx.
−1
Therefore it is enough to check that 1
2
φ (x)
1 − x 2 dx
−1
1 2π
1 1 −1 −1
φ(x) − φ(y) x −y
2
1 − xy dx dy √ 1 − x2 1 − y2
for any smooth φ. Now, we make the change of variables x = cos t to justify 1
2
φ (x) −1
where ψ(t) = φ(cos t).
π 1 − x 2 dx
=
π
φ (cos t) sin (t) dt = 0
2
2
0
ψ (t)2 dt
(7.8)
1200
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
On the other hand, using the change of variable x = cos t, y = cos s on the right-hand side, inequality (7.8) becomes π
1 ψ (t) dt 2π
π π
2
0
0 0
ψ(t) − ψ(s) cos t − cos s
2 (1 − cos t cos s) dt ds.
(7.9)
To show this, we write ψ(t) = ∞ kt and then, because ψ is a smooth function, we can k=0 ak cos differentiate term by term to get ψ (t) = − ∞ k=1 kak sin kt, therefore π
∞ π 2 2 ψ (t) dt = k ak 2
2
k=1
0
and π π 0 0
=
ψ(t) − ψ(s) cos t − cos s
∞ k,l=1
2
π π ak al 0 0
(1 − cos t cos s) dt ds (cos kt − cos ks)(cos lt − cos ls)(1 − cos t cos s) dt ds. (cos t − cos s)2
To compute the integrals on the right-hand side of the above equation, we take the generating function of these numbers and with a little algebra one can show that ∞
π π k l
u v
k,l=1
0 0
π π = 0 0
(cos kt − cos ks)(cos lt − cos ls)(1 − cos t cos s) dt ds (cos t − cos s)2
(u − u3 )(v − v 3 )(1 − cos t cos s) dt ds (1 + u2 − 2u cos t)(1 + u2 − 2u cos s)(1 + v 2 − 2v cos t)(1 + v 2 − 2v cos s) ∞
=
π 2 uv = π2 kuk v k 2 (1 − uv)
(7.10)
k=1
for all u, v ∈ (−1, 1). The last integral can be computed as follows. First use partial fractions to justify π 0
(A + B cos t) dt = 2 (1 + u − 2u cos t)(1 + v 2 − 2v cos t) =
π 0
C dt + 2 1 + u − 2u cos t
D/2 C/2 + 1 − u2 1 − v 2
π 0
D dt − 2v cos t
1 + v2
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
1201
where the constants C, D are linear combinations of A and B. Further, taking A = 1 and B = − cos s and repeating once more the partial fractions argument, one can cary out the proof of (7.10). The main consequence of the above calculation is that π π 0 0
(cos kt − cos ks)(cos lt − cos ls)(1 − cos t cos s) dt ds = π 2 kδkl (cos t − cos s)2
and that π π 0 0
ψ(t) − ψ(s) cos t − cos s
2 (1 − cos t cos s) dt ds = π 2
∞
kak2 .
(7.11)
k=1
Therefore inequality (7.9) becomes equivalent to ∞ ∞ π 2 2 π 2 k ak kak 2 2 k=1
k=1
which is obviously true. Notice that equality in this inequality is attained for the case ak = 0 for all k 2 and arbitrary a1 . This corresponds to the case ψ(t) = c1 + c2 cos t or φ(x) = c2 x + c1 for some c1 , c2 . Finally we point out that equality in (7.6) is attained if the equality is attained in (7.7) and (7.9). From there one can easily see from rescaling that equality in (7.5) is attained for V (x) = ρ(x − α)2 + β and φ(x) = c1 + c2 x. The proof of Theorem 8 is complete. 2 In the above proof we showed a direct calculation for Eq. (7.11) which is natural in the course of the above proof. However, there is another way of looking at it which will appear below in Section 9 as the kernel of the number operator. 8. A second version of Poincaré inequality The second version of the Poincaré inequality is motivated by the free calculus and the noncommutative derivative. It was already investigated by Biane [3] for the case of the semicircular law. Definition 1. For a given probability measure μ on R, we say that it satisfies a Poincaré inequality if there is a constant C > 0 such that
φ(x) − φ(y) x −y
2 μ(dx) μ(dy) C Varμ (φ)
for every φ ∈ C01 (R).
(8.1)
By the best constant we mean the largest C > 0 for which the above inequality is satisfied and we denote it by Poin(μ) or λ1 (μ) or SG(μ).
1202
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
In the noncommutative setting for a given function φ, we can think of Dφ(x, y) = φ(x)−φ(y) x−y as the noncommutative derivative of φ. As pointed out by Voiculescu in [30], this is the unique map D : Cx → Cx ⊗ Cx such that 1. D1 = 0. 2. D(fg) = D(f )g + f D(g) for any f, g ∈ Cx. First we collect a couple of obvious properties of the Poincaré constant. Proposition 2. 1. For any a = 0,
1 Poin (ax + b)# μ = 2 Poin(μ) a where here and elsewhere, for a given function f : R → R, f# μ is the push forward measure given by (f# μ)(A) = μ(f −1 (A)). 2. If f : R → R is a differential map such that |f (x)| c > 0 for all x ∈ R, then Poin(μ) c2 Poin(f# μ). 3. If {μn }n1 is a sequence of probability measures which converges weakly to μ, then Poin(μ) lim sup Poin(μn ). n→∞
Next we describe some bounds for the Poincaré constant. Theorem 9. Assume that the measure μ has compact support and is not concentrated at one point. Then μ satisfies a Poincaré inequality with 2 d 2 (μ)
Poin(μ)
1 Var(μ)
where d(μ) = diam(supp(μ)) is the diameter of the support of μ and Var(μ) = ( xμ(dx))2 . Equality on the left in (8.2) is attained only for the case μ = αδa + (1 − α)δb ,
(8.2)
x 2 μ(dx) −
a < b, d0 < α < 1.
Equality on the right of (8.2) is attained only for the case of a semicircular law (a ∈ R, r > 0) μ(dx) =
1 1 (x) 4r 2 − (x − a)2 dx. [a−2r,a+2r] 2πr 2
In addition, assume that V is a C 2 potential on R such that for some integer p and real ρ > 0, V (x) − ρx 2p , is convex and μ is the minimizer of V (x) μ(dx) − log |x − y| μ(dx) μ(dy)
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
1203
over all probability measures of R. Then (pρ
2p 1 p p ) 8
In particular if p = 1, we get that
ρ 4
Poin(μ).
(8.3)
Poin(μ).
Proof. For a given function φ ∈ C01 (R), the left-hand side of (8.2) follows from Varμ (φ) =
1 2
2 φ(x) − φ(y) μ(dx) μ(dy)
φ(x) − φ(y) 2 μ(dx) μ(dy) x−y φ(x) − φ(y) 2 d 2 (μ) μ(dx) μ(dy). 2 x −y
=
1 2
(x − y)2
(8.4)
The right-hand side of (8.2) follows from (8.1) for a φ ∈ C01 (R) such that φ(x) = x on the support of μ. For measures μ = αδa + (1 − α)δb , condition (8.1) is equivalent to
2
2
2 Cα(1 − α) φ(b) − φ(a) α 2 φ (a) + (1 − α)2 φ (b) φ(b) − φ(a) 2 + 2α(1 − α) for any φ ∈ C01 (R). b−a Since for any function φ ∈ C0∞ (R) we can find another function ψ ∈ C01 (R) so that φ(a) = ψ(a) and φ(b) = ψ(b) and ψ (a) = 0, ψ (b) = 0, this is also equivalent to
2 ψ(b) − ψ(a) 2 Cα(1 − α) ψ(b) − ψ(a) 2α(1 − α) b−a
for any ψ ∈ C01 (R).
This amounts to C 2/(b − a)2 and therefore, in this case, Poin(μ) = Conversely, if μ is a measure so that Poin(μ) = φ ∈ C01 (R)
2 , d 2 (μ)
2 . d 2 (μ)
then, for 1 > > 0, there is a function
such that
φ (x) − φ (y) 2 2 2 + Varμ (φ ) > μ(dx) μ(dy). x −y d 2 (μ)
Without loss of generality we can assume that 0 = inf supp(μ), 1 = sup supp(μ) and φ dμ = 0, 2 φ dμ = 1 where we recall that supp(μ) stands for the support of μ. In this case, the above inequality implies
1204
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
2+ 2
|x−y|1−
φ (x) − φ (y) x −y
+ |x−y|<1−
2 μ(dx) μ(dy)
φ (x) − φ (y) x −y
2 μ(dx) μ(dy)
2 φ (x) − φ (y) μ(dx) μ(dy)
|x−y|1−
1 + (1 − )2 (2 − ) =− (1 − )2
2 φ (x) − φ (y) μ(dx) μ(dy)
|x−y|<1−
2 φ (x) − φ (y) μ(dx) μ(dy) +
|x−y|1−
2 , (1 − )2
which results with |x−y|1−
2 (1 − )2 . φ (x) − φ (y) μ(dx) μ(dy) 2 − 2−
(8.5)
Now,
2 φ (x) − φ (y) μ(dx) μ(dy)
|x−y|1−
2 φ (x) − φ (y) μ(dx) μ(dy)
|x−1/2|1/2− |y−1/2|1/2−
2μ |x − 1/2| 1/2 − .
(8.6)
Thus (8.5) and (8.6) give
(1 − )2 μ |x − 1/2| 1/2 − 1 − 4 − 2
for any 1 > > 0.
This shows that μ((0, 1)) = 0 and therefore μ = αδ0 + (1 − α)δ1 . The other extreme case of inequality (8.2) is contained in Biane’s paper [3] in the more general context of several noncommutative variables. For completeness we will provide here a selfcontained proof. In the first place, using Proposition 8.1, we may assume that μ(dx) =
1 1[−2,2] (x) 4 − x 2 dx 2π
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
1205
is the semicircular law on [−2, 2]. Take Un to be the Chebyshev polynomials of second kind x defined by Un (cos(θ )) = sin(n+1)θ sin θ . With this choice, we have that Un ( 2 ) are the orthogonal polynomials with respect to μ. The generating function of Un is given by ∞
r n Un (x) =
n=0
1 1 − 2rx + r 2
for |x|, |r| < 1,
from which one gets ∞
rn
n=0
∞
n−1
n=0
k=0
Un (x) − Un (y) 2r = 2 rn Uk (x)Un−1−k (y), = x −y (1 − 2rx + r 2 )(1 − 2ry + r 2 )
and then Un (x) − Un (y) =2 Uk (x)Un−1−k (y). x −y n−1
(8.7)
k=0
Now, for a given φ ∈ C01 (R), we can write in L2 (μ) sense, φ(x) =
∞
αn Un
n=0
x , 2
yielding from orthogonality and (8.7) that
Varμ (φ) =
φ 2 dμ −
φ(x) − φ(y) x−y
2
2 φ dμ
=
∞
αn2
and
n=1
μ(dx) μ(dy) =
∞
nαn2 .
n=1
It follows that in this case Poin(μ) = 1 = 1/ Var(μ) and equality is attained only for φ(x) = c1 + c2 U1 (x) = c1 + c2 x for some constants c1 , c2 . To prove the converse, take a compactly supported measure μ and assume that x μ(dx) = 0 and x 2 μ(dx) = 1. In order to show that μ is the semicircular distribution, it suffices to show that Un ( x2 )μ(dx) = 0 for all n 1. We use induction to this task. Assuming true for U1 , U2 , . . . , Un , and using Un+1 (x) = 2xUn (x) − Un−1 (x), we need to show that xUn ( x2 ) integrates to 0 against μ. Applying Poincaré’s inequality to Un ( x2 ) + rU1 ( x2 ) together with the induction hypothesis and equation (8.7), we get that for any r ∈ R, Un2
Un ( x2 ) − Un ( y2 ) 2 x x μ(dx) + r xUn μ(dx) μ(dx) μ(dy), 2 2 x −y
which implies that
xUn ( x2 ) μ(dx) = 0.
1206
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
In the case of the equilibrium measure of a convex potential V , we have the support of the measure consists of one interval [a, b] and a, b solve the system (cf. Eq. (2.4)) 1 2π
b a
x −a dx = 1 V (x) b−x
and
b
1 2π
V (x) a
b−x dx = −1. x −a
If we denote c = (b − a)/2 and β = (a + b)/2, the system above can be rewritten in terms of β and c as c 2π
1 −1
1+t V (β + ct) √ dt = 1 and 1 − t2
c 2π
1 −1
1−t V (β + ct) √ dt = −1 1 − t2
which is equivalent to c 2π
1
V (β + ct) √ −1
1
t 1 − t2
dt = 1 and −1
1 V (β + ct) √ dt = 0. 1 − t2
Since V is C 2 the first equation can be integrated by parts to get that c2 2π
1
V (β + ct) 1 − t 2 dt = 1.
−1
On the other hand we know that V (x) 2p(2p − 1)ρx 2p−2 , hence 2p(2p − 1)ρc2 1 2π
1 −1
2p(2p − 1)ρc2p 2π = =
p(2p − 1)ρc
(ct + β)2p−2 1 − t 2 dt
1
t 2p−2 1 − t 2 dt
−1
2p 2p p
4p (2p − 1)
pρ
2p 2p p c 4p
.
This yields − 1 2p 2p . c 2 mρ p Finally, because d(μ) = b − a = 2c, we arrive at (8.3).
2
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
1207
To conclude this section, we present an inequality which relates the equilibrium measure of a strong convex potential and the arcsine law. Theorem 10. Assume that V (x) − ρx 2 is a convex for some ρ > 0 and the equilibrium measure 1 dx be the arcsine law with support μV has support [a, b]. Let arcsinea,b = 1[a,b] (x) π √(b−x)(x−a) [a, b]. Then for any smooth function supported on [a, b], (8.8) φ (x)2 μV (dx) ρ Vararcsinea,b (φ), where the variance is considered with respect to the arcsinea,b law. Proof. It suffices to deal with the case a = −1, b = 1, the rest following by simple rescaling. Recall that√in the proof of Theorem 8, we use convexity to get that the density g(x) of μV satisfies g(x) πρ 1 − x 2 . Thus the proof reduces to 1 π
1
φ (x)2 1 − x 2 (dx) Vararcsine (φ).
(8.9)
−1
For this, write φ = ∞ n=0 αn Tn (x) the expansion of φ in terms of Chebyshev polynomials of = nU the first kind. Now, T n−1 and thus the above inequality reduces to the obvious inequality ∞ 2 2 ∞ n 2 n α α . 2 n n=1 n=1 n We will actually see below that inequality (8.9) is simply the spectral gap for the Jacobi operator associated to the arcsine law. 9. Poincaré inequalities and Jacobi operators In this section we show how the two versions of the Poincaré inequalities can be viewed as spectral gaps for some Jacobi operators. This discussion is mainly driven from the work [8] by Cabanal-Duvillard and his interpretation of the variance in (7.4) in terms of the number operator of the Jacobi operator associated to the arcsine law. This viewpoint allows for an unified perspective of the Poincaré inequalities presented in the preceding sections. For our purpose we consider here the Jacobi operators given, for smooth functions on (−1, 1), by
(9.1) Lλ f (x) = − 1 − x 2 f (x) + (2λ + 1)xf (x) for λ 0. We consider the Gegenbauer polynomials Cnλ , λ > 0, defined by the generating function ∞ n=0
r n Cnλ (x) =
1 . (1 − rx + r 2 )λ
For λ = 0 we set Cnλ (x) = Tn (x)/n, n 1, where Tn are the Chebyshev polynomials of the first kind.
1208
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
It is known that Cnλ are eigenfunctions of Lλ , with eigenvalue n(n + 2λ), i.e. Lλ Cnλ = n(n + 2λ)Cnλ . On the other hand the Gegenbauer polynomials are orthogonal with respect to the probability measure νλ =
λ−1/2 22λ Γ 2 (λ + 1) . 1[−1,1] (x) 1 − x 2 πΓ (2λ + 1)
Notice that in the case of λ = 0, this becomes the arcsine law and for λ = 1, this is the semicircular law, while for λ = 1/2, this becomes the uniform measure on [−1, 1]. now the normalized Gegenbauer polynomials φnλ = Gλn / cnλ , where cnλ = Take λ 2 Gn (x) νλ (dx). Then φnλ form an orthonormal basis of L2 (νλ ) and thus the operator Lλ is diagonalized in this basis. Consider Nλ to be the counting number operator with respect to the basis φnλ , i.e. Nλ φnλ = nφnλ .
(9.2)
This implies that Lλ = Nλ2 + 2λNλ . Therefore we have the following two inequalities Lλ (2λ + 1)Nλ
and Nλ 1 − Pλ
(9.3)
where Pλ here stands for the projection on constant functions in L2 (νλ ). In other words, Pλ φ = φνλ . Notice that Eq. (9.3) include two statements. The first one is the comparison of L and N , with the spectral gap 2λ + 1 while the second one is the spectral gap of the counting number operator with the spectral gap 1. In the sequel we want to translate these spectral gaps in terms of Poincaré type inequality. For this matter we need to find the kernel of the operator N . λ Then we have for any function in the domain of definition of Lλ , that φ = ∞ n=0 αn φn , and then Lφ, φL2 (νλ ) =
∞
n(n + 2λ)αn2 .
n=0
On the other hand, using integration by parts, we can justify that Lφ, φL2 (νλ ) =
φLλ φ dνλ =
φ (x)2 1 − x 2 νλ (dx).
For the number operator, we have that φNλ φ dνλ =
∞ n=0
nαn2 = lim r↑1
∞ n=0
nr n−1 αn2 .
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
1209
Now, for −1 < r < 1, ∞
nr n−1 αn2 =
φ(x)φ(y)
n=0
∞
nr n−1 φnλ (x)φnλ (y) νλ (dx) νλ (dy).
n=0
2 φ (x)φnλ (y)νλ (dx)νλ (dy) = 0 Furthermore, since φnλ dνλ = 0 for n 1, wealso obtain that n−1 φ λ (x)φ λ (y), nr for n 0 and thus, denoting Kλ (r, x, y) = − ∞ n n n=0
φ(x)φ(y) 1 = 2
∞
nr n−1 φnλ (x)φnλ (y) νλ (dx) νλ (dy)
n=0
2 φ(x) − φ(y) Kλ (r, x, y) νλ (dx) νλ (dy).
The following formula is essentially due to Watson [33] and valid for λ > 0, ∞
r n φnλ (x)φnλ (y) =
n=0
(1 − r 2 )Γ (2λ) 22λ−1 Γ 2 (λ)
1 −1
(1 − z2 )λ−1 dz. (1 − 2r(xy + z (1 − x 2 )(1 − y 2 )) + r 2 )1+λ
For λ = 0, we have to deal with the Chebyshev polynomials of the first kind which was more or less what appeared in the proof of Theorem 8. For this case, we have that (denoting x = cos t and y = cos s), ∞ n r n=0
cn
Tn (x)Tn (y) =
1 − r cos(t + s) 1 − r cos(t − s) + 2 1 − 2r cos(t + s) + r 1 − 2r cos(t − s) + r 2
where cn = Tn2 dν0 = 1 for n = 0 and 1/2 otherwise. Thus, we obtain, after differentiation with respect to r and then limit over r ↑ 1, that ⎧ 1 Γ (2λ) ⎪ ⎪ 3λ−1 Γ 2 (λ) −1 2 ⎪ ⎨ 1−xy Kλ (x, y) = lim Kλ (r, x, y) = , (x−y)2 ⎪ r↑1 ⎪ ⎪ ⎩ 1 ,
(1−xy−z
2 )λ−1 (1−x 2 )(1−y 2 ))1+λ
(1−z √
2(x−y)2
dz,
λ > 0, λ = 0,
(9.4)
λ = 1.
The integrand is not a rational function. In some cases, it is algebraic since λ 0 need not be an integer. To reveal the singularity of this kernel, we make the change of variable
1 − xy − z 1 − x 2 1 − y 2 = t 1 − xy − 1 − x 2 1 − y 2 . Then, after simple algebraic manipulations, setting fλ : (0, 1) → R, 1/u fλ (u) = 1
[(t − 1)(1 − ut)]λ−1 dt, t λ+1
1210
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
and
Hλ (x, y) =
⎧ √
Γ (2λ)(1−xy+ (1−x 2 )(1−y 2 ))λ (x−y)2 ⎪ ⎪ ⎨ 23λ−1 Γ 2 (λ)((1−x 2 )(1−y 2 ))λ−1/2 fλ (1−xy+√(1−x 2 )(1−y 2 ))2 , λ > 0, 1 − xy, ⎪ ⎪ ⎩1 2,
λ = 0, λ = 1,
(9.5)
we can rewrite Eq. (9.4) for |x|, |y| < 1 as Kλ (x, y) =
Hλ (x, y) (x − y)2
(9.6)
where Hλ (x, y) is a continuous function of x, y ∈ [−1, 1]. Now, from (9.3), we obtain the following result. Theorem 11. For any λ 0, one has for all λ 0 and any φ ∈ C 1 ([−1, 1]), that
2λ + 1 φ (x) 1 − x 2 νλ (dx) 2
2
φ(x) − φ(y) x−y
2 Hλ (x, y) νλ (dx) νλ (dy). (9.7)
and
φ(x) − φ(y) x −y
2 Hλ (x, y) νλ (dx) νλ (dy) 2 Varνλ (φ).
(9.8)
Remark 5. 1. Eq. (9.7) for λ = 0 is the statement of Theorem 8 for the case V (x) = 2x 2 and for λ = 1 (more precisely, Eq. (7.8)) while Eq. (9.8) is the statement of the second Poincaré inequality contained in Theorem 9 for the semicircular law. The combination of these two inequalities is equation (8.9). In other words, for measures νλ , the first Poincaré type inequality is driven by the comparison of the Jacobi and counting number operators defined in (9.1) and (9.2), as the second Poincaré type is the spectral gap of the counting number operator. 2. Combining Eqs. (9.7) and (9.8), we also get a Brascamp–Lieb type inequality:
φ (x)2 1 − x 2 νλ (dx) (2λ + 1) Varνλ (φ). (9.9) For λ 1/2, the measure νλ is of the form e−V (x) dx, where V (x) = −cλ − (λ − 1/2) × log(1 − x 2 ), a strictly convex function on (−1, 1) and according to the classical Brascamp– Lieb inequality [6],
φ (x)2
(1 − x 2 )2 νλ (dx) (2λ − 1) Varνλ (φ). (1 + x 2 )
(9.10)
Notice here that neither (9.9) not (9.10) implies the other which means that they complement 1 1 each other in some sense. For example if φ has support in [− 2λ , 2λ ], (9.9) implies (9.10), 1 1 while if φ is supported on [−1, 1] \ [− 2λ , 2λ ], (9.10) implies (9.9).
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
1211
10. Wishart ensembles and Marcenko–Pastur distributions In this section, we address the preceding functional inequalities for probability measures on the real positive axis in the context of the Wishart Ensembles from random matrix theory and their associated Marcenko–Pastur distributions. We start with the random matrix heuristics although, as far as we know, it has not been used towards functional inequalities as before. The problems of large deviations principle for the distribution of the eigenvalues of Wishart ensembles is discussed in [16]. The model is as follows. Take T (n) a n × p(n) random matrix with all the entries being iid N (0, 1) random variables. Then T (n)T (n)t for n < p(n) is known as the nonsingular Wishart random ensemble. According to [17, p. 129], the distribution of the Wishart ensembles is given by Cnp e−
p(n) 2
Tr M
(det M)(p−n−1)/2 dM.
where the measure dM = ij dMij the restriction of the Lebegue measure on the set of n × n non-negative matrices. It is also known (for example [17, p. 129]) that the joint distribution of eigenvalues 1 (λ1 , λ2 , . . . , λn ) of p(n) T (n)T (n)t is given by 1 − p(n) n ti (p(n)−n−1)/2 e 2 i=1 λi Zn n
i=1
1i<j n
|λi − λj |.
Our interest is in the limit distribution of μn = n1 ni=1 δλi . The classical result states that if n/p(n) −→ α ∈ (0, 1], then the limit distribution of μn is the so called Marcenko–Pastur distrin→∞ bution given by 1[(1−√α)2 ,(1+√α)2 ] (x)
4α − (x − 1 − α)2 dx. 2παx
This is a particular model for the standard Wishart ensembles. However one can consider a more general example with potentials for which the distribution of the matrix is driven by a potential Q : [0, ∞) → R, Cn e−p(n) Tr Q(M) (det M)γ (n) dM where dM stands for the Lebesgue measure on n × n positive definite matrices. The distribution of eigenvalues of M is given by 1 −p(n) n Q(ti ) γ (n) i=1 e ti Zn n
i=1
1i<j n
|ti − tj |.
1212
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
1 p(n) The main result of [16] is that the distribution of the random measures μn = p(n) i=1 δλi under the conditions n/p(n) −→ α ∈ (0, 1], γ (n)/n −→ γ > 0, νn satisfy a large deviation principle n→∞
with scale n−2 and the rate function given by
n→∞
R(μ) = E˜ Q (μ) −
inf
μ∈P ([0,∞))
E˜ Q (μ),
where E˜ Q (μ) =
α2 α Q(x) − γ log(x) μ(dx) − 2
log |x − y| μ(dx) μ(dy).
This gives the following motivation. Assume that V : [0, ∞) → R ∪ {+∞} is a lower semicontinuous potential such that lim|x|→∞ (V (x) − 2 log |x|) = ∞. Then, according to the results in [27], we know that there is a unique minimizer of inf
μ∈P ([0,∞))
EV (μ).
In addition the equilibrium measure μV has compact support. A particular case of interest is V (x) = rx − s log(x) with r > 0, s 0 for which we know [27, p. 207] that the equilibrium measure is given by μV (dx) = 1[a,b] (x)
√ √ s +2−2 s +1 r (x − a)(b − x) dx where a = , 2πx r √ s +2+2 s +1 b= . r
(10.1)
One recovers the Marcenko–Pastur distribution for V (x) = rx − s log(x), r > 0, s 0, with r = 1/α and s = (1 − α)/α. The natural way to deal with functional inequalities in the context of measures on the positive axis [0, ∞) is to transfer measures from [0, ∞) into measures on the whole R. For a measure μ on [0, ∞), consider thus the associated symmetric measure μ˜ on R defined as
μ(F ) = μ˜ x: x 2 ∈ F
(10.2)
for any measurable set F of [0, ∞). Defining V˜ (x) = V (x 2 )/2, it is then an easy exercise to check that EV (μ) = 2EV˜ (μ). ˜
(10.3)
In addition, the minimizer of EV˜ is μV˜ = μ˜ V Further, for the non-decreasing transportation map θ of μV into μ, define
θ˜ (x) = sign(x) θ x 2 , which transports μ˜ V˜ into μ. ˜
(10.4)
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
1213
In addition, as it was pointed out in [18], the relative free Fisher information IV (μ) is defined for measures μ on [0, ∞) with density p = dμ/dx in L3 ([0, ∞), x dx) as ∞ IV (μ) =
2 x H μ(x) − V (x) μ(dx) with H μ(x) = p.v.
0
2 μ(dy). x −y
(10.5)
Otherwise we take IV (μ) = +∞. The main reason for defining this in this way is because, cf. [18, Lemma 6.3] and the discussion following, one has ˜ IV (μ) = 2IV˜ (μ),
(10.6)
where IV˜ is defined by (4.1). To state the transportation cost result, we define the appropriate distance. For any μ, ν ∈ P([0, ∞)), set the distance as W (μ, ν) =
inf
π∈Π(μ,ν)
1/2 √ √ 2 x − y π(dx, dy)
(10.7)
where Π(μ, ν) is the set of probability measures on R2 with marginals μ and ν. In this context we have the following transportation cost inequality. Theorem 12. Assume that V : (0, ∞) → R is C 2 ((0, ∞)) such that V (x 2 ) − ρx 2 is convex on (0, ∞) for some ρ > 0 and let μV be the equilibrium measure of V on [0, ∞). Then, for any probability measure μ on [0, ∞), we have that ρ W 2 (μ, μV ) EV (μ) − EV (μV ).
(10.8)
In the case of V (x) = rx − s log(x) with r > 0 and s 0, this inequality with ρ = r is sharp. Proof. As announced, the idea is to interpret this inequality as an inequality for potentials on the whole real line instead of [0, ∞). Using the measures μ˜ and μ˜ V from Eq. (10.2) together with (10.3), we have that
˜ − EV˜ (μ˜ V ) . EV (μ) − EV (μV ) = 2 EV˜ (μ) On the other hand, if θ is the (increasing) transportation map of μV into μ, then it is not hard to check that √
2
2 2 x − θ (x) μV (dx) = x − θ˜ (x) μ˜ V (dx). W (μ, ν) = In this framework the inequality (10.8) translates as ρ 2 W (μ, ˜ μ˜ V ) EV˜ (μ) ˜ − EV˜ (μ˜ V ). 2 2 From here we will use the same argument as in the proof of Theorem 2. Start with
(10.9)
1214
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
EV˜ (μ) ˜ − EV˜ (μ˜ V ) =
V˜ θ˜ (x) − V˜ (x) − V˜ (x) θ˜ (x) − x μ˜ V (dx)
˜ θ˜ (x) − θ˜ (y) θ (x) − θ˜ (y) − 1 − log μ˜ V (dx) μ˜ V (dy) + x−y x−y and notice that the second line of this is non-negative. For the first line we point out that because V˜ (x) − ρ2 x 2 is convex and x and θ˜ (x) have the same sign, for any x,
2
ρ
˜ V˜ θ˜ (x) − V˜ (x) − V˜ (x) θ(x) − x θ˜ (x) − x , 2 which implies (10.8). √ In the case V (x) = rx − s log(x), take θ (x) = ( x + m)2 for large m and notice that θ˜ (x) = x + m sign(x). Therefore inequality (10.9) becomes |x + m sign(x)| μ(dx) ˜ ˜ − 2s log rm rm + 2rm |x| μ(dx) |x| sign(x) − sign(y) − log 1 + m μ(dx) ˜ μ(dy) ˜ x −y
2
2
which is sharp for large m.
2
The next result is the Log-Sobolev type inequality, which was conjectured by CabanalDuvillard in [7, p. 140] for the case of Marcenko–Pastur distribution. Theorem 13. Let V be as in the previous theorem. Then, with the definition from (10.5) and for any measure μ ∈ P([0, ∞)), EV (μ) − EV (μV )
1 IV (μ). 2ρ
(10.10)
In the case V (x) = rx − s log(x), r > 0 and s 0 inequality (10.10) with ρ = r is sharp. Proof. We will discuss here the proof only in the case when μ has a smooth compactly supported density, careful approximations being described in [18]. From (10.6), we have IV (μ) = 2IV˜ (μ), ˜ where IV˜ (μ) ˜ = (H μ(x) ˜ − V˜ (x))2 μ(dx). ˜ Rewriting everything in terms of μ˜ and the associated quantities, the inequality to be proven can be written in the same way as we did in the proof of Theorem 4, 1 2ρ
2
H μ˜ θ˜ (x) − V˜ θ˜ (x) μ˜ V˜ (dx)
+ V˜ (x) − V˜ θ˜ (x) − V˜ θ˜ (x) x − θ˜ (x) μ˜ V˜ (dx)
− H μ˜ θ˜ (x) − V˜ θ˜ (x) x − θ˜ (x) μ˜ V˜ (dx)
x −y ˜ ˜ + H μ˜ θ (x) x − θ (x) μ˜ V˜ (dx) − μ˜ (dx) μ˜ V˜ (dy) 0. (10.11) log ˜θ (x) − θ˜ (y) V˜
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
1215
Notice that V˜ (x) − ρ2 x 2 is not convex on the whole real line but it is convex on the intervals (0, ∞) and (−∞, 0). The key to everything here is that θ˜ (x) has the same sign as x and this allows us to apply convexity of V˜ (x) − ρ2 x 2 on each of the intervals (−∞, 0) and (0, ∞) to conclude that
ρ 2
˜ x − θ˜ (x)2 − 2θ˜ (x) x − θ˜ (x) V˜ (x) − V˜ θ˜ (x) − V˜ θ˜ (x) x − θ(x) 2
2 ρ (10.12) = x − θ˜ (x) . 2 From here we can follow word by word the proof of Theorem 4. For the case √ V (x) = rx, we have equality in (10.10) if θ˜ (x) = x + m sign(x) and thus this means θ (x) = ( x + m)2 . In the case V (x) = rx − s log(x), we look at θ˜ (x) = x + m for large m. In this case V˜ (x) = 2 rx /2 − s log |x| and then a simple calculation shows that (10.10) is equivalent to |x + m sign(x)| 2 rm + 2mr |x|μ˜ V (dx) − 2s log μ˜ V (dx) |x| sign(x) − sign(y) μ(dx) ˜ μ(dy) ˜ −2 log 1 + m x −y 2 s m2 r− μ˜ V (dx). ρ x(x + m sign(x)) Dividing both sides by m2 and taking the limit of m to infinity implies that ρ r. On the other hand ρ = r validates (10.10), hence ρ = r is the best constant. 2 Next in line is the HWI inequality which is the content of the following statement. Theorem 14. Assume V is as in Theorem 12 and the distance W given by (10.7). Then for any measure μ ∈ P([0, ∞)), (10.13) EV (μ) − EV (μV ) 2IV (μ)W (μ, μV ) − ρW 2 (μ, μV ). For the case of V (x) = rx − s log(x), r > 0, s 0, this inequality for ρ = r is sharp. Proof. As it was made clear in the previous two theorems, we translate this inequality in terms of the associated symmetric measures on R. Following upon the proofs of above theorems, we can rewrite (10.13) in the following form:
2
H μ˜ θ˜ (x) − V˜ θ˜ (x) μ˜ V (dx)
−
1/2
2 θ˜ (x) − x μ˜ V (dx)
˜ H μ˜ θ˜ (x) − V˜ θ˜ (x) x − θ(x) μ˜ V (dx)
2
˜ x − θ˜ (x) − ρ θ˜ (x) − x μ˜ V (dx) V˜ (x) − V˜ θ˜ (x) − V˜ θ(x)
x −y ˜ ˜ μ˜ (dx) μ˜ V (dy) 0. log + H μ˜ θ (x) x − θ(x) μ˜ V (dx) − ˜θ (x) − θ˜ (y) V
+
1216
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
Using the fact that V˜ (x)− ρ2 x 2 is convex on each interval (−∞, 0) and (0, ∞) combined with the fact that x and θ˜ (x) have the same sign, the rest of the proof √ is the same as the one of Theorem 5. For the case V (x) = rx − s log(x), using θ (x) = ( x + m)2 , one can show that ρ = r is sharp. 2 At last, we would like to discuss a Poincaré type inequality in this context. As in Section 7, for the heuristics, we consider the general model of random matrices with distribution Pn (dM) = Cn e−nr Tr M (det M)sn dM = Cn e−n Tr(rM−s log(M)) dM = Cn e−n Tr(V (M)) dM
(10.14)
where dM stands for the Lebesgue measure on n × n positive definite matrices and s 0. For a given smooth compactly supported function φ : [0, ∞) → R, we want to apply the Brascamp– Lieb inequality [6] to the function Φ(M) = Tr φ(M) on the space of positive definite matrices. Now, ∇Φ(M) = φ (M). The Hessian of Ψ (M) := Tr(V (M)) can be interpreted as a linear map from Hn (n × n Hermitian matrices) into itself which is given by ∇ 2 Ψ (M)X = sM −1 XM −1 . Hence the inverse of the Hessian is then (∇ 2 Ψ (M))−1 X = 1s MXM. Thus we obtain from Brascamp-Lieb that
−1 1 2 Tr ∇ Ψ (M) φ (M)2 Pn (dM) VarPn Φ(M) . n
On the other hand, from [20] or [8] the variance of Φ(M) converges to 14 Vararcsine[a,b] (φ), where dx is the arcsine law on the support [a, b] of μV . Next, we recall that arcsine[a,b] = π √(x−a)(b−x)
1 Tr((∇ 2 Ψ (M))−1 φ (M)2 ) = sn Tr((φ (M)M)2 ), whose integral against Pn converges to the 1 2 2 integral of s x φ (x) against the equilibrium measure μV from Eq. (10.1). These considerations suggest that s (10.15) x 2 φ (x)2 μV (dx) Vararcsine[a,b] (φ). 4 1 n
Notice here that one can actually make this heuristic into an actual proof of this inequality. Motivated by these heuristics and also inspired by Theorem 8, we have the following stronger result. Theorem 15. Assume that Q : [0, ∞) → R is a convex potential and let V (x) = Q(x) − s log(x) for s > 0 satisfy limx→∞ (V (x) − 2 log(x)) = ∞. Assume that the support of μV is [a, b]. Then for any smooth function φ on [a, b], the following holds,
s x φ (x) μV (dx) 4π 2 2
b b
2
a a
φ(x) − φ(y) x−y
2
−2ab + (a + b)(x + y) − 2xy dx dy. × √ √ 2 (x − a)(b − x) (y − a)(b − y) If Q(x) = rx + t, equality is attained for φ(x) = c1 +
c2 x ,
therefore (10.16) is sharp.
(10.16)
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
1217
In particular, combining (10.16) with (9.8) for λ = 0, we get an improvement of (10.15) as
s x 2 φ (x)2 μV (dx) , Vararcsine[a,b] (φ). 2
Equality though is attained only for φ identically 0. In the case V (x) = rx, r > 0, on [0, ∞), there is no constant C > 0 such that inequality (10.16) holds with C instead of s/4π 2 . Nevertheless, for every smooth φ on [a, b], the following holds,
r xφ (x) μV (dx) 4π 2
b b
2
a a
φ(x) − φ(y) x −y
2
−2ab + (a + b)(x + y) − 2xy dx dy, × √ √ 2 (x − a)(b − x) (y − a)(b − y)
(10.17)
with equality for φ(x) = c1 + c2 x. As remarked after the statement of Theorem 8, the numerator in (10.17) is nonnegative. Proof. The same argument as in the proof of Theorem 8, shows that the density g(x) of μV satisfies √ s (x − a)(b − x) g(x) , √ 2πx ab therefore it suffices to show that 1 √ π ab
b
2
xφ (x)
1 (x − a)(b − x) dx 2π 2
a
b b a a
φ(x) − φ(y) x−y
2
−2ab + (a + b)(x + y) − 2xy dx dy. × √ √ 2 (x − a)(b − x) (y − a)(b − y) Next, making the change of variable x = (a + b)/2 + u(b − a)/2 and denoting ζ (u) = φ((a + b)/2 + u(b − a)/2), we reduce the problem to showing that for any smooth function φ on [−1, 1], we have 1 √ π ab
1 −1
1 1 a+b b−a ζ (u) − ζ (v) 2 1 2 2 + u ζ (u) 1 − u du 2 2 u−v 2π 2 −1 −1
1 − uv ×√ du dv. √ 1 − u2 1 − v 2 Denoting β =
b−a b+a ,
we have that
a+b √ 2 ab
=√1
1−β 2
, and the preceding inequality reformulates as
1218
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
(1 + βu)ζ (u)
2
1 − u2 du
1 − β2 2π
1 1 −1 −1
ζ (u) − ζ (v) u−v
2
1 − uv ×√ du dv. √ 1 − u2 1 − v 2
(10.18)
To show this, take ψ(t) = ζ (cos(t)) and then after the change of variable u = cos(t) we need to check π
1 + β cos(t) ψ (t)2 dt
1 − β2 2π
0
π π 0 0
Writing ψ(t) = fact that
∞
n=0 an cos(nt)
ψ(t) − ψ(s) cos(t) − cos(s)
and using that ψ (t) = −
π cos(t) sin(nt) sin(mt) dt =
π 4
0
2
1 − cos(t) cos(s) dt ds.
∞
n=1 nan sin(nt),
together with the
for |m − n| = 1, otherwise,
0
and Eq. (7.11), the inequality becomes
n2 an2 + βn(n + 1)an an+1 1 − β 2 nan2 . n1
Let δ = have
√
1−
1−β 2 β
(10.19)
n1
be the solution 0 < δ < 1 of βδ 2 − 2δ + β = 0. Notice that for any n 1, we δ 1 2 an an+1 − an2 − an+1 2 2δ
which implies that
2 2 βn(n + 1) 1 2 2 2 2 n an − δan + an+1 n an + βn(n + 1)an an+1 2 δ
n1
n1
=
nβ(1 − δ 2 ) n1
2δ
an2 =
1 − β2
nan2 ,
n1
what we had to prove. Notice here that equality is attained in this inequality if and only if an+1 = −δan for all n 1, which means that an = (−1)n−1 δ n−1 a1 . This corresponds to the function t δ+u ψ(t) = a1 1+δδ+cos 2 +2δ cos t , or ζ (u) = a1 1+δ 2 +2δu which means that φ(x) = a1 (r − s/x). Therefore equality holds also for φ(x) = c1 + c2 /x.
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
1219
For the second part, in the case V (x) = rx with r > 0, notice that if there is a C > 0 so that (10.16) holds with C instead of s/4π 2 , then, following the same argument as above, we would have the equivalent of (10.19) as
n2 an2 + n(n + 1)an an+1 C nan2 . n1
n1
n
Taking in this an = (−γn ) for 0 < γ < 1, we have that γ 2 /(γ + 1) −C log(1 − γ 2 ), and this is certainly false for γ close to 1. √ √ and then For Eq. (10.17), notice that in this case the equilibrium measure is μV (dx) = r2πb−x x after a simple rescaling this follows from Eq. (7.8). This complete the proof of the theorem. 2 It is interesting to look at this inequality as a spectral gap result as in Section 9. For example in the case of the Marcenko–Pastur measure (Q(x) = rx), the inequality (10.16) is actually equivalent to inequality (10.18). Using the interpretation from Section 9, we can rephrase this as, for a given β ∈ (0, 1),
(1 + βx) 1 − x 2 φ (x)2 ν0 (dx) 1 − β 2 N φ, φν0
where ν0 is the arcsine law on [−1, 1] and N is the number operator. Now we can define the operator
Lβ φ(x) = −(1 + βx) 1 − x 2 φ (x) − β − x − 2βx 2 φ (x). With this definition, 1 Lβ φ, φν0 = π
(1 + βx)φ (x)2 1 − x 2 dx
and then inequality (10.18) becomes Lβ φ, φν0
1 − β 2 N φ, φν0
for any smooth function φ on [−1, 1]. In particular this means that Lβ 1 − β 2 N . On the other hand it is clear that the operator Lβ can not be diagonalized by the Chebyshev polynomials of the first kind, therefore the orthogonal polynomial approach given in Section 9 does not work the same way here. Remark 6. We want to point out that for the case V (x) = rx − s log(x) for r > 0 and s 0, the parameter r appears in the transportation, Log-Sobolev and HWI, while the parameter s plays the dominant role in the Poincaré inequality.
1220
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
Acknowledgments We would like to thank D. Cabanal-Duvillard for pointing to us the formula of the fluctuation for Wishart ensembles and for informing us about his Log-Sobolev conjecture in [7]. Many thanks to the anonymous referee for the pertinent and scholarly comments which pointed several shortcomings of the submitted version and led to an overall improvement of this paper. References [1] G. Ben Arous, A. Guionnet, Large deviations for Wigner’s law and Voiculescu’s non-commutative entropy, Probab. Theory Related Fields 108 (2) (1997) 183–215. [2] D. Bakry, L’hypercontractivité et son utilisation en théorie des semigroupes, in: Lectures on Probability Theory, Saint–Flour, 1992, in: Lecture Notes in Math., vol. 1581, Springer, Berlin, 1994, pp. 1–114. [3] P. Biane, Logarithmic Sobolev inequalities, matrix models and free entropy, Acta Math. Sin. (Engl. Ser.) 19 (3) (2003) 497–506. [4] G. Blower, The Gaussian isoperimetric inequality and transportation, Positivity 7 (3) (2003) 203–224. [5] S.G. Bobkov, I. Gentil, M. Ledoux, Hypercontractivity of Hamilton–Jacobi equations, J. Math. Pures Appl. (9) 80 (7) (2001) 669–696. [6] H.J. Brascamp, E.H. Lieb, On extensions of the Brunn–Minkowski and Prékopa–Leindler theorems, including inequalities for log concave functions, and with an application to the diffusion equation, J. Funct. Anal. 22 (4) (1976) 366–389. [7] T. Cabanal-Duvillard, Probabilités libres et calcul stochastique, application aux grandes matrices aléatoires, Université Paris VI PhD thesis, 1999. [8] T. Cabanal-Duvillard, Fluctuations de la loi empirique de grandes matrices aléatoires, Ann. Inst. H. Poincaré Probab. Statist. 37 (3) (2001) 373–402. [9] D. Cordero-Erausquin, Some applications of mass transport to Gaussian-type inequalities, Arch. Ration. Mech. Anal. 161 (3) (2002) 257–269. [10] I. Csiszár, Information-type measures of difference of probability distributions and indirect observations, Studia Sci. Math. Hungar. 2 (1967) 299–318. [11] P.A. Deift, Orthogonal Polynomials and Random Matrices: A Riemann–Hilbert Approach, Courant Lecture Notes in Math., vol. 3, New York University, Courant Institute of Mathematical Sciences, New York, 1999. [12] R.J. Gardner, The Brunn–Minkowski inequality, Bull. Amer. Math. Soc. (N.S.) 39 (3) (2002) 355–405 (in electronic). [13] N.A. Gozlan, A characterization of dimension free concentration in terms of transportation inequalities, preprint, 2008. [14] L. Gross, Logarithmic Sobolev inequalities, Amer. J. Math. 97 (4) (1975) 1061–1083. [15] U. Haagerup, Seminar notes on free probability. [16] F. Hiai, D. Petz, Eigenvalue density of the Wishart matrix and large deviations, Infin. Dimens. Anal. Quantum Probab. 1 (1998) 633–646. [17] F. Hiai, D. Petz, The Semicircle Law, Free Random Variables and Entropy, Math. Surveys Monogr., vol. 77, Amer. Math. Soc., Providence, RI, 2000. [18] F. Hiai, D. Petz, Y. Ueda, Free transportation cost inequalities via random matrix approximation, Probab. Theory Related Fields 130 (2004) 199–221. [19] K. Johansson, On fluctuations of random hermitian matrices, Duke Math. J. 91 (1998) 1–24. [20] D. Jonsson, Some limit theorems for the eigenvalues of a sample covariance matrix, J. Multivariate Anal. 12 (1) (1982) 1–38. [21] J.H.B. Kemperman, On the optimum rate of transmitting information, Ann. Math. Statist. 40 (1969) 2156–2177. [22] A.M. Khorunzhy, B.A. Khoruzhenko, L. Pastur, Asymptotic properties of large random matrices with independent entries, J. Math. Phys. 37 (10) (1996) 5033–5060. [23] M. Ledoux, The Concentration of Measure Phenomenon, Math. Surveys Monogr., vol. 89, Amer. Math. Soc., Providence, RI, 2001. [24] M. Ledoux, A (one-dimensional) free Brunn–Minkowski inequality, C. R. Acad. Sci. Paris 340 (2005) 301–304. [25] F. Otto, C. Villani, Generalization of an inequality by Talagrand and links with the logarithmic Sobolev inequality, J. Funct. Anal. 173 (2) (2000) 361–400.
M. Ledoux, I. Popescu / Journal of Functional Analysis 257 (2009) 1175–1221
1221
[26] I. Popescu, Talagrand inequality for the semicircular law and energy of the eigenvalues of beta ensembles, Math. Res. Lett. 14 (6) (2007) 1023–1032. [27] E.B. Saff, V. Totik, Logarithmic Potentials with External Fields, Grundlehren Math. Wiss. (Fundamental Principles of Mathematical Sciences), vol. 316, Springer-Verlag, Berlin, 1997. [28] M. Talagrand, Transportation cost for Gaussian and other product measures, Geom. Funct. Anal. 6 (3) (1996) 587– 600. [29] C. Villani, Topics in Optimal Transportation, Grad. Stud. Math., vol. 58, Amer. Math. Soc., Providence, RI, 2003. [30] D. Voiculescu, The analogues of entropy and of Fisher’s information measure in free probability theory. V. Noncommutative Hilbert transforms, Invent. Math. 132 (1) (1998) 189–227. [31] D. Voiculescu, The analogues of entropy and of Fisher’s information measure in free probability theory. I, Comm. Math. Phys. 155 (1) (1993) 71–92. [32] F.-Y. Wang, Functional Inequalities, Markov Properties and Spectral Theory, Science Press, Beijing–New York, 2005. [33] G.N. Watson, Notes on generating functions of polynomials—(3) polynomials of Legendre and Gegenbauer, J. London Math. Soc. 8 (1933) 289–292.
Journal of Functional Analysis 257 (2009) 1222–1250 www.elsevier.com/locate/jfa
Holomorphic transforms with application to affine processes ✩ Denis Belomestny, Jörg Kampen ∗ , John Schoenmakers Weierstraß Institute for Applied Analysis and Stochastics (WIAS), Mohrenstraße 39, 10117 Berlin, Germany Received 16 March 2009; accepted 20 March 2009 Available online 5 April 2009 Communicated by Paul Malliavin
Abstract In a rather general setting of Itô–Lévy processes we study a class of transforms (Fourier for example) of the state variable of a process which are holomorphic in some disc around time zero in the complex plane. We show that such transforms are related to a system of analytic vectors for the generator of the process, and we state conditions which allow for holomorphic extension of these transforms into a strip which contains the positive real axis. Based on these extensions we develop a functional series expansion of these transforms in terms of the constituents of the generator. As application, we show that for multi-dimensional affine Itô–Lévy processes with state dependent jump part the Fourier transform is holomorphic in a time strip under some stationarity conditions, and give log-affine series representations for the transform. © 2009 Elsevier Inc. All rights reserved. Keywords: Itô–Lévy processes; Holomorphic transforms; Affine processes
1. Introduction Transforms are an important tool in the theory of (ordinary and partial) differential equations and in stochastic analysis. In probability theory the Fourier transform of a random variable, which represents the characteristic function of the corresponding distribution, is widely used. ✩ Partially supported by SFB 649 ‘Economic Risk’ and by DFG Research Center M ATHEON ‘Mathematics for Key Technologies’ in Berlin. * Corresponding author. E-mail addresses: [email protected] (D. Belomestny), [email protected] (J. Kampen), [email protected] (J. Schoenmakers).
0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.03.013
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
1223
Fourier (and Laplace) transforms have become increasingly popular in mathematical finance as well. On the one hand, for example, via a complex Laplace transform and the convolution theorem one may derive pricing formulas for European options (e.g., see [7,8]). On the other hand, Laplace and Fourier transforms are known in closed form for many classes of processes. A famous example is the so-called Lévy–Khintchine formula which provides an explicit expression for the characteristic function of a Lévy process. More recent financial literature goes well beyond Lévy processes and attempts to establish explicit or semi-explicit formulas for derivatives where underlyings are modelled, for instance, by affine processes (see among others [5,6,9]) or more general Itô–Lévy processes (e.g., see [14]). Theoretical analysis of affine processes is done in the seminal paper by Duffie, Filipovi´c and Schachermayer [5] and has led to a unique characterization of affine processes. In particular, it is shown that the problem of determining the (conditional) Fourier transform of an affine process Xs corresponds to the problem of solving a system of generalized Riccati differential equations in the time variable s (see [5]). Although closed form solutions of this system can be found in important cases, there is no generic approach to solve such a system in the general multi-dimensional case. In this article we establish some kind of functional series representation for the Fourier transform, hence the characteristic function of the process under consideration and, in principle, for more general transforms. The most natural one is a Taylor expansion in time s around s0 = 0. Unfortunately, it turns out that in many cases the resulting power series converges only in s = s0 = 0. This problem corresponds to a difficulty which is well known in semi-group theory and in the theory of parabolic differential equations: small time expansions for the solutions of parabolic equations are usually possible in a neighborhood of some s0 > 0, while an expansion around s0 = 0 may have zero convergence radius. In this paper we prove that for a generator with affine coefficients the Fourier transform extends holomorphically into a disc around s0 = 0 and a strip containing the positive real axes, under some mild regularity conditions. Then, for multi-dimensional affine processes we obtain convergent expansions for the Fourier transform and its logarithm on the whole time line. Hence, we have (affine) series representations for the exponent of the characteristic function of a general multi-dimensional affine process. More generally, we develop a framework based on a concept of analytic vectors which allows for functional series expansions for a class of holomorphic transforms which covers the standard Fourier transform. The outline of the paper is as follows. The basic setup is described in Section 2. In Section 3 we introduce the notion of analytic vectors associated with a given generator and study functional series expansions for the corresponding transform. Section 4 is devoted to the Cauchy problem for affine generators. In Section 5 we derive series representations for the logarithm of the Fourier transform corresponding to a generator with affine coefficients. Section 6 gives an explicit representation for these expansions in a one-dimensional case. Finally, Section 7 contains results for affine Itô–Lévy processes which mainly follow from previous sections. More technical proofs are given in Appendix A. 2. Basic setup Let (Ω, F , (Ft )t0 , P ) be a standard filtered probability space where the filtration (Ft ) satisfies ‘the usual conditions’. On this space we consider for each x ∈ Rn a compensated Poisson ran dt, dz, ω) = N (x, dt, dz, ω) − v(x, dz) dt on R+ × Rn , where N is a Poisson dom measure N(x, random measure with (deterministic) intensity kernel of the form v(x, dz) dt = EN (x, dt, dz)
1224
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
satisfying v(x, B) < ∞ for any B ∈ B(Rn ) such that 0 ∈ / B (closure of B). Hence N is determined by t k v k (x, B) , P N x, (0, t], B = k = exp −tv(x, B) k!
k = 0, 1, 2, . . . .
(x, (0, t], B) is a In particular, for any B ∈ B(R) with 0 ∈ / B and x ∈ Rn , the process MtB,x := N (true) martingale. Further, we assume that the kernel v satisfies, v x, {0} = 0,
2 |z| ∧ |z| v(x, dz) < ∞,
x ∈ Rn .
Rn
Let us assume W (t) to be a standard Brownian motion in Rm living on our basic probability space, and consider the Itô–Lévy SDE: dXt = b(Xt ) dt + σ (Xt ) dW (t) +
(Xt− , dt, dz), zN
X0 = x,
(2.1)
Rn
for deterministic functions b : Rn → Rn , σ : Rn → Rn × Rm , which satisfy sufficient regularity and/or mutual consistency conditions such that (2.1) has a unique strong solution X, called an Itô–Lévy process, which can be regarded as a strong Markov process (e.g., see [14,15]). As a well-known fact, the above process X can be connected to some kind of evolution equation in a natural way. In this context we consider a ‘pseudo generator’ A : D A ⊂ C (2) ⊂ C −→ C,
(2.2)
where C := C(Rn ) is the space of continuous functions f : Rn → C, equipped with the topology of uniform convergence on compacta, and C (2) is the space of functions f ∈ C which are two times continuously differentiable. Further, f ∈ D(A ) iff f ∈ C (2) , and A f (x) :=
n n 1 ∂ 2f ∂f aij (x) + bi (x) 2 ∂xi ∂xj ∂xi i,j =1
i=1
∂f f (x + z) − f (x) − · z v(x, dz), + ∂x
with a := σ σ ,
(2.3)
Rn
exists and is such that A f ∈ C. In this respect we assume that the building blocks of the operator (2.3), a(x), b(x), and v(x, B) for any B with 0 ∈ / B, have bounded derivatives of any order. Clearly, D(A ) is dense in C and we henceforth require that the operator A thus defined is closable. In general, closability of an integro-differential operator of type (2.3) will in particular depend on the characteristics of the measure v and the chosen topology. In this respect, the following proposition provides sufficient conditions on the measure v for the above specified topology.
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
1225
Proposition 2.1. Let the measure v be of the form v(x, dz) = γ (x, z)ν(dz), where the function γ has bounded partial derivatives of any order, and ν is a Borel measure with bounded support satisfying |z|2 ν(dz) < ∞. Then the operator A in (2.3) is closable. The proof is given in Appendix A. Remark 2.2. Without going into further detail we note that the above closability restrictions on the measure v(x, dz) may be relaxed, to unbounded support for example, when the domain of A is restricted to functions with certain growth constrains. However, the restrictions of Proposition 2.1 do not exclude (pure) jump-processes with infinite activity. Also, in practice, heavy tailed jump measures restricted to a large enough bounded domain may be considered. The closure of A is denoted by A : D(A) ⊂ C −→ C.
(2.4)
As such A can be seen as a relaxation of the notion of a generator of a strongly continuous Feller–Dynkin semigroup associated with the process X, for which the Hille–Yosida theorem applies. This semigroup is usually defined on the Banach space C0 (Rn ), i.e. the set of continuous functions on Rn which vanish at infinity, equipped with the supremum norm, and its generator coincides literally with A on a dense subdomain of C0 (Rn ). By slight abuse of terminology however, we will also refer to A as ’generator’ when A is considered in connection with the process X given via (2.1). Let now F := {fu , u ∈ I } ⊂ C be a dense subset of bounded continuous functions fu : Rn → C which have bounded derivatives of any order. With respect to the (closed) generator (2.4) we consider for each fu ∈ F the (generalized) Cauchy problem ∂ pˆ (s, x, u) = Ap(s, ˆ x, u), ∂s p(0, ˆ x, u) = fu (x),
s 0, x ∈ X ⊂ Rn ,
(2.5)
where X is some open (maximal) domain, and assume that problem (2.5) has a unique solution for 0 s < ∞. In this context existence and uniqueness results can be found in [10], (see also [2] for generalizations and results for unbounded initial data). Another proof can be obtained via purely stochastic methods using Malliavin calculus, see [3]. In particular, if some global ellipticity condition is satisfied we may have X = Rn . For mixed type operators, i.e. for operators where the type of the second order differential part may vary in space (which may happen for differential operators with affine coefficients for example), existence, uniqueness, and the maximal domain X has to be considered case by case. We underline, however, that in this article the main focus is on functional series representations for the solution of (2.5), and we therefore merely assume that sufficient regularity conditions for the coefficients in (2.3) (hence (2.1)) are fulfilled. Remark 2.3. In our analysis we often consider the pseudo generator (2.3) and its closure (2.4) on C := C(X), for an open domain X ⊂ Rn , rather than C(Rn ). For notational convenience (while slightly abusing notation) we will denote these respective operators with A and A also.
1226
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
If A is the generator of the process (2.1), the solution p(s, ˆ x, u) has the probabilistic representation p(s, ˆ x, u) = E fu Xs0,x , where X 0,x is the unique strong solution of (2.1) with X00,x = x a.s. We refer to p(s, ˆ x, u) as 0,x generalized transform of the process Xs associated with F. As a canonical example we may consider
fu (x) := eiu x ,
u ∈ Rn ,
(2.6)
0,x
in which case (2.5) yields the characteristic function p(s, ˆ x, u) = E[eiu Xs ]. By using multi-index notation, the integral term in (2.3) may be formally expanded as Rn
∂f f (x + z) − f (x) − · z v(x, dz) ∂x
1 1 ∂x α f (x) zα v(x, dz) =: mα (x)∂x α f (x). = α! α! |α|2
|α|2
Hence, we may write formally the generator as an infinite order differential operator A= aα (x)∂x α
(2.7)
|α|>0
with obvious definitions of the coefficients aα (x) for |α| > 0. 3. Analytic vectors and transforms First we introduce the notion of a set of analytic vectors associated with an operator A. Definition 3.1. F = {fu , u ∈ I } is a set of analytic vectors for an operator A in an open region X, if (i) Ak fu exists for any u ∈ I and k ∈ N, (ii) for every u ∈ I there exists Ru > 0 such that for all x ∈ X, r r |A fu (x)| Ru−1 lim sup k→∞ rk r! where the limit is uniform over any compact subset of X (hence a locally uniform Cauchy– Hadamard criterion). Example 3.2. For unbounded self-adjoint operators on a Hilbert space, analytic vectors can be constructed via their spectral resolution [16]. In [18] unbounded operators on sequence spaces are studied which are represented by exponentiable infinite matrices. For such operators the standard basis composes a system of analytic vectors in fact.
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
1227
If F is a set of analytic vectors in the sense of Definition 3.1 then for all x ∈ X the map s → Ps fu (x) :=
∞ k s k=0
k!
Ak fu (x),
|s| < Ru
(3.1)
is holomorphic in the complex disc D0 := {s ∈ C: |s| < Ru } and the series converges uniformly ˆ x, u) for 0 s < Ru . in x over any compact subset of X. In fact, Ps fu (x) coincides with p(s, Proposition 3.3. If F is a set of analytic vectors, the map (s, x) → Ps fu (x) defined in (3.1) satisfies (2.5) for all s, |s| < Ru and x ∈ X. In particular we have Ps fu (x) = p(s, ˆ x, u), 0 s < Ru . Proof. Obviously, P0 fu (x) = fu (x) for x ∈ X. Set (see Remark 2.3) Ps(N ) fu (x) :=
N sk k=0
(N )
then both Ps
k!
Ak fu (x),
fu (x) and APs(N ) fu (x) :=
N sk k=0
k!
Ak+1 fu (x)
converge uniformly for any x in a compact subset of X and for any s satisfying |s| < Ru − ε with arbitrary small ε. Hence, since A is closed, APs fu (x) =
∞ k s k=0
and we are done.
k!
Ak+1 fu (x) =
∞ ∂ sk k ∂ A fu (x) = Ps fu (x), ∂s k! ∂s k=0
2
In order to study generalized transforms associated with a set of analytical vectors F in domains containing the non-negative real axis we introduce for η > 0 the sequence (η)
qk (x, u) :=
k−1 1 −1 η A + rI fu (x) k!
(3.2)
r=0
1 ck,r η−r Ar fu (x), k! k
=:
x ∈ X, u ∈ I, k = 0, 1, 2, . . . .
r=0
In (3.3) the coefficients ck,r , 0 r k, are determined by the identity k−1
r=0
(z + r) = z(z + 1) · . . . · (z + k − 1) ≡
k r=0
ck,r zr ,
(3.3)
1228
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
and are usually called unsigned Stirling numbers of the first kind. These numbers satisfy c0,0 = 1 and ck,0 ≡ 0,
ck,k ≡ 1,
ck+1,j = kck,j + ck,j −1 ,
1 j k,
(3.4)
if k 1. Obviously, the following recursion is equivalent to (3.2), (k + 1)qk+1 (x, u) = η−1 Aqk (x, u) + kqk (x, u), (η)
(η)
(η)
k 0, x ∈ X.
(3.5)
The next theorem provides a functional series representation for the solution of (2.5) for all s 0, under certain conditions. Theorem 3.4. Let F be a set of analytic vectors in the sense of Definition 3.1, u ∈ I be fixed, (η) and the sequence (qk ) be defined as in (3.3). Let pˆ be the solution of the Cauchy problem (2.5). Then the following statements are equivalent: ˆ x, u) has a (i) There exists a constant Ru > 0 such that for each x ∈ X, the map s → p(s, holomorphic extension to the domain GRu := z: |z| < Ru ∪ z: Re z > 0 ∧ | Im z| < Ru , see Fig. 1. (ii) There exists an ηu > 0 such that for each x ∈ X the following series representation holds: p(s, ˆ x, u) =
∞
(ηu )
qk
k (x, u) 1 − e−ηu s ,
0 s < ∞.
k=0
Moreover, the series converges uniformly for (x, s) running through any compact subset of X × {s ∈ R : 0 s < Ru }. (iii) The solution pˆ of the Cauchy problem (2.5) is holomorphically extendable to [0, ∞), there exists ηu > 0 such that k (η ) limk→∞ qk u (x, u) 1,
x ∈ X,
(3.6)
and, there exists εu , 0 < εu < 1, such that the series ∞
(ηu )
qk
(x, u)w k
(3.7)
k=0
converges uniformly for (x, w) running through any compact subset of X × {w ∈ C: |w| < 1 − εu }.
Proof. See Appendix A.
2
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
1229
Fig. 1. Domain GRu on the complex plane.
Remark 3.5. From the proof of Theorem 3.4 it is clear that the implication (iii) ⇒ (i), where statement (iii) consists of (3.6), and (3.7) with εu = 0 holds as well. That is, loosely speaking, if in (iii) series (3.7) converges uniformly on all compact subsets of X × {w ∈ C: |w| < 1}, the holomorphy assumption on pˆ can be dropped. Remark 3.6. In order to use the representation in (ii) one has to choose ηu . In fact, ηu can be related to Ru via ηu = π/Ru and hence increases with decreasing Ru . It is important to note Theorem 3.4 concerns the solution of the Cauchy problem (2.5) connected with a general operator A. In particular, all criteria in this theorem are of pure analytic nature and via (3.2), respectively (3.3), exclusively formulated in terms of the Ak fu (x), i.e. coefficients in Definition 3.1. In the case where A is the generator of a Feller Dynkin process one can formulate a sufficient probabilistic criterion for Theorem 3.4(i). Proposition 3.7. Let F be a set of analytic vectors in the sense of Definition 3.1 and let the Markov process {Xt } be associated with the generator A. If in addition, for every u ∈ I there exists a radius Ru such that for any t 0 ∞ k s k 0,x < ∞, A E fu Xt k!
0 s < Ru ,
(3.8)
k=0
uniformly in x over any compact subset of X, then Theorem 3.4(i) holds. The statement is a direct consequence of the following “quasi” semi-group property of the transition operator Pt .
1230
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
Proposition 3.8. Let F be a set of analytic vectors satisfying (3.8). Then, for all x ∈ X and all t 0, the generalized transform p(t ˆ + s, x, u) can be represented as p(t ˆ + s, x, u) =
∞ k s k=0
k!
Ak E fu Xt0,x ,
0 s < Ru ,
(3.9)
where the series converges uniformly in x over any compact subset of X. Proof. Denote the right-hand side of (3.9) by p(t, ˜ s, x, u). Obviously, p(t, ˜ 0, x, u) = 0,x E[fu (Xt )]. Set p˜ (N ) (t, s, x, u) :=
N sk k=0
k!
Ak E fu Xt0,x ,
then both p˜ (N ) (t, s, x, u) and Ap˜ (N ) (t, s, x, u) =
N sk k=0
k!
Ak+1 E fu Xt0,x
converge for N → ∞ uniformly over any compact subset of X, and s with |s| < Ru − ε, for an arbitrary small ε > 0. Hence, for |s| < Ru − ε, we have N −1 k ∂ (N ) s k+1 0,x p˜ (t, s, x, u) = A E fu Xt = Ap˜ (N −1) (t, s, x, u), ∂s k! k=0
p(t, ˜ 0, x, u) = p(t, ˆ x, u) and thus, by closeness of the operator A and uniqueness of the Cauchy problem (2.5)–(2.7), we have p(t, ˜ s, x, u) = p(t ˆ + s, x, u). 2 The following proposition provides a situation in a semigroup context where a much stronger version of the condition (i) in Theorem 3.4 applies. It also sheds light on the connection between semi-group theory and holomorphic properties of generalized transforms. Proposition 3.9. Let C0 (Rn ) be the Banach space of continuous functions f : Rn → C which vanish at infinity, equipped with supremum norm: f := supx∈Rn |f (x)|. Let A : D(A) ⊂ C0 (Rn ) → C0 (Rn ) be the generator of the Feller–Dynkin semi-group (Ps )s0 associated with the process X, i.e. Ps f (x) = E[f (Xs0,x )], f ∈ C0 (Rn ). Suppose that the family F is such that fu ∈ D(Ak ) for each u ∈ I and all integer k 0, and that for each u ∈ I , ∞ k s Ak fu < ∞, k! k=0
0 s < Ru .
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
1231
Then for each u ∈ I , Ps fu =
∞ k s k=0
k!
Ak fu ,
0 s < Ru ,
(3.10)
with convergence in C0 (Rn ). Thus, the map s → Ps fu for 0 s < Ru extends via (3.10) to the complex disc D0 := {s ∈ C: |s| < Ru }. In particular, for each x ∈ Rn the map s → Ps fu (x) is holomorphic in D0 . Moreover, for each t 0, we may extend the map s → Pt+s fu , 0 s < Ru to the disc D0 via, Ps+t fu = Pt Ps fu =
∞ k s k=0
Proof. See Appendix A.
k!
Pt Ak fu =
∞ k s k=0
k!
Ak Pt fu ,
s ∈ D0 .
(3.11)
2
Under the conditions of Proposition 3.9, F = {fu , u ∈ I } is a set of analytic vectors for the generator A in the sense of Definition 3.1 with X = Rn . Moreover, due to Proposition 3.9 the map s → Ps+t fu (x) =
∞ k s k=0
k!
Pt Ak fu (x) =
∞ k s k=0
k!
Ak E fu Xt0,x ,
is holomorphic in D0 for each x ∈ Rn and hence Theorem 3.4(i) is fulfilled. In this paper we do not stick to the semigroup framework because we want to avoid the narrow corset conditions of Proposition 3.9. We also want to consider operators A with unbounded (for instance, affine) coefficients and sets F of functions that do not vanish at infinity (for example, (2.6)). Such situations may lead to the violation of condition fu ∈ D(Ak ), k ∈ N in the sense of Proposition 3.9. In particular, in the next Sections 4–5 we will focus on general operators A with affine coefficients and in Section 7 on affine processes related to affine generators satisfying a kind of admissibility conditions. 4. Affine generators Let us now consider generators of the form (2.3) with affine coefficients. In this section A may or may not be a generator of some Feller–Dynkin process. The next theorem and its corol laries say that the extended Fourier basis {f˜u (x) := eiu x , u ∈ Cn } is a set of analytical vectors for an operator A of the form (2.7), where the coefficients aα (x) are affine functions of x and satisfy certain growth conditions for |α| → ∞. Moreover, an explicit estimate for the radius of convergence is given. Theorem 4.1. Let A be a generator of the form (2.7) with affine coefficients aα (x), i.e. for all multi-indexes α, aα (x) =: cα + x dα ,
x ∈ X,
(4.1)
1232
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
where X ⊂ Rn is an open region, cα is a scalar constant, and dα ∈ Rn is a constant vector. Assume that the series
aα (x)uα
(4.2)
|α|>0
converges absolutely for all u ∈ Cn . This is in particular fulfilled under the conditions of Proposition 2.1. Then, for every u ∈ Cn and x ∈ X it holds r A f˜u (x) r r nr f˜ (x) (r + 1)!2 1 + x θ u u
(4.3)
with x = maxi=1,...,n |xi |, θ (v) :=
2k (1 + v)k Dka ,
v ∈ R+
(4.4)
k1
and Dka := sup
max
x∈X |α|=k, |β|1
|∂x β aα (x)| . 1 + x
The proof of Theorem 4.1 is given in Appendix A. Corollary 4.2. If in Theorem 4.1 the region X is bounded, the generalized Fourier basis constitutes a set of analytic vectors for the affine operator A in X. Corollary 4.3. If in Theorem 4.1 there exists for any ς > 0 a constant M (which may depend on ς > 0) such that Dka Mς k /k!
k 1,
then θ (v) M exp 2ς(1 + v) . Corollary 4.4. If in Theorem 4.1 it holds that aα (x) ≡ 0 for |α| > 2 (generator of diffusion type) then θ (v) C(1 + v)2 ,
C > 0.
For an affine operator A the sequence (3.2) can be explicitly constructed via the next proposition, which is proved in Appendix A.
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
1233
Proposition 4.5. Let A be an affine generator as in Theorem 4.1 and define bβ (x, u) := ∂uβ
Afu (x) (α + β)! = (iu)α aα+β (x) fu (x) α!
=: b0β (u) +
α0
b1β,κ (u)x κ ,
κ, |κ|=1 (η)
with a0 := 0. We set Ar fu (x) =: gr (x, u)fu (x) and, for fixed η > 0, qr (x, u) =: hr (x, u)fu (x) (the dependence on η is suppressed in h for notational convenience), where both gr and hr are polynomials in x of degree r. It holds gr,γ (u)x γ , hr (x, u) =: hr,γ (u)x γ , (4.5) gr (x, u) =: |γ |r
|γ |r
where gr and hr satisfy g0 ≡ g0,0 ≡ h0 ≡ h0,0 ≡ 1, and for r 0, respectively, γ + β gr+1,γ = gr,γ +β b0β β |β|r−|γ | γ −κ +β gr,γ −κ+β b1β,κ , and + β |κ|=1, κγ |β|r+1−|γ | γ +β hr,γ +β b0β (r + 1)hr+1,γ = η−1 β |β|r−|γ | γ −κ +β hr,γ −κ+β b1β,κ + rhr,γ (u), + η−1 β
(4.6)
|κ|=1, κγ |β|r+1−|γ |
where |γ | r + 1, and empty sums are defined to be zero. Remark 4.6. Depending on the open set X we may consider instead of (4.5) for an x0 ∈ X expansions in x − x0 rather than in x. For simplicity we henceforth assume {0} ∈ X which, if necessary, may be realized by a translation of the state space. A natural question is whether affine generators are the only ones for which the Fourier basis constitutes a set of analytic vectors. For this paper we leave this issue as an open problem but the following proposition shows that at any case the set of such generators is rather “thin”. Let us put X = [−π, π] and ∂2 ∂ 1 A = a(x) 2 + b(x) . 2 ∂x ∂x Proposition 4.7. The set of coefficients (a(x), b(x)) such that for an arbitrary M > 0 N A fu 2 M N N !, N → ∞, L (X) is dense in L2 (X) × L2 (X).
1234
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
Proof. Without loss of generality let us assume that b(x) ≡ 0 and u > 0. The general case can be considered along the same ideas and is only formally more complicated. Any a ∈ L2 (X) may be approximated (in L2 -sense) by a finite Fourier series a(x) ≈
n
al eilx .
l=1
Thus, for given ε > 0 we can find natural n and amplitudes al (an = 0) such that n ilx al e a(x) −
ε.
L2 (X)
l=1
The corresponding approximative operator is given by := A
n
l = A
l=1
n
al eilx
l=1
∂2 . ∂x 2
Using the fact that for any s1 , . . . , sk ∈ N, sk eiux = (−1)k as1 · · · ask s1 · · · A A
k−1
u+
l
2
k
ei(u+
sj
j =1 sj )x
,
j =0
l=0
and setting 1 Fk := 2π
π
N fu (x) dx, e−ikx fu−1 (x)A
k ∈ N,
−π
we have Fk = 0 for k > nN, and for N → ∞ FnN = (−1)N anN
N −1
2 (u + nl)2 ∼ (−1)N anN n2N (N − 1)! (N − 1)2u/n .
l=0
Further, by Parseval’s identity it holds 1/2 nN N A fu 2 = 2π |Fk |2 , L (X) k=0
and then we are done.
2
Obviously, Proposition 4.7 may be formulated with respect to any bounded interval.
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
1235
5. Log-affine representations for the affine Cauchy problem In [5] a Markov process X is called regular affine if for every s, s 0, the characteristic function pˆ of Xs0,x has the form p(s, ˆ x, u) = E fu Xt0,x = exp C(s, u) + x D(s, u) ,
(5.1)
with fu (x) = exp(iu x). As a main result, it is shown in [5] that (under certain conditions) the vector valued function D satisfies for all s > 0 a generalized system of Riccati equations, ∂s D(s, u) = R D(s, u) ,
(5.2)
and the function C is then obtained by s C(s, u) =
Q D(τ, u) dτ.
(5.3)
0
The vector valued function R and real valued function Q in (5.2) and (5.3) respectively, are determined by the relation Q(iu) + x R(iu) fu (x) = Afu (x).
(5.4)
In general the equation for D in (5.2) is impossible to solve analytically. In this section we will derive (under certain conditions) general functional series expansions for (5.1), hence in particular for C and D, for which all ingredients can be obtained from the generator A in a direct algebraic way. Consider the Cauchy problem (2.5) for affine generators A of the form (2.7), under the assumption (4.2). As in (4.1) we set a(x) = cα + x dα . The ansatz p(s, ˆ x, u) = exp C(s, u) + x D(s, u) ,
(5.5)
for scalar C(s, u) and vector valued D(s, u), where C(0, u) = 0 and D(0, u) = iu, for the Cauchy problem (2.5) yields, ∂s C + x ∂s D =
aα (x)D α =
|α|>0
cα D α +
|α|>0
x dα D α ,
|α|>0
and so ∂s C =
cα D α ,
|α|>0
C(0, u) = 0,
∂s D =
dα D α ,
|α|>0
D(0, u) = iu.
We thus have a system of ordinary differential equations (ODEs), which reads component-wise
1236
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
∂s C =
∂s Dj =
cα D α ,
|α|>0
dα(j ) D α ,
j = 1, . . . , n,
|α|>0
C(0, u) = 0, By assumption (4.2), the series cα y α ,
Dj (0, u) = iuj .
|α|>0
dα(j ) y α ,
(5.6)
j = 1, . . . , n,
(5.7)
|α|>0
are absolutely convergent for all y ∈ Rn , and thus define terms-wise differentiable C (∞) (Rn ) functions. In particular, they are locally Lipschitz and so according to standard ODE theory the system (5.6) has for fixed u ∈ Rn a unique solution (C(s, u), D(s, u)) for 0 s < su∞ ∞, where (s, C(s, u), D(s, u)) leaves any compact subset of R × R × Rn , when s ↑ su∞ . Remark 5.1. By a general theorem from analysis (e.g., see [4]), it follows that the solution of (5.6) extends component-wise holomorphically in s into a disc around s = 0, due to the analyticity of (5.7). This implies that (5.5) is holomorphic in s. So, besides Theorem 4.1, also along this line one may show that (5.5) can be represented as a power series of the form (3.1). I.e., in particular, the Fourier basis (2.6) constitutes a set of analytic vectors for the affine operator A. However, the direct approach in the proof of Theorem 4.1 (see Appendix A) leads to an explicit estimate (4.3) and allows for investigating possible extensions of p(s, ˆ x, u) into a strip containing the real axis in the complex plane (see Theorem 5.4). Moreover, it also suggests the line to follow in cases where A is not affine and/or the function base is not of the form (2.6). Let us suppose that for fixed u ∈ Rn the statements of Theorem 3.4 hold. Then we obtain for 0 s < su∞ ∞, x ∈ X, p(s, ˆ x, u) = exp C(s, u) + x D(s, u) =
∞
(ηu )
qk
k (x, u) 1 − e−ηu s .
(5.8)
k=0
Since q0 u (x, u) = fu (x) = exp(iu x) = 0 we have, taking into account the boundary conditions for C and D, at least for small enough ε > 0, (η )
C(s, u) + x D(s, u) =
∞
(ηu )
ρk
k (x, u) 1 − e−ηu s ,
0 s < ε,
(5.9)
k=0
where by a standard lemma on the power series expansion of the logarithm of a power series, (ηu )
ρ0
(η ) ρk u (x, u) =
(ηu )
(x, u) = ln q0
(x, u) = iu x,
k−1 1 1 (ηu ) (ηu ) (ηu ) jρj (x, u)qk−j (x, u) , q (x, u) − fu (x) k k j =1
(ηu )
Thus by (5.9), the ρk
are necessarily affine in x!
k 1.
(5.10)
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
1237
(η )
Remark 5.2. It is possible to prove directly that the functions ρk u defined above are affine in x using Proposition 4.5 via a (rather laborious) induction procedure, so without using a local solution of (5.6). Theorem 5.3. Suppose that for fixed u ∈ I the statement of Theorem 3.4(i) holds for an ˆ x, u) = 0. Then, with open region X and, in addition, for s ∈ GRu and x ∈ X it holds p(s, (η ) (η ,0) (η ,1) ρk u (x, u) =: ρk u (u) + x ρk u (u) determined by (5.10), we have ∞ (η ,0) (ηu ,1) −ηu s k u p(s, ˆ x, u) = exp ρk + x ρk (u) 1 − e , 0 s < ∞. k=0
ˆ x, u) is holoProof. Let u ∈ I and x ∈ X be fixed. Since GRu is simply connected and s → p(s, morphic and non-zero in GRu , there exists a single-valued branch s → L(s, x, u) of the multivalued complex logarithm such that p(s, ˆ x, u) = exp(L(s, x, u)) for all s ∈ GRu . Along the same line as in Theorem 3.4 we then argue that there exists an ηu > 0 such that w → L(Φηu (w), x, u) (see the proof of Theorem 3.4) is holomorphic in the unit disc {w: |w| < 1}, hence, there exists ρ˜k (x, u) such that L(Φηu (w), x, u) = ρ˜k (x, u)w k , 0 |w| < 1, and so k0
L(s, x, u) =
k ρ˜k (x, u) 1 − e−ηu s ,
0 s < ∞.
k0
Since the later expansion must coincide with (5.9) for small s, it follows that necessarily (η ) ρ˜k (x, u) = ρk u (x, u) and the theorem is proved. 2 Let us now pass to another interesting log-affine representation for the characteristic function. From (4.5) and (4.6) we derive formally r r qr(η) (x, u) 1 − e−ηs = eiu x hr,γ (u)x γ 1 − e−ηs 1|γ |r r0 γ 0
r0
=e
iu x
xγ
γ 0
|γ |+r h|γ |+r,γ (u) 1 − e−ηs .
r0
Suppose that the requirements of Theorem 5.3 hold for fixed u and η = ηu . Then, using Theorem 4.1, it is easy to show that for small enough ε > 0, |γ |
x ∞ |h|γ |+r,γ (u)||w||γ |+r < ∞, if |w| < ε, x ∞ < ε γ 0
r0
(see Remark 4.6). Thus, for |s| and x ∞ small enough we obtain |γ |+r ln p(s, ˆ x, u) = iu x + ln xγ h|γ |+r,γ (u) 1 − e−ηu s γ 0
r0
= C(s, u) + x D(s, u),
1238
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
with (in multi-index notation) C(s, u) = ln
−ηu s r , hr,0 (u) 1 − e
r0 −ηu s )r r1 hr,κ (u)(1 − e −ηu s )r r0 hr,0 (u)(1 − e
D κ (s, u) = iuκ +
|κ| = 1.
,
(5.11)
However, the left- and right-hand sides of (5.11) are holomorphic for all s ∈ GRu and we so arrive at the representation r p(s, ˆ x, u) = exp ln + iu x hr,0 (u) 1 − e−ηu s r0
+x
h (u)(1 − e r1 r
−ηu s )r
r0 hr,0 (u)(1 − e
,
−ηu s )r
s ∈ GRu , x ∈ X,
(5.12)
with hr (u) := hr,i (u) i=1,...,n , where for 1 i n, the multi-index (δij )j =1,...,n is identified with i. Particularly due to the explicit estimate (4.3) for affine generators in Theorem 4.1, we may proof the next theorem (which is a non-probabilistic version of Proposition 3.7 in the situation where A is affine). Theorem 5.4. Let X be a bounded domain. Assume that the system (5.6) is non-exploding, i.e. su∞ = ∞, and that for any fixed u ∈ Rn the solution D(s, u) remains bounded as s → ∞. Then, ˆ + s, x, u), 0 s < Ru has there exists a radius Ru > 0 such that for any t 0 the map s → p(t a holomorphic extension to the disc {s ∈ C: |s| < Ru }. Moreover, it holds p(t ˆ + s, x, u) =
∞ k s k=0
k!
Ak p(t, ˆ ·, u)(x),
|s| < Ru , x ∈ X.
(5.13)
Remark 5.5. The maximal extension radius Ru satisfies Ru
1 1 inf , 2n θ ( D ∗ (u) ) x∈X 1 + x
(5.14)
where function θ is defined in (4.4) and D ∗ (u) = sups>0 D(s, u) . Proof. Denote the right-hand side of (5.13) by p(t, ˜ s, x, u). Obviously, p(t, ˜ 0, x, u) = p(t, ˆ 0, x, u). Let us define p˜ (N ) (t, s, x, u) :=
N sk k=0
k!
Ak p(t, ˆ ·, u)(x).
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
1239
Since p(t, ˆ x, u) = exp(C(t, u) + x D(t, u)), Theorem 4.1 implies that the series ∞ k s k=0
k!
ˆ ·, u)(x) Ak p(t,
(5.15)
is absolutely and uniformly convergent on any compact subset of X × {s ∈ C: |s| < Ru } if Ru satisfies (5.14). So, both p˜ (N ) (t, s, x, u) and Ap˜ (N ) (t, s, x, u) =
N sk k=0
k!
Ak+1 p(t, ˆ ·, u)(x)
converge for N → ∞ uniformly over any compact subset of X and s with |s| < Ru − ε for any ε > 0. Hence, for |s| < Ru − ε N −1 k s k+1 ∂ (N ) ˆ ·, u) = Ap˜ (N −1) (t, s, x, u), p˜ (t, s, x, u) = A p(t, ∂s k! k=0
p(t, ˜ 0, x, u) = p(t, ˆ x, u), and thus, by closeness of the operator A and uniqueness of the Cauchy problem (2.5)–(2.7), we have p(t, ˜ s, x, u) = p(t ˆ + s, x, u). 2 6. Full expansion of a specially structured one-dimensional affine system
Let us consider Cauchy problem (2.5) for n = 1 with fu (x) = exp(iux), where the jumpkernel in the generator A (see (2.3)) has a special affine structure of the form v(x, dz) =: (λ0 + λ1 x)ν(dz), and where the diffusion coefficients have a similar structure, b(x) = (λ0 + λ1 x)θ,
a(x) = (λ0 + λ1 x)ϑ,
for some constants λ0 , λ1 , θ, ϑ ∈ R, and measure ν. So, in Proposition 4.5 the aα have the form al =: (λ0 + λ1 x)ηl where μ0 := 0,
μ1 = θ,
μ2 :=
1 ϑ + z2 ν(dz) , 2
μl :=
1 l!
zl ν(dz),
l > 2.
Hence, in Proposition 4.5, the bβ in (4.6) have the form br (x, u) = b0r (u) + xb1r (u) (l + r)! dr (iu)l =: (λ0 + λ1 x) r h(u), =: (λ0 + λ1 x) μl+r l! du l0
r 0,
1240
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
where h(u) :=
l l0 μl (iu) .
It is now possible to show via (4.6) that for r 1,
gr (x, u) =
p>0, q0 0
× h (u) p
q
h
nj −1
j =1
1 (p) r−p π λ (λ0 + λ1 x)p r! (n1 ,m1 ),...,(nq ,mq ) 1
mj d nj (u) nj h(u) , du
(6.1)
with the following integer recursion procedure: (0) Initialization: π (p) ≡ 1, π(n ≡ 0, p, q 1. 1 ,m1 ),...,(nq ,mq ) For all ni > 0, mi 0, with 1 i q, p, q 1:
Reduction rule I: If mj = 0, for some j , 1 j q, then (p)
(p)
π(n1 ,m1 ),...,(nq ,mq ) = π(n1 ,m1 ),...,(nj −1 ,mj −1 ),(nj +1 ,mj +1 ),...,(nq ,mq ) . Reduction rule II: (p)
π(n1 ,m1 ),...,(nq−1 ,mq−1 ),(nq ,mq ) =
q p + nj − 1 (p+nj −1) (p−1) π(n1 ,m1 ),...,(nj ,mj −1),...,(nq ,mq ) + π(n1 ,m1 ),...,(nq ,mq ) . nj j =1
In fact, the above recursion procedure follows automatically after substituting (6.1) as ansatz (η) into (4.6). Finally, we obtain the qk for the series expansion in Theorem 3.4 by (3.3), i.e. 1 ck,r η−r gr (x, u)fu (x). k! k
(η)
qk (x, u) =:
r=0
Remark 6.1. If the measure ν is finite, (iu)l 1 h(u) = iuθ − ϑu2 + 2 l!
zl ν(dz)
l2
1 = φ(u) − 1 + iθ − φ (0) u − ϑu2 , 2 where φ is the characteristic function of ν. Hence, in this case h and all its derivatives may be computed from φ.
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
1241
7. Application to affine processes Affine processes have become very popular in recent years due to their analytical tractability in the context of option pricing, and their rather rich dynamics. Many well-known models such as Heston and Bates models fall into the class of affine jump diffusions. Option pricing in these models is usually done via the Fourier method which requires knowledge of the Fourier transform of the process in closed form (see e.g. [6]). The functional series representations for affine generators developed in this paper, in particular (5.12), can be directly applied to affine processes. Let us recall the characterization of a regular affine process as given in [5]. Definition 7.1. We call a strong Markov process {Xt } with generator A a regular affine process if A is of the form (2.3) and all functions aij (x),
bi (x),
v(x, dz)
i, j = 1, . . . , m,
are affine in x (see (4.1)), and satisfy the set of admissability conditions spelled out in [5, Definition 2.6]. These conditions guarantee that A is the generator of a Feller–Dynkin (strong) Markov n process X in a subspace of the form Rl × Rn−l + ⊂ R for some 0 l n. The next theorem provides a sufficient condition for convergence of the series representation in Theorem 3.4(ii), hence representation (5.12), for regular affine processes. Theorem 7.2. Let {Xs } be a regular affine process which has a non-degenerated limiting distribution for s → ∞, and has a generator A which satisfies the moment condition (4.2). Then the (conditional ) characteristic function p(s, ˆ x, u) = E[fu (Xs0,x )], with fu (x) = eiu x , has a representation according to Theorem 3.4(ii): p(s, ˆ x, u) =
∞
(ηu )
qk
k (x, u) 1 − e−ηu s ,
0 s < ∞.
k=0
Moreover, the scaling factor ηu may be chosen according to the inequality: ηu Cθ L 1 + u 22 , where the (monotonic) function θ is defined in (4.4), L > 0 is a constant independent of x, and C is a constant generally depending on x. Proof. Following [5], p(s, ˆ x, u) has representation of the form (5.5) for 0 s < ∞. The existence of a limiting distribution implies in particular that D(s, u) in (5.5) is bounded for all s 0. Moreover, as shown in [5, Section 7], p(s, ˆ x, u) is the characteristic function of some infinitely divisible distribution for all s > 0 (hence also in the limit s → ∞). As a consequence (see [17]), there exists an M > 0 independent of s such that ˆ x, u) < M. lim u −2 2 log p(s,
u 2 →∞
This implies that D(s, u) L(1 + u 22 ) for some constant L > 0 not depending on x and s 0. Now we apply Theorem 5.4 and Remark 5.5. 2
1242
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
Remark 7.3. The existence of a limiting stationary distribution is a sufficient condition for the boundedness of D(s, u). In fact, there are affine processes which have no limit distribution but bounded D(s, u) (a trivial example is standard Brownian motion). The study of existence of limiting (and stationary) distributions for affine processes is currently under active research, e.g. see [13] or [11]. Appendix A A.1. Proof of Proposition 2.1 Let us split the operator A in a diffusion and an integral component,
AD f (x) :=
n n 1 ∂ 2f ∂f aij (x) + bi (x) , 2 ∂xi ∂xj ∂xi i,j =1
i=1
AI f (x) := A f (x) − AD f (x)
Suppose that in the given topology, fn → 0 and A fn → g.
(A.1)
Let ϕ ∈ Cκ∞ (Rn ) be an arbitrary test function with compact support. Then ϕ(x)AD fn (x)dx = fn (x) AD ϕ(x) dx where n n ∂ 1 ∂2 aij (x)ϕ(x) − bi (x)ϕ(x) . AD ϕ(x) = 2 ∂xi ∂xj ∂xi i,j =1
i=1
Since fn is uniformly bounded on the compact support of ϕ we have by dominated convergence fn (x) AD ϕ(x) dx → 0, hence
ϕ(x)AD fn (x) dx → 0.
(A.2)
For the integral part we have by Fubini, ϕ(x)AI fn (x) dx ∂fn fn (x + z) − fn (x) − z · ϕ(x)γ (x, z) dx = ν(dz) ∂x ∂ = ν(dz) fn (x) ϕ(x − z)γ (x − z, z) − ϕ(x)γ (x, z) − z · ϕ(x)γ (x, z) dx. ∂x
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
1243
Since by assumption both the measure ν and the function ϕ have compact support, the latter integral can be restricted to (z, x) ∈ K1 × K2 for compact subsets K1,2 ⊂ Rn . Thus, since by assumption the derivatives of x → γ (x, z) are continuous in (x, z), we have ϕ(x − z)γ (x − z, z) − ϕ(x)γ (x, z) − z ·
∂ ϕ(x)γ (x, z) = O |z|2 , ∂x
(z, x) ∈ K1 × K2 .
2 Thus, since |z| ν(dz) < ∞, it follows by (A.1) and dominated convergence that ϕ(x)AI fn (x) dx → 0, and then by (A.2), ϕ(x)A fn (x) dx → 0. On the other hand also A fn is uniformly bounded on the compact support of ϕ, so again by dominated convergence and (A.1),
ϕ(x)A fn (x) dx →
ϕ(x)g(x) dx = 0.
The function ϕ is arbitrary, hence g = 0. A.2. Proof of Theorem 3.4 (i) ⇒ (ii): Let U := {z ∈ C: |z| < 1} be the unit disc in the complex plane. Consider for η > 0 the map 1 Φη : z −→ − Ln(1 − z). η Obviously, there exists an ηu > 0, (0−, ∞) ⊂ Φηu (U) ⊂ GRu (i.e., for 0 > 0 small enough (−0 , ∞) ⊂ Φηu (U)). Moreover, the map Φηu is injective on U . Thus, (denoting the extension in (i) with pˆ as well) for each x ∈ X, the function p(Φ ˆ ηu (w), x, u) is holomorphic in U and has a series expansion w −→ p(Φ ˆ ηu (w), x, u) =:
∞
q˜k (x, u)w k ,
|w| < 1,
k=0
and as a consequence, p(z, ˆ x, u) =
∞ k=0
k q˜k (x, u) 1 − e−ηu z ,
z ∈ Φηu (U), x ∈ X.
(A.3)
1244
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
Since (A.3) holds in particular for z ∈ (0−, ∞), we have in a (possibly small) ε-disk around z = 0, p(z, ˆ x, u) =
∞ k=0
∞ k k z k q˜k (x, u) 1 − e−ηu z = A fu (x), k!
0 |z| < ε,
(A.4)
k=0
due to Proposition 3.3. By taking z = 0 we have p(0, ˆ x, u) = q˜0 (x, u) = fu (x). We know from the exponential generating function of Stirling numbers of the second kind Sn,k that ∞
(exp(z) − 1)k zn = Sn,k k! n!
(A.5)
n=0
(cf. for example [12, Vol. II, Section 21.9, (9.1), p. 1041]).1 Hence, (−1)k ηu−k Ak fu (x) = (−1)k ηu−k
∞ l ∂k q ˜ (x, u) 1 − exp(−η z) l u ∂zk z=0 l=0
=
k
Sk,l (−1)l l!q˜l (x, u).
l=0
Using Stirling inversion (see [1] for example), we have uk =
k
Sk,l vl
⇔
vl =
l=0
l (−1)l−k cl,k uk , k=0
with cl,k defined in (3.4). Hence, 1 cl,k ηu−k Ak fu (x). l! l
q˜l (x, u) =
k=0
Next we prove the uniform convergence as stated in (ii). Let us take an arbitrarily fixed compact subset K ⊂ X, and an arbitrarily fixed κ with 0 < κ < Ru . Let now > 0. Take a fixed δ with 0 < δ < 1 such that κ/(δRu ) < 1. Due to Definition 3.1 there exists a number Nδ such that r A fu (x)
r! (δR)r
1 We thank an anonymous referee for his remarks and references concerning the theory of Stirling numbers, which have led to a substantial shortage of our original proof.
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
1245
for all n > Nδ and x ∈ K. With w := 1 − e−ηu s , we then have, using a well-known property of Stirling numbers, for all 0 s κ < Ru , x ∈ K, and all N > Nδ , ∞ ∞ k wk (ηu ) k −r r qk (x, u)w = ck,r ηu A fu (x) k! k=N k=N r=0 ∞ ∞ wk −r r = ck,r ηu A fu (x)1rk 1N k k! k=0
∞
r=0
r=0
N r=0
k=max(N,r)
|w|k ck,r k!
∞ |w|k ck,r ηu−r Ar fu (x) k! k=N
∞
+
∞
ηu−r Ar fu (x)
r=N +1
∞ |w|k ck,r =: (I ) + (II). ηu−r Ar fu (x) k! k=r
For N > Nδ , we have for the second term ∞
(II)
r=N +1 ∞
∞ ∞ r! |w|k 1 ln(1 − |w|) r ck,r = (ηu δRu )r k! (δRu )r ηu
r=N +1
r=N +1
k=r
κ δRu
r .
Thus, since κ/(δRu ) < 1, (II) < /2 for N > N1 > Nδ . For the first term we may write (I ) =
N r=0
∞ ∞ N |w|k |Ar fu (x)| −r |w|k ck,r = r!ηu ck,r ηu−r Ar fu (x) k! r! k! k=N
r=0
k=N
N |Ar fu (x)| ln(1 − |w|) r = ζr,N r! ηu r=0
with ζr,N
−1 |w|k r! N k=r k! ck,r =1− . | ln(1 − |w|)|r
Note that 0 ζr,N 1, and that limN →∞ ζr,N = 0 for all r 0. Due to our assumptions we then have (I ) C
Nδ r=0
ζr,N 1rN +
∞ κ r ζr,N 1rN δRu r=0
(A.6)
1246
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
for all 0 s κ < Ru , x ∈ K. Due to dominated convergence, the right-hand side of (A.6) converges to zero as N → ∞. (ii) ⇒ (iii): Is obvious, take εu := 1 − e−ηu Ru . (iii) ⇒ (i): Let ηu and εu be such that (iii) holds. We may then define (see the proof of (ii)) p(z, ˜ x, u) =
∞
(ηu )
qk
k (x, u) 1 − e−ηu z ,
x ∈ X,
(A.7)
k=0
which is holomorphic in z ∈ Φηu (U). We first note that p(0, ˜ x, u) = fu (x). Next we consider for 0 s < −ηu−1 ln εu , p˜ (N ) (s, x, u) =
N
(ηu )
qk
k (x, u) 1 − e−ηu s ,
k=0
which satisfies N k−1 ∂ (N ) (η ) p˜ (s, x, u) = kqk u (x, u) 1 − e−ηu s ηu e−ηu s ∂s k=1
= −ηu
N
(ηu )
kqk
k (x, u) 1 − e−ηu s
k=1
+ ηu
N −1
k (ηu ) (k + 1)qk+1 (x, u) 1 − e−ηu s
(A.8)
k=0
N (η ) = −ηu N qN u (x, u) 1 − e−ηu s +
N −1
(ηu )
Aqk
k (x, u) 1 − e−ηu s
(A.9)
k=0
by some rearranging and using (3.5). Since due to (iii) the first term in (A.9) vanishes for N → ∞, we obtain ∂ (N ) ∂ p(s, ˜ x, u) = lim p˜ (s, x, u) = lim Ap˜ (N −1) (s, x, u), N →∞ ∂s N →∞ ∂s together with ˜ x, u). lim p˜ (N ) (s, x, u) = p(s,
N →∞
From the uniform convergence as stated in (iii) it follows easily that the two series in (A.8), the first term in (A.9), and so also the second term in (A.9) converge uniformly in the same sense. Thus, the above limits are uniform on compacta accordingly. Since the operator A is closed, we so obtain ∂ p(s, ˜ x, u) = Ap(s, ˜ x, u), ∂s
0 s < −ηu−1 ln εu ,
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
1247
and by uniqueness of the Cauchy problem associated with the operator A we thus have p(s, ˆ x, u) = p(s, ˜ x, u) =
N
(ηu )
qk
k (x, u) 1 − e−ηu s ,
0 s < −ηu−1 ln εu .
k=0
Because of the assumption that p(s, ˆ x, u) is holomorphically extendable in each s, 0 s < ∞, we then must have p(s, ˆ x, u) = p(s, ˜ x, u) for 0 s < ∞. Finally, it is not difficult to see that there exists Ru > 0 such that GRu ⊂ Φηu (U), hence (i) is proved. A.3. Proof of Proposition 3.9 From the Taylor formula for semi-groups it follows that
Ps fu =
r sk k=0
1 A fu + k! r!
s (s − τ )r Pτ Ar+1 fu dτ.
k
0
Due to (3.10), for 0 τ s < Ru we have Pτ Ar+1 fu sup Pτ Ar+1 fu sup Pτ 0τ s
0τ s
1 +ε Ru
r+1 (r + 1)!
for any ε > 0. It thus follows that r+1 r 1 sk k A fu +ε s r+1 sup Pτ , Ps fu − k! Ru 0τ s k=0
which converges to zero when r → ∞, if |s| < Ru /(1 + εRu ). Since ε > 0 is arbitrary, the first statement is proved. The commutation property Ak Pt fu = Pt Ak fu and the boundedness of Pt for t 0 imply that for |s| < Ru , ∞ ∞ ∞ |s|k |s|k |s|k Ak Pt fu = Pt Ak fu Pt Ak fu < ∞. k! k! k! k=0
k=0
k=0
Since Pt fu ∈ D(Ak ) for all k 0, (3.11) follows. A.4. Proof of Theorem 4.1 For r 0 define Ar f˜u =: gr f˜u with f˜u (x) = exp[iu x], and write aα (x)∂x α gr exp iu x . Ar+1 f˜u = A gr exp iu x = |α|1
1248
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
Leibniz formula implies Ar+1 f˜u =
aα (x)
|α|1
=
βα
aα (x)
|α|1
α! ∂x β gr ∂x α−β exp iu x β!(α − β)! α! α−β ∂x β gr exp iu x . (iu) β!(α − β)!
βα
Hence, the following recurrent formula holds gr+1 =
aα (x)
|α|1
βα
α! (iu)α−β ∂x β gr . β!(α − β)!
(A.10)
Similar formulas for derivatives of gr+1 can be obtained: ∂x ρ gr+1 =
|α|1 βα
α! ρ! (iu)α−β ∂x η aα ∂x ρ−η+β gr . β!(α − β)! η!(ρ − η)! ηρ
Since the underlying process is affine, all derivatives of a of order higher than one are zero and thus, by induction, gr is polynomial in x of degree at most equal r. We so get for |ρ| r + 1, ∂x ρ gr+1 =
ηρ, |η|1
ρ! α! (iu)α−β ∂x η aα ∂x ρ−η+β gr η!(ρ − η)! β!(α − β)! |α|1 βα
By defining Γr := max |∂x β gr |, |β|r
we obtain the following estimate for x ∈ X, |∂x ρ gr+1 | Γr 1 + x ηρ, |η|1
ρ! α! a |u|α−β D|α| η!(ρ − η)! β!(α − β)! |α|1 βα
with |u| := [|u1 |, . . . , |un |]. Hence, by the simple relations {η: |η|1}
ρ! = 1 + |ρ|, η!(ρ − η)! α,|α|=k
with u = maxi=1,...,n |ui |, we have
1=
βα
|α| α! |u|α−β 1 + u , β!(α − β)!
(k + n − 1)! 2n+k , k!(n − 1)!
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
1249
k k Γr+1 2n Γr (r + 2) 1 + x 2 1 + u Dka k1
= 2 Γr (r + 2) 1 + x θ u , n
(A.11)
where the series in (A.11) is convergent due to assumption (4.2). As a consequence, (4.3) holds. A.5. Proof of Proposition 4.5 From (A.10) we have with a0 := 0, gr+1 =
aα
α,β,γ 0
=
α! γ! (iu)α−β gr,γ x γ −β 1|γ |r 1βα 1βγ β!(α − β)! (γ − β)!
gr,γ +β
β,γ 0
=
x
γ
|γ |r
+
γ +β γ x 1|γ +β|r bβ β
|β|r−|γ |
γ +β 0 bβ gr,γ +β β gr,γ +β
|γ |r |β|r−|γ |
=
|γ |r+1
+
xγ
|γ |r+1
gr,γ +β
|β|r−|γ |
xγ
γ +β β
b1β,κ x γ +κ
κ, |κ|=1
γ +β 0 bβ β
|κ|=1, κγ β, |β|r+1−|γ |
gr,γ −κ+β
γ −κ +β 1 bβ,κ , β
where empty sums are to be interpret as zero. The second recursion follows from (r + 1)hr+1 = η−1 h˜ r+1 + rhr with A(hr fu ) = h˜ r+1 fu and h˜ r+1 computed via (4.6) with gr replaced by hr . References [1] M. Aigner, A Course in Enumeration, Springer, Berlin, 2007. [2] A.L. Amadori, Nonilinear integro-differential evolution problems arising in option pricing: A viscosity solutions approach, Differential Integral Equations 16 (2003) 787–811. [3] K. Bichteler, J.-B. Graveraux, J. Jacod, Malliavin Calculus for Processes with Jumps, Gordon & Breach, 1987. [4] J. Dieudonné, Foundations of Modern Analysis, Academic Press, New York, 1960. [5] D. Duffie, D. Filipovi´c, W. Schachermayer, Affine processes and applications in finance, Ann. Appl. Probab. 13 (2003) 984–1053. [6] D. Duffie, J. Pan, K. Singleton, Transform analysis and asset pricing for affine jump diffusions, Econometrica 68 (2000) 1343–1376. [7] E. Eberlein, K. Glau, A. Papapantoleon, Analysis of valuation formulae and applications to exotic options in Lévy models, arXiv:0809.3405, 2009. [8] E. Eberlein, A. Papapantoleon, Symmetries and pricing of exotic options in Lévy models, in: A. Kyprianou, W. Schoutens, P. Wilmott (Eds.), Exotic Option Pricing and Advanced Lévy Models, Wiley, 2005, pp. 99–128. [9] D. Filipovi´c, A general characterization of one-factor affine term structure models, Finance Stoch. 5 (2001) 389– 412.
1250
D. Belomestny et al. / Journal of Functional Analysis 257 (2009) 1222–1250
[10] M.G. Garroni, J.L. Menaldi, Second Order Elliptic Integro-Differential Problems, Res. Notes Math., vol. 430, Chapman&Hall, 2002. [11] P. Glasserman, Kim Kyoung-kuk, Moment explosions and stationary distributions in affine diffusion models, Working paper, Columbia Business School, 2007. [12] R.L. Graham, M. Grötschel, M. Lovász, Handbook of Combinatorics, vols. 1, 2, Elsevier Science B.V., Amsterdam, 1995. [13] M. Keller-Ressel, T. Steiner, Yield curve shapes and the asymptotic short rate distribution in affine one-factor models, Finance Stoch. 12 (2) (2008) 149–172. [14] B. Øksendal, A. Sulem, Applied Stochastic Control of Jump Diffusions, Springer, 2007. [15] P. Protter, Stochastic Integration and Differential Equations, Springer, 1990. [16] M. Reed, B. Simon, Methods of Modern Mathematical Physics, Academic Press, 1980. [17] K. Sato, Lévy Processes and Infinitely Divisible Distributions, Cambridge Univ. Press, 1999. [18] J. Schoenmakers, Locally equi-continuous semigroups on locally convex sequence spaces (in dutch), Master thesis, Eindhoven University of Technology, 1988.
Journal of Functional Analysis 257 (2009) 1251–1260 www.elsevier.com/locate/jfa
Complex symmetric partial isometries Stephan Ramon Garcia a,1 , Warren R. Wogen b,∗ a Department of Mathematics, Pomona College, Claremont, CA 91711, USA b Department of Mathematics, CB #3250, Phillips Hall, Chapel Hill, NC 27599, USA
Received 7 April 2009; accepted 13 April 2009 Available online 18 April 2009 Communicated by D. Voiculescu
Abstract An operator T ∈ B(H) is complex symmetric if there exists a conjugate-linear, isometric involution C : H → H so that T = CT ∗ C. We provide a concrete description of all complex symmetric partial isometries. In particular, we prove that any partial isometry on a Hilbert space of dimension 4 is complex symmetric. © 2009 Elsevier Inc. All rights reserved. Keywords: Complex symmetric operator; Isometry; Partial isometry
1. Introduction The aim of this note is to complete the classification of complex symmetric partial isometries which was started in [9]. In particular, we give a concrete necessary and sufficient condition for a partial isometry to be a complex symmetric operator. Before proceeding any further, let us first recall a few definitions. In the following, H denotes a separable, complex Hilbert space and B(H) denotes the collection of all bounded linear operators on H. * Corresponding author.
E-mail addresses: [email protected] (S.R. Garcia), [email protected] (W.R. Wogen). URLs: http://pages.pomona.edu/~sg064747 (S.R. Garcia), http://www.math.unc.edu/Faculty/wrw (W.R. Wogen). 1 First author partially supported by National Science Foundation Grant DMS-0638789. 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.04.005
1252
S.R. Garcia, W.R. Wogen / Journal of Functional Analysis 257 (2009) 1251–1260
Definition. A conjugation is a conjugate-linear operator C : H → H, which is both involutive (i.e., C 2 = I ) and isometric (i.e., Cx, Cy = y, x). Definition. We say that T ∈ B(H) is C-symmetric if T = CT ∗ C. We say that T is complex symmetric if there exists a conjugation C with respect to which T is C-symmetric. It is not hard to see that T is a complex symmetric operator if and only if T is unitarily equivalent to a symmetric matrix with complex entries, regarded as an operator acting on an l 2 -space of the appropriate dimension (see [4, Sect. 2.4] or [7, Prop. 2]). One can also easily show that if dim ker T = dim ker T ∗ , then T is not a complex symmetric operator. For instance, the unilateral shift is perhaps the most ubiquitous example of a partial isometry which is not complex symmetric (see [7, Prop. 1], [4, Ex. 2.14], [5, Cor. 7]). On the other hand, we have [9, Thm. 4]: Theorem 1. Let T ∈ B(H) be a partial isometry. (i) If dim ker T = dim ker T ∗ = 1, then T is a complex symmetric operator. (ii) If dim ker T = dim ker T ∗ , then T is not a complex symmetric operator. (iii) If 2 dim ker T = dim ker T ∗ ∞, then either possibility can (and does) occur. Although these results are the sharpest possible statements that can be made given only the data (dim ker T , dim ker T ∗ ), they are in some sense unsatisfactory. While it is known that there exist partial isometries in B(H) that are not complex symmetric whenever dim H 5, it turns out that every partial isometry in B(H) is complex symmetric if dim H 3. The authors were unable to settle the issue in the case dim H = 4. To be more specific, the techniques used in [9] were insufficient to discuss the case dim H = 4 and dim ker T = 2. Significant numerical evidence in favor of the assertion that all partial isometries on a four-dimensional Hilbert space are complex symmetric has recently been produced by J. Tener [11]. Let us now describe our results and the resolution of this problem. Suppose that T is a partial isometry on H and let H1 = (ker T )⊥ = ran T ∗
(1)
denote the initial space of T and H2 = (H1 )⊥ = ker T denote its orthogonal complement (see [10, Pr. 127] or [3, Ch. VIII, Sect. 3] for terminology). With respect to the orthogonal decomposition H = H1 ⊕ H2 , we have A 0 T= (2) B 0 where A : H1 → H1 and B : H1 → H2 . Furthermore, the fact that T ∗ T is the orthogonal projection onto H1 yields the identity A∗ A + B ∗ B = I, where I denotes the identity operator on H1 . Finally, observe that the operator A ∈ B(H1 ) is simply the compression of the partial isometry T to its initial space.
S.R. Garcia, W.R. Wogen / Journal of Functional Analysis 257 (2009) 1251–1260
1253
The main result of this note is the following concrete description of complex symmetric partial isometries: Theorem 2. Let T ∈ B(H) be a partial isometry. If A denotes the compression of T to its initial space, then T is a complex symmetric operator if and only if A is a complex symmetric operator. Due to its somewhat lengthy and computational proof, we defer the proof of the preceding theorem until Section 2. We remark that Theorem 2 remains true if one instead considers the final space of T . Indeed, simply apply the theorem with T ∗ in place of T and then take adjoints. Corollary 1. Every partial isometry of rank 2 is complex symmetric. Proof. Let T ∈ B(H) be a partial isometry such that rank T 2. If rank T = 0, then T = 0 and there is nothing to prove. If rank T = 1, then we may appeal to [9, Cor. 5], which asserts that every rank-one operator is complex symmetric. If rank T = 2, then we may write A 0 T= B 0 where A is an operator on a two-dimensional space. Since every operator on a two-dimensional Hilbert space is complex symmetric (see [1, Cor. 3], [2, Cor. 3.3], [7, Ex. 6], [9, Cor. 1], or [11, Cor. 3]), the desired conclusion follows immediately from Theorem 2. 2 Corollary 2. Every partial isometry on a Hilbert space of dimension 4 is complex symmetric. Proof. As mentioned earlier, the results of [9] indicate that only the case dim H = 4 and dim ker T = 2 requires resolution. The corollary is now an immediate consequence of Theorem 2 and the fact that every operator on a two-dimensional Hilbert space is complex symmetric. 2 We conclude this section with the following theorem, which asserts that each C-symmetric partial isometry can be extended to a C-symmetric unitary operator on the whole space (the significance lies in the fact that the corresponding conjugations for these two operators are the same). Theorem 3. If T is a C-symmetric partial isometry, then there exists a C-symmetric unitary operator U and an orthogonal projection P such that T = U P . Proof. Since T is a C-symmetric partial isometry, it follows that |T | = P is an orthogonal projection and that T = CJ P where J is a partial conjugation supported on ran P which commutes with P [8, Sect. 2.2]. We may extend J to a conjugation J on all of H by letting J = J ⊕ J where J is any conjugation on ker P . The operator U = C J is the desired C-symmetric unitary operator. 2 2. Proof of Theorem 2 This entire section is devoted to the proof of Theorem 2. We first require the following lemma: Lemma 1. If H, K are separable complex Hilbert spaces, then T ∈ B(H) is a complex symmetric operator if and only if T ⊕ 0 ∈ B(H ⊕ K) is a complex symmetric operator.
1254
S.R. Garcia, W.R. Wogen / Journal of Functional Analysis 257 (2009) 1251–1260
Proof. If T is a C-symmetric operator on H, then it is easily verified that T ⊕ 0 is (C ⊕ J )symmetric on H ⊕ K for any conjugation J on K. The other direction is slightly more difficult to prove. Suppose that S = T ⊕ 0 is a complex symmetric operator on H ⊕ K. Before proceeding any further, let us remark that it suffices to consider the case where H = ran T + ran T ∗ .
(3)
Otherwise let H1 = ran T + ran T ∗ and note that H1 is a reducing subspace of T . If H2 denotes the orthogonal complement of H1 in H, then with respect to the orthogonal decomposition H1 ⊕ H2 ⊕ K, the operator S has the form T ⊕ 0 ⊕ 0, where T denotes the restriction of T to H1 . By now considering S with respect to the orthogonal decomposition H ⊕ K = H1 ⊕ (H2 ⊕ K), it follows that we need only consider the case where (3) holds. Suppose now that (3) holds and that S is C-symmetric where C denotes a conjugation on H ⊕ K. Writing the equations CS = S ∗ C and CS ∗ = SC in terms of the 2 × 2 block matrices S=
T 0
0 , 0
C=
C11 C21
C12 C22
(4)
(the entries Cij of C are conjugate-linear operators), we find that C11 T = T ∗ C11 ,
(5)
C21 T = C21 T ∗ = 0,
(6)
∗
T C12 = T C12 = 0.
(7)
Since C21 T = C21 T ∗ = 0, it follows that C21 vanishes on ran T + ran T ∗ and hence on H itself by (3). On the other hand, (7) implies that C12 vanishes on the orthogonal complements of ker T and ker T ∗ in H. By (3), this implies that C12 vanishes identically. It follows immediately from (4) that C11 and C22 must be conjugations on H and K, respectively, whence T is C11 -symmetric by (5). This concludes the proof of the lemma. 2 Now let us suppose that T is a partial isometry on H and let H1 = (ker T )⊥ = ran T ∗ and H2 = ker T . With respect to the decomposition H = H1 ⊕ H2 , it follows that T=
A B
0 0
where A : H1 → H1 , B : H1 → H2 , and A∗ A + B ∗ B = I.
(8)
(⇒) Suppose that T is a complex symmetric operator. For an operator with polar decomposition T = U |T | (i.e., U is a partial isometry satisfying ker U = ker T and |T | denotes the positive
S.R. Garcia, W.R. Wogen / Journal of Functional Analysis 257 (2009) 1251–1260
1255
√ 1 1 operator T ∗ T ), the Aluthge transform of T is defined to be the operator T = |T | 2 U |T | 2 . Noting that I 0 , T ∗T = 0 0 we find that T =
A 0
0 . 0
By [6, Thm. 1], we know that the Aluthge transform of a complex symmetric operator is complex symmetric. Applying Lemma 1 to T, we conclude that A is complex symmetric, as desired. (⇐) Let us now consider the more difficult implication of Theorem 2, namely that if A is a complex symmetric operator, then T is as well. We claim that it suffices to consider the case where ran B = H2 . In other words, we argue that if K = ran T + ran T ∗ , then we may suppose that K = H. Indeed, K is a reducing subspace for T and T = 0 on K⊥ . By Lemma 1, if T |K is a complex symmetric operator, then so is T . Write B = V |B| where V : H1 → H2 is a partial isometry with initial space (ker B)⊥ ⊆ H1 and final space H2 (since ran B = H2 ). In particular, we have the relations √ (9) V ∗ B = |B| = B ∗ V , |B| = I − A∗ A. By hypothesis, the operator A ∈ B(H1 ) is complex symmetric. Therefore suppose that K is a conjugation on H1 such that KA = A∗ K and observe that the equations √ √ A I − A∗ A = I − AA∗ A, √ √ A∗ I − AA∗ = I − A∗ AA∗ , √ √ K I − A∗ A = I − AA∗ K, √ √ K I − AA∗ = I − A∗ AK, follow from a standard polynomial approximation argument (i.e., if p(x) ∈ R[x], then Ap(A∗ A) = p(AA∗ )A and Kp(A∗ A) = p(AA∗ )K hold, so that the desired identities follow upon passage to the norm limit). In particular, it follows from the preceding that √ √ (KA) I − A∗ A = I − A∗ A(KA), that is KA|B| = |B|KA,
A∗ K|B| = |B|A∗ K.
Let us now define a conjugate-linear operator C on H by the formula AK KB ∗ C= . BK −V A∗ KV ∗
(10)
(11)
1256
S.R. Garcia, W.R. Wogen / Journal of Functional Analysis 257 (2009) 1251–1260
Assuming for the moment that C is a conjugation on H, we observe that
A 0 AK = B 0 BK
K 0 I 0 KB ∗ . 0 0 0 0 −V A∗ KV ∗
T
C
|T |
J
Since it is clear that J is a partial conjugation which is supported on the range of |T | and which commutes with |T |, it follows immediately that T is a C-symmetric operator (see [8, Thm. 2]). To complete the proof of Theorem 2, we must therefore show that C is a conjugation on H. In other words, we must check that C 2 is the identity operator on H and that C is isometric. Since these computations are somewhat lengthy, we perform them separately: Claim. C 2 = I . Proof of Claim. We first expand out C 2 as a 2 × 2 block matrix: C2 = = =
AK BK
KB ∗ −V A∗ KV ∗
AK BK
KB ∗ −V A∗ KV ∗
AKAK + KB ∗ BK BKAK − V A∗ KV ∗ BK
AKKB ∗ − KB ∗ V A∗ KV ∗ BKKB ∗ + V A∗ KV ∗ V A∗ KV ∗ AB ∗ − KB ∗ V A∗ KV ∗ . BB ∗ + V A∗ KV ∗ V A∗ KV ∗
AA∗ + KB ∗ BK BA∗ − V A∗ KV ∗ BK
To obtain the preceding line, we used the fact that K is a conjugation and A is K-symmetric. Letting Eij denote the entries of the preceding block matrix we find that E11 = AA∗ + KB ∗ BK = AA∗ + K(I − A∗ A)K = AA∗ + (I − AA∗ ) = I, E12 = AB ∗ − KB ∗ V A∗ KV ∗ = AB ∗ − K|B|A∗ KV ∗ ∗
∗
= AB − KA K|B|V
∗
by (9) by (10)
= AB ∗ − A|B|V ∗ = AB ∗ − AB ∗
since B ∗ = |B|V
= 0, E21 = BA∗ − V A∗ KV ∗ BK = BA∗ − V A∗ K|B|K
since V ∗ B = |B|
= BA∗ − V |B|A∗ KK
by (10)
S.R. Garcia, W.R. Wogen / Journal of Functional Analysis 257 (2009) 1251–1260
1257
= BA∗ − V |B|A∗ = BA∗ − BA∗
since B = V |B|
= 0. As for E22 , it suffices to show that E22 agrees with I (the identity operator on H2 ) on the range of B, which is dense in H2 . In other words, we wish to show that E22 Bx = Bx for all x ∈ H2 , which is equivalent to showing that E22 Bx = BB ∗ Bx + V A∗ KV ∗ V A∗ KV ∗ Bx = Bx
(12)
for all x ∈ H2 . Let us investigate the second term of (12): V A∗ KV ∗ V A∗ KV ∗ Bx = V A∗ KV ∗ V A∗ K|B|x = V A∗ KV ∗ V |B|A∗ Kx ∗
∗
by (9) by (10)
= V A K|B|A Kx
since V ∗ V = Pran |B|
= V |B|A∗ KA∗ Kx
by (10)
∗
∗
= BA KA Kx
since B = V |B|
= BA∗ Ax = B(I − B ∗ B)x
since A∗ A + B ∗ B = I
= Bx − BB ∗ Bx. Putting this together with (12), we find that E22 Bx = Bx for all x ∈ H2 whence E22 = I , as claimed. 2 Claim. C is isometric. Proof of Claim. The proof requires three steps: (i) Show that C is isometric on H1 . (ii) Show that C is isometric on BH1 , which is dense in H2 . (iii) Show that CH1 ⊥ C(BH1 ). For the first portion, observe that 2 2 x x KB ∗ = AK C 0 BK −V A∗ KV ∗ 0 AKx 2 = BKx = AKx, AKx + BKx, BKx = A∗ AKx, Kx + B ∗ BKx, Kx
1258
S.R. Garcia, W.R. Wogen / Journal of Functional Analysis 257 (2009) 1251–1260
= (A∗ A + B ∗ B)Kx, Kx = Kx, Kx = Kx2 = x2 . Thus (i) holds. Now for (ii): 2 2 KB ∗ 0 C 0 = AK BK −V A∗ KV ∗ Bx Bx 2 KB ∗ Bx = ∗ ∗ −V A KV Bx = KB ∗ Bx2 + V A∗ KV ∗ Bx2 2 = B ∗ Bx2 + V A∗ K|B|x 2 = B ∗ Bx2 + V |B|A∗ Kx = B ∗ Bx2 + BA∗ Kx2 = B ∗ Bx2 + BA∗ Kx, BA∗ Kx = B ∗ Bx2 + B ∗ BA∗ Kx, A∗ Kx
= B ∗ Bx2 + (I − A∗ A)A∗ Kx, A∗ Kx
= B ∗ Bx2 + A∗ K(I − A∗ A)x, A∗ Kx
= B ∗ Bx2 + K(I − A∗ A)x, AA∗ Kx
= B ∗ Bx, B ∗ Bx + KAA∗ Kx, (I − A∗ A)x
= (I − A∗ A)x, (I − A∗ A)x + A∗ Ax, (I − A∗ A)x
= x, (I − A∗ A)x − A∗ Ax, (I − A∗ A)x + A∗ Ax, (I − A∗ A)x
= x, (I − A∗ A)x = x, B ∗ Bx = Bx, Bx = Bx2 . Thus (ii) holds. Now for (iii): x x AK KB ∗ AK 0 C ,C = , By BK −V A∗ KV ∗ BK 0 0 KB ∗ By AKx = , BKx −V A∗ KV ∗ By
KB ∗ −V A∗ KV ∗
0 By
S.R. Garcia, W.R. Wogen / Journal of Functional Analysis 257 (2009) 1251–1260
1259
= AKx, KB ∗ By − BKx, V A∗ KV ∗ By
= B ∗ By, KAKx − BKx, V A∗ K|B|y
= B ∗ By, A∗ x − BKx, V |B|A∗ Ky = AB ∗ By, x − BKx, BA∗ Ky = AB ∗ By, x − B ∗ BKx, A∗ Ky
= AB ∗ By, x − (I − A∗ A)Kx, A∗ Ky
= AB ∗ By, x − K(I − AA∗ )x, A∗ Ky
= AB ∗ By, x − KA∗ Ky, (I − AA∗ )x
= AB ∗ By, x − Ay, (I − AA∗ )x
= AB ∗ By, x − (I − AA∗ )Ay, x
= AB ∗ By, x − A(I − A∗ A)y, x = AB ∗ By, x − AB ∗ By, x = 0. By the polarization identity, it follows that y1 x2 x1 x1 ,C = , C Bx2 By2 By2 By1 holds for all x1 , x2 , y1 , y2 ∈ H1 whence C is isometric on H.
2
3. Partial isometries and the norm closure problem Partial isometries on infinite-dimensional spaces often provide examples of note. For instance, one can give a simple example of a partial isometry T satisfying dim ker T = dim ker T ∗ = ∞ which is not a complex symmetric operator: Example 1. Let S denote the unilateral shift on l 2 (N). Although S is certainly not a complex symmetric operator (by (ii) of Theorem 1; see also [4, Ex. 2.14], [7, Prop. 1], or [5, Cor. 7]), part (i) of Theorem 1 ensures that the partial isometry S ⊕ S ∗ is complex symmetric. Indeed, take N to be the bilateral shift on l 2 (Z), note that S ⊕ S ∗ is unitarily equivalent to N − N e0 ⊗ e0 , and appeal to [9, Thm. 3]. That S ⊕ S ∗ is complex symmetric can also be verified by a direct computation [8, Ex. 5]. On the other hand, the partial isometry T = S ⊕ 0 on l 2 (N) ⊕ l 2 (N) is not a complex symmetric operator by Lemma 1. Let S(H) denote the subset of B(H) consisting of all bounded complex symmetric operators on H. There are several ways to think about S(H). By definition, we have S(H) = T ∈ B(H): ∃ a conjugation C s.t. T = CT ∗ C .
1260
S.R. Garcia, W.R. Wogen / Journal of Functional Analysis 257 (2009) 1251–1260
If C is a fixed conjugation on H, then we also have S(H) = {U T U ∗ : T = CT ∗ C, U unitary}. Thus if we identify H with l 2 (N) and C denotes the canonical conjugation on l 2 (N) (i.e., entryby-entry complex conjugation), we can think of S(H) as being the unitary orbit of the set of all bounded (infinite) complex symmetric matrices. The following example shows that the set S(H) is not closed in the strong operator topology (SOT): Example 2. We maintain the notation of Example 1. For n ∈ N, let Pn denote the orthogonal projection onto the span of the basis vectors {ei : i n} of l 2 (N). Now observe that each operator Tn = Pn S ⊕ S ∗ is unitarily equivalent to S ⊕ 0n ⊕ S ∗ where 0n denotes the zero operator on an n-dimensional Hilbert space. Each Tn is complex symmetric since S ⊕ S ∗ is complex symmetric (by Lemma 1). On the other hand, since Pn S is SOT-convergent to 0, it follows that the SOT-limit of the sequence Tn is 0 ⊕ S ∗ , which is not a complex symmetric operator (by Lemma 1). The preceding example demonstrates that the set of all complex symmetric operators (on a fixed, infinite-dimensional Hilbert space H) is not SOT-closed. We also remark that the conjugations corresponding to the operators Tn from Example 2 depend on n. In contrast, if we fix a conjugation C, then it is elementary to see that the set of C-symmetric operators is a SOT-closed subspace of B(H). We conclude with a related question, which we have been unable to resolve: Question. Is S(H) norm closed? References [1] L. Balayan, S.R. Garcia, Unitary equivalence to a complex symmetric matrix: Geometric criteria, preprint. [2] N. Chevrot, E. Fricain, D. Timotin, The characteristic function of a complex symmetric contraction, Proc. Amer. Math. Soc. 135 (2007) 2877–2886, MR2317964 (2008c:47025). [3] J.B. Conway, A Course in Functional Analysis, second ed., Grad. Texts in Math., vol. 96, Springer-Verlag, New York, 1990. [4] S.R. Garcia, Conjugation and Clark operators, in: Contemp. Math., vol. 393, 2006, pp. 67–112, MR2198373 (2007b:47073). [5] S.R. Garcia, Means of unitaries, conjugations, and the Friedrichs operator, J. Math. Anal. Appl. 335 (2007) 941– 947, MR2345511 (2008i:47070). [6] S.R. Garcia, Aluthge transforms of complex symmetric operators, Integral Equations Operator Theory 60 (3) (2008) 357–367, MR2392831. [7] S.R. Garcia, M. Putinar, Complex symmetric operators and applications, Trans. Amer. Math. Soc. 358 (2006) 1285– 1315, MR2187654 (2006j:47036). [8] S.R. Garcia, M. Putinar, Complex symmetric operators and applications II, Trans. Amer. Math. Soc. 359 (2007) 3913–3931, MR2302518 (2008b:47005). [9] S.R. Garcia, W.R. Wogen, Some new classes of complex symmetric operators, Trans. Amer. Math. Soc., in press. [10] P.R. Halmos, A Hilbert Space Problem Book, second ed., Springer-Verlag, New York, 1982. [11] J.E. Tener, Unitary equivalence to a complex symmetric matrix: An algorithm, J. Math. Anal. Appl. 341 (1) (2008) 640–648, MR2394112 (2008m:15062).