This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
,p,q;fi)
(2.1)
in which 71, V and f are real analytic for I in a domain ofK.d~l, ip e T d _ 1 , p and q are real in the neighborhood of the origin, and p. is a small parameter, is called a-priori unstable if
and
dpV{1,0,0; A«) = dp(V + K)(1,0,0; A<) = 0 = dqV(1,0,0; p) = dq(V + H)(1,0,0; det d\pq)V = det d2{p
p) (2.2)
when I, p and q vary in their own set of definition, and C is a positive constant, independent of ft. Some authors refer to the a-priori unstable systems as "initially hyperbolic". Condition (2.2) means t h a t p = 0 = q is a hyperbolic equilibrium. An example of a-priori unstable system is obtained choosing 7Z as free rotators tt=|(/?
+ -
+ J2-i)
(2-3)
• P = i p 2 + 32(cos9-l)
(2.4)
and V as a pendulum
25 where g is a constant. As it turns out, a-priori stable systems also have partially hyperbolic orbits near simple resonances. In the distinction between a-priori stable and a priori unstable systems, a crucial role is played by the size of the Lyapunov exponent near hyperbolic equilibria. This exponent is of order one in the case of an a-priori unstable system because of (2.2), while it is of order *Js near the simple resonances of a generic a-priori stable system. This will be clarified in Section 6. To better understand the previous remark, the reader may check that the following example [to be compared with the previous (2.3)-(2.4)] is a-priori stable: S(I,
', 0,0), 6 V-1}
(3.14)
are invariant under the Hamiltonian flow of h^,. Moreover, it follows that 7} is contained in the manifold with boundary 6 I - - 1 ,
Si = { ( X 4 ( P V ; M ) , v',p',q'),
\p'\ < R^,
\q'\ < R^}
,
which is locally invariant and on which the motion is simply: *L(Z£(PV;A*)./.PV)
=
(Z1(PV;M)V
+ uj00t,p'e-x°°t,q'ex-t),
(3.15)
provided that {p'e-^1-1'."V)^ | g ' e A«U',pV)*| < R ^ where w^ = dphoo and \ x = %^oo depend only on £' = p'q', I and p.. In particular, the whiskers are (locally) parameterized as Wf
=
{(xL(0;»),
e'er*-1
Wf
=
{(TU0\»),?',0,q'),
f' 6 T * - 1 |q'| < i ^ } .
\p'\
and (3.16)
We propose the name of fan to call sets of the type fiM x T r f _ 1 , which collects the tori, their whiskers, and their normal hyperbolic trajectories.
30 The method of proof used here yields a very strong normal form, since (3.15) describes exactly the motion of a (d+1)—dimensional neighborhood of the torus (even if it does not determine all the motions near the torus). This normal form is at the basis of the construction of the unstable orbits presented here in Section 4 (as well as in [11] and in [12]). Another fundamental ingredient in the construction of such unstable trajectories will be the fact that Diophantine properties (or, more generally, rationally independence) of the "old frequency" djh are preserved for the "new frequency" dr'hoo, according to ( P I ) of Theorem 3.1. We remark the fact that the hypotheses of [12] can be readily derived 4 from the conclusions of our KAM Theorem. Namely, hypothesis (ii) of [12] follows from (3.2), (3.3), ( P I ) and (3.15); hypothesis (iii) of [12] follows from (3.14) and (3.16). We also remind that for "isochronous" systems a much stronger normal form holds. See [20]. We also remark that, leaving out the hyperbolic variables p and q, our proof also establishes the classic KAM Theorem for Lagrangian tori in isoenergetically non-degenerate systems. Moreover, the same result as Theorem 3.1 holds for Hamiltonians depending on several small parameters ft^\... ,//"': the proof would remain the same, denoting fi = (p,^\. • . ,/i'"') and considering it as a vector. The proof of Theorem 3.1 is deferred to Section 7. We now derive from Theorem 3.1 a KAM result for Hamiltonians depending on two parameters e and ft, in which the parameter e plays the role of a fixed singular-perturbation parameter, while the dependence on ju will be uniform. We will apply the following Corollary 3.2 in the a-priori stable setting, in which the Lyapunov exponent is not bounded from zero uniformly in the parameter. In reference to this, see Lemma 6.3 below. Corollary 3.2 Fix I* £ R d _ \ E € M. Consider the Hamiltonian H(I, ip,p, q) = h(I,pq; e) + / ( / , ip, p, q; E, /J.)
(3.17)
•with h and f real for any real value of (I,
0 such that, if \/i\ < iio, the energy level is filled by whiskered tori with density at least 1 — 0(^/jlo). More precisely: There exist fio and R^, 0 < p 0 < ft and 0 < Roo < R, l')- From (7.4): Oo.Xo} • 'P(E') is a bound on
sucn
that, for \fi\ < fio, there exist:
(el) a smooth canonical transformation $ , close to the identity and real analytic (for a fixed action) in the angles, in the hyperbolic variables and in the parameter fi, (e2) a function
hoo : R ( d _ 1 ' x K x R —> R with the same smoothness as <&,
(e3) a set nu C R(<*-i)+i+i ; with density at least 1 - 0(./Mo), such that, fixed \fi\ < fio, d"(Ho$(I',
V(/',p',g') e fiM,
Moreover, setting (,' = p'q', V ( / ' , p ' , g ' ) e ^ , f'eT1-1,
q'?0, d
v(/',p', 9 ')en M , v' eT -\ p'jto, d 1
More precisely: one can find a ball B C R ~
\dp>(Hoi)\
=
\q'd(,hoo\>0,
\dg,(Ho^)\ = \p'd<:,hoo\>0.
in the actions and a set VT of the form
VT = {I e B and diH0(1,0;0)
is (70, T) - diophantine}
with a suitable 70 [i.e. 70 = 0(^/flo)J, such that there exist (El) a function 2£,(C; fi), with range in the action space, which is smooth in I, C,, fi, and (for a fixed I) real analytic in £ and ft, for |£| < R^, \fi\ < no, verifying X4,(0;0) = / , (E2) a function
a^ (£; fi) with the same regularity as X^, that verifies a1^ (0; 0) = 0,
such that:
(Pi) a/'Mz£,(C;/*U;A«) = a,Ho(i,0;0) • (1 + O£,(C;M)), v / e vT. (P2) The set Q.^ in (e3) can be described in the following two ways:
««.
=
{(il(p'q';n),p',q'),
=
{(I',p',q') s.t. \p'\ < Rn, \q'\ < R*,, /ioo(/,pV;/^) = E a n d a / e Z V s.t. drhoo(I,p'q';fi) = diH0(I,0;Q)-(l+aIoo(p'q';fi))}.
IeVT,
\P'\ < Rx, l
(P3). Denoting by D e n s B fresp., by d e n s B / the (2d — 1)-dimensional restriction of the Lebesgue density on the energy level {(I',p',q') s.t. h00(I',p'q';fi) = E} [resp., the d— dimensional restriction in the space of the actions of the Lebesgue density on the energy level], we have
densBfV > l - O ( V ^ )
35 Dens B^(nft
x T^" 1 ) > 1 - 0 ( y W
(5-6)
(P4) We have the following equality of sets:
n° = {TUO-,H), IevT} = =
{/s.t. hoo(I,0;n)
=E
and 3 / E D T s.t. drh^I^fi)
= d / # L / i ( , = 0 • (1 + » 4 ( 0 ; M ) ) } •
(5-7)
(P5) Denoting by densg the (d - 2)—dimensional restriction of the Lebesgue density to the manifold defined by the energy relation hao(I', 0; p) = E, dens EQ% > 1 - 0 ( v ^ ) . Finally, one can take no = 0[ inf(— detdf
(5.8)
q{P))-
Proof. Using Lemma 5.1, we obtain the new Hamiltonian H{r,v',p',q')
= h*{I',P'q';p)
+
^r{I',f',p',q';n).
Notice that by (5.2) the matrices of isoenergetic non-degeneracy of h* and Ha agree in t h e origin of the hyperbolic coordinates; and by (5.1) d^h* > 0, where £' = p'q'. Therefore, Theorem 3.1 can be applied. •
6
Whiskered tori for a-priori stable systems
In this section, e will be a strictly positive, fixed, small parameter. Our target will be to look at an a-priori stable system near a simple resonance and recognize that these systems (under extremely mild conditions) are "hyperbolic in the first order". In this way we will be able to apply the previous results to the a-priori stable case too. We note that this implies that the d—dimensional resonant tori break down for generic perturbations, creating (d — 1)—dimensional whiskered tori. The mechanism of such a breakdown was considered, without measure estimates, in [38] and [25]. Lemma 6.1 Consider the function hM{I,p,q;e)
= h(I,p;e)
+ ef(I,p,q;e) d_l
(6.1) 1
with h and f real analytic for (I,p) in a domain of R x R and q e S . Assume that there exists (I,p) 6 R d _ 1 x R, verifying dph(I,p;0) = 0 and dph(I,p;0) ^ 0. Assume that the function f(q) = f(I,p,q;0) has a non singular critical point, i.e. there exists q such that dqf(q) = 0 and dgf(q) 5^ 0. Then, two functions exist p(I; e) and q(I; e), real analytic for I near I and e small, with p{I; 0) = p, q(I\ 0) = q, such that dphW(I, p(I;e), q(I;e);e) = 0 = dqh^(I,
p(I;s), q(I;e);e).
(6.2)
Moreover, if f has a non singular maximum and a nonsingular minimum, we can make the previous choice of q in order to verify d2ph(T,p;0)d2J(I,p,q;0)<0. (6.3)
36 Proof. Apply the Implicit Function Theorem to C(p,q,I,e)
= (dphW(I,p,q;e),
dqf(I,p,q;e))
near I — I,p = p,q = q and s = 0.
•
The following Lemma sets the equilibria found in (6.2) in the origin: Lemma 6.2 Consider the Hamiltonian system (6.1), under the same assumptions as the previous Lemma. Let
q{Im; e ) ) PW + p{IW; e) sin (q - <j(/W;£))+• Z11'
sends the Hamiltonian (6.1) into a new hW(lW,pW,qW;e)
,
Hamiltonian
verifying
<9 pl .,nM(/W,0,0;e) = 0 = dgmh^(I^,0,0;s).
Proof. Straightforward check.
(6.4) Q
We now inspect the hyperbolic structure of the above /J 1 ' near I = / and p = 0 = q, showing that, for £ small enough, /i' 1 ' inherits such a hyperbolic structure from the one of /i' 0 ' stated in (6.3). In detail: Lemma 6.3 Let /i' 1 ' be the Hamiltonian
obtained from /i' 0 ' in the previous Lemma. Define
\{I, p, q; e) = ^j- det d(2p|1) qW)hW (I, p, q- s ) .
(6.5)
Then, ( A ( / , 0 , 0 ; e ) ) 2 = s{d2phd2qf)
- e2
detdfPiq)f,
with the functions on the right hand side evaluated inp = p(I; s) and q = q(I; e). In particular, if e is small enough, A(I,0,0;e) is real and positive, and \X\(I,p, q;e)\ > c„y/e, for a suitable constant c, for any I in a suitable neighborhood of I and p and q near 0. Proof. Straightforward check.
P
Lemma 6.4 Consider the system (6.1). Assume that (I,p) € M d _ l x H verifies dph(I,p;0) = 0 and dph(I,p; 0) jt 0. Assume that the function f(q) = f(I,p, q; 0) has a non singular maximum and a non singular minimum. Then, there exists a canonical transformation (I,f,p,q) <—• (i 1 ' 2 ',^' 2 ',?' 2 '.?' 2 '), 2 2 defined for p^ and g' ' in a neighborhood ofO, /l ' in a neighborhood of I and v>'2' € Td~l, with new Hamiltonian ft[2](jl2l,C'2';ff) verifying \dimhW{lV\pWqW;e)\
= |A(/PI,pPI,,Pl; e )| > c,/i
(6.6)
for a suitable constant c , for any / I 2 ' in a suitable neighborhood of I, p™ and <j'2' near 0, where we defined^ =pl 2 'ql 2 l and \{I,p,q\s) is defined in (6.5). Furthermore, d 7 " |2I n [2, (/ 12J ,0; £ ) = a»(ft + ef){lW,
p ( / ' 2 ' ; e ) , q{I[2]\e);e),
Vn € N " " 1 .
(6.7)
Proof. First apply Lemma 6.2 to obtain a Hamiltonian like (6.4), and recall also Lemma 6.3. Then apply Lemma 5.1. Q The next theorem will show the existence of whiskered tori near simple resonances for a-priori stable systems. It will follow via Corollary 3.2, applying the previous Lemmas, where (Ji,...,Jd) and (-01,. ..,ipd) in the next statement will correspond respectively to (I\,.. .,Id-\,p) and (ipi,. ••, ¥>d-i,g) of the Lemmas above. This is done making use of a classical result in perturbation theory, namely the Averaging Theorem (see, for instance, §5 of [3] and §52 of [4]).
37 T h e o r e m 6.5 Fix u 6 N, u > 2. Consider the system H(J, i/i) = h(J) + ef(J, ip; e), with h and f real analytic for J in a domain ofK.d and ip € Td. Assume that h is isoenergetically non-degenerate on the energy level h = B with respect to the first (d — 1) action variables. Let J be such that djdh(J) = 0, dj^h(J) 5^ 0 and let 5(y1,...yd_1)/i(17) be rationally independent Set ^jix)
=
=jvrr / f(J,d-ux;0)dil>i...dil;d_1. meas la l Jjd-i Assume that Tj has nonsingular maximum and minimum. Then, a suitable subset (depending on v) of the energy level near J, is filled by whiskered invariant tori with density at least 1 — 0(e"l2), provided that e is small enough; more precisely, the tori [resp., the fan5] fill the space, near J, with (2d — 3) — dimensional density [resp., (2d — 1)—dimensional density] at least 1 — 0(e-"/ 2 ). More precisely: there exist (i) a smooth canonical transformation (J,i>) = $(I', ip1, p', q'), with I' £ M.d, p',q' 6 K, ip' G Jd~l, (ii) a smooth function h^, : Rd~l x R i—> B., (Hi) a set Q.CtV c R W - I ) + I + I , with density at least 1 - 0(£"/ 2 ), such that:
9 " ( f f o $ ( / ' , / , p V ) ) = erh^I^p'q'-e), hoo(/',pV;e) = E, V(/',P',g') e fi£,„.
V(/',p',g') € «.,„, if/ G Td~\n
£ N2d
In the coordinates (I',
= {(/V,
0,0),
^'GT"-1},
for I' in a suitable set fi* u, whose density is at least 1 — 0(e"^2). are W\I') W(I')
The corresponding (local) whiskers
= {(/', V ' , p ' , 0 ) , a/ 6 T*" 1 \p'\ < R^} = {(I',V',0,q'),
for a suitable R^, > 0. Furthermore: for any I' G 12° „ there exists a smooth function lr,e,v 2/',e,i/(0) — I' and
'• K —* R d _ 1 such that
n*,» = { ( ^ , £ , . ( P V ) , P ' , 9 ' ) r e i2°„ \P'\
p'e-^C'^',
gV-CWX) ,
provided that |p'e - A '»< / '' p V ) t |, | ? V ° ° ( / ' ' p ' , ' ) t | < R ^ Proof. Making use of the Averaging Theorem, we can find a canonical transformation, close to the identity for small e, sending the Hamiltonian H(J,ip) of the hypothesis into ffb(/,tp,p,q) = h(I,p) + ef (I,p,q;e) + 0(e"). Such a transformation is denned in a suitable neighborhood of J (which is small if u is big). Moreover fb(I,P,Q-,ty 5
=
zq-r / meas l 0 - 1 Jjd~i
Recall the notation of the fan at page 10.
f(J,p,ipi,-.-,i/>d-nq;0)dip1...dipd-1.
38 Then, use Lemma 6.4 and Corollary 3.2 with fj. = e". Notice also t h a t it is important that Theorem 3.1 contains a quantitative estimate on how small /*o is. In particular, it must be smaller than /c.Ag, and this estimate is satisfied if A R; •Jl, and fi »s e", with v > 2. D The statement of the previous Theorem can be sharpened considering Diophantine simple resonances and optimizing the choice of v as done in the Nekhoroshev theory: Theorem 6.6 Consider the system H(J,ip) = h(J) + sf(J,ip;£), under the same assumptions as Theorem 6.5. Assume also that d(jl:...jd_x)h(J) is (7,7")—Diophantine. Then, if e is small enough, a neighborhood of J in the energy level is filled by whiskered invariant tori with density at least 1 — 0(e~°"'c '), where c > 0 is a suitable constant. More precisely, the tori [resp., the fan.] fill the space, near J, with (2d — 3)—dimensional density [resp., (2d — 1)—dimensional density! at least l_0(e-°(1/£C)) Proof. Following the notations of [34], we set A = {n = ( m , . . . , n j ) £ Z d s.t. n\ = . . . = rid-i = 0}, and
where the Cj's are suitable constants, chosen so that the hypotheses of the "Normal Form Lemma" of [34], page 192, are verified. Applying it to H(J,tp), it leads to the new Hamiltonian H^(I,ip,p,q) = h(I,p) + ef\l,p,q;£) + f,(I,
=
-J3J- / meas 11a l Jv*-i
f(l,p,i>1,...,ibd-i,q;0)dll>i...dibd-i,
and the size of f, is controlled by £e _ °^ 1 '' e °'. Then, as in the proof of the previous Theorem, apply Lemma 6.4 and Theorem 4.1. • Notice that, in the proof of the previous Theorem, an explicit dependence of the constants with respect to the size of the domain of analyticity can be easily carried out. Namely, if the strip of analyticity in the angles ip has width f, then the "Normal Form Lemma" bounds the size of / , by eexp(—K£/6), where K is denned in (6.8). Related measure estimates for elliptic equilibria can be found in [15] and in Section 4.1.5 of [7].
7
Proof of t h e K A M Theorem about partially hyperbolic tori
Proof of Theorem 3 . 1 . The proof presented here makes use of a Newton-type algorithm, that will provide a sequence of canonical transformations converging on a suitable Cantor set. The general step of the algorithm can be summarized as follows: Defining recursively suitable quantities as in (7.27)-(7.35), and assuming condition (7.36) [which is fulfilled by 70 = O(yfjlo)j, there exists a sequence of canonical changes of variables $j, converging in a suitable Cantor set, transforming the Hamiltonian (3.1) into Hj = hj + fj, with hj depending only on the actions and on the product of the hyperbolic variables, and supy. | / , | < 6j, where Vj is a sequence of sets, converging to a Cantor set, and 0j converges to zero super-exponentially fast. Also, the set Vj can be written as follows: Vj
=
{(I,V,p,q;n)
€ C^-l)+(d-l)+l+l+i
s t
and there exists I € Vr st \I — Tj(pq\fi)\
|p| <
Rj>
[(?|
< R.^ \ ^ \ < ^
|^| <
^
< pj > ,
where VT is defined in (3.4), the quantities Rj, pj and £j are defined in (7.27)-(7.35), and X,- and OCJ are functions defined via the Implicit Function Theorem by the relations
dihiPfc-.lt),*;;!*) MZ/(<;M).C;/»)
= 0/A(/,O;O)-(l + aJ(C;/*))•••••(!+ 04(0/0) =
E.
39 The fact that the KAM tori are of codimension (not higher than) one, i.e. p and q are (at most) one-dimensional, is crucial, in this argument, for the estimate on the small divisors. In order to have dimensional estimates, we introduce a constant c with the dimensions of the inverse of an action. This is done only to have "dimensional" estimates: the reader who does not find it useful may set c = 1 in the sequel. In this way the matrix of isoenergetic non-degeneracy becomes U0 = I
'
n ) '
wnere w
= &ih-
In the sequel, we will often make use of the following easy relation: for 6 < 1
Also, we will use that, if a > 0, 0 < <5 < 1, then there exist two constants C and C" (depending only on d and a) such that ^2 n€Z
\n\ae~MS
<
|n|ae-l"l*
<
C5-(d+a)
d
J2
C <5-
(7.2)
nezd l»l>N
Now we start the iterative process. The first step is slightly different from the other ones, since we need to build the first couple of functions 2^ and a'0 as follows. THE F I R S T STEP. Set h0(I,pq-/j.) = h(I,pq;/i), f0(I,
j^(o ; o) = / , Po (C; l*)-I\<
dthoiJ&iCriiCv) = "«(/) (i + a 0 «; M )), Po and
hQ(i^c,p)X;^ = E,
\a'Q(C, /i)| < cp0 ,
(7.3) where wo(I) = 9//i 0 (/,0;0). Let 90, Ao, B0 and L0 be such that s u p | / 0 | < 00, s\ip\dfh0\ < A0, sup \UQ I < Bo, and sup \djho\ < Lo, where the sup is done over OpA R li0. Obviously, we may choose 00 = 0(^0). For any real analytic F(I,
=
£
Fkjn(I;fi)pkqUm".
fc,>£N -ez-'-i
Also, without loss of generality, we may assume that V|7 — I"\ < 2p0, \p\ < 2RQ, \q\ < 2Ro, \p.\ < 2p.0, we have that \?ftd<;ho(I, pq; p)\ > Ao/2. THE I T E R A T I V E S C H E M E . Fix 7V0 suitably large (see (7.7) below) and p 0 suitably small (see (7.11) below). Also define £0 s f/2 and fix 50, 0 < S0 < m i n { l , £ 0 / 4 } . Denote f%jn the Taylor-Fourier terms of /o. Set
X0(/V P
' '*'^ .JL-, |fc-J|H-i"l>0, | n | < 7 V 0
(^-^^(^^(^lo(^PY;,).n^fc^e'-' (7.4)
40 defined on the set ^
A A J
,
=
{(I,V,P,q;ti)
€ C (
%
, |c^| <
&
,
]/x |
<
Mo ,
and there exists I € Z>T st \I - IjJ (pg; /n)| < /?o } • In the definition of V?,R0,(0tlLO, the index "0" high above refers to the index "0" of X^. We now consider the Lie transform (I,ip,p,q) = §lxa(I',
=
-fQ(I',f',p',g';ii)+
The next c*'s in this section stand for suitable constants (that can be explicitly determined by the algorithm). Set ^ = min{7o, Ao}- From (7.2): V°
sup
IE »,,«.»«-- /&„(/'; n) ( P T ( ? T ^""'1 <
.
M>N0 l«|>N 0
<
d+2)
e-N°l°V
Cle0S^
= 0$6^d+2)
EJ ( 7 o *)- 2 ,
(7.6)
where we have chosen 50
c*E0o
Furthermore, from (7.5), (7.1) and (7.2): sup
\{ho,Xo}\
=
o
sup
\
J2
fttnWwWfWY^Z
\n\
<
ciflo^
-2
(78)
-
E s t i m a t e s o n t h e small divisors. Assume that
Define 6 AQ = Ao. Assume also that 70 < 2AJiV0T+1 min{p 0 /2, i ^ } .
(7.10)
Define
the inequality above following from (7.10). Now, V/ e Z>T and n 6 Z d _ 1 - {0}, IS/Ao ( 4 ( C ; / i ) , Ci /*) • n| = |(1 + 4(C;/*))wo(/) • n| > (1 - cpo) M / ) • "I > I ^ p •
Thus, if
(7-12)
|/-4(C;//)|
|0JMJ>C;AO -n| > |a//io fzi(C;/i),C;/x) -«l - \9ih0 (zftCsAO.C;/*) -diho(I,C;n)\N0 > 6 One needs the dummy definition of A* just to make the notation uniform with the j— th step of the algorithm, in which one will set A*. ~ max{4o, Aj}.
41
>
^ 7 7 , o |n|
Vn e Z ^ 1 - {0}.
(7.13)
T
Besides, since uio(I) is real, \ik ^ j and | I — Jg (£; p)\ ^ Po: >
|iS//i 0 (/, C; p.) • n + 9cM-f> <; M) (fc - j)l > istyd/fco (x7(C; *0, C; M) • n + dfhoil,C;AO {k - j)]| - A0/>oWo >
>
|SR[i ( l + a'0(C, /*)) w0(/) • n + d(h0(I, C; /*) (k - j)]| - A„/8 =
= \X[ic&(.<;A*)"OCO • « + W - C;AO (* - j)]| - Ao/8 > > |fc - j | |3*dcho(J,C;A0l - I*[*«5(C;M)^(-0 • HI - V » > > |S3CMA<;A0M*ao(C;A0^-«|-V8>Ao/4(7.14) The estimate on the small divisors in xo is thus given by (7.13) and (7.14). These inequalities also show the convergence of the series defining xo on V? /;„«„„„• E s t i m a t e s o n t h e Lie transform. From the estimates on the small denominators and (7.2), it follows t h a t sup |Xo|< C 2 % V ° (7-15) PO.HO*
.«o-*o/2.eo
so that, by t h e Cauchy Estimate: a
sup ,
v°
sup |9/fXo| i
sup
sup
V" , l
13,'Xol
< c 3 - p r 5QK1 7o Pa <
*
C3-^6QKI To a
sup
\dp'Xo\
sup
|3g-Xo|
V" v°
c3^~-5oKi 7o -Ho < c3 - A - ^ . <
.
(7-16)
7o -Ho
where / q ' s d e n o t e suitable c o n s t a n t s (depending only on d and r ) . Hence, using Lemma A.3, V|£| < 3 , * X o ( ^ o / 4 , r t o e - ' " ' > , £ o - 4 c S o , * ' o ) - Vpo/3,Roe-3So,£0-350,no
— ^po
(7-17)
provided t h a t CS-^SQK*<1.
(7.18)
E s t i m a t e s o n t h e n e w H a m i l t o n i a n . Define h0(I',
=
f Jo
f0Hi',v',P',q';p)
= f
(l-t){{ho,Xo},Xo}o^Xo(I'^',p',q';fj,)dt
{fo,xo}°&m(i',
Jo
f5V',M;n) h^I'^'q'; n) MI',
^
Y,&0(i';f)W)k
ken = h0(I',p'q'; ») + / 0 *(/',pV; /*) = hUr,V>',p',q';n) + fl{I'^\p',q';fl)+
£
fc.jew, I E Z ^ - 1
|n|>JV0
#i(J'V.J>V)
=
ffcj'f/'yy.rt
f°kjn(l';p)p'kq'j
ein^
42 Using Lemma A.5 (at the first order for /o and at the second order for ho) one has ^o^io
=
h
foo^l
=
fo + fo~.
o + {ho,Xo} + h0
This implies, by (7.5), that H^I',V',p',q')
= h1(r,p'q';ll)
+ h{I',
(7.19)
By Lemma A.6, (7.15) and (7.8), making use of (7.17) to control the domains, we obtain: sup
|/ij| < v°
PO/ 4 . R O«~ 45 °.«O—«o->*o
sup v°
\{{h0,Xo},Xo}\
sup
v°
(7.20) 7o Po
Po/3.fioc_3*°.«o-35o.fo
ce-^-Sr3-
I/* I <
„
(7-21)
7o Po
TO/I.«O"~4SO.£O-«O.>'O
Hence, by (7.6) sup v°
|/i|
,»
4
= 0i
(7-22)
To Po
Then, setting p\ = po/8, Ri s RQ e~45°, £i = £o — 45o> we obtain a new Hamiltonian like (7.19) with sup v o | / i | < Si. By the Implicit Function Theorem, we obtain two functions Z[(£;^) and a{(£;/u), real analytic for |C| < Rl and |/x| < /JO verifying |Tf(C;M)-^(C;/*)l |O{(C;A*)I
< <
Pi/2 cpi/2
(7-23) (7.24)
fl/Ai (xf (C; /*), C; /i)
= 9 / ^ (zjfc; A*). <; A*) • (i + «i(C; /0) = = ^ ( 7 ) . (l + oS(C;A»))-(l + a{(<;/•)) fcitrf(C;/*),C;/*) = « .
By construction V}
R c
„ C V? „ . „ , so that sup v i
(7-25) (7.26)
l/il < 9\, and we can iterate the
previous arguments (writing the appropriate index instead of the index 0), from (7.4) onwards. I T E R A T I O N OF T H E A L G O R I T H M . Set 7* = min{7o, A 0 /2}. Fix a suitable l>\, recursively:
f
.
^
=|
*' - i-yM Pi
=
—rr
ft+i = I i?j+1
6+1
and define
(7-27)
(728)
-
(7-29)
(7-30)
EE ^ . e - 4 ^ = J ? o e - 4 S « = o * '
(7.31)
= ^-4*^=&-453«*
(7-32)
43
9j+i
" c74fisrt
. 3. 7*
= s
^ Cl(7-)
(7 33)
'
(7.34)
2
min{7;_1,A,}.
(7.35)
Obviously 2
1 £
°3
3
Iterating the scheme, one obtains Hi{I,
where In order to apply recursively the algorithm above, one has to check t h a t the following conditions are satisfied at the general J— th step of the scheme: (CI) The sup [resp., the inf] over Vj of a quantity involving only hj (or its derivatives up to a suitable order) is less or equal than the double of the corresponding sup [resp., greater or equal than the half of the corresponding inf] over Vo of the corresponding quantity with index 0 (e.g.: X} = infvj |d c Aj| > A 0 /2, etc.), (C2) The matrix U} is nonsingular on Vj, (C3) fj < 2A*jNT+1 minVj/2, Rj} and pj <
Xj/(ScLjNj),
(C4) There exists a constant C* such that ej < (C* AJ £o)2' • To prove (C1)-(C4), we will assume the following m a i n condition: Ki(\
C l ( m in{ 7 0 ,A 0 }) l0g
SW0
2 K
\ *
(7 36)
'
where A"i and K2 are suitable constants. We remark [see (7.42) below] that this condition is satisfied choosing 70 = O( v / 5o), for #0 small enough, 9Q < O(Xg). The proof of (C1)-(C4) is by induction, assuming them true for i = 1 , . . . ,J — 1. In these pages, fcj's will stand for suitable constants. First notice that, by definition of /ij, the relation Xj > Aj_i/2 follows, and so 7? > 7|_x/2. Notice also that e< > e'l_1, VI < i < J, so £i > 4-1 > • • • > 4
, VI < i < J.
(7.37)
Therefore, defining Ao = log(l/£o), tyi+lpi
Ni < —— <>o
1
tyi+lpi
log — = - — - Ao , VI < i < J £0
(7.38)
°o
Making use of the inductive hypothesis, this implies that P^
,» ui 4, *? , *T '+ I - V l < i < j - 1 .
(7.39)
44 Thus
Then, £j
< *f 4 , A J + 1 c - 2 ^ - 1 e?_! .
(7.40)
Iterating (7.40), one gets (C4). Furthermore, it is easy to see that sup v . \Uj —Uj-i\ < BQ13~:',
hence
sup % - Woi < y > u p iw< - w<-ii < Bo-1533-* = —-. This implies (C2), via Lemma A.2. Incidentally, we have also proved that Bj < 2Bn- The other relations in (CI) follow in the same way. Also, the already proved (C4) implies that 2^+l0 ^ > ^ — do then, recalling (7.38), one obtains (C3).
1 logp^-r— • G*Ao£o
(7.41)
P a s s a g e t o t h e limit. From (7.23):
sup
\l'+m -1]\< £
^p
\ll, -2/1 < J2
ft^ElT'
showing the uniform convergence of J j to a suitable Z^, for |£| < R^ and |/J| < fioAlso, if we set 00
<4(C;AO = I I ( 1
+ Q
J K ; A O ) - I , v|cl <«!„, \H\
j=0
using the fact that \aj\ < cpj < cpo/V it is easy to prove that the above product converges uniformly and that Ictcd < cp. Via iteration of (7.15), the convergence of the transformation $ j = $ £ o . . . o $£,o readily follows. Since the convergences are uniform for complex \fi\ < fio, \p\ < Rao, \q\ < Rao, |3<£>| < £00, we obtain the claimed analyticity in the angles, in the hyperbolic variables and in the parameter fi. C H A R A C T E R I Z A T I O N O F T H E S E T S n„ A N D n» OF V A L I D I T Y OF T H E T H E O R E M [see (3.6) and (3.10)] A N D M E A S U R E OF THE P R E S E R V E D TORI. Let x such that if x > x then c\K^ (loga;)^ 2 a r 1 < 1. Set
7o
= J ^ - ^ = 0 ( ^ ) = 0( v ^oT).
(7.42)
Then, the KAM condition (7.36) is satisfied. Since hoc, is isoenergetically non-degenerate for J and C sufficiently close to I* and 0 respectively and for tin small enough, without loss of generality, we may assume that inf \d[d, /loo I > 0, /es,|
45 is a diffeomorphism. This proves that / = lUO; V) <=» M A 0 ; p) = E and djh^I,0;
p) = w0( J) • (1 + o 4 ( 0 ; A*)) -
(7.44)
And this implies (3.10) and (3.11). We now prove that the function 1^ is locally invertible on the U-energy level, i.e., there exist a suitable p' > 0 such that for any I, \I - I*\ < p', hoo(1,0; p.) — E, there exists a unique I, \I — I*\ < p such that h0(1,0; 0) = E and / = 1^(0; p.).
(7.45)
To show this, recall Proposition 2.5 and consider the local diffeomorphism G(I, a) = (crdiho(1,0; 0), h0(1,0; 0)) .
(7.46)
1
Given / , let I be the component in the actions of G" (djhoo{I,Q; p),E). From (7.43) it follows that .4(7) = A(i^,(0;p)), proving the existence in (7.45). The uniqueness follows from the fact that Z£(0;/*) = 2 ^ ( 0 ; p ) and h0(1,0;0) = E = h0(y,0;0) imply G(1,1 + o£,(0;/i)) = G(y, 1 + c&(0;/x)). Then, (3.12) readily follows from (7.45). The other characterizations of the set of validity of the Theorem and the related estimates on the measure can be proved with similar arguments. •
*****
Appendix A
Some technical Lemmas
A.l
Some linear algebra
L e m m a A . l Consider A e Mat(n x n). Let a, b, c, d be n— dimensional column vectors and a, /?, 7 e R, with p^O^f. Then / A a b\ det [ cT a 7 = ( - 7 / 3 ) _ n + 1 det ((7a - ab)dT - p{fA - bcT)) .
(A.l)
\
\dT
I ~/A 7 a 7&\ / (fA fa 76) — b(cT a 7 ) = f~n det I cT a 7 ) = f~n det I cT a 7
\dT
P 0/ =
T
/fA 7 " " det I
— bc cT
V =
P 0/
dT P 0
V
fa- ab 0 \ a 7 =
p
0}
(P{fA-bcT) 70-at 0\ (7/3)"" det /?cr a 7 = \ /ScP /3 0/ / P(yA — bcT) \ /7a — Q6\
(
/3cT
V =
(7/3)"" det
/3dr
-
/
a
0
V Z3 /
•' ,3(7 A - bcT) - (7a — ab)dT PcT-adT
V
-fa — ab
a
7
0
°
fa- ab 0 a f\ =
0
0
0
46
(7/?)-" • ( - l ) 2 " + 3 7 d e t (0hA
=
n
=
2 +3
2n+2
{•70)- -(-l) " T{-l)
~ {la ~ ab)(F
~^ T
l a
~ ab)
=
T
/3det(0{7A-bc )-(-ya-ab)d )
.
proving (A.l).
A.2
•
P e r t u r b a t i o n s of nonsingular matrices
Lemma A . 2 LetM,N be square matrices of the same order. IfMis then (M + N) is nonsingular too. Moreover
nonsingular and \N\ < 1/\M~1\,
KM + AO - 1 l < ——' . K ; ' - 1 - |Af-!| \N\ In particular, if M is nonsingular \(M + N)-1\<2\M~1\.
and \N\ < 1/(2 |JVf_1|), then (M + N) is nonsingular too and
Proof. We have that M " 1 'Ek>o(-1)k(M~lN)k
A.3
= (M + W) _ 1 -
D
Estimates on Lie transforms
Lemma A . 3 Let V be a domain of Cd x Cd. Fixed r = ( r i , . . . , rid) £ R M , with r* > 0, we call Vr = {z = ( z i , . . . , Z 2 d ) e C
Assume that x(x,y) that \t\ < to with
M
S.t. 3W=
(Wlt . . . ,W2d) G V S.t. \Zi-Wt\
is real analytic on Vr. Fixed r' € R 2 d , 0 < r\ < ri, then ' (Vr<) C Vr provided t o m a x / ^ ' V , I ri-rl
8 U
P^'V', »= !,••-,4<1. ri+d - r'i+d )
(A.2)
Proof. If (x,y) G Vr', then 3(x, y) e V such that \xi - Xi\ < r\ and \yt — y~i\ < r'i+d, 1 < i < d. Set (x(t), y(t)) = $x(x, y). If t h e thesis were false, there would exist t, \t\ < t0, which is the time of "first exit" from Vr. Explicitly, (x(t),y(t)) e VT for all \t\ < \t\ and (x(i),y(F)) € dVT. But \xi(t) -xt\
= \xf(t) - Xi(0)\ < sup \xi(t)\ \t\ < sup \dyiX\ t0
r't
(A.3)
Vr
that 7 shows \xi(t) — Xi\ < \xi(t) — Xi\ + \x, — Xi\ < rt — r\ + r[ = r-j. In the same way one sees !?/<(£) - Vi\ < n+d- So (x{i), y(F)) 6 Int V r , in contrast with (x(t),y{t)) 6 dVT. • We denote by LXH = {H, \} * n e Poisson operator and by Lx the operator Lx applied for j times. Lemma A . 4 For all m € N j m
—
{Ho9*x(Lx,y))
=
(ZZH)o*Xx,y).
Proof. Induction over m.
O
L e m m a A.5 Assume the hypotheses over r, r ' , x> *o in Lemma A.3. Fix k € N, k > 1. Then
H0$tx(x,y)
fc-i
j
= J£,-L{H
+ tkRk(x,y;t),
7 In (A.3) we assumed sup V r |9 y i xl # °; if n o t t obviously |xi(£t) — Xi\ = 0 < rj — T'V and the argument goes on in the same way.
47 with 1
Rk{x t] = f
^
SUP
1^1
7^'""'
(1 - sf" 1 dk
/ %4yr^ (jffo * >y))U ^ s
s u p K |if[
<1 T F ^ -
(*d-|t|)*
(A-4)
V|*| < *0 -
(A.5)
Proof. Set Tit) = H ( $ ' (x, J/)). By the Taylor expansion one has
so t h a t (A.4) follows from Lemma A.4. Then, by the Cauchy Estimate and Lemma A.3, sup|i£fc|
<
V,'
dk
sup
I f1 (1 - 9^*_1
(I.W)6V,.„|T|<|«|
< -
1 t, suP^,»)e^,|T|
<
proving (A.5).
•
L e m m a A . 6 Assume the hypotheses over r, r' and x in Lemma A.3. Assume also that x *s Teal analytic on VR with Ri > ri and that H is real analytic in Vr. Then, sup \L{H\ <j\ sup \H\ (sup \x\Y (max/ - ^ r, r L — x vr, v. vH \ \{n - r'MRi+d - ri+d) (ri+d - r'i+d)(Ri
-}) . - n) J / (A.6)
Proof. If we set l to =
I
ri-n
ri+d-r'i+d
)
we have t h a t to verifies (A.2). By Lemmas A.4 and A.3, one gets by the Cauchy Estimate sup | L i g |
Ho&x
=
sup
^
j]
<
-sup|tf|=,!suP|tf| ( m a x j ^ - ^ ,
<
j ! sup {H\ (sup M y ( m a x {
dP (=0 in,
••
< C sup *o v;,,|t|<« 0 i
m
/
\H°$X\<
(snpvJdyix\
fr<
_ ^
supVr \dXix\ \ \ '
_ ^ ,
J ^ ^ } ) (
^
_ ^
< ^
proving (A.6).
References [1] Arnol'd, V.I., Small denominators and problems of stability of motion in classical and celestial mechanics, in Russian Mathematical Survey 18(6), 1963, p.91. [2] Arnol'd, V.I., Instability of Dynamical Systems with several degrees of freedom, in Soviet Mathematics - Doklady 5(3), 1964, p.581.
_ ^ })' , D
48 [3] Arnol'd, V.I. ed., Dynamical Systems III - Encyclopaedia ences, Vol. 3, New York, Springer-Verlag, 1985. [4] Arnol'd, V.I., Mathematical Verlag, 1989.
Sci-
methods of Classical Mechanics New York, S p r i n g e r -
[5] Arnol'd, V.I. - Avez, A., Ergodic problems Benjamin, 1968. [6] Bourgain, J., On Melnikov's p.445.
of Mathematical
persistency
of classical mechanics,
New York,
problem, in Math. Res. Lett., 4(4), 1997,
[7] Broer, H.W. - Huitema, G.B. - Sevryuk, M.B., Quasi-Periodic Motions ilies of Dynamical Systems, New York, Springer-Verlag, 1996.
in
Fam-
[8] Cheng, C. - Wang, S., The surviving of lower-dimensional tori from a resonant torus of Hamiltonian systems, in Journal of Diff. Equations, 155(2), 1999, p.311. [9] Chierchia, L., A direct method for constructing Equation, in Meccanica, 25(4), 1990, p.246.
solutions
of
Hamilton-Jacobi
[10] Chierchia, L. - Celletti, A., Construction of analytic KAM surfaces stability bounds, in Comm. Math. Phys., 118(1), 1988, p.119. [11] Chierchia, L. - Gallavotti, G., Drift and diffusion H. Poincare Phys. Theor., 60(1), 1994, p . l .
and
effective
in phase space, in A n n . Inst.
[12] Chierchia, L. - Valdinoci, E., A note on the construction of Hamiltonian tories along heteroclinic chains, in Forum Math., 12(2), 2000, p.247.
trajec-
[13] DeLatte, D., On normal forms in Hamiltonian dynamics, a new approach to some convergence questions, in Ergodic Theory Dynam. Systems, 15(1), 1995, p.49. [14] Delshams, A. - Gelfreich, V. - Jorba, A. - Seara, T.M., Exponentially small splitting of separatrices under fast quasiperiodic forcing, in Comm. M a t h . P h y s . , 189(1), 1997, p.35. [15] Delshams, A. - Gutierrez, P., Estimates on invariant tori near an elliptic equilibrium point of a Hamiltonian system, in Journal of Diff. Equations, 131(2), 1996, p.277. [16] Delshams, A. - de la Llave, R. - Seara, T.M., A geometric approach to the existence of orbits with unbounded energy in generic periodic perturbations by a potential of generic geodesic flows oft2, Comm. Math. Phys., 209(2), 2000, p.353. [17] Eliasson, L.H., Perturbations of stable invariant tori for Hamiltonian Ann. Scuola Norm. Sup. Pisa CI. Sci. IV, 15(1), 1989, p.115. [18] Eliasson, L.H., Biasymptotic solutions of perturbed integrable tems, in Bol. Soc. Brasil. Mat. (N.S.), 25(1), 1994, p.56.
systems,
Hamiltonian
[19] Fontich E. - Martin P., Arnold diffusion in perturbations of analytic Hamiltonian systems, [email protected], #98-319.
in sys-
integrable
[20] Gallavotti, G., Hamilton-Jacobins equation and Arnold's diffusion near invariant tori in a priori unstable isochronous systems, [email protected], # 9 7 - 5 5 5 . [21] Gentile, G., A proof of existence of whiskered tori with quasi-flat homoclinic intersections in a class of almost integrable Hamiltonian systems, Forum M a t h . 7(6), 1995, p.709. [22] Graff, S.M., On the conservation of Hyperbolic Invariant Systems, in Journal of Diff. Equations, 15, 1974, p . l .
Tori for
Hamiltonian
[23] Huang, D. - Zengrong, L., On the persistence of lower-dimensional invariant hyperbolic tori for smooth Hamiltonian systems, to appear in Nonlinearity. [24] Jorba, A. - de la Llave, R. - Zou, M., Lindstedt series for lower dimensional tori, in Hamiltonian systems with three or more degrees of freedom, N A T O A d v . Sci. Inst. Ser. C Math. Phys. Sci. 533, Dordrecht, Kluwer Acad. Publ., 1999, p.151. [25] de la Llave, R. - Wayne, C.E., Whiskered and low dimensional integrable Hamiltonian systems, preprint, 1996. [26] Malgrange, B., Ideals of differentiable functions, 1967.
tori v< nearly-
London, Oxford University Press,
[27] Moser, J., The Analytic Invariants of an Area-Preserving Mapping near a Hyperbolic Fixed Point, in Comm. on Pure and Applied Mathematics, 9, 1956, p.673. [28] Moser, J., Stable and random motions in dynamical systems, matics Studies, 77, Princeton University Press, 1973.
Annals of M a t h e -
49 [29] Neishtadt, A.I., Estimates in the Kolmogorov Theorem on Conservation of Conditionally Periodic Motions, in Journal of Applied Mathematics and Mechanics, 45(6), 1982, p.766. [30] Niederman, L., Dynamic around a chain of simple resonant tori in nearly grable Hamiltonian systems, preprint, [email protected], #97-142. [31] Poincare, H., Les methodes Paris, 1899.
nouvelles
de la mecanique
inte-
celeste, Gauthier Villars,
[32] Poschel, J., Integrabiiity of Hamiltonian Systems on Cantor Sets, in Comm. on P u r e and Applied M a t h e m a t i c s , 35(5), 1982, p.653. [33] Poschel, J., On elliptic lower-dimensional Zeitschrift, 202(4), p.559. [34] Poschel, J., Nekhoroshev estimates for M a t h . Zeitschrift, 213(2), 1993, p.187.
tori in Hamiltonian quasi-convex
Hamiltonian
[35] Rudnev, M. - Wiggins, S., KAM Theory Near Multiplicity faces, in Journal of Nonlinear Science, 7, 1997, p . 177. [36] Sevryuk, M.B., Lower-dimensional p. 160.
tori in reversible
[37] Sevryuk, M.B., Invariant sets of degenerate [email protected], # 9 8 - 6 8 4 .
in Math.
systems,
One Resonant
in Sur-
systems, in Chaos, 1(2), 1991,
Hamiltonian
systems near
[38] Treshchev, D.V., The mechanism of destruction of resonance Systems, in Math. U S S R Sbornik, 68(1), 1991, p.181. [39] Valdinoci, E., Tori di transizione nella teoria KAM, di Roma 3, [email protected], # 9 8 - 1 5 2 . [40] Whitney, H., Analytical extension of differentiate in TYans. A.M.S., 36, 1934, p . 6 3 .
systems,
tori of
equilibria, Hamiltonian
tesi di laurea at Universita
functions
defined on closed set,
[41] Xu, J., Persistence of elliptic lower-dimensional invariant tori for small perturbation of degenerate integrable Hamiltonian systems, in J. Math. Anal. Appl., 208(2), 1997, p.372. [42] You, J., A KAM theorem for hyperbolic-type degenerate lower-dimensional in Hamiltonian systems, in C o m m . M a t h . Phys., 192(1), 1998, p.145. [43] You, J., Perturbations of lower-dimensional nal of Diff. Equations, 152(1), 1999, p . l .
tori for Hamiltonian
tori
systems, in Jour-
[44] Zehnder, E., Generalized implicit function theorems with applications to some small divisor problems II, in C o m m . P u r e Appl. Math., 29(1), 1976, p.49.
M A T H E M A T I C A L P H Y S I C S ELECTRONIC J O U R N A L
ISSN 1086-6655 Volume 6, 2000 Paper 3 Received: Oct 28, 1999, Revised: Mar 21, 2000, Accepted: Mar 22, 2000 Editor: G. Benettin
Computer—Assisted Proofs for Fixed Point Problems in Sobolev Spaces Alain Schenkel 1 '*, Jan W e h r 2 4 , a n d P e t e r W i t t w e r 3 ' f 1
Department of Mathematics, Helsinki University, P.O. Box 4, 00014 Helsinki, Finland Interdisciplinary Center for Mathematical and Computer Modeling, Warsaw University, Pawinskiego 5a, Warszawa 02 106, Poland 3 Departement de Physique Theorique, University de Geneve, CH-1211 Geneve 4, Switzerland 2
A b s t r a c t . In this paper we extend the technique of computer-assisted proofs to fixed point problems in Sobolev spaces. Up to now the method was limited to the case of spaces of analytic functions. The possibility to work with Sobolev spaces is an important progress and opens up many new domains of applications. Our discussion is centered around a concrete problem that arises in the theory of critical phenomena and describes the phase transition in a hierarchical system of random resistors. For this problem we have implemented in particular the convolution product based on the Fast Fourier Transform (FFT) algorithm with rigorous error estimates. K e y words: computer-assisted proofs, constructive analysis in Sobolev spaces, phase transitions in random media, discrete convolutions are convolutions of splines.
The research of these authors was supported in part by the Fonds National Suisse. The research of this author was supported in part by the NSF Probability and Mathematical Statistics grant 9706915. Permanent address: Department of Mathematics, University of Arizona, Tucson AZ 85721, USA.
50
51
Contents 1 Introduction 1.1 T h e Model 1.2 Main Result 1.3 Computer-Assisted Proofs
52 53 56 60
2 O r g a n i z a t i o n of t h e P r o o f
61
3 C o n s t r u c t i v e A n a l y s i s in Bap 3.1 Operations Involving Real and Complex Numbers 3.2 Standard Sets of Bafi
67 68 71
4 O p e r a t i o n s Involving Functions 4.1 Elementary Operations 4.2 T h e Scaling Operator 4.3 T h e Operator T 4.4 T h e Convolution 4.5 T h e Identity
74 74 76 77 78 81
5 T h e M a p s Nx 5.1 A Bound on Mx 5.2 Existence of the Family of Fixed Points: First Estimate 5.3 Existence of the Fixed Point fx
82 82 85 86
6 C o n t r a c t i v i t y P r o p e r t i e s of DMX 6.1 Oscillatory Functions 6.2 Functions with Support Near the Origin 6.3 Functions with Support Near Infinity 6.4 Piecewise Constant Functions
88 90 93 95 97
7 T h e T a n g e n t M a p s DMXK 7.1 Decomposition of the Operator Norm 7.2 T h e Operator M 7.3 Existence of the Family of Fixed Points: Second Estimate
100 100 102 105
8 The Program Acknowledgments
109
Appendix
109
References
114
52
1. Introduction In this paper we study certain aspects of a model that describes the conductivity in a disordered material. A disordered material is often modeled in statistical mechanics by what is known as a network of random resistors. One would like to be able to describe, for example, what happens in a d-dimensional cubic lattice where each link represents a resistor whose resistance is a random variable. One of the quantities of interest is the effective conductivity of such a network. This conductivity could be defined, for instance, by taking the limit as L goes to infinity of the conductivity
53 1.1. T h e M o d e l The hierarchical network that we study in this paper is constructed recursively as indicated in Fig. 1.2.
Fig. 1.1: The hierarchical lattice at order 2.
Consider the mapping on graphs t h a t consists of replacing every link by two pairs of links. If we start with a graph consisting of one link connecting two sites, then after applying this operation we end up with four sites and after n applications we end up with a graph of 4 " links. For n > 1, t h e network of random resistors that we consider is obtained after n iterations of the procedure outlined above, and consists of resistors with conductivities described by 4 n independent copies S Q of a random variable E 0 . The random variables EQ ' are therefore i.i.d.. The choice of a hierarchical geometry permits t o give a simple formulation of the effective conductivity of the network, when measured between the two vertices of the initial link, in terms of a map. Indeed, using the composition laws for conductors connected in series and in parallel, t h e conductivity of a circuit that consists of four resistors with conductivities al, cr2, cr3, cr4 arranged in a loop is
a = Dc{ava2,a3,a4)=
1 1 j _ + j_ + j_ x • a\
CT2
CT3
(1-1)
<74
If the conductivities alt
54 are obtained for the network of order n — 1 are also i.i.d.. They are independent copies of the random variable given by E ^ ^ E ^ E ^ E *
3
^ ) .
By applying successively the nonlinear average Dc, one can therefore go up the hierarchy to compute the effective conductivity E n of the network of order n. We note that by applying one more time the average Dc to four independent copies of E n , we obtain the conductivity E n + 1 of the hierarchical network of order n + 1 made up from resistors with conductivities given by 4 n + 1 independent copies of the random variable E 0 . We are interested in the limit as n -> oo of the sequence E n that we have just defined. This limit corresponds to the effective conductivity of our hierarchical network in the infinite volume limit. It is not difficult to see that our model is an approximation to the renormalization on a simple square lattice in d = 2 dimensions. In [SW] a detailed discussion of this approximation can be found. See also [BO]. The questions that arise naturally are the following. First of all, a phenomenon of self-averaging should lead to a deterministic effective conductivity for the infinite network. Next, it is interesting to know in what way this conductivity depends on the conductivity of each link of the network, t h a t is on the distribution of the initial random variable E 0 . Finally, once the convergence of the effective conductivity is established, one can study the fluctuations of E n around this limit. A certain amount of information can easily be obtained by studying the parameter Pn
= P ( E n > 0).
(1.2)
Recall t h a t we permit the links to be perfect insulators with a nonzero probability. Here, p0 = p is the probability that a resistor of the original (infinite) network is not broken. From (1.1) it is not difficult to see that the conductivity of a diamond circuit is nonzero with probability p = g(p) = l-(l-p2)2, (1.3) where p is the probability that each of the for resistors has a nonzero conductivity. The function g is characteristic for percolation problems [G]. The function g is increasing on the interval [0,1] and has, in addition to the fixed points at zero and one, a unique unstable fixed point pc in the interval (0,1). The value of pc is,
There are therefore three cases, lip < pc, it follows immediately that pn —> 0 as n —> oo. Hence, the effective conductivity of the network is zero with probability one in this case. For p > pc, one has lim n _ > 0 o p n = 1. This means that with probability one there is a p a t h made from resistors with nonzero conductivity that connects the two sites of the lattice for n = 0. In other words, the percolation threshold of the network is given by pc.
55 We note that this does not imply that the effective conductivity is nonzero. However, it has been proved in [Wl] that for p > pc the sequence E n converges with probability one to a constant cr*(p), and in [Sh] that this constant is strictly positive. Therefore, the percolation threshold corresponds exactly to the phase transition of the effective conductivity. At the critical point p = pc, one has pn = pc for all n. This means that the probability P ( E n = 0) is invariant. In the following we are going to be interested in the part of the distribution of E n supported on (0, oo). An argument based on the study of the expectation of E n shows, however, that the effective conductivity of the network in the infinite volume limit is still zero in this case. Indeed, if we denote by E{X) the expectation of a random variable X, one has E<4>)) = E(
£ ( E n + 1 ) = E(DC(X?,...,
2 x
),
where we have used independence of the random variables. Since, in addition we have the following inequality between the arithmetic mean and the geometric mean, _ 2 _
x + y
i+iS x
2 '
y
and since the left hand side of this expression is equal to zero if x or y are zero, we can bound £ ' ( E n + 1 ) by the expectation of the average of E™ and E<,2) over the set where the random variables are strictly positive. Therefore, one obtains E(Xn+1)
- > 0 a s n - > o o , and the sequence E n converges
The goal of this paper is to describe how the E„ converge to zero at the critical point, that is to describe the fluctuations of the E n around their limiting value. A numerical study [SW] of the probability densities of the random variables E ra indicates that if one normalizes E n with an appropriate factor fin that fixes the expectation of the random variable /J n E n at the expectation of E 0 , the sequence /*„E n converges in distribution to a multiple of a (universal) random variable E„, and lim ^ ± 1 = A* « 1.756, n-s-oo
nn
independently of the choice of E 0 . This means that the fluctuations of the effective conductivity of the network present a certain universality in the limit n —¥ oo: the limiting probability densities for different initial random variables E 0 distinguish themselves only by a change of scale. Furthermore, the probability density for the positive
56 values of E , decays faster t h a n exponentially at zero and infinity. Therefore, at the critical point, the behavior of the fluctuations distinguishes itself from the supercritical case for which a perturbative computation [SS] indicates that the sequence of properly normalized random variables E n converges to a normal distribution (a proof of this fact will appear in [WW]). Also, since A* < 2, conductance fluctuations can be thought of as anomalously large compared to the supercritical case. In this work we address the question of the existence of a positive real number A* and a random variable E„ such t h a t E,=A*2?c(E«,...,Ei4)).
(1.5)
We note that the m a p Dc is homogeneous of degree one and the random variables AS, are therefore solutions of (1.5) for all A > 0. The number A* gives the dynamics of the renormalization group Dc at the critical point on the whole of the set AE„ and is related to the critical exponent t t h a t describes the phase transition of the effective conductivity, <7*(P)~(P-PC)', P>PC, through the formula, cf. [SW], t =
log A* logff'(p c ) '
1.2. Main R e s u l t In order to study the fluctuations of the effective conductivity of our hierarchical network at the critical point, we work in the framework of functional analysis. The part of the distribution of the random variables T,n t h a t interests us is the one that is supported on (0, oo). One assumes t h a t this part of the distribution of E 0 is absolutely continuous with respect to Lebesgue measure and one derives the functional equation for the probability densities that correspond to the nonlinear average (1.1). It turns out to be simpler to work with the resistivities instead of the conductivities. One considers therefore the random variables T for the resistivity given by
If a is the density of the random variable E, then the density p of T is given by
p{x)=T(
(1.6)
The average Dc for t h e conductivities can be rewritten for the resistivities as Dr(r,,...,
r4)=
i n+r-2
|
i TT+T4
•
(1-7)
57 The functional equation for the density VT(p) of the average (1.7) of four independent copies of a random variable T with density p is therefore given in terms of the map T and the convolution operator. Indeed, the probability density p of a sum of two random variables with densities px and p2 is given by the convolution of px and p2, that is by p(x) = (Pl*p2)(x)
=
p1{y)p2(x-y)dy.
(1.8)
Jo Therefore, it follows from (1.7) that Vr(p)
= T(T(p*p)*T(p*p)).
(1.9)
One observes that, formally, the Dirac-densities 8(x — a) are fixed points of this transformation for all a > 0. They correspond to the limiting densities in the non-critical cases. In order t o obtain an equation for the probability densities with support on (0, oo) we determine the contributions of the four resistors to the finite value of Dr(rv .. . r 4 ) . By inspecting (1.7), one observes that they are of two types: either rx + r2 = oo and r 3 + r 4 < ° ° (with the corresponding symmetric case r1 + r2 < oo and r3 + rA = 00), or all of the four resistivities are finite. At the critical point pc, one determines easily that the probability of the first case is given by Cl
= 2 p c ( l - p 2 ) = 0.763...,
(1.10)
whereas the probability of the second case is c 2 = p3c = 0.236....
(1.11)
Obviously cx + c2 = 1. Therefore, the operator that acts on the probability densities and corresponds to the finite part of the map Dr on the random variables is given at the critical point by T>(p) = Cl(p*p)
+ c2T(T(p*p)*T(p*p)).
(1.12)
In order to rewrite the fixed point problem (1.5) in terms of the probability densities, one uses t h a t for A > 0 the probability density of a random variable T/A is given by Sxp, where p is the probability density of T and where Sx is the operator that changes the scale, Sxf{x) = Xf(Xx), A > 0. (1.13) Therefore the fixed point problem (1.5) can be rewritten as p* = Sx.V{p").
(1.14)
The proof of the existence of a real A* and a function p* satisfying (1.14) which we present here is constructive. In particular, we will be capable of providing explicit bounds on A* as well as an approximation to the function p*. The graph of the approximation t h a t we have obtained this way is represented in Fig. 1.2.
58
A
P*{X)
Fig. 1.2: The fixed point p*.
The operator (1.12) was studied by mathematically rigorous, computer-assisted, constructive analysis. Before stating with precision the result t h a t is proved in this paper, we define the function spaces with which we work. We first define the notation that will be used later. Notation. We denote by IR + the set of nonnegative real numbers. The set of positive real numbers will be denoted by 1R+. For an interval J C ]R, we denote by Cn(I) the set of functions that are n times continuously differentiable on I. The derivative of a function / of one variable will be denoted by / ' . For a positive function /J,, we denote by L ( R + , (M(X) dx) the space of functions defined on M + and integrable with respect to the measure fi(x) dx. Finally Wx (lR.+,fj,(x) dx) is the Sobolev space of functions of L (JR+,fi(x) dx) with one distributional derivative in L X (]R + , fi(x) dx). For r > 0 and x in a metric space, Br(x) will denote the open ball of radius r centered at x. Definition 1.1. For a, f3 > 0 and functions wa/} given by w
af}(x)
(1.15)
= e x p ( - + /3a:J,
we define Bap to be the Banach space Lx(lR+,wag(x)
dx).
We denote
the norm of
f e Bap by \\f\\a0, that is, /•OO
n / m = / wal3(x)\f(x)\dx.
(1.16)
Jo
We furthermore
define (1.17) a>0
0>O
59 Remark 1.2. It is clear t h a t we have the inclusion BOT C Ba/3 for a > a and r > j3. The inclusion is strict unless er = a and r = /3. We will also need the following definitions of the mass and expectation. of a function f G L 1 (1R + ) by
Definition 1.3. We define the mass M(f) M(f)=
Iff
/ f(x)dx. Jo € X 1 ( R + , (1 + \x\) dx), we define the expectation
(1.18) E(f)
by
poo
E{f)=
/ Jo
xf(x)dx.
(1.19)
Finally, we will need t o exclude functions / such that M(f)E(f)
= 0, i.e., functions
in H = {f€
L1(R+,
(1 + |x|) dx) | M(f)E(f)
= 0}.
(1.20)
We can now state the main result of this paper. T h e o r e m 1.4. There exists a real number A* and a function f* € B\H that satisfy the equation
r = sx.v{n.
In addition f* has the following two
properties
(1) M(f*) = 1, (2) / • e C°°(B. + ). Note that this theorem does not imply t h a t the fixed point / * is a probability density, since / * is not necessarily a positive function. While we see strong numerical evidences for positivity of t h e fixed point / * , we have no proof of this fact. Before terminating this section, we summarize in the following lemma some of the properties satisfied by the maps that are contained in (1.14). L e m m a 1.5. The maps Sx form a multiplicative group, i.e., S1 = I, and Sx Sx SXl\2- Moreover, for f and g integrable functions one has Sx(f*9)
= Sxf*Sxg,
(1.21)
S*T = TS1/X.
(1.22)
H f-> 9 S L1(M.+, (1 + \x\) dx), then the mass and expectation identities M(f) = M(Sxf) = M(Tf), M(f*g) = M(f)M{g), E(Sxf) E(f
satisfy the following
(L23)
= \E(f),
* g) = M(f)E(g)
=
+
E(f)M(g).
60 Even though the proof of Theorem 1.4 needs in part a computer for its proof, the properties (1) and (2) follow directly from the existence of the fixed point. The regularity of the fixed point will be established later, cf. Proposition 2.3, whereas property (1) is proved in the following lemma. L e m m a 1.6. Let f € L 1 (1R + )\H and let A > 0 arbitrary. Iff then Af (/) = 1. Proof. Using the relations (1.23) together with M(f) identity Af(/) = M(SxV{f)) that l =
ClM(/)
satisfies f =
SxV(f),
^ 0, one computes from the
+ c2M(/)3.
Using the monotonicity of the function x t-4 c2x3+c1x—1, one verifies that if cx+c2 = 1, then the only zero is given by x0 = 1. Therefore, M(/) = 1.
1.3. Computer—Assisted Proofs The rest of the proof of Theorem 1.4 is the main part of this paper. It is based on a very large number of inequalities proved rigorously with the help of a computer. The use of a computer for proving theorems in analysis has become standard by now. This method, which allows to do constructive functional analysis on a computer, has been developed by O.E. Lanford in his seminal paper [LI], and has then been generalized by [EKW2]. This technique of proof has since then been applied to problems of various origin. See for example [BS, C, CC, dlL, EKW1-2, EW1-2, FL, FS, KP, KSW, KW1-7, L l - 3 , LR1-3, M, R, Sel-2, St]. The proofs constructed up to now have in common that they all deal with spaces of analytic functions. One important novelty of the work presented here is that a proof is constructed for function spaces of i 1 - t y p e . The basic ideas underlying the proof remain the same, but the generalization to L1 spaces uses approximation methods which are typical in numerical analysis, and we will explain how to control discretization errors in this context. A computer-assisted proof is complete once the program has come to an end without a "domain error".
61
2. Organization of t h e Proof In order t o prove existence of a fixed point / * for the map SX.V, we will rely on the contraction mapping principle. The following argument shows, however, that SX,V cannot even be hyperbolic due to the presence of a symmetry. Recall that the nonlinear average Dr is homogeneous of order 1. This causes the scaling operator Sx to commute with T> for every A > 0. Hence, existence of a fixed point / * implies existence of a one parameter family of fixed points {Sxf*}x>0In * n e case E(f*) ^ 0, this family can be parameterized by the expectation E(Sxf*) = E(}*)/\. We first remove this symmetry and make t h e fixed point problem hyperbolic by introducing the family of maps { J V A } A > 0 defined by Wf)
= Sx (cx(f)f
* f + c2T(T(f
* / ) * T(f * / ) ) ) ,
(2.1)
where c A ( / ) is such t h a t E(ATx(f))
= 1•
(2.2)
T h e expression inside t h e outer brackets on the RHS of (2.1) differs from the map V only by t h e coefficient c A ( / ) . The following remark relates the maps SXT> and JV A : If c A (/ A ) = c x for some fixed A > 0 and some fx, then SxV(fx) = Nx(fx) by definition of T> a n d A/"A. This leads to the following criterion for the existence of a fixed point of SX.V. L e m m a 2 . 1 . If there exist a real A* > 0 and a fixed point fx. offtfx. with cx. (fx.) = cv then fx. is solution of the functional equation f = Sx.V(f). To prove existence of A* and / A . , we will study the family {Afx} in a neighborhood ( A - , A + ) of our best numerical value for A*. For each value of A in this neighborhood, the contraction mapping principle will be used to prove existence of a fixed point fx of A/"A. T h e maps 7VA are hyperbolic but not contracting in the neighborhood of their fixed point, due t o an unstable direction that, roughly speaking, crosses transversally the manifold of functions with total mass equal to one. To cope with this problem, we will adopt later a standard strategy that consists of applying a variant of Newton's method. Once t h e existence of a fixed point fx = 7VA(/A) is established for all A 6 [A - , A + ], we will show t h a t c A _(/ A _) < Cj < c A + ( / A + ) . A continuity argument will finally yield the existence of a A* G ( A - , A + ) and a function / A , satisfying the hypothesis of Lemma 2.1. Before entering into more details, we introduce some notation and state a few results concerning the domains of definition and target spaces of Afx. Using the commutation and distributivity properties (1.21) and (1.22), one rewrites J\fx as + C2^A2(/) ,
*xU)
= cMWxW
*x(f) Nl{f)
= Sx(f * f), = T{TNlU) * THlif)).
(2-3)
where (2.4) (2.5)
62 From the condition (2.2), the coefficient cx(f) is expressed in terms of the expectation of A ^ ( / ) and M&f). Since E(j\ft(f)) = 2M(f)E(f)/X and \E(AT2(f))
= E(T(T(f
* / ) * T(f * / ) ) ) = E2(f)
(2.6)
one gets
A -2 c2E2(f) c>(f)=:~: ^)i[2M(f)E(f)
(2-7)
We will see that E2(f) is finite for / 6 B 0 j 8 with /3 > 0. Hence, c A ( / ) is finite provided one excludes functions / in %, i.e., functions such that M(f)E{f) = 0. We now state a result about the domains of definition and target spaces of the maps Afx. Proposition 2.2. For a > 0, /3 > 0 and A > 0,JVX is well defined as a m a p from to BaT for all
Bap\H
Proof. The operators Sx,T and the convolution / H-> / * / are well defined on i 1 ( I R + ) . Next, we show that the convolution product maps B^ x B^ into B,AQV. Using Fubini's theorem, we get the following inequalities /»oo
/»x
dx w
(4Qr,(x) I
11/* 311(40,, < / pOO
=
dy\f{y)\\g{x-y)\
rOO
dy\f{y)\
dx wi4Qv{x
Jo
+
y)\g(x)\
Jo
<supf7f7?)||/ll cJff||c," ^oKw, Jx)w Jy)/ s
(
= \\f\kv\\9\\(v.
(2-8)
The last equality follows from sup y>0
w{4Q{x y ' ^ '
+ y) — = supexp(-Cft(x, j / ) ) , ^ *
y>0
and
/ l v( x ? 2 / ) = £ ± ^ _ _ i _ > 0 ""
zy
x +y ~
for all x, y > 0. Next, S A is bounded as a map from B^ to #(£/A)(A»J)
:
/•OO
II5'A/II(C/A)(A7,)
=y
W
(C/A)(A^) (^/A) t/(^) (rfa;
<sup(W^^/A))H/llc, = II/II C ,-
(2-9)
63 Therefore A^1 maps Bap into B^4a^x^x^, and hence into BaT for a < 4a/A and r < A/?. In order to check that JV A maps Ba/} into #( 4a / A )( Ai g), w e firs* n o t e that T is obviously bounded as a map from B^ to 2?_£ , with
pyii^ii/iic,
(2.10)
since w ^ ( l / x ) = wv(-(x) for all a; > 0. Using (2.8), (2.10) and the bound on A/^, one concludes that M%(f) g B(ia/X)(4xp) C B ( 4 a / A ) ( A j 8 ) for / e S 0 / 8 . FinaUy, the bounds (2.8), (2.10), and the fact that the expectation of a function in Bar is finite for T > 0 imply that c A ( / ) is finite for / 6 Ba/3\H and /3 > 0.
Proposition 2.2 together with Remark 1.2 immediately imply that every fixed point / G Bap\H of Mx with a,/3 > 0 and A G (1,4) satisfies / G 5 \ % . Furthermore, using the regularization properties of the convolution, one can show that every such fixed point is a smooth function. More precisely, we have the following proposition, whose proof can be found in the appendix. Proposition 2.3. Let a,/3 > 0, A G (1,4), and let f G Ba/3\H Then f 6 B\H, f is of class C°°(1R + ), a n d / ' € B.
be a fixed point of Mx.
The following theorem implies Theorem 1.4. T h e o r e m 2.4. Let A" = 1.7562035 a n d A + = 1.7562048. Then, (a) For some a,/3>0, there is a continuous family {/ A } Ae r A - )A +i of functions in Bap\H such that Afx(fx) = / A for all A € [A - , A+], (b) C
A
-(/
A
-)
1 < C A +
(/
A +
).
(2.11)
Our main result follows from Theorem 2.4 and Proposition 2.3. Proof of T h e o r e m 1.4. Assume first t h a t the m a p A i-> c A (/ A ) is continuous. Then Theorem 2.4 implies the existence of a A* 6 (A~,A + ) for which c A .(/ A .) = c1 and / A . = Afx. fx., which using Lemma 2.1 implies that 5 A . I>(/ A .) = / A . , and using Proposition 2.3 that / A . G (B\H)nC°°(JR+). I t remains t o be checked that the map A i-> C A ( / A ) is indeed continuous. One first observes t h a t the linear functionals / >-»• M(f) and / i-> E{f) are bounded as maps from B(7T t o K , provided r > 0, and that
IWJI^supf-^WlU,
(2.12)
|£(/)|<sup(-^— )||/||„.
(2.13)
Hence, / i-> E2(f) is continuous as a m a p from Ba/3 t o R for every Q,/3 > 0, using in addition the bounds derived in the proof of Proposition 2.2. Therefore / i-> c A (/)
64 is continuous as a map from Bap\H to K for every a,/3 > 0 and A € 1R. Next, for each / € BapXH with a, @ > 0, the map A (-> c A (/) is continuous. The continuity of A (-»• c A (/ A ) as a map from [A - , A + ] to R finally follows from the continuity of the family
The proof of Theorem 2.4 is in part computer-assisted. Once (a) is established, the verification of part (b) involves mainly an explicit calculation, that will be given in Section 5.3. The remainder of this section is devoted to the proof of part (a). First, in order to simplify further our estimates, we introduce yet another family of operators, closely related to {A/"A}A>0 . This family is defined by
MX,K
(2.14)
« ,
where A, K > 0. L e m m a 2.5. Let a > 0 and /3,X,K > 0. Then 7VA K is wei] defined as a m a p from Bap\H to BaT for every a < 4a/KX and T < K\{3, and one has B.a/3
(2.15)
'1/K
B
(««)(i)
me*/?)
Proof. First, our previous result on the domains of definition and target spaces of A/"A implies t h a t the operator NXK is well defined as a map from Ba0\H to B,ia,Ky.,KX^ whenever a > 0 and /3, A, K > 0. We now show that C
A(/) =
c
s
K\(
i/J)
(2.16)
•
Using (2.7), we see that , KX{
°
. 1/KI
'
«A -
c2E2(S1/Kf)
2M(S1/Kf)E(S1/Kf)
and the relations M(S1/Kf) = M(f), E(S1/Kf) to (2.16). Using (2.16), we compute
= nE(f)
= cx(WZ(f) + c2Af2x(f) = A/- A (/),
' and E2(S1/Kf)
= nE2(f)
lead
65 and conclude by observing that JV A =
Sy^Af^
From Lemma 2.5 it follows that the fixed points of Nx for A 6 [ A - , A + ] are related to the fixed points of Afx+ K for K £ [A~/A + , 1]. Furthermore, the operators J V A K are well defined as maps from Bag\H to BaB for A and K satisfying ! < « < ! .
(2.17)
This condition is easily seen to hold for K g [A~/A + , 1] and A 6 [A - , A + ] . As mentioned earlier, the fixed points fx of the maps Afx are not attractive. Numerically, the two largest eigenvalues of DMx{fx) are roughly 1.37 and 0.54 for A « A*. In the context of computer-assisted proofs, the standard way of solving a hyperbolic fixed point problem is to turn it into a fixed point problem for a contraction by proceeding in the following way. We choose an invertible linear map M close to the inverse of 1 — DMX. (fx.) and define MXiK = l + M(tfXjK-l).
(2.18)
In Section 7.2, we give a detailed description of M and establish its invertibility. Furthermore, we will see that M is bounded in BaB for all a,/3 > 0. Hence, MXK is well defined as a map from BaB\H to Ba8 for all a > 0, fi > 0, and «, A satisfying (2.17). The existence of the continuous family of fixed points {fx} will follow from estimates on the contractions Mx+ K . These estimates are collected in the following proposition. P r o p o s i t i o n 2.6. Let A + and A - be deGned as in Theorem 2.4. Then, for fj, — 0.5, v = 0.9 and r = 9 • 10~ 4 , there is a function fx+ € B^v and two positive real numbers q < 1 and e < r ( l - q) for which the following holds. For all K € [ A _ / A + , 1], the operator M.x+ K is well defined and continuously differentiable as a m a p from the closed ball Br(f°+) C B^\H to B„„ and satisfies for aJJ / e B r ( / ° + ) II-MA+,K(/A0+)-/A+IU<£>
P-*W/)II-
(2.19) (2-20)
The fact that Br(fx+) does not contain any function in H will follow from computing explicit bounds on the inverse of M(f)E(f) for all / G Br(fx+). These bounds will be computed when evaluating the quantity c A+ ( / ) , cf. the remark preceding Section 6.1. We now show that Proposition 2.6 implies part (a) of Theorem 2.4. P r o o f of Theorem 2.4 ( a ) . By the contraction mapping principle, Proposition 2.6 implies the existence of a fixed point fx+K € B^XH of Mx+ for all K € [A~/A + , 1].
66 From the invertibility of the operator M, the functions fx+ K are also fixed points of the operators A/"A+ K. Hence, t h e conjugation relation (2.15) ensures the existence of the family {/A} of fixed points of J\fx for A € [A", A + ]. These fixed points are given by h = Si/J\+,K , K = A/A+. Since S1/ltfx+tK € £ ( A - M / A + ) l Aft for all K € [\~/\+, 1], it follows that for all A G [A - , A + ], / A € Bap\U for some a,/3 > 0. Finally, we prove the continuity of the family {/ A }. From Proposition 2.3, it follows that fx £ S n C ° ° ( ] R + ) with f'x e B. Since the functions / A + K = SKfKX+ have the same properties, n H-> S1^Kfx+tK is continuous as a map from [A _ /A + , 1] to Bap provided that the family {/ A + K } K is continuous in B^u. In order t o show that this is indeed the case, we check that {fx+ K}K is continuous at K = K0 for each K0 6 [A~/A + ,l]. Let us fix such a K 0 and denote / 0 = / A + K . First, since the contraction mapping principle and Proposition 2.6 imply t h a t / „ belongs t o the ball Bf ( / A + ) , where f = e/(l - q) < r, then, for every e > 0 satisfying i < r - f, the ball Bs(f0) is contained in Br(fx+). Hence, by (2.20), the operators Mx+ K are strict contractions there with rate q. Next, since f0 6 B D C°°(]R + ) with / „ 6 8 , K 4 MX+ K{f0) is continuous as a map from M*+ to Bap for all a, [3 > 0. This implies t h a t there is a 5 = 5(e) such that Mx+ K maps the ball Bg(f0) into itself for each K with \K — K0\ < 5: for / e S f ( / 0 ) , it follows from the continuity of the m a p K >-*• Mx+ K ( / 0 ) and q < 1 that \\Mx+tK(f)
~ / o l U ^ H - ^ A + , K ( / ) - A4 A + i l ( (/o)IU + | | ^ A + i ( « ( / o ) - / o l U
< ge -+||^ A+iK (/ 0 )-^ A+iKo (/ 0 )IU < e. Therefore, the contraction mapping principle implies the existence of a fixed point of Mx+ in the ball J3 f (/ 0 ) whenever \K — K 0 | < S. By uniqueness of the fixed points fx+K, one concludes t h a t | | / A + K — / A +, K o ||^^ < e f ° r a U K satisfying \K — « 0 | < S.
Proposition 2.6 reduces the proof of Theorem 1.4 to the verification of the estimates (2.11), (2.19) a n d (2.20). This verification is computer-assisted, and yields q « 0.85 and e « 1.15 • 1 0 - 4 . The function / A + has been numerically determined to be a very good approximation of the fixed point / A + . It is given by the linear interpolation of 2 17 positive numbers a t well chosen points, and has been obtained by iterating a numerical version of the m a p Nx+ (as described in Sections 4 and 5) and renormalizing the mass yroperly after each iteration in order t o remove the unstable direction. Regarding the computation of the norm of the tangent map DM.X+ K(f), we will take advantage of the fact that DMx(f) h a s very good contraction properties on certain subspaces of finite codimension provided / has some regularity. In particular, the nontrivial action of the operator M can b e restricted t o a finite dimensional subspace, and the computation of the norm of DJ\Ax+ K(f) essentially requires t o explicitly evaluate DA4X+ K(f) on finitely many basis vectors.
67 T h e remainder of this paper is devoted to the proof of Proposition 2.6 and the verification of inequality (2.11). In Section 3, we review the basic approach of computerassisted proofs, and extend it to function spaces of L -type. In Sections 4 and 5, we give a detailed account of the rigorous implementation on a computer of the maps Afx and of the computation of bound (2.19) and inequality (2.11). Section 6 is devoted to the tangent maps DNX and their contraction properties, whereas Section 7 deals with DMX+ K and the computation of the bound (2.20). S e c t i o n . 8 is available as a supplement to this paper, and contains the source code of the program (proof .f) and two input d a t a files ( f p o i n t . l p and f p o i n t . l m ) . The program has been written in Fortran 77 * and consists of a (short) main program and several subroutines ordered in a "bottom-up" hierarchy, accordingly to the organization of the paper. Except for Section 3, references to the program are collected in remarks at the end of each section. For a description of the input data files, see Sections 5.2 and 5.3.
3. Constructive Analysis in Ba/} Computer-assisted proofs rely on the ability, first, to discretize the problem under study in terms of objects that are representable on a computer, and, second, to have a rigorous control on the errors arising from the discretization. We note that in this respect, arithmetic operations are special, since controlling them rigorously requires an explicit knowledge of how rounding is performed by the computer. Nevertheless, we emphasize t h a t the main difficulties related to discretization are usually concerned with the specific transformations involved in the functional equation under study, the control of numerical rounding being typically of no particular relevance. To address discretization issues, one introduces the notions of bounds and standard sets. Denoting, for any set E, by ViT,) the set of all subsets of E, we start by defining what we call a bound in the context of computer-assisted proofs. D e f i n i t i o n 3 . 1 . Let
68 given a map (p: E D Dv —>• E', we construct a bound on tp within the class of maps g: std(E) D Dg —¥ std(E'). Finally, it is in general possible to characterize the images of g in std(E') constructively and implement this m a p on the computer. We note that one can usually choose std(E) and std(E') specifically adapted to the m a p tp in order to improve the bounds g that can be constructed. Unless specified otherwise, the standard sets for a Cartesian product E x E' will be defined by setting std(E x E') = std(E) x std(E').
3.1. Operations Involving R e a l a n d Complex N u m b e r s In our application of the above mentioned procedure to the case of real numbers, we have followed the approach of [KSW] which is based on the 64 bit I E E E standard for floating point arithmetics. This standard specifies two things: a format for floating point numbers (IEEE numbers) and rules concerning rounding after the operations +, —, * , / and ^r. We will not discuss the detail of the implementation, but refer the interested reader to the corresponding section of [KSW]. We first choose a subset S of IEEE numbers, the "safe range", for which no underflows nor overflows can occur. The standard sets of 1R and ]R+ are defined as follows. Definition 3.2. We define std(lR) as the collection of all (closed) intervals [a,b] with a 0. To represent an interval in std(lR) on the computer, we use for convenience the data type for complex numbers available in Fortran. Given a < b E <S, the procedure sbound returns the interval [a, b], whereas, given the interval [a, b] € s t d ( R ) , r l and ru, respectively, returns a and b. We add two more functions, s i c o n s t and s r c o n s t , which, given r € S an integral constant, and, respectively, r € 5 an I E E E number, return the (unique) singleton in std(H) containing r. By using the IEEE specifications related to the rounding occurring after the operations +, —, *, / and ^r, one first writes two functions, r u p and rdown, which, given rc the rounded result of any of these operations, compute an upper bound and a lower bound, respectively, on the exact result r. If these bounds do not belong t o <S, a flag is raised and the program stops. In a straightforward manner, one next constructs bounds in std(lR) (in the sense of Definition 3.1) on the maps x >->• —x (sneg), \x\ (sabs), \/x (sinv), x2 (spower2), y/x ( s s q r t ) , and (x, y) i-> x + y (ssum), x — y ( s d i f f ) , x * y (sprod) and x/y (squot). We will also need a bound on the function x H-> exp(x). The precision with which this function is evaluated is not specified by the IEEE standard. Hence, we make use of the bounds constructed so far and compose them in the following way. First, we use exp(nx) = exp(x)" to restrict the Taylor expansion of exp(i) to cases where \x\ < 0.03. Next, we compute the first three terms in the expansion and bound the tail by a geometrical series. This bound is implemented in the function
69 sexp. We note that it is only involved in the computation of the weight wap and is not required to be of great accuracy. Finally, we will need t o evaluate for x close to zero and n = 0 , . . . , 3, {
Logn(x) = -(-x)-" £
-f-.
(3.1)
k=n+l
Note that the second factor is just the tail of the Taylor expansion of log(l + x). In particular, Log0(a;) = log(l + x). One easily checks that the inequalities ^ {-x)k ^fc +n
I xm+1 I Ira + ra + l P
°8r»W -
k—l
^ (-x) f c I xm+1 I + 2^fc + n | m + n + i | fc=l
are valid for all m > 1. W i t h m — 4, a sufficiently accurate bound is constructed in selogne from the previous inequalities. We end this section with the discussion of a bound on the discrete convolution. For ( r c • • ••>Tn-i) a n d s ~ (so> • • • i s n - i ) e ^™> * n e discrete convolution r * s is the element of R 2 n _ 1 given by r =
(r*s)k=
Y,
r s
iv
*= 0,...,2n-2.
(3.2)
Computing r*s according t o (3.2) involves 0(n2) operations, and becomes impractical for large n. The standard strategy is t o go into Fourier space where the convolution becomes the pointwise product of vectors. The gain in computational time is due to the fact that the discrete Fourier transform can be implemented with an 0(n log2 n) algorithm, known as t h e Fast Fourier Transform (FFT). More precisely, the discrete Fourier transform is a m a p from
(^))&= X>*P(*—)*;> .7=0
* = 0,...,n-l,
(3.3)
H
for z — ( z 0 , . . . , zn_1) € C " . T h e inverse Fourier transform T~x is given by
(^M^E^-^TTH
* = O,...,»-I.
(3.4)
j=o
For r,s e H n as above, one has the well known relation (r *s)k = (F-\F(r)
•?(§)))k,
k = 0,...,2n-2,
(3.5)
where f = ( r 0 , . . . ,rn_lt 0 , . . . , 0) 6 R 2 n and s = ( « „ , . . . , * n _ 1 , 0 , . . . , 0 ) 6 R 2 n . An efficient implementation of (3.3) and (3.4) follows from the observation that if n is even,
70 then T(z) can be decomposed into the sum of the Fourier transform of two vectors in
={x + i-y€G\x€R,yeI}
(3.6)
with R and I elements of std(M). The only operations in C involved in the F F T algorithm are the addition and product, bounds on which are readily implemented from our bounds acting on std(lR). One also needs bounds on the trigonometric factors appearing in (3.3) and (3.4). From the periodicity properties of the functions sin and cos, one first notes that it is sufficient to construct a bound on the maps (l,m) H-> sin(Z7r/m),cos(i7r/m) where m > 4 is a power of 2 and / ranges in { 0 , . . . , m/4}. The case I = 0 is trivial. For I = 1 and m = 4 , 8 , . . . , one evaluates sin and cos recursively: For m = 4 one has cos(7r/4) = sin(7r/4) = l / \ / 2 , and for m > 4 a power of 2 one uses the half angle formulas / #„\
A + cos(a;)
. . .„.
1 sin(x)
Finally, for I > 1, one applies the double angle formulas. R e m a r k . Bounds on the Fourier transform and inverse Fourier transform are implemented in the procedure f f t according to the F F T algorithm. We do not enter into the details of this algorithm, and refer the reader to [PFTV], from which the code has been adapted to interval analysis using the bounds described above. Adapting again t o interval analysis a code from [PFTV], a bound on the discrete convolution r H-> r * r is implemented in the procedure f a s t c o n v o l u t i o n l , while the general case (r, s) *->• r * s is implemented in f a s t c o n v o l u t i o n 2 . Those bounds are restricted to vectors whose dimension is a power of 2. In the sequel, we actually compute the discrete convolution of vectors of the form r = (0, rv ..., rn, 0), n a power of 2. The first two elements and last two elements of such convolution are trivially zero and are updated directly in the procedures f a s t c o n v o l u t i o n l and f a s t c o n v o l u t i o n 2 . For convenience later on, see Section 4.4, we also add at the beginning and the end of the result one element zero.
71 3.2. S t a n d a r d S e t s of Baf3 We now describe the standard sets of the Banach spaces Baa. As mentioned above, the choice of these sets should be adapted to the problem in order to optimize the bounds t h a t one needs to construct. Although functions in BQ/3 are in general irregular, the fixed points of the maps Afx are smooth. Furthermore, these maps are continuous and preserve the regularity. Therefore, we will take for our standard sets of Bap balls centered at regular functions. To represent a regular function on the computer, we will rely on the approximation scheme of spline interpolation. A spline function of order n is a function in C " - 1 which is piecewise polynomial, each of the polynomials being of degree n. For our purpose, it is sufficient to consider splines of order one as the centers of our standard sets, i.e., continuous piecewise affine functions. This choice is a compromise between the quality of the approximation and the simplicity of the bounds that we will have to construct. Note that increasing the order of the interpolation does not lead in general to better approximations. Indeed, for a function / 6 C°°([a,b]) and a typical partition of [a,6] with mesh size e > 0, the associated interpolation of / by a spline g of order n — 1 satisfies for a norm of L x -type
ll/-ffll«e"ll/ (n) llHence, depending upon the behavior of / ' " ' , it can become better to consider finer partitions of [a, b ] rather than to increase the order of the interpolation. We now introduce a few objects that will be used to define the standard sets of
D e f i n i t i o n 3.4. For n > 2, we denote by Vn the set of all partitions p of 1R+ of the form p = {0 < x0 < xx < ... < xn < oo}. (3.7) Furthermore, we denote by V™ the subset ofVn made of uniform partitions, i.e., partitions p = {x i }7 = 0 € Vn satisfying xi — xt_x = e, i = 1 , . . . , n, for some e > 0. T h e uniform partitions have been introduced in order to simplify the implementation of a bound on the convolution operator. For p = {a^J-jLg 6 Vn and A > 0, we adopt the convention to denote by Xp the partition {AX^^LQ. Next, we describe more precisely the piecewise affine functions we will work with. D e f i n i t i o n 3.5. We define A to be the set of all functions p e C°(1R+) for which there is an n > 2 and a partition p = {arj}" =0 e ^n s u c f l that p is affine on [a; i _ 1 ,a; i ] for i = 1 , . . . , n and p{x) — 0 for x £ (x0, xn). Furthermore, Au denotes the subset of A consisting of those functions for which p can be chosen uniform. We note that A,AW C Ba0 for all a, fi > 0. Given a partition p - {arJJLg G Vn and a set of values v = {VJ}"_ 0 € R " + 1 satisfying vQ = vn = 0, we denote by Tx{p, v)
72 the linear interpolation of (p,v) in A, i.e., {vi+*l\-l\(x-xi)
f°rx€
[xt,xi+1]a,ndi€
{0,...,rc-l},
T1(p,v)(x)=l
(3.8) {0
otherwise.
Conversely, associated with every function p & A, there is a pair (p, v) in Vn x R n + 1 , for some n > 2, satisfying 7^(p, v) = p. If p ^ 0 and if one imposes a minimality condition on n, then the associated pair (p, w) is unique. Definition 3.6. For p =£ 0 a function in A, let n(p) = min{n > 2 [ 3 (p, v) € "Pn x R " + 1 such t h a t Tt(p, v) = p}, and define ir(p) to be the (unique) element ofVn(p)
x ]Rn(-p^+1 satisfying
T1(n(p))
= p.
Note that by definition of A, one has always •K(P) = (-, {PiK=o ) with p 0 = Pn(P) ~ 0In order to define the standard sets of A and Au, we need to choose the standard sets
Definition 3.7. For n > 2, we define std('P n ) to be the collection of all sets (X0,..., of the form
(X0, ...,Xn)
= {{x{}^0 ern\x0ex0,...,xne
XJ,
Xn)
(3.9)
with X0,..., Xn any increasing sequence ofn +1 pairwise disjoint elements of std(lR^). Similarly, we define std('P^) as the collection of all sets (A, E) of the form (A, E) = {peVZ\p={a
+ ie}? =0 , a€ A and e € E},
(3.10)
withA,Eestd(R*+). Note that std(T^) is not a subset of std('P n ). Indeed, the sets (A,E) contain only uniform partitions, whereas there are always non-uniform partitions in each set {X0,..., Xn) which is not a singleton. The standard sets of A and Au are defined in terms of std(P„) and std("P^) as follows. Definition 3.8. Let N = 2 2 0 . We define std(*4), respectively std(Au), tion of all sets (P, V) of the form (P, V) = {p € A | p = T^vlp
to be the collec-
€ P and v e V},
(3.11)
with P e std(P n ), respectively P G s t d ( P ^ ) , V a s t d ( R n + 1 ) a n d 2 < n < N. Finally, we introduce the standard sets of 5 a / 3 . They will be of two types, denoted by std(Ba/3) and std(B a / J )«.
73 Definition 3.9. Let a > 0 and /3 > 0. We define std(Bap), be the collection of all sets (P, V, G) of the form (P, V,G) = {feBal3\f
= p + g,pe
with (P, V) e std(A), respectively
respectively std(Bap)u,
(P, V), g e Baf} and \\g\\ap < G } ,
to
(3.12)
{P, V) € std(.A"), and G 6 S, G > 0.
Hence, a set (P, V, G) is the union of all balls of radius G that are centered at piecewise affine functions belonging t o (P, V). Remark. In our program, the d a t a type with which a set (P, V, G) is represented, with P € std(7>„) and V e s t d ( H n + 1 ) , is a 2 x (n + 2) matrix, say f, with entries of complex data type. (Recall that we use the Fortran data type for complex numbers to represent the elements of s t d ( H ) and std(JR^).) The entries f ( l , 0 ) up to f (l.rc) contain V, and f ( l , n + 1) contains t h e interval [0, G]. The entries f ( 0,0) u p t o f ( O . n ) contain the partition P. If P = {A,E) 6 std("P^), then f ( 0 , 0 ) = A and f ( 0 , n + 1) = E. If P <£ st&{Vl), then f ( 0 , n + 1 ) = [0,0]. Given an integer n > 2 and a,e e S, a,e > 0, the procedure f z e r o returns a standard set (P,V,G) € std(Bap)u where G = 0, V — ([0, 0 ] , . . . , [0,0]) and P contains the partition {a + is}"=0. Finally, given (P, V, G) € std(B Q/3 ) and an integer i, the procedure get_f _on_i returns two elements of std(]R!j_) and two elements of std(M) containing respectively xi_1,xi, pi_1 and pi for *Kpe(P,V),7r(p) = ({xj},{pj})?=0. We end this section with a few comments about the strategy that we will adopt when constructing bounds on the various maps that enter the definition of Nx. Some of these maps are linear and preserve A. Let £ be such a map. Then, for / = p + g with p € A, the piecewise affine part p and the general term g can be treated separately, and since the piecewise affine parts carry the relevant information, it is natural to describe the affine part of £ ( / ) by C{p) and its general term by C(g). Moreover, the choice of the standard set image in std(.4) containing C{p) is straightforward. For instance, the product of a function / € Bap by a scalar A € H is bounded using 7r(A/9) = (p, Xv), where (p,v) = n(p), and ||Ap|| a3 = |A| ||<7||ayg- This bound is implemented in the procedure fmult. In general, however, the maps t h a t will be considered do not preserve A. Let U : Bap —> B^v be such a transformation. For f = p + g with p £ A and g € i3 Q/3 , we write U(p + g)=U(p)
+
W(p,g).
Since U(p) £ A, we will consider t h e linear interpolation p e A of U{p) at well chosen points. This choice will usually be a compromise between the quality of the approximation and the simplicity of the implementation. One then has U{P + 9) = P + W(p, g) + (U(p) - p), and the last two terms on the RHS correspond to the general term g of U(p + g). They will be bounded using first W\(n<\\w{p,9)kv+\MP)-P\\
74 Next, explicit formulas involving p, g and the values oili(p) at the chosen interpolation points, together with the use of interval analysis, will lead to a rigorous upper bound on the previous expression, and hence to the representable G € <S entering Definition 3.9. We note t h a t the elements of s t d ( R ) defining the piecewise affine function p consist in general of intervals of non zero length. Nevertheless, since the bound G has been computed for all reals in those intervals, one can "close" each of them by picking arbitrarily one of the representable numbers it contains. This will prevent the standard sets containing the piecewise affine part from "opening up" substantially when bounds are composed, in particular when evaluating convolution products.
4. Operations Involving Functions In this section, we construct bounds (in the sense of Section 3) on the various maps that enter the definition of the transformations Nx. Most of the bounds given here follow from direct calculations and are easy to prove. We have grouped some of these calculations in the appendix. In t h e following, we will usually consider / G Ba3 of the form / = p+g, where p € A will always stand for the piecewise affine part of / and g 6 BaB for the general term of / . Furthermore, when not explicitly mentioned otherwise, n(p) and ir(p) = (p, v) will be denoted by n and ({a;j}"=o>{/9j}™=o)> respectively. Finally, we denote the interval [xi-i>xi\ by Ii,i = l,...,n.
4 . 1 . E l e m e n t a r y Operations We start with the map / i-» ||/|| a / 3 , a bound on which is constructed from std(BaB) to s t d ( K + ) using the triangle inequality and, for the piecewise affine part, the estimate
\\p\Lf)= 5Z /
w
*p(x) \P(X)\
dx
< £ »pK,(*))to - ^- 1 ) k - 1 ' + kl 1=1
xe/
<
J
(4-1)
The convexity of waB leads to SU
P wa0{x)
= m a x K ^ i ^ ! ) , wa8{Xi)}.
(4.2)
We next consider the mass M(f) and the expectation E(f) of a function / S Ba3, for which it will be sufficient to construct bounds from std(Ba3)u to s t d ( R ) . By linearity, we can first treat separately the affine part p, and by using p0 = pn = 0, a direct
75 calculation yields n-l
M{p)
e
(4.3)
= I>i> i=l
E(p)-- = £j2Pixi>
(4.4)
where e = X-^-XQ. Using M(p)-\M(g)\ < M(f) < M(p)+\M(g)\ and the corresponding inequality for E(f), we get the desired bounds by estimating the mass and expectation of the general term g with (2.12) and (2.13). The supremum of l/«J a / 3 appearing in (2.12) is taken at xc = y/a/P, which leads to sup(^-T)-exp(-2Va^).
(4.5)
Similarly, one computes SU
P(^l) = 1 + y/™°M-VT^i3).
x>o^wa0(x)/
(4.6)
2/3
We end this section with the discussion of a bound on the addition of two functions fi, f2 £ &a0- ^ u e *° * n e linearity of this m a p and the fact t h a t the addition of two functions in A is again in A, it is natural to choose for the general t e r m of f1 + f2 the addition of the general terms of those two functions, whose norm is bounded by using the triangle inequality. It then remains to construct a bound on the m a p + : A x A —> A. Let such that n(Pl) = (p^vj eVnx R n + 1 and ir{p2) = (p2, v2) € Vm x H m + 1 . Pl,p2sA If p1 and p2 have no common nodes, then (p, w) = n(p1+p2) € ^n+m+i xWin+m+ , with p being the refined partition made of the ordered union of px and p 2 . T h e last is valid only if pj and p2 have no common nodes and we shall construct a bound whose domain is restricted to such cases. Hence, denoting p = {2/J"^o m+1 a n < i w — {wiYi=ot+1 > o n e defines for each i = 0 , . . . , n + m + 1, ... - (n 4. „ w„ \ - J (vih + P2((Pi)j) Wi ~ {Pl + P2)(Vi) ~ \ (v2)j + Pi«P2)j)
if3 if
3 such t h a t Vi = (p x ) -, 3 3 such t h a t Vi = (p2)y
To implement this bound with interval analysis, we must check first t h a t the nodes in std(]R+) of the standard sets Pt e std('P n ) and P2 e std('P m ) containing the partition Pj and p2 are pairwise disjoint intervals. This implies that every function in (Plf Vx) is linear on each node of P2, and vice versa. This in turn implies t h a t a bound on the evaluation p 2 ((Pi)i) a n < l Pi((P2)j) is readily obtained from (3.8) using interval analysis. We note finally that when the standard set containing px (p2) is in std(.A™), i.e., with ^ l ( A ) °f th e form (A, E), we first proceed to the evaluation of the nodes in terms of A and E.
76 Finally, one constructs a bound on the difference of two functions by composing the previous b o u n d with a bound on the unary minus Ba0 9 / M- - / obtained from II - 9\\ap = IMIo/3 a n d W(~P) = (P> -«)» w h e r e (P,v) = *(P)Remark. A bound on the weight wap is computed in the procedure sw. The inequality (4.1) is implemented in snoxm-pl, whereas (4.2) and (4.6) are implemented in ssup_of _w and ssup_of_x_over_w, respectively. For several intervals / C 1R+, we will need to evaluate later the quantities s u p x € / l/wap(x), ftw, and | / | - 1 / 7 t « , where \I\ denotes the length of I. By using the convexity of w, a bound on the first quantity is computed in ssup.of _winverse, whereas the other two quantities are bounded in sint_of-w. The other bounds described in this section are implemented in the procedures snorm, smass, s e x p e c t a t i o n , fadd and f d i f f .
4.2. The S c a l i n g O p e r a t o r It will be sufficient for our purpose to construct a bound on Sx : Ba0 -»• B^ P
4
f(x)^h{x)
7
( 4 7 )
= Xf{Xx),
acting from std(Bag)u to std(B(a/4^)u. We recall that the scaling operators are bounded under constraints which translate in this particular case into A < 4 (if a > 0) and 7 < A/3. It will be checked by the program that these inequalities are satisfied for the values of A, a, f3 a n d 7 we will use. Since Sx is linear and preserves Au, we can treat separately the piecewise affine part and the general term. A bound on 5 A : Au -»• Au is obtained from t h e relation n(Sxp) = (p/X,Xv), where (p, v) — ir(p). For the general term we estimate using (4.5), WaAx/X)
= e x p ( - 2 v / a ( l - A / 4 ) G 8 - 7 / A ) ) \\g\\aP,
(4.8)
the last equality being valid under the conditions on A, a, ft and 7 mentioned above. Note that the scaling operator (4.7) is a strict contraction for 7//? < A < 4. This will be used to improve the bound on Afx(f) = Sxf * Sxf that we shall construct later. In (4.7), taking a larger target space, i.e., B,a,a^ with a > 4, would lead to a better contraction. Nevertheless, a = 4 is the largest value for which Hx maps 5 a / 3 into Bay, cf. (2.8). Remark. A b o u n d on Sx : Au —> Au is implemented in the procedure f s c a l e . p l , and the bound (4.8) in f scale_gen. Those two procedures are called in f s c a l e to build the desired bound on the operator (4.7).
77 4.3. T h e Operator T We now construct a bound on the operator T : Baa -+ B8a 1 f(x) ^ h(x) =
1 -2f{~),
(4-9)
acting from std(BaB)u to std(2?g a ). For f = p + g with p € Au and # 6 Ba3, one has T / = Tp + Tg. Since T p is not piecewise linear, we must first choose a function p e A which approximates Tp. Denoting again n(p) = ({£j}" = 0 , {ft}™=o)> w e consider for p the linear interpolation of Tp at the nodes £, = !/*„_,.,
* = 0,...,n.
(4.10)
Therefore, we define p to be ~P = T1{p,v),
(4.11)
where p = { x j ^ = 0 , and 5 = { p j ? = 0 with Pi = ( T p ) ( £ 4 ) = ^ - i P „ - i .
* = 0,...,n.
(4.12)
Next, the general term g of Tf is given by g = Tp — p + Tg, and we use (2.10) to p o t I TV] fk t" £1
In order to bound the first term on the RHS of (4.13), we use again (2.10) together with the linearity of T and the fact that it is an involution. This leads to \\Tp-p\\Ba
=
\\P-Tp\\aa
< V ) ^ P waB(x)
/ \(p - Tp)(x)\dx.
(4.14)
Finally, an explicit bound on the integral appearing in the previous expression follows from a direct calculation and is given in the next lemma. L e m m a 4 . 1 . Let p € Au, and p be defined as in (4.11). With ir(p) = ({xj™ =0 , { p j " = 0 ) , e = x1 — x0 and It = [xt_v x j , one has j f \{p - Tp){x)\ dx<£-
( l o g ( l + - £ - ) |p, - p . . ,J
ft
ft-l
I i ,,/•„.
../oil
ft
ft-1
h
//i i c \
R e m a r k . A bound on T : BaB —¥ BBa is implemented as described here in the procedure ft.
78 4.4. The Convolution For a, /? > 0 and 7 € [a, 4a], we consider in this section the operator
(/. h)>-+ f *h. As mentioned before, we specifically introduced standard sets of piecewise affine functions defined on uniform partitions in order to simplify the construction of a bound on the convolution. Hence, our bound will act from std(B Q / 9 ) u x std(B Q / 3 )" to std(2? 7/S )". To simplify further the explicit expressions which we shall derive below, we restrict its domain to pairs (F1,F2) for which the standard sets (A1,E1) and (A2,E2) containing the partitions associated with the affine functions in Ft and F2 satisfy Ex = E2 and both Ex and E2 are singletons. This ensures that all affine functions in Fx and F2 are defined on (uniform) partitions with identical mesh size. Let f — p + gf and h = cr + gh, with p, a € A" and gf, gh e Z?Q/3. Then, one has f*h
= p*a + p*gh+gf*h.
(4.17)
The relevant information is carried by the term p * cr. Since it does not belong to A, we will proceed as in the previous section and approximate it by a function p € Au. The general term of / * ft will be given by g=(p*a-p)
+ p*gh+gf*h.
(4.18)
One can estimate the last two terms on the RHS of the previous expression using the bound (2.8). However, estimating the norm of the first term requires an explicit expression for (p * cr)(x), x > 0. We now derive this expression, which will be used also to specify p. We first state an intermediate result whose proof can be found in the appendix. Lemma 4.2. If p,a e Au have uniform partitions with identical mesh size e, then (p*(r)" & Au. Furthermore, assume n(p) = n(
Then n((P * a)") = 2n, and (p * a)" = T ^ K } ^ K > £ = o ) where z
k = xo + Vo +
k£
>
«fc = JO»fc+i-2a t + * f c -i), with the convention s_x = s2n+1
= 0.
We now specify the nature of p * cr.
( 4 - 2 °)
(4-21)
79 L e m m a 4.3. Let p,
/
over all
( ^ ' ( x ) + {
Let
Ja
= [
rt(x)2dx
+2 f
Ja
Ja
f (
(4.22)
Jo.
We will see that the second term on the RHS is zero, yielding f
> f
vlixfdx.
The conclusion then follows from a well known result in spline theory, see for instance [N], which ensures t h a t such a minimization problem has a unique solution given by the natural cubic spline interpolation of the d a t a points entering the constraints of the minimization problem. It remains to see that the second term on the RHS of (4.22) is zero, i.e., that ip'o and ((p — ip0)" are orthogonal in L2[a, b]. From Lemma 4.2, we have <^o £ Ap, where Ap denotes the subspace of L2[a,b] consisting of all functions r € A with 7T(T) = (pT, •) and pT a subpartition of p. Next, we observe that every ip 6 C2[a, b] with ip(zk) = 0, k = 0 , . . . , 2n, satisfies ip" G A£: a basis of Ap is given by { T / J J ^ 1 , rk being the "hat" function centered at zk, i.e., with Xi the characteristic function of the interval 7, T
k(x)
= X[Zk^,Zk](x)(x
- zk_x)
+ X{Zk,zk+l)(x)(zk+i
~
x
)'
and a simple calculation using integration by parts leads to b
/
Tk{X)^"{x)dx
= 0,
Ja
k = 1 , . . . , 2n — 1. We conclude the proof by noting that the conditions of the minimization problem are (up —
80
As a consequence, it follows that p * a is given on each interval [Zfc,zfc+1], k = 0 , . . . , 2n — 1, by the cubic polynomial (p * a)(zk + 0) = C0{k) + dObJfl + C2(k)62 + C3(k)63 ,
(4.23)
6 6 [0, e], where the coefficients Ct(k) take the form Co(fc) = g(»*+i + 4»fc + »*-i)» CiW= gK+i-^-i)' c
k
2(
)
C3W
= ^( s fc+i =
(4.24)
2s
k + *fc-i).
g^2 (Sfc+2 ~~ 3 s fc+l +
3s
fc
_
S
fc-l)>
using again the convention s_1 = s2n+1 = 0. Indeed, C2(k) is just (p * a)"(zk)/2 and has been directly computed in Lemma 4.2. Using (p * cr)(z0) = (p * &)(z2n) — 0, the remaining coefficients are obtained from C2(k) and the formula for natural cubic spline interpolation, see for instance [ANW]. A natural choice for the affine part p of p *
\\g\\10 <\\p*o~ p\\lP + I M U I k l U + \\9f\\ap\Hap-
(4-26)
The bound on the m a p / i-> ||/|| Q a described earlier allows us to estimate the last two terms on the RHS of (4.26). A bound on the first term is obtained by a direct calculation using (4.25) and the explicit expression (4.23). The result is formulated in the next lemma. L e m m a 4.4. Let p, a and p defined as above. \\p**-p\\10<e3J2supw10(X)(hc2(2l (=0
where J, =
[z2l,z2l+2\.
X
^1'
Then
+ l)\ + ^(|C3(2/)|
+ |C 3 (2/ + 1 ) | ) ) . (4.27)
81 Remarks. • By definition of Au, the first and last two elements of the discrete convolution (4.19) are trivially zero, so that only the convolution of { p j " ^ 1 and {cr^Zi needs to be computed. Furthermore, recall that in order to simplify the implementation of the bound on the discrete convolution, we have restricted its domain to the standard sets of s t d ( H " ) for which n is a power of 2. Hence, our bound on the convolution (4.16) is defined only on elements of std(Ba/3)w with partitions in std(7^) such that n — 1 is a power of 2. • Bounds on the quantities e'~1Ci(k), i = 0, . . . , 3 respectively, are computed in the procedure cubic_spline_coeff and saved in the vectors stO, s t l , s t 2 and s t 3 . Note that the interpolation (4.25) and the bound (4.27) provide a bound on the convolution from std(.A") x std(.4") to std(£?7/3). It is implemented in the procedure f cubic_to_pwlinear. Finally, f convolute2 computes the desired bound on (4.16), making first use of f a s t c o n v o l u t i o n 2 to get the discrete convolution (4.19), whereas f c o n v o l u t e l is adapted to the special case / = h. Those two subroutines have a call to sexp_of _tconv, which has been introduced to compute a n accurate bound on the expectation of A/^(/), cf. (2.6), and will be explained in Section 5.1.
4.5. T h e Identity Another operator we need to consider is the identity. Indeed, we recall that ultimately we want t o compose the bounds constructed so far in order to get bounds on the maps of interest. However, the bounds constructed so far do not have always matching range and domain, and cannot in general be composed as such. In particular, the bound on the operator T applies in std(Bap) whereas the convolution is defined only on std(Bap)u x std(Baf3)u. Furthermore, the bound on the convolution is defined for pairs whose affine parts satisfy constraints on the mesh of their partitions. Hence, we need a bound on the identity map I : Ba/3 -»• 2?0/3 defined from std(Ba0) to std(Bap)u such t h a t the affine part of all functions in every standard set image is ensured to possess a given partition. Let p = (x0,..., xn) e V% a fixed but arbitrary uniform partition, and / = p + g with p € A and g e Bap. For the new affine part p of f with partition p, we would hke to consider the linear spline interpolation of p at the nodes of p. However, in order for p to be in Au, one must ensure p to be continuous, so that we define p = Tx (p, {0, p(Xl),...,
Pix^),
0}).
(4.28)
Then, from
f=
P+(p-p)+9,
the new general term g reads (p — p) + g and its norm is simply bounded by WP-P + g\\a0 < \\P - p\\aP + \\g\\ap.
(4.29)
82 The first term on the RHS is bounded using the bounds constructed previously on the norm in Bap and the difference of two functions. For every uniform partition p, the previous construction leads to a specific bound on the identity map. This bound can be optimized from case to case by adapting p to the function p so t h a t \\p — p\\ap is minimal. Again, our approach is to seek for a compromise between accuracy and simplicity of the implementation. First, we choose not to increase the number of parameters from p to p, so t h a t n < n(p). Hence, the only free parameters for p are the first node x0 and last node xn. Given a T > 0, the interval (xQ,xn) is chosen to be the smallest interval such that |/»(a;)| < r for x ^ (x0,xn). This interval might be fairly different from supp(/o), leading to a mesh e smaller than |supp(p)|/n and hence a better approximation of p on regions where the information is more relevant. The cutoff r may vary from place to place in the proof and has been determined empirically. Remark. Given an integer n > 2, and two positive representable numbers xQ and s, the procedure f i d e n t i t y constructs a standard set in std("P") containing the uniform partition p with supp(p) = (x0,x0 + s), and computes a b o u n d on the identity map as described above. The representable numbers x0 and s which describe the support adapted to a given function are determined in the procedure r s u p p o r t . 5 . T h e M a p s Mx The goal of this section is to explain how bound (2.19) of Proposition 2.6 is computed and how inequality (2.11) of Theorem 2.4 is checked. A major step is to compute A/"A+ and Afx- on various functions of B „, with p, v, \~ and A + as in Proposition 2.6. We will see in Section 5.2 that these maps must be estimated from Z? to ^ ( A + J / / A - ) > a space slightly smaller than B^, since A + /A~ ss 1 + 7 • 1 0 - 7 . In the sequel we denote 5 = A _ / A + , and begin in the next section by describing the construction of a bound on the maps Nx : B^
->• £ M W < 5 ) .
5.1. A Bound on Afx We recall that for A > 0, Nx : B^/U cf. Proposition 2.2, and is given by *x(f)
-»• #,,(„/
= cx(fWl(f)
+ c2U2xU),
(5-1)
*TXIU)),
(5-2) (5.3)
where K(f) = Sx(f * f), A/-A2(/) = T{TMl(f) C
Al-c2£(A^(/)) *(/)=2 M(f)E(f) •
(5 4)
"
83 The expression (5.4) is more convenient for our present purpose than (2.7). In principle, one readily gets a bound on A/"A by composing the bounds constructed in the previous section. However, one can without too much effort improve this bound in two ways. First, the distributivity and commutativity properties of the operators involved in (5.1) give us the freedom to choose the order in which the bounds are composed. The order can affect the estimates, since in general these properties are not shared by the bounds. Regarding Mx, the fact t h a t Sx preserves Au yields slightly better estimates by using *x(f) = Sxf * SJ, (5.5) instead of (5.2). Furthermore, in order to get as much contraction as possible from the scaling operator, cf. Section 4.2, one chooses the sequence of spaces
A bound on (5.6) follows by composing the bounds of Section 4. We now turn to A/jJ and, as above, let Sx act first, considering (5.3) with AAj as in (5.5). Regarding the choice of spaces, we note that one could exploit the operators T and the outer convolution to estimate A/jJ in the smaller space B ,v,&\, as needed. That would permit us to consider v instead of v/5 in (5.6) for which Sx is a better contraction. Nevertheless, 6 is so close to one that it does not lead to any significant improvement, and for convenience one simply constructs a bound on A/jJ by composing the previous bound on AAj and a bound on the map T * T The target space of the convolution above is chosen in order to minimize the norm of the general term arising from the convolution of the piecewise affine part. The second improvement concerns the computation of the coefficient cx(f). An estimation of the quantity E(J\f^{f)) entering (5.4) is of poor quality if obtained by composing the bound on A/jf described above and the bound on the expectation as given in the previous section. Exploiting the structure of A/"jJ and the fact that it maps into a smaller space, due to the outer convolution, leads to a substantial improvement. Defining £ : Ba3 x BaB -»• R a0 a/J (5.8)
{f,h)^E(T(f*h)),
one has E{Mx{f)) = £(TAfxL{f),TAfxl(f)). In order to construct a bound on £ acting from std(Ba/3)u x atd(Ba0)u to std(]R), we first observe that for g e B^, f°°
\E(Tg)\ < / JO
1
-\g(x)\
x
T
dx < ^—-—\\g\\ *>0
W
(5.9)
X
VC,\
)
84 Next, for / = p + gf and h = a + gh, with p, a € Aw and gf, gh e Bap, one has £{f, h) = E{T(p * &)) +E(T(p
*gh+gf*
h)),
(5.10)
and, since p*g/l + gf*h£ Bua\p, one obtains from (5.9) the following estimate on the second term in the RHS of (5.10), \E(T(p*gh
+ gf*h))\<
sup —JL—Q\p\\\\gh\\ *>0 w0(4a)
+ \\gf\\a0\\h\\a(3).
(x)
(5.11)
Finally, the first term on the RHS of (5.10) can be computed explicitly. We use the same notation as in Section 4.4. Then, p * a is given on each interval [«fc,«j.+i], k = 0 , . . . , 2n — 1, by the cubic polynomial (p * a){zk +9)=
C0(k) + Cy{k)9 + C2(k)62 + C3(k)63,
where the coefficients C{(k) are given by (4.24). Hence, r°°
E(T(p
*a))=
i
-{p* cr)(x) dx x
Jo 2«-l
= Hjo
r
l
, g + z
ie (Co + £ ^ i ( f c ) + e202C2{k)
+ e393C3(k))
60, (5.12)
and using J0
dX a n
X-^Tfa
-~
L
^ n ^
= Log n (a),
where Log n is defined in (3.1), one can integrate each term in (5.12) and gets finally 2n-l
E{T(p * (7)) = £
3
E
fc=0 m = 0
emCm(k)
Log m ( i - ) .
(5.13)
k
R e m a r k . A bound on (5.8) is implemented in sexp_of_tconv. Since the quantities entering (5.13) are computed during the estimation of the convolution, this subroutine is called in f c o n v o l u t e l and f convolute2. A bound on A/^ is implemented in f N. This subroutine also returns a standard set containing the coefficient c A (/) that will be used to check (2.11), treating separately the special case where the value of E(f) is known exactly, cf. Section 5.3.
85 5.2. Existence of t h e Family of Fixed Points: First E s t i m a t e We now explain how the quantity e entering inequality (2.19) of Proposition 2.6 is computed. Recall that it consists in an upper bound on
(5-14)
IIAW/x + )-/* + IU.
uniform in K 6 [S, 1], 5 = A~/A + , where A - < A + , \i and v are given in Proposition 2.6, and / A + is an approximate fixed point in Au. Prom the definition of .A/"A+ K and Mx+ K, cf. (2.14) and (2.18), it follows that
\\MX+Jf°x+) - /A°+ IU < ||M|| \Wx+iK(f°x+) - f°+ H^,
(5.15)
and from (2.9) one obtains
IPV*+,«(/*+) - tf+IU < \\sKWx+(&) - / A \)iu + II3,A° + - /JMU < \WX+ (f°x+) - f°x+ \\^/s)
+ \\(SK - 1)/ A ° + H ^ .
(5.16)
The last inequality is valid since / A + and jV A +(/ A + ) belong to B iv,S\. Indeed, / A + has compact support in (0, oo), and JVA+ preserves this property. Therefore, A/"A+(/A+) € Bag and / A + e Bap for all a, /3 > 0. By composing the bounds constructed in the previous sections, one gets an estimate for the first term on the RHS of (5.16). At this point, the only dependence on the parameter «; lies in the second t e r m of (5.16), which one bounds uniformly using the following result. L e m m a 5.1. Let 0 < K < 1 and / 6 Wi(R+, then
waf3(x)dx).
Iff
e Baj for some 7 > /?/«,
IKS, - 1)/IU < (1 - xHWfUaf, + W(x)\\aWK)).
(5-17)
By definition, the function / A + € Aw satisfies the hypothesis of Lemma 5.1 for all a, P > 0. Furthermore, the bound (5.17) is decreasing in K. Hence, one has for all « € [S, 1], \\(SK - l ) / 5 + I U ^ t 1 " * ) ( H / A V I U + Mfx+YWWrtu/s))(5-18) Collecting the inequalities (5.15), (5.16) and (5.18) yields the desired bound e on (5.14), uniform in n 6 [5,1]. The only missing information is the norm of the operator M appearing in (5.15), which will be given in Section 7.2. We end this section with the P r o o f of Lemma 5.1. For / € Wi(H+,
dx), one can rewrite PKX
Hence,
SJ(x)
- f{x) = (K-
I)f{x)
IKS, - l ) / | | a / , < (1 - * ) | | / | | Q / J + K / JO
+K/
dxwa0(x)
f'{y)dy.
/ JKX
\f'{y)\dy.
(5.19)
86 Furthermore, /»oo
/ JO
px
dxwap(x)
ry/K
i»oo
\f'{y)\dy=
dy\f'(y)\ JO
JKX
wa/)(x)dx Jy
1 —K
<—— K
f°°
y\f'(y)\max{wa0(y),wa/3(y/K)}dy
/ Jo
1-K
^ —— SUP
max{wafj(y),wa/3(y/K)}
if
77k
K
Wvf (y)\L(0/Ky
W y>o a(p/K){y) Finally, one checks t h a t the supremum in the previous expression is bounded by one for K < 1, which leads t o (5.17).
R e m a r k . The quantity e is computed in the procedure compute_residual. This procedure also returns a bound on the first term in the RHS of (5.16), a quantity that will be used in Section 5.3. T h e bound (5.17) is implemented in the procedure snorm_of_Skappaml, where the second term on the RHS of (5.17) is estimated using, for p 6 A with 7v(p) = ( { x j , { ft }) t " =0 ,
\\XP'(X)\L(f}/K)
1 " < 5 X l S U P Wc(P/K)iX)\Pi ST»=ei,
~ Pi-l\(Xi
+ Xi-l)-
(5-20)
Finally, the (positive) representable numbers that describe the approximate fixed point /°+ € Au are contained in the file f p o i n t . l p . The first two numbers in this file are the boundary points of s u p p ( / ° + ) . They determine the partition p € V„, n = 2 1 7 + 1, satisfying 7r(/° + ) = (p, •). The last 2 1 7 numbers are the (nonzero) entries of the vector v, where TT(/°+) = (•, v). Given a nonnegative G €. S, the subroutine reacLfp reads the file f p o i n t . I p and constructs a standard set (P, V, G) with (P, V) £ std(.4 u ) containing f°
5.3. E x i s t e n c e o f t h e F i x e d P o i n t / A . Recall that once t h e existence of the continuous family {/A} of fixed points of JVA is established for A G [ A - , A + ], our main result, namely the existence of a A* g [A - , A + ] and a function / * satisfying Sx.V{f*) = / * , follows from C
A-(/A-)
where cx is given by (1.10) and cx(f) by (5.4). Checking this inequality amounts to computing for each of the three quantities involved a standard set in std(R).
87 Let us start with c A + ( / A + ) . Suppose that one has a standard set, say in std(f?MI/), containing the fixed point fx+. Then one readily gets a standard set in std(lR) containing C A+ (f\+) by composing our bounds to compute
^ + (/A + )
X+l-c2E{Af2x+(fx+)) =^ vM(fx+, ) /
;
•
(5-21)
The previous expression follows from (5.4) and E(fx+) = 1, a property satisfied by definition of the maps Mx, cf. (2.2). In order to check that cx < c A + ( / A + ) , the size of the standard set obtained from (5.21) must be small enough, which ultimately requires to localize well enough the fixed point fx+. In particular, Proposition 2.6 implies only that fx+ € Bf(fx+) with f = e/(l — q). This cannot be used to construct a suitable standard set containing / A + , since the ball Bf(fx+) also contains the fixed point fx+ $ = Ssfxfor which cx+(Ssfx-) = c A _(/ A _) < c r In order to get a suitable standard set, we first use t h a t the approximate fixed point / A + has been numerically determined as a very good approximation of fx+, and exploit our bounds to compute
\\MX+(&)
- / M u < ||M|| |py/-A+(/A°+) - / A ° + iu < e',
(5.22)
with e' w 4.97 • 1 0 - 7 (to be compared with e w 1.15 • 10~ 4 ). Next, since by Proposition 2.6, Mx+ = M.x+ x is a contraction on the ball Br(fx+) 6 B with rate q < 1, one infers from (5.22) and the contraction mapping principle that I I / A + - / A ° + I" U" <- 1 - 9 ' Finally, one constructs in std(B / i l / ) the standard set whose affine part is given by the singleton {/A+ } and whose general term has norm e'/(l — q). This set contains / A + and allows to check that cx < c A + ( / A + ) . We now consider c A _(/ A _), setting again 6 = A~/A + . For convenience, we work with the fixed point / A + s of M.x+ s whose existence is guaranteed in Br(fx+) £ B^ by Proposition 2.6. Lemma 2.5 implies that / A _ = S1,sfx+S and identity (2.16) leads to ,. , ,. , ,A+l-C2ff(A/-A2+(/A+i,)) c A _(/ A _) = cx+(fx+iS) =5 M{h+s)
(5.23)
where E(fx+ &) = 1/5 has been used. In order to check that c A -(/ A _) < cx using the previous relation, we must localize / A + s closely enough. For this purpose, we have determined a very good approximation / A _ to the fixed point / A _ . As / A + , it is given by the linear interpolation of 2 1 7 positive values at well chosen points. First, we check using our bounds that / A _ satisfies l|S (5 / A °--/A , + I U < r ,
(5.24)
88 with r as in Proposition 2.6, and, \\Mx+>s(S6fx-)
- Ssfl-||„„
< ||M|| \\Nx+j{Ssfl-)
- 5,/A°_|U
= ||M||||5^ A + (/ A °-)-^/MU <||M||||^A-(/A0-)-/A°-IUA) < e",
(5.25)
with e" w 4.97 • 10~ 7 . Inequality (5.24) ensures t h a t £,5/°- € Br(f°+). Hence, Proposition 2.6 and inequality (5.25) imply by the contraction mapping principle that e" II/A+,5 _ Ssfx- Wfif ^ 1 — a'
As above, this leads to the construction of a suitable standard set in std(B /XI/ ) containing the fixed point / A + s. To conclude, we emphasize that the accuracy of the bounds on (5.21) and (5.23) is crucial, since it determines how close to A* one can take A - and A + , and since, on the other hand, the size of the interval [A - , A + ] must be small enough in order to prove the existence of the family {/A}Ae[A-,A+lRemark. The bounds e' and e" are computed in the subroutine compute_residual introduced earlier. The computations of c A+ (/ A +) and cA_ (/ A _ )/5 are carried out in the subroutine f N, in which a bound on the maps Afx is implemented as explained in Section 5.1. The remainder of the procedure described in this section is worked out at the end of the main program. The (positive) representable numbers t h a t describe / A _ are contained in the file f p o i n t . lm. This file is organized in the same way as f p o i n t . l p , and a standard set containing / A _ is constructed by the subroutine read_f p described in Section 5.2.
6. Contractivity Properties of DMX As mentioned in Section 2, the tangent map DJ\fx(f) is a contraction on certain subspaces of Bap with finite codimension. The main goal of this section is to describe these subspaces and compute the contraction factors. Those will be used in Section 7 to estimate the norm of the tangent map of Mx+ K. We first introduce some notations and check that Afx is C1 on its domain of definition. The (Frechet) derivative of Nx at / £ Bap is explicitly given by DAfx(f)h
= Sx(2cx(f)
f *h + 4c2T(T(f
* / ) * T ( / * h)) +Sx(f,h)
f */),
(6.1)
where the variation Sx(f, h) of c A (/) is such that E(DMx{f)h)
= 0.
(6.2)
89 Indeed, since all functions in the range of 7VA have the same expectation, the tangent space contains only functions with expectation zero. Defining Afx and Afx as in (2.4) and (2.5), we rewrite (6.1) as DMx(f)h
= cx{f)DMl{f)h
+ c2DNl{f)h
+ 8x(f, h)Afl(f),
(6.3)
where D^(f)h = 2Sx(f*h), DMZ(f)h
(6.4)
1
l
= 2T(TA/-A (/) *TDN x {f)h).
(6.5)
From the condition (6.2), 5X is expressed in terms of the expectation of the three terms on the RHS of (6.3). Using the relations (1.23), one gets
h(f,h) - -cx(f)[j^
+
E(T))-
XC
> 2M(f)E(f)
•
(6
-6)
Now, for a > 0 and /3, A > 0, one easily checks that the estimates of Proposition 2.2 imply that whenever Afx : B^/H —>• BaT is well defined, i.e., a < 4a/A, r < A/3, it is continuously differentiable on Bap/?{. We will need later to estimate DMx(f) for Afx : Bap —> Baj with 7 slightly larger than p. In this case, the conditions for DAfx(f) to be bounded become 7//? < A < 4. We will see below t h a t under the stronger conditions j/p < A < 4, the tangent map DJ\fx(f) is actually compact provided / is sufficiently regular, i.e., there is a sequence of subspaces of finite codimension on which DAfx (/) converges to zero. These subspaces are defined as follows. Definition 6 . 1 . For p = { x 0 , . . . ,xn} following subspaces ofBap, Caa = {heBav,V
a partition
in Vn and a,b > 0, we define the
= 0\ supp(/i) C (0,a)},
Cp = {h e L 1 ( M + ) | supp(/i) C (a; 0 ,x„), /
h(x) dx — 0 for i = 1 , . . . , n},
JXi — i
Kbp = {h€ B<0, C = 0 | supp(ft) C (6,00)}. Furthermore, we denote by BvaQ the following subspace of Bap, Bpat3=C^®Cp®nl".
(6.7)
For a small enough and b large enough, it turns out that when restricted to £™, 7^0, and C respectively, the tangent map of J\fx at a function / in W±(H+, wa/3(x)dx) has norms of value O(e-1/a\\f\\a0), O(e-b\\f\\a0), and 0(\p\ \\f'\\afi), where \p\ denotes the mesh size of the partition p. More generally, if Cp consists of functions whose n — 1
90 first moments vanish on every interval of the partition p, the norm of the tangent m a p restricted to Cp is C ? ( | p | n | | / ^ | | a / 3 ) , provided that the base function / is regular enough. For our purpose, it is sufficient to consider n = 1. Before deriving explicitly the contraction factors, we remark that we will need to evaluate later DNX{J) on the complement of B^g in BaB. It is easily seen that for a given partition p, every h 6 BaB can be uniquely decomposed into a sum h = g + T where g € B^3 and where r is constant on each interval of the partition p and satisfies supp(r) = supp(p). More precisely, with p = {x0,... ,xn} and \i the characteristic function of the interval I, one has B ^ = ^ © ^ ,
(6.8)
where V p is the n-dimensional vector space defined by V p = {r \T = £
AiXt,,.!,,,], \ e R } .
(6.9)
i=l
Section 6.4 is devoted to the construction of a bound on the map DAfx(f)
: V —> Bai.
Remark. In (6.3), there are factors that depend only on the base function / . These factors, namely Mx(f), TAf^(f) and their norms, together with c A (/), E(f) and M(f), are computed once and for all in the subroutine compute_constant_terms using the bounds of Section 4. (This subroutine makes use of snorm_of _der_pl, a function commented in the final remark of Section 6.1.) According to Proposition 2.6, A = A + and / is represented by the standard set in std(23„„)" whose affine part is the singleton {/A+} and whose general term g satisfies ||<7|L„ < 9 • 10~ 4 . Finally, for given standard sets containing M(h),E{h) and E{DN^{f)h), a bound on Sx(f,h) is computed in the procedure s d e l t a l using (6.6).
6.1. Oscillatory Functions We derive now an upper bound on the norm of the operator DM"x(f) : Cp —> Bay, with Cp as in Definition 6.1 and with f = p + g e BaB/%, p G Au• For the first two terms in (6.3), and for ||g|| a/ 3 small, the contraction factor will come from the convolution in DNX- Hence, we first use the bounds obtained in Proposition 2.2 and get in full generality \\Mxl(f)\\aT (6.10) In the previous expression, only the quantities that depend on h remain to be estimated. \\DNx{f)h\\ai
< ( | c A ( / ) | + 2c2||A/-A1(/)||a7) \\DMlx{f)h\\ai
+ \Sx(f,h)\
Let us begin with Sx(f,h). For h € Cp, one has M(h) — 0 and the first term in (6.6) vanishes. Next, E(h) is expressed in term of the largest interval in the partition p.
91 Denoting p — { x 0 , . . . , xn} and Ii — fo^, x j , i = 1 , . . . , n, the identity fj h{x)dx = 0 implies \f
= | | (x-Xi
xh{x)dx\
+
*i-1)h(x)dx\
< liXi-x^J
\h(x)\dx,
which in t u r n yields \E(h)\ < I
max { * . _ * , _ , } 8 u p ( - ^ ) | | A | | a / 8 .
(6.11)
Finally, since DAf%(f)h € # a ( 4 7 ) , it follows from (2.13) that \E{DU2x{f)h)\
< 2sup(
* (A\Wl{f)\\^\\DUl{f)h\\ar
(6.12)
Inserting (6.11) and (6.12) into (6.6) leads to an estimate for the second term on the RHS of (6.10). In order to bound the RHS of (6.12) and the first term on the RHS of (6.10), it then remains to estimate \\DAfl(f)h\\a^. In order to treat DAfx(f), one has the possibility to exploit, as in the previous section, the distributivity of the scahng operator Sx with respect to the convolution. It turns out t h a t the order is not crucial and we consider for simplicity DMl : Baf} x Cp — ? - > B(ia)p
2S —±> Bar
(6.13)
We begin with the convolution and use the following result. L e m m a 6.2. Let f 6 W*(R+,
wap(x) dx) and h e Cp with p = {x0,. . . , £ „ } .
11/ * ft|l(4a)/» < \ca0(p)\\f\\a0\\h\\a0, where, denoting 1\ =
Then, (6-14)
[xi_1,xi],
^ . - ^ J j - / . . , , . ) . ) . »=l,...,n lxeIi
Wap(X) JT.
*
,6,5,
J
Since for p e A one has by definition p g Wi(M.+,wap(x)dx), the previous lemma together with Proposition 2.2 imply, with / = p + g, p e Aw, g G B a / 3 , and h eCp,
11/ * &ll(4a)/J < IIP * h\\(4a)0 + 11^IIa/9II^HajS < ( ^ ( P ) I I ^ I U + llfllla/l)ll*lla/»-
(6-16)
92 Next, since 7//? < A < 4, inequality (4.8) applies (with a replaced by 4 a ) , and we finally obtain
\\DNl(f)h\\ai < le-A{\caM\\P\U
+ \\9\Q\\h\\af),
(6.17)
where A = 2^/a{4 - A)(/3 - 7/A). A few comments are in order. In (6.17), the contraction factor is not only given by cap(p) but also by how close in Bap the base function / is to a regular function together with the norm of that function in W 1 1 (]R + ,w Q/3 (x)dx). The fact t h a t the fixed point whose existence we want to prove is smooth plays an important role here. To make a connection with Proposition 2.6, the quantity ||<7||a/3 in (6.17) is the radius of the ball on which the tangent maps DMX+ K need to be contractions. All the other terms can be made as small as we wish by letting the size of the largest interval in p go to zero, cf. (6.11) and (6.15). Note that cag(p) depends sensitively on a and /3, and optimizing this factor requires to consider a partition p with smaller intervals where the weight wap varies strongly. We will encounter later other optimization criteria for p. We shall denote by pr the partition p which we will eventually choose, cf. Section 7.1. We end this section with the P r o o f of Lemma 6.2. Define the function hx by
A1(a:)=
fh(S)d£, JXQ
for x e (x0,xn), and ht(x) = 0 otherwise. Note that h[ — h and, by definition of h, hi(xi) = 0 for i = 0 , . . . , ra. Hence, integration by parts leads to
11/ * A||(4a)/J = 11/' * Alll(4a)/» < ll/'MIMa/JIt remains to estimate the norm of hx in term of h. For i = 1 , . . . , n and x € one has
[xi_1,xi],
which in turn yields n
.
IKIIa/3 == X / i=l
^llbf
Jl
W
apiX) \hl(X)\
dx
,
'
w
<*p(x) [ \H0\dddx
•
93 Remark. The quantity ca0(pr) is computed in the subroutine swsupint. (See the final remark of Section 7.1 for a description of the parameters related to the partition pr.) The estimates (6.11), (6.12) and (6.17) are implemented in fDN_center to compute (6.10), with a call to the subroutine s d e l t a l to get 5x(f,h). The quantity ||/o'||a;8 entering (6.17) is bounded in snorm_of _der_pl by
WWaff < £ where ir{p) = ({yj},{Pj})T=o
SU
P VafiWP,
and Ij =
~ Pj-l\>
( 6 - 18 )
[y^^yj]-
6.2. Functions with S u p p o r t N e a r t h e O r i g i n In this section, we consider DNx(f) acting on functions h € Laa for a small enough. As in the previous section, but for different reasons, the contractivity properties of DMx{f) are entirely due to the term DAf^(f). Indeed, since the functions / which will be considered have in general a support given by JR+, the support of DAfl(f)h for h 6 Caa is also equal to R + due to the convolution. Hence, the size of DAf£(f)h is essentially given by the size of DAf£ ( / ) h, and we proceed as before starting with the bound (6.10) on | | Z W A ( / ) f c | | a r The last term in the expression (6.6) for Sx(f, h) is again bounded using (6.12). The /j-dependent coefficients of the first two terms in (6.6) are given by M(h) and E(h), which are bounded using
'MWI ^ ^ T ^ I W U
( 6 - 19 )
I^WI < I ^ r l W U
( 6 - 2 °)
provided a < sja/fi for the first inequality, and a < (l + y/1 + 4af3)/2/3 for the second inequality, cf. the discussion of (4.5) and (4.6). It remains to bound DAf£(f)h
in Bay.
DMl : Bap x CI — ^
We consider B{r)a)f3
— ^
Bai,
(6.21)
with 77 6 [A, 4] a parameter t o be chosen later. For the convolution, we use the L e m m a 6 . 3 . Let f e Ba0 and h e £ « . Then, for 1 < TJ < 4,
11/ * h\\{va)0 < exp(- a v ^ ( 2 f l " ^jll/II^Hftll^.
(6.22)
94 Proof. Exploiting supp(ft) C (0, a), we proceed as in Proposition 2.2 and get Wf * hW(r,a)0 < where
Since g{x,y) compute
SU
P
*>0 »>!i>e)
exp(-a«(a:,y))||/||,^||h||a/,,
, . x+y g{x,y) = xy
r) —. x + y
> 0 for 77 < 4, one has supexp(—ag) = exp(—ainfg) and, using 77 > 1, we inf , ( , , „ ) .
inf
^ ~ ^ )
=
^
-
^
\
We now turn to the scaling operator. Since s u p p ( / * h) = IR + for functions / that will be considered, the following general bound is optimal, \\Sxg\\a^
< e x p ( - 2 V ' a ( l - A/./)(/3 - 7/A))||ff|| ( „ a)j8 ,
(6-23)
which is valid provided 7//3 < A < 77. From (6.23) and (6.22), we get a bound on \\DN\{f)h\\aj. We now optimize the parameter 77. Since ultimately we will get the needed contraction factor by choosing a small enough, and since (6.23) does not depend on o, we consider (6.22) only. For A > 1, the maximum of y/rj(2 — ^/rj) on [A, 4] is taken at 77 = A, and one gets finally \\DMl{f)h\\ai
< 2exp(-aVX(2o~VX))l|/H^||/t|l^.
(6.24)
Recall that (6.24) is valid provided 7//? < A < 4. Furthermore, it leads for a small enough to a strict contraction only if A < 4 : this is the first compactness condition. Before ending this section, let us comment on the optimization of the contraction factor. Instead of (6.21), one can consider DAf£(f)h = 2(Sxf * Sxh) with Sx : Baf} -»• &(a/i))i a n ( * *? G [^' 4] a parameter to be optimized. Since Sxh e £ ( Q / „ ) - ls °^ o r ( ^ e r G(e~1/a) if 77 > A, one gets a second a-dependent contraction factor from the convolution. However, optimizing 77 leads to the same bound as (6.24), and we use (6.21) for convenience of implementation. R e m a r k . The bounds (6.19), (6.20) and (6.24) are implemented in the procedure fDN_left to compute (6.10). The conditions on a under which (6.19) and (6.20) are valid are first checked, namely a < \/a/j3 and a < ( l + \ A + 4a/3)/2/3. An explicit check of 7//3 < A < 4 is also necessary. Up to now, this inequality was implicitly verified when bounds were computed, as in (6.17) for instance.
95 6.3. Functions with S u p p o r t N e a r Infinity We now consider functions h 6 Up" with 6 large enough. Here, t h e situation differs from the previous cases in the sense that the term DMx{f)h is small independently of the size of DNl(f)h. Indeed, the property of h to have support away from the origin is preserved by DMx(f). After applying the transformation T, one obtains a function whose support is near the origin, and the result from the previous section related to the convolution yields a second exponentially small factor. Hence, we simply start with the triangle inequality to get from (6.3) \\DMx{f)h\L,
< \cx{f)\\\DNlx{f)h\\ai
+ c2\\DN2x{f)h\\aj
+ \5x(f, h)\
\WtV)\\ar (6.25)
Let us begin with the first term. The main contraction factor is here entirely due to the scaling operator acting on Up. Furthermore, for / S Bap, the m a p h >-> f * h preserves TZbp. Hence, one has the choice of the order in which t h e scaling and the convolution are composed. By letting the scaling act first, one gains a (^-independent) contraction factor when applying this operator to the function / . Recall t h a t Sx : BQ0 —>• #( Q / 4 ) 7 is a strict contraction for y//3 < A < 4. One can improve this factor by considering B0y for the target space of Sx. Hence, we consider finally DMl/2
: Ba0 x K0 - i * - > S 0 7 x 7 ^ A —5-». Baj.
(6.26)
Provided j/fi < A, the scaling operator in (6.26) is bounded, and, since Sxh has again support away from the origin, the convolution above is well defined even for a > 0. For / £ Bap, one estimates as usual
iis A /ii 07 <su P ^^u/m = exp(-2V«C9-7/A)) ll/IU
(6.27)
•>b
and for h 6 Up, one uses the knowledge about the support of h to get
x>b
w
x
a~t\
)
= e x p ( - a / 6 - 6(0 - 7/A)) \\h\\ap,
(6.28)
the last equality being valid if b > y/a/([5 — 7/A). Next, we consider the convolution in (6.26). For / e B0y and h e 7^ 7 /A , we proceed as in Proposition 2.2 and get
y>b/X
'
'
y>6/*
= exp(oA/6)||/|| 0 7 ||/i|| 0 7 .
(6.29)
96 Finally, (6.27), (6.28) and (6.29) lead to \\DMl(f)h\\aj
< 2e~A exp(-6(/3 - 7 /A)) ||ft|| a / l ||/|| a / | )
(6.30)
where A = 2^/a(/3 — 7/A) — a(X — l ) / 6 . Although the convolution deteriorates the 6-independent factor given by the scaring, (6.26) is still a good choice due to the large values of b t h a t will be considered. Proceeding in this way is not crucial, but allows to take smaller values for b, thereby saving about 10 percent of the computation time devoted to the evaluation of DNx(f) on V, the space of piecewise constant functions. We conclude by observing that (6.30) yields a bound which is exponentially small in b only if j/P < A: this is the second compactness condition. Next, we consider the second t e r m in (6.25). One has I|£WA(/WU
<
2\\TJ^(f)*TDJ^(f)h\\^a.
From DNl(f)h € nh^x it follows t h a t TDM\(f)h with 77 = 1 leads t o \\DAfi(f)h\\^
€ £* / f \ and applying Lemma 6.3
< 2exP(-^)\\Afx\f)\\aj\\DAfxl(f)h\\ar
It remains t o estimate Sx(f, h). The expectation of DJ\fx(f)h
(6.31)
is simply bounded
by \E(DAf*(f)h)\ < W—^)\\DNi{f)h\\ ar
(6.32)
Note that in t h e previous cases, we used the properties of the convolution near the origin to bound this quantity according to (6.12). Here, these properties have been used already in t h e bound (6.31) to extract a second exponentially small factor in b. Therefore, inserting (6.31) into (6.32) leads to a better estimate than (6.12). Finally, for h € IZp and b large, one has the following bounds on M(h) and E(h)
l M ( f t )l<:r-^iTlWU provided b > y/a/(5 inequality.
IWI<—TITIHUS.
(6-33)
for the first inequality, and 6 > (1 + \ / l + 4a/3)/2/3 for the second
Remark. The b o u n d s (6.30),(6.31), (6.32) and (6.33) are implemented in f DN-right to estimate (6.25). T h e validity conditions of (6.28) and (6.33), namely b > y/a/(P-y/\) (> \foJP) and b > (1 + y/\ + 4aP)/2P, are explicitly checked.
97 6.4. P i e c e w i s e Constant Functions Finally, we consider the case of functions h in V. On this space, the tangent map DMx(f) is not a contraction and the relevant information is contained in the images DJ\fx(f)h of the basis vectors h of V p . Therefore, in order to keep track of this information, we need to construct a bound on the tangent map in the sense of Section 3. For p = {x0,..., xn} and Ii = (x{, a ^ ) , a basis of V p is given by {xIi }" = 1 . Hence, we introduce the following set X of characteristic functions, X
= {CX[a,a+S]\ C G R , a > 0, 8 > 0},
and we construct a bound on DAfx : Bag x X —• Ba^ acting from std(Bap)u x std(<;f) to s t d ( B a 7 ) , where we define std(
(6.34)
for C £ s t d ( R ) and A,B € s t d ( R + ) . Note that, once a bound on DAf£ : Bap x X —> Ba^ has been obtained, composing it with the bounds of Section 4 readily yields bounds on the first two terms of DAfx(f)h, cf. (6.3) and (6.5). To compute the coefficient 5x(f,h) in the third term of (6.3), the only missing quantities are the mass and the expectation of h € X. Those are obtained from t h e equalities M(X[a,a+6]) = <5 ,
#(X[„,a+*]) = *(« + W
(6.35)
It remains to construct a bound on DAfx. We consider DHlll
:Ba0xX
- A + B,7 x X —L+£
a r
(6.36)
The reason for this choice is as follows. Some of the functions h will have support close to the origin or far away from the origin. In such cases, we know from the previous sections that the scaling in (6.36) is a very good contraction. Hence, considering (6.36) will automatically yield an extra contraction factor and improve the bound on the convolution between Sxh and the general term of Sxf. A bound on Sx : X —• X is easily obtained from S\X[a,a+5]
=
^X[a/\,(a+S)/\]-
Next, we construct a bound on the convolution defined from std(S^_) u x std(A^) to std(BJT))u, with 7 € [C, 4(]. Let / = p+g, p € Au and g € B
98 ir(p) = ({Xj}, {Pj})"=0 and denote by e the mesh of the uniform partition associated with p. If e > 8, the function p * h takes a simpler form than in the case e < S, and we restrict the domain of our bound to such cases in order to simplify the implementation. Define yk = a + x0 + he, fc = 0, ...,n + l, (6.37) and Ik = [yk, yk+i\, k = 0 , . . . , n. It is clear from the properties of the convolution that p* his continuous and has a support equal to (y0, yn + S). Next, a short computation shows that provided e > 5, p * h is given on the interval Ik by
f s(Pk - Vfc-x/2) + flVfc-i + e2(p'k - /4-i)/2, o < e < s, (P*h)(yk + 0)=\
(6.38) [S(pk-8p'k/2)
+ eSp'k,
with the convention that p_± = pn+i
5<9<e,
= 0, and where Pk+i
-
Pk
Pk
Indeed, one has ra+8
(p*h)(yk + 0)= I*
p{yk + 0-x)dx=
f p(xk +9-0<%.
Ja
(6.39)
JO
Two cases arise: if 9 > 6, the function p in the above integral is given by P(x) = Pk + (x- xk)p'k.
(6.40)
Inserting (6.40) into (6.39) and integrating lead to the second part of (6.38). For 9 < 5, we rewrite (6.39) as (p* h)(yk + 9)= f p(xk + 0-t)d£+ Jo
[ p(xk + 9-0deJe
(6.41)
In the first term, p is again given by (6.40), whereas in the second term one has P(x) = pk + (x-xk)p'k_1.
(6.42)
Inserting (6.40) and (6.42) into (6.41) yields the first part of (6.38). Next, we define the affine part p of / * h to be the linear interpolation of p * h at the nodes {yk}. More precisely, we consider P = r1({yJ?+01,{ft}r=to1). (6-43) where Pk = 8{Pk-8p'k-i/2),
* = 0 , . . . , n + l.
(6.44)
99 Note that p e Au. and one gets
Finally, the general term of / * h is given by g = p* h-p + g *h \\9\\^<\\p*h-p\\m
For h = X[a,a+S]i
one
+ \\g\\(„\\h\\
(6.45)
simply uses that IWI C „<<5
sup wCv{x). x£[a,a+6]
To bound the first term on the RHS of (6.45), we first note that on the interval Ik, k = 0 , . . .,n,
= d(pk - Sftk_j2)
+ S(y - yk){P'k - 8(p'k -
pk_x)l2e).
From this formula and the expression (6.38) for p*h, one computes for 6 e [0,6], \(p*h-
p)(yk + 0)| = » ( * - - - - ) \ p '
k
- p'^l,
(6.46)
and for 0 € [S, e], l(P* * - # ( » * + * ) l = y ( l - f ) | p * - p U l -
(6-47)
Therefore, integrating (6.46) and (6.47) leads to SU
UP *ft- p||7„ < ^2 ~l
P "V,^) / I0> * x - p)(y)\dy n
X
= S2{j--£;)Yl
sup
fc=Oxe/fc
" ' C i ^ l f t + i ~ 2Pk + Pfc-il ,
(6-48)
with the convention p_x = pn+1 = 0. Remark. A set (A, B, C) 6 std(A') is represented on the computer by a vector, say fb, with f b ( l ) = ; 4 , f b ( 2 ) = S , and f b ( 3 ) = C . The scaling Sx : X -> X is implemented in f s c a l e _ c h i . A bound on the convolution in (6.36) is implemented in the procedure f conv_chi from (6.43), (6.45) and (6.48), where we first check the condition e > S. We note that for the purpose of the proof of Proposition 2.6, the base function / is always represented by the same standard set in std(B a / 3 )". Hence, the only quantity in (6.44) that may change from basis vector to basis vector is S. By choice of the partition pr, see Section 7.1, most of the basis vectors have equal S, and the computation of the pks is carried out only once for such basis vectors. Finally, the bounds on the scaling and on the convolution in (6.36) and the bounds from Section 4 are composed in the subroutine fDN.chi to implement a bound on DNx(f) : X -> Bay, f € Bap.
100
7. T h e T a n g e n t M a p s DMXiK In this section we explain how a uniform upper bound on the contraction rate of the operators Mx+ K in a neighborhood of the fixed point / * is obtained for all K G [ A _ / A + , 1]. This will complete the proof of Proposition 2.6. We recall t h a t the operators Aix+ K are given in terms of the original maps JV A + K = SKAfx+ by Mx+tH = l + M(tfx+tK-l),
(7.1)
where M is some fixed invertible linear map close to the inverse of 1 — DMX. (fx.). Since A/"A+ K is already a good contraction on the subspaces BvaB for certain partitions p, we need M to be different from the identity only on the finite dimensional subspace V p , cf. (6.8). In Section 7.1, we introduce some notation and express the norm of a linear m a p in BaB in terms of its norms when restricted to BvaB and V p . The description of M is given in Section 7.2. The last section is devoted to the final estimate needed to prove Proposition 2.6.
7.1. D e c o m p o s i t i o n of the Operator N o r m Let p = {x0, . . . , i n } b e a partition in Vn. In order to express the projector on V p , we introduce two maps associated with p: the finite rank operator lp : BaB —> R n defined by
v-{iU'«*L „•
(7 2)
-
and Jp : M " -> V p defined by n
•WKU = £**/,,
(7-3)
where Ii = (xi_1,xi\ and |7| is the Lebesgue measure of / C 1R. With this notation, t h e projector Qp on Vp may be written as QP = Jplp-
(7-4)
Let A be a bounded linear map in B^B. One has WWap
< ||/||pmax{p|VP||,||%^||},
(7.5)
where || • || p is the norm in BaB given by
ll/llPHIQP/IU + ll(i-e P )/IU-
(7-6)
101 The norms || • \\p and || • || a / 3 are equivalent, with l l / I U < | | / | | p < Kf\\f\\a0
(7.7)
for some constant Kp&. Prom the definition of BpaB and its subspaces £™, Cp, 1lbB, it follows t h a t \\A\BPJ\
=max{P|£,„||,||%p||,p|^„||}.
(7.8)
PIVP||=
(7.9)
Furthermore, one has max ||Aifc||a/J, 1=1,...,71
^
with ?7i the characteristic function of Ii normalized in BaB, i.e., Vi=(f
WapWdxY'x^.
(7.10)
Inserting (7.7), (7.8) and (7.9) into (7.5), one gets \\A\\ < ^ m a x { { | | ^ m } 7 = 1 , m | £ x „ | | , p | C p | | , | | A | ^ „ | | } .
(7.11)
For A = JDA1 A + K ( / ) , evaluating the quantities in the RHS of this expression will yield the desired bound on the norm of the tangent map of M\+ K- The bounds obtained in t h e previous section will allow us to estimate each of the last three quantities in one step, by evaluating in turn \\DMx+K(f)h\\aB for all h in the unit balls of ££°, Cp and 72.^". In contrast, the contractivity of Mx+ K on V follows from the specific choice of the operator M, and an explicit computation of the n quantities ||^4%||a/j is required. This accounts for most of the computation time of the proof. This leads us to the problem of optimizing the partition p in (7.11) with respect to A = DAix+ K(f). Roughly speaking, the size of the intervals in p = (x0,...,xn) is determined by the contraction rate of A o n C that we need to obtain. Hence, the number of intervals n is fixed by x0 and xn. In order to minimize n, we want to maximize x0 a n d minimize xn. These two parameters determine the contraction rate of A on £^° and 7t^n. Increasing a and /3 improves the contraction and allows to consider larger x0 a n d smaller xn. However, large values of a and ft deteriorate the estimate (2.19) of Proposition 2.6, i.e., the precision of the approximate fixed point. Good values for a and /? have been found empirically to be a = 0.5 and ft = 0.9, for which x0 = 0.065, xn = 11.83 and a (non-uniform) partition of 5050 intervals give the desired bound (2.20). In t h e sequel, we will refer to this partition as pT and denote nr = 5050. We end this section with the computation of the equivalence constant K^. we estimate \\Qpf\\aB- From l ( V ) i l = llTT / fWdx\ 1 2
\ i\ Jli
^ 177 ^(TTM) '
Mil x€U
KW
X /
a^{ )
/ " W Wl/tol**. Jli
First,
(7-12)
102 it follows ,
i=l -
m a X
(in
'Ji SUP
„,
(VI /
W
ac/3(x)dx)\\f\\c/3-
Hence, the following inequality
ll/llp = llfip/L/j + ll(i - %)nu ^ H/IU + 2HSp/lla/j. implies ^
< 1 + 2 . max ( ^ sup -
^
/ " waf3(x)dx).
(7.13)
Note that the previous upper bound tends to 3 from above when n increases and when the size of each interval goes to zero. Also, the weight contributes t o this bound by its largest variation on the intervals {1^}. We have already encountered a similar situation, cf. (6.15), and we chose to consider a non-uniform partition with a higher density of nodes where the weight varies strongly. For the partition pr introduced above, one has K%" < 3.15. Pr
Remark. An upper bound on the equivalence constant Kg" is computed in the procedure compute_equiv_const, using swsupint to estimate the second t e r m in the RHS of (7.13). The first and last points in the partition pr are x 0 = 0.065 and xn = 11.83, respectively. The first 100 (nprl) intervals are uniform with mesh er = (xUr — x 0 )10~ (sepsprl), whereas the remaining 4950 (npr2) intervals are uniform with mesh sT2 = 2 e n (sepspr2).
7.2. The Operator M As mentioned earlier, M should be a good approximation to the inverse of 1—DAfx. (/ A .), and needs to be different from the identity on the finite dimensional space VPr only. Hence, for a certain partition p € Vm to be chosen later, we write M = (i - g p £ W A + ( / ° + ) e P r \
(7.14)
where / A + is the explicit approximate fixed point of Afx+ entering the statement of Proposition 2.6. The previous expression involves the m x m matrix A = IpDtfx+(ft+)Jp,
(7.15)
and can be rewritten as M = (1 - JvAIp)-x
= 1 + JpA(l
- A)-%.
(7.16)
103 Since we look only for an approximation, the operations involved in the computation of the matrices A and ^4(1 — A)-1 need not to be exact. Hence, the use of interval analysis is not required here and we will rely on numerics only. The result of this operation will be denoted by B, i.e., B^A[l-A)-1. (7.17) With the notation for C an m x m real matrix, M is finally defined by M = 1 + Bp.
(7.18)
We note that the numerical invertibility oil — A does not imply the invertibility of M. Since this property is required in order for the fixed points of Nx+ K and of 1 + M(jV A+ K — 1) to be in correspondence, we must check that M is indeed invertible. We exhibit a matrix C for which (1 + B)C is invertible. This implies that the matrix 1 + B is invertible, which in t u r n ensures the invertibility of M. For C, we consider the matrix 1 — A that has been previously numerically determined. Then, we check rigorously with interval analysis t h a t the matrix X given by X = (l + B)C-l,
(7.19)
satisfies
11*11 < 1,
(7-20)
m
for some norm on R . From this inequality, it then follows that 1 + X is invertible. The norm on ]Rm we use in the program is ||x|| = m a x i = 1 m \xt\, that is, for C a real matrix with coefficients {ct •}, ||C||=
max
£|cy|.
(7.21)
J=l
We now discuss the choice of the partition p used in the definition of M. This partition will be denoted by ps. Since the decomposition 5 ^ © VPr has been introduced in order to isolate the subspace B^g on which N\+tK is a contraction and since the non trivial action of M should t u r n A4A+ K into a contraction on V P r , it is natural to require B^CKer^.). This is in particular true if pa is a subpartition of pr, i.e., Ps C pr.
(7.22)
There is no need for ps to be equal to pT. In particular, ps could have fewer nodes than pr, which would improve performance with respect to memory and computation time.
104 By trial and error, we have determined a small partition which satisfies (7.22) and leads to a contraction on VPr. This (uniform) partition contains ms = 500 intervals. Hence, B is a 500 x 500 matrix with entries in S, the set of (safe) representable real numbers. For technical reason, the matrix A is not computed according to (7.15) with p = ps. This would amount to computing the matrix elements a^ = (Z ZM/" A+ (/° + )J Ps ^j)it where {Xj} is the canonical basis of JR m \ To avoid the writing of special procedures, we want to use our bound on DNX+ acting on X even though interval analysis is not required. However, the intervals in ps are too large for the JVa^j to be in the domain of this bound. (Recall the restriction on the domain of the convolution between a characteristic and a piecewise linear function in Section 6.4.) Hence, we first divide each interval in ps into d subintervals. This leads to a partition pt e Vdm whose intervals are now small enough for d = 10. With {yk} denoting the canonical basis of B. ", one has J x- = Yli=i ^p Vdij-i)+i- Next, in order to save some computation time, we exploit the continuity of IW A + (/"+) to compute an approximated matrix A given by o y = (TpDNx+ where Jpt = djpt
{fl+)JPtyk(i))n
(7-23)
and k(J) = d(J - 1/2).
ls We recall that in (7.23), the function DNx+{fx+)JPtyk(j) given by our bound as a sum p + g, with p £ A and g a general term. For the purpose of computing A, g is discarded and it remains to discuss the map Ips : A -> R m *. We will need later to evaluate this map rigorously and we now describe how to bound it. Let p G Vm and TT(P) = (pp,-)- Define p = p U pp = {yj}f=0 and pj = p(yj). Then, writing lj = (yj-i,Vj), one has for i = 1 , . . . , m,
We restrict the domain of this bound to those p's for which the support of the partition p contains the support of p . By proceeding so, we ensure that no information is lost when projecting on Vp. We end this section by deriving an expression for the operator norm of M in Bap. Recall that this quantity was needed in Section 5, cf. (5.15), (5.22) and (5.25). We start with the trivial estimate ||M||
(7.25)
and express the norm of the finite rank operator Bpa in terms of the partition ps and the matrix elements of B. Let p = {x0,..., xn} e Vn, li = (^j_i, x^ and let C be an
105 n x n matrix with real entries { q , } . For / e Ba/3, one estimates
\\jPciPfu = EIIX-(V)il / *"<*(*)**
and, using our previous bound (7.12) on |(2L/),-|, one gets
IICJ < Nf(C),
(7.26)
where
«*> = , i & (j£[ ^ ( ^ ) ) £ M /, "«*<*) <**) •
M
Remark. A bound on the map 2Tp : .4 —» R " is implemented in the procedure p r o j e c t i o n , checking first the condition on its domain of definition and using f add to compute p and {p,}. In the procedure compute_matrix, the matrix B (bm) is computed using (7.17) and (7.23). A call to s h o w . i n v e r t i b i l i t y verifies t h a t 1 + B is invertible. For the numerical inversion of I—A, we use the standard algorithm of Gauss elimination, implemented in the subroutine g a u s s j . The operator norms of M and Bps are computed in compute_matrix_norm. Finally, the partition ps satisfies supp(p s ) = supp(p r ) and is uniform with mesh es = 10 £ r (sepsps). Hence, it contains 500 (mps) intervals. 7.3. Existence of t h e Family of F i x e d Points: S e c o n d E s t i m a t e In this section, we derive a uniform bound on the norm of the tangent maps DM.X+ K(f) for K in [A - /A+, 1], and / € # r ( / ° + ) C B^ with p, = 0.5, v = 0.9 and r = 9 • 1 0 _ i . In the sequel, we set 6 = A~/A + . By definition, one has
DMX+Jf)
= l + M{SKDAfx+(f)
- 1).
(7.28)
For / € Bap/H, DAfx(f) is bounded from Ba0 to Baj provided 7//? < A < 4. Hence, with P = v and 7 = V/K, SKDAfx+ ( / ) is bounded as a m a p from B to B provided l / « < ^+ < 4. One concludes that for all K € [<5,1], DMX+ K(f) is bounded as a map from B to B^v provided 1/5 < A + < 4. For the values of A + and A - as given in the statement of Proposition 2.6, the previous condition is satisfied. To estimate the norm of DA4X+ K(f) in B', we proceed as outlined in Section 7.1 and bound each term on the RHS of (7.11). We start with the simple case h € BP*. The property ps C pr implies Bps h = 0, so that Mh = h. Hence,
\\DMx+iK{J)h\\^ < W\\ WS^M^ifM^ <\\M\\\\DAfx+(f)h\\^K„/K) <\\M\\\\DNx+{f)h\\Kv/S),
(7.29)
106 for all K 6 [S, 1]. A n upper bound on ||M|| was described in the previous section. Representing / by t h e standard set in std(£ M „)" whose affine part is the singleton {f°+} and whose general term has norm r, the bounds of Section 6 yield 0.85 as an upper bound on t h e RHS of (7.29) for all h in the unit ball of £*°, Cpr, and Kxv". Next we consider the more delicate case of h 6 VPr. According to (7.11), one has to estimate t h e nT quantities ||£>A^A+ K ( / ) T ? J | L „ where ?/4 is the normalized characteristic function of t h e i t h interval It in the partition pr. Recalling that M = 1 + Bp , we get from (7.28) DMx+iK{f)Vi
= SKDMX+ (f)Vi + Bps (SKDAfx+ (/) - 1)„4.
(7.30)
The bound on t h e m a p DJ\fx(f) : X -> B previously constructed yields the function Z?A/"A+ ( / ) % represented by a standard set in std(.4) and a general term. We denote the former by pi a n d t h e latter by git i.e., DMx+U)Vi
=
Pi+9i,
and rewrite (7.30) as
DM^Jf)^
= SK(Pi + 9i) + Bps {SK(Pi + 9i) = MSn9i
+ SKPi + Bps(SKPi
- ru).
Vi)
(7.31)
The norm of the first term is bounded as before for all K € [5,1] by
IIMS^-IU < II^H WSilU/6)-
(7-32)
To treat t h e remaining terms in (7.31), we express them as SKPi + Bpt{SKPi
- ife) = SK{Pi + Bps(Pi -
Vi))
+ Bps(SK-l)Pi
(7.33)
+ (l-5
107 for p € Vn and C an n x n matrix, restricted to intervals / satisfying / C Ii for some interval It in the partition p. A bound on p \-¥ lpp has already been discussed in the previous section. Denoting by Ii the i t h interval in the partition p, one has {TpXi)i — 0 if / n Ii = 0, and otherwise
( 7 - 36 )
(?PX/)i = l^l/IAIn
n
A bound on the map C : H —> TR is readily implemented with interval analysis, and it only remains to consider the map (p,v) >->• | | p + Jpv\\ , (p,v) e A x ]R". Let us denote 7r(p) = (p p , •), p = p U p p = {yj}^L0 and p^- = p(2A,). Imposing the restriction supp(p p ) C supp(p), one obtains II _l_ T II
^V"Y^ i=i
j
/ M f J ^ + ^il + I P j - l + ^ l
,_,_.
is/,
where /• stands for the j t h interval in p. This finishes the construction of a bound on the map (7.35), which, given standard sets containing pi and T^, provides an estimate on the RHS of (7.34). Next, the second term in (7.33) is simply bounded by \\BV,{SK
- i)ftlU ^ \\BP.W 11(3. - *)Pi\\v
( 7 - 38 )
The operator norm of B p > has been determined in the previous section, and Lemma 5.1 provides a bound uniform in K for the second factor, namely
II(SK - l ) f t | U < (1 - S){\\Pi\\^ +
\\XPX(»/5))-
To treat the last term in (7.33), we use the L e m m a 7.1. Let p = { x 0 , . . . , ar n } €Vn, C annxn matrix with coefficients {c^}, and 0 < K < 1. Then the operator norm of (1 — SK)Cp in Ba/3 satisfies \\(l-SK)Cp\\
(7.39)
is given by (7.27) and
Rf(K) = (!-.) + ^
^ w^{x)dx(£^'
Wa^x)dx
The only dependence on K in (7.39) is in the factor RP0(K). is decreasing in K. Hence, one obtains
||(i - sK)BPm(Pi - vM^
+ Jj
wa0(X)dX).
Furthermore,
< J C ^ W ^ N I I f t l U +!).
RP0(K)
( 7 - 4 °)
108 for all K € [6,1]. Finally, abound on WDM^^f)^]]^ follows from (7.32), (7.34), (7.38) and (7.40). As mentioned earlier, computing this bound for the nr = 5050 basis vectors 7^ of VIV accounts for most of the computation time. In the terms involving explicitly r)it one can, using the linearity, factorize the value of r){, t h a t is (/ z w ) _ 1 . Therefore, one only needs to compute an upper bound on this quantity. Furthermore, by proceeding like this one can take advantage of the fact t h a t the value of Xi{ can be represented by the standard set containing only the representable number one. This leads to a standard set containing pi + gt which is more localized and improves the quality of the final bound. We end this section with the Proof of Lemma 7.1. For / e Bag, one has
||(1 - SK)Cpf\\a/3 = /
wa0(x) (1 - SK) E E ^ ^ A X . W \dx
^ E KV)il E M / ^ W K 1 - sK)xiM)\dx-
(7-41)
Furthermore, one has
JO
Jxi-i
Jli
JXi
Factorizing fj wap in the previous expression and inserting the bound (7.12) on \(1 into (7.41) finally leads to (7.39).
f)t\
Remark. A bound on the map (p, v) >-¥ \\p+ Jv v|| a ~ is implemented in the procedure snorm_add, and the product Cv is implemented in l i n e a r _ a p p . A uniform bound on the norm of (1 — SK)Bps is computed in compute_norm_of _lmSB. Given i 6 { 1 , . . . , nT}, the subroutine i n i t _ c M returns both a standard set in std(A') containing Xi a n ( ^ the value of 77i, whereas the subroutine fDM_chi computes a bound on ||DA^ A + K(f)Vi\\/j,v Finally, for all / G B r ( / ° + ) and K € [S,1], a uniform bound on the norm of the tangent maps DA4X+ K(f) is implemented according to (7.11) in compute_norm_of_DM.
109
Acknowledgments J.W. would like to thank Jean-Pierre Eckmann, Peter Wittwer, and the Department of Theoretical Physics at the University of Geneva for their warm hospitality while part of this work was carried out. A.S. and P.W. would like to thank Jan Wehr and the Department of Mathematics at the University of Arizona at Tucson for their warm hospitality and for providing us generously with computer resources while part of this work was carried out.
Appendix Proof of P r o p o s i t i o n 2 . 3 . If for some fixed A e (1,4) and a,/3>0, fx is a fixed point of Mx and belongs to Bap\H, then Remark 1.2 a n d Proposition 2.2 imply that / A e B. We now prove that, in addition, / A is at least once differentiable, with f'x G B. The regularization properties of the convolution imply then immediately that fx is of class C°°(1R + ). For (,T] > 0, let B^v denote the Sobolev space of functions in 5^ with one (distributional) derivative in B c „, i.e., with the norm
ll/l&, = ll/llc„ + l l / V In the sequel, we adopt the shorter notation B( = #<-<- and B) = Bis. One shows that the fixed point / A belongs t o B^ for all £ > 0 by the following argument. One exhibits an h 6 B\ and two sequences {/„}„> 0 and {fir„}„>0 satisfying fx=h + fn+gn for all n > 0, such t h a t { / n } n > 0 is Cauchy in B^ and {gn}n>0 converges to zero in B^. Hence, / A is equal in B,- to a function belonging to B\. Since JVA preserves the regularity, this function is also a fixed point of JV A . Therefore, it is equal to / A in B). We first construct recursively the sequences { / „ } n > 0 and {„}„>o- Since / A belongs to B(- for all C, > 0, and since CQ°(1R_|_) is dense in B^1 there exist for every 5Q > 0 an h e B^ and a g0 € B^ satisfying h = h + g0, (A.1) with llffollc < V
(A.2)
/o = 0-
(A.3)
Moreover, one defines Denoting c A (/ A ) = cA and J7 x = cxAfx + C 2 JV A , we now define for all n > 0, fn+i=rfx(h gn+1
+ fn)-h
=A7"A(SJ>
+
Cx(fn,gn),
110 where
Cx(f,g) =77x(h + f + g) -77x(h + f) -77x(g). Note that Cx(f,g) contains only cross terms between h + f and g. We now check t h a t the sequences {/„}„> 0 and {gn}n>0 have the desired properties, i.e., fx = h + fn + gn, {gn}n>o converges to zero in B^, and {/„}„> 0 is Cauchy in B^. Since fx is a fixed point of 77x, it first follows from (A.l) and (A.4) that fx = h + fn + gn, for all n > 0. Furthermore, C > 0 and A e (1,4) together with Proposition 2.2 imply that {<7„}„>0 converges to zero in B<*. Indeed, 77x is well defined as a map from B,- to B(-, and the bounds obtained in the proof of Proposition 2.2 lead to llff»llc^ g Allff„-lll<+«2llff B -lllcApplying this inequality recursively and using ||p 0 ||^ < S0 < 1, one gets for all n > 1
HflJIcf^2",
(A.5)
where 5 = (cx + c2)S0 < 1 for S0 small enough. Note that (A.l), (A.2) and (A.5) imply, for 50 small enough, the uniform bound
\\fnk<\\fx-h\\(
+ \\Sn\k
< 2S0.
(A.6)
Next, in order to show that fn G B^ for all n > 0, one proceeds as in Proposition 2.2 and studies the maps which enter the definition of J7X and Cx, i.e., Sx, T and the convolution operator. From (2.8) and ( / * g)' = f' * g, it follows that
ll/*ffll(4.)^ll/ll-IHU, whereas (2.9) together with A > 1 and (Sxf)'
(A.7)
= XSxf' leads to
ll^/li(V/A)(A.)
(A.8)
(A.7) and (A.8) imply in particular IISA(/*9)ll(4C/A)(*c)
(A.9)
Ill We now show that for all r > r', T is a bounded operator from B\T to B\,a. One has \\Tf\\T,a = 11/IU < H/IU, and using (Tf)'(x) = - 1 ( ^ ( T / ) ^ ) + (T/')(x)), one gets l l ( r / ) ' | U < 2 / x- « / T , » |T/|( a ;)dr+ / -x , « V » |T/'|(x)dx Jo Jo /»oo
= 2/
/*oo
aru;
Jo
XVT'(*)I/'I(*)<^
JO
< 2 s u p Ii±^ f (£)| l / ,|i T *>0
W^r (2;)
(A-10)
where CTT, is finite as long as r > T'. In particular, since A > 1, (A.9) and (A.IO) imply l|T5 A (/* 5 )||J ( 4 C / A )
* g) * TSx(g * g))||* < C||/||J||ff|| c ||p|| c P|| c .
(A.ll)
Therefore, 77x is well defined as a map from Bl to B} for £ > 0 and A € (1,4). Assume now that fn € B^ and that S0 is small enough. Then, (A.6), (A.9), and (A.ll) lead to Wx(f*
+ fn)\\l
< WxW\\l+C\\fJ((Ml
+
\\fn\\l)
( A - 12 )
< Cl + ^ll/nllo for some positive 5 < 1. Similarly, using (A.5) and (A.6), one gets \\Cx(f«,9j\\\
< C\\gJS\fJl
+ \\h\\\).
(A-13)
Therefore, (A.12) and (A.13) lead, together with (A.5), to Wfn+l\\l
(A-14)
112 Finally, we check that the sequence {/ n }„> 0 is Cauchy in Bl. Since (A.5), (A.13) and (A.14) imply t h a t lim \\Cx(fn,gn)\\l = 0, it only remains to show that {Afx{h + fn)}n>0 sequence {/i„}„> 0 , with
is Cauchy in B^. We first verify that the
hn=Ml{h + fn), is Cauchy in S(4£/A)(AO' Since A G (1,4), this implies in particular the convergence of {Atf(h + / J } „ > 0 in B*. Defining Afxl(f,g)
=
Sx{f*g),
one has ||fc„ - hm\\lT < 2\\Ml(h, /„) -J%(h, 1
< 2||A^ (/J,/„ - fm)\\iT
fm)\\iT
+ \\ATlx(fn) -
+ \\tft(fn, fn - fm)\\lT
Afl{fm)\\lT
+ ||j\ft(/ m , fm -
fn)\\lT,
which leads, with (A.9) and (A.14), to
\\K - ^Ilk/Ax*) < c(2\\h\\\ + \\fJl + \\U\l) \\fn - / j | c
+ fn) =
F(hn),
where
F(f) =
T(Tf*Tf),
and that the bounds obtained above imply the continuity of F as a m a p from #(4C/A)(AO to B^ for C > 0 and A e (1,4). Hence, the convergence of {Mx(h + / „ ) } in B^ follows from the convergence of {hn} in $(4(-/A)(A<:V
Proof of L e m m a 4 . 1 . For x e / j , the functions p and Tp are given by p(x) = -{pi{x
- x^j)
+ pi_l{xi
- x)J,
= ^ ( 4 - i f t - i ( * i - x) + x3iPi(x - *<-i))»
(A.15)
(A-16)
113 and one computes \p(x)-Tp(x)\=\p(x)-p(l/x)/x2\
= Jh-i(^-»)(i-^r)-ft^-^-i)(^-i)| (X — Xi_1)(Xi
=
< £flfr-ft-il —
— X) .
^5 4V
2
IPi-i^-^i-i) +
\xiPi ~ xi-iPi-i\
-Pi(*i-X)
+
~ MxiPi ~
x3
|
x
\*1PJ-XLIPJ-I\\
a;2
a;
2
i-iPi-i)\ , A 17)
/
Integrating the expression on the RHS of (A.17) leads to the stated result.
Proof of Lemma 4.2. By definition of A, it is clear that (p * a)" 6 A- Furthermore, because p and a have a uniform partition with identical mesh size e, {p * a)" is also defined on a uniform partition. It is given by {zk}%L0 where zk is defined in (4.20). It remains to compute vk = (p*a)"{zk). With p'(x) = TZ=iP'iXiM) a n d St) = E J U ^ - X J / * ) , where Pi = (Pi - Pi-i)/£ a n d v'j = Wj - Vj-i)/e, o n e S e t s
v
k = £ Yl far i+j=k+l
Expressing the RHS of the previous equality in terms of the coefficients (4.19) finally leads to the relation (4.21).
Proof of Lemma 4.4. For k = 21, I = 0,...,n - 1, and 6 6 (0, 2e), one has 9_ p(zk + 0) = C0(k) + ^(c0(k 2e'
+ 2) - C0(kj).
The continuity properties of p * a imply C0(k + 1) = C0(k) + eC^k) + e2C2{k) + e3C3(k), Cx{k + 1) = C^k) + 2eC2(k) + 3e2C3(k), C2(k + 1) = C2(k) + 3eC3(k), k = 0,..., 2n — 1. For 9 € (0, e), these relations allow us to write 5e p(zk + 9) = C0(k) +flC1(fc) +eO(2C2(k + 1) - -|c 3 (fc) + |c 3 (fc + l)),
114 and, from (4.23), (p * a)(zk + 9) = C0(k) + 9Cx{k) + 92{c2(k
+ 1) - (3e -
9)C3{k)).
Hence, (p - p * a)(zk + 9) = 0(2e - 0)C2(k + 1) + fl(0(3e - 0) - ^ - ) c 3 ( f c ) + ^ C 3 ( & + 1), which leads to the estimate \(p-p**)(y)\dy<e3(-\C2(k
+ l)\ + -\C3(k)\
+ ^\C3(k + l)\).
For 9 e (e, 2e), one proceeds similarly and obtains \V>-p*
+ l)\ + -\C3(k)\
+ l\C3(k
+ l)\),
and the bound (4.27) follows immediately.
• References ANW] Ahlberg, J.H., E.N. Nilson and J.L Walsh: The Thory of Splines and Their Applications, Academic Press, New York London, (1967). [Bel] Bernasconi, J.: Electrical conductivity in disordered systems. Phys. Rev. B 7, 2252-2260 (1972). [Be2] Bernasconi, J.: Conduction in anisotropic disordered systems: Effective-medium theory. Phys. Rev. B 9, 4575-4579 (1974). [Be3] Bernasconi, J.: Real-space renormalization of bound-disordered conductances lattices. Phys. Rev. B 18, 2185-2191 (1978). [Bl] Blumenfeld, R.: Probability densities of homogeneous functions: explicit approximation and applications to percolating networks. J. Phys. A 2 1 , 815-825 (1988). [BO] Berker, A.N. and S. Ostlund: Renormalisation-group calculations of finite systems: order parameter and specific heat for epitaxial ordering. J. Phys. C 12, 4961-4975 (1979). [BS] Burbanks, A. and A. Stirnemann: Holder continuous Siegel disc boundary curves. Nonlinearity 8, 901-920 (1995).
115 [BSW] Bernasconi, J., W.R. Schneider and H.J. Wiesmann: Some rigorous results for random planar conductance networks. Phys. Rev. B 16, 5250-5255 (1977). [BW] Bernasconi, J. and H.J. Wiesmann: Effective-medium theories for site disordered resistances networks. Phys. Rev. B 13, 1131-1139 (1976). [C] Celletti, A.: Construction of librational invariant tori in the spin-orbit problem. Journal of Applied Mathematics and Physics (ZAMP) 4 5 , 61 (1993). [CC] Celletti, A. and L. Chierchia: Construction of Analytic KAM Surfaces and Effective Stability Bounds. Commun. Math. Phys. 118, 119-161 (1988). [dlL] de la Llave, R.: Computer assisted proofs of stability of m a t t e r . In: Computer Aided Proofs in Analysis, K. Meyer and D. Schmidt (eds.), The IMA Volumes in Mathematics 28, 116-126 (1991). [EB] Essoh, C D . and J. Bellissard: Resistance and fluctuation of a fractal network of random resistors: a non-linear law of large numbers. J. Phys. A 22, 4537-4548 (1989). [EKWl] Eckmann, J.-P., H. Koch and P. Wittwer: Existence of a fixed point of the doubling transformation for area-preserving maps of the plane. Phys. Rev. A 26, 720-722 (1982). [EKW2] Eckmann, J.-P., H. Koch and P. Wittwer: A computer-assisted proof of universality for area-preserving maps. Providence, Memoirs of the AMS 4 7 , 1-121 (1984). [EW1] Eckmann, J . - P . and P. Wittwer: Computer Methods and Borel Summability Applied to Feigenbaum's Equation, Springer-Verlag, Berlin Heidelberg New York Tokyo, Lecture Notes in Physics 227 (1985). [EW2] Eckmann, J.-P. and P. Wittwer: A Complete Proof of the Feigenbaum Conjectures. J. Stat. Phys. 46, 455-475 (1987). [FL] Feffermann, C. and R. de la Llave: Relativistic Stability of Matter. Matematica Iberoamericana 2 / 1 , 2 , 119-213 (1986).
Revista
[FS] Feffermann, C. and L. Seco: Aperiodicity of the Hamiltonian Flow in the T h o m a s Fermi Potential. Revista Matematica Iberoamericana 9 / 3 , 409-551 (1993). [G] Grimmet, G.: Percolation, Springer, Berlin New York, 2nd ed. (1999). [Kl] Kirkpatrick, S.: Classical Transport in Disordered Media: Scaling and EffectiveMedium Theories. Phys. Rev. Lett. 27, 1722-1725 (1971). [K2] Kirkpatrick, S: Percolation and Conduction. Rev. Mod. Phys. 4 5 , 574-588 (1973). [K3] Kirkpatrick, S.: Percolation thresholds in Ising magnets and conducting mixtures. Phys. Rev. B 15, 1533-1538 (1977). [KP] MacKay, R.S. and I.C. Percival: Converse KAM: Theory and Practice. Commun. Math. Phys. 98, 469-512 (1985).
116 [KSW] Koch, H., A. Schenkel and P. Wittwer: Computer-Assisted Proofs in Analysis and Programming in Logic: A Case Study. SIAM Review 38, No. 4, 565-604 (1996). [KW1] Koch, H. and P. Wittwer: A Non-Gaussian Renormalization Group Fixed Point for Hierarchical Scalar Lattice Field Theories. Commun. Math. Phys. 106, 495-532 (1986). [KW2] Koch, H. and P. Wittwer: Rigorous Computer-Assisted Renormahzation Group Analysis. In: V l l l t h International Congress on Mathematical Physics, M. Mebkhout and R. Seneor (eds.), World Scientific (1986). [KW3] Koch, H. and P. Wittwer: Computing Bounds on Critical Indices. In: Nonlinear Evolution and Chaotic Phenomena, G. Gallavotti and P.F. Zweifel (eds.), NATO ASI Series B: Phys. 176, 269-277 (1987). [KW4] Koch, H. and P. Wittwer: The Unstable Manifold of a Nontrivial RG Fixed Point. Canadian Mathematical Society, Conference Proceedings 9, 99-105 (1988). [KW5] Koch, H. and P. Wittwer: On the Renormalization Group Transformation for Scalar Hierarchical Models. Commun. Math. Phys. 138, 537-568 (1991). [KW6] Koch, H. and P. Wittwer: A Nontrivial Renormalization Group Fixed Point for the Dyson-Baker Hierarchical Model. Commun. Math. Phys. 164, 627-647 (1994). [KW7] Koch, H. and P. Wittwer: Bounds on the Zeros of a Renormahzation Group Fixed Point. M a t h . Phys. E J 1, No 6, 24pp. (1995). [LI] Lanford III, O.E.: A computer-assisted proof of the Feigenbaum conjectures. Bull, of the AMS 6, 427-434 (1982). [L2] Lanford III, O.E.: Computer-Assisted Proofs in Analysis. Physica 124 A, 465-470 (1984). [L3] Lanford III, O.E.: A Shorter Proof of the Existence of the Feigenbaum Fixed Point. Commun. M a t h . Phys. 96, 521-538 (1984). [LR1] de la Llave, R. and D. Rana: Accurate Bounds in K.A.M. Theory. In: V l l l t h International Congress on Mathematical Physics, M. Mebkhout and R. Seneor (eds.), World Scientific (1986). [LR2] de la Llave, R. and D. Rana: Accurate Strategies for Small Divisor Problems. Bull, of the AMS 22, 85-90 (1990). [LR3] de la Llave, R. and D. Rana: Accurate strategies in K.A.M. problems and their implementation. In: Computer Aided Proofs in Analysis, K. Meyer and D. Schmidt (eds.), T h e IMA Volumes in Mathematics 28, 127-146 (1991). [M] Mestel, B.D.: A computer assisted proof of universality for cubic critical maps of the circle with golden mean rotation number. Ph.D. Thesis, Math. Dept., University of Warwick (1985). [N] Niirnberger, G.: Approximation
by spline functions, Springer-Verlag, Berlin (1989).
117 [PFTV] Press, W.H., B.P. Flannery, S.A. Teukolsky and W.T Vetterling: Numerical Recipes in Fortran: The Art of Scientific Computing, Cambridge University Press, Cambridge, 2nd ed. (1992). [R] Rana, D.: Proof of accurate upper and lower bounds to stability domains in small denominator problems. Ph.D. Thesis, Princeton University (1987). [Sel] Seco, L.: Lower bounds for the ground state energy of atoms. Ph.D. Thesis, Princeton University (1989). [Se2] Seco, L.: Computer Assisted Lower Bounds for Atomic Energies. In: Computer Aided Proofs in Analysis, K. Meyer and D. Schmidt (eds.), The IMA Volumes in Mathematics 28, 241-251 (1991). [Sh] Shneiberg, I.: Hierarchical Sequences of Random Variables. Theory Probab. Appl. 3 1 , 137-141 (1987). [St] Stirnemann, A.: Existence of the Siegel disc renormalization fixed point. Nonlinearity 7, 959-974 (1994). [SS] Schlosser, T. and H. Spohn: Sample-to-Sample Fluctuations in the Conductivity of a Disordered Medium. J. Stat. Phys. 69, 955-967 (1992). [SW] Stinchcombe, R.B. and P.B. Watson: Renormalization group approach for percolation conductivity. J. Phys. C: Solid State Phys. 9, 3221-3247 (1976). [Wl] Wehr, J.: A strong law of large numbers for iterated functions of independent random variables. J. Statist. Phys. 86, no 5-6, 1373-1384 (1997). [W2] Wehr, J.: A lower bound on the variance of conductance in random resistor networks. J. Statist. Phys. 86, no 5-6, 1359-1365 (1997). [WW] Wehr, J. and J.-M. Woo: A central limit theorem for nonlinear hierarchical sequences of random variables. Annals of Probability (to appear). [Z] Ziman, J.M.: The localization of electrons in ordered and disordered systems: I. Percolation of classical particles. J. Phys. C 1, 1532-1538 (1968).
id
IP
EJ
MATHEMATICAL P H Y S I C S E L E C T R O N I C J O U R N A L ISSN 1086-6655 Volume 6, 2000 Paper 4 Received: Jun 17, 2000, Revised: Jul 17, 2000, Accepted: Aug 14, 2000 Editor: J. Avron
Degenerate space-time paths and the non-locality of quantum mechanics in a Clifford substructure of space-time Kaare Borchsenius *
Abstract The quantized canonical space-time coordinates of a relativistic point particle axe expressed in terms of the elements of a complex Clifford algebra which combines the complex properties of SL(2.C) and quantum mechanics. When the quantum measurement principle is adapted to the generating space of the Clifford algebra we find that the transition probabilities for twofold degenerate paths in space-time equal the transition amplitudes for the underlying paths in Clifford space. This property is used to show that the apparent non-locality of quantum mechanics in a double slit experiment and in an E P R type of measurement is resolved when analyzed in terms of the full paths in the underlying Clifford space. We comment on the relationship of this model to the time symmetric formulation of quantum mechanics and to the Wheeler-Feynman model. *Bollerisvej 8, 3782 Klemensker, Denmark, e-mail: [email protected]
118
119
1
S u b s t r u c t u r e of the canonical space-time coordinates
The fact that half-integer spin representations of the Lorentz group are realized in nature casts doubt on the assumption that space-time is a primary space. More specifically, as pointed out by Penrose [1], the fact that different spatial directions of a spin-one-half particle correspond to different complex linear combinations of the two quantum states suggests that there is a direct connection between the structure of space and the need for complex state vectors in quantum mechanics. Taken together, considerations like these point to the existence of a substructure of space-time which combines the complex properties of the Lorentz group and quantum mechanics. Substructures of space-time have been discussed in Schwartz and Van Nieuwenhuizen [2] and in Borchsenius [3, 4, 5]. To determine the nature of such a complex substructure of space-time we shall use the canonical quantization of a relativistic point particle as a model. We shall adopt Dirac's method in which space and time are treated on an equal footing, both being regarded as functions of a parameter-time r. Reparametrization invariance imposes a constraint which can be used to define a Hamiltonian together with a set of canonical variables. The quantization results in a set of hermitian canonical space-time coordinates, the components of which satisfy
*£)= XLW
(i)
These components transform under a Lorentz transformation in the index fj, and under a unitary change of basis in Hilbert space in the indices a and b. To bring out the complex properties of the Lorentz group, we make use of the connection between a real four-vector and a second-rank hermitian spinor V" = i ^
A
AV
*,
VAh = aA6V
(2)
where a^ are the four hermitian Pauli matrices. The spinor form of the canonical space-time coordinates exhibits two hermitian properties, one related to SL(2.C) and the other to the unitary group in Hilbert space. To find a substructure of X corresponding to these two groups, we observe that the components (3) form a hermitian matrix in the combined indices (A, a) and (B, b)
(x**y=xg*
(4)
As shown in the appendix, any hermitian matrix can be expressed in terms of the elements of a complex Clifford algebra according to (66). For the canonical space-time coordinates (4) this implies that there exists a complex Clifford algebra with elements CA so that
Xa? = {cA,c;*},
{cA,c?} = o
(5)
120
The complex linear space which generates the Clifford algebra, and to which the C's belong, we shall call Clifford space, and we shall refer to its elements as Clifford coordinates, borrowing from space-time terminology. To write (5) in abstract form we shall adopt the following notation. The components CA which transform like a right-handed two-component spinor in the index A and > as a ket vector in the index a shall be written as CA where the ket on top is used to distinguish it from a quantum operator and an ordinary eigenvector. <. > Likewise C£B will be written as the bra vector CB= ( C B ) t where f performs both the complex involution of the Clifford algebra and the quantum conjugation > < in Hilbert space. The commutator between a ket vector X and a bra vector ij} shall be denned as {X, }}a0
={ {Xa, Ipb},
4 , X} ^
{^a, Xa}
(6)
that is, we adopt the convention that the order of the ket and bra vectors in the first term in the commutator determines whether both terms are direct products or contractions. With this notation, (5) can be written in the abstract form > <. XAB = {CA,CB},
> > {CA,CB}
=0
(7)
X and C can be expressed in terms of a complete set of eigenstates \xT) and their eigenvalues
X» = W>z?<*?|
(8)
cAd±fW\CA
0)
CA=W)cA,
When these expressions are inserted into (7) we obtain {cA, c;B} = 6TSxAB,
{cA, c f } = 0
(10)
Hence the eigenvalues of X are determined by a set of mutually orthogonal elements c^ of the Clifford algebra. To make our discussion more transparent we shall refer to these elements as 'eigenvalues' and write the eigenstates \xT) as \cT). By use of (7) we obtain the expression for the expectation value of X in the state \s) >
<.
{s\XAB\s ) = (s\{CA,CB}\s)
= {cA,c*B)
(11)
cA ^ (s\ CA
(12)
(12) are the Clifford coordinates corresponding to the expectation value of the space-time coordinates. Applying (9) they become cA = (s\xr)cA
(13)
121 The relationship of this equation to the expression for the expectation value of the space-time coordinates S" = | ( s M | 2 < (14) can be described as a linear extraction of the quantum amplitudes as a complex substructure of the probabilities, and of the Clifford coordinates as a complex substructure of the space-time coordinates. If, conversely, we had sought a substructure of space-time which had the quantum amplitudes as a linear space of weights as in (13), we would have been led to something of the nature of the orthogonality relations (10). In the continuum limit X has a Continuous spectrum and in the coordinate representation (9) and (10) become CA= {CA(X),C*6(X')}
f\x)cA(x)dx
= XA66(X-X'),
(15) {CA(X),J3(X')}=0
(16)
(16) generates an infinite dimensional Clifford Algebra of a type well known from the Algebra of creation and annihilation operators for a Fermi field. The stability of Clifford space under SL(2.C) implies that there are at least two values c and —c of the Clifford coordinates,which correspond to the same space-time coordinates a;. The well known degeneracy of SO(1.3) transformations with respect to SL(2.C) transformations is hereby extended to space-time itself. As we shall see in sections 3 and 4, this has the consequence that the path of a particle in Clifford space has a starting point in physical time. This is the only physical consequence of the Clifford model which differs from those of conventional quantum mechanics. To reconcile these starting times with experience in a satisfactory way, the Clifford substructure would presumably have to be applied to objects more fundamental than point particles (e.g. fields or strings) and examined in the context of a cosmological model. The viability of our model rests upon the possible success of such a program.
2
Canonical equations
We consider the action: L(c(r),c(r))dr (17) / Since the Lagrangian is real-valued it is natural to assume that the Clifford variables c and c occur within anticommutators. In this case the variation of L can be expressed as SL = {^L,5cA}
+ c.c. + {§^,ScA}
+ c.c.
(18)
which defines the derivatives with respect to c and c up to terms which anticommute with 5c. The conjugate to c is defined as
dA
=W
(19)
122
If c can be eliminated in favour of d* the Hamiltonian becomes H{c, d) = {<*, d*A} + c.c. - L(c, c)
(20)
with the equations of motion C
~ dd*A'
dA
~
dc*
(21)
In case the action (17) has local symmetries t h e Hamiltonian is found by the methods of constrained dynamics. We shall only consider Hamiltonians which can be expressed in the form XAB={CA,C*B},
H(c,d) = H(x,p),
pA6 = {
(22)
The system corresponding to the action (17) cannot be quantized in the usual way through Poisson brackets because c and d* become vectors C and D and not operators in Hilbert space. Instead we shall determine the conditions which have to be imposed on C and D in order to obtain t h e usual canonical quantization of the system (22) with p as the momenta conjugate to x. For the Hamiltonian (22) the equations of motion (21) become _ dH ckAA = -^-dBk, dpA/ '
•» _ E dH =- c * E - ^ ~ ° dx**
d dAA
(23)
The quantized form of these equations will be
XAi
= {CA,C6},
PAi = {D6,DA}
(25)
Applying the equations of motion (24) to (25) gives
tF**
=
-^"•PEB\{C'!,DA)
- ^{D6,C6)\H,PAi\
(26)
For these equations to reduce to the usual space-time canonical equations of motion we must impose the commutation relations {CA,DB}=5AH(T)
(27)
where /x(r) is a real scalar function of r. Then (26) becomes
'X** = -<£>[*,*•*], i * U , ~ * W „ l
(28)
123
or in reparametrized form
Though we obtain the standard space-time canonical equations of motion, they are subject to the (as we shall see) important restriction that the parameter f is only well defined for /x(r) ^ 0. Normally the compatibility of the commutation relations with the equations of motion is ensured by the Poisson brackets. This also applies in the present case to the space-time commutation relations [X",X"]
= 0, [P»,P*]=0,
[X",Pv] = ihSS
(30)
which are compatible with the equations of motion (28). These equations, however, assume the validity of the Clifford commutation relations (27) which are not related to any poisson brackets. We shall prove the compatibility of these commutation relations with the equations of motion in the classical case where they reduce to {CA, d*B} = -CAB M(T) (31) Since all skewsymmetric second rank spinors are proportional to CAB, (31) is equivalent to the vanishing of the symmetric part of the commutator {c{A,d*B)} = 0
(32)
The equations of motion (21) subject to the constraint (32) can be obtained from f {cA,dA}
+ c.c.-H(c,d)
+ XAB(T){c{A,dB)}+c.c.dT,
XAB = XBA
(33)
by independent variation of c and d where XAB are six Lagrange multipliers. A local SL(2.C) transformation CA = SAE(T)CE,
dA = S/(r)d*E
(34)
turns (33) into
/
{c ,d*A} + c.c. - H(c,d) + (SEASEB
+ XAB){c(A,dB)}
+ ex.dr
(35)
The last two terms in (35) can be made to vanish if = -XAB{r)
SEA{T)SB{T)
Taking A to be small, the infinitesimal SL(2.C) SAB (r) = eAB +
KAB
(r),
(36)
transformation KAB
=
KBA
(37)
124
turns (36) into kAB
=
_XAB
(3g)
which can always be solved for KAB(T) in terms of XAB{T). The constraint (32) can therefore be absorbed into a local SL(2.C) transformation of the dynamical variables and will accordingly preserve the form of the equations of motion. In section 4 we shall examine a specific model of the relativistic point particle and find that also the quantum form of the Clifford commutation relations leads to a consistent result.
3
Degenerate space-time paths
When ft(r) in (27) has a zero, the parameter f of the space-time equations of motion is ill-defined. We should therefore be prepared to encounter complete solutions C{T),D(T) to the equations of motion (24) which generate incomplete solutions X(T),P(T) to the space-time equations of motion. To understand what happens, let us assume that /z(0) = 0. Then for r = 0 the commutation relations (27) reduce to {CA(0),DB(0)}
=0
(39)
Let us expand C(T) C ( T ) = £ (0) + . . . + 1 tf
(40)
The higher order derivatives C^(T) are obtained by differentiating the equations of motion (24) and reinserting the expressions for X and P obtained from (26). Because of the commutation relations (39) this can only result in coefficients which contain terms of the form F£(X(0),P(0))
CB (0) or GAB(X(0),P(0))
D6 (0)
(41)
When C(T),D(T) is a solution to (24), so is -C{-T),D(T). Thus C{T) must be odd under a change of sign of C(0) and r. It follows that in the expansion (40) all terms of even order must have coefficients of the first type in (41) and all terms of odd order must have coefficients of the second type. When therefore the expansion (40) is inserted into (5) to determine X(r) we find that, because of the commutation relations (39), all anti-commutators between terms of odd order and terms of even order vanish. Accordingly X(T) can only contain terms of even order and must therefore be an even function of r: X(-T)
=
X{T)
(42)
This implies that C ( r ) reproduces X(T) twice, making it twofold degenerate. Hence there exist complete paths in Clifford space which have either a beginning or an end in physical time. If we assume that it is the first possibility
125 which applies, then the only way to avoid a contradiction with experience is to assume that the 'starting times' of the particles are of cosmological origin. The viability of the Clifford model therefore depends on the construction of a cosmological model which would presumably go beyond the framework of the quantum mechanics of point particles. The classical paths will, like X(T), be even functions of r . In the quantum regime, however, paths for which X(T) ^ x{—r) will also contribute to the transition amplitudes. Consequently the Clifford model will seem to be nonlocal from a space-time point of view. We shall interpret this non-locality in section 6.
4
The relativistic point particle
Since there exists no SL{2.C) invariant hermitian second rank spinor, but only the real skewsymmetric metric CAB, the simplest reparametrization invariant action for a relativistic point particle which only depends on c is ^{cA,c*B}{cA,c*B}dr
-2*V^f
(43)
The conjugate to c is d*A = - 2 f ^({cE,c^}{cE,c^})-3HcA,cB}c*B
(44)
Not unexpectedly the Hamiltonian (20) vanishes because of reparametrization invariance. By use of the relation VAPVB^
= 8BVI1V^
(45)
for a hermitian second rank spinor, we obtain from (44) the associated constraint ±{dA,dB}{d*A,dB}
= m2
(46)
or p„jf = m 2
(47)
This is the same constraint as would have been obtained from the usual spacetime Lagrangian myx2, but with the important difference that p^ is no longer a primary dynamical variable. The new Hamiltonian is proportional to the constraint: H{C, D) = viD^P"
~ m2)
(48)
The gauge is fixed by choosing V{T) = ^ . By use of the space-time commutation relations the equations of motion (24) become
ic>A=LpAi^
£D<*=°
<*»
126
with the solution CA (r) =CA (0) + ^ ^ ( O ) Dj. (0)r,
A * (r) = D A (0)
(50)
(r)} = {CA (0),DB (0)} + ^ - P ^ ( 0 ) P ^ ( 0 ) r
(51)
Prom (50) we obtain {CA (r), A
Applying (45) and the quantum form of (47) to (51) it becomes {CA ( r ) , D B (r)} = {CA (0),D B (0)} + ^ | r
(52)
Accordingly, the Clifford commutation relations (27) are preserved in time by the equations of motion, and with the choice r = 0 for the zero-point of /x(r) we obtain /*W = j r , f= -T2 (53) The corresponding space-time solution is X " ( f ) = X"{0) + ^P"(0)f,
P„(f) = P„(0)
(54)
In accordance with the general result in section 3, the complete solutions to the Clifford equations of motion (49) are double coverings of the incomplete solutions X(f),f > 0 to the space-time equations of motion.
5
M e a s u r e m e n t principle
The measurement principle in quantum mechanics says that the (abstract) state vector is constant in time as long as no measurement is being performed. After a measurement has been performed the state vector is replaced by the eigenvector of the measured quantity for subsequent times ('state vector reduction'). This measurement principle applies equally well to Dirac's parameter-time formalism when 'time' is taken to be a parameter-time with the same direction as our f in the foregoing. Recognizing the primary character of Clifford space, we shall instead assume that the reduction of the state vector takes place in the positive direction of r itself which therefore comes to represent the true direction of causality: Measurement Principle. The state vector of the particle is constant in parameter-time T as long as no measurement is being performed. When the particle is measured to be in the eigenstate \xp) of X the state vector is replaced by \cp) = \xp) for parameter-times r > Tp where cp is an 'eigenvalue' of C (TP) and cp and C {TP) satisfy {cp,cP}
= xp and {C (rp),C
(TP)} = X respectively
127 Using a convenient terminology we shall say that the Clifford position of the particle has been measured to be cp at r = Tp. The measurement principle respects the fact that since the interaction-Hamiltonians used for measuring space-time positions depend only on C through X, the state vector reduction in Clifford space should also be defined trough X and its eigenvalues. In the following we shall examine the consequences of this principle. Let the space-time position of the particle have been measured to be XQ. Prom (42) and (10) it follows that C(T) satisfies the criteria in the measurement principle at two parameter-times r = ±TQ. Let us call the corresponding 'eigenvalues' for CQ+ and CQ~. Hence the state vector will be \CQ-) and \CQ+) (both equal to \XQ) ) right after r = — TQ and r = TQ respectively. If no measurement is being performed between r = — TQ and r = TQ the particle will arrive at r = TQ in the state | C Q _ ) . Since after r = TQ the state is |CQ+), the transition amplitude is ( C Q _ | C Q + ) = (XQ\XQ) = 1. Therefore the measurement principle is self-consistent as long as no measurement is being performed between r = —TQ and T = TQ. Let us now assume that such a measurement is being performed, resulting in the space-time position xp corresponding to the Clifford positions cp± at r = ±Tp respectively, where Tp
= \{xP\xQ)\2
(55)
and therefore equals the transition probability for the particle to move from xp to XQ. We conclude that the space-time transition probabilities arise as transition amplitudes for the complete paths in Clifford space. Note that viewed from space-time it appears as if there are two amplitudes, one moving forward in time from xp to XQ and the other moving backwards in time from XQ to xp. This resembles the situation in the time symmetric formulation of quantum mechanics by Aharonov and Vaidman [6], Costa de Beauregard [7], and Werbos [8]. In the present model the two state vectors of time symmetric quantum mechanics are recognized to be one and the same, propagating along a path which covers the space-time path twice. The use of parameter-time in our model is necessitated by the secondary character of physical time, but it has the added advantage of ensuring manifest Lorentz invariance. The present model should also be compared to the so-called 'double spacetime interpretation of quantum mechanics' Bialynicki-Birula [9], inspired by Schwinger's time loop integrated amplitudes. The main problem in this interpretation is how to join the two space-time sheets at infinity to allow a particle to travel along a single path on the two sheets. The choice of taking the causal direction of state vector reduction to be in the positive direction of parameter-time r rather than of 'affine time' f strongly suggests that the same should apply to the direction of propagation of classical fields. The following heuristic observation shows that this is not necessarily inconsistent with experience. Let the union of all possible particle trajectories for T < 0 and for r > 0 form regions ft- and fi+ of Clifford space which
128
correspond to the same space-time region. For the field to propagate in the positive direction of r we should choose the advanced field on fi_ and the retarded field on Q+. The contribution to the electrodynamic action in the proper-time interval [ff, 7=2] of a test-particle with charge e traversing this region is
u. -7-1
_ _
J
I-T2
m v x 2 + Aadve{—&) dr + - / 2
r2
mvx2 +
AretexdT
JTI
ff2 rr1 1 = / mVx2 + (-Aadv + -Aret)exdr
-
(56)
The test-particle will therefore detect the effective field to be the time symmetric half-advanced plus half-retarded field. Assuming complete absorption and no self-interaction Wheeler and Feynman [10] have shown that this time symmetric field leads to the conventional rules of electrodynamics.
6
Interpretation of non-locality
Consider a particle which travels in space-time from a point P to a point Q and is forced to travel trough two alternative points S\ and 52- This corresponds to the double slit experiment with the two slits being opened at given times. As follows from our foregoing discussion the particle can follow four alternative sets of paths in Clifford space corresponding to the the four sequences of positions in Clifford space ordered according to parameter time: CQ-, cs,-, cP-,
cP+,
cSj+, cQ+, i,j = 1,2
(57)
The amplitude for the particle to travel from CQ_ to CQ+ is the sum of the amplitudes for all four different sets of paths 2
Yl
(cQ-\csi-)(csi-\cp-)(cp+\cSj+){csj+\cQ+) = K ^ f e i X ^ S i \XQ) + (xp\xS2)(xs2
\XQ)\2
(58)
which is the well known probability for the particle to travel from P to Q. The customary interpretation of this transition probability is that there are two alternative paths and that the transition probability is the sum of the probabilities for each path plus two interference terms which seem to signal a non-local influence of one path on the other. From (58) we see that there are really four different sets of paths and that the two interference terms are the amplitudes for the two sets of paths where the particle goes through each slit at opposite parameter times. The apparent non-locality can be entirely attributed to the twofold degeneracy of the space-time paths. If we measure the position of the particle at one of the slits, say S\, then according to our measurement principle the particle has to travel through both
129 cs!- and cs,+ or neither of them, and this excludes the two sets of paths where the particle passes through both slits. This removes the interference terms in accordance with the space-time view of quantum mechanics. This analysis is readily extended to a many-slit experiment by observing that all interference terms arise from pairs of slits. As the second example of non-locality we shall consider an EPR type of measurement. Consider a composite system (PQ) consisting of two spin | particles P and Q. First the position and the total spin of the composite system is measured to be X(PQ) and 0. After this measurement P and Q become separated by a spacelike distance and their position and spin along some axis are measured to be xp and | and XQ and — | respectively. The last two measurements appear to be correlated despite the spacelike separation of P and Q, giving thereby the impression of 'action at a distance'. However, according to our measurement principle, the position measurements, and together with them the spin measurements, each correspond to two measurements in Clifford space at opposite values of r . If the measurements of £(PQ) , xp and XQ correspond to parameter-times r = ± T ( P Q ) , r = ±Tp and r = ±TQ respectively, then the sequence of events for negative r can be described as follows. First at parameter-times r = —Tp and T = —TQ the spins along some axis of P and Q are measured to be | and — \ respectively. At the later parameter-time r = —TPQ > —TP,—TQ, P and Q merge into a composite system {PQ) and the total spin is measured to be 0 . We would not object to this last sequence of events because it suggests no correlation between the spacelike separated states of P and Q. Rather, it suggests an obvious correlation between the states of P and Q on the one hand and the state of the composite system (PQ) on the other, which invokes no need for 'action at a distance'. Nevertheless these two sequences of events, corresponding to opposite values of r, together form a single series of events in the causal direction of state vector reduction in Clifford space and are both the result of the same space-time measurements on a degenerate space-time path. They are therefore on an equal footing and we conclude that it has no absolute meaning to say whether the spin-measurements on P and Q are correlated or independent. Accordingly, the apparent manifestation of 'action at a distance' loses its significance.
A
Appendix. Clifford algebras and Hermitian quadratic forms
A real Clifford algebra arises naturally as the 'square root' of a real quadratic form Q on a linear space V: v2 = Q(v), v&V
(59)
Q can have any signature (iV + , JV0, N-) . In case Q is degenerate (A?o ^ 0), the algebra contains Grassmann elements. When v is expanded on an orthogonal
130 basis a of V it follows that (59) is satisfied if -{ei,ej}
= 6ijQ{ei)
(60)
The basis e^ generates the Clifford algebra. Consider now a quadratic form Q with signature (2N+,2N0,2N^). We can rearrange the generators a into two sets a, and bi,i = l,...,N which when normalized satisfy •^{ai,aj} = -{bi,bj}
= SijQ(ai),
{ai,bj} = 0
(61)
aj and bi can be used as 'real' and 'imaginary' parts to define the complex quantities fj=aj+ibj (62) where i is the imaginary unit. The elements /_,- are seen to satisfy the commutation relations \{fi, / ? } = StjQiai), {ft, /,•} = 0 (63) where * is any complex involution induced by a self-involution in the real algebra. The algebra generated by fi is a complex Clifford algebra. For any hermitian quadratic form if on a linear space V there exists a complex Clifford algebra generated by V which satisfies \{v,v*}
= H(v), veV,
v2=0,v£V
(64)
The proof follows by expanding v on an orthogonal basis fi of V. (64) is seen to be satisfied if \{fu f*} = SijHifi), {fi, fj} = 0 (65) which is recognized to be the generating algebra of a complex Clifford algebra. Expressed in matrix language, (64) implies that any hermitian matrix Jfy can be expressed in terms of elements Vi of a complex Clifford algebra: Ha = {vi,v^},
{vi,vj}=0
(66)
References [1] Penrose, R. (1967). In Battelle Rencontre, C M . DeWitt and J.A. Wheeler, eds., Benjamin, New York. [2] Schwartz, J.H. and Van Nieuwenhuizen, P. (1982). Lettere al Nuovo Cimento, 34, 21. [3] Borchsenius, K. (1987). General Relativity and Gravitation, 19, 643. [4] Borchsenius, K. (1989). General Relativity and Gravitation, 21, 959.
131 [5] Borchsenius, K. (1995). International Journal of Theoretical Physics, 34, 1863. [6] Aharonov, Y. and Vaidman, L. (1990). Phys.Rev., A41, 11. [7] Costa de Beauregard, O. (1989). In Bell's theorem, quantum theory, and conceptions of the universe(ed. M. Kafatos).Kluwer, Dordrecht. [8] Werbos, P. (1989). In Bell's theorem, quantum theory, and conceptions of the universe(ed. M. Kafatos).Kluwer, Dordrecht. [9] Bialynicki-Birula, I. (1986). In Quantum Concepts in Space and Time, R.Penrose and C.J.Isham, eds., Claredon Press, Oxford, p. 226. [10] Wheeler, J.A. and Feynman, R. (1945) Reviews of Modern Physics, 17,157.
MATHEMATICAL PHYSICS ELECTRONIC J O U R N A L
ISSN 1086-6655 Volume 6, 2000 Paper 5 Received: Mar 16, 2000, Revised: Nov 9, 2000, Accepted: Nov 27, 2000 Editor: R. de la Llave
Periodic orbits of renormalisation for the correlations of strange nonchaotic attractors B. D. Mestel School of Mathematical Sciences University of Exeter Exeter EX4 4QE, UK
A. H. Osbaldestin Department of Mathematical Sciences Loughborough University Loughborough LE11 3TU, UK
Abstract We calculate all piecewise-constant periodic orbits (with values ±1) of the renormalisation recursion arising in the analysis of correlations of the orbit of a point on a strange nonchaotic attractor. Our results make rigorous and generalise previous numerical results.
132
133
1
Introduction
The occurrence and robustness of strange nonchaotic attractors was first noted by Grebogi et al in their seminal paper [4]. A strange nonchaotic attractor is an attractor whose geometry is "strange", and on which the dynamics is "nonchaotic" (i.e. for which there is no positive Lyapunov exponent). Grebogi et al [4] considered quasiperiodically forced systems of the type x„+i = f{x„,9n), e„+i=8„+u
(1.1) (modi),
(1.2)
in which u> is irrational, the dynamical variable x (and / ) may be scalar or higher dimensional, and / satisfies f(x,9 + l) = fix,9). (Such systems are examples of skew-product systems.) Strange nonchaotic attractors have since been reported in other theoretical and experimental situations. References to such occurrences may be found in [7]. In the scalar example studied in some detail in [4] the function / in equation (1.1) takes the form f{x, 9) - 2A tanh(x) sin(2?r0).
(1.3)
For |A| < 1 the invariant line x = 0 is the attractor. When |A| > 1 this invariant line is no longer an attractor; however, since orbits are confined to a bounded region of phase space an attractor does exist. This is shown to be strange and nonchaotic in [4]. In [9] the autocorrelation of the orbit on the strange attractor is seen to be self-similar and possess a singular continuous spectrum. As in [2], however, we shall confine our attention to a coarser description of the dynamics. Namely we consider only the sign of the variable i , defining y = -sign(x).
(1.4)
For the systems under consideration the dynamics are thereby reduced to the linear circle map (1.2) together with a recording (»/) of whether 9 is in [0,1/2) or (1/2,1). In the case of golden mean forcing, the autocorrelation function of y is seen to be self-similar with structure determined by the renormalisation recursion relation Qn{x) = Q „ - l (-Ux)Qn-2{u2X
+ U) ,
(1.5)
where w = (\/5 — l ) / 2 is the golden mean. For completeness we shall include from [2] the derivation of this equation, in section 2. In [2] Feudel et al numerically found a piecewise-constant period-6 orbit of this recursion. This periodic orbit is shown in figure 1. In this paper we shall give an explicit construction of this periodic orbit, and moreover analyse all piecewise-constant periodic orbits. These periodic orbits correspond to taking a different coarse-grained description from merely noting in which half of the interval 9 lies. In a different but related work, Kuznetsov et al [7] have given an elegant analysis of the birth of a strange nonchaotic attractor. The same recursion is used to explain the occurrence of universal scaling factors. In this case however periodic orbits of (1.5) of a different nature are considered.
134
A— 3
-4
i
-2
5 - 2
r~T—<
2
2 "
\
6
4
€
-
<
4
-
2
2
I
<
Figure 1: The period-6 orbit discovered by Feudel et al ([2]) Remarkably Ketoja and Satija [5] also derive this same equation in their analysis of the self-similar fluctuations of the localized eigenstates of the golden mean Harper equation (also known as the almost Mathieu equation) V>n+i + V'n-i + 2Acos(2!r(nw + <£))>„ = Eipn
(1.6)
in the supercritical regime A > 1. This finite difference eigenvalue equation is valuable in the study the localization transition in incommensurate systems. The recursion (1.5) helps explain the universality of the supercritical regime. Note that in [5] the iteration occurs in the form Qn{x) = -Q„_i(-u;x)O n _ 2 (a; 2 x + u),
(1.7)
but the substitution Qn = —Q„ renders it equivalent to (1.5). A fixed point of this recursion characterises the universal fluctuations, and this is numerically found in [5]. The same recursion is also used in [5] to analyse a generalised Harper equation describing Bloch electrons on a square lattice with nearest neighbour anisotropy as in (1.6), and the addition of a next-nearest neighbour coupling term. Many periodic orbits are found, and Ketoja and Satija [5] conjecture the existence of a universal strange attractor under the action of the renormalization operator. In a recent paper [8] we have proved that indeed there is a fixed point of the type numerically found in [5]. (See also [6].) We hope to be able to extend our results on smooth solutions to shed more light on the work of Kuznetsov et al [7] on the scenario of the birth of a strange nonchaotic attractor. In [6] these two seemingly distinct scenarios are linked, and indeed an analogy with the critical dissipative standard map is also drawn.
135 In this paper we study periodic orbits of (1.5) for which Qn is piecewise-constant with Qn taking values ±1 for all x 6 K. By piecewise-constant we mean that for each n the function Qn{x) = ± 1 has finitely many discontinuities in any bounded interval of K, although Qn may, and generally will, have infinitely many discontinuities on M. Although this condition might appear somewhat restrictive, we shall see in section 2 that this is the appropriate condition for the renormalisation analysis given by Feudel et al in [2] of the correlation function of the sign of orbits in strange nonchaotic attractors. Moreover, as we shall see, the periodic orbit structure for (1.5) is already very rich in this case. Let us define, for x € K, the discontinuity function
the ratio of the right-hand limit to the left-hand limit of Q„ at x. Then, since every discontinuity of Qn is isolated, R„ is well defined. Because we are not primarily interested in the value of Qn at the discontinuity points, we shall identify any two functions having the same discontinuity points (i.e. those x with Rn(x) = —1) and agreeing at all continuity points (i.e. at those x with Rn{x) = 1). We are now in a position to give a summary of the main results of the paper. In what follows we shall show that, if Q„ is a periodic orbit of (1.5) of period p, then H„ is also periodic with period m where m | p. (Here, and subsequently, the period is understood to refer to the minimal period.) Moreover we shall see that p = m, 2m, or 3m. Reducing the study of periodic orbits of (1.5) on IR to a neighbourhood of the fundamental interval [—UJ, 1], we shall identify the set of discontinuities on [—ui, 1] for the orbit and show that it is a finite union of periodic orbits of the map F : [—w, 1] —> [—CJ, 1] given by
{
-ui~xx, 2 U3~2X-h)-1,
x € [-ui,ui2]; 1
2 , X£[0J2,1].
( L 9 )
Such periodic orbits are classified by their codes (also called itineraries or kneading sequences) and we shall determine the possible values of m in terms of the codes of these orbits. We shall also identify in detail the cases in which p = m, 2m and 3m can occur. This latter analysis is somewhat complicated and involves some non-intuitive number-theoretic conditions on the codes. A consequence of this analysis is that we shall show inter alia Theorem 1. For every positive integer p > 1 there is a periodic orbit Q„ of (1.5) of period p. The paper is organised as follows. In the next section, closely following Feudel et al [2], we briefly review how the recursion (1.5) arises in the renormalisation analysis of the autocorrelation function for a strange nonchaotic attractor. In section 3 we establish some notation and indicate how an iterated function system and its 'inverse', the function F above (1.9), naturally arise in the recursion. The iterated function system has as invariant set the interval (—ui, 1], and we show in section 4 that it suffices to consider the recursion (1.5) restricted to this interval. Since we are solely concerned with piecewiseconstant functions Q„ taking the values ± 1 , much of the nature of the recursion can be understood from a study of the discontinuity function i?„ defined above (1.8). This we consider in detail in section 5. However an analysis of the discontinuity function is not in itself sufficient, and in section 6 we relate the periodicity of the discontinuities to that of Qn itself. This relationship is nontrivial and requires a careful
136 consideration of the orbits of the map F. The results are summarised in section 7. In section 8 we give an analysis of the construction of periodic orbits of (1.5). The period-6 orbit of Feudel et al [2] shown in figure 1 is seen to be but one example.
2
Renormalisation analysis of the autocorrelation function
In this section we review the work of Feudel et al [2] and show in particular how equation (1.5) arises in a renormalisation analysis of the autocorrelation function for a strange nonchaotic attractor. In all that follows we shall take u> = (y/Z — l)/2 and assume that A > 0. Recall that the Fibonacci numbers are given by: F0 = 0, F x = 1, Fn = Fn-i + F„_2, for n > 1. In terms of the discrete variable y denned above (1.4), our mapping (1.1)—(1.2), with the choice of / given by a function of the form (1.3), is now just ^n+l = »„*(»„) ,
(2-1)
0n+i = On + w
(2.2)
(mod 1),
where the "modulation function"
{
*(«) = <
-1, +1,
0<
"
(2-3)
1/2<0<1
Thus n-1
yn = U*{0k),
(2.4)
fc=0
0„ = 0O + nw
(mod 1),
(2.5)
where we take yo = 1. The dynamics of y are nothing other than the recording of the location of iterates of the linear circle map, and depend only on the initial angle doThe autocorrelation function C(t) of y (which has zero mean and unit variance) is the limit time average 1
C(t)=
T_1
lim = 5 ] »«<+,,
(2.6)
which in view of (2.4), and the fact that $ = ± 1 , is T-l
i-l
i+t-1
s=0 *=0
1
T-li+t-1
Jt=0
i=0
k=i
Now the ergodicity of the linear circle map allows us to write 1
T-li+t-1
,1 -1 i+t-1 i+1-1
p? ^ E n *(**)=/ n *(**)<»>>. ~*°°
t = 0 k=i
J
°
(2-8)
k=i
and, since has unit period, we may also change the integration variable (initial condition So) to Bo - iui resulting in C(t)=
TT *(0*)d0 = / yt(0)d9, Jo k=0 . „
Jo
(2.9)
137 where we explicitly note the dependence of yt on the (initial) angle 0. The autocorrelation function is observed to have scaling about Fibonacci times, and so to analyse this we define Sn(ff) = yF„(8), and have, with Ok = d + kui (mod 1), K-i
SnW) = J ] *(«*)
(2-10)
4=0 K-l-1
F„-l
p-n)
= n *c*) n *(**) k=0
*=F„_i
= Sn.1(e)Sn-2(9 which, using the fact that Fn_1u> = F„-2 — (-w)
n_1
+ Fn^u),
(2.12)
, is
Sn(6) = 5„_ 1 WS n _ 2 (9 - ( - w ) " " 1 ) . To analyse the scaling, we define Qn(x) = S„({-UJ)"X)
giving 2
Qn(x)
(2.13)
= Qn-l{-U1X)Qn-2{ui X
+ u) ,
(2.14)
which is equation (1.5). As noted in [2]
C(F„)= f yFMM=
J Sn{8)d9=rA—
f "
Qn(x)dx.
(2.15)
Jo Jo \~u) Jo Thus the autocorrelation function for Fibonacci times can be determined from the average of the function Qn. For n not a multiple of three we have that Fn is odd which gives C(Fn) = 0. Indeed, as above, by changing the range of the product we may write
c(2m +1) = / JJ *(**)<*>= / Jo
fc=0
J
«
JJ *(ek)de = / $() TT ${ek)$(0-k)de = o, Jo
k=-m
(2.16)
fc=l
since the integrand is odd about 1/2. When n is a multiple of three it is numerically observed in [2] that the average approaches approximately 0.55 for large n. This is the relative height of the secondary peaks in the autocorrelation function. The results of this paper explain the periodic behaviour of the functions Qn in the specific example studied by Feudel et al, and also determine the behaviour in the presence of more general modulation than equation (2.3).
3
Iterated function system and t h e inverse map F
We may write equation (1.5) in the form Q„(x) = Qn-i(M*))Qn-2(
>
(3-1)
where 02 (s) = u>2x + ui,
0i(x) = —UJX ,
(3.2)
2
and u = (y/b - l ) / 2 is the golden mean satisfying u + ui = 1. Associated with this equation is an iterated function system (IFS) on R given by the two contractions 01, 02 satisfying the following properties:
138 1. 4>\ and 02 are linear contractions with fixed points 0 and 1 respectively, and with (j>\ (x) = —w and 0 2 (*)=W 2 . 2. The interval I = [—u, 1] is the fixed point set for the IFS. Indeed *,([-w,l]) = [-w,w 8 ],
< « [ - w , l ] ) = [u, 2 ,l],
(3.3)
so that <M/)U03(J) = 7. We shall henceforth refer to I as the fundamental
(3.4)
interval.
3. The fundamental interval I is the attractor for the IFS. Indeed given any compact subset K CM. and any e > 0, there exists N E N such that for any k > N and any choice i\,...,ik € {1,2} we have
(3.5)
for any x € K. This property will be important when we consider the behaviour of equation (1.5) outside the fundamental interval I. We refer the reader to the book [1] for the theory of iterated function systems. On the fundamental interval we may define a unique inverse map to the pair <j>\,fo-Let F : [—ui, 1] -> [—ui, 1] be defined by =
U^(x)
= -u-'a ,
ie[w2,l],
as drawn in figure 2. We shall see below that periodic points of F correspond to discontinuities of the periodic solutions of (1.5). It is therefore appropriate to study the periodic orbit structure of F, but, before so doing, it is worth noting that for any periodic point y € [—OJ, 1], precisely one of >r(y), <j>2{y) is also a periodic point of F. For suppose Fl(y) = y for some £ e N. Then F{Ft~1(y)) = y, so (pi'1 (F^1 (y)) = y for some i € {1,2}, which depends on whether the periodic point Fe~l(y) € [~u},u2] (in which case i = 1), or Fl~1(y) e [w2,1] (in which case i = 2). We have that Fe~1(y) # a;2, since ai2 is not periodic under F. Thus one of <j>i{y), Mv) equals Ft~l(y), which is periodic. Now suppose that both <j>\ (y) and fa (y) axe periodic. Then there exist £t, £2 € N such that F'1 ((j>i (y)) = My), FeHMy)) = My)- Then Mv) = FM*(My)) = Fktl(
139
Figure 2: The function F. the interval [—ui, u>2) be encoded with the symbol 1 and (w2,1] with the symbol 2. We define the code of x e / to be the sequence (a„)„> 0 in {1,2}N° given by
Fn{x)€[-u,u2);
1,
2, F"(x) 6 (a,2,1].
(3.7)
As is usual we ignore the (countable) set of points whose orbits under F include the point u>2. (Such points are not periodic points of F.) Hence the codes are all infinite sequences. In terms of the code a0aiO2 . . . of a point x € [—u>, 1], we have F(x) = ( - a ; - 1 ) " 0 ! - (ao - l ) u _ 1 .
(3.8)
Since F is uniformly expanding (\F'(x)\ >u> ' ) every point x e / corresponds to a unique code and vice versa. In particular, periodic orbits of F correspond to periodic codes in {1,2} No under the shift map a:
(3-9)
A periodic orbit yo, y\, •. • , yk-i of period k of F is given uniquely by a periodic code aoai. ..ajt_iaoai.. . a * _ i . . . ,
(3.10)
which we henceforth denote by a<jai .. .ak-iIt is straightforward to calculate the periodic orbit yo, y\, • • • , yt-i of F corresponding to a given code a 0 a i . . . ak-i. For we have
_-E^ofe-i)(-^)1+Eaai 1-
(-Ul)Sai
(3.11)
140 where empty sums are defined to be zero. The other points of the orbit may be calculated by applying this formula with the code a o a i . . . ajt_i cyclically permuted. For example, F(y) has two fixed points: y = 0 with code 1, and y = 1 with code 2. with code 21 is given by yo = 1/2 and j/i = —u/2. It is the fixed point y = 0 and that are the discontinuity points in the fundamental interval of the period-6 orbit As an example of applying the formula (3.11), we calculate the period-4 orbit of 2,0 = - w 2 / ( l + w 5 ), Vl = u/(l + u 5 ), y2 = - w 4 / ( l + " 5 ) , 2/3 = ^ 7 ( 1 + w5)-
The period-2 orbit this period-2 orbit shown in figure 1. F with code 1211:
In what follows it will be the code of the periodic orbit that is important, not the orbit itself. Therefore, from now on, we shall principally refer to periodic orbits of F just by their codes.
4
Reduction of Qn on M to the fundamental interval
In this section we consider equation (3.1) outside the fundamental interval [—OJ, 1] i.e., on the whole of R. In what follows we restrict to Q„ taking values ± 1 . Because the fundamental interval / attracts points under the IFS (3.2), we have the following lemma: L e m m a 1. Let Qa, Q\ be initial conditions on R and let e > 0 be such that Qo(x) = Qi{x) = 1, for all x e [—ui — e, 1+e], and let Qn satisfy equation (3.1). Then for each L > 1, there exists N > 0 (depending only on L) such that Qn{x) = 1 for all x € [—L, L] and alln > N. Proof. Let Qo, Qi satisfy the hypotheses of the lemma, and let L > 0 be given. Since
•feo-'-o^Mel-w-^i + e]
(4.1)
for all x 6 [-£, L]. Iterating (3.1), we see that Qn{x) may be written as a product
Qn(x)=
II • i,«2
On-EJ.,,, (*,°"-<»*f 4 (*))-
Hence setting JV = 2Ni, we have for n > JV and x € Qn(x)=
n •i,»a
(4-2)
ik€{l,2}
[-L,L],
Q n _ E j _ 1 J . ( & 1 ° " - ° < M * ) ) = i-
(4-3)
>fcS{l,2}
This completes the proof of the lemma.
•
From the lemma we may prove the following proposition: Proposition 1. LetQn be a piecewise- constant periodic solution of (3.1) ofperiodp on M. with Qn(l+) = Q n (l). Then Qn is periodic with period p on the fundamental interval I. Conversely, suppose that Qn is periodic with period p on I. Then there is a unique extension Qn of Qn to R such that Qn is periodic on R with period p.
141 Proof. First of all let Qn be periodic on R with period p, and with Q„(l+) = 0. Then, clearly, Q„ is periodic on J with period p' dividing p. Let e > 0 be chosen so that for all n > 0 there are no discontinuities of Qn in the intervals [-u - e, -u) and (1,1 + e], and such that 0i and fa map [—u; — e, —u) into J. Such an e exists since the discontinuities are isolated on E, Q„ is periodic, and <j>i(—ui) = (j>2(—ui) = J1, which is not a discontinuity of any Qn. Furthermore, since Q„ has no discontinuities on (1,1 + e], we have that Q„(x) = Q n (l) for a: € [1,1 + e] and so Q„ is periodic on [1,1 + e] with period dividing p'. Now for x € [—u -e, —u), we have that 4>\(x),
142
5
Analysis of the discontinuities
In order to study the piecewise-constant periodic orbits of the recurrence (1.5) with Qn(x) = ±1 ^ d with initial conditions Q0, Qi, it is helpful to consider the dynamics of the discontinuities of Q„. We may define, for each x € M and n > 0, Rn(x) = % ^ \ ,
(5-1)
the ratio of the right-hand limit at x to the left-hand limit at x. Since Qn(x) = ± 1 , we have Rn(x) = ± 1 , and it is clear that Rn(x) = —1 if and only if Qn has a discontinuity at x. Since Qn has at most finitely many discontinuities in any compact interval we have that R„ is well defined. Because of the multiplicative nature of the recurrence (1.5), and because
'
so, using Rn(x) = l/R„(x),
Qn{x-)
,5 ,.
Qn-l(M*)+)Q«-2(M*)-)'
we obtain Rn(x) = R„-i(Mz))Rn-2(
(5.3)
and Rn satisfies the same recurrence relation as Q„. However R„{x) = 1 except at points of discontinuity of Qn, where Rn(x) = —1. We first of all discuss the dynamics of R„ and then relate the dynamics of Qn to those of R„. Indeed, it is clear that if Q„ is periodic with period p € N, then Rn is also periodic with period m dividing p. Our task is to determine the possible periods m of R„ and relate m to p, the period of Q„. Prom now on we assume that Qn is periodic with period p and that Rn is periodic with period m, and, in view of proposition 1, we only consider the behaviour of Qn and Rn on the fundamental interval [-u, 1]. We denote by D = {xe [-u, 1] : Rn(x) = - 1 for some n > 0} ,
(5.4)
the restricted discontinuity set. Then D is the set of points in the fundamental interval [—w, 1] for which Qn has a discontinuity for at least one n > 0. One important observation is that since each Qn is piecewise-constant (and so the set of discontinuities of Qn on [—u>, 1] is finite), and since Q„ is periodic, it follows that D is a finite set.
5.1
The restricted discontinuity set and t h e m a p F
In this section we show that the restricted discontinuity set D consists of finitely many periodic orbits of the map F. Indeed we have the following result: Proposition 3. Let Q,.. be a periodic orbit of (1.5) with Qn(x) — ± 1 , and let D be the restricted discontinuity set. Then D consists of a finite collection of periodic orbits of the map F. For suppose y e D. Then Rn{y) = - 1 for some n > 0. From (5.3) we have that Rn-^i^iy)) = -1 for some i\ e {1,2}. We therefore have ^i,(y) 6 D. Continuing in this way, we obtain a sequence
143 t'i,« 2 ,... € {1,2} such tha.t
(5.5) 5 6
<*«,•-, (2/«) = 3fc-i ,
(-)
where here, and in what follows, we assume that expressions relating to the periodic orbit yo,yi,--- , y t - i are reduced modulo k. Moreover, by the results of section 3, we have that precisely one of (pi(yi),
*.(«) = J*-1**-0' Q-i1 = 1
(")
( ^ - 2 t e - i ) > «t-i = 2 , where we have used the facts that Rn-2(
•
(5-8)
From this we see that Rn+aoj hQi _ 1 (yt) = H„(j/o), so that if y0 € .D and i?„(i/o) = - 1 we have yt € D, since #„+„„+...+„,._, (j/;) = Rn{y0) = - 1 . We conclude that not only must every point y in D be a periodic point of F , but that every point on the periodic orbit of y also lies in £>, so that D consists of complete orbits of F. Since D is finite, proposition 3 now follows. Prom (5.8) we see that only one of the factors in the right-hand side of (5.3) is different from +1, although which one depends on the code a o a i . . . a^-i. We also observe that in (5.8) n decreases by a;_i. Now, over the whole of the orbit yo,Vi, •• • ,2/fc-i we have that re decreases by fc-i
*=£>•,
(5.9)
i=0
i.e.,
fln(Vi) = Rn-l(Vi) ,
(5-10)
for 0 < i < k — 1. It follows that we must have m | I. We therefore conclude the following: Proposition 4. The period m of the discontinuity function R„ restricted to a periodic orbit yo, • • •, yt-i of F divides I, the sum of the code over the orbit of F. We now introduce three examples of periodic orbits of F which we shall use to illustrate the theory as it develops. Example 1. Period-4 orbit of F with code 1122. Then I = 6. Example 2. Period-4 orbit of F with code 1211. Then I = 5. Example 3. Period-6 orbit of F with code 111222. Then I = 9.
144 5.2
The discontinuity matrix
Let j/oj/i • • -Vk-i be a periodic orbit of F with code aoa\ .. .ai,-i, and with £ given by equation (5.9). We shall first consider the dynamics of R„ on this orbit. It is helpful at this point to introduce an i x k matrix M, the discontinuity matrix, with entries ± 1 defined by Mn,i = RnlVi) ,
(5.11)
foiO
(5.12)
where here, a n d in w h a t follows, indices referring t o t h e periodicity of Rn a r e reduced modulo I. T h e structure (5.12) can be more easily understood as follows. Column i of the matrix M is simply column (i — 1) cyclically permuted downwards by dj_i single cyclic permutations. This observation also holds when i = 0, for then (5.12) becomes M„,o = M „ _ a t _ , , * - i . Let us denote t h e column 0 by (Xo,Xi,... relation (5.12) tells us t h a t
,Xt-\),
(5.13)
i.e., M n> o = Xn for 0 < n < I — 1. T h e n t h e
M„,i = M „ _ a o , 0 = Xn-ao,
(5.14)
and, in general,
so t h a t t h e columns of M are simply cyclic permutations of the column 0 of M. As an illustration consider example 2. Recall t h a t t h e code is 1211, t h e period k is 4, a n d I = 5. T h e matrix M is X.Q X4 X\ XQ X% X\ M = X3 X2 \ X4 X3
X2 -^1 X3 Xi X4 X3 XQ X4 X\ XQ j
(5.16)
Note that the column i is obtained from the column (i — 1) by cyclically permuting the column (i — 1) downwards by a;_i.
5.3
Periodicity of the discontinuities
Not only does any periodic orbit Rn with discontinuities at a periodic orbit yo,yi, • • • ,J/*-i of F have a discontinuity matrix M with the structure (5.15), but also, conversely, any matrix M satisfying (5.15)
145 corresponds to a periodic orbit of Rn, by denning iJ„(x) = 1 except on the points yo,..., yk-i where we define Rn(yi) = Afnmod<,t for n > 0. The period of Rn certainly divides £, but may not actually be equal to I. Indeed, trivially, setting M„,i = —1 for all i, n gives a periodic orbit of period 1 for Rn. In fact the period of R„ depends only on the period of column 0 of M viewed as a sequence of ± 1 . This is because this column is periodic with period m if and only if it is invariant under m single cyclic permutations and m is the least positive integer for which this is true, i.e., Xn+rn = Xn and Xn+j ^ Xn for all n if 1 < j < m. Now the other columns of M are obtained from column 0 by cyclically permuting and thus they will also have period m. In fact, since any column of M can be obtained from any other one by cyclic permutations it follows that all columns of M have the same period m. Indeed, for r e N Mn+r,i = Xn+r-X'rJ
1 = * » - £ & -J+r =
M
^
(5"17'
if, and only if, m \ r. This is because the Xn have period m. Thus each column of M has period m. We conclude that the period m of R„ is the period of the first column (XQ,XI,... ,XI-\) of M. It is now clear that m 11 and that for every m dividing I we can find a column (XQ, XI,... , X<_i) with period m. It is worth remarking that the first two rows n = 0 and n = 1 of M are not independent, so that, although the recursion (1.5) is second order, we cannot choose i?o and Ri arbitrarily on yo,..., Vk-i and obtain a periodic orbit. We have therefore solved the question of the periodic behaviour of the discontinuities R„ for a single periodic orbit yo,Vi,- • • ,Vk-i of F with code aoai...a*_i. In summary, the period m of R„ corresponds precisely to the period of a single column of M, i.e., Rn(yi) for any 0 < i < k — 1. We have m divides t = X)j=o a ji aa^ conversely, for everyTOdividing £, we can, for suitable choice of the column 0 of M, viz., (X0,Xi,... ,Xi-i), arrange for ( X 0 , X 1 ; . . . ,Xt-i), and thus M and R„, to have period TO. Consider example 2. Here the first column is (Xo,Xi,X2,X3,Xi). The only positive integers dividing I = 5 are 1 and 5, so the only possible periods in this case are m = 1 and m = 5. Setting Xo — X\ = Xi = Xi = Xi = — 1 gives period 1, whilst any other choice (with at least one —1) gives period 5. Setting Xo = Xi = X2 = X3 = X4 = 1 gives period 1, but then the orbit of F will not lie in D.
5.4
Multiple periodic orbits in D
Having considered the dynamics of the discontinuity function R„ on a single periodic orbit of F, we now consider the case in which the restricted discontinuity set D consists of more than one periodic orbit of F. To do this, we must establish some notation. Firstly, let t be the number of periodic orbits of F in D. For 0 < s
^=£ai-
(5.18)
Now, from the multiplicative structure of (5.3), we have that a product of solutions is again a solution of the equation. Moreover, because the periodic orbits in D are distinct, and are never mapped to each other under the two maps <j>i,fc,we have that the dynamics of Rn on each of the periodic orbits in D
146 are independent. Indeed, we may write (-1
Rn{x) = '[[R°„(x),
(5.19)
8=0
where R", is the restriction of i? n to the periodic orbit s, i.e.,
^(*)={f"
K{x)=^(x),
* 6 {*,..,*._,};
{52Q)
otherwise.
We may apply the analysis of the previous subsections to each of the functions iJ*. This is because Rn(x) — 1, except when x is one of the points on the periodic orbit t/p .. .y\._x of F. In particular, for each orbit in D we can formulate the I' x ks discontinuity matrix Ms, where, for 0 < n < Is — 1 and 0 < i < k* - 1, Ml^R'M).
(5.21)
We observe that these matrices are independent of each other since the dynamics of R„ on each periodic orbit in D are independent. The theory for Rn that we discussed above carries over in a straightforward manner to the function Rn. To simplify notation, we adopt the convention that, when dealing with periodic orbit s and its matrix, expressions relating to the periodic orbit j/o,--- .S/*«-i a r e reduced modulo k' whilst those relating to the periodicities of R„ are reduced modulo i". Thus, as in (5.12), we have (5-22)
^»,i = ^ - . f _ 1 , i - i .
forO < n < £ s - l a n d O < i
^
= x
,Xl._j).
(5 23)
'«-i£i*y
-
s
s
s
and the period m" of the column 0 is precisely the row period of M . We also have m \ I . Conversely, let (. = lcm(£°,..., £' _ 1 ). Then for any m 11 we define m' = gcd(m, Is). Then m' \ (.' and by appropriate choices of (Xg,X(,... , X | , _ j ) we may construct a matrix M ' with row period any m" dividing Is, and, extending periodically to all n > 0, we have that Rn has period m s restricted to the orbit j/o> • • • >2/*'-iWe therefore have the following proposition for piecewise-constant, right-continuous functions taking the values ± 1 : P r o p o s i t i o n 5. Let Qn be a periodic orbit of (1.5). Then the period m of the discontinuity function R„ is given by rr^lcmim0,...,™,*-1), wherem" is the period of the function R'n and is given by the period of (XQ, Xs,... of the discontinuity matrix Ms. Furthermore, m divides e^lcmif0,...,?-1). s
Moreover, by appropriate choices of {Xg, X ,... odic orbit of Rn with period m.
(5.24) , X | , _ 1 ) , i.e., columnO
(5.25)
, X | , _ 1 ) , for any m dividing I me may construct a peri-
147 Let us illustrate this result when D is the union of examples 1-3 in subsection 5.1. Then i = lcm(6,5,9) = 90. Hence R„ has period dividing 90 and, conversely, for any m dividing 90, we may ensure that Rn has period m.
6
The relationship between Qn and Rn
Let D be the restricted discontinuity set. We now consider the period of the functions Qn and relate it to that of the discontinuity functions i?„. We first of all note that Rn does not completely determine Qn. However, Q„ is determined by R„ together with the value Q„(x) at a single point x. Although any choice of x would be sufficient, for our purposes it is convenient to take x = 1+, the right-hand limit at x = 1. We write Qn+ = C„(i+) •
(6.1) +
Indeed, since Qn is right-continuous, this is just Q n (l), but we write Qn to emphasise the fact that it is the right-hand limit. Now, on the fundamental interval, we have
Qn(x) = 0„(X+) = Qn+ J ] *»(»)
(6-2)
x
Qn(x-) = Q„+ I J *»(»).
(6-3)
*
for x 6 [-ui, 1]. It follows that Qn is periodic with period p if and only if Rn is periodic with period m dividing p and Qj, + is periodic with period p, or R„ is periodic with period p = m and Qj, + is periodic with period dividing p. We can therefore reduce the problem of the periodicity of Q„ to that of Qn+ and of Rn on [-u, 1]. To simplify the notation in what follows we introduce the quantities jfc'-i
On = I J *»(») .
D
n=H
j/€D
Kivl) •
(6-4)
i=0
We now evaluate (1.5) at x = 1+ to obtain Qi+ = On(l+) = Qn-d-U-)Qn-2{l+)
where we have used the fact that (Qj,t 2 ) 2
=
(6.5)
= QiUQiUD^
(6.6)
= {QiUfQ^Dn-lDn^
(6.7)
= Qj,t 3 I?„-iD„-2,
(6.8)
1 s m c e <3n-2 =
±1
-
Now each of the products D n -i> ^n-2 in (6.8) is a product of entries in the matrices Ms for 0 < s < t — 1. Indeed for each n in the range 0 < n
» = IlDn=i[ II M n,i.
(6-9)
148 so in (6.8) we have expressed <2j,+ in terms of Ql^t3 and a product of entries in the matrices Ms, 0 < s < t — 1, and hence of the X ° . Now an orbit of the second order recurrence (1.5) is periodic with period p if and only if Q0 = Qp, and Qi — Qp+i J where p is the least such positive integer. We know that p is a multiple of m, the period of R„. To obtain the relationship between p and TO, we investigate Q^t/Qo+ and Qlit+i/Q\+- We note that p = m if and only if both of these ratios have value 1, i.e., 5 % = - l = ± i = 1.
(6.10)
In what follows we shall need to evaluate products of the form
II
(6-H)
A.
n^rmod3 r
where m > 1 and r € {0,1,2}. We therefore prove the following lemma, which we shall use in our subsequent work. L e m m a 2. Let a' be a positive integer and let D„ — ± 1 have period dividing a', i.e., D„+ai — Dn for all n. Let V be a positive integer with b1 = 0 mod 3 and a' | b', and let r 6 {0,1,2}. Then: 1. if a' = 0 mod 3 then
n
°nfia,);
A.=( n
n^rmod3 r
(6.i2)
n^rmod3 r
2. if a' = 0 mod 3 then
II
Dn = l;
(6.13)
A. = i;
(6.14)
A, = 1 •
(6.15)
n^rmod3 r
3. if a' ^ 0 mod 3 then
II n^rmod3 r
4- if a' £ 0 mod 3 then
II n^rmod3 r
Proof. Suppose that a' = 0 mod 3. Then 6'/a'-l
D n "= n n^rmod3 j"=0 r
j=0
( 6 - I6 >
n ^—i-'
( 6 - 17 )
ja'+/ I
b'/a'-l
=n
A,
n n^rraod3
n~jo'^rraod3 r