Global Analysis of Dynamical Systems

Global Analysis of Dynamical Systems This page intentionally left blank Global Analysis of Dynamical Systems Festsc...

Author: H. W. Broer | B. Krauskopf | Gert Vegter (editors)

143 downloads 2262 Views 27MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Global Analysis of Dynamical Systems

This page intentionally left blank

Global Analysis of Dynamical Systems Festschrift dedicated to Floris Takens for his 60th birthday

Edited by

Henk W Broer University of Groningen, The Netherlands

Bernd Krauskopf University of Bristol, UK and

Gert Vegter University of Groningen, The Netherlands

Institute of Physics Publishing Bristol and Philadelphia

c IOP Publishing Ltd 2001 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior permission of the publisher. Multiple copying is permitted in accordance with the terms of licences issued by the Copyright Licensing Agency under the terms of its agreement with the Committee of Vice-Chancellors and Principals. British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library. ISBN 0 7503 0803 6 Library of Congress Cataloging-in-Publication Data are available

Commissioning Editor: James Revill Production Editor: Simon Laurenson Production Control: Sarah Plenty Cover Design: Fr´ed´erique Swist Marketing Executive: Colin Fenton Published by Institute of Physics Publishing, wholly owned by The Institute of Physics, London Institute of Physics Publishing, Dirac House, Temple Back, Bristol BS1 6BE, UK US Office: Institute of Physics Publishing, The Public Ledger Building, Suite 1035, 150 South Independence Mall West, Philadelphia, PA 19106, USA Typeset in TEX using the IOP Bookmaker Macros Printed in the UK by MPG Books Ltd, Bodmin

Contents

Preface

xi

1

Forced oscillations and bifurcations Floris Takens

1

2

Historical behaviour in smooth dynamical systems David Ruelle References

63

Implicit formalism for affine-like maps and parabolic composition Jacob Palis and Jean-Christophe Yoccoz On Floris as a friend 3.1 On homoclinic bifurcations 3.2 Implicit formalism for affine-like maps 3.3 Parabolic composition References

67

3

4

5

Strong resonances and Takens’s Utrecht preprint Bernd Krauskopf 4.1 Setting and notation 4.2 Zq-equivariant normal forms 4.3 Weak resonance 4.4 Strong resonances 4.5 Strong resonance for q = 4 4.6 From the normal form to the full dynamics References Semi-local analysis of the k : 1 and k : 2 resonances in quasi-periodically forced systems Florian Wagener 5.1 Preliminaries 5.2 Normal form analysis 5.3 Semi-local bifurcation analysis of the k : 1 resonance 5.4 Semi-local bifurcation analysis of the k : 2 resonance 5.5 Conclusions

65

67 69 70 77 87 89 91 92 94 95 96 107 108 113 115 116 118 121 124

vi

Contents Acknowledgements References

6

7

8

Generic unfolding of the nilpotent saddle of codimension four Freddy Dumortier, Peter Fiddelaers and Chengzhi Li 6.1 General setting and some history of the subject 6.2 Specific setting and presentation of the results 6.3 Local bifurcations 6.4 Rescalings 6.5 Bifurcations of heteroclinic saddle connections 6.6 Study of Xλ,δ + O(δ 2 ) as a perturbation of a family of Hamiltonian systems 6.7 Uniformity of local bifurcation diagrams with respect to δ Acknowledgements References Exponential confinement of chaos in the bifurcation sets of real analytic diffeomorphisms Henk W Broer and Robert Roussarie 7.1 Motivation: the Hopf–Takens bifurcation 7.2 Setting of the problem and main result 7.3 Applications 7.4 Averaging 7.5 Linearization near a hyperbolic orbit 7.6 Exponential confinement: proof of the main theorem 7.1 7.7 Concluding remarks Acknowledgements References Takens–Bogdanov bifurcations without parameters and oscillatory shock profiles Bernold Fiedler and Stefan Liebscher 8.1 Examples 8.2 Bifurcations from lines of equilibria 8.3 Bifurcations from planes of equilibria 8.4 Normal forms 8.5 Scaling, alias blow-up 8.6 Complete integrability to scaling order zero 8.7 Slow flow of first integrals to order ε 8.8 Elliptic integrals and Weierstrass functions 8.9 Poincar´e flows of first integrals 8.10 Higher order: Poincar´e flow, averaging, Melnikov 8.11 Geometry of Poincar´e maps 8.12 Stiff hyperbolic balance laws Appendix. Derivation of the normal form

128 128 131 131 132 140 144 151 155 164 165 165 167 168 169 172 177 184 196 200 208 208 211 212 215 220 224 227 228 230 231 234 238 241 246 249

Contents Acknowledgements References 9

Global bifurcations of periodic orbits in the forced Van der Pol equation John Guckenheimer, Kathleen Hoffmann and Warren Weckesser 9.1 The slow flow and its bifurcations 9.2 Symmetry and return maps 9.3 Degenerate decompositions and fixed points of H 9.4 Concluding remarks Acknowledgements References

vii 257 257 261 263 267 270 273 275 275

10 An unfolding theory approach to bursting in fast–slow systems Martin Golubitsky, Kreˇsimir Josi´c and Tasso J Kaper 10.1 A framework for classifying bursters 10.2 The codimension-one burster 10.3 Codimension-two bursters 10.4 Codimension-three bursters Appendix. Case-by-case analysis of Hopf–steady-state bursters of section 10.3.4 Acknowledgements References

277

11 The intermittency route to chaotic dynamics Lorenzo J D´ıaz, Isabel L Rios and Marcelo Viana 11.1 Saddle–nodes of diffeomorphisms 11.2 Transition maps 11.3 Global aspects: ghost dynamics 11.4 Prevalence of local and global strange attractors 11.5 Persistence of tangencies Acknowledgements References

309

12 Homoclinic points in complex dynamical systems Robert L Devaney 12.1 Baby Mandelbrot sets 12.2 The complex saddle–node 12.3 Dynamics of P0 12.4 Homoclinic points 12.5 Ecalle cylinders for Pc 12.6 Polynomial-like maps 12.7 Proof of theorem 12.1 References

329

278 284 286 297 304 304 305

310 313 316 320 322 326 326

329 330 332 333 334 335 336 337

viii

Contents

13 Excitation of elliptic normal modes of invariant tori in volume preserving flows Mikhail B Sevryuk 13.1 Background on elliptic normal modes 13.2 Aim and notation 13.3 Persistence theorem for tori of codimension greater than one 13.4 Persistence theorem for tori of codimension one 13.5 Excitation of the elliptic normal mode Acknowledgements References 14 On the global dynamics of Kirchhoff’s equations: rigid body models for underwater vehicles Heinz Hanßmann and Philip Holmes 14.1 Setting of the problem 14.2 Normalization of the Hamiltonian 14.3 Dynamics of the normal forms 14.4 Implications for the original system 14.5 Conclusions Acknowledgements References 15 Global dynamics and fast indicators Carles Sim´o 15.1 A model problem in 1 12 degrees of freedom 15.2 Bounding the region of interest 15.3 Estimating the fraction of integrability 15.4 A model problem in 2 12 degrees of freedom 15.5 Computation of the local exponential growth of the distance 15.6 Conclusion Appendix. Taylor series methods of integration Acknowledgements References 16 A general nonparametric bootstrap test for Granger causality Cees Diks and Jacob DeGoede 16.1 Granger causality 16.2 Information theoretic test statistic 16.3 Bootstrap procedures 16.4 Monte Carlo simulations 16.5 Summary and discussion References

339 340 342 343 346 347 350 350 353 353 358 362 367 370 370 371 373 374 375 378 383 384 386 387 388 388 391 391 393 396 397 401 403

Contents 17 Birkhoff averages and bifurcations Ale Jan Homburg and Todd Young 17.1 General assumptions and notations 17.2 Likely rotation numbers 17.3 Discontinuity of averages 17.4 Local embedding flows for the saddle–node 17.5 The Mather invariant and return maps 17.6 Intermittency near the saddle–node bifurcation 17.7 Boundary crisis bifurcations References 18 The multifractal analysis of Birkhoff averages and large deviations Yakov Pesin and Howard Weiss 18.1 Main result 18.2 Applications and connections to probability and number theory Appendix. Equilibrium measures and thermodynamic formalism Acknowledgements References 19 Existence of absolutely continuous invariant probability measures for multimodal maps Henk Bruin and Sebastian van Strien 19.1 Setting of the problem 19.2 The multimodal summability condition 19.3 Proof of theorem 19.1, part (a) 19.4 Proof of theorem 19.1, part (b) References 20 On the dynamics of the renormalization operator Artur Avila, Welington de Melo and Marco Martens 20.1 Formulation of the main result 20.2 Decompositions 20.3 Proof of theorem 20.1 Appendix. Behavior of the composition operator Acknowledgements References Author index

ix 405 406 407 409 411 413 414 416 417 419 419 425 429 430 430 433 434 435 437 442 447 449 450 452 457 457 459 460 461

This page intentionally left blank

Preface

nunc demittis servum tuum Domine secundum verbum tuum in pace quia viderunt oculi mei salutare tuum Lc 2: 29–30

This book is a liber amicorum for Floris Takens in celebration of his 60th birthday. It was offered to him at the workshop Global Analysis of Dynamical Systems, held at the Lorentz Center of Leiden University in June 2001. This collection of contributions from long-time collaborators, colleagues and friends of Floris constitutes an up-to-date document of central developments in Dynamical Systems that have emerged around his work.

Floris Takens: a total mathematician We present some historical details, without attempting to be complete. In 1969 Floris Takens finished his thesis, entitled The minimal number of critical points of a function on a compact manifold and the Lusternik–Schnirelman category, under supervision of Nico Kuiper in Amsterdam. In 1972, at the age of 31, he was appointed professor at the Department of Mathematics of Groningen University. His chair was officially in Differential Topology, in particular Dynamical Systems. Beginning with the work of Poincar´e on celestial mechanics, geometrical methods had been introduced into dynamical systems research. In the 1960s and 1970s the area of Dynamical Systems gained further momentum in this direction by the impact of the Fields Medalists Stephen Smale (University of California at ´ Berkeley) and Ren´e Thom (Institut des Hautes Etudes Scientifiques), who were both originally topologists. Thom later became famous for his catastrophe theory, to which also Chistopher Zeeman (University of Warwick) contributed a lot. Given his background and abilities Floris Takens fitted perfectly into this ´ culture. During 1969–70, he had been a guest at the Institut des Hautes Etudes Scientifiques (IHES) in Bures-sur-Yvette near Paris. This visit brought him into contact with Ren´e Thom and David Ruelle, who strongly affected his further xi

xii

Preface

development. One of the members of Takens’s own generation is the Brazilian Jacob Palis, who did his PhD research with Smale. Since 1971 Palis and Takens have been involved in an extensive and extremely fruitful scientific collaboration. For many years Floris has been a guest of IMPA, the Instituto de Matem´atica Pura e Aplicada in Rio de Janeiro, for several months at a time. Many other colleagues also participated in this program of visits, one of them being Sheldon Newhouse (University of North Carolina, later Michigan State University). Floris Takens became one of the founding fathers of the modern discipline of Dynamical Systems. He has built up a great scientific reputation and has established many international contacts. He was the supervisor of about 20 PhD students, often in co-supervision with colleagues. We mention here in chronological order his Groningen PhD graduates Albert Hummel, Henk Broer, Gert Vegter, Fopke Klok, Jan Barkmeijer, Cars Hommes, Ale Jan Homburg, Bernd Krauskopf, Florian Wagener and Evgeny Verbitsky. Outside Groningen he co-supervised Freddy Dumortier, Bert Jongen and Sebastian van Strien, and later also Jan-Pieter Pijn, Pieter Been, Cees Diks, Marcel van der Heijden and Michel van der Stappen. Many of Floris’s students are now enjoying scientific careers of international significance, quite a few of them building their own schools. In the 1990s Floris Takens chaired the Department of Mathematics at Groningen for several years. At the national level, he chaired the Mathematical Research Institute, a cooperation between the Universities of Groningen, Nijmegen, Utrecht and Twente. He helped organize several programmes of the Netherlands Organization for Scientific Research (NWO) that were concerned with Nonlinear Dynamics, one of them being the successful Mathematical Physics programme. Since 1991 he has been an active member of the Royal Dutch Academy of Sciences (KNAW). On 5th January 2001 Floris Takens received an honourary doctorate from the Technical University of Delft in recognition of his work in mathematics and its applications in other sciences. Floris is known for his strict self-discipline and a great sense of duty. He always arrived at the Department early in the morning, even if the night before there had been a heavy dinner. Perhaps this is why he dislikes others to be late for appointments. We also mention Floris’s great interest in culture and art. He plays the flute and regularly attends performances of music in the Province of Groningen, in Feerwerd, Thesinghe or the Oosterpoort Theatre, to which he characteristically goes by bike. In December 1999 Floris retired from the University of Groningen. He kept his office for, among other things, his editorial duties for the Springer Lecture Notes in Mathematics series. He also remains a faithful attender of colloquia and seminars. For Floris teaching obligations now have come to an end, but the work is not completed at all. Several papers and books are under construction. However, he has more time now to enjoy Groningen’s wealth of Northern German organs and other music performances—and to enjoy the observatory that has been constructed on top of his house in the Groningen village of Bedum.

Preface

xiii

The influence of Floris Takens For Floris Takens the discipline of Mathematics is one entity, and this includes applications. This view agrees with his own career, where ‘pure’ differential topology and ‘applied’ time series analysis coexist. In his research papers, analysis, geometry and measure theory all have their natural place next to one another. This book bears witness to the influence of Floris Takens in the world of Dynamical Systems. We briefly sketch here only two areas—the contributions speak for themselves. A major line of work of Floris Takens lies in bifurcation theory. Two important papers of his appeared in 1974: Singularities of vector fields (Publ. Math. IHES 43) and Forced oscillations and bifurcations. The latter is the oftencited ‘Utrecht preprint’, which contains, among other things, the treatment of what is now generally called the Bogdanov–Takens bifurcation. This paper is reprinted in its original form as chapter 1 of this book, so that it is finally generally accessible. In these 1974 papers the influence of Floris’s IHES visit, and in particular his contact with Ren´e Thom, can clearly be traced. This early work has stimulated much follow-up research in bifurcation theory, by Floris Takens himself, by his co-workers, by his PhD students and by many other colleagues. As several contributions in this book show, his influence lasts until the present day. The paper On the nature of turbulence (published in Commun. Math. Phys. 20 1971) that Floris Takens wrote with David Ruelle is also connected with his work on bifurcation theory. It was a real break-through and has attracted a lot of attention from other disciplines, notably from Physics. A new idea was created, contradicting the standard views of Landau and Lifschitz regarding the onset of turbulence in fluid motion. The new concept was that of a strange attractor, later associated with the term chaos. This development is nicely described by James Gleick in his book Chaos: Making a New Science (Penguin Books 1987). A second major line of his work started with Floris Takens’s paper Detecting strange attractors in turbulence (in Proc. Warwick Conference on Dynamical Systems and Turbulence (Springer Lecture Notes in Mathematics 898) 1980). It contains a much cited result that is now often called the Takens Embedding or Takens Reconstruction Theorem. It provides a highly effective and also purely behavioristic way of dealing with time series. Indeed, it is only assumed that the time series under consideration is generated by a deterministic dynamical system, but the equations of motion do not need to be known. Apart from its interest within the field of Dynamical Systems, which is still felt today, this paper was of great influence in many areas of applications, ranging from research on epilepsy to chemical process technology.

xiv

Preface

About this book While the contributions in this book cover a broad range of topics in Dynamical Systems, we have tried to loosely organize them into groups. As mentioned earlier, chapter 1 is a reproduction of Floris Takens’s Utrecht preprint Forced oscillations and bifurcations. Chapters 2 and 3 are the contributions of Floris’s long-time collaborators David Ruelle, and Jacob Palis and Jean-Christophe Yoccoz, respectively. Chapters 4 to 10 deal with questions in bifurcation theory, chapters 11 and 12 are concerned with homoclinic bifurcations, and chapters 13 to 15 are in the general area of Hamiltonian dynamics. Chapter 16 is on time series analysis, and chapters 17 to 20 deal with questions of ergodic theory and one-dimensional maps. All chapters are dedicated to Floris Takens on the occasion of his 60th birthday. We are very grateful to all contributors for their efforts and enthusiasm and for working within the strict deadlines we needed to impose. Finally, we thank Jim Revill and Simon Laurenson at Institute of Physics Publishing for their very pleasant and efficient cooperation in getting this book published on time. Henk Broer, Bernd Krauskopf, Gert Vegter Groningen and Bristol April 2001

Chapter 1 Forced oscillations and bifurcations Floris Takens University of Groningen

This chapter is a reproduction of the original paper Takens F 1974 Forced oscillations and bifurcations In Applications of Global Analysis I Communications of the Mathematical Institute Rijksuniversiteit Utrecht 3

Note by the author After the original publication in 1974, V I Arnol’d informed me that a complete proof of theorem 6.1 (Bogdanov–Takens bifurcation) was already available in Moscow in a manuscript by R I Bogdanov.

Note by the editors We are very happy to reprint Floris Takens’s celebrated 1974 paper Forced oscillations and bifurcations in this book. This paper was available only as a preprint from the Mathematical Institute of the University of Utrecht. Therefore, it is known as Takens’s Utrecht Preprint in some circles. The paper is a worked out manuscript of a talk given at the symposium Global Applications of Dynamical Systems held on 8th February 1973 at the Mathematical Institute in Utrecht. It is the first paper in Applications of ´ Global Analysis I; the second paper is the paper L’Evolution Temporelle des Catastrophes by Ren´e Thom (not reprinted here). However, Forced oscillations and bifurcations was also available as a preprint on its own without Thom’s paper. It is estimated that a total of 60 preprints were printed, which were available from the Mathematical Institute in Utrecht for 6 Dutch Guilders. 1

2

Floris Takens

(original contents)

Even though it was hard to come by, Forced oscillations and bifurcations has had much impact on the development of bifurcation theory. In particular, it contains the study of what is now known as the Bogdanov–Takens bifurcation. The influence of this work is still evident up to this date. For example, seven contributions in this book refer to this paper. And on a personal note, for all three of us, Forced oscillations and bifurcations, and Floris’s expertise and guidance in general, have been very influential for our careers in dynamical systems. We thank Floris for contributing his paper and for providing information on its history. Furthermore, we are very grateful to the Mathematical Institute of the University of Utrecht, and especially to Eduard Looijenga, its director of research, for agreeing to this reproduction and for providing us with the original typescript.

To aid referencing, the original page numbers appear in the running head above the original pages.

Forced oscillations and bifurcations

(original p 1)

3

4

Floris Takens

(original p 2)

Forced oscillations and bifurcations

(original p 3)

5

6

Floris Takens

(original p 4)

Forced oscillations and bifurcations

(original p 5)

7

8

Floris Takens

(original p 6)

Forced oscillations and bifurcations

(original p 7)

9

10

Floris Takens

(original p 8)

Forced oscillations and bifurcations

(original p 9)

11

12

Floris Takens

(original p 10)

Forced oscillations and bifurcations

(original p 11)

13

14

Floris Takens

(original p 12)

Forced oscillations and bifurcations

(original p 13)

15

16

Floris Takens

(original p 14)

Forced oscillations and bifurcations

(original p 15)

17

18

Floris Takens

(original p 16)

Forced oscillations and bifurcations

(original p 17)

19

20

Floris Takens

(original p 18)

Forced oscillations and bifurcations

(original p 19)

21

22

Floris Takens

(original p 20)

Forced oscillations and bifurcations

(original p 21)

23

24

Floris Takens

(original p 22)

Forced oscillations and bifurcations

(original p 23)

25

26

Floris Takens

(original p 24)

Forced oscillations and bifurcations

(original p 25)

27

28

Floris Takens

(original p 26)

Forced oscillations and bifurcations

(original p 27)

29

30

Floris Takens

(original p 28)

Forced oscillations and bifurcations

(original p 29)

31

32

Floris Takens

(original p 30)

Forced oscillations and bifurcations

(original p 31)

33

34

Floris Takens

(original p 32)

Forced oscillations and bifurcations

(original p 33)

35

36

Floris Takens

(original p 34)

Forced oscillations and bifurcations

(original p 35)

37

38

Floris Takens

(original p 36)

Forced oscillations and bifurcations

(original p 37)

39

40

Floris Takens

(original p 38)

Forced oscillations and bifurcations

(original p 39)

41

42

Floris Takens

(original p 40)

Forced oscillations and bifurcations

(original p 41)

43

44

Floris Takens

(original p 42)

Forced oscillations and bifurcations

(original p 43)

45

46

Floris Takens

(original p 44)

Forced oscillations and bifurcations

(original p 45)

47

48

Floris Takens

(original p 46)

Forced oscillations and bifurcations

(original p 47)

49

50

Floris Takens

(original p 48)

Forced oscillations and bifurcations

(original p 49)

51

52

Floris Takens

(original p 50)

Forced oscillations and bifurcations

(original p 51)

53

54

Floris Takens

(original p 52)

Forced oscillations and bifurcations

(original p 53)

55

56

Floris Takens

(original p 54)

Forced oscillations and bifurcations

(original p 55)

57

58

Floris Takens

(original p 56)

Forced oscillations and bifurcations

(original p 57)

59

60

Floris Takens

(original p 58)

Forced oscillations and bifurcations

(original p 59)

61

This page intentionally left blank

Chapter 2 Historical behaviour in smooth dynamical systems David Ruelle ´ Institut des Hautes Etudes Scientifiques

Dedicated to Floris Takens on the occasion of his 60th birthday. When Floris Takens and myself proposed that hydrodynamic turbulence is described by what we called strange attractors, we expected neither the opposition that this idea would at first encounter, nor the importance it would later achieve in the development of chaos theory. On the occasion of this joint work [14] (and later) I noted the ability of Floris to combine the focused rigorous thinking appropriate for mathematical work, and the openmindedness necessary to study physical applications. Some openmindedness will also be required from the reader of the present note, because I raise some questions about smooth dynamical systems, which I am quite unable to attack in a proper mathematical way. But, by necessity, one must ask questions before they can be answered, and smooth dynamics is a particularly daunting subject in view of the many questions which one has absolutely no idea how to answer. A question of great interest for physical applications is that of the long time behaviour of smooth dynamical systems. To fix ideas, let f be a diffeomorphism of a compact manifold M, and let x ∈ M. What can one say about f n x for n → ∞? To make the problem somewhat manageable we have to put restrictions on f and x. A natural restriction is that we may assume x ∈ / N where N is some subset of zero Lebesgue measure of M. (By Lebesgue measure we mean a measure with smooth density w.r.t. Lebesgue, in charts of M. The concept of zero measure is then independent of choices.) As far as f is concerned, one is happy if one can handle some nonempty open set in some topological space of diffeomorphisms. The problem just outlined has been heavily researched, from Hadamard to Poincar´e, through Kolmogorov and the Russian school, to Smale and his friends, 63

64

David Ruelle

to Jacob Palis and the Brazilian school, in which we may include Floris Takens. (There are many books, among which we may quote [4, 8, 9, 13, 15].) In the simplest examples, for f in some open set (Morse–Smale diffeomorphisms), f n x tends to an attracting periodic orbit for Lebesgue almost all x ∈ M. This is what we would call predictable behaviour. In more general situations, for f in some open set (Axiom A + No Cycles), and for Lebesgue almost all x ∈ M, f n x tends for n → ∞ to an attractor A (there are finitely many such attractors in M). In this situation, and again for Lebesgue almost all x, the measure n−1 1 δ f kx n

(2.1)

k=0

tends weakly to a limiting measure µ when n → ∞. For each attractor A there is a unique µ, which is ergodic, and is called the SRB measure; see [1, 12, 16]. In another situation (which is the object of KAM theory; see for instance [6, 7]) for f in an open set of volume preserving diffeomorphisms and x in a set of positive Lebesgue measure, f n x is confined to an m-torus, and (2.1) tends to an ergodic measure µ on this torus, so that the dynamical system (µ, f ) is quasiperiodic. In all these examples we have recurrent behaviour, described by a measure µ. If this measure has positive entropy we say that we have chaos. Computer studies have shown that chaotic behaviour is quite common, but its study remains difficult in spite of the work of Pesin, L-S Young, Viana, and many others; see in particular [10, 11, 18, 19, 20]. But what about nonrecurrent behaviour? Is it possible that for a large set of diffeomorphisms f , and a set of positive measure of points x ∈ M, the measure (2.1) has no limit? This absence of limit is what we want to call historical behaviour. This means that, as the time n tends to ∞, the point f n x keeps having new ideas about what it wants to do. Can such historical, nonrecurrent behaviour occur in a stable manner? (It is easy enough to concoct an example in two dimensions where f n x oscillates ever more slowly between two fixed points, so that (2.1) does not have a limit, but this example disappears after perturbation.) It is apparently not known if historical behaviour, as described above, can occur in a persistent manner. (I am indebted to Michel Herman for confirming that.) Making a conjecture would involve choosing a topological space of diffeomorphisms, etc. Rather than going into such premature details let me explore the possibility that historical behaviour could be persistent. Here are a few arguments in favour of this possibility. 1.

As discussed earlier in the text we have a two-dimensional example of historical behaviour, easily killed by perturbation. But features (like nonhyperbolic behaviour) which are nongeneric in low dimension, often become generic in higher dimension. This might happen for historical behaviour.

Historical behaviour in smooth dynamical systems 2.

3.

65

Computer studies of all but the simplest dynamical systems show that the limit of (2.1) when n → ∞ is often extraordinarily slow, if it takes place at all. (In this respect see for example [3].) There are physical systems which are believed to have historical behaviour, and which somewhat resemble smooth dynamical systems. Specifically, spin glasses (see [5]) are infinite systems of spin with historical behaviour, and Markov partitions give a representation of hyperbolic systems as infinite systems of spins. (The resemblance is not close: hyperbolic systems correspond to one-dimensional chains of spins, while one would like to consider spin glasses in higher dimension. Also, the time evolutions are totally different.) Note the following intuitive view of why one expects historical behaviour for spin glasses. Their evolution is pictured as a random walk in a random potential. At a given time one is trapped in a valley of the potential. Eventually one crosses a barrier to another valley. As time goes on, deeper and deeper valleys are explored where one stays trapped for longer and longer. (For a rigorous study of this situation in one dimension, see [17].) Can a smooth dynamical system emulate a random walk in a random potential (and this in a persistent manner)?

We know that there are simple dynamical systems with very inventive time evolution, particularly among cellular automata (think of Conway’s Game of Life [2]). The question here is whether or not, for smooth dynamical systems, it is possible to get rid of ‘historical’ behaviour by eliminating ‘negligible sets’ of diffeomorphisms and of initial conditions.

References [1] Bowen R and Ruelle D 1975 The ergodic theory of Axiom A flows Inv. Math. 29 181–202 [2] Gardner M 1970 The fantastic combinations of John Conway’s new solitaire game ‘life’ Scientific American 223(October) 120–3 [3] Grebogi C, Ott E and Yorke J 1985 Super persistent chaotic transients Ergod. Theor. Dynam. Syst. 5 341–72 [4] Katok A and Hasselblatt B 1995 Introduction to the Modern Theory of Dynamical Systems (Cambridge: Cambridge University Press) [5] M´ezard M, Parisi G and Virasoro M A 1987 Spin Glass Theory and Beyond (Singapore: World Scientific) [6] Moser J 1968 Lectures on Hamiltonian Systems (Memoirs AMS 81) (Providence, RI: American Mathematical Society) [7] Moser J 1973 Stable and Random Motions in Dynamical Systems (Annals of Mathematical Studies) (Princeton, NJ: Princeton University Press) [8] Palis J and de Melo W 1982 Geometric Theory of Dynamical Systems (Berlin: Springer) [9] Palis J and Takens F 1993 Hyperbolicity and Sensitive Chaotic Dynamics at Homoclinic Bifurcations (Cambridge: Cambridge University Press)

66

David Ruelle

[10] Pesin Ya B 1976 Invariant manifold families which correspond to non-vanishing characteristic exponents Izv. Akad. Nauk. SSSR Ser. Mat. 40(6) 1332–79 (Engl. transl. Math. USSR Izv. 10(6) 1261–305) [11] Pesin Ya B 1977 Lyapunov characteristic exponents and smooth ergodic theory Usp. Mat. Nauk. 32(4) 55–112 (Engl. transl. Russ. Math. Surv. 32(4) 55–114) [12] Ruelle D 1976 A measure associated with Axiom A attractors Am. J. Math. 98 619– 54 [13] Ruelle D 1989 Elements of Differentiable Dynamics and Bifurcation Theory (Boston, MA: Academic) [14] Ruelle D and Takens F 1971 On the nature of turbulence Commun. Math. Phys. 20 167–92 Ruelle D and Takens F 1971 On the nature of turbulence Commun. Math. Phys. 23 343–4 [15] Shub M 1987 Global Stability of Dynamical Systems (Berlin: Springer) [16] Sinai Ya G 1972 Gibbsian measures in ergodic theory Usp. Mat. Nauk. 27(4) 21–64 (Engl. transl. Russ. Math. Surv. 27(4) 21–69) [17] Sinai Ya G 1982 Limit behaviour of one-dimensional random walks in random environments Teor. Veroyatn i ee Primen 27 247–58 [18] Viana M 1997 Multidimensional nonhyperbolic attractors Publ. Math. IHES 85 63– 96 [19] Young L-S 1998 Statistical properties of dynamical systems with some hyperbolicity Ann. Math. 147 585–650 [20] Young L-S 1998 Developments in chaotic dynamics Not. Am. Math. Soc. 45 1318–28

Chapter 3 Implicit formalism for affine-like maps and parabolic composition Jacob Palis IMPA Jean-Christophe Yoccoz College de France

Dedicated to Floris Takens on his 60th birthday. We consider two classes of surface maps, namely affine-like maps and parabolic maps, which play a central role in the analysis on non-uniformly hyperbolic dynamics. We study these maps through an implicit time-symmetric formalism and define the basic notion of distortion. We give estimates for the distortion of compositions of such maps.

On Floris as a friend Floris Takens has been one of the main forces in the notable development of dynamical systems in the last three decades or so. One of us, Jacob, met Floris at the Institut des Hautes Etudes Scientifiques (IHES) in 1969. From the start Floris intellectually impressed very much the visitors of the Institute at that time, who, besides Jacob, included Steve Smale, Bob Williams and Michael Shub. Also, based on conversations at tea time and otherwise, the same can certainly be said with respect to the permanent members Ren´e Thom and David Ruelle. He would soon interact with several of us, most particularly Jacob himself and a bit later Sheldon Newhouse, a collaboration that spread through the years and a number of papers. Much more instantaneous in repercussion was David’s and Floris’s remarkable ‘turbulent proposal’: their paper [6] from the early seventies immediately became much commented on among mathematical physicists, fluid dynamicists and dynamicists in general, and it triggered a much 67

68

Jacob Palis and Jean-Christophe Yoccoz

needed revival of the discussion on turbulence. The authors seemed firm but calm and rather unassuming during the heated (and lasting) debate they have created. . . and their fine personalities shone through. Floris had worked on singularity theory under Nicholas Kuiper, but it is clear from the above that in no time at all he was moving at ease in dynamics, always in an elegant and enthusiastic way. At the same time, he looked very Dutch, like he had stepped out of a painting from their famous schools produced during the Renaissance and post-Renaissance periods (for example, those by Breughel). He visited Paris several times afterwards during the seventies, but it was not there or back in Groningen that his first encounter with Jean-Christophe, then 23 years old, occured: it was at the University of Warwick during a large meeting in dynamics in 1980. From that point on, we both have been much in touch with Floris, for instance almost every year at the Instituto de Matematica Pura e Aplicada (IMPA). Of course, many other colleagues were present at those opportunities: often Sheldon, surely Welington de Melo, Ricardo Ma˜ne and later Marcelo Viana and Lorenzo Diaz, and sometimes Dennis Sullivan and Floris’s former students Henk Broer and Sebastian van Strien, among others. On one such occasion Floris and Sheldon gave a flute–piano concert and they were much applauded. It seems that Sheldon rarely plays the piano nowadays but turned to dancing—with Patricia—while Floris still keeps playing the flute. Welington was inviting everybody sailing in Angra dos Reis, a real tropical paradise: Floris and Henk were famous for their strength (here physical) and nobody dared to throw them into the water, on the not so common occasions when someone had proposed this game. We must clarify that the water there is calm, clean and warm. Warm was also the atmosphere at IMPA and that had much to do with Floris and the other colleagues who were constantly available, willing to participate in the many activities that were programmed and to interact profoundly with local colleagues. Together, they constituted the makers of a cosy and stimulating atmosphere, notably (but not only) in dynamics. The warmth and generosity of the visitors towards the young students, particularly on the part of Floris, became well known and these encounters were very fruitful. Indeed a number of these students grew into excellent mathematicians and they hold for Floris a special appreciation. We did see him in other places, like the International Center for Theoretical Physics (ICTP), in France at IHES, Dijon, Lyon, . . . , in Groningen, and at Warwick, but in many ways our most vivid memories have to do with IMPA and Brazil. Floris also found fame for an almost infinite capacity of drinking caipirinhas at friendly reunions in the evenings: he easily won many bets having two and sometimes even three of us together ‘against’ him; more than once Jacob abandoned the field for becoming sick, while Floris seemed ready for the next round. On the occasion of celebrating the conclusion of Floris’s and Jacob’s book [3] on bifurcations of homoclinic orbits in Rio de Janeiro, Jean-Christophe took Floris home at the end of the party and later, for the first time, he had to deal (in a friendly way) with the local police for driving in a not very sober

Implicit formalism for affine-like maps and parabolic composition

69

condition. . . Another time, Floris and Henk threw a quick one-hour party at the end (a late afternoon) of a very good mathematical meeting in Groningen: no food was available, but there was plenty of good wine. We recall that somehow several of us, even the more reserved participants, broke our glasses before the end, and only after this happy-hour did we go for dinner. We also like Floris’s simple, uncomplicated way of arranging his visits to other institutions and settling down immediately upon arrival. Of course, the secretaries rank him highly for that. Perhaps most consistently, he is known for being systematic: he always rents the same apartment when visiting Rio, which is made available to him by the lady owner in appreciation for his loyalty and gentlemanliness. Again consistently, he leaves the seminar room when the scheduled time is over, no matter who is talking. In closing, we wish Floris the best and we hope that he will continue to be close to all us for many years to come. Thank you, Floris.

3.1 On homoclinic bifurcations Let f be a smooth diffeomorphism of a surface M, and p be a hyperbolic saddle periodic point for f . / O( p) Recall that a homoclinic point for the orbit O( p) of p is a point q ∈ which belongs both to W s (O( p)) and W u (O( p)). When the intersection is transverse at q, one can continue it through a neighbourhood of f in Diff∞ (M). On the other hand, one says that f has a non-degenerate homoclinic tangency at q if W s (O( p)) and W u (O( p)) have a quadratic tangency at q. In this case, this tangency occurs on a codimension-one submanifold U0 of a neighbourhood U of f in Diff∞(M), passing through f . This submanifold divides U into two components U+ and U− : the intersection disappears in U− and ramifies into two intersections in U+ . Newhouse and Palis [1] did consider this setting, assuming that there is a neighbourhood V of O( p) ∪ O(q) such that the maximal g-invariant set in V is O( p) (resp. O( p) ∪ O(q)) when g ∈ U− (resp. g ∈ U0 ); they proved that, for most g ∈ U+ , the maximal g-invariant set in V is hyperbolic, and even a basic set for g. Palis and Takens [2] considered the considerably more elaborated setting where the periodic point p belongs to a saddle-like basic set K . They assumed that there exists a neighbourhood V of K ∪ O(q) such that for g ∈ U− (resp. g ∈ U0 ) the maximal g-invariant set in V is K (resp. K ∪ O(q)). We have denoted here (and before) by p and K what are actually the hyperbolic continuations of p and K in U . They proved that, if the Hausdorff dimension of K is smaller than 1, then the maximal g-invariant set in V is again hyperbolic and even a basic set for most g in U+ . In the two results above, the expression ‘for most g’ means that the property

70

Jacob Palis and Jean-Christophe Yoccoz

is true for a set of g ∈ U+ which meets any one-parameter family transverse to U0 along a set of full density at U0. In a forthcoming paper, whose results are announced in [5], we study the more difficult case where the Hausdorff dimension of K is larger than 1 (but not too large). It was known from [4] that now one cannot hope for a uniformly hyperbolic maximal g-invariant set for most g ∈ U+ . However, we are able to prove that the dynamics on the maximal g-invariant set are for most g nonuniformly hyperbolic, and rather similar to H´enon-like attractors in a saddle-like way. In the following, we present two classes of maps, affine-like maps and parabolic maps which play a fundamental role in our analysis of non-uniformly hyperbolic dynamics on surfaces. Affine-like maps occur in a natural way when looking at basic sets through Markov partitions. We use a time-symmetric implicit formalism to deal with these maps, which is close to generating functions in symplectic dynamics. The main technical ingredient is the distortion, which measures the deviation from affine maps and must be scaled in an appropriate way. Parabolic maps are related to the quadratic tangency which occurs at the homoclinic bifurcation, as first return maps in a rectangle following the orbit of q. To study the full dynamics in V we thus need to compose parabolic maps with affine-like maps, both on the left and right, and to know under which circumstances one again obtains an affine-like map. This is only possible if a transversability condition is satisfied. The most important result in this line is an estimate of the distortion of the resulting maps under the transversability hypothesis.

3.2 Implicit formalism for affine-like maps Let I0s , I0u , I1s , I1u be non-trivial compact intervals and R0 = I0s ×I0u , R1 = I1s ×I1u be the associated rectangles in R2 . A horizontal strip in Ri is a subset of the form {ϕ − (x i ) ≤ yi ≤ ϕ + (x i }, where (x i , yi ) denote the coordinates in Ri and ϕ − , ϕ + are defined and continuous on Iis , with values in Iiu , and satisfy ϕ − (x i ) < ϕ + (x i ) for all x i ∈ Iis . Similarly, a vertical strip is a subset of the form {ψ − (yi ) ≤ x i ≤ ψ + (yi )}, where ψ − , ψ + ∈ C(Iiu , Iis ) satisfy ψ − (yi ) < ψ + (yi ) for all yi ∈ Iiu . Denote by π the canonical projection from R0 × R1 onto I0u × I1s . Consider a C 1 diffeomorphism F whose domain is a vertical strip in R0 and whose image is a horizontal strip in R1 . We will say that F is affine-like if the restriction of π to the graph F ⊂ R0 × R1 of F is a diffeomorphism onto I0u × I1s . In other words, for any y¯0 ∈ I0u , x¯1 ∈ I1s the image by F of the horizontal segment {y0 = y¯0 } (in the domain of F) must intersect the vertical segment {x 1 = x¯1 } in a single point where the intersection is transversal. For such a map F, there exist C 1 -maps A : I0u × I1s → I0s and B : I0u × I1s →

Implicit formalism for affine-like maps and parabolic composition

71

I1u such that the inverse of π : F → I0u × I1s is given by (y0 , x 1 ) → (A(y0 , x 1 ), y0 , x 1 , B(y0 , x 1 )). When F is C r (r ∈ [1, +∞] or r = ω), A and B are also C r . We will denote by A x , A y the partial derivatives ∂∂xA1 and ∂∂yA0 , and similarly define Bx , B y , A x x , A x y , . . . . We will frequently omit the argument of these functions because where it should be taken from is obvious from the context. On F , we have d x 0 = A y d y0 + A x d x 1 d y1 = B y d y0 + B x d x 1 which we rewrite as −1 d x 1 = A−1 x d x 0 − A y A x d y0 −1 d y1 = Bx A−1 x d x 0 + (B y − B x A y A x )d y0 .

(Observe that A x , B y do not vanish on I0u × I1s because F is a diffeomorphism.) Thus the Jacobian matrix of F is 1 −A y . D F = A−1 x Bx A x B y − A y Bx Similarly, D F −1 = B y−1 and

A x B y − A y Bx −Bx

Ay 1

det D F = A−1 x By .

Definition 3.1 (cone condition). Let F be affine-like and let u, v, λ > 0 satisfy 1 ≤ uv ≤ λ2 . We say that F satisfies the cone condition C(λ, u, v) if, for any tangent vector (X 0 , Y0 ) (at some point of the domain of F) with image (X 1 , Y1 ) under T F, we have (i) if |Y0 | ≤ u|X 0 | then |Y1 | ≤ v −1 |X 1 | and |X 1 | ≥ λ|X 0 |, (ii) if |X 1 | ≤ v|Y1 | then |X 0 | ≤ u −1 |Y0 | and |Y0 | ≥ λ|Y1 |. Lemma 3.2. The cone condition is satisfied if and only if λ|A x | + u|A y | ≤ 1 everywhere on I0u × I1s .

and

λ|B y | + v|Bx | ≤ 1

72

Jacob Palis and Jean-Christophe Yoccoz

Proof. By the formula for D F above, we have X 1 = A−1 x (X 0 − A y Y0 ) Y1 = A−1 x (B x (X 0 − A y Y0 ) + A x B y Y0 ). In order to have |X 1 | ≥ λ|X 0 | whenever |Y0 | ≤ u|X 0 |, it is necessary and sufficient to have u|A y | < 1 and |A x |−1 (1 − u|A y |) ≥ λ, which is equivalent to u|A y | + λ|A x | ≤ 1. Similarly, in order to have |Y0 | ≥ λ|Y1 | whenever |X 1 | ≤ v|X 0 |, it is necessary and sufficient to have v|Bx | + λ|B y | ≤ 1. Now assume that both inequalities in lemma 3.2 are satisfied, and take |Y0 | ≤ u|X 0 |. Then |Y1 | ≤ |Bx ||X 1 | + |B y ||Y0 | ≤ |Bx ||X 1 | + |B y |uλ−1 |X 1 | ≤ v −1 |X 1 | because uλ−1 ≤ v −1 λ. Similarly, if |X 1 | ≤ v|Y1 | then |X 0 | ≤ u −1 |Y0 |.

3.2.1 Composition of affine-like maps Let R0 = I0s × I0u , R1 = I1s × I1u , R2 = I2s × I2u be three rectangles as above. Let F be an affine-like diffeomorphism from a vertical strip of R0 onto an horizontal strip of R1 , and let F be an affine-like diffeomorphism from a vertical strip of R1 onto a horizontal strip of R2 . In general, the composition F ◦ F is not affine-like. The domain of F ◦ F may, for instance, not be a vertical strip of R0 , and may even not be connected. Nevertheless, we get nice composition properties if we assume that F and F satisfy cone conditions. More specifically, assume that F satisfies a cone condition C(λ, u, v), with 1 < uv < λ2 , and that F satisfies a cone condition C(λ , u, v) with 1 < uv < λ2 . Then one checks immediately that F ◦ F is affine-like and satisfies the cone condition C(λλ , u, v). 3.2.2 Formulas for the composition Let F, F be affine-like and satisfying cone conditions as above, and F = F ◦ F. Let A, B be the maps on I0u × I1s defining F implicitly as F(x 0 , y0 ) = (x 1 , y1 ) ⇐⇒

x 0 = A(y0 , x 1 ) y1 = B(y0 , x 1 )

and let A , B (on I1u × I2s ) and A , B (on I0u × I2s ) be the maps similarly defining F and F .

Implicit formalism for affine-like maps and parabolic composition

73

In the following calculations we study a composition (x 0 , y0 ) −→ (x 1 , y1 ) −→ (x 2 , y2 ) and we consider the six coordinates x 0 , x 1 , x 2 , y0 , y1 , y2 as functions on the graph

of F . We have x 0 = A(y0, x 1 ) y1 = B(y0 , x 1 ) x 1 = A (y1 , x 2 ) y2 = B (y1 , x 2 ) and, hence,

d y1 = B y d y0 + B x d x 1 d x 1 = Ay d y1 + Ax d x 2 .

Solving this system for d x 1 , d y1 , we introduce := 1 − Ay Bx

(3.1)

and observe that > 1 − u −1 v −1 > 0 and, hence, we get

d x 1 = −1 (Ay B y d y0 + Ax d x 2 ) d y1 = −1 (B y d y0 + Ax Bx d x 2 ).

(3.2)

We substitute these formulas into d x 0 = A y d y0 + A x d x 1 d y2 = B y d y1 + Bx d x 2 to finally get

d x 0 = (A y + −1 A x Ay B y )d y0 + A x Ax −1 d x 1 d y2 = B y B y −1 d y0 + (Bx + B y Ax Bx −1 )d x 1 .

This means that the first partial derivatives of A , B are given by

Ax = A x Ax −1 B y = B y B y −1

and

Ay = A y + Ay (A x B y −1 ) Bx = Bx + Bx (Ax B y −1 ).

(3.3)

(3.4)

74

Jacob Palis and Jean-Christophe Yoccoz

3.2.3 Distortion Let F be a C 2 affine-like map; thus A and B are also C 2 . We want to consider the second-order partial derivatives of A and B (the maps defining F implicitly) as a way to measure the distortion of F. But in order to achieve that, we need to scale some of these partial derivatives. Recall that |A x |, |B y | do not vanish on I0u × I1s . We consider the six functions ∂x log |A x |, ∂ y log |A x |, A yy , ∂ y log |B y |, ∂x log |B y |, and Bx x on I0u × I1s . The absolute maximal value attained by one of these functions on I0u × I1s is called the distortion of F and it is denoted by D(F). 3.2.4 Interpretation of partial derivatives of A and B Let F be C 2 affine-like as above. For fixed x 1 , the curve {x 0 = A(y0, x¯1 )} is the pre-image by F of the vertical {x 1 = x¯1 }; the partial derivatives A y and A yy measure the slope (relative to the vertical) and the curvature of the pre-image. On the other hand, taking y0 as a coordinate on the pre-image, the relation y1 = B(y0 , x¯1 ) expresses how the pre-image is sent to the vertical {x 1 = x¯1 }: thus, B y measures the vertical contraction, ∂ y log |B y | is a measure of the logarithmic variation of contraction along the curve, and ∂x log |B y | is a measure of the logarithmic variation of contraction transverse to the curve. Changing F to F −1 and vertical to horizontal curves, we obtain similar interpretations for the remaining partial derivatives Bx , A x , Bx x , ∂x log |A x | and ∂ y log |A x |. 3.2.5 Formulas for the second-order derivatives of a composition Let F, F be C 2 affine-like maps as above, satisfying a cone condition C(λ, u, v) (with 1 < uv < λ2 ) and let F = F ◦ F be the composition. Let also A, B, A , B , A , B be as above. We have already computed the first-order derivatives of A , B in terms of those of A, B, A , B , namely

log |Ax | = log |A x | + log |Ax | − log −1 Ay = A y + Ay (A x B y −1 )

(3.5)

(and similar formulas for Bx and log |B y |). Taking derivatives in (3.5), one has to be careful that the variables x 1 (in A, B) and y1 (in A , B ) are really functions of y0 , x 2 whose partial derivatives are given by (3.2).

Implicit formalism for affine-like maps and parabolic composition

75

First, we take the derivatives in (3.1) and get −x = Bx [ Ayy −1 Ax Bx + Ax y ] + Bx x Ax −1 Ay = − y =

Ax [Bx ∂ y B y [ Ay ∂x

log |Ax | + log |B y | +

Bx x Ay −1 + Ayy Bx −1 +

Bx2 Ayy −1 ] −1 A2 y B x x ].

(3.6)

(3.7)

We now take derivatives in (3.5) with respect to x 2 ∂x log |Ax | = ∂x (log |A x |)−1 Ax + ∂x log |Ax | + ∂ y (log |Ax |)−1 Ax Bx − x −1 which gives  ∂x log |Ax | = ∂x log |Ax | + Ax −1 K 0    K 0 = ∂x log |A x | + 2Bx ∂ y log |Ax |    +Bx x Ay −1 + Bx2 Ayy −1 .

(3.8)

The next step is to take derivatives in (3.5) with respect to y0 ∂ y log |Ax | = ∂ y (log |A x |) + ∂x (log |A x |)−1 Ay B y + ∂ y (log |Ax |)B y −1 − y −1 which gives  −1   ∂ y log |A x | = ∂ y log |A x | + B y K 1  K 1 = ∂ y log |Ax | + Ay (∂x log |A x | + ∂x log |B y |)    −1 +Ayy Bx −1 + A2 y Bx x .

(3.9)

Next, we take derivatives with respect to y0 in the second equation of (3.5) Ayy = A yy + A x y −1 Ay B y + A x B y −1 Ayy B y −1 − A x B y Ay y −2 + A x y B y Ay −1 + A x B yy Ay −1 + A x x B y Ay −2 Ay B y + A x Bx y Ay −2 Ay B y which gives  A yy = A yy + A x B y −1 K 2       K 2 = Ay (2∂ y log |A x | + ∂ y log |B y |)      

−1 +A2 y B y (2∂ x log |B y | + ∂ x log |A x |)

+B y −2 (Ayy + A3 y B x x ).

(3.10)

Jacob Palis and Jean-Christophe Yoccoz

76

Exchanging A, B and B , A as well as x, y we get the other three formulas  ∂ y log |B y | = ∂ y log |B y | + B y −1 K 0    K 0 = ∂ y log |B y | + 2 Ay ∂x log |B y | (3.11)    −1 +Ayy Bx −1 + A2 y Bx x  ∂x log |B y | = ∂x log |B y | + Ax −1 K 1    K 1 = ∂x log |B y | + Bx (∂ y log |B y | + ∂ y log |Ax |) (3.12)    −1 2 −1 +Bx x A y + Bx A yy  Bx x = Bx x + B y Ax −1 K 2       K = Bx (2∂x log |B y | + ∂x log |Ax |) 2 (3.13)  +Bx2 Ax −1 (2∂ y log |Ax | + ∂ y log |B y |)      +Ax −2 (Bx x + Bx3 Ayy ). 3.2.6 Distortion of the composition of affine-like maps Let F, F , F = F ◦ F and A, B, . . . be as above. We assume that F, F satisfy a cone condition C(λ, u, v) with 1 < uv < λ2 . Proposition 3.3. One has D(F ) ≤ max(D(F) + C0 K (F) max(D(F), D(F )), D(F ) + C0 K (F ) max(D(F), D(F ))) where K (F) = max u s

max(|A x |, |B y |)

K (F ) = max u s

max(|Ax |, |B y |)

I0 ×I1 I1 ×I2

and C0 depends only on u, v. uv Proof. One has |A y | < u −1 , |Ay | < u −1 , |Bx | < v −1 , |Bx | < v −1 , −1 < uv−1 and K (F) ≤ λ−1 < 1, K (F ) ≤ λ−1 < 1; the formulas (3.8) to (3.13) combined with these estimates immediately imply the statement of this proposition.

Corollary 3.4. For i ≥ 0, let Ri = Iis × Iiu be a rectangle and let Fi be an affine-like C 2 -diffeomorphism from a vertical strip in Ri onto an horizontal strip in Ri+1 . Assume that all Fi satisfy a uniform cone condition C(λ, u, v) with 1 < uv < λ2 . Let N > 0 and F = FN−1 ◦ FN−2 ◦ · · · ◦ F1 ◦ F0 .

Implicit formalism for affine-like maps and parabolic composition

77

Then D(F) ≤ C1 max D(Fi ) 0≤i
where C1 depends only on u, v. Proof. The statement follows by induction on N from proposition 3.3, since we have K (F) ≤ λ−N ≤ (uv)−N/2 .

3.3 Parabolic composition 3.3.1 Parabolic maps Let R0 = I0s × I0u and R1 = I1s × I1u be rectangles as above. Consider a C ∞ map G which is a diffeomorphism of a neighbourhood of R0 onto a neighbourhood of R1 . We say that such a map is parabolic if there exists in I0u × I1s a smooth simple arc 0 , with endpoints on the boundary of I0u × I1s , dividing this rectangle into two components + and − such that (i) for (y0 , x 1 ) ∈ 0 the image G(I0s × {y0 }) meets {x 1 } × I1u in a single point, interior to both paths, where the two paths have a quadratic tangency; (ii) for (y0 , x 1 ) ∈ − the image G(I0s × {y0 }) does not intersect {x 1 } × I1u ; (iii) for (y0 , x 1 ) ∈ + the image G(I0s × {y0 }) intersects {x 1} × I1u in two points, interior to both paths, where the intersection is transversal. These conditions can be reformulated as follows: consider the intersection G of the graph of G with I0s × I0u × I1s × I1u ⊂ R4 . The restriction of the projection π : (x 0 , y0 , x 1 , y1 ) −→ (y0 , x 1 ) to G is a fold map, with image 0 ∪ + , 0 being the image of the critical locus. Because G is a diffeomorphism, tangents to the smooth curve 0 cannot be horizontal or vertical. This means that there exists a smooth function θ on I0u × I1s such that (i) θ > 0 on + , θ < 0 on − , and θ ≡ 0 on 0 , (ii) the partial derivatives θ y , θx do not vanish on I0u × I1s . The function θ is obviously far from being uniquely defined by these properties. Having chosen such a θ , one defines a smooth function w on G by w2 = θ ◦ π (there are two choices for w; the other is −w).

(3.14)

78

Jacob Palis and Jean-Christophe Yoccoz

Because θx , θ y do not vanish, one may solve (3.14) in x 1 or y0 defining smooth maps Y0 , X 1 by y0 = Y0 (w, x 1 ) ⇐⇒ w2 = θ (y0 , x 1 ) ⇐⇒ x 1 = X 1 (w, y0 ). Considering x 0 , y0 , w, x 1 , y1 as maps from G , we factorize G as G + ◦ G 0 ◦ G − , or (x 0 , y0 ) −→(w, y0 ) −→(x 1 , w) −→(x 1 , y1 ) G−

G0

G+

G −1 0 (x 1 , w)

where G 0 (w, y0 ) = (X 1 (w, y0 ), w) and = (w, Y0 (w, x 1 )). The maps G − and G + are diffeomorphisms determined by maps X 0 , Y1 as x 0 = X 0 (w, y0 ) (3.15) y1 = Y1 (w, x 1 ) and the partial derivatives X 0,w , Y1,w do not vanish as G ± are diffeomorphisms. 3.3.2 Parabolic composition: the setting Consider as above rectangles R0 , R1 and a parabolic map G. Consider also 0 = 1 = rectangles R I0s × I0u , R I1s × I1u , and an affine-like map F0 (resp. 0 (resp. R1 ) to an horizontal strip of R0 (resp. R 1 ). F1 ) from a vertical strip of R We want to investigate the composition F1 ◦ G ◦ F0 . To motivate what follows we start with an example. 0 = R1 = R 1 = [−1, +1]2 and Example 3.5. We take R0 = R G(x 0 , y0 ) = (x 02 − y0 , x 0 ). Then 0 = {x 1 + y0 = 0}, + = {x 1 + y0 > 0}, − = {x 1 + y0 < 0}, and one can take θ (y0 , x 1 ) = y0 + x 1 Y0 (w, x 1 ) = w2 − x 1 X 1 (w, y0 ) = w2 − y0 X 0 (w, y0 ) = w Y1 (w, x 1 ) = w. Let F0 , F1 be defined by x0 , y0 ) = (µ0 x 0 , λ−1 y0 + c0 ) F0 ( 0 F1 (x 1 , y1 ) = (λ1 (x 1 − c1 ), µ−1 1 y1 ) x 0 | ≤ µ−1 so that the domain of F0 is the vertical strip {| 0 } in R0 . Its image −1 is the horizontal strip {|y0 − c0 | ≤ λ0 } in R0 ; the domain of F1 is the

Implicit formalism for affine-like maps and parabolic composition

79

vertical strip {|x 1 − c1 | ≤ λ−1 1 } in R1 , while its image is the horizontal strip 1 . We assume that |λ0 |, |λ1 |, |µ0 |, |µ1 | > 1 and also } in R {| y1 | ≤ µ−1 1 |c0 | + |λ0 |−1 < 1, |c1 | + |λ1 |−1 < 1 . The composition F = F1 ◦ G ◦ F0 is then given by F( x0 , y0 ) = (λ1 {µ20 x 02 − λ−1 y0 − c0 − c1 }, µ−1 x 0 ). 0 1 µ0 In the system

x 02 − λ−1 y0 − c0 − c1 } x 1 = λ1 {µ20 0 y1 = µ−1 x0 1 µ0

let us try to express x0, y1 in terms of y0 , x 1 as −1 y0 + λ−1 x 1 + c0 + c1 }1/2 x 0 = εµ−1 0 {λ0 1 −1 y1 = εµ−1 y0 + λ−1 x 1 + c0 + c1 }1/2 1 {λ0 1

with ε ∈ {−1, +1}. This defines smooth functions of ( y0 , x 1 ) ∈ [−1, +1]2, provided that we have c0 + c1 > |λ0 |−1 + |λ1 |−1 . When this condition is satisfied, the domain of F has two components (associated with the two choices for ε); the restrictions of F to each of these components (which are vertical strips in [−1, +1]2) are affine-like maps implicitly defined as above. We now come back to the more general setting. We want to write a condition on F0 , F1 , G which guarantees that the composition F = F1 ◦G ◦ F0 has a domain 0 , the restrictions of F consisting of two components which are vertical strips in R to these strips being affine-like. 3.3.3 The main computation Let F0 (resp. F1 ) be implicitly defined by smooth maps A0 , B0 on I0u × I0s (resp. u s A1 , B1 on I1 × I1 ) by y0 , x 0 ) x 0 = A0 ( (3.16) x0, y0 ) = (x 0 , y0 ) ⇐⇒ F0 ( y0 = B0 ( y0 , x 0 ) x1 ) x 1 = A1 (y1 , (3.17) x1, y1 ) ⇐⇒ F1 (x 1 , y1 ) = ( x 1 ). y1 = B1 (y1 , In the system x 0 = X 0 (w, y0 ),

y0 = B0 ( y0 , x 0 )

Jacob Palis and Jean-Christophe Yoccoz

80

we eliminate y0 and solve for x 0 , getting x0 = X 0 (w, y0 )

(3.18)

which is possible provided for instance that B0,x C 0 < X 0,y −1 C0

(3.19)

and the partial derivatives of X 0 are then given by X 0,w = X 0,w (1 − X 0,y B0,x )−1 X 0,y = X 0,y B0,y (1 − X 0,y B0,x )−1 .

(3.20)

We recall that functions in these formulas must be taken at appro0 (w, priate values, namely X 0 at (w, y0 ), B0 at ( y0 , X y0 )) and X 0 at (w, B0 ( y0 , X 0 (w, y0 ))). Similarly, from the system y1 = Y1 (w, x 1 ), we obtain provided that

x 1 = A1 (y1 , x1 )

1 (w, y1 = Y x1 )

(3.21)

A1,y C 0 < Y1,x −1 C0

(3.22)

1 being given by the partial derivatives of Y 1,w = Y1,w (1 − Y1,x A1,y )−1 Y 1,x = Y1,x A1,x (1 − Y1,x A1,y )−1 . Y

(3.23)

Let us then define 1 (w, C(w, y0 , x 1 ) = w2 − θ (B0 ( y0 , X 0 (w, y0 )), A1 (Y x 1 ), x 1 )).

(3.24)

After plugging (3.16)–(3.18), and (3.21) into (3.14), it now reads C(w, y0 , x 1 ) = 0.

(3.25)

The partial derivatives of C are given by 1,w ) Cw = 2w − (θ y B0,x X 0,w + θx A1,y Y C y = − θ y (B0,x X 0,y + B0,y ) = −θ y B0,y (1 − X 0,y B0,x )−1

(3.26)

1,x + A1,x ) = −θx A1,x (1 − Y1,x A1,y )−1 C x = − θx (A1,y Y and we also have 2 2 1,w 2 − Cww = θ yy B0,x X 0,w + 2θx y B0,x A1,y X 0,w Y 2 2 2 1,w 1,w + θx x A21,y Y + θ y B0,x x + θx A1,yy Y X 0,w

1,ww . + θx B0,x X 0,ww + θ y A1,y Y

(3.27)

Implicit formalism for affine-like maps and parabolic composition

81

We replace (3.19), (3.22) by the stronger hypothesis B0,x C 0 1, B0,x x C 0 1,

A1,y C 0 1 A1,yy C 0 1

(3.28)

which guarantees, according to (3.26) and (3.27), that |Cww − 2| 1

(3.29)

|Cw − 2w| 1.

(3.30)

The relations (3.29), (3.30) show that for fixed y0 , x 1 the function w −→ ¯ y0 , C(w, y0 , x 1 ) has a unique minimum, which is close to zero. Denote by C( x1 ) the minimal value. The condition to solve (3.25) is then ¯ y0 , C( x1 ) < 0

(3.31)

in which case we get two solutions for (3.25), namely w = W ± ( y0 , x 1 ).

(3.32)

We now plug (3.32) into (3.16), (3.17) to get y0 , X 0 (W ± ( y0 , x 1 ), y0 )) x 0 = A0 ( ± y0 , x1 ) =: A ( 1 (W ± ( y 1 = B 1 (Y y0 , x 1 ), x 1 ), x1 ) y0 , x 1 ). =: B ± (

(3.33) (3.34)

3.3.4 Geometrical interpretation of C¯ Consider fixed values y¯0 , x¯1 for y0 , x 1 and the curves γ0 = G ◦ F0 ({ y0 = y¯0 }), γ1 = F1−1 ({ x 1 = x¯1 }); the curve γ0 is ‘parabolic-like’ while γ1 is ‘verticallike’. ¯ y¯0 , x¯1 ) can be ¯ y¯0 , x¯1 ) > 0 then the curves γ0 , γ1 do not intersect and C( If C( viewed as the distance between γ0 , γ1 . ¯ y¯0 , x¯1 ) = 0 then the curves γ0 , γ1 intersect at a single point where they If C( are tangent. ¯ y¯0 , x¯1 ) < 0 then the two curves γ0 , γ1 intersect at two points where the If C( ¯ y¯0 , x¯1 )| can be viewed as the maximal intersection is transversal. The value |C( horizontal distance between the parts of γ0 , γ1 between their intersection points. ¯ y0 , In the following, we will assume that C( x 1 ) < 0 everywhere on I0u × I1s and we define ¯ y0 , δ = min |C( x 1 )|. I0u × I1s

Remark 3.6. By (3.26), we see that C x , C y do not vanish; therefore, the same is true for the partial derivatives C¯ x , C¯ y . We conclude that the minimal value δ is

82

Jacob Palis and Jean-Christophe Yoccoz

attained at one of the corners of I0u × I1s , a fact which is clear in the geometrical interpretation above. Under the hypothesis just mentioned, the maps A± , B ± are defined on all × I1s . This means that the domain of F1 ◦ G ◦ F0 has two components, which 0 , and that the restriction of F1 ◦ G ◦ F0 to each of these are vertical strips in R 1 . We denote by F + strips is affine-like, the images being horizontal strips in R − + (resp. F ) the restriction which is implicitly defined by A , B + (resp. A− , B − ). In the following, we take ε ∈ {+, −}. We will now estimate the contraction rate B yε , the expansion rate (Aεx )−1 , and the distortion of F ε . I0u

3.3.5 Partial derivatives From (3.25), we have Wxε = −Cw−1 C x , W yε = −Cw−1 C y

(3.35)

and then, from (3.33), (3.34), Aεx = A0,x X 0,w Wxε = −A0,x X 0,w Cw−1 C x 1,w Cw−1 C y B yε = −B1,y Y and

Aεy = A0,y + A0,x X 0,y − A0,x X 0,w Cw−1 C y 1,x − B1,y Y 1,w Cw−1 C x . Bxε = B1,x + B1,y Y

(3.36)

(3.37)

Using (3.20), (3.26), we get from (3.36) Aεx = A0,x X 0,w Cw−1 θx A1,x (1 − X 0,y B0,x )−1 (1 − Y1,x A1,y )−1 B yε = B1,y Y1,w Cw−1 θ y B0,y (1 − Y1,x A1,y )−1 (1 − X 0,y B0,x )−1 .

(3.38)

˜ we have Let us assume that, for some k, ˜ |X 0,y | ≤ k, ˜ k˜ −1 ≤ |X 0,w | ≤ k,

|Y1,x | ≤ k˜ k˜ −1 ≤ |Y1,w | ≤ k˜

(3.39)

˜ and let us write in what follows k for various constants which only depend on k, provided that the constant, implied in (3.28) as a common bound for the norms, is small enough. We also assume that ˜ k˜ −1 ≤ |θx | ≤ k,

˜ k˜ −1 ≤ |θ y | ≤ k.

(3.40)

Implicit formalism for affine-like maps and parabolic composition

83

Then, from (3.38), we see that k −1 |A0,x ||A1,x ||Cw |−1 ≤ |Aεx | ≤ k|A0,x ||A1,x ||Cw |−1

(3.41)

k −1 |B0,y ||B1,y ||Cw |−1 ≤ |B yε | ≤ k|B0,y ||B1,y ||Cw |−1 . We will assume that A1,x C 0 + B0,y C 0 δ

(3.42)

which implies by (3.26) that C x C 0 + C y C 0 δ and guarantees, therefore, that the relative variation of C¯ through small. But then, we have

(3.43) I0u

k −1 δ 1/2 ≤ |Cw | ≤ kδ 1/2

×

I1s

is very (3.44)

y0 , x 1 ), y0 , x 1 ), which is the argument of Cw in (3.35) and the subsequent (at (W ε ( relations). We finally get the estimates k −1 δ −1/2 |A0,x ||A1,x | ≤ |Aεx | ≤ kδ −1/2 |A0,x ||A1,x | k −1 δ −1/2 |B0,y ||B1,y | ≤ |B yε | ≤ kδ −1/2 |B0,y ||B1,y |.

(3.45)

Next, we look in the formulas (3.37) for estimates of the remainders. From | X 0,y | ≤ k|B0,y |, X 0,w | ≤ k, k −1 ≤ |

1,x | ≤ k|A1,x | |Y 1,w | ≤ k k −1 ≤ |Y

(3.46)

which is clear from (3.20), we obtain |A0,x X 0,y | ≤ k|A0,x ||B0,y | 1,x | ≤ k|A1,x ||B1,y |. |B1,y Y

(3.47)

On the other hand, from (3.26), we have k −1 |A1,x | ≤ |C x | ≤ k|A1,x | k −1 |B0,y | ≤ |C y | ≤ k|B0,y |

(3.48)

which gives, together with (3.44), (3.46) |A0,x X 0,w Cw−1 C y | ≤ k|A0,x ||B0,y |δ −1/2 1,w Cw−1 C x | ≤ k|B1,y ||A1,x |δ −1/2 . |B1,y Y

(3.49)

Combining (3.47) and (3.49) we obtain |Aεy − A0,y | ≤ k|A0,x ||B0,y |(1 + δ −1/2 ) |Bxε − B1,x | ≤ k|A1,x ||B1,y |(1 + δ −1/2 ).

(3.50)

Jacob Palis and Jean-Christophe Yoccoz

84

3.3.6 Formulas for the second-order derivatives of Aε and B ε To get ∂x log |Aεx |, ∂ y log |Aεx |, we take logarithmic derivatives in (3.36) to obtain X 0,w Wxε + ∂w (log | X 0,w |)Wxε ∂x log |Aεx | = ∂x (log |A0,x |) + ∂w (log |C x |)Wxε + ∂x log |C x | − ∂x (log |Cw |)Wxε − ∂x log |Cw |

(3.51)

and 0,y + ∂ y log |Aεx | = ∂ y (log |A0,x |) + ∂x (log |A0,x |)( X X 0,w W yε ) + ∂w (log | X 0,w |)W yε + ∂ y log | X 0,w | + ∂w (log |C x |)W yε + ∂ y log |C x | − ∂w (log |Cw |)W yε − ∂ y log |Cw |.

(3.52)

We take derivatives in (3.37) to get 0,w W yε ) X 0,y + X Aεyy = A0,yy + ∂ y (log |A0,x |)A0,x ( 0,y A0,x (∂ y log |A0,x | + ∂x log |A0,x |( +X X 0,y + X 0,w W yε )) 0,yy + X 0,yw W yε ) + A0,x ( X 0,w Cw−1 C y (∂ y log |A0,x | + ∂x log |A0,x |( X 0,y + X 0,w W yε )) − A0,x X X 0,yw + X 0,ww W yε ) − A0,x Cw−1 C y ( 0,w Cw−1 (C yy + Cwy W yε ) − A0,x X 0,w C y Cw−2 (Cwy + Cww W yε ). + A0,x X

(3.53)

There is a similar formula for Bxε x . For ∂ y log |B yε | , ∂x log |Bxε | we also get formulas symmetrical to (3.51), (3.52) above. 3.3.7 Estimates for the distortion of F ε We now proceed, assuming (3.28), (3.39), (3.40) and (3.42), to estimate the terms in (3.51)–(3.53). We assume also that the partial derivatives of order two of X 0 , Y1 , θ are also bounded by k˜ and we keep the same convention for constants k. From (3.35), (3.44), (3.48), we have k −1 |A1,x |δ −1/2 ≤ |Wxε | ≤ k|A1,x |δ −1/2 k

−1

|B0,y |δ

−1/2

≤

|W yε |

≤ k|B0,y |δ

−1/2

.

(3.54) (3.55)

Combining (3.54) with (3.46), the first term in (3.26) is bounded by |∂x (log |A0,x |) X 0,w Wxε | ≤ k|∂x log |A0,x |||A1,x |δ −1/2.

(3.56)

Implicit formalism for affine-like maps and parabolic composition

85

Next we take logarithmic derivatives in the first relation of (3.20) to get X 0,w ∂w log | X 0,w | = ∂w log |X 0,w | + ∂ y (log |X 0,w |)B0,x −1 + (1 − B0,x X 0,y ) (X 0,y B0,x x X 0,w 2 + B0,x X 0,yw + X 0,yy B0,x X 0,w ) ∂ y log | X 0,w | = ∂ y (log |X 0,w |)(B0,y + B0,x X 0,y )

(3.57)

+ (1 − B0,x X 0,y )−1 (X 0,y B0,x y 0,y + B0,x X 0,yy (B0,y + B0,x X 0,y )). (3.58) + X 0,y B0,x x X From (3.28), (3.46) and the hypothesis on X 0 , we get X 0,w || ≤ k |∂w log | 0,w || ≤ k|B0,y |(1 + |∂x log |B0,y ||). |∂ y log | X

(3.59) (3.60)

We also need estimates for the second-order partial derivatives of C. First, from (3.29), (3.44), one has |∂w log |Cw | | ≤ kδ −1/2 .

(3.61)

Then, taking logarithmic derivatives in the last formula of (3.26), we get 1,w X 0,w + ∂x (log |θx |)A1,y Y ∂w log |C x | = ∂ y (log |θx |)B0,x + ∂ y (log |A1,x |)Y1,w 1,w + (1 − Y1,x A1,y )−1 (Y1,x A1,yy Y 1,w ) + Y1,xw A1,y + Y1,x x A21,y Y 0,y ) ∂ y log |C x | = ∂ y log |θx |(B0,y + B0,x X ∂x log |C x | = ∂x (log |θx |)(A1,y Y1,x + A1,x ) 1,x + ∂x log |A1,x | + ∂ y (log |A1,x |)Y + (1 − Y1,x A1,y )−1 {Y1,x A1,x ∂ y log |A1,x | 1,x + A1,y Y1,x (A1,y Y 1,x + A1,x ). + Y1,x A1,yy Y

(3.62) (3.63)

(3.64)

According to (3.28), (3.46), one obtains |∂w log |C x || ≤ k(1 + |∂ y log |A1,x ||)

(3.65)

|∂ y log |C x || ≤ k|B0,y | |∂x log |C x || ≤ |∂x log |A1,x || + k|A1,x |(1 + ∂ y log |A1,x |).

(3.66) (3.67)

From (3.65), (3.44), (3.48), one has also |∂x log |Cw || ≤ k|A1,x |δ −1/2 (1 + |∂ y log |A1,x ||).

(3.68)

Jacob Palis and Jean-Christophe Yoccoz

86

We can now estimate ∂x log |Aεx | in (3.51) from (3.54), (3.56), (3.59), (3.61), (3.65), (3.67), (3.68) as |∂x log |Aεx || ≤ |∂x log |A1,x || + k|A1,x |δ −1 + k|A1,x |δ −1/2 (∂x log |A0,x | + ∂ y log |A1,x |).

(3.69)

The estimate symmetric to (3.68) is |∂ y log |Cw || ≤ k|B0,y |δ −1/2 (1 + |∂x log |B0,y ||).

(3.70)

Then, we estimate ∂ y log |Aεx | in (3.52) from (3.46), (3.55), (3.59), (3.60), (3.61), (3.65), (3.66) and (3.70) as |∂ y log |Aεx || ≤ |∂ y log |A0,x || + k|B0,y |δ −1 + k|B0,y |δ −1/2 (|∂x log |A0,x || + |∂ y log |A1,x || + |∂x log |B0,y ||).

(3.71)

Before estimating Aεyy from (3.53), we first bound X 0,yy by taking derivatives in the second formula of (3.20) as X 0,yy = (1 − X 0,y B0,x )−1 {X 0,y B0,yy + X 0,y B0,yx X 0,y + B0,y X 0,yy (B0,y + B0,x X 0,y } + (1 − X 0,y B0,x )−2 X 0,y B0,y {X 0,y B0,x y + X 0,y B0,x x X 0,y + B0,x X 0,yy (B0,y + B0,x X 0,y )}. (3.72) which gives | X 0,yy | ≤ k(|B0,y ||∂ y log |B0,y || + |B0,y |2 (1 + |∂x log |B0,y ||)).

(3.73)

Estimating now the many terms in (3.53), we finally get |Aεyy | ≤ |A0,yy | + k|A0,x ||B0,y |δ −1/2 (|∂ y log |A0,x || + |∂ y log |B0,y ||) + k|A0,x ||B0,y |2 δ −1 (δ −1/2 + |∂x log |A0,x || + |∂x log |B0,y ||). (3.74) The partial derivatives ∂ y log |B yε |, ∂x log |B yε |, Bxε x are estimated by relations which are symmetric to (3.69), (3.71), (3.74). Recalling (3.42), we have obtained the following. Theorem 3.7. Assume that, for some k˜ > 1, all functions |X 0,y |, |Y1,x |, |X 0,w |±1 , |Y1,w |±1 , |θx |±1 , |θ y |±1 , and the partial derivatives of order two of X 0 , Y1 , θ are ˜ such that, if the functions bounded by k˜ . Then, there exists β = β(k) |A1,y |, |A1,yy |, |B0,x |, |B0,x x |, δ −1 |A1,x |δ −1 |B0,y |

Implicit formalism for affine-like maps and parabolic composition

87

are all bounded by β, then the distortion of both branches F ± of the composition F1 ◦ G ◦ F0 satisfies D(F ± ) ≤ max{D(F0 ) + kB0,y C 0 δ −1 (1 + δ 1/2 max(D(F0 ), D(F1 ))), D(F1 ) + kA1,x C 0 δ −1 (1 + δ 1/2 max(D(F0 ), D(F1 )))}. Proof. The result is straightforward from (3.69), (3.71), (3.74) and the symmetric relations. Remark 3.8. It may look surprising that we need a strong bound for |A1,y |, |A1,yy |, . . . in order to get a nice estimate for D(F). But actually it is not difficult to show that such a hypothesis is necessary (even to get our assertions on the domain of F), and it is a rather simple computation to prove that these bounds are valid in the homoclinic bifurcation context, at least when one is close enough to the bifurcation.

References [1] Newhouse S and Palis J 1976 Cycles and bifurcations theory Asterisque 31 44–140 [2] Palis J and Takens F 1985 Cycles and measure of bifurcation sets for two-dimensional diffeomorphisms Inv. Math. 82 397–422 [3] Palis J and Takens F 1993 Hyperbolicity & Sensitive Chaotic Dynamics at Homoclinic Bifurcations (Cambridge: Cambridge University Press) [4] Palis J and Yoccoz J-C 1994 Homoclinic tangencies for hyperbolic sets of large Hausdorff dimension Acta Math. 172 91–136 [5] Palis J and Yoccoz J-C 2001 Nonuniformly hyperbolic horseshoes unleashed by homoclinic bifurcations and zero density of attractors C. R. Acad. Sci., Paris to appear [6] Ruelle D and Takens F 1971 On the nature of turbulence Commun. Math. Phys. 20 167–92 Ruelle D and Takens F 1971 On the nature of turbulence Commun. Math. Phys. 23 343–4

This page intentionally left blank

Chapter 4 Strong resonances and Takens’s Utrecht preprint Bernd Krauskopf University of Bristol

The influential paper Forced oscillations and bifurcations [53] by Floris Takens, reprinted in this book as chapter 1, deals with the now classic p : q resonance problem: What happens when a periodic orbit of a vector field loses its stability by a pair of complex conjugate Floquet multipliers crossing the unit circle at a root of unity e±2πip/q ? By means of a normal form theorem Takens reduced the problem to the study of a generic Zq -equivariant planar vector field that approximates the qth iterate of the Poincar´e map in a suitable cross section. He gave all normal forms for Zqequivariant planar vector fields. For q ≥ 5, the case of so-called weak resonance, he proved that an invariant circle bifurcates and that there is a resonance tongue in which the dynamics is locked to a q-periodic attracting orbit on the invariant circle. The cases q ≤ 4 are called the strong resonances, and the dynamics cannot be reduced to an invariant circle. Takens presented unfoldings for all cases of strong resonance, except q = 4. Furthermore, for q ≤ 3 he in part proved and in part conjectured that these unfoldings are versal. (This is indeed now proved; see section 4.4.1.) His main tool was to write the normal form as a perturbation of a Hamiltonian vector field by means of a blow-up procedure. In this paper we give some history of strong resonances, where main contributions have been made by V I Arnol’d and his school; see [4] and references therein. Best known is perhaps the case of the vector field normal form for q = 1 (now known as the Bogdanov–Takens bifurcation), which was completely analysed independently by Arnol’d’s student R I Bogdanov [14, 15]. Hence the name Bogdanov–Takens bifurcation. The main emphasis here is on the only open case of 1 : 4 resonance and the associated Z4-equivariant normal form, which Takens did not study in [53]. 89

90

Bernd Krauskopf

We give a concise description of the bifurcation set and all unfoldings for this normal form. Furthermore, we present new evidence for the correctness of the bifurcations set in the form of numerically computed surfaces of heteroclinic bifurcations. Even though today the strong resonances are among the ‘classic’ bifurcation problems that can be found in recent textbooks, such as [30, 43], they recently received renewed interest. For example, LeBlanc [44] found the strong resonances for q = 2, 3, 4 near Hopf–Hopf interactions in 1 : q resonance, Broer and Golubitsky [17] studied the geometry of resonance tongues with a singularity theory approach, and Wagener [54] considered strong resonances in the quasiperiodically forced case. Guckenheimer and Malo [33, 45] used the case of 1 : 4 resonance to test their algorithm to obtain computer proofs for planar vector fields, and they proved the existence of two concentric periodic orbits. Needless to say that, in terms of applications, strong resonances can be found in virtually any system with two independent frequencies, for example, in any forced oscillator. This chapter is organized as follows. First we introduce the setting and some notation in section 4.1. The crucial normal form theorem is recalled in section 4.2. We then discuss weak resonance and the strong resonances for q ≤ 3 in their historic context in section 4.3 and section 4.4.1, respectively. In section 4.5 we deal with the case q = 4, where we present the bifurcation set of the normal form in section 4.5.1 and numerically computed surfaces of heteroclinic bifurcations in section 4.5.2. Finally, section 4.6 contains a short discussion of how the normal form dynamics are related to that of the original system.

Some personal words My involvement with strong resonances and, in fact, with bifurcation theory in general, began with my PhD work on the 1 : 4 resonance problem under the supervision of Floris Takens and Henk Broer. V I Arnol’d had suggested this problem to Henk during his visit to Groningen in 1990 with the words: ‘Do you have a student for this?’ (Floris was at IMPA at the time). His original sketch drawing of the A-plane (see section 4.5) is reproduced in my thesis [38]. Indeed my first task was to understand Floris’s Utrecht preprint and Arnol’d’s work on strong resonances starting from [4]. It is my great pleasure to dedicate this chapter to Floris on his 60th birthday. I thank Floris and Henk for many valuable discussions and for their encouragement throughout my career. Special thanks also go to Alexander Khibnik, in particular for his help in digging into the Russian literature, to Christiane Rousseau for our pleasant collaboration on unfoldings ‘arising at infinity’, to Sebius Doedel for all his help with AUTO/Homcont over the years, and to John Guckenheimer for many interesting and helpful discussions (about DsTool among other things).

Strong resonances and Takens’s Utrecht preprint

91

4.1 Setting and notation Consider a forced oscillator x¨ = f˜(x, x, ˙ t; µ, β).

(4.1)

Here µ and β are two real parameters and we assume that the system is forced with the constant frequency one. We can rewrite (4.1) as a vector field in complex notation (by taking z = x + i x) ˙ as z˙ = f (z, t; µ, β) t˙ = 1

(4.2)

where f can be calculated from f˜. The phase space of (4.2) is then C × R /Z. By a suitable change of coordinates we may assume that f (0, t; µ, β) = 0 for any µ and β, which means that = { f (0, t, µ, β), t ∈ (0, 1]} is the trivial periodic orbit induced by the forcing. In order to study the stability of consider the Poincar´e map Pµ,β

1 defined by (Pµ,β (z), 1) = φµ,β (z, 0)

t is the flow of the vector field (4.2). In other words, Pµ,β is simply the where φµ,β stroboscopic map of the forcing period (which was set to one). By construction the Floquet multipliers of are the eigenvalues of the linearization D Pµ,β of Pµ,β around the origin 0 ∈ C , plus the trivial Floquet multiplier 1 (corresponding to the direction along the periodic orbit). To study the stability of we need to study the stability of 0 under Pµ,β . Recall that the origin 0 is stable if all eigenvalues of D Pµ,β are strictly inside the unit circle. When only one parameter is changed there are three generic ways of 0 becoming unstable:

• • •

one eigenvalue moves through +1, which leads to a saddle–node bifurcation of periodic orbits, one eigenvalue moves through −1, which leads to a period-doubling bifurcation, and a pair of complex conjugate eigenvalues moves through the unit circle, which is called critical damping in [53].

We are concerned with the third case of critical damping and now assume that, for µ = 0, D Pµ,β has eigenvalues e±2πiβ . Note that the forced oscillator (4.2) is the simplest suspension of the planar diffeomorphism Pµ,β . By way of center manifold techniques, the Poincar´e map of any periodic orbit in Rn with exactly two complex conjugate Floquet multipliers crossing the unit circle will be an orientation preserving planar diffeomorphism of the form Pµ,β . Clearly there are two possibilities: either β ∈ R \ Q or β = p/q ∈ Q , where p and q are relatively prime. Because of the denseness of Q in R one

92

Bernd Krauskopf

immediately sees that the study of critical damping is really a codimension-two problem in which both parameters µ and β are important. When β = p/q ∈ Q one speaks of a p : q resonance.

4.2

Zq -equivariant normal forms

The main tool for the study of p : q resonances, and strong resonances in particular, is the following theorem due to Arnol’d [2] and Takens [53]. Theorem 4.1. In a neighborhood of (z, µ, β) = (0, 0, p/q) the map Pµ,β can be approximated up to any prescibed order by the time-one map of a Zq -equivariant planar vector field X µ,β , composed with J p/q := D P0, p/q (0), the rotation over 2π p/q. In short 1 Pµ,β = J p/q ◦ X µ,β + flat terms in (z, µ, β − p/q).

(4.3)

About the proof. Takens’s proof of this theorem uses normal form computations in jet spaces in the presence of symmetry and is in [53]. For completeness we sketch here a proof based on the geometrical idea of following the p : q torus knots: averaging the suspension of Pµ,β in co-rotating coordinates; see [23] for more details, and see also [54]. This proof is similar to the proof by Arnol’d [4] where he averages in the Seifert foliation. We change from the coordinates (z, t) in the suspension (4.2) of Pµ,β to co-rotating coordinates (ξ, t) by : (ξ, t) "→ (z, t) := (ξ e2πit p/q , t). This gives the system ξ˙ = fˆµ, p/q (ξ, t) = e−2πit p/q f µ, p/q (ξ e2πit p/q , t) − (2πi p/q)ξ t˙ = 1

(4.4)

on C × R/q Z, which is invariant under the group of deck transformations (ξ, t) "→ (ξ e2πip/q , t − 1).

(4.5)

The phase space C × R /q Z of (4.4) is a q-fold cover of the phase space C × R /Z of (4.2) via the projection , called the Seifert (or M¨obius) cover. 1 ˆ Note that D Pˆ0, p/q = id, implying that φˆ0, p/q (ξ, 0) = ( P0, p/q (ξ ), 1) is a near identity transformation, where φˆ µ,β denotes the flow of (4.4). It is a well known fact that in this situation, by averaging (4.4) over the period q, one gets the normal form for the flow t t φˆ µ,β (ξ, 0) = (X µ,β (ξ ), t) + flat terms in (z, µ, β − p/q).

Strong resonances and Takens’s Utrecht preprint

93

↓

Figure 4.1. The Seifert cover in the case of 1 : 2 resonance (the two open ends need to be identified).

The vector field X µ,β is Zq-equivariant since (4.4) is invariant under the deck transformations (4.5). The result follows by lifting the flow φµ,β of the suspension (4.2) to the cover. Keeping in mind that (z, 0) = (z, 0), we get 1 (Pµ, p/q (z), 1) = φµ,β (z, 0) 1 ((z, 0)) = φµ,β 1 = (φˆ µ,β (z, 0)) 1 (z), 1)) + flat terms = ((X µ,β 1 (z), 1) + flat terms. = (J p/q ◦ X µ,β

The Seifert cover is illustrated with the example of 1 : 2 resonance in figure 4.1, which shows the suspension and the Seifert cover for a Poincar´e map with the original orbit and a period-doubled orbit.

94

Bernd Krauskopf

4.3 Weak resonance Weak resonance for q ≥ 5 leads to the birth of an invariant circle, just like the case of irrational rotation numbers β ∈ R \ Q . The following and later theorems give the normal forms in complex notation where possible (as is done, e.g., by the Arnol’d school [4]). Theorem 4.2. The normal form z˙ = εz + Az|z|2 + z¯ q−1

(4.6)

is a versal model for Zq-equivariant planar vector fields for q ≥ 5. The fact that (4.6) is a versal model means that it contains all dynamics at the level of the Zq-equivariant planar vector field approximation of the Poincar´e map near a weak resonance [4]. Notice that the S 1 -equivariant part εz + Az|z|2 of (4.6) is stronger near the origin than the Zq -equivariant part z¯ q−1 . The bifurcation diagram of (4.6) can be found in [53, pp 48–9]. It consists of a resonance tongue formed by two saddle– node bifurcations of periodic orbits that have contact of order (q − 2)/2. Inside the tongue there are q attracting equlilibria and q saddle points on an invariant circle, while outside the tongues there is an attracting (for Re A < 0) periodic orbit. On the level of the Poincar´e map this means that an invariant circle bifurcates, whose dynamics is locked to a q-periodic orbit for parameters inside the resonance tongue. In conclusion, when µ changes through zero, generically a normally hyperbolic invariant circle is born in critical damping. Neimark studied the birth of an invariant circle for the first time [47]. It was pointed out by Sacker [51, 52] that Neimark had overlooked the important genericity condition β %= p/q for q = 1, 2, 3, 4. This bifurcation is called Neimark–Sacker bifurcation today, and the situation can be formulated as follows; see e.g. [43]. Theorem 4.3. For any generic one-parameter family of planar diffeomorphisms µ,β with eigenvalues e±2πiβ at µ = 0 where β %= p/q for q = 1, 2, 3, 4 there P is a neighborhood of the fixed point 0 in which a unique normally hyperbolic invariant circle is born as µ goes through the bifurcation value µ = 0. The above genericity condition is a non-resonance condition. The resonances for q ≥ 5 are called weak resonances, because they do not obstruct the birth of an invariant circle. According to theorem 4.2 resonance tongues are attached to the line µ = 0 in the (µ, β)-plane at the weak resonance µ,β yields a curve in points β = ± p/q. The generic one-parameter family P the (µ, β)-plane, which typically cuts infinitely many resonance tongues. This highlights again the fact that critical damping should be seen as a codimensiontwo phenomenon.

Strong resonances and Takens’s Utrecht preprint

95

4.4 Strong resonances The resonances for q ≤ 4 are called the strong resonances. At a strong resonance the dynamics cannot be reduced to the dynamics on a bifurcating invariant circle. The first sketches of codimension-two bifurcations near the strong resonances are due to Mel’nikov [46]. The strong resonances were also studied by Barsuk et al in [8], where for the cases q = 3 and q = 4 a number of phase portraits were given. For the case of q = 4 the local bounday curves are drawn, with the exception of the curve of Bogdanov–Takens points. In the spirit of Takens [53] and Arnol’d [2], we now consider the problem of finding the respective normal forms and their unfoldings for Zq-equivariant planar vector fields. 4.4.1 Strong resonances for q ≤ 3 The strong resonances for q ≤ 3 can be analysed by considering the respective normal forms as perturbations of Hamiltonian vector fields; see also [19]. These Zq -equivariant normal forms can be found in [53] in real notation. For completeness, we present them here. Theorem 4.4. The following normal forms are versal models for a Zqequivariant planar vector field for q ≤ 3. q=1: q=2: q=3:

(x, ˙ y˙ ) = (y, η1 + η2 x + x 2 + ax y) (x, ˙ y˙ ) = (y, η1 x + η2 y + ax 3 − x 2 y) z˙ = εz + Az|z|2 + z¯ 2 .

(4.7) (4.8) (4.9)

Notice that for all these normal forms the Zq-equivariant part is stronger near the origin than the S 1 -equivariant part. This is why there are always equilibria off the origin and no invariant circles. Notice that the normal form (4.9) is indeed of the same form as (4.6), but for technical reasons those for q ≤ 2 cannot be written in complex notation. The unfoldings for q ≤ 3 can be found in [53], where their versality was in part proved and in part conjectured. Let us be a bit more specific about the different cases. The unfolding for q = 1 can be found in [53, pp 33–4]. It is in fact the well-known normal form of the Bogdanov–Takens bifurcation, or double-zero eigenvalue bifurcation; see e.g. [30, 43]. It was found and proved to be versal independently by Bogdanov in [14, 15], and this was first mentioned in [1]. Notice that the two cases a = ±1 can be transformed into each other if one allows a change in the direction of time. There are two unfoldings for q = 2, one for a = +1 to be found in [53, p 38] and one for a = −1 to be found in [53, pp 42–3]. The proof of versality was completed by Carr in [24]. The unfoldings for q = 2 and q = 3 and the full

96

Bernd Krauskopf

proof of versality of the normal forms were independently obtained by Khorozov in [35]. For q = 3 there is again only one case, if one allows a change in the direction of time (which changes the sign of Re A %= 0). The different versal unfoldings of the normal forms in theorem 4.4 can also be found in recent textbooks, for example, in [30, 43]. We remark that the paper [35] by Khorozov also contains a brief discussion of the case q = 4, giving the equations for all local non-degeneracy conditions and two bifurcation sequences. We turn to this case now.

4.5 Strong resonance for q = 4 The 1 : 4 resonance problem is the most complicated of the strong resonances. Concerning Z4-equivariant planar vector fields, instead of a theorem there is a classic conjecture due to Arnol’d [2]. Conjecture 4.5. A versal model of a Z4-equivariant planar vector field is given by (4.10) z˙ = εz + Az|z|2 + z¯ 3 . Note that the two nonlinear terms in (4.10) are of the same order. Hence, their relative influence is determined only by the coefficient A. This accounts for much of the difficulty of this case. The case q = 4 marks a transition between strong and weak resonance. The normal form (4.10) (or rather its analogon in real notation) is derived but not studied in [53]; to quote Takens: ‘The case q = 4 is too complicated to be treated with the methods we have at this moment’ [53, p 46]. In order to prove conjecture 4.5 one needs to answer the following interdependent questions: (i) What are the equivalence classes of unfoldings of (4.10) in the parameter ε for generic nonlinear terms given by A? (ii) Are these unfoldings versal? (That is, robust under pertubations by higherorder terms.) In [2] Arnol’d presented the first discussion of the 1 : 4 resonance problem, giving the equations for all local non-degeneracy conditions and two bifurcation sequences of (4.10). The first picture of the division of the A-plane into regions of different unfoldings, containing all local curves, can be found in [3] (and in the first edition of the English translation). In this paper Arnol’d writes that the main theorem on Zq-equivariant unfoldings has ‘the word “theorem” [. . . ] in quotation marks because it is not proved for q = 4’ [4, p 307]. The list of known unfoldings is contained in the examination problems of the book (without the conjecture that it is complete). The method of perturbation from a Hamiltonian was used by Neishtadt [48] who showed that zero, one or two limit cycles can bifurcate. In

Strong resonances and Takens’s Utrecht preprint

97

contrast to all other resonances, perturbations from a Hamiltonian do not give enough information to find all unfoldings of (4.10). The non-local boundary curves in the A-plane, that were predicted by the work of Arnol’d [3] and Neishtadt [48], were computed via numerical continuation by Berezovskaya and Khibnik [11, 12, 13]. This resulted in the picture of the the A-plane we know today [4]; see also figure 4.4 below. Their paper [12] also contains a complete list of all known unfoldings. Surveys of this ‘classic’ material on 1 : 4 resonance can be found in [4, 5, 50], and in the more recent textbooks [30, 43]. The question whether stable invariant tori bifurcate close to 1 : 4 resonance for certain values of the parameters was answered positively by Wan [55]. More recently, there are statements on the maximal number of limit cycles that can occur in certain regions of the parameter space of the Z4-equivariant normal form, which allows conclusions about possible types of bifurcation. Cheng [28] showed the existence and uniqueness of a limit cycle in the case where there are no secondary equilibria and the origin is repelling. Cheng and Sun [29] showed that the limit cycles around the secondary equilibria are inside a limit cycle around the origin. According to Zegeling [58] there can be at most four (one if the symmetry is divided out) such secondary limit cycles. Finally, Wang [56] showed that a unique secondary limit cycle is born in the secondary Hopf bifurcation. The example of the forced Van der Pol equation can already be found in [55]. Periodically forced systems are studied asymptotically and numerically in [9, 10]. An example of a map near 1 : 4 resonance from a control system can be found in [32]. Finally, we remark that the equivalent of 1 : 4 resonance in the setting of symplectic maps has recently been studied by Bridges et al in [16]. In the series of papers [36, 37, 40] we studied the bifurcations of (4.10); see also the overview papers [39, 41] and the thesis [38]. In the next section we present the bifurcation set and all unfoldings of this normal form in a concise way. Our approach gives new insight into (4.10) and also facilitates the use of the latest numerical techniques. In section 4.5.2 we present new results on the computation of global bifurcation surfaces, which confirm the correctness of the bifurcation set presented in section 4.5.1. These methods can also be used in other bifurcation problems with three-dimensional parameter spaces. 4.5.1 The bifurcation set We adopt now the point of view taken in [37, 40], where more details can be found. Scaling first time and then the phase plane shows that all bifurcation curves in the ε-plane of (4.10) are straight lines from the origin. Consequently, by setting ε = eiα we encounter all bifurcations as α changes from, say, −π to π, which we call a bifurcation sequence. In combination with applying the transformation Re A =

1 cos ϕ, b

Im A =

1 sin ϕ b

98

Bernd Krauskopf

to (4.10) we get z˙ = eiα z + eiϕ z |z|2 + b z¯ 3

(4.11)

where b ∈ R+ , ϕ ∈ [0, 2π] and α ∈ (−π, π]. Notice that bifurcation sequences of (4.11) are in one-to-one correspondence with unfoldings of (4.10). As for the normal forms (4.7) and (4.8), we do not wish to distinguish between unfoldings that can be transformed into each other by (z, t) "→ (¯z , t)

and

(z, t) "→ (¯z , −t).

(4.12)

This means that we can restrict our attention to the region ϕ ∈ [π, 3π/2]. The unfoldings for other values of ϕ can be obtained by using the symmetries (4.12). The advantage of (4.11) over (4.10) is that all the interesting behavior occurs in a compact subset of parameter space, namely for b ∈ [0, 1]. This is helpful because the key idea is to view all parameters α, ϕ and b as parameters on equal terms. In other words, we study the bifurcation set that divides the full (b, ϕ, α)space into regions of topologically equivalent phase portraits. We call (4.11) the compact equation and the region b ∈ [0, 1], α ∈ (−π, π], ϕ ∈ [π, 3π/2] the cube of interest. The bifurcation set of (4.11) is shown in figure 4.2. It was rendered with Mathematica [57] from formulas of the local bifurcations and additional information detailed in [37]. The bifurcation set consists of bifurcation surfaces intersecting in curves of codimension-two bifurcations. The codimension-one bifurcations, defining the surfaces, are introduced and labeled in table 4.1, and the codimension-two bifurcations, defining the curves, are introduced and labeled in table 4.2. The 15 generic phase portraits corresponding to the open regions in the complement of the bifurcation set are numbered and shown in figure 4.3. They, and all other phase portraits in this paper, were computed with the package DsTool [6]. Notice that there are only 12 topologically different phase portraits, because 4 and 9, 2 and 10, and 13 and 15 are topologically equivalent, respectively. If the second symmetry transformation in (4.12), which involves a change in the direction of time, is applied then we see that also 12 transforms into 14 and vice versa. The main point is that all information on all bifurcation sequences of (4.11), and hence on all unfoldings of (4.10), is contained in figures 4.2 and 4.3. In order to construct a bifurcation sequence for a given (b, ϕ), all one has to do is collect the equivalence types of phase portrait that occur as α varies, say, from −π to π, much as one drills into glacial ice to learn about the climate of previous ages. The transition from one generic phase portrait to the next goes via a generic codimension-one bifurcation if the drilling only intersects codimension-one parts of the bifurcation set (surfaces), and then only transversely. In other words, the projections of the codimension-two bifurcation curves form boundary curves between regions in the (b, ϕ)-plane, and they are shown in figure 4.4. The curves S and BT were computed from the formulas in

Strong resonances and Takens’s Utrecht preprint

99

b

1

t1

0.5

Pi

0

Pi -2

2

2 #

S2

0

1

S1

-P -Pi ---2

1

t1

1 t1

-Pi -P 3 Pi ---2

' Pi

Figure 4.2. The bifurcation set for ϕ ∈ [π, 3π/2]. Some of the surfaces are partly cut away so that all surfaces can be seen. The surface ∞ is not drawn because it would block the view. For the surface labels see table 4.1. See also the colour section.

t

[37] and the global bifurcation curves S1 , S2 , S1 and T were continued numerically with AUTO/Homcont; see section 4.5.2 for details. The boundaries ∞ (ϕ = 3 π for 0 ≤ b ≤ 1) and t∞ (b = 1) are defined by a Hopf 2 bifurcation and a pitchfork bifurcation ‘at infinity’, respectively; see [40]. The different regions of bifurcation sequences are labeled according to the labeling

Bernd Krauskopf

100

Table 4.1. Symbols for surfaces of codimension-one bifurcations. Surface

Characterizing property

1

First Hopf bifurcation at 0

2

Second Hopf bifurcation at 0

S2

First saddle–node bifurcation

S2

Second saddle–node bifurcation

# 1 2

Hopf bifurcation at secondary equilibria Homoclinic loop at secondary equilibria First square connection; see figure 4.5(a1) Second square connection; see figure 4.5(a1) Clover connection; see figure 4.5(b1)

t

Saddle–node of limit cycles ∞ ∞

Pitchfork bifurcation at ∞ Hopf bifurcation at ∞

Table 4.2. Symbols for curves of codimension-two bifurcations. Curve S BT

Characterizing property Hopf bifurcation at 0 coincides with second saddle–node bifurcation Bogdanov–Takens bifurcation

T

Clover connection with zero trace at saddles

S1

Clover connection coincides with first saddle–node bifurcation

S1 S2

Square connection coincides with first saddle–node bifurcation Square connection coincides with second saddle–node bifurcation

introduced in [12] which was refined in [36]; see also [43]. All bifurcation sequences are listed in table 4.3 in the notation of tables 4.1 and 4.2; for the phase portraits involved see figure 4.3. Note that there are only 11 different bifurcation sequences, because those in regions V(a) and V(b) are equivalent. See [36] for an inventory of all bifurcation sequences consisting of phase portraits computed with DsTool. The bifurcation set is more than just a nice way of presenting what is known about the problem. Actually, it contains more information than the division of the (b, ϕ)-plane combined with the knowledge of the bifurcation sequences. This is because it also shows what happens above the boundary curves in the (b, ϕ)plane. In particular the question arises of what happens for (b, ϕ) = (1, 3π/2),

Strong resonances and Takens’s Utrecht preprint π

α

IV

III III(a) V(a)

VI

π

VIII

α

I

1

12

11 10

9

15

8

6

5 3

4

14

7

2

13

ϕ

1

−π π

101

b ≈ 0.8

ϕ

12

−π 3π/2 π

b>1

3π/2

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

Figure 4.3. Two cross sections through the bifurcation set of figure 4.2 intersecting all 15 open regions, and the associated phase portraits. Note that panel 8 shows two concentric limit cycles. In fact, 2 and 10, 4 and 9, and 13 and 15 are topologically equivalent, so that there are only 12 different generic phase portraits in total.

which turns out to require a study of the bifurcations at ∞ of (4.11); see [40]. In our opinion the study of the bifurcation set is the right framework for studying the question of whether (4.11) is a versal model. We have analytical arguments and numerical evidence for the following; see [36, 37, 40] for details.

Bernd Krauskopf

102

∞

3 Pi ---2

VII

ϕ

V

VIII BT VI T V(a)

| V(b)

III(a)

III II

I

S

S1

S2

IV(a)

t∞

S1 IV

Pi

0.2

0.4

0.6

1

0.8

b

Figure 4.4. The (b, ϕ)-plane of (4.11) with the regions corresponding to different generic bifurcation sequences (roman numerals). The curves BT, S, ∞ and ∞ are known analytically and the curves S2 , S1 , S1 and T were computed with AUTO/Homcont. For the curve labels see table 4.2, and for a list of all generic bifurcation sequences see table 4.3.

t

Table 4.3. The 11 generic bifurcation sequences. Region

Bifurcation sequence when varying α from −π to π

I

12 –

II

1–

1 – 2 – S1 – 3 – S2 – 10 –

III

1–

1 – 2 – S1 – 3 –

III(a)

1–

1 – 2 – S1 – 3 – 1 – 9 – 2 – 11 – S2 – 1 1 – 2 – S1 – 4 – 1 – 3 – 2 – 9 – S2 – 10 – 2 – 1 1 – 2 – S1 – 4 – 1 – 3 – 2 – 9 – 2 – 11 – S2 – 1

1 – 13 –

1 – 14 – 2 – 15 –

1 – 9 – S2 – 10 – 2 – 1 – 3 – 2 – 9 – S2 – 10 – 2 – 1 1 – 2 – S1 – 5 – – 2 – S – 5 – – 3 – 2 – 9 – 2 – 11 – S2 – 1 1 1 – 8 – – 3 – 1 – 9 – 2 – 11 – S2 – 1 1 – 2 – S1 – 5 – – 7 – #– 5 – – 3 – 1 – 9 1 – 2 – S1 – 6 –

IV

1–

IV(a)

1–

V

1–

V(a),V(b)

1–

VI

1–

VII

1– – 2 – 11 – S2 – 1 1 – 1 – 2 – S1 – 6 – – 2 – 11 – S2 – 1

VIII

2 – 12

2 –1

–7–

#– 5 –

–8–

–3– 1 –9

Strong resonances and Takens’s Utrecht preprint

103

Results on the Z4 -equivariant normal form (4.11) (i) The bifurcation surfaces intersect each other only in the known curves and do not have folds when projected onto the (b, ϕ)-plane in the direction of α. (ii) Along the curves in the interior of the cube of interest the codimension-two bifurcations are generic. (iii) For b = 1, ϕ = 3π/2 and α %= 0, ±π/2, π there are two types of codimension-two singularity at ∞. They unfold versally in the parameters b and ϕ in a local neighborhood. (iv) For b = 1, ϕ = 3π/2 α = 0 there is a codimension-three singularity at ∞, which seems to unfold versally in the parameters b, ϕ and α. The point b = 1, ϕ = 3π/2 α = 0 in parameter space is an organizing center: arbitrarily close to it all equivalence types of phase portrait can be found; see also [42]. (v) Over the open regions in the (b, ϕ)-plane the surfaces in the bifurcation set are transversal to the α-direction and correspond to generic codimension-one bifurcations. The first four statements indicate that there are indeed not more bifurcation sequences than we already know of. The last is a robustness result that indicates that the unfoldings in the open regions are versal. The problem is not completely solved, and there are many open questions and real challenges. On the other hand, all results to this date confirm conjecture 4.5, so that the unfoldings of Z4equivariant vector fields presented here (and indeed in textbooks such as [30, 43]) can be used with considerable confidence. It is a general fact, that in many advanced bifurcation problems certain assumptions and observations can only be checked numerically at present. (An example is result (i) above.) This is why advanced numerical tools are now essential for the study of bifurcation problems and models from applications alike. 4.5.2 Surfaces of heteroclinic bifurcations In this section we show how to compute the surfaces , 1 and 2 . If one considers the bifurcation set in the extended region ϕ ∈ [ 12 π, 32 π] then one realizes that the surface 2 is the image of the surface 1 under the symmetry (4.12); see figure 4.7(b3). Hence, it is sufficient to compute only the single surface 1 for ϕ ∈ [ 12 π, 32 π]. In order to use the continuation software Homcont for homoclinic bifurcations, which is part of the package AUTO [31], we divide out the Z4symmetry of (4.11) as follows. Writing z = r eiθ , we transform (4.11) to polar coordinates r˙ = cos αr + (cos ϕ + b cos 4θ )r 3 θ˙ = sin α + (sin ϕ − b sin 4θ )r 2 .

104

Bernd Krauskopf

(a1)

(b1) Wu Ws

Ws

= Wu

Ws

Wu

Ws = Wu

(a2)

(b2)

Ws = Wu

Wu Ws = Wu

Ws Wu

Ws Figure 4.5. Dividing out the Z4-symmetry reduces the heteroclinic square (a1) and clover connection (b1) to the respective homoclinic connections (a2) and (b2). The parameters are b = 0.5, ϕ = 0.7, and α = 0.1007 for (a1), (a2) and α = 0.214 for (b1), (b2). As an aside, note that (a1) is (topologically equivalent to) the logo of the SIAM Conference on Applications of Dynamical Systems.

Next, we define x = r cos 4θ and y = sin 4θ to obtain x˙ = cos αx − 4 sin αy + (cos ϕx − 4 sin ϕy)(x 2 + y 2 ) + b(x 2 + 4y 2 ) x 2 + y 2

(4.13)

y˙ = cos αy + 4 sin αx + (cos ϕy + 4 sin ϕx)(x 2 + y 2 ) − 3bx y x 2 + y 2 . Notice that the vector field (4.13) is well defined, but does not have a coninuous derivative at the origin. It can be thought of as a vector field on the cone that one gets by glueing together the sides of a quadrant of the complex plane. Heteroclinic orbits of clover and square type of (4.11) are two different types of homoclinic orbit of (4.13), as is illustrated in figure 4.5. Note that both homoclinic orbits surround the origin and stay well away from it. can be represented as a family, parametrized by b, of curves The surface

Strong resonances and Takens’s Utrecht preprint (a)

(b)

105

Ws = Wc

Wc Wc

Ws

(c)

Ws (d)

Ws = Wc

(e)

Wc

Ws Ws

Wc

Ws

Wc

Figure 4.6. Manifold organization along the lower saddle–node surface S2 , namely: saddle–node bifurcation outside a limit cycle (a), codimension-two saddle–node and square connection at S1 (b), saddle–node bifurcation on a limit cycle (c), codimension-two saddle–node and clover connection at S1 (d), and saddle–node bifurcation outside a limit cycle (e). From (a) to (e) ϕ takes the values −0.2π, 0.9265π, 1.1π, 1.265π, and 1.29π, while b = 0.5 and α = ϕ − π − arcsin b.

from S1 to the line {ϕ = 32 π, α = 12 π and 0 ≤ b ≤ 1} (where the system is Hamiltonian); see figure 4.7. Similarly, the surface 1 can be represented as a family, parametrized by b, of curves from the line {ϕ = 12 π, α = − 12 π and 0 ≤ b ≤ 1} (where the system is Hamiltonian) to S1 . For b > 1 the surface 1 is a family of curves between (the two Hamiltonian lines) {ϕ = 12 π, α = − 12 π} and {ϕ = 32 π, α = − 12 π}. This information comes from the definition of the boundary curves and from the knowledge of the Hamiltonian vector fields along the lines {ϕ = 32 π, α = ± 12 π}; see [37] for details. The main idea is now to compute these curves by starting on the saddle– node surface S1 where it forms the boundary between phase partraits 2 and 3; see figure 4.3. On this part of S1 the saddle–node takes place on a limit cycle, and its boundary is given by the curves S1 and S1 , corresponding to codimension-two saddle–node homoclinic orbits; cf [7]. This is shown in figure 4.6, where we plot phase portraits at the saddle–node bifurcation (defined by α = ϕ − π − arcsin b; see [37]) for the three generic codimension-one cases (a), (c), and (e), which correspond to a saddle–node bifurcation outside, on and inside a period orbit, respectively. Panels (b) and (d) show the codimension-two

Bernd Krauskopf

106 3 Pi ---2 0

1

0.5 1

0.5

Pi -2

(a1)

(a3)

S2

0

2

0

Pi

P Pi -2

3 Pi ---2 0 0.5 1

0

Pi -2

(a2)

S1 -Pi --2 3 Pi ---2

0

1 Pi 1.5

3 Pi ---2 0 0.5 1 Pi

1.5

1 Pi -2

0.5

0

1

(b3)

0

Pi

(b1)

S2 2

Pi -2

2

-Pi --2

0 3 Pi ---2 0

1

0.5 1 Pi

1.5

Pi -2 0

1

-Pi --2

(b2)

1

S1 -Pi --2

Pi

-Pi 3 Pi ---2

Pi -2

Figure 4.7. Global connection surfaces (a) and 1 and 2 (b) as computed with AUTO/Homcont. Panels (a1) and (b1) are the computed curves, panels (a2) and (b2) the rendered surfaces, and panels (a3) and (b3) show how the surfaces and 1 and 2 connect in (b, ϕ, α)-space with the saddle–node surfaces S1 , S2 and the Hopf surfaces 1 , 2 . See also the colour section.

Strong resonances and Takens’s Utrecht preprint

107

saddle–node homoclinic orbits S1 and S1 . By increasing b ∈ [0, 1.5] in steps of 0.05 we computed the curves of the above parametrizations with Homcont. (Notice that all interesting behavior of (4.11) is captured in this range of b-values.) Concretely, we detected the saddle– node bifurcation on S1 , where it takes place on a limit cycle at the boundary between phase portraits 2 and 3. Homcont allows one to detect the codimensiontwo homoclinic bifurcations S1 and S1 . Starting from the codimension-two point S1 (in the respective section for fixed b) we followed the homoclinic for increasing ϕ until ϕ = 32 π. Similarly, we started at S1 (in bifurcation the respective section for fixed b < 1) and followed 1 for decreasing ϕ until ϕ = 32 π. For b ≥ 1 we found the homoclinic orbit 1 with the method of homotopy, which is also implemented in AUTO/Homcont [31], and then followed it in ϕ. The collections of curves for the chosen values of b can be seen in figure 4.7(a1) for the clover connection . Panel (a2) shows the result of rendering these data as a surface. This was done in Mathematica by importing the data from AUTO and computing a mesh from the individual data points. sits in the bifurcation set; see also Figure 4.7(a3) shows how the surface figure 4.2. In the same way, figure 4.7(b1) shows the computed curves on 1 , while panel (b2) is the rendering of the surface. In figure 4.7(b3) we show how both square connection surfaces 1 and 2 sit in the bifurcation set, where 2 is simply a plot of the symmetric transform of 1 . Our computations are strong evidence in support of result (i) above. Also with AUTO/Homcont we computed the codimension-two bifurcation S1 (and also the curve T) in (b, ϕ, α)-space. Their curves S1 and projections form the respective boundary curves in the (b, ϕ)-plane in figure 4.4, where we obtained S2 from S1 by again applying the symmetry transformation (4.12). More generally, the method of computing bifurcation surfaces of global bifurcations is a new useful tool in bifurcation theory. Note that the surfaces under consideration often have (at least conjecturally) the property that they can be represented as a family of one-dimensional curves. We used this method in the study of a family of cubic Li´enard equations in [34]. There we showed numerically that the bifurcation set in the respective three-dimensional parameter space has cone structure with respect to the central singularity. This involved the computation of three different surfaces of global bifurcations. The interested reader is invited to investigate this parameter space interactively by visiting the multimedia enhancement web page that was published with [34].

4.6 From the normal form to the full dynamics Throughout this paper we have studied the unfoldings of a generic Zq-equivariant planar vector field X µ,β , whose time-one map is an approximation of the qth

108

Bernd Krauskopf

iterate of the original Poincar´e map Pµ,β near a p : q resonance; see theorem 4.1. 1 The question is what happens when J p/q ◦ X µ,β in (4.3) is perturbed by generic flat terms. This was addressed by Takens in [53, pp 54–6], and we briefly sketch what is known about this classical problem. • •

•

•

•

By the Implicit Function Theorem, hyperbolic elements of phase portraits of the vector field (including the local behavior of invariant manifolds) do not change qualitatively under (sufficiently small) perturbations. A limit cycle of the vector field gives rise to a normally hyperbolic invariant circle of the time-one map. Due to normal hyperbolicity a perturbation of this circle is present in the perturbed map. The dynamics on the invariant circle is generically quite complicated; see [22]. If an invariant circle is born in a Hopf bifurcation, resonance tongues are attached to the locus of this Hopf bifurcation in the parameter space of the perturbed map. For parameters from a resonance tongue there is exactly one periodic attractor on the invariant circle; see [20, 21]. A curve of heteroclinic (or homoclinic) connections of the vector field gives an exponentially sharp horn of the perturbed map in which there is heteroclinic tangle. The boundaries of the horn are formed by two curves of heteroclinic tangency bifurcations; see [20, 21, 49]. A curve of saddle–node bifurcations of limit cycles of the vector field leads to a phenomenon for the perturbed map called the quasi-periodic saddle– node bifurcation. Close to the bifurcation curve of the vector field there are two curves for the map. On one side of both curves there exist two normally hyperbolic invariant circles of the perturbed map, on the other side of both curves there is no invariant circle. In between the two curves the dynamics is not completely understood. One can find periodic behavior suggested by resonances as well as quasi-periodic behavior on Denjoy-like Cantor sets; see [18, 25, 26, 27].

With these general results the dynamics of the Poincar´e map Pµ,β can be found from the dynamics of the Zq-equivariant planar normal form X µ,β at least qualitatively. To obtain the dynamics of the forced vector field (4.2) near the periodic orbit

one needs to suspend Pµ,β . Notice that equilibria of X µ,β correspond to qperiodic orbits of Pµ,β . In suspension, any q-periodic orbits of Pµ,β form a p : q torus knot around . Figure 4.1 is an example of a suspended phase portrait for the case q = 2, showing a period-doubled orbit or 1 : 2 torus knot.

References [1] Arnol’d V I 1972 Lectures on bifurcations and versal families Usp. Math. Nauk. 27(5) 119–84 (Engl. transl. 1973 Russ. Math. Surv. 27(4) 54–123)

Strong resonances and Takens’s Utrecht preprint

109

[2] Arnol’d V I 1977 Loss of stability of self-oscillation close to resonance and versal deformations of equivariant vector fields Funkts Anal. Priliozh 11(2) 2–10 (Engl. transl. 1977 Funct. Anal. Appl. 11 85–92) [3] Arnol’d V I 1978 Supplementary Chapters in the Theory of Ordinary Differential Equations (Moscow: Nauka) (translated as [4]) [4] Arnol’d V I 1988 Geometrical Methods in the Theory of Ordinary Differential Equations 2nd edn (Berlin: Springer) [5] Arnol’d V I, Afrajmovich V S, Il’yashenko Yu S and Shil’nikov L P 1994 I. Bifurcation theory Dynamical Systems V: Bifurcation Theory and Catastrophe Theory (Encyclopedia of Mathematical Sciences 5) ed V I Arnol’d (Berlin: Springer) (translated from 1986 Dinamicheskie Sistemy 5 (Moscow: VINITI)) [6] Back A, Guckenheimer J, Myers M R, Wicklin F J and Worfolk P A 1992 DsTool: computer assisted exploration of dynamical systems Mon. Not. Am. Math. Soc. 39 303–9 [7] Bai F and Champneys A R 1996 Numerical computation of saddle–node homoclinic orbits of co-dimension one and two J. Dynam. Stab. Syst. 11 327–48 [8] Barsuk L O, Belosludtsev N M, Neimark Yu I and Salganskaya N M 1968 Stability of a fixed point of a mapping in a critical case and some special bifurcations Izv. Vuzov (Radiphys.) 11(11) 1632–41 (in Russian) [9] Belhaq M 1990 Numerical study for parametric excitation of differential equation near a 4-resonance Mech. Res. Commun. 17(4) 199–206 [10] Belhaq M 1992 4-subharmonic bifurcation and homoclinic transition near resonance point in nonlinear parametric oscillator Mech. Res. Commun. 19(4) 279–87 [11] Berezovskaya F S and Khibnik A I 1978 On the analysis of the equation z˙ = eiα z + Az|z|2 + z¯ 3 (in connection with the problem of stability loss of a periodic solution near 1 : 4 resonance) Report of Research Computing Center of Acad. Sci. USSR at Pushchino (in Russian) [12] Berezovskaya F S and Khibnik A I 1979 On the problem of auto-oscillation bifurcations near 1 : 4 resonance (investigation of the model equation) Report of Research Computing Center of Acad. Sci. USSR at Pushchino (Engl. transl. 1994 On the problem of bifurcations of self-oscillations close to a 1 : 4 resonance Sel. Math. Form Sov. 13(3) 197–215) [13] Berezovskaya F S and Khibnik A I 1980 On the bifurcations of separatrices in the problem of stability loss of auto-oscillations near 1 : 4 resonance Prikl. Math. Mech. 44 938–42 (Engl. transl. 1981 J. Appl. Math. Mech. 44 663–7) [14] Bogdanov R I 1976 Bifurcations of a family of vector fields on the plane Tr. Semin. Im. I. G. Petroskogo 2 23–36 (Engl. transl 1981 Sel. Math. Sov. 1 373–87) [15] Bogdanov R I 1976 Versal deformation of a singularity of a vector field on the plane in case of zero eigenvalues Tr. Semin. Im. I. G. Petroskogo 2 37–65 (Engl. transl. 1981 Sel. Math. Sov. 1 389–421) [16] Bridges T J, Furter J E and Lahiri A 1997 Instability and bifurcation near the symplectic 1 : 4 resonance Dynam. Stab. Syst. 12(4) 271–92 [17] Broer H W and Golubitsky M 2001 The geometry of resonance tongues: a singularity approach Preprint (http://www.math.uh.edu/ dynamics/reprints.html) [18] Broer H W, Huitema G B and Takens F 1990 Toward a Quasi-Periodic Bifurcation Theory Unfoldings and Bifurcations of Quasi-Periodic Tori (Memoirs AMS 83) ed H W Broer, G B Huitema, F Takens and B L J Braaksma (Providence, RI: American Mathematical Society) p 421

110

Bernd Krauskopf

[19] Broer H W and Roussarie R 2001 Exponential confinement of chaos in the bifurcation sets of real analytic diffeomorphisms, chapter 7 of this volume [20] Broer H W, Roussarie R and Sim´o C 1993 On the Bogdanov–Takens bifurcation for planar diffeomorphisms Proc. Equadiff 91 ed C Perell´o, C Sim´o and J Sol`aMorales, pp 81–92 [21] Broer H W, Roussarie R and Sim´o C 1996 Invariant circles in the Bogdanov–Takens bifurcation for diffeomorphisms Ergod. Theor. Dynam. Syst. 16 1147–72 [22] Broer H W and Takens F 1989 Formally symmetric normal forms and genericity Dynamics Reported vol 2, ed U Kirchgraber and H O Walther (New York: Wiley) pp 39–59 [23] Broer H W and Vegter G 1992 Bifurcational aspects of parametric resonance Dynamics Reported vol 1 (New Series), ed C K R T Jones et al (Berlin: Springer) pp 1–53 [24] Carr J 1981 Applications of Centre Manifold Theory (Applied Mathematical Science 35) (Berlin: Springer) [25] Chenciner A 1985 Bifurcation de points fixes elliptiques: I. Courbes invariantes Publ. Math. IHES 61 67–127 [26] Chenciner A 1985 Bifurcation de points fixes elliptiques: II. Orbites p´eriodiques et ensembles de Cantor invariants Inv. Math. 80 81–106 [27] Chenciner A 1988 Bifurcation de points fixes elliptiques: III. Orbites p´eriodiques de ‘petites’ p´eriodes et elimination r´esonnantes des couples de courbes invariantes Publ. Math. IHES 66 5–91 [28] Cheng C-Q 1990 Hopf bifurcations in nonautonomous systems at points of resonance Sci. China A 33 206–19 [29] Cheng C-Q and Sun Y-S 1992 Metamorphoses of phase portraits of vector fields in the case of symmetry of order 4 J. Diff. Eqns 95 130–9 [30] Chow S-N, Li C and Wang D 1994 Normal Forms and Bifurcation of Planar Vector Fields (Cambridge: Cambridge University Press) [31] Doedel E, Fairgrieve T, Sandstede B, Champneys A, Kuznetsov Yu A and Wang X 1998 AUTO 97: continuation and bifurcation software for ordinary differential equations (ftp://ftp.cs.concordia.ca/pub/doedel/auto/auto.ps.Z) [32] Frouzakis C E, Adomaitis R A and Kevrekidis I G 1991 Resonance phenomena in an adaptively-controlled system Int. J. Bif. Chaos 1(1) 83–106 [33] Guckenheimer J 1995 Phase portraits of planar vector fields: computer proofs Exp. Math. 4(2) 153–65 [34] Khibnik A I, Krauskopf B and Rousseau C 1998 Global study of a family of cubic Li´enard equations Nonlinearity 11(6) 1505–19, with featured multimedia enhancement available via http://www.iop.org/Journals/no [35] Khorosov E I 1979 Versal deformations of equivariant vector fields for the case of symmetries of orders 2 and 3 Tr. Semin. Im. I. G. Petrovskogo 5 163–92 (Engl. transl. 1985 Topics Mod. Math. Petrovskij Seminar 5 207–43) [36] Krauskopf B 1994 Bifurcation sequences at 1 : 4 resonance: an inventory Nonlinearity 7(3) 1073–91 [37] Krauskopf B 1994 The bifurcation set for the 1 : 4 resonance problem Exp. Math. 3(2) 107–28 [38] Krauskopf B 1995 On the 1 : 4 resonance problem: analysis of the bifurcation set PhD Thesis University of Groningen

Implicit formalism for affine-like maps and parabolic composition

111

[39] Krauskopf B 1995 The extended parameter space of a model for 1 : 4 resonance CWI Q. 8(3) 237–255 [40] Krauskopf B 1997 Bifurcations at ∞ in a model for 1 : 4 resonance Ergod. Theor. Dynam. Syst. 17 899–931 [41] Krauskopf B 1998 Stability loss near 1 : 4 resonance Charlemagne and His Heritage: 1200 Years of Civilization and Science in Europe Volume 2: Mathematical Arts ed P L Butzer, H Th Jongen and W Oberschelp (Brepols) pp 551–61 [42] Krauskopf B and Rousseau C 1997 Codimension-three unfoldings of reflectionally symmetric planar vector fields Nonlinearity 10(5) 1115–50 [43] Kuznetsov Yu A 1995 Elements of Applied Bifurcation Theory (Applied Mathematical Sciences 112) (Berlin: Springer) [44] LeBlanc V G 2000 On some secondary bifurcations near resonant Hopf–Hopf interactions Dynam. Cont. Discr. Imp. Syst. 7 405–27 [45] Malo S 1994 Rigorous computer verification of planar vector field structure PhD Thesis Cornell University, Ithaca, NY [46] Mel’nikov V K 1962 A qualitative description of resonance phenomena in nonlinear systems Dubna OINF Preprint R-1012 (in Russian) [47] Neimark Yu I 1959 On some cases of periodic motions on parameters Dokl. Akad. Nauk. SSSR 129(4) 736–9 [48] Neishtadt A I 1978 Bifurcations of the phase pattern of an equation system arising in the problem of stability loss of selfoscillations close to 1 : 4 resonance Prikl. Math. Mech. 42(5) 830–40 (Engl. transl. 1979 J. Appl. Math. Mech. 42 896–907) [49] Palis J and Takens F 1993 Hyperbolicity & Sensitive Chaotic Dynamics at Homoclinic Bifurcations (Cambridge: Cambridge University Press) [50] Rousseau C 1989 Codimension 1 and 2 bifurcations of fixed points of diffeomorphisms and periodic solutions of vector fields Ann. Sci. Math. Qu´ebec 13(2) 55–91 [51] Sacker R J 1964 On invariant surfaves and bifurcations of periodic ordinary differential equations Courant Inst. Math. Sci. Technical Report IMM-NYU 333 [52] Sacker R J 1965 A new approach to the perturbation theory of invariant surfaces Commun. Pure Appl. Math. 18 717–32 [53] Takens F 1974 Forced oscillations and bifurcations Applications of Global Analysis I Communications of the Mathematical Institute Rijksuniversiteit Utrecht 3—1974 (reprinted in chapter 1 of this volume) [54] Wagener F 2001 Semilocal analysis of the k : 1 and k : 2 resonances in quasiperiodically forced systems, chapter 5 of this volume [55] Wan Y-H 1978 Bifurcation into invariant tori at points of resonance Arch. Rat. Mech. Anal. 68 343–57 [56] Wang D 1990 Hopf bifurcation at the nonzero foci in 1 : 4 resonance Acta Math. Sinica (new series) 6 10–17 [57] Wolfram S 1988 Mathematica, a System for Doing Mathematics by Computer (New York: Addison-Wesley) [58] Zegeling A 1993 Equivariant unfoldings in the case of symmetry of order 4 Serdica 19 71–9

This page intentionally left blank

Chapter 5 Semi-local analysis of the k : 1 and k : 2 resonances in quasi-periodically forced systems Florian Wagener University of Amsterdam

The damped driven linear oscillator y¨ + d y˙ + α 2 y = ε sin ωt is usually treated in a first course on mechanics, where students learn to describe its behaviour by a function A(ω) that gives the amplitude of the oscillation as a function of the driving frequency ω. In his Utrecht preprint [17] Floris Takens took the same problem, but this time for nonlinear oscillators, as a starting point for his analysis of resonances, weak and strong, at periodic forcing. He investigated the generic behaviour of parametrised families of vector fields that have, for a certain value of the parameter, closed orbits whose Floquet multipliers λ are roots of unity. Special attention was given to the case of strong resonance, where the Floquet multipliers satisfy λ = 1 with a positive integer less than or equal to 4. A subsequent question is the following. Let a nonlinear oscillator be given that can sustain, for certain values of the parameters, self-excited oscillations, e.g. an oscillator of Van der Pol type. It is assumed that in the space of parameters there is a codimension-one manifold of non-degenerate Hopf bifurcations of some equilibrium, which can be assumed to be the origin. What happens if a small fixed frequency quasi-periodic perturbation is applied to the system? How does the global structure of the bifurcation diagram change? For the quasi-periodic case, these types of question have been addressed by Braaksma and Broer in [4], and again by Broer et al in [5]. However, they restricted themselves to the case where the Floquet exponent of the unperturbed system at bifurcation is not in 113

114

Florian Wagener

resonance with the m-dimensional quasi-periodic frequency of the perturbation. In parameter space this leads to the existence of a Hopf bifurcation set Hc which is a subset of positive measure on some codimension-one smooth manifold E , and a family of blunt cusps, tangent to E at points of Hc with infinite order of tangency, such that for parameter values in the cusps, the system has an invariant normally hyperbolic (m + 1)-torus; see [4] for details. The complement of the cusps consists of many small holes. In analogy to the situation in the theory of quasi-periodic saddle–node bifurcations (see [5, 8, 9, 10]), these are called ‘Chenciner bubbles’. In this chapter, and in a more detailed way in [19], resonance phenomena in these bubbles are investigated by averaging techniques as in [7], and bifurcation analysis. It turns out that the averaged system is the same as in the case of periodic perturbations, treated in [12, 13]. However, in the periodic case the Hopf bifurcation manifold in parameter space is smooth everywhere except at some isolated resonance regions. In the quasi-periodic case on the other hand, it is shown here that every resonance between the perturbing quasi-periodic frequency and the Floquet exponent of the unperturbed system is generically the end point (for ε = 0) of an open resonance tongue. Hence, in the quasi-periodic case it is to be expected that for finite perturbation strength the Hopf bifurcation curve is frayed. In particular, the following situation is considered: given a family of vector fields Z (σ ) on a two-dimensional manifold N, such that for some value σ0 of the parameter, the vector field Z 0 = Z (σ0 ) has an equilibrium p with the property that at p, the linearisation of Z 0 at p has eigenvalues on the imaginary axis. Let these eigenvalues be denoted by ±i α; these will be the Floquet exponents of the unperturbed tori later on. Let Tm = Rm /2π Zm be the standard m-torus. If there is another family of vector fields X (σ, ε) on Tm × N, which is such that for ε = 0, X is of the form: X =ω

∂ + Z + ε Z˜ ∂x

where ω ∈ Rm is quasi-periodic, Z is in the obvious way interpreted as a vector field on Tm × N, and Z˜ may depend on the torus coordinate x, then X is called a quasi-periodic perturbation of Z . In the case m = 1, this reduces to a periodic perturbation of Z . That case has been treated in [12]; the present chapter is part of an analysis of the quasi-periodic case. The vector field X is in normal resonance at ε = 0, if there are k ∈ Zm, ∈ Z+, such that: 'k, ω( + α = 0 where '·, ·( denotes the standard inner product. For ≥ 3 techniques from KAM theory (see [5]) can be used to show the existence of invariant tori close to Tm × { p} in Tm × N. These techniques break down in the case of low-order resonances, that is, in the case where ∈ {1, 2}, since in that case it is not obvious

Semi-local analysis of the k : 1 and k : 2 resonances

115

how to control the linearisation of the flow at the torus (for = 2), or even the location of the torus (for = 1) under perturbations. In order to investigate what happens in those cases, the vector field X can be investigated for (σ, ε) close to (σ0 , 0) using normal form or averaging techniques; cf [6, 7, 14]. This analysis is given in the forthcomng publication [19]: one of the main steps is the investigation of the periodic case. The main points of that analysis are repeated here, since the context differs slightly from [12]. For the averaged system the local bifurcation curves will be given, and the bifurcation diagrams are sketched (they differ in the case = 1 from those given in [12] due to different scalings), and finally results from [19] will be presented, which indicate how these pictures change if more quasi-periodic low-order normal resonances are considered.

5.1 Preliminaries Forced oscillators are modelled as vector fields X (σ ) on a phase space M = Tm × R2 where Tm = Rm /2π Zm is the m-torus. Typical points of M are denoted by (x, y), with x ∈ Tm and y ∈ R2 . In the following only fixed frequency forcing is considered (cf [4]), that is, all vector fields have the same component along the torus, given as ω

∂ ∂x

where ω ∈ Rm satisfies a Diophantine condition: there are γ > 0, τ > m − 1, fixed for the remainder of the chapter, such that for all k ∈ Zm\{0}, |'k, ω(| ≥ γ |k|−τ . Unless stated otherwise, all functions are assumed to be infinitely often differentiable (or smooth). Originating in the theory of Hamiltonian systems, integrability is usually taken to mean equivariance with respect to the action of a continuous group; cf [5]. In the present context this group is Tm , acting on M as τψ (x, y) = (x + ψ, y). A vector field X is then called integrable, if it is equivariant under τψ for all ψ, that is, if (τψ )∗ X = X for all ψ ∈ Tm . Unforced oscillators correspond to integrable vector fields on M.

116

Florian Wagener

Adding small amounts of forcing is modelled by adding a non-integrable term ε F(x, y, σ, ε) ∂∂y to the integrable (unforced) vector field X 0 (σ ), with ε the (non-negative) perturbation strength. Let X denote the total vector field X (σ ) = ω

∂ ∂ + X 0 (σ ) + ε F . ∂x ∂y

More specifically, it is assumed that X is of the form

∂ ∂ µ −α 2 X =ω + y + |y| C(σ )y + R(y, σ ) + ε F(x, y, σ, ε) α µ ∂x ∂y (5.1) where R = O(|y|5 ). Writing X this way introduces the assumptions that the parameter σ is of the form σ = (µ, α, σ3 , . . .) and that H0 , the Hopf bifurcation curve of the unforced system, is of the form µ = 0. Moreover, it introduces an assumption on the dependence of the Floquet multipliers of the invariant torus on the parameters µ and α, and it assumes that the unforced system has been written in normal form coordinates at a Hopf bifurcation. Finally, if necessary after a scaling, the matrix C can be taken to be orthogonal. This chapter is organised as follows. In the next section we present results that give a normal form of the perturbed vector field X, averaged over the torus Tm ×{y = 0}; see also [19]. This normal form of X consists of an integrable part Z and a non-integrable part R that is of higher order in both |y| and ε. In sections 5.3 and 5.4, equations for bifurcation curves of local bifurcations are given, and in the final section (section 5.5) consequences of this analysis for the vector field X are drawn.

5.2 Normal form analysis As usual in this kind of problem, complex valued coordinates simplify many expressions. They are introduced here by posing z = y1 + i y2 , yielding X =ω

∂ ∂ + (λz + eiθ |z|2 z + r + ε f ) . ∂x ∂z

Here λ = µ + i α and r = O(|z|5 ). To analyse the location of the Hopf bifurcation curve Hε if ε is small but non-zero, some α0 ∈ R is fixed, and a normal form transformation NF (also called averaging transformation in this context) is applied to X for (z, λ) in a

Semi-local analysis of the k : 1 and k : 2 resonances

117

small neighbourhood of (0, i α0 ). The averaging is performed over the group that is generated by the linear vector field X =ω

∂ ∂ + i α0 . ∂x ∂z

In the case where 'k, ω( +α %= 0 for any k and , this group is Tm × T, generated by translations over the torus, and rotations around the torus. In the case of a loworder normal resonance, the group is represented by all maps φψ (x, z) = (x + ψ, e−i'k,ψ( z) for ψ ∈ Tm . As is shown in [19], to lowest order, the averaged map is of the form X =ω

∂ ∂ + (λz + eiθ |z|2 z + εg + r˜ ) ∂x ∂z

where r˜ = O(|z|5 + ε2 + ε|z|2 ), and where g (to lowest order) is of the form −i'k,x( g(x, z, z¯ ) = Ae−i' 1 k,x( if = 1 (5.2) z¯ if = 2. Ae 2 Here

for = 1, and

1 f (x, 0, 0, σ0 )ei'k,x( dx A= (2π)m Tm ∂f 1 (x, 0, 0, σ0 )ei'k,x( dx A= (2π)m Tm ∂z

for = 2. Both situations ( = 1 and = 2) can be reduced to the simpler case where λ is close to 0 by a Van der Pol transformation VdPol , which is given by −1 VdPol (x, z, σ )

= (x, e−i'k,x( z, σ0 + σ )

where σ0 = (0, i α0 , σ30 , . . .) is a parameter on the manifold H0 in k : resonance. Note that in the case = 2, this transformation lifts the vector field to ˜ which is isomorphic to Tm × C . In [1], this is called lifting a covering space M, the vector field to the Seifert foliation; the present treatment is very close in spirit to [7]. The group of deck transformations on this covering space is generated by the maps !k : M˜ → M˜ for k ∈ {1, . . . , m}, with !k (x, z) = (x + πek , −z), where ek is the kth unit vector. Recall that the lifted vector field X˜ is equivariant with respect to these transformations, that is (!k )∗ X˜ = X for all k. The effect of the Van der Pol transformation is to transform the vector field to ∂ 1 ∂ + (λz + eiθ |z|2 z + ε A¯z −1 + r˜ ) . X= ω ∂x ∂z

118

Florian Wagener

Here the function r˜ is of order r˜ = O(|z|5 + ε2 + ε|z|2 ). To simplify the vector field X at the resonance further, a scaling transformation sc : M × [0, 1] → M × [0, 1] is performed, where µ4− −1 2 . sc (x, z, σ, ε) = x, µz, µ σ, A The vector field X takes the form: X=

1 ∂ ∂ ω + µ2 (λz + eiθ |z|2 z + z¯ −1 + µ r ) ∂x ∂z

with a different remainder term r , which is now of order O(1) as µ → 0. It is assumed that A %= 0. This is a non-degeneracy condition, and it is an open and dense condition on f ; see [19] for details. Hence, for (x, z, σ0 ) in some compact neighbourhood of T × {0} × σ0 , the vector field X can be written as X=

∂ 1 ∂ ω + µ2 Z + µ3 r ∂x ∂z

with Z = (λz + eiθ |z|2 z + z¯ −1 )

∂ . ∂z

(5.3)

(5.4)

The integrable part 1 ω ∂∂x + µ2 Z of (5.3) incorporates the lowest order contribution of perturbation terms. By analysing it, the major effects of the forcing on the unforced system are found. After dividing out the action of the group Tm of translations along the torus, the integrable part of X reduces to µ2 Z . Hence, it suffices at first to restrict attention to this vector field, or rather—rescaling time—to Z itself. In the next sections, bifurcation curves and diagrams are given for Z in the cases = 1 and = 2. These are essentially the same as found in [12] for the periodic case (where a Poincar´e map instead of an averaged vector field was considered), since the averaged system is the same in both the periodic and the quasi-periodic case. Subsequently, the full vector field X is interpreted as a small perturbation of the integrable vector field 1 ω ∂∂x + µ2 Z , and persistence of the bifurcations under this perturbation is discussed.

5.3 Semi-local bifurcation analysis of the k : 1 resonance This section analyses the normal form obtained in the previous section for = 1. All equilibria of the system are determined, together with their stability. This way, a semi-local bifurcation diagram of X is obtained: it is semi-local in the sense that the ‘perturbation strength’ µ is assumed to be close to 0, whereas the ‘space variable’ z is varying over a compact set K that is large enough to capture all ‘interesting’ phenomena, for instance, K = {|z| ≤ 10}.

Semi-local analysis of the k : 1 and k : 2 resonances

119

Consider the z-coordinate of the vector field given in (5.4) Z = (λz + eiθ |z|2 z + 1)

∂ . ∂z

(5.5)

Assuming the map σ "→ (λ(σ ), θ (σ )) to have a surjective derivative, after reparametrisation (λ, θ ) may be considered as parameters themselves. 5.3.1 Equilibria In order to determine the equilibria of (5.5) most conveniently, we introduce 1 auxiliary variables ζ , ψ and ρ by ζ = |z|2 , z = ζ 2 eiψ , and ρ = ρ1 +iρ2 = e−iθ λ. 1 Using these, if ζ 2 eiψ is an equilibrium then 1

3

ρζ 2 + ζ 2 = −e−i(θ+ψ) .

(5.6)

Taking norms of both sides yields ζ 3 + 2ρ1 ζ 2 + |ρ|2 ζ = 1.

(5.7)

Note that ζ = 0 cannot be a solution to this equation. Assuming that ζ is a positive root of (5.7), equation (5.6) determines the argument ψ uniquely up to multiples of 2π. Hence, every positive root of (5.7) corresponds in a one-to-one way to an equilibrium of the system. 5.3.2 Codimension-one bifurcations Saddle–node bifurcations Since (5.7) is a third-order equation in ζ , it has at most three, and at least one positive root. Double and triple roots of (5.7) correspond to (non-degenerate) saddle–node and cusp bifurcations of equilibria respectively. Equation (5.7) has a double root if 3ζ 2 + 4ρ1 ζ + |ρ|2 = 0

(5.8)

and a triple root if additionally 6ζ + 4ρ1 = 0.

(5.9)

Eliminating ζ from equations (5.7) and (5.8) gives the expression for the curve of saddle–node bifurcations ρ14 ρ22 + 2ρ12 ρ24 + ρ26 + ρ13 + 9ρ1 ρ22 +

27 4

= 0.

It is smooth everywhere except at cusp points, which are found by solving the nonlinear system (5.7)–(5.9), keeping in mind that ζ > 0, yielding √ 3 3 ρ2 = ± . ρ1 = − , 2 2

Florian Wagener

120

Hopf bifurcations Linearising (5.5) around an equilibrium z 0 by posing z = z 0 + w and neglecting terms of order O(|w|2 ) yields ¯ w˙ = (ρeiθ + 2eiθ |z 0 |2 )w + eiθ z 02 w. A necessary condition for a Hopf bifurcation is then, using ζ = |z 0 |2 Re[(eiθ )(ρ + 2ζ )] = 0

3ζ 2 + 2ρ1 ζ + |ρ|2 > 0.

and

(5.10)

Together with (5.7), this yields the equations for the set of Hopf bifurcations (tan2 θ + 1)ρ22 − 14 (ρ1 − ρ2 tan θ )2 > 0 ρ13

+ ρ12 ρ2 tan θ

+ ρ1 ρ22 (4 − tan2 θ ) − ρ23 (4 tan θ

(5.11)

+ tan θ ) = −8. (5.12) 3

Equation (5.12), together with inequality (5.11), gives the locus of the Hopf bifurcation points in the (ρ1 , ρ2 ) space. A Euclidean rotation over angle −θ yields then the Hopf bifurcation curves in λ-space. 5.3.3 Codimension-two bifurcations Cusp bifurcations Cusp bifurcations have already been obtained above, at singularities of the saddle– node bifurcation curve. Bogdanov–Takens bifurcations Note that, if in equation (5.12) polar coordinates r and φ are introduced by ρ1 = r cos φ, ρ2 = r sin φ, the equation can be written in the form r3 =

8 . f (θ, φ)

Hence, every straight line through the origin will intersect the curve H given by (5.12) at most once. Note that the locus of equation (5.11) is two straight lines through the origin, which intersect H at precisely two points, corresponding to a situation where the linearisation of an equilibrium has two zero eigenvalues. Hence, these points satisfy the necessary linear condition to be Bogdanov–Takens bifurcation points; see [2, 3, 17]. Unfortunately, since the expressions for these points are quite complicated, it remains unchecked whether the Bogdanov–Takens bifurcations are nondegenerate. Note however that there are parameter values for which the necessary conditions of a generic codimension-three bifurcation of focus type in the sense of [11] are satisfied.

Semi-local analysis of the k : 1 and k : 2 resonances

121

5.3.4 Codimension three Substituting the cusp bifurcation parameter values into (5.10), with the inequality sign replaced by an equality, yields the location of codimension-three bifurcation points as π =1 cos2 θ ∓ 6 where the signs correspond to the signs of the cusp bifurcation parameters; the solutions are {π/6, 5π/2, 7π/6, 11π/6} in [0, 2π). Normal form calculations, too lengthy to be reproduced here, yield that these points are non-degenerate nilpotent codimension-three bifurcation points of elliptic type (in the terminology of [11]). Note that this implies the existence of degenerate Hopf bifurcation points, at least for parameters in a neighbourhood of the codimension-three points. 5.3.5 Bifurcation diagrams The results obtained above are illustrated in figure 5.1. The bifurcation diagrams in terms of the parameters λ1 = Re(λ) and λ2 = Im(λ) are given for θ = 9π/10 and θ = 2π/3. Note that these diagrams are qualitatively different.

5.4 Semi-local bifurcation analysis of the k : 2 resonance Here a bifurcation analysis is given of system (5.4) for = 2 Z = (λz + eiθ |z|2 z + z¯ )

∂ . ∂z

(5.13)

Note that this vector field is Z2-equivariant under the rotation z "→ −z of the complex plane: this is the quotient action of the deck group after the torus symmetry Tm has been divided out. 5.4.1 Equilibria Introducing ζ , ψ and ρ as in the beginning of the previous subsection, setting the right-hand side of (5.13) equal to zero gives 1

1

3

e−2iψ ζ 2 = −eiθ ρζ 2 − eiθ ζ 2 .

(5.14)

1

This implies either ζ = 0 or, after eliminating ζ 2 and taking absolute values on both sides, that ζ 2 + 2ρ1 ζ + |ρ|2 = 1. (5.15) As before, for given ζ the argument ψ can be solved from (5.14), resulting here in two roots ψ = ψ0 and ψ = ψ0 + π. Hence, any positive root ζ of (5.15) determines two equilibria. Of course, this follows also immediately from the Z2equivariance.

Florian Wagener

122

5

(a)

0

−5 −0.5

0

0.5

1 1.5 lambda1

2

lambda2

lambda2

5

−5 −0.5

2.5

−1

0

0.5

1 1.5 lambda1

2

2.5

−1.5

C

BT

−1.2

(b)

C

−1.3

lambda2

−1.1

lambda2

(d)

0

BT

(e)

−2

−1.4

−1.5 1

1.1

1.2

1.3 lambda1

1.4

−2.5

1.5

0

0.2

0.4 lambda1

0.6

0.8

−0.8 C

0.36

0.35

(c)

lambda2

lambda2

C

−1

(f) BT

BT 0.34 −1.2

1.69

1.695

1.7

1.705 lambda1

1.71

1.715

1.48

1.5

1.52 1.54 lambda1

1.56

Figure 5.1. The bifurcation diagrams for the family of averaged vector fields Z 0 = (λ1 + iλ2 )z + eiθ |z|2 z + 1 with θ = 9π/10 (left column) and θ = 2π/3 (right column), in the (λ1 , λ2 )-plane. Solid curves indicate Hopf bifurcations and dashed curves saddle–node bifurcations. Panels (b) and (c), and (e) and (f) are enlargements near cusp and Bogdanov–Takens points, denoted by C and BT, respectively.

5.4.2 Codimension-one bifurcations Saddle–node and pitchfork bifurcations Equilibria coalesce if (5.15) has a single ζ = 0 root, or if it has a double root. The first case corresponds to a pitchfork bifurcation at 0, which is a generic bifurcation in the Z2-equivariant context, the second to a saddle–node bifurcation. It follows immediately from equation (5.15) that pitchfork bifurcations occur for all parameters ρ satisfying |ρ| = 1

Semi-local analysis of the k : 1 and k : 2 resonances

123

and saddle–nodes for ρ22 = 1,

ρ1 < 0.

At ρ = ±i , the pitchfork bifurcation is degenerate. Hopf bifurcations To assess the stability of an equilibrium z 0 , equation (5.13) is linearised around z 0 as ¯ w˙ = eiθ (ρ + 2|z 0 |2 )w + (1 + eiθ z 02 )w. There are two cases: first, if z 0 = 0, Hopf bifurcations of the origin are considered. They occur for Re(eiθ ρ) = 0,

|ρ| > 1.

Geometrically, this set corresponds to the intersection of a line through the origin with the complement of the unit circle, resulting in two half-lines. The set of Hopf bifurcations of points z 0 such that |z 0 |2 = ζ > 0 consists of parameters satisfying the relations + ρ2 tan θ )2 + ρ22 = 1 1 2 tan θ ρ + ρ1 ρ2 + ρ22 < 1 2 1 2 0 < 2ζ = ρ2 tan θ − ρ1 . 1 4 (ρ1

(5.16) (5.17) (5.18)

Geometrically, the locus of these relations is an elliptic arc in the space of parameters. 5.4.3 Codimension-two bifurcations Degenerate pitchfork bifurcations Combination of the conditions for saddle–node and pitchfork bifurcation yield two degenerate pitchfork bifurcation points at ρ=i

and

ρ = −i.

Since equation (5.15) is of second order in ζ , the points are non-degenerate codimension-two bifurcation points. Bogdanov–Takens bifurcation If equations (5.17) are satisfied with equality replacing the inequality then the linearisation of the vector field at the corresponding equilibrium has a double eigenvalue zero. This occurs for i . ρ = | tan θ | 1 + tan θ

124

Florian Wagener

These points satisfy the necessary conditions of Bogdanov–Takens bifurcation points. Again, because of complexity, normal forms are not computed. Symmetric double-zero bifurcations The endpoints on the unit circle of the two half-lines that describe Hopf bifurcations of z = 0 correspond to the case where the linearisation of the vector field around the origin has a double-zero eigenvalue. In the original system, this situation corresponds to a k : 2 strong resonance bifurcation. The nondegeneracy of this bifurcation is determined by normal form analysis; see [15, 18]. This is most conveniently done in real coordinates (y1 , y2 ) ∈ R2 , related to z by z = y1 +i y2. Likewise, real parameters λ1 , λ2 are introduced by λ = λ1 +i λ2 . Standard normal form theory shows that for (λ1 , λ2 ) close to (0, 1), there exists a coordinate change transforming the system into Z = y2

∂ ∂ + (−2σ2 y1 + 2σ1 y2 + (−4 sin θ )y13 + (8 cos θ )y12 y2 ) ∂y1 ∂y2

+ O(|y|5 + |y|3 |σ | + |y||σ |2).

Here λ1 = σ1 and λ2 = 1 + σ2 . The normal form for (λ1 , λ2 ) = (σ1 , −1 + σ2 ) close to (0, −1) reads Z = y2

∂ ∂ + (2σ2 y1 + 2σ1 y2 + (4 sin θ )y13 + (−4 cos θ )y12 y2 ) ∂y1 ∂y2

+ O(|y|5 + |y|3 |σ | + |y||σ |2 ).

Note that in both cases the singularity is non-degenerate for all θ %= k2 π, k ∈ Z, and that it is versally unfolded by the corresponding family. Moreover, note that for non-degenerate values of θ , the two bifurcation points are of different type; see [1] for more information. 5.4.4 Bifurcation diagrams The results obtained above are illustrated by the bifurcation diagram in figure 5.2. Note that the two points of symmetric double-zero bifurcations corresponding to k : 2 resonance points in the original system are necessarily of different type.

5.5 Conclusions This section discusses the consequences of the semi-local bifurcation analysis of the system ∂ Z = (λz + eiθ |z|2 z + z¯ −1 ) ∂z

Semi-local analysis of the k : 1 and k : 2 resonances

125

4

3

2

lambda2

1

SDZ

DPF

0

DPF SDZ

BT

−1

−2

−3

−4 −2

−1.5

−1

−0.5

0

0.5

1

1.5

2

lambda1

Figure 5.2. The bifurcation diagram of the family of Z2-equivariant vector fields Z 1 = (λ1 + iλ2 )z + eiθ |z|2 z + z¯ with θ = 2π/3, in the (λ1 , λ2 )-plane. Solid curves indicate Hopf bifurcations, dashed lines saddle–node bifurcations, dotted curves pitchfork bifurcations of the origin (corresponding to period doubling bifurcations of the original vector field X). The abbreviation BT stands for Bogdanov–Takens bifurcation point, DPF for degenerate pitchfork, SDZ for the symmetric double-zero bifurcation. If a small quasi-periodic forcing is applied to the system, this bifurcation diagram persists, although bifurcation curves may alter their positions slightly. However, the interpretation of the bifurcation points changes. Hopf and Bogdanov–Takens bifurcations of equilibria become Hopf and Bogdanov–Takens bifurcations of invariant quasi-periodic tori, pitchfork and degenerate pitchfork bifurcations become period doubling and degenerate period doubling bifurcations of tori, and the symmetric double-zero bifurcation becomes a k : 2 strong resonance bifurcation of the original system. Note that the Hopf bifurcation curve will have many more resonance points, and that its structure in the full system will be rather complicated.

for the full system X=

1 ∂ ∂ ω + µ2 Z + µ3 g . ∂x ∂z

(5.19)

126

Florian Wagener

Recall that bifurcations of equilibria for Z correspond to bifurcations of quasiperiodic tori of the integrable vector field Y = 1 ω ∂∂x + µ2 Z . Furthermore, recall that X is considered on a compact neighbourhood C of Tm × {0} × {σ0 }. For definiteness, it is assumed that C is of the form Tm × K × $, where K ⊂ C and $ ⊂ Rq are compact neighbourhoods of 0 and σ0 , respectively. The nonintegrable vector field X is then C ∞ -µ3 close to Y , uniformly on C. Note that the inverse scaling transformation sc maps C to Tm ×µK ×µ2 $: as µ tends to zero, the corresponding portion of phase space and parameter space shrinks down to Tm × {0} × {0}. Performing more than one normal form step shows that the non-integrable phenomena in the bifurcation diagram of X shrink in size faster than any power of µ, as µ tends to 0; see [19] for details. From the above, it follows that there is some constant µ0 > 0, such that if 0 ≤ µ < µ0 , then all saddle–node bifurcations of equilibria of the family Y persist in the family X as saddle–node bifurcations of quasi-periodic orbits of the family X. This is conjectured to be also the case for quasi-periodic cusp bifurcations. Quasi-periodic Hopf bifurcations are known to persist only on subsets of positive measure of certain smooth codimension-one manifolds in parameter space. The aim of the present chapter was to shed some light on what happens outside that bifurcation set. Since the theory of global quasi-periodic bifurcations is not as well developed as the local theory, we restrict ourselves to the vague heuristic remark that all complexity connected to homoclinic bifurcations [16], associated with Bogdanov–Takens and symmetric double-zero bifurcations, is to be expected in the system. Note that all resonances 'k, ω( + α = 0 are subject to this analysis, which yields bifurcation diagrams for ε smaller than some ε0 . As ε → 0 the saddle– node bifurcations (in the case of the k : 1 resonances, and the period-doubling bifurcations in the case k : 2) form the boundary of resonance tongues that end at α = −−1 'k, ω(. Recall that the set of all α of this form is dense in R. The resonance regions considered here form small regions in the heart of Chenciner bubbles. However, in general, ε0 depends on the resonance—the present results are not uniform in k and . More can be said about the persistence of the quasi-periodic Hopf bifurcations, if the analysis of the present chapter is reapplied. Rescaling time ˜ one by t = µ−2 τ and introducing a new driving frequency by setting 1 ω = µ2 ω, obtains the form of the full system (5.19) X = ω˜

∂ ∂ + Z + µg(x, z, z¯ , σ, µ) . ∂x ∂z

Note that this can again be viewed as a quasi-periodic perturbation of a planar vector field—in particular, low-order normal resonances at the Hopf bifurcation

Semi-local analysis of the k : 1 and k : 2 resonances

127

−1.5 −1.6 −1.7

C

k:1

−1.8 k’’:2

lambda2

−1.9 C

−2

C k’’:2

−2.1 k’:1 −2.2 −2.3 −2.4 −2.5 −0.1

0

0.1

0.2

0.3 0.4 lambda1

0.5

0.6

0.7

0.8

Figure 5.3. In the presence of quasi-periodicity k : 1 and k : 2 resonances (with k and k presumably quite large) modify the original Hopf bifurcation curve of the bifurcation diagram; cf figure 5.1(e). Only two resonances are shown but in reality there are many more. Note the difference from the periodic case where the diagram does not change qualitatively if a periodic perturbation is applied (although the interpretation of the bifurcation points changes). Note also that the location of a k : resonance, for specific k and , depends on the perturbation strength µ. As µ decreases, the resonance moves along the Hopf curve away from the Bogdanov–Takens point.

curve will lead to resonance phenomena as analysed in this chapter; compare the sketch in figure 5.3. This analysis can be reapplied a finite number of times, leading to more and more normal resonance modifications of the original Hopf bifurcation curve. It is known that the quasi-periodic Hopf bifurcation set H of a quasi-periodically driven system has a subset Hc , that is also a subset of a smooth codimensionone manifold E , nowhere dense relative to E but having positive measure in E . The present results strengthen the conjecture that the full set H has the same properties, and that the boundary of H relative to E includes infinitely many points of k : 1 and k : 2 bifurcations. However, it is an open question what happens at limit points of infinitely many of these low-order normal resonance points.

128

Florian Wagener

Acknowledgements I gratefully acknowledge helpful remarks and comments of Claude Baesens, Henk ` Broer, Heinz Hanßmann, Hans de Jong, Angel Jorba, Bernd Krauskopf, Robert Reid and Jordi Villanueva during the preparation of this chapter. Moreover, I want to thank Floris Takens for many inspiring discussions, in the course of which he kindled my interest in dynamical systems. Without him, the previous lines would not have been written. The research for this chapter has been done at Warwick University, supported by the European Commission Grant ‘Nonlinear dynamics and spatially extended systems’, and at the University of Amsterdam, supported by the CeNDEF pioneer grant of the Netherlands Organisation for Scientific Research (NWO).

References [1] Arnol’d V I 1988 Geometrical Methods in the Theory of Ordinary Differential Equations 2nd edn (Heidelberg: Springer) [2] Bogdanov R I 1976 Bifurcations of a limit cycle of a family of vector fields in the plane Trudy Sem. Im. I. G. Petrovskogo 2 23–36 (Engl. transl. 1981 Selecta Math. Sov. 1 373–88) [3] Bogdanov R I 1976 A versal deformation of a singular point of a vector field in the plane in the case of vanishing eigenvalues Trudy Sem. Im. I. G. Petrovskogo 2 37–65 (Engl. transl. 1981 Selecta Math. Sov. 1 389–421) [4] Braaksma B L J and Broer H W 1987 On a quasi-periodic Hopf bifurcation Ann. Inst. H. Poincar´e Analyse Nonlin´eaire 4 115–68 [5] Broer H W, Huitema G B, Takens F and Braaksma B L J 1990 Unfoldings and bifurcations of quasi-periodic tori Mem. Am. Math. Soc. 83(421) 1–175 [6] Broer H W and Takens F 1989 Formally symmetric normal forms and genericity Dynam. Rep. 2 39–59 [7] Broer H W and Vegter G 1992 Bifurcational aspects of parametric resonance Dynam. Rep. (New Series) 1 1–53 [8] Chenciner A 1985 Bifurcations de points fixes elliptiques, I. Courbes invariantes Publ. Math. IHES 61 67–127 [9] Chenciner A 1985 Bifurcations de points fixes elliptiques, II. Orbites p´eriodiques et ensembles de Cantor invariants Inv. Math. 80 81–106 [10] Chenciner A 1988 Bifurcations de points fixes elliptiques, III. Orbites p´eriodiques de ‘petites’ p´eriodes et elimination r´esonnantes des couples de courbes invariantes Publ. Math. IHES 66 5–91 ˙ [11] Dumortier F, Roussarie R, Sotomayor J and Zoładek H 1991 Bifurcations of Planar Vector Fields (Lecture Notes in Mathematics 1480) (Berlin: Springer) [12] Gambaudo J M 1985 Perturbation of a Hopf bifurcation by an external time-periodic forcing J. Diff. Eqns 57 172–99 [13] Holmes P and Rand D 1978 Bifurcations of the forced Van der Pol oscillator Q. Appl. Math. 35 495–509 [14] Krauskopf B 2001 Strong resonances and Takens’s Utrecht preprint, chapter 4 of this volume

Implicit formalism for affine-like maps and parabolic composition

129

[15] Kuznetsov Y A 1995 Elements of Applied Bifurcation Theory (Heidelberg: Springer) [16] Palis J and Takens F 1993 Hyperbolicity and Sensitive Chaotic Dynamics at Homoclinic Bifurcations (Cambridge: Cambridge University Press) [17] Takens F 1974 Forced oscillations and bifurcations Applications of Global Analysis I Communications of the Mathematical Institute Rijksuniversiteit Utrecht 3—1974 (reprinted in chapter 1 of this volume) [18] Takens F 1974 Singularities of vector fields Publ. Math. IHES 43 47–100 [19] Wagener F O O 2000 Low order normal resonances in the quasi-periodic Hopf bifurcation Preprint

This page intentionally left blank

Chapter 6 Generic unfolding of the nilpotent saddle of codimension four Freddy Dumortier and Peter Fiddelaers Limburgs Universitair Centrum Chengzhi Li Peking University

We consider the four-parameter family of planar vector fields y

∂ ∂ + [x 3 + µ2 x + µ1 + (ν + bx + εx 2 + x 3 h(x, λ))y + O(y 2 )] ∂x ∂y

where ε = ±1 and λ = (µ1 , µ2 , ν, b) is small. It represents the generic universal unfolding of a nilpotent saddle of codimension four. We present the bifurcation diagram. It describes the transition between two nilpotent saddle bifurcations of codimension three, corresponding to b > 0 and b < 0 respectively. It also describes the link between two Bogdanov–Takens bifurcations of codimension three. We find that any system in this family has at most two limit cycles. Taking h(x, λ) ≡ 0 we get Li´enard equations of type (3, 2) having a negative Poincar´e index. Among such systems the local four-parameter family contains all possible phase portraits with at most two limit cycles and, hence, it presumably contains all possible phase portraits.

6.1 General setting and some history of the subject At the beginning of the seventies, Floris Takens published some papers towards a generalisation of Thom’s elementary catastrophe theory to differential equations. In [19] he started classifying the singularities of vector fields in this spirit, emphasizing the occurrence in generic families or, in other words, the ‘codimension’. In [20] he started the study of the unfoldings of these singularities, 131

132

Freddy Dumortier, Peter Fiddelaers and Chengzhi Li

introducing among others what is now commonly called the Bogdanov–Takens bifurcation. Since then a lot of papers have appeared that improved or generalized his results. Concerning the classification of the singularities we can mention, for example, [5, 8]. Concerning the unfoldings or, in other words, concerning the local bifurcations, the harvest of papers is really abundant and we will not try to give a full account of them here. From dimension three on the variety of dynamical features is so rich that a full description of the topology of the unfolding seems hopeless. The unfoldings of singularities of codimension two have not yet been fully treated and codimension three reveals itself to be extremely complicated. In the plane the only delicate problem deals with the limit cycles, their number and bifurcation patterns. We can say that essentially the generic local bifurcations depending on at most three parameters are known. Some technical problems remain open, but the predicted bifurcation diagrams are most probably the correct ones. From these studies it is not yet possible to extract interesting general features that can be proven for the totality of generic local unfoldings, so that more ‘experiments’ are necessary to shed some light in this field. In this paper we study the generic unfoldings of the nilpotent saddle of codimension four. It gives us the opportunity of presenting the totality of techniques that are currently being used in the field, and of surveying technical problems that need to be solved. Moreover, as can be seen in the next section, the case under consideration is more interesting than merely being the next one in a systematic treatment.

6.2 Specific setting and presentation of the results In [13] the generic three-parameter unfoldings of non-cuspidal nilpotent singularities of codimension three were investigated. The unfoldings have the form y

∂ ∂ + [εx 3 + µ2 x + µ1 + y(ν + b(λ)x + x 2 + x 3 h(x, λ)) + y 2 Q(x, y, λ)] ∂x ∂y

where all functions are C ∞ , λ = (µ1 , µ2 , ν) is small, ε = ±1, b(0) = b %= 0, and Q(x, y, λ) = O(|x, y, λ| N ) for some a priori given N. The unperturbed system, corresponding to λ = 0, has a singularity at the origin of saddle, √ focus, or elliptic 2, or ε = −1 type, depending on whether ε = 1, ε = −1 and 0 < |b| < 2 √ and |b| > 2 2, respectively. Hence, by perturbation there are three topologically distinct cases. The proposed bifurcation diagrams for all three cases were given in [13]. The bifurcation diagram for the saddle case with b > 0 is illustrated in figures 6.1 and 6.2. Since the bifurcation set is a topological cone with vertex at 0 ∈ R3 , it is sufficient to draw its intersection with the boundary of a box having faces µ1 = ±1, µ2 = ±1 and ν = ±N for N sufficiently large; see figure 6.1. The most interesting part is the intersection with the face µ2 = −1 shown in figure 6.2 together with all the different structurally stable types of phase portrait.

The nilpotent saddle of codimension four

133

Figure 6.1. Conic bifurcation diagram of the codimension-three nilpotent saddle (b > 0).

In this figure E represents both the regions to the left of S N and to the right of S Nr . The significance of the different symbols can be found in [13]. For the study of the saddle of codimension three we refer to [10, 13, 21, 22], warning, however, that there is a gap in [22]. The bifurcation diagram for the saddle with b < 0 is similar, since it can be transformed to the former one by means of (x, y, µ1 , µ2 , ν, t) "→ (−x, −y, −µ1 , µ2 , ν, t). An interesting problem is to consider the transition between the two cases, i.e. to study the most generic passage through b = 0. It will reveal also to contain a link between two differently oriented Bogdanov–Takens bifurcations of codimension three, as studied in [12]. Nevertheless, in the whole bifurcation diagram we will only encounter phase portraits exhibiting at most two limit cycles. Unfortunately this fact requires a quite involved proof, based on two different rescalings (the principal rescaling and the central rescaling) as was already the case in [13]. After central rescaling we will, however, not find a limiting system consisting of a single integrable vector field, as has been the case till now, but a one-parameter family of Hamiltonian vector fields. Since the study of the zeros of the related Abelian integrals was revealed to be quite technical and rather lengthy we preferred to publish it separately in [9]. In this paper we are going to show how this fits in with all others results that we need. The aim of the present paper is, hence, to study the saddle case of codimension four with b(0) = 0 as extra degeneracy. More precisely, we

Freddy Dumortier, Peter Fiddelaers and Chengzhi Li

134

consider singularities which belong to one of the following two submanifolds of codimension four in the space of all germs of singular vector fields at 0 ∈ R2 : 4 S±

∂ 3 2 3 ∂ + (x + y(±x + ax ) , a ∈ R . = X| j X (0) ∼ y ∂x ∂y

4

Any four-parameter family X λ cutting equivalence, to the form y

4

S±

transversally can be brought, by C ∞ -

∂ + (x 3 + µ2 (λ)x + µ1 (λ)) + y(ν(λ) + b(λ)x ∂x ∂ + εx 2 + x 3 h(x, λ)) + y 2 Q(x, y, λ)) ∂y

where ε = ±1, all functions are C ∞ , µ1 (0) = µ2 (0) = ν(0) = b(0) = 0 and Q(x, y, λ) = O(|x, y, λ| N ) for an a priori given N. The genericity of the family is given by D(µ1 , µ2 , ν, b) %= 0. D(λ1 , λ2 , λ3 , λ4 ) Therefore, to study the generic four-parameter families described above, we can simply consider x˙ = y, Xλ : y˙ = x 3 + µ2 x + µ1 + y(ν + bx + εx 2 + x 3 h(x, λ)) + y 2 Q(x, y, λ), (6.1) where ε = ±1, λ = (µ1 , µ2 , ν, b) is small, and h and Q are C ∞ , Q(x, y, λ) = O(|x, y, λ| N ) for a given N. We will also keep ε = +1, which can be obtained from ε = −1 by the change (x, y, µ1 , µ2 , ν, b, t) "→ (−x, y, −µ1 , µ2 , −ν, −b, −t), hence, involving a time reversal. Clearly the critical points of (6.1) are situated on the x-axis, where they satisfy x 3 + µ2 x + µ1 = 0. Their bifurcation set S N is given by S N = {27µ21 + 4µ32 = 0}. Let us also define R+ = {27µ21 + 4µ32 > 0} and R− = {27µ21 + 4µ32 < 0}. The three regions are cone-like from the origin consisting of curves (µ1 (τ ), µ2 (τ ), ν(τ ), b(τ )) = (τ 3 µ1 , τ 2 µ2 , τ ν, τ b) with (µ1 , µ2 , ν, b) ∈ S 3 . Let us from now on speak of a cone-like set of ‘principal’ type, if it consists of such curves, with (µ1 , µ2 , ν, b) subject to a restriction on S 3 . We will speak of a cone-like set of ‘central’ type if it consists of curves (τ 3 µ1 , τ 2 µ2 , τ 2 ν, τ b), with ( µ1 , µ2 , ν, b) subject to a restriction on S 3 . The set S N ∩ S 3 has the shape of a ‘flying saucer’, as is shown in figure 6.3. In 2 figure 6.3(a) we draw the circle ν 2 + b = 1, representing S N0 ∩ S 3 , with S N0 = S N ∩ {µ1 = µ2 = 0}, all other (ν, b)-values lying inside this circle. In figure 6.3(b) we draw the shape of the intersections S N ∩ S 3 ∩ {(r ν, r b) | r ∈ R} for the different (ν, b) ∈ S 1 .

The nilpotent saddle of codimension four

Figure 6.2. The face µ2 = −1 in figure 6.1, with b > 0.

135

136

Freddy Dumortier, Peter Fiddelaers and Chengzhi Li

Figure 6.3. Representation of S N, the bifurcation set of critical points.

There are two curves N S− and N S+ of nilpotent saddles of codimension three situated at respectively {µ1 = µ2 = ν = 0, b < 0} and {µ1 = µ2 = ν = 0, b > 0}. There are two surfaces S H+2 and S H−2 of semi-hyperbolic bifurcations of codimension two, situated at respectively {µ1 = µ2 = 0, ν > 0} and {µ1 = µ2 = 0, ν < 0}. We are now able to state a theorem, using the notations and definitions given above. Theorem 6.1 (local bifurcations). Let X λ be a family of planar vector fields as defined in (6.1). Then for (µ1 , µ2 , ν, b) ∼ (0, 0, 0, 0) and (x, y) ∼ (0, 0), the following results hold. (i) The bifurcation set of the critical points of X λ is given by S N = {27µ21 + 4µ32 = 0}. (ii) There are two curves N S− = {µ1 = µ2 = ν = 0, b < 0} and N S+ = {µ1 = µ2 = ν = 0, b > 0} of nilpotent saddle bifurcations of codimension three, while outside {ν = 0} the set S N0 = S N ∩ {µ1 = µ2 = 0} consists of two surfaces S H+2 and S H−2 of semi-hyperbolic bifurcations of codimension two; their intersections with Sε3 , for ε > 0 small, are represented in figure 6.3. (iii) In the set S N\S N0 there are two curves NC and NCr of nilpotent cusp bifurcations of codimension three having as asymptotics (µ1 , µ2 , ν, b) = (− 14 b3 , − 34 b2 , 14 b2 , b) for respectively b < 0 or b > 0. There are also four surfaces of Bogdanov–Takens bifurcations (of codimension two), connecting respectively N S+ to NC , NC to N S− , N S− to NCr and NCr to N S+ . All other local bifurcations are saddle–node bifurcations (semi-hyperbolic bifurcations of codimension one). (iv) In R+ = {27µ21 + 4µ32 > 0} the vector fields have as unique singularity a hyperbolic saddle and no bifurcations occur.

The nilpotent saddle of codimension four

137

(v) In R− = {27µ21 +4µ32 < 0} the vector fields have three singularities, all nondegenerate of which two are hyperbolic saddles and one is an anti-saddle (a singularity having index 1). There is a 3-disc H , having as boundary the set of nilpotent bifurcations, and consisting of Hopf–Takens bifurcations for the anti-saddle. All these Hopf–Takens bifurcations are of codimension one, except for two surfaces of Hopf–Takens bifurcations of codimension two, connecting respectively NC to N S+ and NCr to N S− . There are no other local bifurcations in R− . In theorem 6.1 we have a full description of all possible local bifurcations. The proof will be given in section 6.3 and will merely rely on normal form calculations. This elaboration will not permit us to get the required results on cone-like neighbourhoods of the bifurcation curves and surfaces. Conjecture 6.2. For both N S− and N S+ there exists a cone-like neighbourhood of central type consisting of a trivial one-parameter family of nilpotent saddle bifurcations of codimension three. In order to prove this conjecture we first need to solve the problem encountered in [22] concerning the Abelian integrals in the study of the nilpotent saddle of codimension three. Besides this problem we do not expect further complications. In theorem 6.4 below we will prove that there exist cone-like neighbourhoods of principal type of S H+2 and S H−2 consisting of a trivial two-parameter family of semi-hyperbolic bifurcations of codimension two. For the rest of the paper we then concentrate on the complement inside R − of small conic neighbourhoods of central type of N S− ∪ N S+ and of small conic neighbourhoods of principal type of S H+2 ∪ S H−2 . We expect the bifurcation diagram in such a complement to look like that sketched in figure 6.4. Let us give a short description of the bifurcation diagram represented in figure 6.4. On the top and bottom faces of the picture we meet a slice of the bifurcation diagram of the nilpotent saddle of codimension three, illustrated in figure 6.2. On the front and back faces we have no bifurcations and encounter stable systems with two saddles and one node. On the left and right faces we find bifurcations occuring in S N\V , i.e. bifurcations related to a double zero. The nilpotent cusp bifurcation of codimension three occurs at the points NC and NCr ; the degenerate two-saddle cycle bifurcation happens at the points DT SC and DT SCr (at one saddle the ratio of hyperbolicity is equal to one while the other saddle is attracting resp. repelling); the homoclinic loop bifurcation of order two occurs along the curve DL (resp. DL r ), joining the points NC and DT SC (resp. NCr and DT SCr ), and is in the boundary of the respective surfaces of left (and right) double limit cycle bifurcations DC and DCr . Conjecture 6.3. For both NC and NCr there exist cone-like neighbourhoods of central type consisting of a trivial one-parameter family of nilpotent cusp bifurcations (Bogdanov–Takens bifurcations) of codimension three.

138

Freddy Dumortier, Peter Fiddelaers and Chengzhi Li

Figure 6.4. The bifurcation diagram inside R− \V for fixed µ2 .

The proof of this conjecture requires an adaptation of the study made in [12]. We believe that this is possible with the existing techniques and known results, but the elaboration might be quite lengthy and we do not work it out. We prefer to give full attention to the way in which all these local bifurcations of codimension one, two and three fit together. We are now able to state a second theorem. Theorem 6.4. Let X λ be a family of planar vector fields as defined in (6.1). Then for (µ1 , µ2 , ν, b) ∼ (0, 0, 0, 0) and (x, y) ∼ (0, 0) the following results hold. (i) There exist cone-like neighbourhoods of principal type of S H+2 and S H−2 consisting of a trivial one-parameter family of semi-hyperbolic bifurcations of codimension two. (ii) There exist arbitrarily small cone-like neighbourhoods of central type V− ,

The nilpotent saddle of codimension four

139

V+ , V , Vr of respectively N S− N S+ , NC and NCr and arbitrarily small cone-like neighbourhoods of principal type W+ and W− of respectively S H+2 and S H−2 such that for parameter values outside these neighbourhoods and inside R − = {27µ21 + 4µ32 ≤ 0} the bifurcation diagram of X λ is a trivial µ2 -family of bifurcation diagrams as represented in figure 6.4. The proof of theorem 6.4 is spread over sections 6.4–6.7. In section 6.4 we introduce two kinds of rescaling, the principal one and the central one. The necessity of using two rescalings was already present in the study of the nilpotent saddle of codimension three; see [13]. In [13] the central rescaling led to a problem of calculating Abelian integrals related to a perturbation from a unique (non-Hamiltonian) integrable system. In the codimension-four case the central rescaling leads to a perturbation problem from a one-parameter family of Hamiltonian vector fields. It is the first case in which this difficulty shows up. The estimation of the related Abelian integrals was tractable, but far from trivial. We have published the results from the lengthy calculations in a separate paper [9]. Also in section 6.4 we describe all results that can be obtained by principal rescaling, and a few results that rely on central rescaling like the Hopf bifurcation surface H and the degenerate Hopf bifurcation curves D H and D Hr . We also position the bifurcations of saddle connections. In section 6.5 we investigate the bifurcations of saddle connections, describe the bifurcation surfaces SCs and SCi , as well as the behaviour near the line T SC. In section 6.6 we study Xλ,δ as a perturbation of the Hamiltonian system Xλ,0 , give precise descriptions of the bifurcation surfaces L , L r , DC and DCr , and the bifurcation curves DL and DL r . Finally, in section 6.7, we come back to some technical problems that remain to be dealt with in order to finish the proof completely. To conclude this introduction it is interesting to remark that using the methods of [7] it is possible to show that the quadratic family x˙ = y + (1 + b)x 2 − x y, y˙ = λ1 + λ2 x + λ3 y + (λ2 + λ3 )x 2 − 2x y represents a nilpotent saddle bifurcation of codimension four for (λ1 , λ2 , λ3 , b) ∼ (0, 0, 0, 0). By taking h ≡ 0 and Q ≡ 0 in the normal form in expression (6.1) the nilpotent saddle bifurcation of codimension four clearly shows up in Li´enard equations of type (3, 2). In fact, the nilpotent saddle of codimension four is the most complicated singularity occuring in Li´enard equations of type (3, 2) with negative Poincar´e index. It clearly contains all possible phase portraits having at most two limit cycles. Presumably two will be the maximum number of limit cycles that Li´enard equations of type (3, 2), with a negative Poincar´e index, can have; there is however no proof for this statement. Hence, our study is complementary to the study in [15], where Li´enard equations of type (3, 2) with a positive Poincar´e index were considered. We also refer to [15] for a number of models in which one can expect to find interesting applications.

140

Freddy Dumortier, Peter Fiddelaers and Chengzhi Li

6.3 Local bifurcations In this section we will prove theorem 6.1 and, therefore, investigate directly the form X λ itself. The techniques we use are rather standard, based on normal form calculations and the verification of transversality conditions, partially with computer assistance. We will, hence, keep the presentation as short as possible. The critical points of X λ are given by y = 0 and x 3 + µ2 x + µ1 = 0. Let S N be the zero set of the discriminant of this last equation: S N = {(µ1 , µ2 , ν, b) | 27µ21 + 4µ32 = 0}. We now verify that the critical points are non-degenerate outside S N. Let m 0 = (x 0 , 0) be any critical point. Calculating the 1-jet at m 0 we find −Det(x 0 , λ) = 3x 02 + µ2 , Tr(x 0 , λ) = ν + bx 0 + x 02 + x 03 h(x 0 , λ). / S N and the saddle The determinant Det(x 0 , λ) of the 1-jet is non-zero when λ ∈ or focus/node nature is given by the sign of Det(x 0 , λ). There exist three nondegenerate points in the region R− = {27µ21 + 4µ32 < 0} and one non-degenerate point in the region R+ = {27µ21 + 4µ32 > 0}. The nature of these points can be described as follows: a focus or node is located between two saddles for λ ∈ R− ; there exists a hyperbolic saddle for λ ∈ R+ . 6.3.1 The Hopf singularities The set of Hopf singularities of any codimension is contained in the set obtained by elimination of x 0 from the two equations Tr(x 0 , λ) = ν + bx 0 + x 02 + x 03 h(x 0 , λ) = 0 x 03 + µ2 x 0 + µ1 = 0. We may suppose that = −(3x 02 + µ2 ) > 0. Performing the translation √ (x − x 0 , y) and scaling both y and time by a factor , we change (6.1) into

∂ (b + 2x 0 ) x2 3x 0 2 x 3 y x+√ + −x+ x + +y √ ∂x √ 1 ∂ 3 3 2 + √ ((x + x 0 ) h(x + x 0 , λ) − x 0 h(x 0 , λ)) + y Q(x + x 0 , y, λ) . ∂y The Lyapunov coefficient of order one is given by (see [1, 13]) 1 ∂Q b + 2x 0 + B −3x 0 − Q(x 0 , 0, λ) − √ (1 + A + 3 (x 0 , 0, λ)) + √ ∂y (6.2) where A = coefficient of x 2 in (x + x 0 )3 h(x + x 0 , λ) − x 03 h(x 0 , λ) B = coefficient of x in (x + x 0 )3 h(x + x 0 , λ) − x 03 h(x 0 , λ).

The nilpotent saddle of codimension four

141

Clearly A, B are of the form A = x 0 A, B = x 02 B. The expression (6.2) has the same sign as ∂Q 2 (−3x 0 − Q(x 0 , 0, λ))(b + 2x 0 + x 0 B) − 1 + x 0 A + 3 (x 0 , 0, λ) . ∂y Using the new parameter c = b + 2x 0 + x 02 B this expression becomes ∂Q −3x 0c − 1 + x 0 A + cQ(x 0 , 0, λ) + 3 (x 0 , 0, λ) . ∂y

(6.3)

The Hopf singularities of codimension ≥ 2 can only appear when this last expression is zero. If we suppose that this holds, and calculate the second Lyapunov coefficient, we find that this coefficient has the same sign as an expression of the form: (6.4) x 02 c2 f (x 0 , c) with f (0, 0) < 0. We prefer not to include the expression or the calculation because they are both quite long. We performed it in Macsyma. Since we are only looking for small values of x 0 and c we may conclude from (6.4) that we must have x 0 = 0 or c = 0 to have Hopf singularities of codimension > 2. But x 0 = 0 or c = 0, together with the condition that expression (6.3) is zero, imply that = 0. This shows that there are no Hopf singularities of codimension > 2 in the unfolding of the nilpotent saddle of codimension four. In view of proving theorem 6.4, more information on the Hopf–Takens bifurcations will be given in section 6.3.2. 6.3.2 Bifurcations along the set SN The vector field X λ has a degenerate singular point for λ ∈ S N = {27µ21 +4µ32 = 0}. Let (x 0 , 0) be this point. We have 3 x 0 + µ2 x 0 + µ1 = 0 (6.5) 3x 02 + µ2 = 0. The nilpotent bifurcations The trace at the point (x 0 , 0) is given by Tr(x 0 , λ) = ν + bx 0 + x 02 + x 03 h(x 0 , λ).

(6.6)

The point (x 0 , 0) is nilpotent if Tr(x 0 , λ) = 0. Define N B = S N ∩ {Tr(x 0 , λ) = 0}. Let λ0 ∈ N B\{0}, λ0 = (µ01 , µ02 , ν 0 , b0 ). Let µ1 = µ01 + M1 , µ2 = µ02 + M2 , ν = ν 0 + N, b = b0 + B, x = x 0 + X, y = Y and & = (M1 , M2 , N, B). We develop the family X λ in the coordinates X, Y and the parameters &; x 0 enters the formula as an extra arbitrarily small parameter.

Freddy Dumortier, Peter Fiddelaers and Chengzhi Li

142

Taking into account that (6.5) holds and that Tr(x 0 , λ0 ) = 0 we have ∂ + [(M2 x 0 + M1 ) + M2 X + 3x 0 X 2 + X 3 ∂X + Y (N + Bx 0 + (b 0 + 2x 0 + B)X + X 2

X λ0 +& = Y

+ ((X + x 0 )3 h(X + x 0 , λ0 + &) − x 03 h(x 0 , λ0 ))) ∂ + Y 2 Q(X + x 0 , Y, λ0 + &)] . ∂Y To determine the type of the singularity (x 0 , 0) we consider the 4-jet of X λ0 at (x 0 , 0) j 4 X λ0 ((x 0 , 0)) = Y

∂ + (X 3 + b20 X 2 + Y (b11 X + b21 X 2 + b31 X 3 ) ∂X

∂ ∂Y (6.7)

+ Y 2 (b02 + b12 X + B03Y + b22 X 2 + b13 XY + b04 Y 2 ))

where

 b20       b21       b02   b11       b31         b03

= 3x 0

1 ∂2 3 (x h(x, λ0 ))x=x0 2 ∂x2 = Q(x 0 , 0, λ0 ) ∂ 3 (x h(x, λ0 ))x=x0 = b 0 + 2x 0 + ∂x 1 ∂3 3 = (x h(x, λ0 ))x=x0 6 ∂x3 ∂Q (x 0 , 0, λ0 ). = ∂y = 21 +

Keeping x 0 %= 0, and, hence, b20 %= 0 leads to ‘cuspidal’ singularities. If b11 %= 0 then (x 0 , 0) is a cusp singularity of codimension two. If b11 = 0, one can show (using a linear coordinate transformation, working with dual forms and using e.g. Macsyma) that (6.7) is C ∞ -equivalent to  dx   =y dt   d y = x 2 + y(c31 x 3 + O(x 4 )) + y 2 O(|x, y|3 ) dt

(6.8)

2 b ) = (3x )−4 (−1 + with c31 = (b20)−4 (−b21 + b21b20 b02 + b31b20 − 3b20 03 0 O(x 0 )). Since we are keeping x 0 small, we can conclude that the singularity (x 0 , 0) is a cusp singularity of codimension three. The condition x 0 = 0, together with (6.5) and (6.6), implies that µ01 = µ02 = 0 ν = 0; therefore b0 %= 0. Using Macsyma it is easily seen that j 4 X (0,0,0,b0) (0, 0)

The nilpotent saddle of codimension four

143

is C ∞ -equivalent to:  dx   =y dy   d y = x 3 + c40 x 4 + O(x 5 ) + y(c11 x + c21 x 2 + O(x 3 )) + y 2 O(|x, y|3 ) dt (6.9) with c11 = b0 , c21 = b11 b022+2b21 , c40 = b202 . Condition (3 ) on p 2 of [13] becomes 5c21 − 3c11 c40 = 5 + O(b0 ). Since we are looking for b0 small, we can conclude that (0, 0) is a nilpotent saddle of codimension three. Let us denote N S = S N ∩ {µ1 = µ2 = ν = 0}. With the knowledge we have, we can easily prove that in each nilpotent case the family X λ0 +& is a generic unfolding of the vector field X λ0 . This gives nilpotent saddle bifurcations of codimension three and Bogdanov–Takens bifurcations of codimension two and three. The detailed elaboration follows a similar procedure as in sections 1–3 of [14]. We can also give more information about the nature of the limit cycles that bifurcate out of the cusps of codimension three and the nilpotent saddles of codimension three. Using the method of dual forms one shows that system (6.9) is C ∞ -equivalent to  dx  =y  dt (6.10) dy 5c21 − 3c11c40 2  3 3  = x + y c11 x + x + O(x ) + y 2 O(|x, y|3 ). dt 5 So, from (6.8) and (6.9) (see [12] and [13]) we see that the cusps and the saddles are of the same kind. In the parameter region with two limit cycles we get that the inner one is attracting and the outer one is repelling. The semi-hyperbolic bifurcations We suppose now that λ0 ∈ S N\N B, so that Tr(x 0 , λ0 ) %= 0. Let λ0 = (µ01 , µ02 , ν 0 , b 0 ), µ1 = µ01 + M1 , µ2 = µ02 + M2 , ν = ν0 + N, b = b0 + B, x = x 0 + X, y = Y and & = (M1 , M2 , N, B). Then X λ0 +& = Y

∂ ∂ + (a(&) + b(&)X + c(&)Y (1 + O((X, Y )) + 3x 0 X 2 + X 3 ) ∂X ∂Y

where a(&) = M1 + x 0 M2 , b(&) = M2 , c(&) = Tr(x 0 , λ0 + &) (with c(0) = Tr(x 0 , λ0 ) %= 0).

144

Freddy Dumortier, Peter Fiddelaers and Chengzhi Li

Lemma 6.5. Let m 0 = (x 0 , 0), then j 2 X λ0 (m 0 ) ∼ c(0)Y

3x 0 2 ∂ ∂ − X ∂Y c(0) ∂ X

for λ0 ∈ S N\N B and λ0 ∈ / {(µ1 , µ2 , ν, b) | µ1 = µ2 = 0}; 1 ∂ ∂ − X3 j 3 X λ0 (m 0 ) ∼ c(0)Y ∂Y c(0) ∂ X for λ0 ∈ S N\N B and λ0 ∈ {(µ1 , µ2 , ν, b) | µ1 = µ2 = 0}\{0}. Proof. The result follows by a standard centre manifold calculation.

Lemma 6.6. The family X λ0 +& is a generic semi-hyperbolic bifurcation of codimension one for λ0 ∈ S N\(N B ∪ {(µ1 , µ2 , ν, b) | µ1 = µ2 = 0}); and a generic semi-hyperbolic bifurcation of codimension two for λ0 ∈ (S N\N B) ∩ ({(µ1 , µ2 , ν, b) | µ1 = µ2 = 0}\{0}). Proof. Suppose W & : Y = (X, &) is an equation of a centre manifold for the family. The restriction of X λ0 +& to W & has the orbit equation: X˙ =

(X, &)

where W & is parametrised by X. Let us start by taking λ0 ∈ S N\N B. We consider (X, &) = A(&) + B(&)X + C(&)X 2 + O(X 3 ) 3x 0 with A(0) = B(0) = 0 and C(0) = − c(0) as calculated above and obtain the first-order terms of A by standard centre manifold calculations applied to X λ0 +& . This gives a(&) + O(&2 ). A(&) = − c(&) Obviously da(0) %= 0 and so d A(0) %= 0. If λ0 ∈ (S N\N B) ∩ ({(µ1 , µ2 , ν, b) | µ1 = µ2 = 0}\{0}), we consider

(X, &) = A(&) + B(&)X + C(&)X 2 + D(&)X 3 + O(X 4 ). Proceeding similarly we get a(&) + c(&)A(&) = O(&2 )b(&) + c(&)B(&) = O(&2 ). The independence of d A(0), d B(0) follows from the independence of da(0), db(0).

6.4 Rescalings In this section we start the proof of theorem 6.4. It will be based on the use of rescalings.

The nilpotent saddle of codimension four

145

6.4.1 Principal rescaling Exactly like in the study of the saddle of codimension three, we make a first rescaling, called the principal rescaling. It is expressed by:    x = τ 2x   y=τ y    3  µ   1 = τ 2 µ1 µ2 = τ µ2  ν = τν     b = τb    1   t = t. τ

(6.11)

As is usual in rescaling, we keep (x, y) ∈ A, with A some large ball B R (0) in R2 . This leads to results on small neighbourhoods of (0, 0) in the original (x, y)-plane; these neighbourhoods shrink to the origin when τ → 0. However, the saddle-type dynamics permit us, exactly like in [13], to use the Poincar´e– Bendixson theorem to show that the obtained results trivially extend to a uniform neighbourhood of (0, 0). This would also easily follow from the family blow-up, as described in [6]. Under the principal rescaling the family X λ changes into y

∂ ∂ + (x 3 + µ2 x + µ1 + ν y + τ y(bx + x 2 ) + O(τ 2 )y) . ∂x ∂y

(6.12)

In this expression we have omitted the bars over (x, y). For τ = 0 this family becomes ∂ ∂ y + (x 3 + µ2 x + µ1 + ν y) . (6.13) ∂x ∂y This is a family of cubic Li´enard equations with constant damping. The semihyperbolic bifurcations of codimension two can be studied in the charts {ν = 1} and {ν = −1} and statement 1 in theorem 6.4 immediately follows by transversality arguments. In the principal chart {µ2 = −1} the phase portraits are like those in figure 6.5. Similarly as in [10, 13] one can show the genericity of the curves of superior and inferior saddle connections using the rotational property with respect to the parameter ν and the semi-rotational property with respect to the parameter µ1 . The divergence of the vector fields (6.13) is equal to ν. So there can be no limit cycles for ν %= 0. For ν = 0 we have a one-parameter family of Hamiltonian vector fields. For ν %= 0, the bifurcation diagram remains the same for τ small. It is stable since it is defined by a transversality condition.

Freddy Dumortier, Peter Fiddelaers and Chengzhi Li

146

Figure 6.5. Phase portraits of the cubic Li´enard equations with constant damping.

6.4.2 Central rescaling For ν = 0 the situation in (6.13) degenerates. Therefore we consider the following blow-up in parameter space:  ν = u ν     µ1 = µ1 (6.14) µ2 = µ2    b = b  τ = ur with ν 2 + r 2 = 1. Using this blow-up family (6.12) becomes y

∂ ∂ + (x 3 + µ2 x + µ1 + uy( ν + r bx + r x 2 ) + O(u 2r 2 )y) . ∂x ∂y

µ2 in a compact domain there exists an M > 0 such that For x, y, b, µ1 , | bx + x 2 + O(u)| < M, if we suppose that 0 ≤ u ≤ u 0 , for a certain u 0

The nilpotent saddle of codimension four

147

ν + r M. small enough. So we have ν − rM ≤ ν + r bx + r x 2 + r O(u) ≤ ν ν ν If ν > 0 and 0 ≤ r ≤ 2M , we have 2 ≤ ν + r bx + r x 2 + r O(u) ≤ 3 2 . If ν 3 ν ν ν < 0 and 0 ≤ r ≤ − 2M , we have 2 ≤ ν + r bx + r x 2 + r O(u) ≤ 2 . Under these conditions, the phase portraits and the bifurcations will be the ones given by (6.13) with ν %= 0; see figure 6.5. So, it remains to consider r = 1 and | ν| ≤ M0 , with M0 > 0 fixed and sufficiently big. This is the same as using the so-called central rescaling  x = δ x   2  y = δ y    3  µ = δ µ1 1   µ2 = δ 2 µ2 (6.15) 2  ν = δ µ     b = δ b     t = 1 t δ µ2 , b) ∈ {( µ1 , µ2 , b)| µ21 + µ22 + b = 1} with δ > 0 small, ( µ1 , µ2 , b) belonging to the boundary of some box {( µ1 , µ2 , b) | or ( µ1 , max(c1 | µ1 |, c2 | µ2 |, c3 |b|) = 1}, with ci > 0, and | ν| ≤ M, M > 0 fixed and large. We also keep ( x, y) ∈ A, with A some large ball B R (0) in R2 . Again this will apparently only lead to the description of the phase portraits of X λ on domains, shrinking to the origin for δ → 0, but outside these domains the dynamics trivially extend to a uniform neighbourhood of (0, 0). By the rescaling (6.15), X λ is transformed into the expression Xλ,δ + O(δ 2 )

with Xλ,δ :

∂ ∂y

x˙ = y µ2 x + µ1 + δ( ν + bx + x 2 )y y˙ = x 3 +

(6.16)

(6.17)

omitting the tildes over (x, y). Let us now first show how in the neighbourhood of N S+ and N S− we can get results in cone-like neighbourhoods of principal type (obtained by principal rescaling) starting from the (conjectured) results in cone-like neighbourhoods of central type (obtained by central rescaling). For that we take b = ±1 with µ2 ) small. If | ν| ≥ ν0 > 0, then a divergence argument shows that for ( µ1 , µ2 ) sufficiently small all phase portraits necessarily are the ones already ( µ1 , described in family (6.13) for ν %= 0. It remains to make a study in the cones µ1 , δ 2 µ2 , δ 2 ν, ±δ) | ( µ1 , µ2 , ν) ∈ B, δ ≥ 0} C± = {(δ 3

(6.18)

where B is some small ball Br (0) in ( µ1 , µ2 , ν)-space. Let us now consider the complements of the cones C± ; i.e. we take ( µ1 , µ2 ) ∈ S 1 , or better max(| µ1 |, | µ2 |) = 1 and we take ( ν, b) in a large

148

Freddy Dumortier, Peter Fiddelaers and Chengzhi Li

Figure 6.6. Bifurcation diagram near the line TSC.

compact ball K = B R (0). As we will be able to take K as large as we want, this shows that we may take the ball B in the definition of C± arbitrarily small. µ2 small and ( b, ν) in an arbitrarily large K , then we have If µ1 = ±1 with a structurally stable situation, namely a unique hyperbolic saddle point. Also for µ2 = 1 we have the same structurally stable situation. µ1 , ν, b) in an arbitrarily large The most interesting case is µ2 = −1 with ( compact ball L. The bifurcation diagram (denoted by B ) is shown in figure 6.4; the situation near the line TSC (two-saddle cycle bifurcation curve) is illustrated ν, b)-coordinates on { µ2 = −1} we will find all the in figure 6.6. Seen in the ( µ1 , −2 interesting changes in the bifurcation set B for √ ≤ b ≤ √3 . 3 3 Figure 6.4 contains more than is actually detectable in the 3-space { µ2 = −1}. In fact, the contact of the surfaces of saddle connection SCs and SCi with

The nilpotent saddle of codimension four

149

the surfaces of saddle–node bifurcation cannot be detected by central rescaling. We found it by principal rescaling. In figure 6.4 we extended the bifurcation diagram B to the one that actually occurs for the family X λ in the cones (µ1 , µ2 , ν, b) ∈ {((τ 3 µ1 , −τ 2 , τ ν, τ b) | (µ1 , ν, b) ∈ B R (0)} with R > 0 arbitrarily large. 6.4.3 Positioning local bifurcations in X λ,δ Making a study of some local bifurcations in Xλ,δ is interesting in view of obtaining ‘uniform’ results, i.e. results that hold in cone-like neighbourhoods. We use the results in sections 6.3.1 and 6.3.2 for the expression Xλ,δ given in (6.17), with µ2 = −1; let us also denote it by x˙ = y (6.19) Xλ,δ : µ1 + δ( ν + bx + x 2 )y y˙ = x 3 − x + with λ = ( µ1 , ν, b) in an arbitrarily large compact ball L and δ > 0 small. λ | µ21 = ± In λ-space we see that S N = { λ | 27 µ21 − 4 = 0} = {

2 √ }, 3 3

with corresponding degenerate singular point at (x 0 , 0) = (± √1 , 0), and N B = 3

λ) = 0} = { λ | ν ± √b + 13 = 0}. S N ∩ {T r (x 0 , 3 After making the transformation x = x 0 + X, y = Y and using e.g. Macsyma to perform the normal form calculations, as mentioned in section 6.3.2, we find on the two lines of (N B) the points of nilpotent cusps of codimension three 2 1 2 λ | ( µ1 , ν, b) = − √ , , √ NC = 3 3 3 3 2 2 1 √ , , −√ λ | ( µ1 , ν, b) = . NCr = 3 3 3 3

As near the nilpotent saddle bifurcations N S− and N S+ we can show that in the rest of the 3-space {ν = 0} the bifurcation diagram, which we have in sufficiently large cone-like neighbourhoods of central type (obtained by central rescaling), can trivially be extended to sufficiently small cone-like neighbourhoods of principal type. This can e.g. be used near NCr and NC as well as near the Bogdanov–Takens bifurcations of codimension two. Next we turn to the Hopf bifurcations for (6.19). The equation of the Hopf singularities is given by 3 x0 − x0 + µ1 = 0 (6.20) bx 0 + ν + x 02 ) = 0 Tr(x 0 , 0) = δ( with x 0 ∈ (− √1 , √1 ), since Det(x 0 , 0) = 1 − 3x 02 must be positive. 3

3

150

Freddy Dumortier, Peter Fiddelaers and Chengzhi Li

By the change of coordinates x = x − x 0 , y = y and skipping the primes we can rewrite (6.19) as x˙ = y y˙ = x 3 + αx 2 + βx + δ(bx + x 2 )y where α = 3x 0 , β = 3x 02 − 1 < 0 and b = b + 2x 0 . √ Let y = − −βY , then the equation becomes  √  x˙ = − −β − Y 1 (αx 2 + x 3 ) + δ(bx + x 2 )Y.  Y˙ = −βx − √ −β The first two Lyapunov constants are 2αδ β b− V1 = − β α √ 5 −β V2 = − δ if V1 = 0. 48 β 2

(6.21)

From the above results and using (6.20) we have that the Hopf bifurcation of order one is given by  µ1 = x 0 − x 03    1 1 ν + x 0 b + x 02 = 0, − √ < x 0 < √ (6.22) H:  3 3   3x 0 b + 3x 02 + 1 %= 0 where we take x 0 as a parameter; and the equation of Hopf bifurcation of order two is  µ1 = x 0 − x 03    ν = 13 (6.23)  1   b = − x0 + 3x 0 which gives D Hr if 0 < x 0 <

√1 , 3

and D H if − √1 < x 0 < 0. 3

Proposition 6.7. The Hopf bifurcation surface H , given by (6.22), passes through the straight line { λ| µ1 = ν = 0} and can be expressed in the form ν = ν( µ1 , b) 2 2 √ √ , ) × (−∞, +∞). The intersection of this surface with for ( µ1 , b) ∈ (− 3 3 3 3 the plane { λ | ν = 13 } consists of two curves D H and D Hr given by (6.23). µ1 ) and b = br ( µ1 ) for They can be expressed by monotonic functions b= b (

The nilpotent saddle of codimension four

151

2 2 2 2 , 0) and µ1 ∈ (0, √ ), respectively. When µ1 → − √ (resp. √ ) µ1 ∈ (− √ 3 3 3 3 3 3 3 3 then D H (resp. D Hr ) tends to the point NC (resp. NCr ) with the property ∂ b ∂ br µ1 → 0 − 0 (resp. µ1 → 0 + 0) then ∂ µ1 → 1 (resp. ∂ µ1 → 1); when b ( µ1 ) → +∞ (resp. br ( µ1 ) → −∞); see figure 6.4. Along D H and D Hr the Hopf bifurcation of order two occurs. In H \(D H ∪ D Hr ) a Hopf bifurcation of order one occurs.

Proof. Since |x 0 | <

= x 0 ( µ1 ) from the first equation of (6.22), then substituting it into the second equation we obtain ν = ν( µ1 , b) = 2 µ1 )b − x 0 ( µ1 ). Similarly, from the first and third equations of (6.23) we −x 0 ( 2 2 obtain b= b( µ1 ) satisfying d b = ∂ b ∂ x0 > 0 for µ1 ∈ (− √ , 0) ∪ (0, √ ) and lim µ1 →±

2 √ 3 3

√1 , we can get a function x 0 3

d b d µ1

d µ1

= limx0 →± √1

3

∂ x 0 ∂ µ1 1 = 3x 02

3 3

3 3

1. The other conclusions are easy to

obtain.

6.5 Bifurcations of heteroclinic saddle connections In this section we consider the superior and inferior saddle-connection bifurcation surfaces SCs and SCi . Section 6.5.1 deals with the position of SCs and SCi globally and the related bifurcations outside small neighbourhoods of the twosaddle cycle (TSC). Since the conclusion and the arguments are exactly the same as in [13], we will only give a brief description of the situation. In section 6.5.1 we study the bifurcations near the TSC. The behaviour changes when b passes through the values ± 45 + O(δ), corresponding to DT SC and DT SCr respectively, which are the degenerate cases of the TSC. However, we leave the study of the homoclinic loop bifurcations to section 6.6.1, although it also appears near the TSC. 6.5.1 The surfaces SC s and SC i We again consider Xλ,δ

x˙ = y y˙ = x 3 − x + µ1 + δ( ν + bx + x 2 )y

(6.24)

with λ = ( µ1 , ν, b) and δ > 0 small. We note that (6.24) is a family of rotated vector fields with respect to ν in the whole (x, y) plane, and it also has the rotational property with respect to µ1 on each of two halfplanes {y ≥ 0} and {y ≤ 0} (with different rotational directions), respectively. 2 2 For any fixed b we consider ( µ1 , ν) ∈ (− √ , √ )×[−M, M] with M > 0 3 3 3 3 ∗ ∗ µ1 , −M) and p+ = ( µ1 , M) be the two end points of the large. Let p− = ( segment {( µ1 , ν) | µ1 = µ∗1 , −M ≤ ν ≤ M}. Then the relative positions of the

152

Freddy Dumortier, Peter Fiddelaers and Chengzhi Li

Figure 6.7. Different positions of saddle separatrices.

Figure 6.8. Relative position of SCi and SCs .

separatrices at the two saddles of the system (6.24) corresponding to p− and p+ are illustrated in figure 6.7. ν) moves from p− to p+ along the straight line µ1 = µ∗1 , If the point ( µ1 , by the rotational property of the vector fields with respect to ν, we must have a unique point SCs∗ and a unique point SCi∗ on this line, corresponding to superior 2 2 and inferior saddle connections respectively. Moving µ∗1 from − √ to √ , we 3 3 3 3 obtain two curves SCs and SCi ; see figure 6.8. On the other hand, since the superior (resp. inferior) separatrices always stay in the halfplane {y ≥ 0} (resp. {y ≤ 0}), by using the rotational property with respect to µ1 in the corresponding halfplane, we know that the curves SCs and SCi are also monotonic with respect to ν. Hence, they have exactly one intersection point TSC. Finally, by moving b, we obtain the detectable part of the bifurcation surfaces

The nilpotent saddle of codimension four

153

Figure 6.9. Hamiltonian vector field for η = 0.

SCs and SCi in ( µ1 , ν, b)-space, which looks like a product of R with the bifurcation diagram in figure 6.8. We will study the additional bifurcations near the intersection line TSC in section 6.5.2, and prove that TSC has coordinate M > 0 large. ν, b) = (0, − 15 , b) + O(δ), for | b| ≤ M, expression ( µ1 , 6.5.2 Behaviour near the TSC It is easy to see that the TSC can occur only for µ1 near 0 and, therefore, it is interesting to consider the blow-up in parameter space ( ν, µ1 , µ2 , b, δ) = (ν , ηµ1 , µ2 , b , ηδ )

(6.25)

with max(|δ |, |µ1 |) = 1. µ1 = ηµ1 in equation (6.24), inducing Taking δ = 1 leads to

x˙ = y y˙ = x 3 − x + η[µ1 + (ν + b x + x 2 )y]

(6.26)

which we have to study for η small and µ1 in an arbitrarily large interval. The choice µ1 = ±1 leads to

x˙ = y y˙ = x 3 − x ± η[1 ± δ (ν + b x + x 2 )y]

(6.27)

to be studied for η ∼ 0 and δ ∼ 0. For η = 0 in (6.26) and (6.27) we find a Hamiltonian system with a TSC, as is sketched in figure 6.9.

154

Freddy Dumortier, Peter Fiddelaers and Chengzhi Li The equations of this cycle for y ≥ 0 and y ≤ 0 are, respectively, 1

± : y = ± √ (1 − x 2 ), −1 ≤ x ≤ 1. 2

We immediately see that no superior nor inferior saddle connection is possible in system (6.27); in fact the integral [1 ± δ (ν + b x + x 2 )y]d x

±

can, for δ ∼ 0, never be zero. For system (6.26) the limiting equation, for η → 0, of the superior connection satisfies [µ1 + (ν + b x + x 2 )y]d x 0= +

1 1 2 2 = µ1 + √ (1 − x )(ν + b x + x ) d x 2 −1 √ √ 2 2 2 2 ν + . (6.28) = 2µ1 + 3 15 Similarly, the limiting equation of the inferior saddle connection is given by √ √ 2 2 2 2 2µ1 − ν − = 0. (6.29) 3 15 From (6.28) and (6.29), we get the limiting equation of the line TSC µ1 = 0, ν = − 15 .

(6.30)

Using (6.25) with δ = 1 we know that for η = δ %= 0 (6.30) is equivalent to µ1 = 0,

ν = − 15 .

(6.31)

For δ = η > 0, the two saddle points S1 (δ) and S2 (δ) of (6.26) are given by y=0 x 3 − x + δµ1 = 0. Near (x, y) = (−1, 0), we introduce u = x + 1, then (u − 1)3 − (u − 1) + δµ1 = 0 gives u = − 12 δµ1 + O(δ 2 ). Hence, x = −1 − 12 δµ1 + O(δ 2 ) at S1 (δ), and the trace of (6.26) at this point is ν − b + 1) + O(δ 2 ). Tr(S1 (δ)) = δ(

The nilpotent saddle of codimension four

155

Similarly, we can obtain x = 1 − 12 δµ1 + O(δ 2 ) at S2 (δ), and the trace at this point is Tr(S2 (δ)) = δ( ν + b + 1) + O(δ 2 ). Along the line TSC (given by (6.30) or (6.31)) we have b) + O(δ 2 ), T r (S1 (δ)) = δ( 45 −

Tr(S2 (δ)) = δ( 45 + b) + O(δ 2 ).

Therefore, if we denote by γ (Si (δ)) the ratio of hyperbolicity at the saddle point Si (δ) (the absolute value of the negative eigenvalue over the positive eigenvalue), then for different values of b along the line TSC we have the following five cases for 0 < δ 1: (i) (ii) (iii) (iv) (v)

if b < − 45 , then γ (S1 (δ)) < 1 and γ (S2 (δ)) > 1 if b = − 45 , then γ (S1 (δ)) < 1 and γ (S2 (δ)) = 1 b < 45 , then γ (S1 (δ)) < 1 and γ (S2 (δ)) < 1 if − 45 < 4 if b = 5 , then γ (S1 (δ)) = 1 and γ (S2 (δ)) < 1 if b > 45 , then γ (S1 (δ)) > 1 and γ (S2 (δ)) < 1.

Furthermore, the calculations show that √ 4 2 δ + O(δ 2 ) < 1 γ (S1 (δ)) · γ (S2 (δ)) = 1 − 5

for 0 < δ 1.

By using [17] (see also proposition 7.3 in [11]) we obtain the bifurcation diagram near the points DT SC and DT SCr ( µ1 = 0, ν = − 15 , b = ± 45 ) and around the line TSC, which is a smooth transition between the diagram of the strong attracting (repelling) case and that of the weak attracting (repelling) case. The bifurcation diagram near DT SCr is illustrated in figure 6.10, where L r and L are the homoclinic bifurcation surfaces, related to the right and left saddle point, respectively. The curve DL r corresponds to the homoclinic loop bifurcation of order two, and the surface DCr corresponds to the double limit cycle bifurcation. Between DCr and L r the corresponding system has two limit cycles, and this is the maximal number of limit cycles for parameter values near the TSC. If we use the transformation ( µ1 , ν, b) "→ (− µ1 , ν, − b), then we obtain the situation near DT SC . Figure 6.6 shows the transition between these two cases. Here we do not draw the Hopf bifurcation surface H , because it is not near the line TSC for b in any finite interval; see figure 6.4.

6.6 Study of X λ,δ + O(δ 2 ) as a perturbation of a family of Hamiltonian systems 2 We consider system (6.16) with Xλ,δ like in (6.24). When | µ1 | < √ this vector 3 3 field has three singularities which are all located on the x-axis. We denote the

156

Freddy Dumortier, Peter Fiddelaers and Chengzhi Li

Figure 6.10. The bifurcation diagram near DT SCr .

middle one (anti-saddle) by (ξ, 0); then ξ is a root of the equation ϕ(x) = x 3 − x + µ1 = 0

(6.32)

and ξ has the same sign as µ1 ; see figure 6.11. We let x = x − ξ , y = y, and rewrite (x, y) as (x, y); then the system (6.16) becomes x˙ = y (6.33) y˙ = x 3 + αx 2 + βx + δ(ν + bx + x 2 )y + O(δ 2 ) where

α = 3ξ, β = 3ξ 2 − 1, ν = ν + bξ + ξ 2 , b = b + 2ξ.

(6.34)

If δ = 0, (6.33) is a Hamiltonian system with the Hamiltonian H (x, y) =

y2 1 4 α 3 β 2 − x − x − x . 2 4 3 2

(6.35)

The level curves h = {(x, y) | H (x, y) = h} are shown in figure 6.12. The saddle points (x ± , 0) of (6.33) are given by x ± = 12 (−α ± α 2 − 4β). We necessarily have β < 0, because the choice of ξ implies x − < 0 < x + . Hence, 4 x± β 2 1 2 2 α 3 h ± := H (x ±, 0) = − + x± + x± = x (x − 2β) > 0. 4 3 2 12 ± ±

The nilpotent saddle of codimension four

157

Figure 6.11. Graphs of ϕ.

If ξ > 0 (i.e. α > 0), then 0 < h + < h − ; if ξ < 0, then 0 < h − < h + ; if ξ = 0, then h − = h + . Since the bifurcation diagram of (6.24) is invariant under the transformation ν, b) "→ (− µ1 , ν, − b), it will become clear that we only need to consider ( µ1 , 2 the case µ1 ≥ 0. For a description of the situation for µ1 near √ we refer to 2 section 6.3, hence, we will focus on the case 0 ≤ µ1 < √ . 3 3 To this end, we restrict the parameters ξ , α, β to the region

1 0≤ξ < √ , 3

0≤α<

√

3,

3 3

−1 ≤ β < 0.

(6.36)

To study the number of periodic orbits of (6.33) for small δ, we essentially need to study the number of zeros of the elliptic integral I (h) = (ν + bx + x 2 )y d x (6.37)

h

where h is shown in figure 6.12 for 0 < h < h + . This was the subject of our study in [9] where we showed that the maximum number of zeros is two, that all positions and bifurcations of zeros can be described by transversality conditions, and that the bifurcation pattern coincides with the one we have represented in figure 6.4 for the limit cycles.

158

Freddy Dumortier, Peter Fiddelaers and Chengzhi Li

Figure 6.12. Level curves of H .

In fact, the implicit function theorem permits us to prove that, at least on the regular part of the Hamiltonian, the whole picture surely extends to a similar one on limit cycles for the system Xλ,δ + O(δ 2 ) with δ > 0 sufficiently small. As such we can be sure about the number of limit cycles and the limit cycle bifurcations on the regular part of the Hamiltonian for small strictly positive δ. However, further elaboration is needed near the centre, and near the homoclinic loop and the TSC, respectively. The analysis near the centre is rather standard; we will say a few more words about it in section 6.7. The TSC has been dealt with in section 6.5.2. The homoclinic loop is more challenging, and we will consider it in the next section.

6.6.1 Homoclinic loop bifurcations away from TSC and SN Now we let x = x − x + in (6.33) to move the origin to the right saddle point (x + , 0) of (6.33); see figure 6.13. 3 + αx 2 + βx = 0 and h = H (x , 0), Let η = ξ + x + and note that x + + + + + while equation (6.33) is changed to

x˙ = y + δ( α x 2 + βx ν + bx + x 2 )y y˙ = x 3 +

(6.38)

The nilpotent saddle of codimension four

159

Figure 6.13. Hamiltonian saddle-loop.

where we rewrite ( x , y) as (x, y), and α = 3η,

ν = ν + bη + η2 ,

= 3η2 − 1, β

b = b + 2η.

The associated Hamiltonian level curves are given by 2 x4 α 3 β y2 − + x + x = h H (x, y) = 2 4 3 2 h < 0. where h = h − h + , hence, −h + < We note η = ξ + x + = 12 (−ξ + 4 − 3ξ 2 ) and 0 < ξ < conditions √ 1 > 0. √ < η < 1, and β α> 3 3 We can keep in mind that η = 1 corresponds to the TSC. We define x k y d x, k = 0, 1, 2 Ik (h) =

(6.39)

(6.40)

√1 , 3

with the (6.41)

h

and

P( h) = I1 ( h)/ I0 ( h),

Q( h) = I2 ( h)/ I0 ( h)

(6.42)

where

h = {(x, y) | H (x, y) = h, −h + < h < 0}. The calculation shows that √  I0 (0) = 43 3η2 − 1 − 2 2η(1 − η2 ) ln ϕ(η)    1 1 I1 (0) = − η(−9η2 + 13) 3η2 − 1 + √ (1 − η2 )(7η2 + 1) ln ϕ(η)  3 2 √   2 I2 (0) = 15 (−63η4 + 67η2 + 8) 3η2 − 1 − 2η(1 − η2 )(5η2 + 3) ln ϕ(η) (6.43) where √ 2η + 3η2 − 1 . (6.44) ϕ(η) = 1 − η2 We note that ϕ( √1 ) = 1 and ϕ (η) > 0 for η ∈ ( √1 , 1). 3

3

160

Freddy Dumortier, Peter Fiddelaers and Chengzhi Li

Lemma 6.8. I2 ( h) I1 ( h) − I2 ( h) I1 ( h) < 0 for h ≤ 0. (x, y) = 0, Proof. Suppose that y = y(x) (≥ 0) is the function defined by H then from (6.40) it is easy to see that y(x) = O(|x|) as x → 0 and y(x) = O(|x − x ∗ |1/2 ) as x → x ∗ ; see figure 6.13. This implies lim I1 (h) =

h→0

0

x dx y(x)

and

lim

h→0

I2 ( h)

=

0

are convergent. Hence, we can use the result in [16] to prove h ≤ 0, which is equivalent to the conclusion of lemma 6.8. α 3 Let (x) = x4 + 3x + x < x 0 < 0 (see figure 6.13), and 4

2 β 2x ,

then

)(x −x 0) < 0 (x)(x −x 0) = x(x 2 + α x +β

Let x = x (x) be defined by x x < 0 and d x0 < d x < 0. Let χ(x) =

x2 x

(x) = ( x) − x2

(x)

x2 dx y(x)

h) d ( II2 (( ) d h 1 h)

< 0 for

= 0 has three zero points

for x < x < 0, x %= x 0 . (6.45)

( x ) for x ∗ < x < x 0 , then x satisfies (x)

( x) − x (x)

=

x x −β x + x + α

where x < x ∗ < x < x 0 , x = x (x) is defined as above, then

1 x 2 + (x 2 + d + α x + β) α x + β) ( x χ (x) = > 0. dx (x + x + α )2 Here we use (6.45), x < x < x 0 < x < 0 and of [16], the desired result follows.

d x dx

< 0. Therefore, by theorem 2

Lemma 6.9. System (6.24) has homoclinic bifurcations with an order not exceeding two. The homoclinic loop bifurcation of order two is given by   µ1 = η − η 3 =0 (6.46) Lr : ν + ( p(0) + η) b + η2 + 2η p(0) + Q(0)  2 ν + η b+η =0 with parameter η ∈ ( √1 , 1). The homoclinic loop bifurcation of order two is 3 given by  µ1 = η − η 3     Q(0)  ν= η + η2 DL r : (6.47) P(0)    Q(0)   b=− + 2η P(0)

The nilpotent saddle of codimension four

161

and Q(0) are defined by (6.42) and with parameter η ∈ ( √1 , 1), where P(0) 3 (6.43). Proof. Instead of (6.24), we consider the equivalent form (6.38). The elliptic integral for this system is I ( h) = ν I0 ( h) + b I1 ( h) + I2 ( h).

(6.48)

On the other hand, near the loop

0 , δ I ( h) also has the expansion (see [18]) c0 + c1 h ln | h| + c2 h + ···   c0 = δ I (0) (6.49) ν c1 = δ  c2 = δ I (0), if c1 = 0. The homoclinic bifurcation of order one (resp. two) is given by c0 = 0, δc1 %= 0 (resp. c0 = c1 = 0, δc2 %= 0). Thus by using (6.48) and (6.49) it is easy to obtain the conditions (6.46) and (6.47). It remains to prove that c0 = c1 = 0, δ %= 0 implies c2 %= 0. This means that the order of homoclinic bifurcation does not exceed two. We suppose c0 = c1 = c2 = 0, δ %= 0. Noting that, by (6.48) and (6.49), ν = 0 and c0 = c2 = 0, δ %= 0 is equivalent to c1 = 0, δ %= 0 gives b+ I2 (0) = 0 I1 (0) I2 (0) = 0. I1 (0)b + with

This contradicts lemma 6.8.

I2 (0) d Lemma 6.10. dη ( + 2η) < 0 for η ∈ ( √1 , 1), where I1 (0) and I2 (0) are I1 (0) 3 defined in (6.43).

Proof. From (6.43) and (6.44) we have G(η) =

I2 (0) d dη ( I1 (0)

+ 2η) =

G(η) , I12 (0)

where

2 4 45 (3η

− 1)(−81η6 + 450η4 − 193η2 + 64) √ 4 2 η 3η2 − 1(9η6 − 7η4 − 5η2 − 29) ln ϕ(η) + 15 + 2(1 − η2 )2 (7η4 + 10η2 − 1)(ln ϕ(η))2 .

(6.50)

Let us show G(η) < 0 for η ∈ ( √1 , 1). By (6.50) and (6.44) we know that 3 limη→1/√3 G(η) = 0, limη→1 G(η) = −∞. For η near √1 we let x = 3η2 − 1, 3

then η2 = 13 (1 + x 2 ), and for x near 0 we have x2 x4 x6 5x 8 7x 10 21x 12 1 − + − + − η = √ 1+ 2 8 16 128 256 1024 3 14 16 18 33x 429x 715x + − + + O(x 20 ) 2048 32768 65536

Freddy Dumortier, Peter Fiddelaers and Chengzhi Li

162 and

3x 5 x7 3x 9 9x 11 159x 13 3 x+ − + − + ln ϕ(η) = 2 40 56 128 704 13312 15 17 19 9x 4275x 985x − + − + O(x 21 ). 1024 557056 155648

Substituting the above expansions into (6.50) we obtain G(η) = Hence, G(η) < 0 for η > to

√1 , 3

16 2 48 175175 x (−65 + 58x ) √1 3

and η near

√1 . 3

+ O(x 20 ).

If there is a η∗ ∈ ( √1 , 1), nearest 3

such that G(η∗ ) = 0, then we must have G (η∗ ) ≥ 0. We will show that

for any η ∈ ( √1 , 1), G(η) = 0 implies G (η) < 0, and this gives the desired 3 conclusion. We rewrite (6.50) as G(η) = A0 (η)+ A1 (η) ln ϕ(η)+ A2 (η)(ln ϕ(η))2 . Then G(η) = 0 gives −A1 (η) + W (η) (6.51) ln ϕ(η) = 2 A2 (η) where

W (η) = ± A21 (η) − 4 A0 (η)A2 (η).

Using (6.51) and noting (ln ϕ(η)) =

√ √2 3η2 −1

(1−η2 )

(6.52)

we have

4(B0 (η) + B1 (η)W (η) + B2 (η)W 2 (η)) (6.53) 225(1 − η2 ) 3η2 − 1 A22 (η) where B0 , B1 and B2 are polynomials of η and 3η2 − 1. Hence, G 1 (η) is continuous in η ∈ ( √1 , 1). We claim that G 1 (η) %= 0 for η ∈ ( √1 , 1). In 3 3 fact, by (6.52) G 1 (η) = G (η) |(51)= −

(B0 + B2 W 2 )2 − B12 W 2 = 20480(3η2 − 1)12(9η4 − 10η2 + 49)(A2(η))2 which has no roots for η ∈ ( √1 , 1). 3

Therefore G 1 (η) has a fixed sign for η ∈ ( √1 , 1) and it is negative, since 3 G 1 (0.8) ≈ −134.298. This finishes the proof of lemma 6.10. Theorem 6.11. In figure 6.4 the homoclinic loop bifurcation surface L r , given 2 by (6.46), can be expressed in the form ν = ν( µ1 , b) for ( µ1 , b) ∈ (0, √ )× 3 3 (−M, M) with M large, and it has the two lines TSC and N Br in its boundary 2 as µ1 → 0 and µ1 → √ respectively. The curve DL r , given by (6.47), can be 3 3 b = b( µ1 )} with the property that ∂ν > 0 and expressed in the form { ν = ν( µ1 ), ∂ µ1

The nilpotent saddle of codimension four ∂ b ∂ µ1

< 0 for 0 < µ1 <

2 √ , 3 3

and lim µ1 →0

∂ b ∂ µ1

√ = −∞, lim µ1 →2/3 3

163 ∂ b ∂ µ1

= 0.

the curve DL r tends to the point Furthermore, when µ1 → 0 (resp. DT SCr (resp. NCr ). The surface L and the curve DL can be obtained from µ1 , ν, b) "→ (− µ1 , ν, − b) together with L r and DL r by the transformation ( (x, y) "→ (−x, −y). Along DL and DL r the homoclinic loop bifurcation of order two occurs. There are no homoclinic bifurcations of a higher order. 2 √ ) 3 3

Proof. From the first equation of (6.46) (or (6.47)) we have

∂ µ1 ∂η

= 1 − 3η2 < 0

since η ∈ ( √1 , 1), hence, we can get a function η = η( µ1 ) from this equation. 3 Putting this function into the second equation of (6.46) (or the last two equations and of (6.47)) we obtain the desired form for L r (or DL r ). We note that P(0) are functions of η. Q(0) = I2 (0) → 6 as Next, from (6.43) we have P(0) = I1 (0) → −1 and Q(0) I0 (0)

I0 (0)

5

η → 1 ( µ1 → 0). On the other hand,

0 shrinks to the origin as η → √1 . By 3 using these properties it is not difficult to verify from (6.46) (resp. (6.47)) that the surface L r (resp. the curve DL r ) tends to the lines T SC = { µ1 = 0, ν = − 15 } 1 1 and N B = { µ1 + √ b + 3 = 0} (resp. to the points DT SCr = {(0, − 15 , − 45 )} and NCr = {(

3 2 √ , 13 , − √2 )}) 3 3 3

as µ1 → 0 and µ1 →

2 √ , 3 3

respectively.

Finally, by lemma 6.10 from (6.47) we obtain that along DL r ∂ b >0 ∂η

and

Q(0) ∂ ν d Q(0) = + + 2η < 0 ∂η dη P(0) P(0)

< 0. Furthermore, using (6.50) and (6.47) we see that since Q(0) > 0 and P(0) along DL r 1 ∂ b G(η) (6.54) ,η ∈ √ ,1 . = ∂ µ1 (3η2 − 1) I12 (0) 3 By using the expansions of η and ln ϕ(η) we also have ln ϕ(η) lim√ = η→1/ 3 3η2 − 1

3 . 2

(6.55)

Hence, it is not difficult to obtain by using (6.54), (6.50), (6.43), (6.44) and (6.55) that along DL r ∂ b = −∞ µ1 →0 ∂ µ1 lim

and

∂ b = 0. µ1 3 ∂

lim √

µ1 →2/3

164

Freddy Dumortier, Peter Fiddelaers and Chengzhi Li

6.7 Uniformity of local bifurcation diagrams with respect to δ As we already mentioned, in dealing with local bifurcations, like e.g. in section 6.3, we have not been sufficiently precise about the domain in parameter space on which we are certain about the proposed bifurcation diagram and the related phase portraits. Trying to finish this study is not always trivial. In the statement of theorem 6.4 we therefore discarded cone-like neighbourhoods (of central type) of N S− , N S+ , NC and NCr . In fact the ‘uniformity’ problem merely concerns the number of limit cycles and possible bifurcations of these limit cycles. Let us now give a precise survey of these technical questions that we have to deal with in order to prove the second statement of theorem 6.4. The complete treatment (including the discarded neighbourhoods of NC and NCr ) would prove the existence of a δ > 0 such that the bifurcation diagram of Xλ,δ , with µ2 = −1, as described in figure 6.4, is uniformly true for δ ∈ (0, δ]. 2 From the result in section 6.4 we know that at the faces µ1 = ± √ only 3 3 saddle–node bifurcations and nilpotent cusps of order two and three occur. Hence, by using the results and techniques in [2, 12, 20] it must be possible to find 2 µ1 = ± √ is a δ1 > 0 such that the bifurcation diagram in figure 6.4 near 3 3 uniformly true for δ ∈ (0, δ1 ]. Outside the discarded neighbourhoods of NC and NCr such a study has been made in [12]. The situation for µ1 near 0 was discussed in section 6.5.2. By using the results in [11, 17] we can find a δ2 > 0, such that the bifurcation diagram near µ1 = 0 is uniformly true for δ ∈ (0, δ2 ]. We finally consider the case σ1 ≤ | µ1 | ≤ σ2 , where constants σ1 and σ2 2 satisfy 0 < σ1 1, 0 < √ − σ2 1. For every µ∗1 , µ∗1 ∈ [σ1 , σ2 ], by using 3 3 the results in section 6.6 and the implicit function theorem we can find δ ∗ > 0, such that the bifurcation diagram near µ∗1 is uniformly true for δ ∈ (0, δ ∗ ]. By the compactness of [σ1 , σ2 ] we can find δ3 > 0 such that the uniformity is valid for δ ∈ (0, δ3 ]. We take δ = min(δ1 , δ2 , δ3 ), then get the global uniformity of the bifurcation diagram for δ ∈ (0, δ]. We need to remark that in the third step near the Hopf and homoclinic loop bifurcation surfaces (or curves) the standard implicit function theorem is not enough. From (6.21) we see that along the Hopf surface H we have limδ→0 Vδ1 %= 0. By using theorem 2.5 of chapter 3 in [3] we obtain the uniformity of the bifurcation diagram near H and Hr . When V1 = 0 from (6.21) we have limδ→0 Vδ2 %= 0. Similar arguments permit us to prove the uniformity of the bifurcation diagram near D H and D Hr . For a general result with respect to this ‘uniformity’ problem in the study of Hopf bifurcations we can refer to [4]. For the homoclinic loop bifurcation, we note from (6.49) and lemma 6.9 that along L and L r we have limδ→0 cδ1 %= 0 and along DL and DL r (c0 = c1 = 0) we have limδ→0 cδ2 %= 0. By similar arguments, as for the Hopf bifurcations, it is possible to prove the uniformity of the bifurcation diagram near L , L r , DL and DL r .

The nilpotent saddle of codimension four

165

Acknowledgements C Li thanks the Limburgs Universitair Centrum for their hospitality and financial support during a number of stays in the period 1995–98.

References [1] Andronov A, Leontovich E, Gordon I and Maier A 1971 Theory of Bifurcations of Dynamical Systems on a Plane (Jerusalem: IPST) [2] Bogdanov R I 1976 Versal deformation of a singularity of a vector field on the plane in the case of zero eigenvalues Seminar Petrovski (in Russian) (Engl. transl. 1981 Selecta Math. Sov. 1 389–421) [3] Chow S-N, Li C and Wang D 1994 Normal Forms and Bifurcation of Planar Vector Fields (Cambridge: Cambridge University Press) [4] Caubergh M and Dumortier F Hopf–Takens bifurcations and centers Preprint [5] Dumortier F 1977 Singularities of vector fields on the plane J. Diff. Eqns 23 53–106 [6] Dumortier F 1993 Techniques in the theory of local bifurcations: blow-up, normal forms, nilpotent bifurcations, singular perturbations Bifurcations and Periodic Orbits of Vector Fields (NATO ASI Series C: Math. and Phys. Sciences 408) ed D Schlomiuk (Dordrecht: Kluwer) pp 19–73 [7] Dumortier F and Fiddelaers P 1991 Quadratic models for generic local threeparameter bifurcations on the plane Trans. Am. Math. Soc. 326 101–26 [8] Dumortier F and Iba˜nez S 1998 Singularities of vector fields on R3 Nonlinearity 11 1037–47 [9] Dumortier F and Li C 2000 Perturbations from an elliptic Hamiltonian of degree four: (I) saddle loop and two saddle cycle J. Diff. Eqns to appear [10] Dumortier F and Rousseau C 1990 Cubic Li´enard equations with linear damping Nonlinearity 3 1015–39 [11] Dumortier F, Roussarie R and Rousseau C 1994 Elementary graphics of cyclicity 1 and 2 Nonlinearity 7 1001–43 [12] Dumortier F, Roussarie R and Sotomayor S 1987 Generic three-parameter families of vector fields on the plane, unfolding a singularity with nilpotent linear part The cusp case Ergod. Theor. Dynam. Syst. 7 375–413 [13] Dumortier F, Roussarie R and Sotomayor J 1991 Generic Three-Parameter Families of Planar Vector Fields, Unfoldings of Saddle, Focus and Elliptic Singularities with Nilpotent Linear Parts (Lecture Notes in Mathematics 1480) (Berlin: Springer) pp 1–164 [14] Fiddelaers P 1992 Local bifurcations of quadratic vector fields PhD Thesis LUC [15] Khibnik A, Krauskopf B and Rousseau C 1998 Global study of a family of cubic Li´enard equations Nonlinearity 11 1505–19 [16] Li C and Zhang Z 1996 A criterion for determining the monotonicity of the ratio of Abelian integrals J. Diff. Eqns 124 407–24 [17] Mourtada A 1991 Degenerate and non-trivial hyperbolic polycycles with two vertices J. Diff. Eqns 113 68–83 [18] Roussarie R 1986 On the number of limit cycles which appear by perturbation of separatrix loop of planar vector fields Bol. Soc. Bras. Mat. 17 67–101 [19] Takens F 1974 Singularities of vector fields Publ. Math. IHES 43 47–100

166

Freddy Dumortier, Peter Fiddelaers and Chengzhi Li

[20] Takens F 1974 Forced oscillations and bifurcation Applications of Global Analysis I Commun. Math. Inst. Univ. Utrecht 3 1–59 (reprinted in chapter 1 of this volume) [21] Xiao D 1993 Bifurcations of saddle singularity of codimension three of a planar vector field with nilpotent linear part Sci. Sinica A 23 252–60 [22] Zoła¸dek H 1991 Abelian Integrals in Unfoldings of Codimension 3 Singular Planar Vector Fields (Lecture Notes in Mathematics 1480) (Berlin: Springer) pp 165–224

Chapter 7 Exponential confinement of chaos in the bifurcation sets of real analytic diffeomorphisms Henk W Broer University of Groningen Robert Roussarie Universit´e de Bourgogne

The bifurcation set is considered of families of planar diffeomorphisms near a degenerate fixed point. Under quite general conditions this bifurcation set is exponentially close to that of an approximating family of vector fields, obtained by averaging. This implies that chaotic dynamics is confined to exponentially (or super-exponentially) narrow horns in the parameter space. After a suitable scaling an approximating Hamiltonian system is obtained, where the perturbation terms contain a dissipative deformation which is stable under contact equivalence. This approach covers many cases like the codimension-k Hopf–Takens bifurcation and some strong resonances. More concretely, this paper deals with a quite general method to analyse generic bifurcations of planar diffeomorphisms. Throughout, the strong connection between diffeomorphisms and planar vector fields that depend periodically on time will be used. The connection is that the diffeomorphisms can be realized as the Poincar´e mapping of time periodic vector fields, which, in general, are only C ∞ in the time t; cf [1]. Also by normal form or averaging techniques, these vector fields become autonomous to a large order, involving symmetry and integrability. It has long been known that these integrable approximations determine the dynamics so much that more complicated effects, such as chaos, are flat phenomena. In the real analytic context one can be more explicit regarding the nature of this flatness and, in fact, it can be identified as exponential or super-exponential. 167

168

Henk W Broer and Robert Roussarie

7.1 Motivation: the Hopf–Takens bifurcation As an example consider planar maps z "→ ζ = (z, λ), where λ is a (multi-) parameter. We restrict ourselves to the case where (0, 0) = 0 and where spect d (0, 0) ⊆ T1 with no strong resonances up to order 4. In that case we may assume that d (0, 0) = Rot2πα , for some angle 2πα %= 2π j/ for || ≤ 4. This description generically fits to the Hopf bifurcation for planar maps: a well known codimension-one bifurcation [1, 27]. A more complicated case of codimension two has been studied extensively in [10, 11, 12]. Here the ‘Hopf’ coefficient vanishes and has to be included as an extra parameter. 7.1.1 Preliminaries To fix thoughts, we now consider the codimension-k Hopf bifurcation, also called the Hopf–Takens bifurcation [33, 34, 32]. This case can be described by saying that in the normal form the leading k − 1 coefficients vanish. We assume here that there are no resonances up to order k. To be more precise, if d (0, 0) = S + N is the Jordan canonical splitting into semisimple and nilpotent parts, in appropriate coordinates, we can write = X 2π + O(2k + 1) where X is a family of polynomial planar vector fields with X (0, 0) = α∂/∂θ . Moreover, S∗ X = X. The superscript means that we take the time 2π map. In suitable polar coordinates ((, θ ) the family X has the form θ˙ = f ((2 , λ) (˙ = (g((2 , λ) where f (0, 0) = α and where g((2 , 0) = c(2k for a constant c %= 0. This means that the map is generated by a similar time-dependent vector field θ˙ = f ((2 , λ) + (2k+2 F((, θ, t, λ) (˙ = ((g((2 , λ) + (2k+2 G((, θ, t, λ)) which depends periodically on the time t, with period 2π. For a discussion on the general relationship between periodically time-dependent vector fields and diffeomorphisms, see [1, 6]; the latter reference contains an interpolation lemma on this subject. Notably, by the interpolation of a diffeomorphism by a timedependent vector field the dependence on t can only be C ∞ .

Exponential confinement of chaos

169

7.1.2 Parameters and scaling First we put λ = 0, scaling ( = ε(, and so obtaining θ˙ = α + ε2 f (( 2 , ε) + ε2k+2 ( 2k+2 F((, θ, ε, t) (˙ = ((ε2k β(2k + ε2k+2 g(( 2 , ε) + ε2k+2 ( 2k+2 G((, θ, ε, t, )) where from now on we assume 2π-periodicity in the time t. We can include parameters λ = (λ0 , λ1 , . . . , λk−1 ) as follows, as suggested by singularity theory; see e.g. [15]. Indeed, scaling λ0 = ε2k λ0 , λ1 = ε2k−2 λ1 , . . . we arrive at θ˙ = α + ε2 f ((2 , λ, ε) + O(ε2k+2 ) (˙ = ε2k ((λ0 + λ1 (2 + · · · + λk−1 (2(k−1) + c(2k + O(ε2 )) where the t-dependence is restricted to the O(ε2k+2 )-terms. For notational simplicity we delete all bars. Truncating at order 2k we now obtain an integrable Hamiltonian system θ˙ = α + ε2 f ((2 , λ, ε) (˙ = 0. Including terms lower than order 2k + 2 yields the following dissipative perturbation: θ˙ = α + ε2 f ((2 , λ, ε) (˙ = ε2k ((λ0 + λ1 (2 + · · · + λk−1 (2(k−1) + c(2k ). This system is still integrable in the sense that it is autonomous or t-independent. Including also the O(ε2k+2 )-terms leads to the full time-dependent system.

7.2 Setting of the problem and main result The perturbation transition from the Hamiltonian, via the dissipative to the full time-dependent system, is a theme with many variations that applies, among other situations, to the Hopf bifurcations with strong resonances, among which is the Bogdanov–Takens case for maps [5, 6, 31]. Also more degenerate bifurcations can be treated this way, like the generalized Bogdanov–Takens case and codimension-three cases [13, 14]. Here the small parameter ε can occur in various ways. The simplest case is the one concerning the Hopf–Takens bifurcation described before.

170

Henk W Broer and Robert Roussarie A quite general case has the form θ˙ = εq A(x, λ, ε) + εq+r R1 (θ, x, t, λ, ε) x˙ = ε p B(x, λ, ε) + εq+r R2 (θ, x, t, λ, ε)

(7.1)

where q, p and r are non-negative integers with 0 ≤ q ≤ p < q + r. Again the t-dependence of the R j is 2π-periodic. Here (x, θ ) ∈ (a, b) × T1 , ε ∈ (0, ε0 ) and λ ∈ S, a compact analytic set. In the x-direction the system extends over the boundaries a and b. Rescaling the time by εq we arrive at the basic, integrable form θ˙ = A(x, λ, ε) = x˙ = 0

∂ H (x, λ, ε) ∂x

where H is a suitable Hamiltonian, which can describe several situations. The simplest case to which this approach applies is an open annulus, say, when A(x, λ) %= 0 for all x and λ. The unfolding then is determined by the dissipative part (7.2) x˙ = ε p−q B(x, λ, ε) where the focus is on the surviving limit cycles and their bifurcations, given by the equations ∂B (x, λ, ε) B(x, λ, ε) = 0 = ∂x which define a subset $ of the (λ, ε)-space. Let $ε = $ ∩ S × {ε}. The natural stability notion for this setting is that of (structural) stability under contact equivalence related to this zero-set $, for ε small. In fact, we now introduce our central assumption. Assumption. equivalence.

The family B(x, λ, 0) is structurally stable under contact-

To be more precise, this assumption means that by restricting ε0 , the set $ is equivalent to $0 × (0, ε0 ), modulo a real analytic, ε-preserving diffeomorphism; cf e.g. [15]. Let us describe the dynamics of the autonomous part θ˙ = εq A(x, λ, ε) x˙ = ε p B(x, λ, ε)

(7.3)

of (7.1) as a function of the parameters in the complement U := S ×(0, ε0)\$ for small ε. Then the system (7.3) is Morse–Smale with a finite number of hyperbolic limit cycles. In each component of U the number of cycles, as well as their stability type, is constant.

Exponential confinement of chaos

ε

$

O(e−c/ε )

$

171

O(εr ) $ l exponential horn H

1111111111111 0000000000000 0000000000000 1111111111111 0000000000000 1111111111111 0000000000000 1111111111111 0000000000000 1111111111111 0000000000000 1111111111111 0000000000000 1111111111111 0000000000000 1111111111111 0000000000000 1111111111111 0000000000000 1111111111111 0000000000000 1111111111111 0000000000000 1111111111111 0000000000000 1111111111111 0000000000000 1111111111111 0000000000000 1111111111111 0000000000000 1111111111111 λ ˜ and exponential horns. Figure 7.1. The bifurcation sets $, $ , $

After this the perturbation εq+r R has to be applied, and evaluated for similar and for more complicated dynamics. Before doing so a suitable finite number of averaging transformations is applied to reduce the size of the remainder R to exponential smallness in ε. In this way $ changes to a ‘broken’ bifurcation set $ , such that distS($ε , $ε ) = O(εr ). As a consequence of the stability assumption, for each ε small enough, the set $ε is analytically diffeomorphic to $0 . It is one of the main results of this paper that chaos and other interesting features are a flat phenomenon, in the sense that they can only take place in an ε-exponentially narrow horn around $ ; see figure 7.1 and compare a remark by Takens [34]. Indeed we can replace $ by a smooth approximation. To be more precise, we have the following. Theorem 7.1 (main theorem). Given a family of time-dependent vector fields (7.1) θ˙ = εq A(x, λ, ε) + εq+r R1 (θ, x, t, λ, ε) x˙ = ε p B(x, λ, ε) + εq+r R2 (θ, x, t, λ, ε) 0 ≤ q ≤ p < q + r, where the family of maps x "→ B(x, λ, 0) is structurally stable under contact-equivalence and where the R j ( j = 1, 2) are 2π-periodic in t. Consider the bifurcation set $ as above. Then there exists a C 0 reparametrization (a homeomorphism) ψ(λ, ε) = (ψε (λ), ε) of S × [0, ε0 ], such

172

Henk W Broer and Robert Roussarie

˜ = ψ($) and $ ˜ ε = ψε ($ε ) then ε "→ $ ˜ ε is a C ∞ family of analytic that if $ subsets of S, while ˜ ε , $ε ) = O(εr ). distS($ Also there exists a positive constant c˜ and there exist decreasing sequences C and ε , ∈ N , defining the exponentially narrow horn ˜ H˜ := {(λ, ε) | distS(λ, $˜ ε ) ≤ C e−c/ε }. For parameter values outside the horn H˜ and for 0 < ε < ε , the dynamics of

(7.3)λ,ε are persistent in the following sense:

(i) Let C be any connected component of U . Then for each (λ, ε) ∈ C there is a one-to-one correspondence between the limit cycles of the autonomous system (7.3)λ,ε and the normally hyperbolic invariant 2-tori of class C of (7.1)ψε (λ),ε . (ii) Any α- and ω-limit set is included in the union of 2-tori. If q > 1 we can replace the exponential by e− ε | log ε| , for an appropriate constant c > 0, also depending on q. c

Remark 7.2. The present approach deals with an annulus of regular Hamiltonian cycles, but also covers the case of a non-degenerate centre. Indeed, a blow-up of this centre reduces to the present annulus case. However, if there is another singularity at the boundary of the cylinder, new problems arise, which will be discussed in section 7.7. For a case study see [5, 6]. Remark 7.3. Compare [9] where a review is given of similar results in the C ∞ -context. Similar programmes are possible in various other contexts, both dissipative and conservative. The limit cycles sometimes then are replaced by quasi-periodic tori. Also persistence theorems hold here, with similar flat, and in the analytic case, exponential or super-exponential estimates; see [9, 4, 3, 18]. The present paper aims to present a general method concerning dissipative effects.

7.3 Applications Let us first define a codimension-k Hopf bifurcation, k = 1, 2, 3, . . . . Given S = Rot2πα and = X 2π with λ a (multi-)parameter and α = α(λ). Let us assume that α = α(0) contains no resonances up to order k as before. Then, in convenient polar coordinates (θ, (), is the time 2π map of the vector field θ˙ = A((2 , λ) + O((2k+2 ) (˙ = (B((2 , λ) + O((2k+3 )

Exponential confinement of chaos

173

with A and B polynomial in (2 of degree less than k, with A((2 , λ) = α + a(λ)(2 + · · · , B((2 , 0) = c(2k . One can choose (0 > 0 and |λ| sufficiently small to ensure that A((2 , λ) %= 0 for all ( ∈ (0, (0 ) λ in a small neighbourhood of 0. For the parameter λ = (λ0 , λ1 , . . . , λk−1 ) we further assume B((2 , λ) =

k−1

λi (2i + c(2k

i=0

as a versal unfolding of the central singularity. Next we perform the scaling as before ( = ε(

and

λi = ε2k−2i λi

which yields θ˙ = α + O(ε2 ) (˙ = ε2k (B((2 , λ) + O(ε2k+2 ) where λ = (λ0 , λ1 , . . . , λk−1 ) ∈ Sk−1. From now on we again delete bars. 7.3.1 The case k = 1 (Neimark–Sacker) In the case k = 1 we have θ˙ = α + O(ε2 ) (˙ = ε2 ((λ0 + c(2 ) + O(ε4 ) with λ0 = ±1. So p = 2, q = 0, r = 4. Assuming that c > 0, the only interesting case is λ0 = −1. Then we have θ˙ = α + O(ε2 ) (˙ = ε2 ((−1 + c(2 ) + O(ε4 ) √ with an unperturbed limit cycle (0 = 1/ c. To study this we localize to ( = (0 + x, which leads to the unperturbed hyperbolic limit cycle θ˙ = α + O(ε2 ) √ x˙ = ε2 c(x + O(x 2 )) + O(ε4 ). This is the starting point of the further linearizing analysis, which for this case recovers the well-known result concerning persistence of the invariant circle; see e.g. [27].

174

Henk W Broer and Robert Roussarie λ0

−1

0

λ1

Figure 7.2. Bifurcation diagram in the Chenciner case.

7.3.2 The case k = 2 (Chenciner) The case k = 2 is given by θ˙ = α + O(ε2 ) (˙ = ε4 ((λ0 + λ1 (2 + c(4 ) + O(ε6 ) now with (λ0 , λ1 ) ∈ S1. Here p = 4, q = 0, r = 6. The bifurcations are given by λ0 = 0 (Hopf bifurcation) 1 2 (saddle–node bifurcation) λ0 = 4 cλ1 where we restrict ourselves to λ1 < 0. We now need to linearize along the two unperturbed cycles as they occur here. To do this we need local charts in parameter space and to localize accordingly in phase space. One way to do this is by taking λ1 = −1 and using λ0 as the parameter, in which case we look near and near λ0 = 1/4c, when the semistable unperturbed cycle occurs for λ0 = 0 √ (0 = 1/ 2c. The case near λ0 = 0 goes as for k = 1. For the remaining case we write 1 +ν 4c 1 (= √ +x 2c

λ0 =

which leads to θ˙ = α + O(ε2 ) c 2 2 4 3 x˙ = cε ν + cνx + x + O(x ) + O(ε6 ). 2 c

Exponential confinement of chaos

175

λ0 λ2

III

λ1 II H I IV

Figure 7.3. Bifurcation set in the Hopf–Takens case where k = 3.

Near ν = 0 a quasi-periodic saddle–node bifurcation occurs and the linearizing analysis would start near each of the corresponding hyperbolic cycles as they occur in the autonomous approximation. The present implies that all the interesting dynamics (concerned with resonance bubbles, Cantori, strange attractors, etc) are confined to an exponentially narrow horn around the saddle– node curve given by the normal form analysis; see figure 7.2 and cf [10, 11, 12]. Similar results in the C ∞ -setting were given in [4]; for a more general reference again see [9]. 7.3.3 General k (Hopf–Takens) The case for k ≥ 3 goes much the same, the question is what will be needed for our purposes. So we start out with θ˙ = α + O(ε2 ) (˙ = ε2k ((λ0 + λ1 (2 + · · · + λk−1 (2(k−1) + c(2k ) + O(ε2k+2 ). Now p = 2k, q = 0, r = 2k + 2. The period T (() of the corresponding Hamiltonian system satisfies T (0) = 2π/α. For the function B(ξ, λ) = λ0 + λ1 ξ + · · · + λk−1 ξ k−1 + cξ k we have to consider the bifurcation set $ ⊆ Sk−1 as it is given by Singularity Theory via the equations B =0=

∂B . ∂ξ

176

Henk W Broer and Robert Roussarie

This set is the union of the plane {λ0 = 0}, corresponding to bifurcation at the origin, and of the discriminant set for positive roots of the general polynomial family cξ k + λk−1 ξ k−1 + · · · + λ0 . In the complement of $ our approach applies. To be more definite, consider the case k = 3, the bifurcation set $ of which is shown in figure 7.3. The complement of $ has four components, each with a different number of limit cycles. In region IV, for example, three limit cycles co-exist. Our main theorem 7.1 can be applied to any of these components. 7.3.4 Further results Any local unfolding of vector fields of finite codimension may be viewed as the normalized part of a corresponding unfolding of diffeomorphisms (or nonautonomous, time-periodic vector fields); for examples see [13, 14]. In all cases a large part of the bifurcation diagram reduces to the framework of the main theorem 7.1, i.e., to a dissipative perturbation of a Hamiltonian vector field around an annulus of regular cycles. Next consider cases, where this annulus is bounded by a polycycle & containing saddle points. The application of our main theorem 7.1 to regular annuli near &, keeping track of the asymptotics at &, now leads us out of the realm of polynomials, Taylor series and classical analytic geometry. Indeed, the asymptotics of the occurring period and Floquet functions, etc, may have Dulac series expansions. In section 7.7 we show how to proceed in certain of these cases. First in section 7.7.1 we briefly review the example of the local Bogdanov–Takens bifurcation for diffeomorphisms (cf [5, 6, 31]) which contains a codimension-one saddle connection. Second, in section 7.7.2 we sketch an example of a codimension-two saddle connection as it shows up in a local codimension-three bifurcation from [13, 14]. In general these applications of the main theorem 7.1 will need that q ≥ 1. For further details we refer to future research. Remark 7.4. The present approach can be further generalized in various ways. One possible direction is that of the strong resonances; cf [2, 1, 19, 20, 21, 22, 34] and see also [23, 37] in this volume. In these cases the autonomous part keeps the discrete symmetry of the linear part; preservation of such symmetries is included in the main theorem 7.1. Another direction is concerned with higher-dimensional diffeomorphisms. To fix thoughts, consider the setting in [33, 34, 16], where one ends up with a higher-dimensional vector field approximation with several extra (formal) integrals. Using some of the present approach seems possible in this situation. However, in the linearization part below (cf section 7.5), one has to deal with resonances. Remark 7.5. Another question is what happens inside the exponentially narrow horns, excluded from the present study. Apart from quasi-periodic bifurcations (cf [10, 4, 7]) one also expects to find homoclinic bifurcations. Indeed, in view

Exponential confinement of chaos

177

of [29, 7] it seems well possible to create examples of this, using real analytic Kupka–Smale-like perturbations; cf e.g. [28, 35, 8]. In this respect one cannot help wondering how the Newhouse phenomenon of infinitely many co-existing sinks [26] relates to the infinitude of elliptic islands in the generic symplectic maps near our Hamiltonian approximation; again cf [11, 12].

7.4 Averaging The starting point of our present investigation is a real analytic system z˙ = εq f (z, ε) + εq+r R0 (z, t, ε)

(7.4)

with z ∈ Rn ⊆ C n and t ∈ R, where the system is 2π-periodic in t and (real) analytic in z and smooth in t. Observe that (7.4) is more general than the quite general form (7.1) from the preceeding section. Indeed, we now do not make any distinction between conservative or dissipative terms. Successive averaging transformations will be applied to (7.4). At each step we obtain the general form z˙ = εq f (z, ε) + εq+r (g(z, ε) + R(z, t, ε))

(7.5)

with an autonomous part g. In fact, in a (finite) number of averaging steps, which do not touch the general form of (7.5), we shall reduce the size of the remainder term R. In the initial case (7.4) we have g ≡ 0, but while iterating the term g is going to change, although it remains bounded: it consists of averaged parts of remainders as they show up after each iteration. The size of the terms is measured in the supremum norm on compact domains, that have been suitably fattened in the imaginary direction. In fact, for any disc D ⊆ Rn and δ > 0, let D + i δ := {z ∈ C n | ∃x ∈ D with |z − x| ≤ δ}. Also we shall need the time-extension of such a domain (D + i δ)∗ := (D + i δ) × T1 . For any holomorphic function h defined here we denote the corresponding supremum norm by or |h|(D+iδ)∗ . |h| D+iδ For vector valued functions we take the maximum of the norm of the components. At each iteration we allow the domain to shrink appropriately in the imaginary direction. In this way we obtain that after a suitable number of steps, the remainder R is exponentially small in ε. This result goes back to [25]; the present exposition leans heavily on [6].

178

Henk W Broer and Robert Roussarie

7.4.1 Formulation of the averaging theorem For the purpose of this paper we include a precise definition of ‘piecewise analyticity’. On the set of real analytic functions we put the real analytic topology, defined as follows: on the complex analytic (or holomorphic) extension of the functions we use the (standard) compact-open topology. Definition 7.6 (piecewise real analyticity). Let W ⊆ R be a domain and consider the real interval [0, ε0 ]. A function g : W × [0, ε0 ] → R is called piecewise real analytic if (i) There exists a monotonic sequence {ε N } N∈N , converging to 0, such that g(z, ε) is real analytic in z ∈ W and in ε ∈ [0, ε0 ], for ε %= ε N , ∀N ∈ N . (ii) For any N ∈ N the arc ε "→ g(z, ε) has a limit in the real-analytic topology for both ε ↓ ε N and ε ↑ ε N . (iii) Moreover, g(z, ε N ) = lim g(z, ε). ε↑ε N

(iv) The limit lim g(z, ε) ε↓0

exists in the real-analytic topology. As will be clear, also the notion of complex piecewise analyticity exists for complex valued functions, defined on a complex domain W ⊆ C n . The parameter ε ∈ [0, ε0 ], however, remains real. It is easy to see that the set of (germs of) piecewise analytic functions forms a ring. Moreover, a map is called piecewise analytic if each of its components is. Finally, this definition sometimes will be extended to functions g(z, t, ε), depending periodically and smoothly on t ∈ T1 . We now give a formulation of the averaging theorem; the rest of the section is devoted to the proof of this theorem. Theorem 7.7 (averaging). Given an initial real analytic system (7.4) z˙ = εq f (z, ε) + εq+r R0 (z, t, ε) with z ∈ D ⊆ Rn ⊆ C n and t ∈ R, where the system is 2π-periodic in t and (real) analytic in z, ε and smooth in t. Suppose that f is polynomial in ε of degree less than r and that (7.4) has a holomorphic extension to the domain z ∈ D + i δ; let | f | := | f | D+iδ . Then, for ε0 sufficiently small, there exists a near-identity transformation ! = !(z, t, ε), which is piecewise complex analytic on a domain (D + i δ )∗ × (0, ε0 ), with δ < δ, such that:

Exponential confinement of chaos

179

(i) The map ! is close to the identity map in the sense that |! − Id|(D+iδ )∗ = O(εr ). (ii) The system (7.4) is taken to the form (7.5), where the functions g and R also are piecewise analytic. (iii) There exist positive constants α, β, c, depending on | f |, |R0 |, but not on ε, such that |R|(D+iδ )∗ ≤ βe−c/ε |g|(D+iδ )∗ ≤ α. (iv) In the setting described before, where S ∈ gl(n, R) generates a symmetry of the initial system (7.4), the averaging transformation can be chosen such that z˙ = εq f (z, ε) + εq+r g(z, ε) has this same symmetry†. (v) If q > 1 the estimate on |R| can be improved to |R|(D+iδ )∗ ≤ βe− ε | log ε| c

for an appropriate constant c > 0, also depending on q. 7.4.2 The algorithm The averaging transformation is the result of an iteration consisting of N = N(ε) steps, where N(ε) ∼ K /ε, for some constant K . The iteration step is independent of N and will be described now. Suppose we are at the j th stage of the process and want to compute the ( j +1)th version. To avoid cumbersome notation, we make use of the +-notation, meaning that we suppress the index j . For the j th iteration we start with the system (7.5) z˙ = ε q f (z, ε) + εq+r (g(z, ε) + R(z, t, ε)) and end up with a similar system w˙ = εq f (w, ε) + εq+r (g+ (w, ε) + R+ (w, t, ε)).

(7.6)

Here we think of g (and g+ ) as the averaged part, and of R (and R+ ) as the ‘perturbing’ remainder. Initially, at the first iteration, we have g ≡ 0 and R = R0 . The systems (7.5) and (7.6) are related by a near-identity transformation of the form z = w + εq+r v(w, t, ε). (7.7) † Similarly for a time-reversing symmetry.

180

Henk W Broer and Robert Roussarie

Since f contains no terms of order r or higher, it will not be affected by these transformations. The approach is completely classical; see e.g. [1]. We arrive at the relation q+r ∂v(w, t, ε) q Id + ε w˙ = ε f (w, ε) + A(w, t, ε) ∂w ∂v(w, t, ε) r + ε g(w, ε) + C(w, t, ε) + R(w, t, ε) + D(w, t, ε) − ∂t (7.8) with A(w, t, ε) = f (w + εq+r v(w, t, ε), ε) − f (w, ε) C(w, t, ε) = g(w + εq+r v(w, t, ε), ε) − g(w, ε) D(w, t, ε) = R(w + εq+r v(w, t, ε), t, ε) − R(w, t, ε). We also define the matrix M(w, t, ε) = εq+r

∂v (w, t, ε). ∂w

In the following the matrix norm is taken to be the operator norm. Splitting the remainder term R as follows: 2π 1 ˜ R(w, s, ε)ds R = [R] + R, where [R](w, ε) = 2π 0 and R˜ has zero time-average, we now choose, as usual, t ∂v ˜ ˜ =R v(w, t, ε) := R(w, s, ε)ds 2⇒ ∂t 0 meaning that v serves to kill the non-constant part of the remainder R. In the case of spatial symmetry‡ generated by the linear map S, we slightly change the definition of [R] by also averaging out the corresponding group-action. For instance, in the discrete case where S = Rα with α = 2π/m we have [R] =

1 2πm

m 2π 0

S∗i R ds.

i=1

What is the output of the present algorithm? To compute this we use the fact that ‡ Similarly for reversibility.

(Id + M)−1 = Id − (Id + M)−1 M

Exponential confinement of chaos

181

provided that Id + M is invertible. Then the relation (7.8) becomes w˙ = εq { f + A + εr (g + C + D + [R])} − εq (Id + M)−1 M{ f + A + εr (g + C + D + [R])}

(7.9)

where everything is expressed in the variables w, t, ε and where we desire that this is equal to the expression (7.6). This means that we have to identify the new terms g+ and R+ . We take g+ := g + [R], R+ := ε−r A + C + D

− ε−r (Id + M)−1 M( f + A) − (Id + M)−1 M(g + C + D + [R]).

This ends the formal description of the algorithm, next we turn to the estimates involved. 7.4.3 Estimates concerning the averaging step We start giving the estimates that are important for one averaging step only, after this turning to the whole (finite) process. We shall introduce several constants that have to be fixed later to make the process work. Suppose that the real analytic input system (7.5) has a holomorphic extension to some complex domain D + i δ of this form. We want the output system to be defined (or at least considered) on a smaller domain D + i δ+ , putting δ+ = δ − εσ. Next we introduce bounds for the supremum norms α ≥ |g| D+iδ

β ≥ |R|(D+iδ)∗

α+ ≥ |g+ | D+iδ+

β+ ≥ |R+ |(D+iδ+ )∗ .

First, a sufficient condition for the transformation (7.7) to take (D + i δ+ )∗ into (D + i δ)∗ is given by 2πβεq+r−1 ≤ σ. (7.10) t ˜ we obtain the estimate Indeed, from the definition v = 0 Rds |v|(D+iδ+ )∗ ≤ 2πβ which, togeher with (7.10), implies that |z − w| = εq+r |v|(D+iδ+ )∗ ≤ εδ = δ − δ+ . Inequality (7.10) will be one of the conditions to deal with at the end.

182

Henk W Broer and Robert Roussarie

Second, we give a number of direct but useful estimates obtained by the Cauchy integral formula and the mean value theorem and which only depend on constants known at the j th stage (i.e., not with index +). ∂vk β ≤ 2π ∂w εσ (D+iδ+ )∗ 2πnβ q+r ε εσ nβ q+r −1 ≤ 1 − 2π ε εσ nβ| f |(D+iδ) q+r ε ≤ 2π εσ nβα q+r ε ≤ 2π εσ nβ 2 q+r ε . ≤ 2π εσ

|M|(D+iδ+ )∗ ≤ |(Id − M)−1 |(D+iδ+ )∗ |A|(D+iδ+ )∗ |C|(D+iδ+ )∗ |D|(D+iδ+ )∗

From this we proceed to get estimates on g+ and R+ , and we can determine α+ and β+ . The estimate on g+ is simply |g+ |(D+iδ+ )∗ ≤ α + β,

(7.11)

while the estimate on R+ is |R+ |(D+iδ+ )∗

2πnβ q+r−1 −1 2πnεq−1 β | f |(1 + 1 − ε ≤ M) σ σ 2πnβ q+r−1 −1 r + ε (α + β) (1 + 1 − ε M) σ 2πnεq−1 2πnβ q+r−1 −1 β(| f | + εr (α + β)) 1 − ε + . σ σ (7.12)

7.4.4 The averaging process Again we shall define some constants, some of which are only specified at the end. We shall have to choose ε0 > 0, and restrict ourselves to 0 < ε < ε0 . Given is a domain D + i δ0 on which f has norm | f | := | f | D+iδ0 . First, we fix a constant σ by σ := 10πn| f |. For any N ∈ N we shall define a process consisting of N iteration steps as described before. If δ0 ε N := 2σ N

Exponential confinement of chaos

183

then this process works for all ε N+1 < ε ≤ ε N . This means that, for these values of ε, the Nth iterate of the averaging process satisfies the estimate of theorem 7.7 on R and g. To ensure that ) N < ε0 we impose the condition that δ0 < ε0 . 2σ

(7.13)

For j ∈ N we fix the bounds α j and β j as follows. We can take any β0 . Since g0 ≡ 0, we may choose α0 > 0 to be arbitrary, and we take α0 = 2β0 . Finally, we choose β j = 2 − j β0 α j = (4 − 21− j )β0 which give the recurrence relation α j +1 = α j + β j ; compare (7.11). Proposition 7.8 (iteration). In the above circumstances we have that for any N ∈ N , ε0 > 0 and δ0 > 0 sufficiently small, on any interval ε N+1 < ε ≤ ε N , the process of N iterations works. Proof. Given N, we use induction on j ∈ {0, 1, 2 . . . , N}. First, for j = 0 there is nothing to prove, so we assume that |g|(D+iδ)∗ ≤ α

and

|R|(D+iδ)∗ ≤ β

having to verify the same formulas with the index +. It is easy to see that by definition |g+ |(D+iδ+ )∗ ≤ |g|(D+iδ)∗ + |R|(D+iδ)∗ ≤ α + β = α+ (see (7.11)). For the corresponding estimate on |R+ | it is sufficient to show that |R+ |(D+iδ+ )∗ ≤ 12 β. First, for fixed α0 , β0 by (7.12) we conclude that q−1

|R+ |(D+iδ+ )∗

2πnε0 ≤ σ

| f |(2 + O(ε0 ))β.

By our previous choice σ = 10πn| f |, the latter estimate implies that for ε0 sufficiently small indeed |R+ |(D+iδ+ )∗ < 12 β. We also have to satisfy the conditions (7.10) and (7.13). Indeed, given | f | we just choose ε0 and δ0 sufficiently small.

184

Henk W Broer and Robert Roussarie

Corollary 7.9 (exponential estimate). In the above circumstances we have for ε N+1 < ε ≤ ε N , |g N |(D+iδ N ) < 5β0 |R N |(D+iδ N )∗ < β0 e−c/ε for any constant c <

1 2

log 2 (δ0 /σ ).

Proof. For the estimate on g N we have |g N |(D+iδ N ) ≤ α N + β N ≤ 5β0 . For R N we next have

|R N |(D+iδ N )∗ < β N

where N=

1 δ0 2 σε

and where we use that β N = 2−N β0 .

For the proof of theorem 7.7 we just take δ = δ0 , g = g N , β = β0 and R = R N on the interval ε N+1 < ε ≤ ε N . Remark 7.10. In the exponential estimate on R we included the case q = 1; see [6]. If we restrict ourselves to the case q > 1, we obtain the stronger estimate |R N |(D+iδ N )∗ < β0 e− ε | log ε| c

for an appropriate constant c > 0, also depending on q.

7.5 Linearization near a hyperbolic orbit In this section we return to the context of the main theorem 7.1. Indeed, we shall consider the system (7.1) to which the averaging theorem 7.7 has been applied. As a result of this we obtain the time-dependent vector field Z = X + εq+r R, with system form ˜ θ˙ = εq A(x, θ, λ, ε) + εq+r R1 (x, θ, t, λ, ε) ˜ x˙ = ε p B(x, θ, λ, ε) + εq+r R2 (x, θ, t, λ, ε) with ˜ A(x, θ, λ, ε) = A(x, λ, ε) + O(εr ) ˜ B(x, θ, λ, ε) = B(x, λ, ε) + O(ε p−q+r )

(7.14)

Exponential confinement of chaos

185

where θ ∈ T1 , x ∈ (a, b), λ ∈ S and ε ∈ (0, ε0 ), where S is a compact analytic manifold. Now we have that 0 ≤ q ≤ p < q + r . The system (7.14) is piecewise analytic in z = (x, θ, λ) and ε (and smooth in t, with 2π-periodic dependence). At this point we only need that R1 and R2 are bounded. We aim to prove a local linearization result around a family of hyperbolic closed limit cycles. The autonomous part of (7.14) is denoted by X λ,ε (x, θ ). The ε-expansion of the return map P(x, λ, ε) of this vector field, with respect to the section θ = 0, follows from the Poincar´e–Melnikov formula P(x, λ, ε) − x = ε p−q G(x, λ, ε) with G(x, λ, ε) = 0

= 2π

2π

B˜ (x, θ, λ, ε)dθ + O(ε) A˜

B (x, λ, 0) + O(ε). A

Then, by the central assumption on B, the family of functions G(x, λ, 0) is structurally stable under contact equivalence. Note that the principal part of the rescaled vector field ε−q X λ,ε (x, θ ) is the same as the initial vector field (7.1), i.e., the Hamiltonian system associated with H0(x, λ), with period T (x, λ) =

2π ∂ H0 ∂ x (x, λ)

=

2π . A(x, λ, 0)

The codimension-k Hopf case is included by taking q = 0, p = 2k and r = 2k + 2. In that case, the vector field X is also polynomial. 7.5.1 Łojasewicz property First consider the case where all objects are analytic instead of piecewise analytic. We recall the definition of the bifurcation set $ for the autonomous part of (7.1) in section 7.2. By the averaging this changes the bifurcation set to $ ⊆ S × [0, ε0] as the closure of the set {(λ, ε) | ε > 0 and ⇔ X λ,ε has a non-hyperbolic limit cycle}. We also introduce $ε := $ ∩S×{ε}. Note that by our central stability assumption under contact equivalence $ε 4 $0 = {B(x, λ, 0) =

∂B (x, λ, 0)} ∂x

where 4 denotes equivalence by a real analytic diffeomorphism. For (λ, ε) ∈ / $ the autonomous part X λ,ε has just a finite number of hyperbolic limit cycles. In order to linearize at these limit cycles we need estimates on the minimal distance

186

Henk W Broer and Robert Roussarie

between the limit cycles and on their Floquet exponents Fl(λ, ε)—to be defined below—in terms of b(λ, ε) := distS(λ, $ε ). Here distS denotes the Hausdorff metric within S. Since G(x, λ, ε) is only piecewise analytic, the set $ also has discontinuities in ε, whence we need a more general notion of piecewise analytic set. Moreover, for properly dealing with a function like Fl(λ, ε), which can be viewed as a linear projection composed with an analytic function, we shall also use the notion of sub-analytic sets and the piecewise analogue of this. 7.5.1.1 Piecewise analytic and sub-analytic sets We start by giving a general definition of piecewise analytic sets and then derive some properties. Definition 7.11. (i) We say that a subset * ⊆ S × [0, ε0] is piecewise analytic if it is defined by a finite number of equations f 1 (z, ε) = f 2 (z, ε) = · · · = f s (z, ε) = 0 where the functions f i (z, ε) are piecewise analytic. (ii) Let *ε := * ∩ S × {ε}. We say that the piecewise analytic set * is trivial if for a monotonous sequence {ε N } N∈N converging to 0, for each N, there exists an analytic diffeomorphism from S × [ε N+1 , ε N ] into itself, mapping * ∩ S × (ε N+1 , ε N ] to *0 × (ε N+1 , ε N ]. Remark 7.12. For any closed subset * ⊆ S × [0, ε0 ] we introduce the ‘horizontal’ distance b(λ, ε) := distS(λ, *ε ) as above. Evidently this definition depends on the metric in S. By compactness of S any equivalent metric gives rise to an equivalent function b , in the sense that cb ≤ b ≤ Cb for two positive constants c and C. For completeness we recall the Łojasewicz property. Proposition 7.13 (analytic case [24, 36]). Let the compact analytic manifold S be given and let * ⊆ S be an analytic subset. Consider the sub-analytic function f : S → R. Then, if Z ( f ) is the zero-set of f, we have (i) If Z ( f ) ⊆ *, then positive constants ν, c exist such that | f (λ)| ≥ c(dist S (λ, *))ν . (ii) Conversely, if Z ( f ) ⊇ *, then positive constants ν, c exist such that | f (λ)| ≤ c(dist S (λ, *))ν .

Exponential confinement of chaos

187

We continue by generalizing the above definition and proposition in the subanalytic context, recalling that a (real) sub-analytic set is any set obtained by a linear projection of an analytic set and that a sub-analytic function is a function whose graph is a sub-analytic set. Proposition 7.14 (piecewise analytic case). Let * ⊆ S × [0, ε0] be a trivial piecewise analytic set and f : S × [0, ε0] → R a piecewise sub-analytic function. (i) If the Z ( f ) ⊆ *, then positive constants ν, c exist such that | f (λ, ε)| ≥ cb(λ, ε)ν . (ii) Conversely, if the zero-set Z ( f ) ⊇ *, then positive constants ν, c exist such that | f (λ, ε)| ≤ cb(λ, ε)ν . Proof. First, we restrict ourselves to the simple case where * ⊆ S × [0, ε0] is a trivial analytic set, where f : S × [0, ε0] → R is analytic and where the transformation ! : S × [0, ε0 ] → S × [0, ε0 ] respecting the ε-levels, which sends * to *0 × [0, ε0 ], is just analytic. Using the fact that the function b is independent of the metric (up to equivalence), we can replace the couple (*, f ) by (!(*), f ◦ ! −1 ). This brings us into the case of proposition 7.13, where we use a product metric. Second, we consider the general case where * and f are trivially piecewise analytic and piecewise analytic, respectively. We choose a common sequence {ε N } N∈N converging to 0 and can apply the previous case to each interval [ε N+1 , ε N ]. The result then follows by the boundedness properties of definition 7.6. 7.5.1.2 Łojasewicz inequalities for the bifurcation set We now return to the bifurcation set $ , introduced at the beginning of this section. Proposition 7.15. $ ⊆ S × [0, ε0 ] for ε0 sufficiently small is a trivial piecewise analytic subset. Proof. By the averaging theorem 7.7 we know that $ is a piecewise analytic set. As was already pointed out before, the Poincar´e–Melnikov formula for the return map has the expansion P(x, λ, ε) − x = ε p−q G(x, λ, ε) B with G(x, λ, ε) = 2π (x, λ, 0) + O(ε). A

(7.15)

188

Henk W Broer and Robert Roussarie

It follows that $0 = {λ | ∃x : B(x, λ, 0) = 0 =

∂B (x, λ, 0)}. ∂x

Since by our central assumption the family B(x, λ, ε) is structurally stable under contact-equivalence, it follows that $ is trivial, provided that ε0 is sufficiently small. Finally we turn to the examples mentioned before of the minimal distance between the limit cycles and their Floquet exponents. Definition 7.16. In the above circumstances we define (i) The distance function d(λ, ε) := inf{distS(γ , γ ) | γ %= γ are hyperbolic limit cycles of X λ,ε } for (λ, ε) ∈ / $ and where we take d = −∞ if there is no limit cycle. (ii) The Floquet function Fl(λ, ε) := inf{εq− p |Fl(γ ) − 1| | γ limit cycle of X λ,ε } for (λ, ε) ∈ S × [0, ε0] \ $ , where Fl(γ ) is the Floquet exponent of γ . Observe that both d and Fl are piecewise sub-analytic functions, while Z (d) ⊆ $ and Z (Fl) = $ . Applying propositions 7.14(i) and 7.15 we get the following. Proposition 7.17 (distance and Floquet exponent). In the above circumstances there exist positive constants c1 , c2 , ν1 and ν2 , such that d(λ, ε) ≥ c1 b(λ, ε)ν1 Fl(λ, ε) ≥ c2 b(λ, ε)ν2 .

7.5.2 Formulation and proof of the linearization theorem Theorem 7.18 (linearization). Consider the time-dependent vector field (7.14) Z := X λ,ε (x, θ ) + εq+r R(x, θ, t, λ, ε). Given U ⊆ S × [0, ε0 ] open and the family (λ, ε) of hyperbolic limit cycles of X λ,ε for (λ, ε) ∈ U \$, there exists a piecewise analytic family of transformations L λ,ε : (I, ϕ) "→ (θ, x)

Exponential confinement of chaos

189

for (λ, ε) ∈ U \ $ and (I, ϕ) ∈ [−1, +1] × T1 , such that the pull-back vector field L −1 ∗ Z has the form I˙ = ε p +1 (λ, ε)I + εq+r ,1 (I, ϕ, t, λ, ε) ϕ˙ = εq +2 (λ, ε) + εq+r ,2 (I, ϕ, t, λ, ε) where +1 is the Floquet exponent of the ‘unperturbed’ limit cycle and +2 is its frequency. Here +2 is a regular function, while |+1 (λ, ε)| ≥ c3 b(λ, ε)ν where ν = max{ν1 , ν2 } and for a positive constant c3 . Moreover, for j = 1, 2 one has |, j | ≤ c4 εq− p b−ν |R| for a constant c4 > 0. A proof of theorem 7.18 is given in several steps, which closely follow the corresponding proof in [6]. 7.5.2.1 Preliminaries We first quote from [6] two results concerning the linearization of a holomorphic map P : (C , 0) → (C , 0) around its hyperbolic fixed point 0. Writing P(z) = az + Q(z), with Q holomorphic with Q(0) = Q (0) = 0, defined on the disc D(ρ), assume that |a| =: A < 1 and that for z ∈ D(ρ) we have |Q (z)| ≤ C. In the following we will take a radius ρ0 ≤ ρ also satisfying ρ0 < A/2C, in which case the map P is injective on D(ρ0 ). Lemma 7.19 (from [6]). Taking ρ0 > 0 such that ρ0 <

√ A (1 − A) 2C

there exists a unique holomorphic map H with the following properties: (i) H restricted to D(ρ0 ) is a diffeomorphism onto its image, (ii) H (0) = 1, (iii) a H = H ◦ P on some open neighhourhood of 0.

190

Henk W Broer and Robert Roussarie

Lemma 7.20 (properties of H). Let the linearizing map √ H of the above lemma be defined on the disc D(ρ0 ) with ρ0 = (A/2C)(1 − A). Then H satisfies |H (z) − z| ≤ D|z|2 , where D=

A(1 −

√

A)(1 −

1 2

√

C A)(1 +

√

√ . A + 12 ( A − A))

Moreover, D( 38 ρ0 ) ⊆ H (D( 12 ρ0 )) while for z ∈ D( 12 ρ0 )

|H (z)| ≤ 4(1 + Dρ0 ).

For the proof of lemma 7.20 again see [6] and use the Cauchy estimate. 7.5.2.2 Construction of L −1 Here we introduce the coordinates (I, ϕ) = L −1 (x, θ ) where we apply the above holomorphic linearization to the Poincar´e map of X λ,ε with respect to the limit cycle λ,ε . Recall that X λ,ε is the autonomous part of (7.14), which we assume to be defined on some open set containing (a, b) × T1 = {x, θ }. This allows one to speak of the global return map P defined on σ = (a, b) × {0} provided that ε is sufficiently small. Suppressing parameters from the notation, we note that here P is the Poincar´e map for any limit cycle of X. Before anything else we can define the return time of the vector field Y := ε−q X with respect to this section σ by T (x, λ, )). We now recall the piecewise analytic family λ,ε , (λ, ε) ∈ U \ $. Let the point x(λ, ε) be defined by {x(λ, ε)} = σ ∩ λ,ε . If we define T (λ, ε) as the period of λ,ε in the vector field ε−q X, then T (λ, ε) = T (x(λ, ε), λ, )). Let us consider the constants +1 , +2 and + = +1 /+2 . Here +2 (λ, ε) := 2π/T (λ, ε) and, therefore, +2 is a regular function. Moreover, we have the relationship (7.16) D P(x(λ, ε)) = exp(2πε p−q +(λ, ε))

Exponential confinement of chaos

191

and, therefore, 1 q− p ε log(D P(x(λ, ε))). 2π From this, and from propositions 7.14 and 7.15, it follows that +1 satisfies the Łojasewicz inequality of theorem 7.18. Linearizing the vector field X (λ, ε) near λ,ε is equivalent to finding a section σ λ,ε , say through x(λ, ε), with constant return time and linearizing the Poincar´e map on this section. Moreover, finding σ λ,ε is equivalent to defining the invariant foliation of X (λ, ε) with respect to λ,ε . The existence of this invariant foliation is proven in [17]. So let us now assume we have σ λ,ε . For the moment we suppress the dependence on the parameters (λ, ε). We first localize on σ near , setting +(λ, ε)) :=

x = x(λ, ε) + u. This u will also be used as a local parameter on σ by local transportation along the vector field X. Accordingly we denote the Poincar´e map as u "→ P(u). By this abuse of notation the return map σ → σ is also denoted by P. Let a := D P(0), then the linearizing map h is defined according to the previous subsection, i.e. it satisfies ah = h ◦ P. We now turn to the map L −1 . Given any point (x, θ ) near there is a unique point u(x, θ ) ∈ σ and a unique passage time 0 ≤ t(x, θ ) < T for the vector field Y = ε−q X to travel from u(x, θ ) to (x, θ ). Given this, we can define 2π t(x, θ ) T I (x, θ ) := h(u(x, θ )) exp +ϕ(x, θ )

ϕ(x, θ ) :=

where +=

εq− p log a. 2π

This concludes the definition of L −1 : (x, θ ) "→ (I (x, θ ), ϕ(x, θ )) where we observe that L −1 ∗ X obtains the form I˙ = ε p +1 I ϕ˙ = εq +2 .

192

Henk W Broer and Robert Roussarie

It remains to transport the remainder term εq+r R along the map L −1 . Since −1 L −1 ∗ (Z )(I, ϕ) = D(L )(x, θ )Z (x, θ )

we clearly need the derivative D(L −1 )(x, θ ) = (DL(I, ϕ))−1 for this. In the remainder of this section we estimate this derivative in terms of the parameters (λ, ε). 7.5.2.3 On the domains: preliminary estimates We now come to the definition and estimation of h, and its derivative, going from a small domain to order 1. We work on some annulus A(λ, ε) around λ,ε , which is the saturation of the interval D(c5 ε p−q bν ) = {z | |z − x(λ, ε)| ≤ c5 ε p−q bν } where, as before, ν = max{ν1 , ν2 }. Note that ν ≥ ν1 is just to have the two annuli mutually distinct, and also distinct from the singular set, whenever c5 is sufficiently small, as to be indicated later. So we identify ρ0 = c5 ε p−q bν and on behalf of H we estimate on the disc D( 12 ρ0 ) A − 1 ≈ c2 ε p−q bν2 C = O(ε p−q ) C 2C D≈ = O(b−ν2 ) √ ≈ p−q b ν2 c ε 1− A 6 c2 ν 2 b + ≈ +1 ≥ 2π +2 = O(1) T = O(1). We conclude by defining h(z) :=

1 3 8 ρ0

H (z) =

8 q− p −ν ε b H (z) 3c5

and derive from the above the following. Lemma 7.21 (properties of h). If ρ0 = c5 ε p−q bν , then on the disc D( 12 ρ0 ) we have D(1) ⊆ h(D( 12 ρ0 ))

|h (z)| = O(εq− p b−ν ).

Exponential confinement of chaos σ¯ u

τ (u)

σ

u

193

P(u) τ (P(u))

T (0)

P(u) τ (u)

T (u)

Figure 7.4. The section σ and some passage times.

7.5.2.4 Passage times As an intermediate step we need to consider all the ingredients introduced before. The passage time t = t(x, θ ) will be split as t(x, θ ) = t (x, θ ) + τ (u(x, θ )) where τ (u) is the local passage time from σ to σ ; see figure 7.4. We need to know the dependence of τ on the parameters (λ, ε). In σ we consider the domain D(c5 bν ε p−q ). The following proposition is borrowed from [6]. Proposition 7.22 (time estimates). On the domain D(c5 bν ε p−q ) ⊆ σ we have uniform estimates ∂τ = O(b−ν εq− p ). |τ (u)| = O(1) and ∂u

Proof. Recall that T (x) is the return time to σ, which we now express in the local variable as T (u). Then we have τ (u) + T (u) = T (0) + τ (P(u)). Iteration of P therefore yields τ (P n−1 (u)) + T (P n−1 (u)) = T (0) + τ (P n (u)) which yields τ (u) =

∞

(T (0) − T (P i (u))).

i=0

On the disc D = D(c5 bν ε p−q ) we have

∂ T (z) i |P (u)|. |T (0) − T (P (u))| ≤ sup ∂u z∈D i

(7.17)

194

Henk W Broer and Robert Roussarie

Since T is a regular function, ∂ T (z) ∂u = O(1). Moreover,

|P(u)| ≤ sup |P (z)||u| z∈D

and again by the mean value theorem |P (z) − P (0)| = O(|z|) on D since P is bounded. On D we have |z| ≤ c5 bν ε p−q . proposition 7.17 we have

Moreover, by

|P (0)| ≤ 1 − c2 bν ε p−q . It follows that, for sufficiently small c5 , on the disc D |P (z)| ≤ 1 − 12 c2 bν ε p−q . Abbreviating B = 1 − 12 c2 bν ε p−q , we find |P i (u)| ≤ B i |u|. From this it follows that the series (7.17) converges, while u i = O(|u|b−ν εq− p ). |P (z)| = O |τ (u)| = O |u| 1− B i

Using the Cauchy formula we so obtain ∂τ = O(b −ν εq− p ) ∂u as desired. 7.5.2.5 Estimation of the remainders ,1 and ,2 Consider the expression ε−q Z (x, θ, t, λ, ε) = Yλ,ε (x, θ ) + εr R(x, θ, t, λ, ε) (cf theorem 7.18). We obtain for the remainder terms ,1 = Dx,θ L −1 (R ◦ L −1 ) ,2

Exponential confinement of chaos

195

where the parameters λ, ε are suppressed. Let us recall the formulas t(x, θ ) = t (x, θ ) + τ (u(x, θ )) 2π ϕ(x, θ ) = t(x, θ ) T I (x, θ ) = h(u(x, θ )) exp +ϕ(x, θ ) where L −1 is defined by the latter two expressions. Let the flow Yt be given by Yt (x, θ ) ≡ (ξ(t, x, θ ), η(t, x, θ )) where ξ denotes the x-component and η the θ -component. From the definitions of t (x, θ ) and u(x, θ ), we obtain η(−t (x, θ ), x, θ ) = 0, u(x, θ ) = ξ(−t (x, θ ), x, θ ). We now compute the derivative of L −1 as 2π ∂t ∂τ ∂ξ ∂t ∂ξ ∂ϕ = + (u) − + ∂x T ∂x ∂z ∂t ∂ x ∂ x (−t (x,θ),x,θ) ∂h ∂ξ ∂t ∂ξ ∂I ∂ϕ +ϕ e = (ξ ) − + + +h ∂x ∂z ∂t ∂ x ∂ x (−t (x,θ),x,θ) ∂x where similar expressions hold for the θ -derivatives. So in order to get the desired estimates on ∂(I, ϕ)/∂(x, θ )) we need to estimate h,

∂τ ∂h ∂t ∂t ∂ξ ∂ξ ∂ξ , , , , , and ∂z ∂ x ∂θ ∂t ∂ x ∂θ ∂z

in terms of ε and the function b = b(λ, ε). Lemma 7.23. For small ε

∂h = O(εq− p b−ν ), |h| = O(1), ∂z ∂τ ∂ξ = O(εq− p b−ν ) = O(1), ∂z ∂t ∂(ξ, η) ∂(x, θ ) = O(1)

uniformly for (x, θ, λ) in a compact set. Proof. The estimates on h, ∂h/∂z and ∂τ/∂z were already obtained in lemma 7.21 and proposition 7.22. The expression ∂ξ/∂t is just one of the

196

Henk W Broer and Robert Roussarie

components of the vector field Y, so the corresponding estimate is evidently valid. Finally, on behalf of the last estimate, we have −

∂η ∂t ∂η (−t (x, θ ), x, θ ) + =0 ∂t ∂x ∂x

where we use that η(−t (x, θ ), x, θ ) ≡ 0.

The θ -component η is bounded away from 0 in the annulus A we consider. Therefore ∂t = O ∂η . ∂x ∂x This reduces the computation to the derivatives of the flow Yt with respect to x and θ. This can be done by studying the first variation equation. Observing that t remains in a compact domain, it follows that ∂(ξ, η)/∂(x, θ ) is bounded. This finishes our proof of theorem 7.18.

7.6 Exponential confinement: proof of the main theorem 7.1 We now combine theorems 7.7 and 7.18 and see what this means for system (7.1) that θ˙ = εq A(x, λ, ε) + εq+r R1 (θ, x, t, λ, ε) x˙ = ε p B(x, λ) + εq+r R2 (θ, x, t, λ, ε). 7.6.1 Averaging Application of the averaging theorem 7.7 gives rise to a similar time-dependent vector field Z = X + εq+r R, of the form (7.14) ˜ θ˙ = εq A(x, θ, λ, ε) + εq+r R1 (x, θ, t, λ, ε) ˜ x˙ = ε p B(x, θ, λ, ε) + εq+r R2 (x, θ, t, λ, ε) where the R j have changed their connotation. Indeed, for some positive constants c, β |R j | ≤ βe−c/ε , j = 1, 2. The question is what happens to the bifurcation set $ of the initial autonomous part. The conjugacy ! gives rise to a new autonomous part that has a piecewise analytic bifurcation set $ . Since |! − Id|(D+iδ )∗ = O(εr ) and because of the stability assumption, we know that dist ($ε , $ε ) = O(εr ). Moreover, for each ε small enough, the set $ε is analytically diffeomorphic to $0 .

Exponential confinement of chaos

197

7.6.2 Persistence of invariant 2-tori Our interest is with the finite number of branches λ,ε of hyperbolic limit cycles, related to the complement of $ . To λ,ε we apply theorem 7.18 to obtain the new form I˙ = ε p +1 (λ, ε)I + εq+r ,1 (I, ϕ, t, λ, ε) ϕ˙ = εq +2 (λ, ε) + εq+r ,2 (I, ϕ, t, λ, ε) where +1 is the Floquet-exponent of the ‘unperturbed’ limit cycle and +2 its frequency. Here +2 is a regular function, while |+1 (λ, ε)| ≥ c3 b(λ, ε)ν for positive constants c and ν. Moreover, one has |, j | ≤ c4 εq− p b−ν |R|. For the whole bifurcation problem we have to take the supremum over all branches related to the complement of $ . Summarizing, we get the global exponential estimate |, j | ≤ βc4 εq− p b−ν e−c/ε where c4 > 0 is sufficiently small. We now turn to the persistence of all hyperbolic branches under consideration. Given ∈ N there exists a constant K , such that ,1 ,2 (7.18) εr max εq− p , < K +1 +2 implies the persistence of λ,ε as an invariant 2-torus for the non-autonomous system of class C , see [17]. Translating this condition to the (λ, ε) parameter plane we have the following: for ε ≤ ε(K , c3 , c4 , c, β), the condition (7.18) is implied by βc4 r+2(q− p) −c/ε ε e (7.19) b 2ν ≥ c3 K where C -persistence holds. Observe that the complement of this domain is an exponential narrow horn with a broken boundary, as defined in the main theorem 7.1. 7.6.3 The limit sets We now turn to the limit sets mentioned in the main theorem 7.1. We shall prove that the limit sets are contained in the interior of the annuli around the hyperbolic cycles λ,ε , for all parameter values defined by (7.19). To this end we recall that the expression (7.14) for the averaged vector field is an εq+r -perturbation of the initial equation (7.1). Our central assumption on the

198

Henk W Broer and Robert Roussarie

family B(x, λ, 0) means that the autonomous part of (7.1) is analytically stable for contact equivalence between families of maps. Taking into account that (7.14) is just piecewise analytic in ε, this means that there exists a piecewise analytic family of diffeomorphisms, putting (7.14) in the form θ˙ = F(x, θ, λ, ε)[εq A(x.λ, ε)] + εq+r R1 (x, θ, t, λ, ε)

x˙ = F(x, θ, λ, ε)[ε p B(x.λ, ε)] + εq+r R2 (x, θ, t, λ, ε)

(7.20)

where F is a piecewise analytic function, such that |F| is bounded from below by a constant f > 0. The new remainder R = (R1 , R2 ) again is dominated by βe−c/ε , as obtained in the averaging theorem 7.7. Similarly the estimates for the functions d(λ, ε) and Fl(λ, ε), related to the bifurcation diagram of the autonomous part, are preserved; cf proposition 7.17. This also holds for the size of the annulus of linearization in terms of ε, as well as the distance b(λ, ε) to the singular set, which now has the form Sing (λ, ε) = {x | B(x, λ, ε) = 0}. Compare the linearization theorem 7.18 and section 7.5.2.3. Here it is understood that the relevant constants c1 , . . . , c5 entering the estimates have to be changed. Consider the x-component ˙ Z 2 (x, θ, t, λ, ε) = ε p F(x, θ, λ, ε)B(x, λ, ε) + εq+r R2 (x, θ, λ, ε) of (7.20). It is our aim to show that Z 2 remains non-zero outside the annuli specified in theorem 7.18. Indeed, we have |Z 2 (x, θ, t, λ, ε)| ≥ ε p f |B(x, λ, ε)| − εq+r |R2 (x, θ, t, λ, ε)| where

(7.21)

|R2 | ≤ βe−c/ε

as above. We now first estimate |B|. The areas outside the annuli are given by dist (x, Sing(λ, ε)) ≥ c5 ε p−q bν .

(7.22)

Also we have the global estimate |B(x, λ, ε)| ≥ c9 bν dist (x, Sing(λ, ε)) for some positive constant c9 . This inequality is derived by using the mean value theorem and the remark that ∂∂ Bx (x, λ, ε) at any point x ∈ Sing (λ, ε) is proportional to the Floquet function Fl. On the domain given by (7.22) we then have the desired estimate |B(x, λ, ε)| ≥ c5 c9 ε p−q b2ν

Exponential confinement of chaos

199

which yields the sufficient condition for positivity of |Z 2 | c5 c9 f ε p−q b2ν − βεq+r e−c/ε > 0.

(7.23)

Using the estimate (7.19), we conclude that the complement of the horn in the parameter plane is given by b 2ν ≥

βc4 r+2(q− p) −c/ε ε e . c3 K

Then condition (7.23), after simplification by a factor βεq+r e−c/ε , is implied by c4 c5 c9 f > 1. c3 K In this expression all constants are determined by the vector field (7.20), except for K , which can be chosen arbitrarily small positive. In fact, K determines the size of the exponential horn. Therefore, by taking K sufficiently small we can establish the desired positivity. So for parameter values given by (7.19) the limit sets are contained in the hyperbolic 2-tori. 7.6.4 Smoothing the horn We now turn to smoothing the boundaries of the horn (7.19), which makes the horn somewhat larger. Indeed, condition (7.19) is implied by a simpler estimate, for any small δ > 0, given by

b ≥ C e−c /ε ,

with c =

c βc4 − δ and C = . 2ν c3 K

(7.24)

For ε > 0, sufficiently small, say ε < ε = ε (K , c3 , c4 , c, β, δ), this inequality guarantees the C persistence of λ,ε . The inverse estimate b ≤ C e−c /ε determines an exponentially narrow horn H around the piecewise analytic set $ ; see figure 7.1. Finally, we can smooth the piecewise analytic set $ , as follows. First, there exists a set ˜ = ˜ ε × {ε} $ $ ε

with the property that Furthermore, the arc

˜ ∩ S × [0, ε ) ⊂ H . $ ˜ε ε "→ $

˜ ε is analytically diffeomorphic is a C ∞ -family of analytic sets in S. Moreover $ to $0 . Second, consider the horn ˜ H˜ := {(λ, ε) | distS(λ, $˜ ε ) ≤ C e−c/ε }

where c˜ is any constant smaller than c . Then for 0 < ε < ε , outside H˜ we have the C -persistence result. This finishes the proof of main theorem 7.1.

200

Henk W Broer and Robert Roussarie

7.7 Concluding remarks The scope of the main theorem 7.1 concerns bifurcation diagrams of real analytic systems for the perturbation of regular limit cycles related to a compact annulus with Hamiltonian cycles. This paper is concluded by some remarks about cases where such an annulus may have a polycycle & in its boundary, which contains singular points; see also section 7.3.4. The simplest case where this occurs is the (standard) Bogdanov–Takens bifurcation of codimension two; see [5, 6, 31]. Here the Hamiltonian system reads y

∂ ∂ + (x 2 − 1) ∂x ∂y

(7.25)

and it has two Morse singularities, the saddle-point of which generates a homoclinic polycycle &, enclosing a (punctured) disc full of Hamiltonian cycles; see figure 7.5. In the dissipative unfolding a codimension-one saddle connection occurs, where a limit cycle disappears in a ‘blue sky catastrophe’. At this bifurcation the divergence at the saddle-point is non-zero. Below we briefly summarize how this case connects to the present setting. The idea is that the main theorem 7.1 of the present paper can be applied on any compact regular annulus in the complement of &, where the relevant asympotics at & can be computed and kept track off. A next case occurs in local codimension-three bifurcations and also in the Bogdanov–Takens bifurcation of codimension three as studied in [13]. Here the Hamiltonian approximation coincides with (7.25), but in the dissipative unfolding a codimension-two saddle connection shows up, with vanishing divergence at the saddle-point at the moment of bifurcation. Moreover, now two limit cycles and a corresponding saddle–node are involved. The aim of this section is to give a heuristic extension of the present paper to this codimension-two case, that can be approached similarly to the codimension-one case above. In a future paper we shall give a more detailed analysis of this and related subjects. 7.7.1 The codimension-one saddle connection Let us briefly return to [5, 6], describing how this case fits in with the current analysis. After appropriate scaling, we end up with a one-parameter family x˙ = εy y˙ = ε(x 2 − 1) + ε2 y(ν − x) + O(ε3 ). As said before, & is a saddle connection, where the bifurcation is generically of codimension one. There exists a local section σ, transversal to &, with the following properties. For a local variable z, such that & cuts σ at z = 0, the return map on σ is given by z "→ z + δ(z, λ)

Exponential confinement of chaos

201

y

x

−1

1

Figure 7.5. The Hamiltonian vector field (7.25).

where λ = (ε, α) and with displacement function δ(z, λ) = εδ(z, λ), with δ(z, λ) = α + β(ε, α)zω(z, λ) + O(z),

with ω(z, λ) =

z −λ − 1 λ

(7.26)

with β(0, 0) %= 0. The bifurcation set then reduces to the line C = {α = 0} of saddle connections. Without loss of generality we may assume that a hyperbolic limit cycle exists for α > 0, for some z > 0. For ε = 0 we have a Hamiltonian vector field X H with the saddle connection & and an annulus of cycles cutting σ, see figure 7.5. The compact annulus including & may be considered as a generalization of the regular annulus as considered in the present paper. Note that the period T (z) of the Hamiltonian cycle tends to ∞ as z ↓ 0. Indeed, T (z) ∼ c| ln z| for a positive constant c. Also, the displacement map is no longer analytic at z = 0, as can be seen by formula (7.26). In [6] asymptotic expressions at z = 0 were computed of all parameters +1 , +2 , Fl, . . . , as they enter the proof of the main theorem 7.1, carried out with p = 2, q = 1, r = 2. These computations yield an estimate similar to (7.19). As a consequence the bifurcation line C, for the non-autonomous system (or the diffeomorphism), is replaced by an exponentially narrow horn. For details see [6]. 7.7.2 The codimension-two saddle connection Let us now consider the ‘next’ case of a codimension-two saddle connection. This case shows up in some local codimension-three bifurcations and, for instance, in the Bogdanov–Takens bifurcation of codimension three. For the latter purpose we have to consider a generic three-parameter family of local diffeomorphisms (or time-periodic, non-autonomous equations), the formal autonomous normal form is associated with the codimension-three Bogdanov–Takens family X λ , as

202

Henk W Broer and Robert Roussarie v1 b1

H C h2 v0 d L c2

H C b2

Figure 7.6. Trace of the cone-like bifurcation diagram of (7.27).

studied in [13] x˙ = y y˙ = x 2 + µ + y(ν0 + ν1 x ± x 3 + x 4 h(x)) + · · ·

(7.27)

with λ = (µ, ν0 , ν1 ). The bifurcation diagram now is a cone over a trace inside a transversal sphere in the parameter space; see figure 7.6. This diagram contains various lines of codimension-one bifurcations, and several codimensiontwo bifurcation points. In particular H denotes a line of Hopf bifurcations, C a line of saddle connections and L a line of saddle–node bifurcations of a limit cycles. Furthermore, b1 and b2 denote Bogdanov–Takens points, c2 is a point of degenerate saddle connection, while h 2 is a degenerate Hopf bifurcation point. The main theorem 7.1 of this paper can be applied in the complement of any of the codimension-two points b1 , b2 , c2 and h 2 . Moreover, the result of [6] applies near the points b1 and b2 . We now focus on a neighbourhood of the saddle connection point c2 . The central part of the bifurcation diagram in figure 7.6, which includes a neighbourbood of the triangle dc2 h 2 , may be studied with help of the rescaling x = ε2 x, y = ε3 y, µ = −ε4 , ν0 = ε4 ν 0 , ν1 = ε6 ν 1 . This reduces the family X λ to ε X ε,ν 0 ,ν 1 , where X is given by x˙ = y y˙ = x 2 − 1 + ε4 y(ν 0 + ν 1 x ± x 3 ) + O(ε5 ).

(7.28)

Exponential confinement of chaos

γi

203

σ

γe

s

Figure 7.7. The section σ and the limit cycles γi,e .

For simplicity we omit all bars and conclude that the original non-autonomous family X λ + non-autonomous terms of higher order in (x, y) now becomes X ε,λ = X H + ε4 X λD + ε5 R(x, y, λ, t, ε). Here λ = (ν0 , ν1 ) and R is the non-autonomous perturbation, T -periodic in time. As before, the Hamiltonian vector field reads XH = y

∂ ∂ + (x 2 − 1) ∂x ∂y

again see figure 7.5. Moreover, X λD = y(ν0 + ν1 ± x 3 )

∂ ∂y

is the dissipative autonomous part. Again the idea is to apply the main theorem 7.1 to any compact annulus of Hamiltonian cycles, using p = 5, q = 1, r = 5, keeping track of the asymptotics near the polycycle &. For this purpose we choose a section σ, transversal to &, parametrized by z, see figure 7.7. The dispacement function for X ε,λ = X H + εk X λD + O(εk+1 ) where k = 4, at an arbitrary order N is given by αi, j (λ, ε)z i ω j + ! N (z, λ, ε). δ(z, λ, ε) = i, j ≤N

As before, ω(z, λ) = (z −λ − 1)/λ. By versality of the unfolding, we have δ = ε4 δ with δ(z, λ, ε) = α0 (λ, ε) − α1 (λ, ε)zω + α2 (λ, ε)z + o(z)

204

Henk W Broer and Robert Roussarie

L

α0

T

C c2

α1

Figure 7.8. Local bifurcation diagram near the degenerate saddle connection point c2 .

where the coefficients α j (λ, ε), for j = 0, 1, 2, are analytic, such that α0 (λ, 0) ≡ 0 ≡ α1 (λ, 0) and α2 (0, 0) %= 0, while the map λ "→ (α0 (λ, 0), α1 (λ, 0)) is invertible in a neighbourhood of λ0 , the coordinate of the bifurcation point c2 . This enables us to choose (α0 , α1 ) as local parameters. From here, for brevity, our considerations become more heuristic. Since ω is an analytic unfolding of − ln z, we expect to obtain a reasonable approximation of the phenomena when replacing δ by δ(z, α) = α0 − α1 z ln z + z. This expectation is strongly supported by [30]. The local bifurcation diagram near the point c2 : (α0 , α1 ) = (0, 0), is given by the line C = {α0 = 0} and by the saddle–node line L of limit cycles, given by ∂δ L= δ=0= ∂z or equivalently α0 − α1 z ln z + z = 0 −α1 (ln z + 1) + 1 = 0. The saddle–node of limit cycles corresponds to the z-value z α1 = e−1 e1/α1 and the line L has the equation L = {α0 = Aα1 = e−1 α1 e1/α1 ; for α1 ≤ 0}.

Exponential confinement of chaos

205

Note that L at c2 has flat contact with C. In the horn T two limit cycles γi and γe co-exist, where γi is nearest to the saddle-point s. These limit cycles correspond to the z-values resp. z i and z e , which are roots of the equation α0 = ϕα1 (z) = α1 z ln z − z; see figure 7.8. Therefore, for 0 ≤ α0 ≤ Aα1 , one has the estimates α0 κ ≤ z i ≤ z α1 = e−1 e1/α1 |α1 | e−1 e1/α1 ≤ z e ≤ e1/α1 where κ > 1 is an arbitrary constant. Note that for λ = (α0 , α1 ) ∈ T the (vertical) distances from λ to C and L are α0 and Aα1 − α0 , respectively. We follow the same strategy as in the previous sections, using the same names and symbols. Indeed, by direct computations one obtains |z i − z e | ≥ ∗|α1 |− 2 e(κ− 2 )/α1 (Aα1 − α0 ) 2 1

1

1

and similarly for the Floquet exponents Fli,e (λ) = α1 (ln z i,e + 1) + κ of γi,e 1

3

1

|Fli,e (λ)| ≥ ∗|α1 | 2 e( 2 −κ)/α1 (Aα1 − α0 ) 2 .

(7.29)

Here and elsewhere, ∗ denotes some positive constant that we do not need to remember. Moreover, the return time function T (z, λ, ε) on the transversal σ is equivalent to c(λ, ε)| ln z|, for some constant c %= 0. From this it follows that +i2 ≥

2π 1 c ln( α0 ) |α1 |

while

+2 ≥ ∗|α1 |

and

+2i,e ≤ ∗|α1 |.

(7.30)

(7.31)

From the Floquet estimates (7.29) and (7.30) we obtain 1

3

3

3

1

+i1 ≥ ∗ |α1 | 2 e( 2 −κ)/α1 | ln α0 |−1 |Aα1 − α0 | 2 1

+e1 ≥ ∗ |α1 | 2 e( 2 −κ)/α1 |Aα1 − α0 | 2 .

(7.32)

The estimates on the distances z i , z e − z i and the Floquet exponents Fli,e lead to the following choices of annuli Ai,e around the limit cycles γi,e , denoting by |Ai,e | the corresponding diameters inside σ . We obtain |Ae | ∼ εk |α1 |e(κ− 2 )/α1 |Aα1 − α0 | 2 1

1

|Ai | ∼ εk α0κ |Aα1 − α0 | 2 . 1

206

Henk W Broer and Robert Roussarie

Now we are able to estimate the times τi,e (u) related to the invariant foliations of the γi,e as 1

1

|τe (u)| = O(|α1 | 2 e α1 ) 1

3

|τi (u)| = O(|α1 |− 2 e( 2 −κ)/α1 α0κ ) ∂τi,e 1 −k − 1 ( 3 −κ)/α1 |Aα1 − α0 |− 2 ). ∂u = O(ε |α1 | 2 e 2 For the diffeomorphism h i,e that linearizes the Poincar´e map we moreover obtain |h i,e (z)| = O(1) |h i (z)| = O(ε−k α0−κ |Aα1 − α0 |− 2 ) 1

1

1

|h e (z)| = O(ε−k |α1 |−1 e(κ− 2 )/α1 |Aα1 − α0 |− 2 ). To estimate the flow (ξ, η) of the averaged vector field ε−q Z in the annuli Ai,e , we have to take into consideration that s is a singular point at a distance of at least z i . Then it follows that α0 κ ||ε−q Z || ≥ ∗|z i | ≥ ∗ |α1 | and, hence, letting (z, θ ) be coordinates along γi,e , −κ(√2+δ) ∂(ξ, η) α 0 ∂z, ∂θ = O |α1 | √ ∂t α0 −κ(1+ 2+δ) =O . ∂z |α1 | Using the above constants and also the fact that Fle = O(1) and Fli = O(|α1 || ln α0 |), we are ready to estimate the remainders ,1 , ,2 . Indeed, in the annulus Ai we have √ 2+δ)

|,1,2| ≤ ∗ε−k e−c/ε |α0 |−κ(2+

√ 2+δ)

|α1 |κ(1+

|Aα1 − α0 |− 2 . 1

(7.33)

Note that, for a fixed value of α1 < 0, this bound diverges at a polynomial rate (as in the regular case) both as α0 → 0, i.e., as γi approaches s and as Aα1 −α0 → 0, i.e., as γe approaches γi . Next, in the annulus Ae we get |,1,2| ≤ ∗ε−k e−c/ε e(

√ 1 2+ 2 +κ+δ)/|α1 |

1

|Aα1 − α0 |− 2 .

(7.34)

Note that in this case the bound diverges as Aα1 − α0 → 0, at an exponential rate in 1/|α1 |.

Exponential confinement of chaos

207

L T e−k/)

111111111111111111111111111111111111111111111111111 000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000 111111111111111111111111111111111111111111111111111 000000000000000000000000000000000000000000000000000 111111111111111111111111111111111111111111111111111 Chaotic layer 000000000000000000000000000000000000000000000000000 111111111111111111111111111111111111111111111111111 000000000000000000000000000000000000000000000000000 111111111111111111111111111111111111111111111111111 000000000000000000000000000000000000000000000000000 111111111111111111111111111111111111111111111111111 000000000000000000000000000000000000000000000000000 111111111111111111111111111111111111111111111111111 000000000000000000000000000000000000000000000000000 111111111111111111111111111111111111111111111111111 000000000000000000000000000000000000000000000000000 111111111111111111111111111111111111111111111111111 0000 1111 000000000000000000000000000000000000000000000000000 111111111111111111111111111111111111111111111111111 111111111 000000000 0000 1111 0000 1111 C c2 0000 1111 4 )/ ln )

Figure 7.9. ‘Chaotic’ layer around the local bifurcation set.

Normal hyperbolicity conditions for the existence of perturbed invariant circles γ˜i,e of class C are of the form ,i ≤ K for i = 1, 2. + i Using the estimates (7.32)–(7.34), for γ˜i we obtain √ 2+δ)

e( 2 −κ)/|α1 | e−c/ε ≤ ∗K |α0 |κ(2+ 3

(Aα1 − α0 )

(7.35)

and, similarly, for γ˜e to e(

√ 1 2+ 2 +2κ+δ)/|α1 | −c/ε

e

≤ ∗K (Aα1 − α0 ).

(7.36)

Let us consider the conditions (7.35) and (7.36) more closely. Observe that Aα1 − α0 is the distance between γe and the line L within σ . So, for any choice of α1 < 0, the hyperbolic invariant curve γ˜e exists outside an exponentially flat horn around a line obtained from L by averaging. However, the coefficient e(

√ 1 2+ 2 +2κ+δ)/|α1 |

explodes as |α1 | → 0. A similar conclusion holds for γ˜i . Note that the righthand side of (7.35) is of order O(α0 (Aα1 − α0 )). Observe that α0 (Aα1 − α0 ) is equivalent to the distance of the point (α0 , α1 ) inside the horn T to the boundary T . A consequence of the occurrence of the coefficient eν/|α1 | , with ν > 0, in (7.36) and (7.35) and of the flat contact between the curves C and L, is that the hyperbolicity is limited by a condition |α1 | ≥ ∗ε. Indeed, to fix thoughts, in the inequality (7.36) take into account that |Aα1 − α0 | ≤ e−1 |α1 |e1/α1 yielding c − + ε

3 2

+

√ 2 + 2κ + δ ≤0 |α1 |

208

Henk W Broer and Robert Roussarie

or, equivalently, |α1 | ≥

3 2

+

√

2 + 2κ + δ ε. c

(7.37)

Similar conditions exist for γ˜i . Remark 7.24. Since in this application q = 1, by the conclusions of the main theorem 7.1, we have super-exponential control on the remainder term. It follows that the ‘chaotic’ layer which is asymptotically flat around the bifurcation set at some distance of the saddle connection point c2 , enters the horn T at a rate |α1 | ∼ ∗ε/ ln ε, which is sharper than the estimate (7.37). This fact is illustrated in figure 7.9. Of course, we have only shown that outside this layer we have hyperbolic behaviour. Nevertheless, is seems reasonable to guess that chaotic behaviour may appear inside the horn T, at a distance |α1 |, which is of order ε/ log ε. Also compare our remarks concluding section 7.3.

Acknowledgements The authors thank Jean-Marie Lions, Floris Takens and Florian Wagener for discussions during the preparation of this paper. Also, we thank Martijn van Noort for his help with some technicalities.

References [1] Arnol’d V I 1983 Geometrical Methods in the Theory of Ordinary Differential Equations (Berlin: Springer) [2] Broer H W and Golubitsky M 2001 The geometry of resonance tongues: a singularity approach Preprint (http://www.math.uh.edu/ dynamics/reprints.html) [3] Broer H W, Huitema G B and Sevryuk M B 1996 Quasi-Periodic Motions in Families of Dynamical Systems (Lecture Notes in Mathematics 1645) (Berlin: Springer) [4] Broer H W, Huitema G B, Takens F and Braaksma B L J 1990 Unfoldings and bifurcations of quasi-periodic tori Mem. Am. Math. Soc. 83(421) 1–175 [5] Broer H W, Roussarie R and Sim´o C 1993 On the Bogdanov–Takens bifurcation for planar diffeomorphisms Proc. Equadiff 91 ed C Perell´o, C Sim´o and J Sol`aMorales (Singapore: World Scientific) pp 81–92 [6] Broer H W, Roussarie R and Sim´o C 1996 Invariant circles in the Bogdanov–Takens bifurcation for diffeomorphisms Ergod. Theor. Dynam. Syst. 16 1147–72 [7] Broer H W, Sim´o C and Tatjer J C 1998 Towards global models near homoclinic tangencies of dissipative diffeomorphisms Nonlinearity 11 667–770 [8] Broer H W and Tangerman F M 1986 From a differentiable to a real analytic perturbation theory applications to the Kupka Smale theorems Ergod. Theor. Dynam. Syst. 6 345–62 [9] Broer H W and Takens F 1989 Formally symmetric normal forms and genericity Dynamics Reported vol 2, ed U Kirchgraber and H O Walther (New York: Wiley) pp 39–59

Exponential confinement of chaos

209

[10] Chenciner A 1985 Bifurcations de points fixes elliptiques I Publ. Math. IHES 61 67–127 [11] Chenciner A 1985 Bifurcations de points fixes elliptiques II: Orbites p´eriodiques et ensembles de Cantor invariants Inv. Math. 80(1) 81–106 [12] Chenciner A 1988 Bifurcations de points fixes elliptiques III: Orbites p´eriodiques de ‘petites’ p´eriodes et e´ limination r´esonnante des couples de courbes invariantes Publ. Math. IHES 66 5–91 [13] Dumortier F, Roussarie R and Sotomayor J 1987 Generic 3-parameter families of vector fields on the plane, unfolding a singularity with nilpotent linear part: the cusp case of codimension 3 Ergod. Theor. Dynam. Syst. 7 375–413 [14] Dumortier F, Roussarie R and J Sotomayor 1991 Generic 3-parameter families of vector fields, unfoldings of saddle, focus and elliptic singularities with nilpotent linear parts Bifurcation of Planar Vector Fields: Nilpotent Singularities and Abelian Integrals (Lecture Notes in Mathematics 1480) ed F Dumortier, R Roussarie, J Sotomayor and H Zoladek (Berlin: Springer) pp 1–164 [15] Gibson C G 1979 Singular Points of Smooth Mappings (Research Notes in Mathematics 25) (London: Pitman) [16] Guckenheimer J and Holmes P 1983 Nonlinear Oscillations, Dynamical Systems and Bifurcations of Vector Fields (New York: Springer) [17] Hirsch M W, Pugh C C and Shub M 1977 Invariant Manifolds (Lecture Notes in Mathematics 583) (Berlin: Springer) [18] Jorba A and Villanueva J 1997 On the normal behaviour of partially elliptic lowerdimensional tori of Hamiltonian systems Nonlinearity 10 783–822 [19] Krauskopf B 1995 On the 1 : 4 resonance problem: analysis of the bifurcation set PhD Thesis University of Groningen [20] Krauskopf B 1994 Bifurcation sequences at 1 : 4 resonance: an inventory Nonlinearity 7 1073–91 [21] Krauskopf B 1994 The bifurcation set for the 1 : 4 resonance problem Exp. Math. 3(2) 107–28 [22] Krauskopf B 1997 Bifurcations at ∞ in a model for 1 : 4 resonance Ergod. Theor. Dynam. Syst. 17(4) 899–931 [23] Krauskopf B 2001 Strong resonances and Takens’s Utrecht preprint, chapter 4 of this volume [24] Łojasiewicz S 1959 Sur le probl`eme de la division Stud. Math. 8 87–136 [25] Neishtadt A I 1984 The separation of motions in systems with rapidly rotation phase J. Appl. Math. Mech. 48 133–9 [26] Newhouse S E 1974 Diffeomorphisms with infinitely many sinks Topology 13 9–18 [27] Newhouse S, Palis J and Takens F 1983 Bifurcations and stability of families of diffeomorphisms Publ. Math. IHES 57 1–71 [28] Palis J and de Melo W C 1982 Geometric Theory of Dynamical Systems (Berlin: Springer) [29] Palis J and Takens F 1993 Hyperbolicity and Sensitive Chaotic Dynamics at Homoclinic Bifurcations (Cambridge Studies in Advanced Mathematics 35) (Cambridge: Cambridge University Press) [30] Roussarie R 1997 Smoothness properties of bifurcation diagrams Publ. Matem`atiques 41 243–68 [31] Sim´o C, Broer H W and Roussarie R 1992 A numerical survey on the Takens– Bogdanov bifurcation for diffeomorphisms Eur. Conf. on Iteration Theory vol 89,

210

[32] [33] [34]

[35] [36] [37]

Freddy Dumortier, Peter Fiddelaers and Chengzhi Li ed C Mira, N Netzer, C Sim´o and G Targonski (Singapore: World Scientific) pp 320–34 Takens F 1973 Unfoldings of certain singularities of vector fields: generalized Hopf bifurcations J. Diff. Eqns 14 476–93 Takens F 1974 Singularities of vector fields Publ. Math. IHES 43 47–100 Takens F 1974 Forced oscillations and bifurcations Applications of Global Analysis I Commun. Math. Inst. University of Utrecht 3 1–59 (reprinted in chapter 1 of this volume) Takens F 1992 Abundance of generic homoclinic tangencies in real-analytic diffeomorphisms Bol. Soc. Bras. Mat. 22(2) 191–214 Tougeron J C 1972 Id´eaux de fonctions diff´erentiables Ergebnisse der Mathematik und ihrer Grenzgebiete vol 71 (Berlin: Springer) Wagener F 2001 Semilocal analysis of the k : 1 and k : 2 resonances in quasiperiodically forced systems, chapter 5 in this volume

Chapter 8 Takens–Bogdanov bifurcations without parameters and oscillatory shock profiles Bernold Fiedler and Stefan Liebscher Free University Berlin

Bifurcation theory deals with the dynamics associated with vector fields x˙ = f x (x, λ) λ˙ = 0

(8.1)

which depend on one or several real parameters λ ∈ Rm . Mostly the analysis is local in λ. Moreover, x ∈ Rn , or x in an n-dimensional manifold, is near a fairly simple time invariant object. See for example [13, 27, 28, 29, 31, 34], and the many references there. Frequently, for example, it is assumed that 0 = f x (0, λ)

(8.2)

so that x = 0 is a given ‘trivial’, or ‘primary’ equilibrium solution, independently of the parameter λ. Clearly, this gives rise to a manifold of trivial equilibria of dimension m = dim λ. In addition, the equation λ˙ = 0 trivially provides an invariant foliation λ = constant of phase space (x, λ), transverse to the equilibrium manifold (0, λ). In the present paper, we abandon this foliation by constant parameter values λ, but keep the requirement of a trivial equilibrium manifold in effect. Replacing λ by y ∈ Rm to mark that difference we consider x˙ = f x (x, y) y˙ = f y (x, y)

(8.3)

with the convenient abbreviation z˙ = f (z), where z = (x, y) and f = ( f x , f y ). As before, we assume the existence of a manifold of trivial equilibria 0 = f (0, y)

(8.4) 211

212

Bernold Fiedler and Stefan Liebscher

for all y ∈ Rm . We also assume f ∈ C κ to be sufficiently smooth throughout; κ ≥ 5 is more than enough. The celebrated Takens–Bogdanov bifurcation deals with (8.1) for two real parameters λ = (λ1 , λ2 ) ∈ R2 and under the (generic) assumption of an algebraically double zero eigenvalue 0 0 D x f x (x = 0, λ = 0) = (8.5) 1 0 See [3, 7, 8, 43, 44] and also [9, 10]. For more recent accounts see also [29, 4]. The standard unfolding in parameter space λ = (λ1 , λ2 ) involves a curve of stationary saddle–node bifurcations, and half-arcs of Hopf bifurcations and homoclinic orbits, respectively. In the present paper, in contrast, we will drop the foliation λ˙ = 0, but treat (8.3), (8.4) with x, y ∈ R2 and a nilpotent linearization (8.5) in complete analogy to the celebrated Takens–Bogdanov bifurcation.

8.1 Examples We contend that the problem of bifurcation from manifolds of equilibria, even in the absence of any parameter foliations λ˙ = 0, and degenerate as it may seem, arises quite naturally in applications. We give three examples next. For further details we refer to section 8.12 below, as well as to [1, 2, 17, 21, 20, 22, 36, 37]. Example 8.1. Our first example arises in population dynamics, game theory etc; see also [17]. For x = (x 1 , . . . , x n ) ∈ Rn+ let B be a degenerate real (n × n)-matrix, say with one-dimensional right/left kernel spanned by r, l ∈ Rn , respectively. Consider the system x˙ j = x j · ((B x) j − b j )

(8.6)

j = 1, . . . , n. For b = (b1 , . . . , bn ) ∈ range B, we obtain a line of equilibria x = x 0 + y · r,

y∈R

(8.7)

which may intersect the positive orthant. Analysis is facilitated, in this example, by the presence of a first integral (alias, conserved quantity) given by j x l j . Higher-dimensional kernels of B may clearly give rise to higher-dimensional equilibrium subspaces. Example 8.2. Our second example arises in the study of viscous profiles of nonlinear hyperbolic conservation laws, mixed with stiff balance laws. We consider travelling wave solutions u(t, ξ ) = U (ε−1 (ξ − st)) of systems ∂t u + ∂ξ F(u) = ε−1 G(u) + εδ∂ξ2 u

(8.8)

Takens–Bogdanov bifurcations without parameters

213

where ε 6 0 indicates a small parameter. The associated travelling wave system is independent of ε and reads δU + (sU − F(U )) + G(U ) = 0.

(8.9)

The case of vanishing G(u) ≡ 0 of pure conservation laws has been studied most widely; see for example [42]. Putting z = (U, U ), we obtain a trivial equilibrium manifold (U, 0) of dimension dim U = 12 dim z. Heteroclinic solutions which converge to different limits (u ± , 0) in this manifold, for time tending to ±∞, are called viscous shock profiles of the Riemann problem with Riemann data u ± . Indeed, the solution U (ε−1 (ξ − st)) then converges in the limit ε 6 0 to a discontinuous weak solution of (8.8), a shock, which propagates at constant speed s. The analysis is greatly facilitated by the first integrals δU + sU − F(U ) ≡ constant

(8.10)

in this case. The opposite extreme where G(u) = 0 only holds for isolated points u does not give rise to equilibrium manifolds of positive dimension. A typical case, however, arises when some—but not all—components G j (u) vanish identically. This corresponds to conservation laws for some components u j , whereas the remaining u-components encounter source terms. As a consequence the equilibrium system G(u) = 0 (8.11) becomes under-determined and equilibrium manifolds z = (u, 0) ∈ M usually appear. The dimension m of M will typically be given by the number of conservation laws. For a detailed analysis of the case of single conservation laws, alias lines of equilibria, see [21, 20, 22, 36, 37], as well as sections 8.2 and 8.12 below. We explicitly mention the appearance of linearly stable weak viscous shock-profiles, which violate the Lax entropy condition and are oscillatory—in marked contrast to the case G ≡ 0 of pure conservation laws. Example 8.3. Our third example is based on an observation for coupled oscillators due to [1], see also [2, 22, 36]. Let i ∈ {±1, . . . , ±(m+1)} denote the vertices of an (m + 1)-dimensional 2m+1 -hedron C. The neighbours Ni of vertex i are all i , except i itself and −i . Consider the coupled system ui . u˙ i = F u i , (8.12) i ∈ Ni

If F(·, 0) is odd, F(−u, 0) = −F(u, 0), then the coupled oscillator system (8.12) possesses an invariant subspace, where the dynamics is described by an (m + 1)fold direct product of the uncoupled flows u˙ = F(u, 0).

(8.13)

214

Bernold Fiedler and Stefan Liebscher

Indeed the subspace given by u −i = −u i

(8.14)

for all i is invariant and eliminates any coupling. For examples on graphs more intricate than C see [2]. Suppose the uncoupled dynamics (8.13) possess a periodic orbit. This gives rise to an (m + 1)-dimensional invariant torus of the coupled system (8.12), foliated by periodic orbits. In a (local) Poincar´e cross section the torus manifests itself as an m-dimensional manifold M of fixed points. As was pointed out in [43], in normal form up to any finite order this Poincar´e map (or its second iterate) coincides with the time-one map of an autonomous vector field in the Poincar´e cross section; see [12]. The manifold M of fixed points then becomes a manifold M of equilibria. Alternatively, we may assume equivariance of F under an S 1 -action such that the periodic orbits of (8.13) become group orbits, alias rotating waves, under this action. Then the full vector field (8.12) pulls back to a vector field in the Poincar´e section, by the Palais construction, and the Poincar´e map coincides with the time-one map of the pulled back vector field; see for example [2], and for the Palais construction [24]. Again, the manifold M of fixed points becomes an equilibrium manifold. Sufficiently motivated, as we now are, to consider vector fields with equilibrium manifolds M of dimension m, the paper is organized as follows. In section 8.2, we recall some basic results on the case of equilibrium lines, m = 1. The simplest case of a non-trivial eigenvalue zero, in the linearization, is addressed, as well as the two cases of Hopf bifurcation without parameters, caused by purely imaginary eigenvalues. Section 8.3 lists several possibilities for bifurcations from equilibrium planes, m = 2—among them, most notably, the Takens–Bogdanov bifurcation without parameters. Focusing on only this case for the rest of this paper, we briefly discuss the relevant normal form which preserves the equilibrium plane; see section 8.4. In section 8.5, a suitable scaling provides an expansion in an artificial small blow-up parameter ε. To leading order ε0 , the resulting scaled system becomes completely integrable, in section 8.6, though not quite Hamiltonian. The resulting slow flow of first integrals, at order ε1 , is derived in section 8.7, relegating discussions of elliptic integrals to section 8.8. Resorting to some numerical evaluation of Weierstrass functions, at last, this analysis remains incomplete. ‘Averaging’ of the rapid oscillations in the slow flow is performed in section 8.9. Before we draw geometric conclusions on the three essentially different types of Takens–Bogdanov bifurcation without parameters, in section 8.11, we stroll the landscape of averaging, subharmonic resonance, Melnikov functions, truncation of normal forms and discretization of the ‘averaged’ vector field, in section 8.10. We conclude, in section 8.12, with an explicit example of stiff balance laws, which exhibits all three types of Takens– Bogdanov bifurcation without parameters which are derived in this paper.

Takens–Bogdanov bifurcations without parameters

215

8.2 Bifurcations from lines of equilibria As a preparation for our investigation of Takens–Bogdanov bifurcations from planes of equilibria, in section 8.3, we first recall earlier results on bifurcations from lines of equilibria. We begin with the case of normal hyperbolicity of the equilibrium line, and then proceed to address the occurrence of a non-trivial zero eigenvalue as well as purely imaginary eigenvalues. In classical bifurcation theory, characterized by a foliation λ˙ = 0 by a scalar parameter λ ∈ R, these latter cases would correspond to bifurcation of non-trivial equilibria and to Hopf bifurcation of small amplitude periodic oscillations, respectively. In the notation of (8.3), (8.4) we consider vector fields x˙ = f x (x, y) y˙ = f y (x, y)

(8.15)

with x ∈ Rn , scalar y ∈ R replacing the parameter λ, and a trivial line of equilibria 0 = f (0, y) (8.16) of the vector field f = ( f x , f y ). In block matrix notation, the linearization A(y) of f at the equilibrium (0, y) ∈ Rn+1 is given by A0 (y) 0 (8.17) A(y) = ∗ 0 with the normal part A0 (y) := D x f x (0, y). The spectrum of A(y) is given by spec A(y) = {0} ∪ spec A0 (y)

(8.18)

that is, by just adding a (trivial) eigenvalue zero to the spectrum of the linearization A0 (y) normal to the equilibrium line. Normal hyperbolicity simply requires all eigenvalues of A0 (y) to possess non-zero real part. Standard theory of normal hyperbolicity then identifies a local centre-stable manifold W cs of the equilibrium line which consists of all initial conditions (x 0 , y 0 ) ∈ Rn+1 near {0} × R, such that x(t) remains small for all positive times t. The centre-stable manifold W cs is foliated by the strong stable manifolds W ss (y) of those (x 0, y 0 ) ∈ Rn+1 near {0}× R selected by the additional requirement (8.19) lim (x(t), y(t)) = (0, y). t →+∞

Similarly, but going backwards in time instead, we obtain the foliation of the centre-unstable manifold W cu by strong unstable manifolds W uu (y); see for example [5, 18, 19, 32] for additional background and technical details. Tangent spaces to W ss (y) and W uu (y) at (0, y), for example, are given by the eigenspaces of A(y) corresponding to the spectrum strictly in the left and right complex half

216

Bernold Fiedler and Stefan Liebscher x1

y x2

Figure 8.1. A normally hyperbolic line of equilibria with flow-invariant foliation.

plane, respectively. By (8.18), these eigenvalues are precisely the eigenvalues of the normal part A0 (y). An interesting generalization of the Grobman–Hartman theorem to the case of non-hyperbolic equilibria has been proved by [41]; see [4]: locally, the flow is given as a direct product of the linearized hyperbolic part with the flow in the centre manifold W c , up to C 0 flow equivalence. Applied to our case, where W c = W cs ∩ W cu is just the equilibrium line, this reduces the flow to x˙ = A0 (y)x y˙ = 0.

(8.20)

Note that y has in fact become a parameter in these coordinates; see figure 8.1 for a phase portrait. We consider the two generic possibilities of a non-hyperbolic normal part A0 (y) next: a simple (non-trivial) eigenvalue zero, and a simple purely imaginary pair ±i, respectively, which cross the imaginary axis at non-zero speed, as y increases through y = 0. Eliminating the foliations due to the remaining hyperbolic part of A0 (0), we reduce our attention to the centre manifold and only consider x = x ∈ R, x = (x 1 , x 2 ) ∈ R2 , respectively. For the case of a simple eigenvalue zero, where x = x ∈ R, see [15, 36]. Since f (0, y) ≡ 0, we can divide by x and obtain z˙ = x f˜(z)

(8.21)

for z = (x, y) ∈ R2 . Dividing by the Euler multiplier x, which reverses time for x < 0, we obtain the time orbits of (8.21) from the time orbits of z = f˜(z).

(8.22)

Takens–Bogdanov bifurcations without parameters

217

x

y

Figure 8.2. Phase portrait of failure of normal hyperbolicity due to a non-trivial simple eigenvalue zero, at y = 0.

Invoking a flow box theorem for (8.22), adapted to preserve the straight equilibrium line, reduces (8.21) to x˙ = ax y y˙ = x.

(8.23)

Note the normal part of the linearization A0 (y) = Dx f x (0, y) = ay

(8.24)

and a %= 0, by transverse crossing of the zero ‘eigenvalue’ ay of A0 (y) at y = 0. To avoid the case y˙ = f y (z) ≡ 0 of a trivial foliation with ‘ordinary’ bifurcation parameter y, the non-degeneracy condition Dz f y (0, 0) %= 0 has to be imposed on (8.21). Since f y (0, y) ≡ 0 and hence D y f y (0, 0) = 0 we obtain Dx f y (0, 0) %= 0, which accounts for the y-component of (8.23). See figure 8.2 for a phase portrait with a = 1. Note the absence of non-trivial equilibria and the heteroclinic orbits in the lower half plane x < 0. The case of purely imaginary eigenvalues y ± i, at y = 0, of the normal part linearization A0 (y) leads to a normal form r˙ = r y y˙ = 12 ar 2

(8.25)

ϕ˙ = 1 in polar coordinates x 1 + ix 2 = r eiϕ for x = (x 1 , x 2 ) ∈ R2 . We have used the purely imaginary Hopf eigenvalues ±i at y = 0 to eliminate dependence on ϕ,

218

Bernold Fiedler and Stefan Liebscher

in normal form, and we have truncated at second order. For a technically more careful discussion we refer to [21]. This result was used by [15] to illustrate a more general blow-up procedure. Division by the Euler multiplier r leads to the linear system r˙ = y y˙ = 12 ar.

(8.26)

We distinguish the elliptic case a < 0, and the hyperbolic case a > 0. The elliptic case was already studied by [17]; see also [20, 21, 22, 37]. In normal form, truncated at any finite order, the flow reduces to (8.25), with all orbits spiralling along ellipsoids in the elliptic case, or cones and hyperboloids, in the hyperbolic case; see figure 8.3. Note how all non-stationary orbits leave a neighbourhood of the equilibrium line in either forward or backward time, in the hyperbolic case. This observation remains true when we include higher-order terms not in normal form. In the elliptic case, such higher-order terms may split the ellipsoids in a transverse way. Due to results of Neishtadt type [40], the splittings will be exponentially small in terms of the size of the ellipsoid, in the analytic case; see also [11, 23, 25, 26] and the references there. Note the absence of periodic orbits. In fact, any non-stationary orbit is heteroclinic from an equilibrium y− > 0 to an equilibrium y+ < 0, locally. The set of all ω-limiting equilibria y+ which occur in any fixed, two-dimensional strong unstable manifold W uu (y− ) may cover a closed interval in (−∞, 0), however, due to Neishtadt splitting. Still, the system possesses a smooth Lyapunov function V = V (x, y), in the elliptic case, which decreases strictly along all trajectories. We give one possible construction. It turns out that all non-stationary orbits cross the x-plane y = 0, and do so transversely; see [21, proposition 2.1]. Normalizing time to t = 0, at this crossing, we define V (x(t), y(t)) := p(tanh t; y− , y+ )

(8.27)

where p(τ ; y− , y+ ) is the unique parabola in τ mapping the three τ -values (−1, 0, +1) to (y− , 0, y+ ), and y± = limt →±∞ y(t) depend on the specific trajectory. Then V (x, 0) = 0, V (0, y) = y, and V is a smooth Lyapunov function. Just as the classical Takens–Bogdanov bifurcation in two parameters requires an understanding of stationary and Hopf bifurcations, in one parameter, all the above examples for bifurcations along lines of equilibria, without parameters, will contribute to the three types of Takens–Bogdanov bifurcation from an equilibrium plane presented in the next section.

Takens–Bogdanov bifurcations without parameters (a)

r

r

y

(b)

y

x1

x1

y

(c)

x2

219

y

x1

x1

y

y x2

Figure 8.3. Hopf bifurcation from lines of equilibria without parameters; elliptic case (left) and hyperbolic case (right). Normal form (a), Poincar´e section x2 = 0 (b), and three-dimensional views (c). From [21]. See also the colour section.

220

Bernold Fiedler and Stefan Liebscher θ = −0.6

θ =0 θ = 0.5

line of equilibria y

x1

x2

Figure 8.4. Phase portraits of the integrable scaled flow, at order zero in ε; see section 8.6. See also the colour section.

8.3 Bifurcations from planes of equilibria We now proceed to an analysis of planes of equilibria arising in vector fields x˙ = f x (x, y) y˙ = f y (x, y)

(8.28)

where x ∈ Rn , y = (y1 , y2 ) ∈ R2 and f = ( f x , f y ) satisfies 0 = f (0, y)

(8.29)

for all y. As before, we have a block matrix decomposition of the linearization A0 ( y) 0 A( y) = Dz f (0, y) = (8.30) A1 ( y) 0

Takens–Bogdanov bifurcations without parameters

221

with families of real (2 × n)-matrices A1 ( y) parametrized by y ∈ R2 . For the normal part A0 ( y), which determines the non-trivial spectrum of A( y), several non-hyperbolic cases arise in generic two-parameter families; see the cases in [43] and the discussion in [4]. Specifically, A0 (0) can possess (a) a simple eigenvalue zero (b) a pair of simple, purely imaginary eigenvalues (Hopf) (c) an algebraically double, geometrically simple eigenvalue zero (Takens–Bogdanov) (d) both (a) and (b) (zero Hopf) (e) two non-resonant pairs of simple, purely imaginary eigenvalues (Hopf–Hopf). Clearly (a), (b) will then occur along curves in y-space, unless additional degeneracy conditions like non-transverse eigenvalue crossings or degeneracy of other higher-order terms are imposed. Cases (c)–(e), in contrast, possess codimension two in the space of (n × n)-matrices A0 ( y) and, hence, will generically occur at isolated values of y ∈ R2 , say at y = 0. As a diversion, we also note a hybrid between ‘ordinary’ bifurcation theory, where λ˙ = 0, and on the other hand (8.28), (8.29), which fix only the y-plane x = 0. Let f y = ( f y1 , f y2 ) and assume 0 ≡ f y2 (x, y)

(8.31)

for all (x, y) ∈ Rn+2 . Then y˙2 = 0, and y2 becomes an ‘ordinary’ bifurcation parameter. Writing y = (y, λ) with y, λ ∈ R, in this case, we then have to discuss systems of the form x˙ = f x (x, y, λ) (8.32) y˙ = f y (x, y, λ) where f (0, y, λ) ≡ 0. These vector fields are just one-parameter versions of systems with trivial equilibrium lines, as discussed in section 8.2; see (8.15), (8.16). For fixed λ, for example, cases (a) and (b) were discussed in section 8.2. In fact, (b) gave rise to two distinct cases of Hopf bifurcation without parameters which we called elliptic and hyperbolic. Varying λ, cases (d) and (e) can then be viewed as the collision of cases (a)–(b), and (b)–(b), respectively. Similarly, as we will see below, the Takens–Bogdanov case (c) can arise from the Hopf case (b) if the Hopf frequency tends to zero as the parameter λ varies. The half-arc of Hopf bifurcation points, in the (y, λ)-plane, then terminates at a nilpotent Jordan block A0 . In this paper, we henceforth restrict to the Takens–Bogdanov case (c), where A0 (0) is nilpotent. We now summarize some main results. Under suitable nondegeneracy conditions, we may choose local coordinates (y1 , y2 ) such that A0 ( y), A1 ( y), take the form a(−y1 + y2 ) −y1 A0 ( y) = (8.33) 1 0

222

Bernold Fiedler and Stefan Liebscher (c1 y1 + c2 y2 ) 1 A1 ( y) = c3 (c1 y1 + c2 y2 ) 0

(8.34)

for some a %= 0. For details we refer to our discussion of normal forms in section 8.4. Again we have omitted the hyperbolic part of A0 , so that x = (x 1 , x 2 ) ∈ R2 . Note that we may, and do, assume a > 0 if we allow for linear transformations of x, y and for a reversal of time. In fact we will absorb a completely into the scaling, alias blow-up (8.57) below. Note the Jordan block of order three in the full linearization   0 0 0 0  1 0 0 0  A0 (0) 0  = A( y) = (8.35)  0 1 0 0  A1 (0) 0 0 0 0 0 at y = 0. A normal form of the vector field z˙ = f (z) up to second order in z = (x, y) reads x˙1 = ax 1 (−y1 + y2 ) − x 2 y1 + abx 22 x˙2 = x 1 y˙1 = x 2 + x 1 (c1 y1 + c2 y2 )

(8.36)

y˙2 = c3 x 1 (c1 y1 + c2 y2 ). In the following sections 8.5–8.11, we use a blow-up or scaling reminiscent of the classical Takens–Bogdanov bifurcation, to exhibit features of the truncated system (8.36), locally near x = y = 0, which survive the addition of higher-order terms. Some of these features, which are the main result of this paper, can be summarized as follows. Let b ∈ / {−17/12, −1}. Then there exists a neighbourhood U of x = y = 0 in R4 such that any trajectory (x(t), y(t)) which remains in U , be it for all positive or all negative times, converges to some equilibrium (0, y), in the trivial equilibrium plane x = 0. More specifically, let z(t) = (x(t), y(t)) ∈ U for all t ≥ 0. Then lim x(t) = 0, for t → +∞, and there exists y+ := lim y(t). t →+∞

(8.37)

Similarly, z(t) ∈ U for all t ≤ 0 implies lim x(t) = 0, for t → −∞, and y− := lim y(t) t →−∞

(8.38)

exists. Of course, non-stationary heteroclinic orbits which remain in U for all t ∈ R and for which both (8.37) and (8.38) hold, are possible, albeit with y+ %= y− . In particular, U is void of non-trivial equilibria, of periodic orbits, and of any homoclinic orbits. Any orbit which remains in U , for all times t ∈ R, is stationary or heteroclinic.

Takens–Bogdanov bifurcations without parameters Sin

x-exit

y-entry x-exit

(1)

1-het

x-entry

hyp. Hopf

(2)

y-entry

(1)

(0)

(1)

(0)

(1)

y-entry x-exit

1-het x-exit

λ, y2

(0)

y, y1 y-entry

x-entry Sin

Fin

x-exit

223

(A)

λ, y2 hyp. Hopf

n-het

Sout (1) x-entry

(2) (0)

(1)

y, y1 (1)

x-entry 1-het

(0) λ, y2

x-exit

(B) Fout ell. Hopf

x-entry

(1)

Fin

(2) (0)

(1)

y, y1 x-entry

(1)

(0)

(C)

Figure 8.5. Three cases of Takens–Bogdanov bifurcations without parameters; see (8.36) and (A)–(C) for coefficients. Unstable dimensions i of trivial equilibria (0, y) are indicated by (i); ‘n-het’ indicate saddle–saddle heteroclinics with n revolutions around the positive y1 -axis. See sections 8.9 and 8.11 for a detailed description.

224

Bernold Fiedler and Stefan Liebscher

? t - 6t 6 figure 8.2 simple zero

entry

figure 8.3 elliptic Hopf

exit

y

y

figure 8.3 hyperbolic Hopf entry entry

t

-

exit exit y

Figure 8.6. Schematic diagrams of bifurcations from lines of equilibria.

In figure 8.5 we depict the three essentially different cases which arise in the three parameter regions Case (A)

b < −17/12

Case (B)

−17/12 < b < −1

Case (C)

−1 < b.

Arrows indicate the pairings (y− "→ y+ ) by heteroclinic orbits. We also indicate exit sets, for which the strong unstable manifolds W uu (y) leave U in forward time, at least partially, as well as entry sets, where W ss (y) shares the same fate in backwards time. The difference between ‘x-entry/exit’ and ‘y-entry/exit’ will be explained in sections 8.9–8.11. It is interesting to note that, to leading order, these diagrams do not depend on the values of c1 , c2 , c3 ; see the blow-up construction in section 8.5. In particular, we may choose c1 = c2 = 0. This identifies y2 as a parameter, to leading order, by y˙2 = 0. Writing y1 = y, y2 = λ as in diversion (8.31), (8.32) above, we can equivalently view the Takens–Bogdanov bifurcations without parameters as a termination of Hopf bifurcation from a line of equilibria given by the y-axis, as an external parameter λ is varied. In this setting, both the elliptic and the hyperbolic Hopf case arise, as discussed in section 8.2. Although figure 8.5 gives an indication of heteroclinicity, it does not reveal the detailed geometry of the associated flows of (x, y) ∈ R3 , as parametrized by λ ∈ R. In fact, the results of figures 8.2 and 8.3 can be represented quite schematically by figure 8.6. All geometric intricacies are lost, like W uu (y− ) terminating at y+ -intervals in the elliptic Hopf case, or distinctions between saddle–saddle heteroclinics, saddle–node heteroclinics, and focus–node heteroclinics. For some such geometric detail we refer to section 8.11 below.

8.4 Normal forms In this section we consider

x˙ = f x (x, y) y˙ = f y (x, y)

(8.39)

Takens–Bogdanov bifurcations without parameters

225

with x, y ∈ R2 , an equilibrium plane 0 = f (0, y) and nilpotent normal part linearization

(8.40)

A0 (0) = D x f x (0, 0) =

0 0 1 0

.

(8.41)

For terms up to order two in x, y, we derive the normal form (8.36) such that the y-plane x = 0 remains an equilibrium plane. As a first step we consider the appropriately non-degenerate linearization   0 0 0 0  1 0 0 0  A0 (0) 0  = (8.42) A(0) =  0 0  A1 (0) 0 A1 (0) 0 0 at x = y = 0. The non-degeneracy which we choose for A1 (0) is such as can occur in generic two-parameter families of linearizations A( y) subject to the equilibrium constraint (8.40). We fix y = 0 by requiring nilpotency of A0 (0). Because the y-plane x = 0 is distinguished to be an equilibrium plane, we may consider range A(0) ∩ {x = 0} (8.43) as an invariant object. With a two-dimensional kernel of A(0) at hand, generically, and one dimension of range A(0) fixed as the x 2 -axis, by A0 (0), the space (8.43) is one-dimensional: call it the y1 -axis. Therefore, A1 (0) takes the form α 1 (8.44) A1 (0) = 0 0 generically with a non-zero upper right entry which we have normalized to 1. The upper left entry α can be eliminated by a skew linear transformation y˜1 = −αx 2 + y1 . Thus α = 0 in (8.44) and henceforth. This proves that A(0) indeed takes the form (8.35) with a Jordan block of order three, generically. Normal forms of vector fields with higher nilpotency have been studied extensively; see for example [14]. For the definitive exposition of normal forms, in general, see [43, 47]. For our purposes we need a slight adaptation, restricting local transformations of the vector field (8.39) to those near identity diffeomorphisms which map the equilibrium y-plane x = 0 into itself. We do not require to fix this plane pointwise. In terms of Taylor polynomials, we successively expand (8.45) (z) = z + 2 (z) + · · · y

with k = ( kx , k ) homogeneous in z of degree k. The above restriction fixing the y-plane x = 0 amounts to x k (0,

y) = 0

(8.46)

226

Bernold Fiedler and Stefan Liebscher

for all y ∈ R2 , k = 2, 3, . . . . In the Taylor expansion of the vector field f (z) = Az + f 2 (z) + · · ·, transformations by allow us to eliminate terms of f in the range of the Lie bracket [·, f ], [g, f ](z) = g (z) f (z) − f (z)g(z), (8.47) up to any finite order k. Here g denotes the vector field which, as an element of the Lie algebra of the diffeomorphism group, generates the transformation by the exponential map, i.e. by time integration of the vector field g. Expanding g(z) = g2 (z) + · · ·

(8.48)

by order of z, we see that g enables us to successively eliminate components of f k (z) in the range of ad A, ((ad A)g)(z) = [ A, g](z) = Ag(z) − g (z)Az.

(8.49)

The normal form of f is then given, up to any finite order k, by a linear complement to the range of (8.49), as g is varied. How do we keep track of the restriction (8.46) during this normal form process? Clearly the Lie algebra of associated vector fields g = (g x , g y ) has to just satisfy gkx (0, y) = 0 (8.50) for all y ∈ R2 , k = 2, 3, . . . . The normal form of f then amounts to an element of the linear complement, in the space of vector fields f satisfying (8.40), to the range of ad A restricted to those g satisfying (8.50). Note here that (ad A)g satisfies (8.40) if g satisfies (8.50). Indeed, transformations by the flow of g preserve the y-plane of equilibria of f , and A = A(0). The normal form itself depends on the choice of a complement in (8.49), (8.50), of course. One of the many possible normal forms of f then takes the form x˙1 = x 1 h 1 (2x 1 y1 − x 22 , y) + x 2 h 2 (2x 1 y1 − x 22 , y) +x 22 h 3 (2x 1 y1 − x 22 , y) x˙2 = x 1

(8.51)

y˙1 = x 2 y˙2 = x 1 y1 h 4 (x 1 y1 , y) with suitable formal Taylor series h 1 , . . . , h 4 , up to any finite order. (Note the restriction h 1 (0, 0) = h 2 (0, 0) = 0 due to the prescribed linearization.) Indeed with A = A(0) nilpotent as in (8.35), by (8.42)–(8.44) and g satisfying the restriction (8.50), we see that a complement to range(ad A) in the space of f satisfying (8.40) is spanned by h 1 , . . . , h 4 at any finite order k of g. Technical details of this calculation can be found in the appendix.

Takens–Bogdanov bifurcations without parameters

227

Truncating the general normal form (8.51) at second order, we obtain the same expression (8.51) with first-order terms h 1 = h 11 y1 + h 12 y2 , h 2 = h 21 y1 + h 22 y2 , and constants h 3 , h 4 as x˙1 = x 1 (h 11 y1 + h 12 y2 ) + x 2 (h 21 y1 + h 22 y2 ) + h 3 x 22 x˙2 = x 1

(8.52)

y˙1 = x 2 y˙2 = h 4 x 1 y1 . A linear transformation, x˜1 = −h 21 x 1 ,

y˜1 = −h 21 y1 − h 22 y2

x˜2 = −h 21 x 2 ,

y˜2 = (h 12 h 21 / h 11 − h 22 )y2

(8.53)

which will be motivated in sections 8.5–8.7 then converts (8.51) into the previously stated truncated normal form (8.36). We specifically note a = h 11 / h 21 ,

b = −h 3 / h 11

(8.54)

h 12 h 21 − h 11 h 22 %= 0.

(8.55)

and the non-degeneracy conditions h 11 , h 21 %= 0,

8.5 Scaling, alias blow-up Linearizing the normal form (8.36) at the y-plane of equilibria, we obtain the unfolding   −y1 0 0 a(−y1 + y2 )  1 0 0 0   A( y) =  (8.56)  (c1 y1 + c2 y2 ) 1 0 0  c3 (c1 y1 + c2 y2 ) 0 0 0 of the Jordan block or order three at y = 0. The analysis of the standard Takens– Bogdanov bifurcation crucially uses a scaling to near Hamiltonian form, which preserves the nilpotent Jordan block, of order two there, at λ = 0. This is impossible for nilpotencies of odd order. Instead, we present a scaling by small 0 < ε < ε0 to near completely integrable form x 1 = (ε/a)4 x˜1 ,

y1 = (ε/a)2 y˜1

x 2 = (ε/a)3 x˜2 ,

y2 = (ε/a)2 y˜2

(8.57)

228

Bernold Fiedler and Stefan Liebscher

and t = aε−1 t˜. Inserting into the normal form (8.36) and omitting tildes, as well as terms of order ε2 and beyond, we obtain x˙1 = −x 2 y1 + ε(x 1 (−y1 + y2 ) + bx 22) x˙2 = x 1

(8.58)

y˙1 = x 2 y˙2 = 0.

Note that y2 has become simply a parameter in this scaling. Therefore, the two viewpoints on Takens–Bogdanov bifurcation without parameters, as presented in section 8.3, (c) and the diversion (8.31), (8.32), coincide to order ε. To emphasize the foliation by y˙2 = 0, in the following sections, we will rename λ := y2 y := y1

(8.59)

and discuss the rescaled normal form x˙1 = −x 2 y + ε(x 1 (−y + λ) + bx 22) x˙2 = x 1

(8.60)

y˙ = x 2 for small ε > 0. Understanding the solutions of (8.60) with x(t), y(t), |λ| ≤ C

(8.61)

for all t ∈ R, and for 0 ≤ ε < ε0 (C), is a significant step towards understanding all solutions of the original system (8.3), (8.4) in a neighbourhood U ⊂ R4 of the Takens–Bogdanov bifurcation at z = (x, y) = 0. We therefore address system (8.60) in sections 8.6–8.9 below and return to the issue of omitted higher-order terms in section 8.10.

8.6 Complete integrability to scaling order zero We consider the scaled vector field (8.60) to order zero in ε, that is, x˙1 = −x 2 y x˙2 = x 1

(8.62)

y˙ = x 2 . Equivalently, we rewrite (8.62) as a third-order scalar equation 1 2 d y¨ + y . 0 = ˙˙˙ y + y y˙ = dt 2

(8.63)

Takens–Bogdanov bifurcations without parameters

229

Immediately, this provides a first integral of motion , = y¨ + 12 y 2 = x 1 + 12 y 2

(8.64)

in terms of x 1 , y. Considering (8.64) as a second-order differential equation for y, on the other hand, we obtain the Hamiltonian system y¨ + 12 y 2 − , = 0 as was to be expected from a good Takens–Bogdanov-type problem. Hamiltonian provides a second integral of motion H = 12 ( y˙ )2 + 16 y 3 − ,y = 12 x 22 − x 1 y − 13 y 3 .

(8.65) The (8.66)

Conversely, we can parametrize all trajectories of (8.62) by (,, H, y) x 1 = , − 12 y 2 1 x 22 = − 12 q(y).

(8.67)

Here q(y) is the Weierstrass polynomial q(y) = q(y; 24,, 24H ) = 4y 3 − 24,y − 24H

(8.68)

with coefficients 24, and 24H . Before we perform a complete integration of (8.62) in terms of Weierstrass functions, in section 8.8, we record how to eliminate , > 0 from (8.65) entirely by a simple scaling. Let y(t) = y ,,H (t) denote any solution of (8.65) with energy H . Then ˜ (8.69) y ,,H (t) = ,1/2 y 1, H (,1/4t) ˜

where y 1, H is a solution of (8.65) with energy H˜ = ,−3/2 H.

(8.70)

A similar scaling applies to the less interesting case , < 0, where all orbits are unbounded. With these remarks it is easy to plot the phase portraits of system (8.62); see figure 8.4. Note the x 2 -independent invariant parabola sheets x 1 = , − 12 y 2 which, for , ≥ 0, contain all bounded orbits. Also note how the collection of homoclinic orbits confines the set of bounded orbits. In fact, the homoclinic orbits emanate from the hyperbolic saddles, at y < 0, and loop around the positive y-axis before returning to the saddle. No further stationary orbits bifurcate from the equilibrium plane, represented by the y-axis. All remaining bounded orbits are periodic, forming a periodic ‘bubble’. Finally, we observe that the scaled flow is independent of the parameter λ = y2 , to order zero in ε. This absence of λ, which extremely facilitates the perturbation computations below and was already crucial to the simplicity of the scaling (8.69), (8.70), originated from our linear transformation (8.53) by which h 22 was eliminated from our normal form (8.52).

230

Bernold Fiedler and Stefan Liebscher

8.7 Slow flow of first integrals to order ε With complete integrability at order ε0 at hand, we now consider the scaled vector field (8.60) to order ε, that is, x˙1 = −x 2 y + ε(x 1 (−y + λ) + bx 22) x˙2 = x 1

(8.71)

y˙ = x 2 . With z = (x 1 , x 2 , y) and obvious notation for the scaled nonlinearities f0 , f 1 , we abbreviate (8.71) as z˙ = f 0 (z) + ε f 1 (z). (8.72) Parametrizing trajectories z by (,, H, y), according to (8.64), (8.66), (8.67), we obtain differential equations for ,, H ; for example ˙ = ,z · z˙ = ,z f 0 (z) + ε,z f 1 (z) ,

(8.73)

where ,z indicates the gradient with respect to z. Note that ,z f0 (z) = 0, because , is a first integral of the order zero flow; see section 8.6. We thus arrive at the slow flow for the first integrals ˙ = ε[(, − 1 y 2 )(−y + λ) − , 2 H˙ = −εy[(, − 12 y 2 )(−y +

1 12 bq(y)] 1 λ) − 12 bq(y)].

(8.74)

In contrast to the slow motion of ,, H , the variable y will move on a time scale of order one, for ε → 0. Cutting through the bubble of periodic orbits, at order ε0 , we in fact have a Poincar´e section $, transverse to the flow (8.71), defined by the half plane (8.75) $ = {(x, y) | x 1 > 0 = x 2 }. Indeed x˙2 = x 1 > 0 in $. The boundary of $ consists of the equilibrium y-axis. A parametrization of $ by (,, H ) is given by choosing y as the middle one of the three solutions y of q(y; 24,, 24H ) = 0 (8.76) and putting x 1 := , − 12 y 2 .

(8.77)

The domain of definition of these coordinates is simply √ , > 0. (8.78) ,−3/2 |H | = | H˜ | < 23 2, √ The boundary of $ is given by the values H˜ = 23 2, for the saddles at y < 0, √ and H˜ = − 23 2, for the foci at y > 0, with y = 0 relegated to , = 0. This allows us to express the Poincar´e return map ε : $ → $

(8.79)

Takens–Bogdanov bifurcations without parameters

231

¯ H¯ ), wherever defined, in terms of the coordinates (,, H ) as ε (,0 , H0) = (,, where ε T

¯ = ,0 + ,

H¯ = H0 +

˙ ,(t)dt = ,0 + ε I , (ε, ,0 , H0 )

0 Tε

(8.80) H˙ (t)dt = H0 + ε I (ε, ,0 , H0). H

0

Here T ε = T ε (,0 , H0) denotes the Poincar´e return time. The form (8.80) of the Poincar´e return map ε suggests to compute I = , (I , I H ) at ε = 0 and to view ε as a time discretization of first order, with time step ε, of the vector field d , = I (,, H ). (8.81) dt H We call (8.81) the Poincar´e flow of first integrals. The somewhat delicate issue of convergence of the integrals I , (ε, ,0 , H0), I H (ε, ,0 , H0 ) to their counterparts I (,0 , H0 ), evaluated at ε = 0, is postponed to section 8.10. By the form (8.74) of the (,, H )-flow, we then conclude that

T0

I (,, H ) = 0

[(, − 12 y 2 )(−y + λ) −

1 12 bq(y)]

1 −y

dt.

(8.82)

Here ε = 0, and therefore ,, H are fixed. The Poincar´e time T 0 is the minimal period of the periodic orbit y(t) of the integrable order zero vector field discussed in section 8.6. To evaluate the integrals (8.82) it is therefore sufficient to compute, for k = 0, . . . , 4, the integrals

T0

Jk = Jk (,, H ) =

(y(t))k dt

(8.83)

0

for the periodic solution y(t) of (8.65) which is determined by ,, H . The simple scaling argument (8.69), (8.70) shows that Jk (,, H ) = ,k/2−1/4 Jk ( H˜ )

(8.84)

where we have abbreviated Jk (1, H˜ ) = Jk ( H˜ ). In the next section we recall a recursion relation and compute the complete elliptic integrals Jk ( H˜ ) in terms of Weierstrass elliptic functions.

8.8 Elliptic integrals and Weierstrass functions In this section we evaluate the complete elliptic integrals T (y(t))k dt Jk ( H˜ ) := 0

(8.85)

232

Bernold Fiedler and Stefan Liebscher

for k = 0, . . . , 4, as introduced in (8.83), (8.84), in terms of Weierstrass elliptic functions. We recall that y(t) = y(t, , = 1, H = H˜ ) is a periodic solution of the second-order equation (8.65), with , = 1 and energy H = H˜ , that is, y¨ + 12 y 2 − 1 = 0 2 1 2 ( y˙ )

+

1 3 6y

(8.86)

− y = H˜ .

(8.87)

Note that T = T ( H˜ ) = J0 ( H˜ ) denotes the minimal period, for | H˜ | < note the continuous limits √ J0 (− 23 2) = 23/4π √ J0 ( 23 2) = +∞

2 3

√ 2. Also

(8.88)

√ corresponding √ to the Hamiltonian centre at y = 2 and to the homoclinic saddle at y = − 2, respectively. To establish the relation of Jk ( H˜ ) to complete elliptic integrals, we recall that 1 q(y) y˙ 2 = − 12

(8.89)

from (8.66)–(8.68), where q(y) = 4y 3 − 24y − 24 H˜ is the cubic Weierstrass polynomial. Following traditional notation, √ let √ e1 > e2 > e3 denote the three real zeros of q depending on H˜ ∈ (− 23 2, 23 2). Obviously then e1 , e2 are the maximal, minimal values, respectively, of the periodic orbit y(t). By time reversal symmetry these values occur at, say, t = T /2 and t = 0, respectively. We can therefore rewrite Jk ( H˜ ) =

T

k

y(t) dt = 2

0

=2

e1

e2 e1 e2

yk dy y˙

yk

dy 1 − 12 q(y)

(8.90)

which clearly identifies Jk as complete elliptic integrals. Following an elementary exposition in [45, 46], we derive a two-term recursion for the elliptic integrals Jk . Just differentiate 1 d k y q(y) = √ (ky k−1 q(y) + 12 y k q (y)). dy q(y)

(8.91)

√ Integrating over y, from e2 to e1 , and re-inserting the factor −1/12 indeed provides a linear two-term recursion relation for Jk , because q(e j ) = 0. Therefore, all Jk can be expressed linearly in terms of J0 and J1 . For k = 0, . . . , 4

Takens–Bogdanov bifurcations without parameters we obtain explicitly

233

J0 = J0 ( H˜ ) J1 = J1 ( H˜ ) J2 = 2 J0

(8.92)

6 ˜ 5 (3 J1 + 2 H J0 ) 12 ˜ 7 (2 H J1 + 5 J0 ).

J3 = J4 =

We now express J0 in terms of the Weierstrass function p(z), defined by the inverse elliptic integral z + kω + ilω =

p(z)

dz . √ q(z)

(8.93)

See for example [33, 35, 46] for some background. Due to the complex Riemann √ surface of q(z), two complex periods ω, iω of the complex Weierstrass function q(z) arise. Choosing ω, ω > 0 we see that √ J0 ( H˜ ) = 2 3ω . (8.94) Indeed, √ p(ω/2) = e1 , p((ω + iω )/2) = e2 , and (8.94) follows by multiplication with −1/12 and J0 , ω > 0. To compute J1 we note that the definition (8.93) of the Weierstrass function p(z) implies ˜ y 1, H (t) = p −1/12(t + t0 ) (8.95)

by separation of variables in (8.87). Therefore, J1 ( H˜ ) =

T 0

=

√

J0 ( H˜ )

y(t)dt = 12ω

p

˜

y 1, H (t)dt

0

√ −1/12t dt = − −12

0

iω

p(z)dz 0

√ = 2 3iζ (iω )

(8.96)

where ζ = − p denotes the Weierstrass ζ -function. The sign of the derived expression for J1 can, in case of doubt, be derived by continuation from H˜ = √ 2 − 3 2, where ω = 2−1/4 3−1/2π and J1 = 25/4π. For later reference we collect some further √ properties of J0 , J1 . It is possible to analytically show continuity at H˜ = ± 23 2 of the following expressions 2 J1 + 3 H˜ J0 =

√ 0 at H˜ = − 23 2 √ 213/4 · 3 at H˜ = 23 2.

(8.97)

Bernold Fiedler and Stefan Liebscher √ 24 4 8 √ 24 4 2 3 H˜ J1 + 4 J0 − 32 H˜ 2 J1 + 3 H˜ J0 H˜ √ √ √

234

− 23

+ 23

2

2

− 23

5 7

√ 2

√ 2

˜ g( H)

+ 23

2

˜

H √ 2

Figure 8.7. Plots of the nonlinearities 2J1 + 3 H˜ J0 , 3 H˜ J1 + 4J0 , and g( H˜ ).

Similarly, we obtain the limiting values of √ 0 at H˜ = − 23 2 ˜ √ 3 J1 H + 4 J0 = 215/4 · 3 at H˜ = 23 2.

(8.98)

Numerical inspection of the expressions (8.97), (8.98) by Mathematica indicates, but of course does not quite prove, positivity 3 H˜ J1 + 4 J0 > 0 2 J1 + 3 H˜ J0 > 0, √ √ for − 23 2 < H˜ ≤ 23 2. Concerning the quotient nonlinearity g( H˜ ) :=

5 3 H˜ J1 + 4 J0 7 2 J1 + 3 H˜ J0

(8.99)

(8.100)

which will play a crucial role in section 8.9, we note the limiting values √ √ 2 at H˜ = − 23 2 ˜ √ g( H ) = 5 √ (8.101) 2 ˜ 7 2 at H = 3 2. √ √ The limit at H˜ = 23 2 follows from (8.97), (8.98). The limit at H˜ = − 23 2 ˜ back to the change of stability along the Hopf follows easily when relating g( H) line λ = y in the equilibrium plane; see (8.112). We will also trust the numerics of Mathematica to reliably indicate that √ √ 5 ˜ − 32 H˜ < g( H˜ ) (8.102) 7 2 < g( H ) < 2, for | H˜ | <

2 3

√ 2; see figure 8.7.

8.9 Poincar´e flows of first integrals Further postponing the issue of higher-order perturbations slightly, we study the Poincar´e flow (8.81) in this section. By (8.82)–(8.84) and recursions (8.92) for

Takens–Bogdanov bifurcations without parameters

235

the complete elliptic integrals Jk ( H˜ ), the Poincar´e flow takes the form ˙ = 2 ,5/4(b + 1)(2 J1 + 3 H˜ J0 ) , 5 H˙ = 2,7/4( 15 λ,−1/2 (2 J1 + 3 H˜ J0 ) − 17 (b + 2)(3 H˜ J1 + 4 J0 )).

(8.103)

Here J0 = J0 ( H˜ ), J1 = J1 ( H˜ ) as in section 8.8. To derive (8.103) we evaluate ˜ the integrals I (,, H ) in (8.82) along y(t) = y ,,H (t), then substitute y 1, H by (8.69), replace H by H˜ = ,−3/2 H according to (8.70), and apply recursions (8.92) which only hold for , = 1, before grouping terms as in (8.103). Since Jk = Jk ( H˜ ) we prefer to use H˜ as a variable in (8.103) directly and write ˙ = 2 ,5/4(b + 1)(2 J1 + 3 H˜ J0 ) , 5 2 1/4 λ 3 ˜ ˙ −1/2 ˜ ˜ ˜ H = , (b + 1)(2 J1 + 3 H J0 ) , − H − αg( H ) 5 b+1 2

(8.104)

where we have abbreviated α=

b+2 1 =1+ b+1 b+1

(8.105)

˜ ˜ ˜ ˜ = 5 3 H J1 ( H ) + 4 J0 ( H ) . g( H) 7 2 J1 ( H˜ ) + 3 H˜ J0 ( H˜ )

We recall the properties of the right-hand sides of the Poincar´e flow (8.104) as collected in section 8.8, (8.97)–(8.102). We have assumed b %= −1 so that b + 1 %= 0. Moreover, , > 0 is invariant, and 2 J1 + 3 H˜ J0 > 0 except for the √ ˜ centres H˜ = − 23 2. Therefore, it makes sense to parametrize all (,, H)-orbits d over τ = log ,. Writing = dτ this leads to the simplified equation H˜ (τ ) =

λ −τ/2 3 ˜ e − H − αg( H˜ ). b+1 2

(8.106)

It is obvious that we can now absorb the value |λ/(b + 1)| into a mere shift of ‘time’ τ , as long as λ = y2 %= 0. The autonomous case λ = 0 corresponds to the limiting case τ = +∞ discussed below and will be omitted for simplicity. To understand the dynamics of the Poincar´e flow (8.81) for λ %= 0, it is therefore sufficient to discuss the two vector fields H˜ (τ ) = ±e−τ/2 − 32 H˜ − αg( H˜ )

(8.107)

for the various regimes of the real parameter α = 1 + 1/(b + 1). For each of the signs ± which is given by sign(λ(b + 1)), the following three cases arise: Case (A)

b < −17/12 ⇐⇒

Case (B)

−17/12 < b < −1

⇐⇒

Case (C)

−1 < b

⇐⇒

−7/5 < α < 1 α < −7/5 1 < α.

(8.108)

Bernold Fiedler and Stefan Liebscher

236

λ<0

Cases

λ>0

1-het H˜

H˜

Sin

(A)

− 75 < α < 1

τ

τ

⇐⇒ b < − 17 12 Fin

hyp. Hopf H˜

Sin

H˜ 1-het Sout

(B)

α < − 75

τ

τ

⇐⇒ − 17 12 < b < −1 hyp. Hopf H˜

1-het

H˜

(C)

1<α

τ

τ

⇐⇒ −1 < b Fout

ell. Hopf Fin

Figure 8.8. Phase portraits of equation (8.107); see also (8.105), (8.108) and the explanations following there. Time direction refers to the flow (8.104).

Phase portraits of the six resulting cases are provided in figure 8.8. We comment on the derivation and interpretation of these phase portraits next. First we recall that orbits of (8.104) and (8.107) coincide, if we put 2 λ eτ . (8.109) ,= b+1 For b < −1 however, the time direction of (8.107) has to be reversed to account for the time direction vested into these orbits by (8.104).

Takens–Bogdanov bifurcations without parameters 237 √ We also recall that H˜ = − 23 2 refers to the equilibrium half line x = 0, √ y1 = y = 2, > 0, for any y2 = λ, in terms of the original Takens–Bogdanov system (8.28)–(8.30), (8.33), (8.34) and its normal forms (8.58), √ (8.36), (8.51), √ (8.60); see also figure 8.5. Indeed ,−3/2 H = H˜ = − 23 2, y = 2,, and the definitions (8.64), (8.66)–(8.68) of ,, H imply x = 0, y > 0. Consistent with ˙˜ ˜ ˙ this observation, √ both , and H vanish along this line, because 2 J1 + 3 H J0 = 0 2 at H˜ = − 3 2; see (8.97). In the original coordinates (x, y) these equilibria are normally hyperbolic, except along the Hopf line λ = y > 0.

(8.110)

Indeed, the (strict) unstable dimension is 2, for 0 < y < λ, and 0 for 0 < λ < y. These unstable dimensions are easily detected in the right half of figure 8.8. In the ( H˜ , τ )-coordinates of (8.106), the Hopf line (8.110) manifests itself by a horizontal tangent H˜ = 0 (8.111) √ 2 ˜ at H = − 3 2, for any fixed λ > 0. Evaluating condition (8.111) together with √ λ = y = 2, immediately yields the limiting value √ √ g(− 23 2) = 2 (8.112) as was claimed in (8.101). For fixed λ < 0, in the left half of figure 8.8, we just as easily detect strict normal stability of the equilibria x = 0, y > 0. For very negative τ√ , alias small it is easy to discuss the orbits of √ , > 0, √ (8.107). Since | H˜ | ≤ 23 2 and 57 2 ≤ g ≤ 2 are uniformly bounded in the region of interest, we √ obtain almost vertical orbits which connect the horizontal boundaries H˜ = ± 23 2 in very short ‘time’ intervals τ . The direction, with proper reversal for b < −1, is easily determined. For very positive τ , as well as for λ = 0, the exponential term disappears and we are left with the autonomous limiting equation − H˜ (τ ) = 32 H˜ + αg( H˜ ).

(8.113)

˜ We √ use properties √ (8.101),√(8.102) of3 g to study this limiting equation for | H | < 2 5 ˜ ≤ g, the right-hand side possesses zeros 2. Since 2 ≤ g ≤ 2 and − H 3 7 2 if, and only if, (8.114) − 75 < α < 1. Only in this case (A),√therefore, the orbits of (8.107) do not connect the horizontal boundaries H˜ = ± 23 2 within finite ‘time’ τ , for all large positive τ , as they did for very negative τ and still do for large positive τ in the other cases (B), (C). Rather, in case (A), the zero or, hypothetically, zeros of 32 H˜ + αg( H˜ ) provide semi-invariant regions, for large τ > 0, which prevent orbits from connecting the horizontal boundaries. Instead, √ √ (8.115) lim H˜ (τ ) ∈ (− 23 2, + 23 2) τ →+∞

238

Bernold Fiedler and Stefan Liebscher

exists for all orbits starting at sufficiently large τ > 0. This accounts for the shaded regions labelled ‘y-entry’, in figure 8.5(A). Note that the backwards escape time t < 0 is finite in terms of the original system (8.103), due to (8.115) and positivity property (8.99) of 2J1 + 3 H˜ J0 . √ We discuss the upper horizontal boundary H˜ = + 23 2 next. This H˜ -value √ does not only indicate the half line of saddle equilibria x√= 0, y = − 2, < 0, which can be discussed analogously to the case H˜ = − 23 2 treated above. It also characterizes the fate under perturbation to positive ε of the family of homoclinic orbits, which exists for ε = 0. This latter point of view is in fact the only appropriate one, if we insert the (8.101) √ non-zero limiting values (8.97), (8.98), ˙ %= 0 along this line, and H˜ vanished in (8.103), (8.107), at H˜ = 23 2. Then , only at the simple zero √ 12b + 17 3 ˜ (8.116) ±e−τ/2 = H + αg( H˜ ) = 2 2 7(b + 1) of the right-hand√side of (8.107). In terms of (8.104) and the original (scaled) √ variables y = − 2, = ∓ 2eτ/2 λ/(b + 1) this occurs along the asymptote λ = − 17 (12b + 17)y.

(8.117)

These points in figure 8.8, and lines in figure 8.5, are labelled ‘1-het’. They correspond to zeros of an associated Melnikov function and to saddle–saddle heteroclinic orbits, in the original system, as we will see in the next section. √ ˙˜ is positive or negative, The remaining boundary regions H˜ = 23 2 where H respectively, indicate a splitting of the homoclinic bubble such that orbits escape in backward or forward time from a neighbourhood U of the origin in the equilibrium plane x = 0. In figure 8.5 this behaviour is marked as ‘x-entry/exit’.

8.10 Higher order: Poincar´e flow, averaging, Melnikov Before completing our analysis of Takens–Bogdanov bifurcations without parameters, in section 8.11, we now digress for a discussion of the effects of omitted higher-order terms. We recall the three approximation steps which we have applied (a) truncation to second-order normal form (section 8.4) (b) omission of scaling terms of order ε2 and higher (section 8.5) (c) approximation of the Poincar´e map ε by the time ε map of the Poincar´e flow (section 8.7, (8.79)–(8.82)). To ensure that our original variables ((ε/a)4 x 1 , (ε/a)3 x 2 , (ε/a)2 y, (ε/a)2 λ) cover a neighbourhood U of the origin in R4 , we fix C > 0 arbitrarily large, consider the scaled variables (x 1 , x 2 , y, λ) in a ball of radius C, and analyse the complete, non-truncated rescaled dynamics for 0 ≤ ε < ε0 = ε0 (C).

Takens–Bogdanov bifurcations without parameters

239

In terms of the variables (τ, H˜ , y), τ = log ,, we immediately see that solutions which do not intersect the Poincar´e section √ $ = {(τ, H˜ , y) | τ ∈ R, | H˜ | < 23 2, y = e2 } (8.118) ¯ or become unbounded, or else belong to some equilibrium in the closure $ ss uu to its strong stable or strong unstable manifold W (y), W (y). As before e2 = e2 (,, H ) denotes the middle zero of the Weierstrass polynomial q(y), where indeed x 2 = 0; see (8.67), (8.68), figure 8.4, and (8.76), (8.90). We are interested in bounded non-stationary solutions which remain in U and thus intersect $. We have seen in section 8.7, that the Poincar´e map ε , wherever defined on $, is just some first-order discretization of the Poincar´e flow (8.81), (8.82) with time step ε. This statement concerning approximation (c) was made for Poincar´e maps ε which arise after completion of the approximation steps (a), (b). More generally, we observe that the same statement evidently remains true, when approximations (a), (b) are included. Indeed, we then include the ‘parameter’ λ = y2 in our construction of $, ε , and the Poincar´e flow. Since the omitted terms perturb the Poincar´e map ε , by terms of order ε2 or higher order, the map ε remains a first-order discretization of the unperturbed Poincar´e flow. In the remainder of this section we discuss two issues. First, we recall how the Poincar´e flow and its discretization ε relate to the usual averaging approach. Second, we relate the Poincar´e flow and ε to the Melnikov functions associated with the family of homoclinic orbits.√ In particular we address the behaviour of ε near the saddle boundary H˜ = 23 2 of the Poincar´e section $. The averaging procedure aims at eliminating the oscillations of y from the slow flow of (,, H ) in (8.74); see [4, 6, 30] for background information. The elimination is achieved by successive y-dependent transformations of (,, H ). These transformations successively lead to autonomous vector fields for (,, H ), independent of y, with y-dependent corrections of order ε N , N = 2, 3, . . . . The first step N = 2 replaces y k in (8.74) by the time average 1 T Jk (y(t))k dt = (8.119) T 0 J0 over the unperturbed T -periodic solution y(t), in the notation of (8.83). Replacing Jk by Jk /J0 , everywhere, converts our Poincar´e flow (8.81), (8.82) to the first averaged flow of the standard approach. The Poincar´e flow and the averaged flow just differ by an Euler multiplier T = T (,, H ) = J0 . In particular the analysis of section 8.9 applies. Except for our interpretation of the Poincar´e map ε as the time ε discretization, rather than a time T · ε discretization, of the (,, H ) flow the two view points are completely equivalent—as long as T (,, H ) = J0 remains finite. Near √homoclinicity, alias near J0 = ∞ or near the upper horizontal boundary H˜ = 23 2, the Poincar´e flow (8.81), (8.82) offers an advantage, because the

240

Bernold Fiedler and Stefan Liebscher

vector field I = I (,, H ) can be interpreted directly in terms of Melnikov functions; see [13, 23] for background information. For systems z˙ = f 0 (z) + ε f 1 (ε, z)

(8.120)

with an unperturbed homoclinic orbit z(t), at ε = 0, a Melnikov function M associated with z(t) is given by +∞ ψ(t)T f 1 (0, z(t))dt (8.121) M= −∞

where ψ is a non-trivial bounded solution of the adjoint variational equation ψ˙ = −(Dz f 0 (z(t)))T ψ.

(8.122)

In the presence of first integrals ,, H at ε = 0 we may choose ψ(t) = Dz ,(z(t)), Dz H (z(t))

(8.123)

and obtain corresponding Melnikov functions M ,, M H. By definition (8.81), (8.82), these Melnikov functions coincide with the components of the Poincar´e flow I = (I , , I H ): MH = I H (8.124) M , = I ,, at homoclinic orbits. As in classical Melnikov theory, which usually deals with non-degenerate homoclinic orbits rather than families of homoclinic orbits, the terms ±ε(I , , I H )(,0 , H0) indicate the leading order in ε of the return to $ of the strong unstable and strong stable manifolds W uu (y), W ss (y), respectively, of the √ √ −3/2 saddle equilibrium at y = − 2,0, ,0 H0 = H˜ = 23 2. This of course assumes that the corresponding point ˆ = ,0 ± ε I , (,0 , H0 ) , Hˆ = H0 ± ε I H (,0 , H0)

(8.125)

ˆ Hˆ ) actually lies in $, to within order ε2 . Similarly, orbits for which (,, lies outside $, to within order ε, do not return. Indeed the integrals I = (I , , I H )(ε, ,0 , H0) in (8.80) converge of order ε, uniformly for T ε ≤ ∞, as ¯ does return to $. For saddle equilibria long as the orbit starting at (,0 , H0) ∈ $ ε (,0 , H0 ) ∈ ∂$ where T = +∞, we only have to replace the return point ¯ H¯ ) in $ by the return point of the strong unstable or strong stable manifold (,, of (,0 , H0 ) and observe the uniform estimate ¯ H¯ ) − (,, ˆ Hˆ ) = O(ε2 ). (,,

(8.126)

In section 8.9 we have determined ‘heteroclinic’ values of ,0 > 0 for which ˙˜ = 0 H

(8.127)

Takens–Bogdanov bifurcations without parameters

241

˙˜ at the saddle boundary H˜ = 23 ; see (8.116), (8.117). Because these zeros of H are simple, with respect to ,, an adaptation of standard Melnikov theory shows the existence of nearby heteroclinic orbits. To within order ε2 , these orbits start √ 3/2 at the computed values of ,0 , H0 = 23 2,0 , y, λ and terminate at the value √ ˆ Hˆ = 2 2, ˆ 3/2. By (8.125), (8.117) the corresponding yˆ , λ associated with ,, 3 values y, yˆ differ by order ελ. In terms of the original variables y1 = (ε/a)2 y, y2 = (ε/a)2 λ we therefore obtain a cusp of saddles in the equilibrium y-plane, with the two half-arcs connected almost horizontally by a heteroclinic orbit. Indeed the horizontal width of the heteroclinic sector is of order ε3 at distance ε2 from the origin.

8.11 Geometry of Poincar´e maps In this section we complete our analysis of Takens–Bogdanov bifurcations without parameters. In particular, we complete our proof of the heteroclinic and x-entry/exit orbits indicated in figure 8.5. Although higher-order terms introduce a small drift in the ‘parameter’ λ, we may safely ignore this effect for simplicity of presentation. Likewise, we omit a discussion of the rather simple case λ = 0 for brevity. In the previous section we have advocated using the (,, H√) Poincar´e flow (8.103) not only on the Poincar´e section $ given by | H˜ | < 23 2, τ ∈ R, but √ up to the boundary H˜ = 23 2 of the saddle equilibria y < 0 and their strong stable and unstable manifolds W ss (y), W uu (y). Alternatively to (,, H ), we can use the coordinates τ = log(,((b + 1)/λ)2 ), H˜ = ,−3/2 H of (8.107) and figure 8.8. Note, however, that a time ε discretization step of (8.103) as realized by the Poincar´e return map ε corresponds to a τ -step of size ˙ = 2 ,1/4 (b + 1)(2 J1 + 3 H˜ J0 ) · ε. ,−1 ,ε 5

(8.128)

Near the saddle boundary where Melnikov theory applies this expression simplifies to a τ -step of ˙ = ,−1 ,ε

1/4 48 (b 5 (2,)

+ 1) · ε

(8.129)

by (8.97). √ As was pointed out earlier, we may restrict our analysis to regions | H˜ | ≤ 23 2, τ ≤ C, and 0 < ε < ε0 (C). We first consider the simplest cases: (B), (C) with λ < 0. In these two cases all orbits of the Poincar´e flow are pointing strictly downwards. Therefore, the collection W cu = W uu (y) (8.130) y<0

of strong unstable manifolds of the saddles intersects $ in an infinite sequence of √ ‘horizontal’ lines, accumulating on the stable equilibria along H˜ = − 23 2. All

242

Bernold Fiedler and Stefan Liebscher

points of $ in between these lines leave the neighbourhood U of x = y = 0, in backwards time, while converging to the stable equilibria in forward time. We can view W cu as a two-dimensional scroll, forward spiralling into the equilibria x = 0, y > 0. It is an interesting warm-up to visualize this scroll in figure 8.4. The two most interesting cases are cases (B) and (C) with λ > 0 which exhibit Hopf bifurcation, without parameters of course, in coexistence with saddle ˙˜ = 0 at H˜ = 2 √2 as discussed in section 8.10. The heteroclinic orbits H 3 remaining two moderately interesting cases, (A) with λ > 0 and λ < 0, are similar to cases (B) and (C) with λ > 0, respectively: the former preserves the hyperbolic Hopf point but the heteroclinic orbit has disappeared through τ = +∞. Conversely, the latter preserves that heteroclinic orbit but has pushed the elliptic Hopf point out through τ = +∞. We first discuss the interesting case (C), λ > 0 which involves both a saddle heteroclinic orbit and an elliptic Hopf point; see figure 8.9. For a local section near the elliptic Hopf point which involves exponentially small splittings of heteroclinic separatrices see [21] and figure 8.3(b) left. We first follow the saddle segment S, between the two saddle end points y± < 0 of the saddle– saddle heteroclinic, forward under ε . The (magenta) forward continuation is a piecewise smooth curve W+c with tangent jumps of order ε at the (forward) points y+,n , n = 0, 1, 2, . . . of W uu (y+ ) ∩ $. Let Fin := lim y+,n . n→∞

Typically the (green) curve W ss (Fin ) will intersect W+c transversely, even at the points y+,n . The curve W+c will therefore limit to an interval of stable c > 0, around F to the right of the Hopf point. Note, however, equilibria y+ in that transversality has not been proved but can typically be expected to hold for a first-order time ε discretization of the Poincar´e flow as given, for example, by the Poincar´e map ε . Similar statements hold true for the (blue) backwards continuation W−c of S in $ and its intersection with the (red) unstable manifolds c ). We also illustrate the behaviour of some strong stable and strong W uu (y− √ unstable manifolds of other equilibria y > 0 on the bottom line H˜ = − 23 2. Note how these manifolds transversely connect to equilibrium intervals on the other side of the elliptic Hopf point, or disappear partially, or disappear completely as their source points y > 0 move away from the Hopf point through Fout . Also, transverse splitting effects should not be expected to be exponentially small any more, during this transition, but to be of order ε. Outside the ‘Hopf bubble’ depicted in figure 8.10, we will encounter ‘horizontal’ copies of the centre manifolds W cu ∩$, W cs ∩$ of the saddles to the right and left, respectively, as we have indicated. These extend from the intersection points y±,n and form smooth continuations of the piece of W±c immediately above y±,n . Again we note that all orbits are heteroclinic, as indicated in figure 8.5, with only one saddle–saddle heteroclinic, or also become unbounded through the split homoclinic family. These facts persist under higher-order perturbations.

Takens–Bogdanov bifurcations without parameters √ 2 2 3

H˜

W cs

y−

1-het

y+

243

W cu

cs W−1

cu W+1

cs W−2

cu W+2

cs W−3

cu W+3

τ

√

−232

Fout

ell. Hopf

Fin

Figure 8.9. Poincar´e section $ of case (C), λ > 0 of figure 8.8. Coding: red = W uu (centre), green = W ss (centre), magenta = W cu (saddle), and blue = W cs (saddle). See the description in section 8.11. See also the colour section. √ 2 2 3

H˜

1-het

τ

√

−232

Fout

ell. Hopf

Fin

Figure 8.10. Set of bounded orbits in the Poincar´e section $ of case (C), λ > 0 of figure 8.8.

Bernold Fiedler and Stefan Liebscher

244 √ 2 2 3

H˜

Sin

y+

W cs

7

6

3

4

5

1-het

2 4

7

+2

9 9 +3

cs W−2

7

√

−232

6

7 -1

8

7 8 9

8 9

cu W+1 τ

-2

9 10 -3

12 12 13

+5 +6 +7

cs W−3

5

6

10 10 11 11 11 +4

4

6

8 10

3 5

7

8

Sout

W cu

4

6 7

cs W−1

2

5

6 8

y−

3

5

+1

1

cu W+2

-4 -5 -6 -7

cu W+3

hyp. Hopf

Figure 8.11. Poincar´e section $ of case (B), λ > 0 of figure 8.8. Coding: red = W uu (centre), green = W ss (centre), magenta = W cu (saddle), and blue = W cs (saddle). See the description in section 8.11. See also the colour section. √ 2 2 3

H˜

Sin

1-het

Sout

τ

√

−232

hyp. Hopf

Figure 8.12. Set of bounded orbits in the Poincar´e section $ of case (B), λ > 0 of figure 8.8.

Takens–Bogdanov bifurcations without parameters

245

We now turn to the final interesting case (B), λ > 0, which involves both a saddle heteroclinic orbit and a hyperbolic Hopf point; see figure 8.11. For a local section near the hyperbolic Hopf point see again [21] and figure 8.3(b) right. This local analysis shows that the Hopf point itself possesses a stable (and an unstable) s u ) indicated in green (red) and terminating at saddle points (WHopf half-arc WHopf Sout , Sin to the right (left) of the Hopf point. We encounter √ the strong stable (unstable) manifolds of the equilibria y > 0 at H˜ = − 23 2. Also note the heteroclinic saddle–saddle connection from y− to y+ which was established in section 8.10. From our analysis of the Poincar´e flow in figure 8.8(B), λ > 0, we conclude that ε maps the collection W cu of strong unstable manifolds of saddles to the cu in figure 8.11. Note how these manifolds W cu right of y− as is indicated by W+n +n u converge to the union of WHopf with the bottom line of normally stable equilibria cs arises from the centre stable to the right of the Hopf point. A similar pattern W−n cs manifold W of saddles to the left of y+ under backwards iteration of ε . cu , W cs , n, n = 0, 1, 2, . . . , each curve W cu By continuity of the curves W+n +n −n cs u must intersect each curve W−n at least once, in the sector of $ between WHopf and

s , say at a point ynn+n . Then the (n + n + 1) points ykn+n , k = 0, . . . , n + n , WHopf

n+n lie on an (n + n )-heteroclinic orbit from the saddle y0n+n to the saddle yn+n .

n+n . Note y− = y01 , y+ = y11 , in this notation. We call Indeed ε (ykn+n ) = yk+1 these points (n +n )-heteroclinic because they revolve around the equilibrium line x = 0 (n + n ) times before returning to the saddle line; see figure 8.4. Note that neither (expected) uniqueness nor transversality of these infinitely many saddle– saddle heteroclinic orbits was addressed here. We have only established their existence.

In conclusion, except for equilibria and the above (n + n )-heteroclinic orbits ykn+n , all points in $ leave the region U in forward or backward time, due to homoclinic splitting; see figure 8.12. In figure 8.11 we have indicated lifetimes in numbers of revolution ±n, according to direction of exit. In the sector u s and WHopf we have indicated the total number of revolutions, between WHopf & cs ) in this sector exit in both forward because the points outside n (Wncu ∪ W−n and backward time. Again we note that all nonstationary orbits which remain in U are heteroclinic, as was claimed in section 8.3 and in figure 8.5, including saddle–saddle (n + n )-heteroclinic orbits ykn+n , k = 0, . . . , n + n , for any n + n = 1, 2, 3, . . .. These facts persist under higher-order perturbations. In addition to the geometric insight now gained into the three different cases of Takens–Bogdanov bifurcation without parameters, these observations complete the proof of our claims of section 8.3.

246

Bernold Fiedler and Stefan Liebscher

8.12 Stiff hyperbolic balance laws In this last section we return to example 8.2 of the introduction, (8.8)–(8.11) of hyperbolic conservation laws with stiff source terms ∂t u + ∂ξ F(u) = ε−1 G(u) + εδ∂ξ2 u,

u ∈ Rn , ξ ∈ R.

(8.131)

Here δ ≥ 0 is a fixed small parameter providing a small viscous regularization, and ε 6 0 accounts for the stiffness of the source term. For the sake of simplicity of the following calculations we restrict ourselves to the case δ = 0 of vanishing viscosity. Results for small δ can be obtained by a perturbation analysis as demonstrated in [20, 37] for the case of Hopf bifurcation without parameters. Rescaling t = t˜/ε, ξ = ξ˜ /ε and omitting tildes we arrive at the εindependent system ∂t u + ∂ξ F(u) = G(u),

u ∈ Rn , ξ ∈ R.

(8.132)

Strict hyperbolicity of this balance law requires DF(u) to possess n distinct real eigenvalues α1 (u) < · · · < αn (u) for any u. Travelling-wave solutions u(t, ξ ) = U (ξ − st) are given by solutions of U˙ = (F (U ) − s)−1 G(U )

(8.133)

as long as s %∈ {α1 , . . . , αn }. Heteroclinic orbits of (8.133) between equilibria u − , u + correspond to travelling waves u(t, ξ ) = U (ε−1 (ξ −st)) of (8.131) which connect the left and right states u − , u + by a thin layer with width of order ε. In the limit ε 6 0 they tend to discontinuous weak solutions, called shocks. For a system of m pure conservation laws combined with n − m balance laws we expect m-dimensional equilibrium manifolds of (8.133). A different mechanism that leads to manifolds of equilibria is provided by source terms G(u) which only depend on some, but not all, of the components of u. Chemical reactions, for example, typically depend on concentrations and temperature but not on flow velocities. Aiming at Takens–Bogdanov points on such equilibrium manifolds we consider four-dimensional systems n = 4 with two-dimensional manifolds of vanishing source G = 0. We intend to provide examples of (8.133) exhibiting Takens–Bogdanov bifurcations which are generated by the interaction of flux F and source G. Separately, none of these components would be compatible with such complicated heteroclinic behaviour as we have observed, for example, near the Hopf line. In fact, the flux can be chosen as a gradient F(U ) = ∇U !(U ). Therefore the travelling wave equation that corresponds to the pure conservation law ∂t u + ∂ξ F(u) = εδ∂ξ2 u

(8.134)

is of gradient type U˙ = ∇!(U ) − sU + C,

C = constant .

(8.135)

Takens–Bogdanov bifurcations without parameters

247

All bounded trajectories of this system converge to equilibria, no foci occur. The pure kinetics ∂t u = G(u)

(8.136)

on the other hand, can be chosen to be stabilizing: G (u) will possess real, negative eigenvalues, in addition to the two trivial eigenvalues generated by the manifold of equilibria. Each local trajectory of (8.136) then converges eventually monotonically to some point on the equilibrium manifold. In contrast to their individual properties, the interaction of a gradient flux function F with a stiff, but stable, source term G in (8.131) can provide Takens– Bogdanov points with all the structure described in the preceeding sections. In particular any heteroclinic orbit that we found near the bifurcation point corresponds to a small-amplitude travelling wave of the hyperbolic balance law. To construct our example of (8.133) with Takens–Bogdanov bifurcation at the origin, we absorb the wave speed into the flux by setting s = 0. We have to require G(0) = 0. The linearization at the origin will satisfy 

0  1   0 0

0 0 1 0

0 0 0 0

 0 0   = (F (U ))−1 G (U )|U =0 0  0 = (F (0))−1 G (0).

(8.137)

Moreover, F must be invertible and, because F = ∇U !, symmetric. This can easily be achieved, for example by the choices   F (0) =     G (0) =  

0 γ1 0 0 γ1 0 γ2 0

γ1 0 γ2 0 0 γ2 γ3 0

0 γ2 γ3 0 0 0 0 0

 0 0   0  γ4 0 0   0  0

(8.138)

with γ1 , γ2 < 0 and γ3 , γ4 %= 0. The following example provides an expansion at the origin which directly coincides with the normal form (8.36), by the

248

Bernold Fiedler and Stefan Liebscher

identification u = (u 1 , u 2 , u 3 , u 4 ) = (x 1 , x 2 , y1 , y2 ) F(u) = ∇u !(u) !(u) = γ1 u 1 u 2 + γ2 u 2 u 3 + 12 γ3 u 23 + 12 γ4 u 24 G(u) = F (u) · (normal form) 

 γ1 u 1  γ2 u 2 + γ1 (−u 3 + u 4 )u 1 − γ1 u 3 u 2 + γ1 bu 2  2 . =  γ2 u 1 + γ3 u 2  0

(8.139)

Although the flux F is linear in this example, (8.139) meets all requirements for Takens–Bogdanov bifurcation discussed before. Similar examples with genuinely nonlinear flux functions F can be constructed by choosing a genuinely nonlinear F with linearization (8.138) at the origin and setting G = F · (normal form), as above. With the parameter choices √ γ1 = − 6, γ2 = −5, γ3 = 6, γ4 = −1 (8.140) √ for example, G (0) possesses eigenvalues −5, − 6, 0, 0 and F (0) possesses eigenvalues −4, −1, +1, +9 such that G is stabilizing the origin and F is strictly hyperbolic. Note that this structure will persist under small changes of the parameters as well as under small changes of the wave speed s. In summary, Takens–Bogdanov bifurcations are possible in stiff hyperbolic balance laws of the form (8.131). This holds true for a genuinely nonlinear flux F, non-vanishing viscosity δ > 0, and under small perturbations of the system. We conclude by highlighting some of those properties of the shock solutions generated by our Takens–Bogdanov example which contradict conventional wisdom for small amplitude shocks of systems of nonlinear, strictly hyperbolic balance laws. For hyperbolic conservation laws one usually expects viscous shock profiles to be monotonic. In particular, in numerical simulations small oscillations near the shock layer are regarded as numerical artefacts due to grid phenomena or unstable numerical schemes. In many schemes ‘numerical viscosity’ is used to automatically suppress such oscillations as ‘spurious’. Near Takens–Bogdanov points, in contrast, all heteroclinic orbits with asymptotic states near the Hopf line correspond to travelling waves with necessarily oscillatory tails. Near elliptic Hopf points, see case (C) of figure 8.5, both tails are oscillatory. In all cases (A)–(C), any heteroclinic connection between the left and right side of the curve y 2 − 2(λ + 2)y + λ2 = 0 of vanishing discriminant, in figure 8.5, gives rise to a travelling wave solution of (8.132) with only one oscillatory tail. These oscillations are generated by the complex eigenvalues of the linearization (8.33) near the Hopf line. In figure 8.13 a travelling wave with oscillatory tails in the limit ε 6 0 of the stiff source is sketched. The oscillations near the shock layer

Takens–Bogdanov bifurcations without parameters u

249

u

u−

u−

ξ

ξ

u+

u+

Figure 8.13. Oscillatory profile in the singular limit ε 6 0 plotted on two different scales.

resemble the Gibbs phenomenon, but in our case they are an intrinsic property of the solution. Numerical schemes should therefore resolve this ‘overshoot’ rather than suppress it. A second paradigm of strictly hyperbolic conservation laws is the Lax admissibility criterion. An admissible shock must have a speed s, such that exactly one characteristic family i is absorbed in the shock. In terms of the eigenvalues αi (u) of DF(u) the Lax criterion reads α1 < · · · < αi−1 (u − ) < s < αi (u − ) < · · · < αn α1 < · · · < αi (u + ) < s < αi+1 (u + ) < · · · < αn .

(8.141)

For weak shocks of hyperbolic conservation laws the Lax criterion is also a structural stability criterion for the heteroclinic connection of the corresponding travelling wave equation. See [42] for more background on shock waves of hyperbolic conservation laws. In our situation, in contrast, the shock speed is determined by the interaction of flux F and source ε−1 G. Specifically, the order of the eigenvalues α1 , . . . , αn of the flux F and the wave speed s turns out to be the same at both asymptotic states of all small-amplitude heteroclinic orbits near the Takens–Bogdanov bifurcation. Takens–Bogdanov points can be constructed for arbitrary relations of characteristic speeds αi and wave speed s. In [37] a PDE stability analysis for oscillatory connections near elliptic Hopf points (along lines of equilibria) was carried out for systems of the form (8.131). It turned out that oscillatory waves of extreme speed, i.e. of speeds faster or slower than all characteristic speeds, are convectively stable: they are linearly stable in an appropriate exponentially weighted space. Waves of intermediate speeds, in contrast, cannot be stabilized by any exponential weight. Near Takens–Bogdanov points, however, the PDE stability analysis of saddle–saddle and of saddle–centre shock profiles remains open.

Appendix. Derivation of the normal form In this appendix we sketch a derivation of the normal form (8.51). Our derivation is semi-elementary; we use the scalar product from [16] as presented in [47].

250

Bernold Fiedler and Stefan Liebscher

More sophisticated results on normal forms for nilpotent linear parts, based on S L(2, R)-representations are available in [14, 38, 39]. These methods have not yet been adapted to the constraints imposed by equilibrium planes and will not be required for our specific analysis. We derive the normal form (8.51) in three consecutive steps. Based on the crucial observation (ad A)T = ad(AT )

(8.142)

which holds for the scalar product on polynomials, introduced in [16], we first derive the normal form x˙ 1 = h 1 x 1 + h 2 x 2 + h 3 x 22 x˙ 2 = x 1 + h 1 x 2 + h 2,0 y1 + 2h 3 x 2 y1 y˙1 = x 2 + h 1,0 y1 + 2h 3,0 y12

(8.143)

y˙2 = h 4 π see (8.154)–(8.175). Here h j = h j (π, y), where π := x 22 − 2x 1 y1 , and h j,0 (π, y) := h j (π, y) − h j (0, y). Moreover, h 1 (0, 0) = h 2 (0, 0) = 0 to avoid additional linear terms. In proposition 8.5 below we show that π, y are generating the ring of polynomials ψ(z) in z = (x, y) which are invariant under exp(AT t). The nonlinear terms on the right-hand side of (8.143) are now complementary to the range of ad(A). In a second step, we add a suitable element of range ad(A) to convert the normal form (8.143) to the normal form x˙1 = h 1 x 1 + h 2 x 2 + h 3 x 22 x˙2 = x 1 y˙1 = x 2

(8.144)

y˙2 = h 4 π see (8.176)–(8.185). Note how the third-order structure in y1 appears, which was so crucial to our analysis. Again h j = h j (π, y) and h 1 (0, 0) = h 2 (0, 0) = 0. In a final step, we slightly massage the term h 4 to obtain the form (8.51), which is more suitable for the linear substitution (8.53), on the level of secondorder terms; see (8.186)–(8.190). As a prerequisite for our derivation of the normal form (8.143), related to ker ad AT , we collect some elementary facts. Define the linear differential operators D, D∗ by D := x 2 ∂ y1 + x 1 ∂x2 D∗ := x 2 ∂x1 + y1 ∂x2 .

(8.145)

Takens–Bogdanov bifurcations without parameters

251

Then we can rewrite   (ad A)g =     (ad AT )H =  

 0 g1   − Dg g2  0  H2 H3   − D∗ H 0  0

(8.146)

with g = (g1, g2 , g3 , g4 ), H = (H1, H2 , H3, H4 ). Note that y2 does not appear in D, D∗ either as coefficient or as differential. Since we are interested in kernels and ranges of ad A, ad AT in certain subspaces of polynomials, we can therefore suppress the invariant y2 , notationally, and derive the normal form for the part z = (x 1 , x 2 , y1 ), only, which exhibits the more interesting nilpotency of A of order three. Proposition 8.4. (i)

Dy1 = x 2 ,

Dx 2 = x 1 ,

(ii)

D∗ x 1 = x 2 ,

(iii)

Dπ = D∗ π = 0,

D∗ x 2 = y 1 ,

Dx 1 = 0, D∗ y1 = 0,

for π = x 22 − 2x 1 y1 .

Proof. The proof is trivial, but benefits from the observation that an interchange of x 1 and y1 converts D, D∗ into each other. Proposition 8.5. Let ψ = ψ(x 1 , x 2 , y1 ) be a polynomial, π = x 22 − 2x 1 y1 . ˜ 1 , x 2 , y1 ), r0 (x 1 , y1 ), (i) Euclidean algorithm: there exist polynomials ψ(x r1 (x 1 , y1 ) such that ˜ + r1 x 2 + r0 . (8.147) ψ = ψπ (ii) Let ψ be invariant under exp(AT t), that is ψ(exp(AT t)z) = ψ(z) for all z or, equivalently, D∗ ψ(z) = 0. Then ψ(z) = h(π, y1 )

(8.148)

for some polynomial h. In particular, y1 and π generate the ring of exp(AT t)-invariants (y2 is suppressed). Proof. Claim (i) follows from the Euclidean algorithm with respect to x 2 . Note that r0 , r1 , ψ˜ are indeed polynomials in (x 1 , x 2 , y1 ), rather than rational functions, because x 22 appears with coefficient 1 in π.

252

Bernold Fiedler and Stefan Liebscher

To prove claim (ii), let z(t) = ( 12 t 2 y1 , t y1 , y1 ) be the exp(AT t)-orbit of z(0) = (0, 0, y1 ). Then d ψ(z(t)) = D∗ ψ(z(t)) = 0 dt

(8.149)

and, hence, ψ0 (z) := ψ(z) − ψ(0, 0, y1 ) satisfies ψ0 (z(t)) ≡ 0

(8.150)

for all t. We now apply the Euclidean algorithm (i) to ψ0 and obtain a polynomial identity (8.151) ψ0 (z) = ψ˜ 0 π + r1 x 2 + r0 . Both ψ0 and π vanish for all z(t). Therefore, r1 ( 12 t 2 y1 , y1 ) · t y1 + r0 ( 12 t 2 y1 , y1 ) = 0

(8.152)

for all t, y1 . Writing (8.152) as a polynomial in t with coefficients which are polynomials in y1 , we see that coefficients vanish for even and odd orders in t, alike. Hence, r0 = r1 = 0, and (8.151) implies ψ(z) = ψ(0, 0, y1 ) + π ψ˜ 0 (z).

(8.153)

Repeating this process, with ψ˜ 0 replacing ψ, proves (8.148) and claim (ii).

To derive normal forms we introduce the following spaces of vector polynomials in z = (x, y) W = { f (z) | f (0, y) = 0 for all y} V = {g(z) | g x (0, y) = 0 for all y} Vc = {c(y) |

cy

(8.154)

≡ 0}.

Clearly V ⊕ Vc span all polynomials. By assumption, our original vector field f (z) ∈ W fixes the equilibrium plane {x = 0}. Our normal form transformation g(z) is taken from V ; see (8.50). Observe that ad A : V −→ W ⊆ V

(8.155)

either by direct calculation, or by contemplating that our normal form transformations leave the equilibrium plane invariant. Normal forms are therefore given by a complement to range(ad A)|V within W . With respect to the scalar product of [16] we have (range ad A)⊥ = ker ad AT

(8.156)

Takens–Bogdanov bifurcations without parameters

253

in the total space V ⊕ Vc . In our restricted situation, an orthogonal complement to range(ad A)|V in W is therefore given by H ∈ ((ad AT )|W )−1 (Vc ). In coordinates H = (H1, . . . , H4 ) this means that    c1 ( y) H2 − D∗ H1    H − D H ( y) c 3 ∗ 2  = 2 (ad AT )H =   −D∗ H3   0 −D∗ H4 0

(8.157)    

(8.158)

for H ∈ W and suitable polynomials c1 , c2 . For example, D∗ H4 = 0 and proposition 8.5(ii), applied to H4 instead of ψ, immediately implies the normal form (8.159) y˙2 = H4(z) = h(π, y) = h 4 (π, y)π as was claimed in (8.144). Indeed, H ∈ W implies h(0, y) = H4(0, y) = 0 for all y. With such encouragement around, we now return to suppressing y2 entirely. The first three components of (8.158) imply D∗3 H1 = 0

(8.160)

because D∗ y1 = 0 and, henceforth suppressed, D∗ y2 = 0. Conversely, any solution of (8.160) generates a unique solution of (8.158) by H2 := D∗ H1 + c1 (y1 ) H3 := D∗ H2 + c2 (y1 ).

(8.161)

Indeed, c1 (y1 ) and c2 (y1 ) are determined by H ∈ W to be given by −D∗ H1, −D∗ H2, respectively, evaluated at z = (0, 0, y1 ). Lemma 8.6. Let ψ = ψ(x 1 , x 2 , y1 ) be any polynomial such that D∗3 ψ = 0. Then ψ = h˜ 0 + h˜ 1 x 1 + h˜ 2 x 2

(8.162)

for some exp(AT t)-invariant polynomials h˜ j = h˜ j (π, y1 ), π = x 22 − 2x 1 y1 . Conversely, any polynomial (8.162) satisfies D∗3 ψ = 0. Proof. Since D∗ y1 = D∗ π = 0, by proposition 8.4, the converse part follows trivially from (8.163) D∗3 x 1 = D∗2 x 2 = D∗ y1 = 0. Now suppose D∗3 ψ = 0. With z(t) := ( 12 t 2 y1 , t y1 , y1 ) as before, and the abbreviation ψ(t) = ψ(z(t)) we obtain ˙˙˙ = D∗3 ψ(z(t)) = 0 ψ(t)

(8.164)

254

Bernold Fiedler and Stefan Liebscher

and, therefore,

2 ˙ ¨ ψ(t) = ψ(0) + ψ(0)t + 12 ψ(0)t .

(8.165)

Obviously, the time-derivatives at t = 0 satisfy ψ(0) = ψ ˙ ψ(0) = D∗ ψ = y 1 ∂ x 2 ψ

(8.166)

¨ ψ(0) = D∗2 ψ = y1 ∂x1 ψ + y12 ∂x2 ψ with right-hand sides evaluated at (0, 0, y1 ). Inserting into (8.165) implies ψ(t) = a0 + a2 y1 t + a1 21 y1 t 2 = a0 + a1 x 1 (t) + a2 x 2 (t)

(8.167)

for suitable polynomials a j = a j (y1 ). Now consider ψ0 (z) := ψ(z) − a0 − a1 x 1 − a2 x 2

(8.168)

and apply the Euclidean algorithm of proposition 8.5, as in (8.150)–(8.153) above. Indeed (8.169) ψ0 (z(t)) = 0 by construction, and

ψ0 = ψ˜ 0 π + r1 x 2 + r0

(8.170)

imply r0 = r1 = 0, as before. This implies ψ(z) = a0 + a1 x 1 + a2 x 2 + π ψ˜ 0 (z).

(8.171)

Moreover, and most importantly, π D∗3 ψ˜ 0 (z) = D∗3 ψ(z) = 0

(8.172)

for all z, and, hence, D∗3 ψ˜ 0 (z) = 0 because a j = a j (y1 ). This enables us to repeat the process, with ψ˜ 0 replacing ψ. This proves the form (8.162) of ψ and the lemma. We are now ready to prove the (x˙1 , x˙2 , y˙1 )-part of the normal form (8.143). Since D∗3 H1 = 0, by (8.160), and since H ∈ W , lemma 8.6 applied to ψ := H1 implies x˙1 = H1 = h˜ 0 π + h˜ 1 x 1 + h˜ 2 x 2 = (h˜ 1 − 2h˜ 0 y1 )x 1 + h˜ 2 x 2 + h˜ 0 x 22 = h 1 x 1 + h 2 x 2 + h 3 x 22

(8.173)

with obvious definitions of h j . Note that indeed h˜ 0 can be replaced by h˜ 0 π, in (8.162), because H ∈ W vanishes whenever x = 0. Moreover, h 1 (0, 0) = h 2 (0, 0) = 0 to ensure higher order of H1.

Takens–Bogdanov bifurcations without parameters

255

With the abbreviation h j,0 (π, y1 ) := h j (π, y1 ) − h j (0, y1 ), (8.161), (8.173), proposition 8.4 and H ∈ W together imply H2 = D∗ H1 + c1 (y1 ) = h 1 x 2 + 2h 3 x 2 y1 + h 2 y1 + c1 (y1 ) = (h 1 + 2h 3 y1 )x 2 + h 2,0 y1 .

(8.174)

Note that in fact h 2,0 = π · hˆ 2,0 , for some polynomial hˆ 2,0 , because we have h 2,0 (0, y1 ) = 0 for all y1 . Similarly H3 = D∗ H2 + c2 (y1 ) = h 1,0 y1 + 2h 3,0 y12 .

(8.175)

Re-inserting y2 in all h j the normal form (8.143) is proved. To prove normal form (8.144), we add a suitable element (ad A)g, g ∈ V , to H to annihilate the terms H2 and H3. Specifically,   − Dg1 + H1  g1 − Dg2 + H2   (8.176) (ad A)g + H =   g2 − Dg3 + H3  . − Dg4 + H4 With the choices g3 = g4 = 0, we obtain the conditions g2 := −H3 g1 := −H2 + Dg2 = −H2 − D H3 .

(8.177)

The new normal form then reads x˙1 = H1 − Dg1 x˙2 = x 1 y˙1 = x 2

(8.178)

y˙2 = H4. To study the transition from (8.143) to (8.144), alias (8.178), we introduce the spaces H1 := {H1(0, y1) = 0} ∩ ker D∗3 (8.179) H∗ := ((ad AT)|W )−1(Vc ) see (8.160), (8.157). We conveniently restrict these spaces to homogeneous polynomials of fixed degree N > 2 in the variables z = (x 1 , x 2 , y1 ), with y2 suppressed. Note the linear lifting isomorphism & : H1 −→ H∗ .

(8.180)

256

Bernold Fiedler and Stefan Liebscher

&(H1) := (H1, H2, H3 ) with H2, H3 given by the construction (8.161) of the normal form (8.143) from H1. We claim, and show below, that the map

(H1 , H2, H3 ) := H1 − Dg1 with g1 given by the normal form transformation (8.177) above in fact defines another linear isomorphism

: H∗ −→ H1 .

(8.181)

This proves normal form (8.144), because indeed (8.178) then takes the form x˙1 = h 1 x 1 + h 2 x 2 + h 3 x 22 x˙2 = x 1

(8.182)

y˙1 = x 2 y˙2 = h 4

and all invariants h j = h j (π, y) are admissible. It now remains to prove that indeed maps H∗ to H1 and that possesses trivial kernel. The latter fact is obvious, because the normal form transformation is given by H "→ H + (ad A)g (8.183) with (ad A)g orthogonal to H ∈ H∗ , and only omits components of the righthand side of (8.183) which are already zero by construction of g. To prove (H ) = H1 − Dg1 ∈ H1 , we use (8.177) to explicitly compute the correction (8.184) Dg1 = −D H2 − D 2 H3. In view of definition (8.179) of H1 and the characterization of H1 by lemma 8.6 and (8.173), it is sufficient to show that both D H2 and D 2 H3 are sums of monomials of the form π j y1k x 1 ,

π j y1k x 2 ,

π j y1k−1 x 22 .

(8.185)

Since Dπ = 0, by proposition 8.4, we may as well suppress π entirely, treating π as a coefficient in this computation. To treat the term D H2 in (8.184) we observe that H2 itself contains only monomials y1k+1 , y1k x 2 ; see (8.174). Applying D = x 2 ∂ y1 + x 1 ∂x2 only produces terms (8.185) from these. To treat the term D 2 H3 in (8.184) we observe that H3 itself contains only monomials y1k+1 ; see (8.175). Applying D once only produces terms y1k x 2 . The second application of D generates terms y1k x 1 , y1k−1 x 22 which are already in the list (8.185). These simple observations complete our proof of normal form (8.144). As a final step, we slightly massage the term h 4 π = πh 4 (π, y1 )

(8.186)

Takens–Bogdanov bifurcations without parameters

257

in (8.144) to obtain the form hˆ 4 x 1 y1 = x 1 y1 hˆ 4 (x 1 y1 , y1 )

(8.187)

which the y˙2 -term takes in (8.51). As in (8.146), (8.183), a massage is defined as addition of −Dg4 such that h 4 π − Dg4 = hˆ 4 x 1 y1 .

(8.188)

Clearly, the spaces of polynomials of fixed degree N in z = (x 1 , x 2 , y1 ) which take the forms (8.186), (8.187), respectively, are of equal dimension. Our construction of g will depend linearly on h 4 and, by orthogonality, will define a linear isomorphism between spaces of equal dimension. j We construct g for each monomial x 1 x 22k y1 , separately. First, note that D = x 2 ∂ y1 + x 1 ∂x2 implies x 1 x 22k y1 − j

1 2k − 1 j +1 2k−2 +1 j x2 y1 D(x 1 x 22k−1 y1+1 ) = x +1 +1 1

(8.189)

for j, ≥ 0 and k ≥ 1. Subtraction of a term Dg has, thus, reduced the x 2 exponent by two. Iterating this procedure k times, and starting at x 1 -exponent j = 0, we obtain g4 such that x 22k y1 − Dg4 = c0 (x 1 y1 )k y1 .

(8.190)

Since x 2 occurs only in even powers in π = x 22 − 2x 1 y1 and in h 4 π, this construction of g4 , which depends linearly on h 4 , allows us to eliminate all (even) powers x 22k and replace them by terms (x 1 y1 )k , as was required in (8.188). This finally proves our normal form (8.51), to arbitrary finite order.

Acknowledgements We are indebted to Abderrahim Azouani, Henk Broer, Sebius Doedel, Vassili Gelfreich, Ale Jan Homburg, Oliver Junge, Arnd Scheel, and Dmitry Turaev for valuable comments during the preparation of this paper. All expert typesetting was accomplished by Regina L¨ohr. And without guidance by the pioneering papers of Floris Takens, written more than 25 years ago, this chapter could not have been written.

References [1] Alexander J C and Auchmuty G 1986 Global bifurcation of phase-locked oscillators Arch. Rational Mech. Anal. 93 253–70 [2] Alexander J and Fiedler B 1989 Global decoupling of coupled symmetric oscillators Differential Equations (Lecture Notes in Mathematics 118) ed C Dafermos, G Ladas and G Papanicolaou (New York: Dekker)

258

Bernold Fiedler and Stefan Liebscher

[3] Arnol’d V 1972 Lectures on bifurcations and versal systems Russ. Math. Surv. 27 54–123 [4] Arnol’d V 1983 Geometrical Methods in the Theory of Ordinary Differential Equations (Grundl. math. Wiss. 250) (Berlin: Springer) [5] Aulbach B 1984 Continuous and Discrete Dynamics Near Manifolds of Equilibria (Lecture Notes in Mathematics 1058) (Berlin: Springer) [6] Bogolyubov N and Mitropol’skij Y 1961 Asymptotic Methods in the Theory of NonLinear Oscillations (New York: Gordon and Breach) [7] Bogdanov R 1976 Bifurcation of the limit cycle of a family of plane vector fields Trudy Sem. Im. I. G. Petrovskogo 2 23–36 (in Russian) [8] Bogdanov R 1976 Versal deformation of a singularity of a vector field on the plane in the case of zero eigenvalues Trudy Sem. Im. I. G. Petrovskogo 2 37–65 (in Russian) [9] Bogdanov R 1981 Bifurcation of the limit cycle of a family of plane vector fields Sel. Mat. Sov. 1 373–87 [10] Bogdanov R 1981 Versal deformation of a singularity of a vector field on the plane in the case of zero eigenvalues Sel. Mat. Sov. 1 389–421 [11] Broer H W and Roussarie R 2001 Exponential confinement of chaos in the bifurcation set of real analytic diffeomorphisms, in this volume [12] Broer H W and Takens F 1989 Formally symmetric normal forms and genericity Dynamics Reported vol 2, ed U Kirchgraber and H O Walther (Stuttgart: Teubner– Wiley) pp 39–59 [13] Chow S-N and Hale J 1982 Methods of Bifurcation Theory (New York: Springer) [14] Cushman R and Sanders J 1986 Nilpotent normal forms and representation theory in sl(2, R) Multiparameter Bifurcation Theory: Proc. AMS–IMS–SIAM Joint Summer Res. Conf.: Arcata, CA, 1985 (Contemp. Math. 56) ed M Golubitsky and J Guckenheimer (Providence, RI: American Mathematical Society) pp 31–51 [15] Dumortier F and Roussarie R 1998 Geometric singular perturbation theory beyond normal hyperbolicity Preprint [16] Elphick C, Tirapegui E, Brachet M, Coullet P and Iooss G 1987 A simple global characterization for normal forms of singular vector fields Physica D 29 95–127 [17] Farkas M 1984 ZIP bifurcation in a competition model Nonlin. Anal. Theory Meth. Appl. 8 1295–309 [18] Fenichel N 1977 Asymptotic stability with rate conditions II Indiana Univ. Math. J. 26 81–93 [19] Fenichel N 1979 Geometric singular perturbation theory for ordinary differential equations J. Diff. Eqns 31 53–89 [20] Fiedler B and Liebscher S 2000 Generic Hopf bifurcation from lines of equilibria without parameters: II Systems of viscous hyperbolic balance laws SIAM J. Math. Anal. 31(6) 1396–404 [21] Fiedler B, Liebscher S and Alexander J 2000 Generic Hopf bifurcation from lines of equilibria without parameters: I Theory J. Diff. Eqns 167 16–35 [22] Fiedler B, Liebscher S and Alexander J 2000 Generic Hopf bifurcation from lines of equilibria without parameters: III Binary oscillations Int. J. Bif. Chaos 10(7) 1613–22 [23] Fiedler B and Scheurle J 1996 Discretization of Homoclinic Orbits and Invisible Chaos (Memoirs AMS 570) (Providence, RI: American Mathematical Society) [24] Fiedler B, Sandstede B, Scheel A and Wulff C 1996 Bifurcation from relative equilibria of noncompact group actions: Skew products, meanders, and drifts Doc.

Takens–Bogdanov bifurcations without parameters

259

Math. 1 479–505 [25] Gelfreich V 1999 A proof of the exponentially small transversality of the separatrices for the standard map Commun. Math. Phys. 201 155–216 [26] Gelfreich V and Lazutkin V 2001 Splitting of separatrices: perturbation theory and exponential smallness Uspekhi to appear [27] Golubitsky M, Stewart I and Schaeffer D 1985 Singularities and Groups in Bifurcation Theory I (Appl. Math. Sci. 51) (New York: Springer) [28] Golubitsky M, Stewart I and Schaeffer D 1988 Singularities and Groups in Bifurcation Theory II (Appl. Math. Sci. 69) (New York: Springer) [29] Guckenheimer J and Holmes P 1982 Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields (Appl. Math. Sci. 42) (New York: Springer) [30] Hale J 1963 Oscillations in Nonlinear Systems (New York: McGraw-Hill) [31] Hale J and Koc¸ ak H 1991 Dynamics and Bifurcations (Texts in Appl. Math. 3) (New York: Springer) [32] Hirsch M, Pugh C and M Shub 1977 Invariant Manifolds (Lecture Notes in Mathematics 583) (Berlin: Springer) [33] Hurwitz A and Courant R 1964 Funktionentheorie (Grundl. math. Wiss. 3) 4th edn (Berlin: Springer) (in German) [34] Kuznetsov Yu 1995 Elements of Applied Bifurcation Theory (Appl. Math. Sci. 112) (Berlin: Springer) [35] Lang S 1987 Elliptic Functions (Grad. Texts in Mathematics 112) 2nd edn (New York: Springer) [36] Liebscher S 1997 Stabilit¨at von entkopplungsph¨anomenen in systemen gekoppelter symmetrischer oszillatoren Diplomarbeit Freie Universit¨at Berlin (in German) [37] Liebscher S 2000 Stable, oscillatory viscous profiles of weak, non-lax shocks in systems of stiff balance laws PhD Thesis Freie Universit¨at, Berlin [38] Murdock J 1998 Asymptotic unfoldings of dynamical systems by normalizing beyond the normal form J. Diff. Eqns 143 151–90 [39] Murdock J 2001 On the structure of normal form modules Preprint [40] Neishtadt A 1984 On the separation of motions in systems with rapidly rotating phase J. Appl. Math. Mech. 48 134–9 [41] Shoshitaishvili A 1975 Bifurcations of topological type of a vector field near a singular point Trudy Sem. Im. I. G. Petrovskogo 1 279–309 (in Russian) [42] Smoller J 1994 Shock Waves and Reaction–Diffusion Equations (Grundl. math. Wiss. 258) (Berlin: Springer) [43] Takens F 1974 Forced oscillations and bifurcations Applications of Global Analysis I Commun. Math. Inst. 1973-3 Rijksuniversiteit Utrecht (reprinted in chapter 1 of this volume) [44] Takens F 1974 Singularities of vector fields Publ. Math. IHES 43 47–100 [45] Tricomi F 1937 Funzioni Ellittiche (Monogr. di Mat. Appl. per Cura d Consiglio Naz d Ricerche 248) (Bologna: Nicola Zanichelli) (in Italian) ¨ [46] Tricomi F 1948 Elliptische Funktionen (Ubersetzt und bearbeitet von M Krafft) (Leipzig: Akademische Verlagsanstalt Geest & Portig K-G) (in German) [47] Vanderbauwhede A 1989 Centre manifolds, normal forms and elementary bifurcations Dynamics Reported vol 2, ed U Kirchgraber and H O Walther (Stuttgart: Teubner–Wiley) pp 89–169

This page intentionally left blank

Chapter 9 Global bifurcations of periodic orbits in the forced Van der Pol equation John Guckenheimer Cornell University Kathleen Hoffman University of Maryland Warren Weckesser University of Michigan

The forced Van der Pol equation, written in the form x3 3 y˙ = −x + a sin(2π θ )

ε x˙ = y + x −

(9.1)

θ˙ = ω defines a vector field on R2 × S 1 . This dynamical system has a long history. It was the system in which Cartwright and Littlewood [2, 3, 10, 11] first noted the existence of chaotic solutions in a dissipative system. They studied the system when ε > 0 is small, the region of relaxation oscillations. Van der Pol and Van der Mark [19] observed multistability and hysteresis in experimental studies of an electrical circuit roughly modeled by the Van der Pol equation. Following the observations of Van der Pol and Van der Mark, Cartwright and Littlewood proved that there are parameter values at which the system has two stable periodic orbits of different periods. A theorem of Birkhoff then implies that the basin boundary dividing the basins of attraction of these two periodic orbits cannot be a smooth torus. Eventually, Littlewood produced an intricate analysis proving the existence of chaotic solutions in the Van der Pol equation [11]. Levinson [9] gave a more 261

262

John Guckenheimer, Kathleen Hoffman and Warren Weckesser

accessible analysis of a simplified, piecewise linear system that inspired Smale’s horseshoe [15]. Levi [8] gave a more comprehensive analysis of a still simpler system. In parameter ranges that are not stiff, periodic orbits of the Van der Pol equation have been studied numerically by Mettin et al [12]. See also Flaherty and Hoppensteadt [4]. Takens [17] took a different approach to proving the existence of chaotic solutions in the Van der Pol system, investigating the dynamics associated with codimension-two bifurcations in which a periodic orbit has a return map whose linearization has one as an eigenvalue of multiplicity two in a regime where the system is nearly linear. He argued that there should be nearby parameters at which there is a periodic orbit with transversal intersections of its stable and unstable manifolds. If true, the Smale–Birkhoff theorem implies that there are invariant subsets on which the flow is conjugate to a subshift of finite type. Takens’ analysis was based upon the properties of generic vector fields: the splittings between stable and unstable manifolds are ‘beyond all orders’ and cannot be computed using regular perturbation theory. To our knowledge, more sophisticated methods of singular perturbation theory have not been applied to compute these splittings in this example. Despite the role that this system has played historically, the global dynamics and bifurcations of relaxation oscillations of the forced Van der Pol equation have not been thoroughly studied, even numerically. When ε ≤ 10−3 , initial value solvers are unable to follow solutions of the system that lie close to unstable portions of the critical manifold defined by y = x 3 /3 − x [6]. Classification of bifurcations from numerical computations is difficult in this regime because the initial value solvers do not resolve the flow near all of the bifurcations. When ε = 0, the character of the equation changes, becoming a differential algebraic equation, similar to the ‘constrained’ systems studied by Takens [18] as models of electrical circuits. The solutions of the singular limit are described by a two-dimensional slow flow augmented by jumps from fold curves of the critical manifold where existence of solutions to the slow flow equations breaks down. We exploit this structure and investigate the global bifurcations of the differential algebraic equations, including the effects of their jumps. Previous studies have examined the global geometry of the system in terms of a cross-section to the flow defined by a constant phase of the forcing term. This is a natural thing to do for periodically forced oscillations, replacing the autonomous system as a continuous vector field in its phase plane by a discrete time diffeomorphism. The perspective here is different, however. The nature of the bifurcations of the system and their relationship to the singular limit becomes more transparent when we use a ‘cross-section’ that contains a fold curve of the critical manifold. Our work is based upon the decomposition of trajectories of (9.1) into regular, slow segments and fast segments that are almost parallel to the x-axis in this system. The singular limit of the return map is then a map of a circle that has discontinuities related to these degenerate decompositions of trajectories. This paper highlights the main features of these return maps and

Global bifurcations in the forced Van der Pol equation

263

the kinds of bifurcations displayed in the two-parameter family of differential algebraic equations. Asymptotic analysis of the relationship between the slow flow and the Van der Pol equation with ε > 0 is subtle and involves ‘canards’, solution segments that follow the unstable sheet of the critical manifold. The chaotic orbits of the Van der Pol equation discovered by Cartwright and Littlewood contain canards, and much of their work is devoted to describing the properties of trajectories with canards and developing topological properties of chaotic invariant sets. They did not consider the bifurcations of the stable periodic orbits. We observe here that the bifurcations of these stable periodic orbits persist in the singular limit of the differential algebraic equations, and that they can be located without analysis of canards. The key observation that we make is based upon an examination of where the slow–fast decomposition of trajectories is degenerate. Degenerate decompositions occur when a trajectory jumps from a folded singular point [1] or jumps to a point where the vector field is tangent to the projection of the fold curve along the fast flow. The first type of degeneracy gives rise to canards; the second type of degeneracy is crucial to the existence of periodic orbits of different periods in the Van der Pol equation. Tangencies of the vector field with projection of a fold curve onto another sheet of the critical manifold were described briefly by Grasman [5] and Mishchenko et al [14], but they do not appear to have been described or analysed previously in the Van der Pol equation. This work is descriptive, not rigorous. We unabashedly utilize the results of numerical computations without establishing error estimates or giving proofs.

9.1 The slow flow and its bifurcations We study the Van der Pol system for small values of ε. The limit ε = 0 is a system of differential algebraic equations that we shall label the DAE when we want to distinguish it from the slow flow defined by (9.2). The critical manifold of the Van der Pol equation is the surface defined by y = x 3 /3 − x. We shall use coordinates (θ, x) for the critical manifold. We denote the circle defined by x = c by Sc . The projection of the vector field onto the critical manifold is singular on its fold curve, consisting of the circles S±1 . At most points of the fold curve, trajectories arrive from both sides or they leave from both sides: the existence theorem for ordinary differential equations breaks down for the DAE at these points. However, we know that when ε > 0 is small, trajectories of the Van der Pol system are almost parallel to the x-axis off the critical manifold. Therefore, when trajectories of the DAE arrive at the fold curves S±1 , we apply a discrete transformation that maps (θ, ±1) to the other point of intersection of a line parallel to the x-axis with the critical manifold, namely (θ, ∓2). The exceptions to this rule occur at folded singularities, described below. Some trajectories of the Van der Pol equation arriving at folded singularities follow the unstable sheet of the critical manifold, jumping at a later time and place. These trajectories are called canards. We regard

264

John Guckenheimer, Kathleen Hoffman and Warren Weckesser

the jumps from fold curves and canards as part of the structure of the DAE. To cope better with the singularity of the DAE, we rescale time to obtain the slow flow of the system. The slow flow of the Van der Pol equations is defined by the system θ = ω(x 2 − 1) x = −x + a sin(2π θ ).

(9.2)

These equations are obtained by setting ε = 0 in (9.1), differentiating y −(x 3/3 − x) = 0 to yield y˙ = (x 2 − 1)x˙ and rescaling time by the factor (x 2 − 1). Note that the time rescaling is singular at x = ±1 and that it reverses the direction of time in |x| < 1. This needs to be taken into account when relating the phase portraits of the slow flow to the DAE and to (9.1). We also take into account that there are jumps from fold curves to the circles S∓2 , so that tangencies of the slow flow with these lines will play a special role in our analysis. Singular points of the slow flow occur only when |a| ≥ 1. These are the folded singularities. They are located on the fold curves at (sin−1 (±1/a)/(2π), ±1), where sin−1 is regarded as a multivalued function. They are not close to singular points of the full system (there are none), but rather represent points where the direction of motion towards or away from the fold curve changes. Figure 9.1 shows a plot of the critical manifold of the Van der Pol equation, the folded singularities, trajectories of the slow flow and a representative trajectory of the Van der Pol equation. The role of the folded singularities is apparent in this figure. Trajectories of the Van der Pol equation can turn along the critical manifold near the folded singularities without jumping immediately to another sheet. The limit trajectories of the Van der Pol equation with initial points on S2 follow trajectories of the slow flow to their intersection with S1 . Away from the folded singularities, they then jump to S−2 and follow trajectories of the slow flow to S−1 where they jump once again to S2 . This cycle defines a return map for the DAE. Trajectories of the DAE that reach a point that is a folded singularity for the slow flow may continue past the singularity, backwards in time along a trajectory of the slow flow in the region |x| < 1. These canards may jump to one of the regions with |x| > 1 at any point along their trajectories. The jumps of the DAE all occur along lines parallel to the x-axis, so the jump from a point (θ, x) along a canard is to one of the two points (θ, u) where u 3 /3 − u = x 3 /3 − x. The DAE does not adequately resolve the canards: they all appear to occur on trajectories of the slow flow that are asymptotic to the folded singularities. Singular perturbation analysis [5] determines asymptotic properties of the canards for small values of ε > 0 in the Van der Pol equation. We shall not discuss these asymptotics or the geometry of canards in this paper, but note they are essential parts of the chaotic dynamics found in the Van der Pol equation. It is relatively easy to compute the limiting images of the canards for the DAE, but we do not pursue this matter here.

Global bifurcations in the forced Van der Pol equation

265

2 1.5 1

y

0.5 0 −0.5 −1 −1.5 −2

13 14 15

θ

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

x

Figure 9.1. Three-dimensional plot of a trajectory for the Van der Pol equation and the critical manifold of the system. The folded saddles occur near the points on the critical manifold where the trajectory turns.

The Jacobian of the slow flow equations is 0 2ωx . 2πa cos(2π θ ) −1 The trace of the Jacobian is negative, so the slow flow is area contracting. This has two immediate implications about the phase portraits of the slow flow. • •

There are no equilibrium points that are sources. There are no unstable periodic orbits. For a < 1, the flow points into the strip |x| < 1 on its boundary, so the Poincar´e–Bendixson Theorem implies that there is a stable periodic orbit in the strip.

As a increases, we encounter the following local degeneracies in the slow flow at points of the fold curves S±1 and the projections S∓2 of the fold curves to stable sheets of the critical manifold [13, 14, 16]: •

At a = 1, there is a folded saddle–node, dividing slow flows with no folded singularities from slow flows with two folded singularities, one a folded

John Guckenheimer, Kathleen Hoffman and Warren Weckesser

266 2.5

2.5

x

x

-2.5

-2.5 0

theta

(a)

1

0

theta

1

(b)

Figure 9.2. Phase portraits of the slow flow for parameter values (ω, a) = (1, 1.5) (a), and (ω, a) = (5, 20) (b). The stable and unstable manifolds of the folded saddles and the circles |x| = 1, |x| = 2 are drawn.

• •

1 saddle ps = (θs , 1) = ( 2π sin−1 ( a1 ), 1) and the other a folded node or focus 1 sin−1 ( a1 ), 1). at pa = (θa , 1) = ( 12 − 2π At a = 1 + 1/(16πω)2, there is a resonant folded node dividing slow flows with folded nodes from slow flows with folded foci. At a = 2, the projections of the folds at S±1 of the slow flow onto the sheets of the critical manifold at S∓2 have inflection points. At values of a > 2, the mappings along the slow flow from S±2 to S±1 are no longer monotone.

The separatrices of the folded saddle ps play a significant role in the dynamics of the DAE and the Van der Pol equation. In the half plane 1 ≤ x, there is a stable separatrix Ws that arrives at ps from the left and an unstable separatrix Wu that leaves ps to the right. Figure 9.2 shows the saddle separatrices of (9.2) for parameter values (ω, a) = (1, 1.25) and (ω, a) = (2, 10). We note that Wu always intersects the circle S1 before making a full turn around the cylinder S 1 × R†. † Here is the proof. Assume that Wu intersects the line θ = 12 and denote the segment of Wu that extends from ps to θ = 12 by Wu2 . Set W¯ u2 to be the reflection of Wu2 in the line θ = 12 . If (θ, x) and (1 − θ, x) are symmetric points on Wu2 and W¯ u2 , the vector (ω(x 2 − 1), x − a sin(2π θ )) is tangent to W¯ u2 . But the vector field of the slow flow at this point is (ω(x 2 − 1), −x − a sin(2π θ )), which points below the curve W¯ u2 . We conclude that Wu intersects the circle S1 at a point p1u ∈ (θs , 1 − θs ), proving our assertion.

Global bifurcations in the forced Van der Pol equation

267

9.2 Symmetry and return maps All trajectories with initial conditions on S±2 reach S±1 except those that lie in the stable manifold of a folded saddle or the strong stable manifold of a folded node. Let P+ be the map along trajectories of the slow flow from S2 to S1 , P− the map along trajectories of the slow flow from S−2 to S−1 , J+ (θ, 1) = (θ, −2) and J− (θ, −1) = (θ, 2). The return map for the DAE to the circle S2 is then given by the composition J− P− J+ P+ . The slow flow and the DAE are symmetric with respect to the transformation T (θ, x) = (θ + 12 , −x). Moreover, T 2 is the identity on S 1 × R, T P+ = P− T and T J+ = J− T . This implies that the return map J− P− J+ P+ = J− P− T T J+ P+ = (T J+ P+ )(T J+ P+ ) is the square of the map H = (T J+ P+ ) on the circle S2 . Consequently, the periodic orbits of the DAE can be divided into those that are fixed by the half return map H and those that are not. Because T phase shifts θ by 12 , the fixed points of H all yield periodic orbits whose period is an odd multiple of the forcing period 1/ω. The stable periodic orbits studied in the work of Cartwright and Littlewood [2, 3, 10, 11] are the ones containing points fixed by H . In this paper, we shall also focus upon these orbits. Figure 9.4 shows graphs of the maps H for several parameter values. There are three primary parameter ranges for a in which the maps P+ and H have different properties. The map P+ is a diffeomorphism of the circle S2 to the circle S1 for 0 < a < 1. In this regime, x decreases along all trajectories in the strip 1 < x < 2, implying that H is a circle diffeomorphism. Its rotation number depends upon ω, increasing with ω. All rotation numbers in [ 12 , ∞) are realized as ω varies in (0, ∞). When 1 < a < 2, the map P+ no longer maps the circle S2 onto the circle S1 . Its image I1 excludes the portion of S1 that lies below the manifold Wu defined in the previous section. The discontinuities in the domain of P+ occur at points in Ws ∩ S2 . There is a single point of discontinuity since the circle S2 is a crosssection for the flow. It also follows that the map P+ remains increasing in this parameter regime. Thus, H is a family of increasing maps of the circle into itself with a single point of jump discontinuity in this parameter regime. This implies that H still has a well defined rotation number and periodic orbits of at most one period. Quasiperiodic trajectories are still possible, but the set of parameter values yielding quasiperiodic trajectories is likely to have measure zero [7]. When 2 < a, the map P+ is no longer monotone. There are two points 1 1 p2l = ( 2π sin−1 ( a2 ), 2) and p2r = ( 12 − 2π sin−1 ( a2 ), 2) at which P+ has a local maximum and minimum, respectively. On the interval D = (θ2l , θ2r ), ¯ it has positive slope. There are two P+ has negative slope while on S 1 − D, crucial additional aspects to the structure of H as a piecewise continuous and piecewise monotone mapping of the circle. First, there are discontinuities of P+ at intersections of D with Ws . (There may be only one such intersection point.) At the points of discontinuity in Ws ∩ S2 , there is a jump with limit values 12 +θs = θr and 12 + p1u = θl . We denote by ql and qr the points (θl , 2) and (θr , 2) in S2 . Second, we observe that the maximum height of Wu is a decreasing function of ω

John Guckenheimer, Kathleen Hoffman and Warren Weckesser

268 2.2

2.2

p2l

2

ql p2r qr

1.8

1.8

1.6

1.6

Wu

1.4

1.4

Ws

1.2

1

ql p2r qr

p2l

2

ps 0

0.1

p1u 0.2

0.3

0.4

0.5

(a)

0.6

0.7

0.8

0.9

1

Wu

1.2

1

W p1u s

ps 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

(b)

Figure 9.3. The structure of the slow flow in the strip 1 < x < 2 for (ω, a) = (5, 20) (a), and (ω, a) = (10, 20) (b). Unstable manifolds Wu are drawn with dot–dash curves, stable manifolds Ws are drawn solid, the trajectories originating at the points p2l are drawn dashed and the circles x = 2 are drawn as dotted lines.

and is unbounded as ω → 0. Therefore, if ω > 0 is small enough, Wu intersects the circle S2 . When this happens, it divides S2 into two intervals. The points in S2 above Wu have image in I H = [ql , qr ] while the points in S2 below Wu have image to the left of ql . (If 0 < θs < 12 < θ1u < 1, then this expression yields I H ⊂ [0, 1]. Otherwise, if 0 < θ1u < 12 , the circular arc I H contains 0 and it is convenient to choose a fundamental domain for the universal cover of the circle S2 that contains [ql , qr ].) Note that Ws lies above Wu . Figure 9.3(a) shows the structure of the flow in strip 1 < x < 2 for (ω, a) = (5, 20), while figure 9.3(b) shows the structure of the flow in strip 1 < x < 2 for (ω, a) = (10, 20). In this figure, the folded saddles ps are marked by the symbol ×. Their stable separatrices are drawn as solid curves, and their unstable manifolds are drawn as dot–dashed curves. The circles S2 are drawn dotted, and the points p2l and p2r are labeled. The dashed trajectories have initial condition p2l . The intervals I H = [ql , qr ] that are the images of H are drawn as thick lines. The points p1u ∈ Wu ∩ S1 are labeled and the points in Ws ∩ S2 are marked by large dots. The graph of the half-return map H for (ω, a) = (5, 20) is shown in figure 9.4(a). The map H is discontinuous at the points of Ws ∩ S2 , has a local maximum at p2l and a local minimum at p2r . We conclude that the graph of H can contain the following types of interval on which it is monotone and continuous: • • •

a decreasing branch with domain [ p2l , p2r ] (this occurs if Ws intersects S2 in a single point), a branch containing p2r with a local minimum, a branch containing p2l with a local maximum,

Global bifurcations in the forced Van der Pol equation 1

1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0

0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0

0

0.1

0.2

0.3

0.4

(a)

0.5

0.6

0.7

0.8

0.9

1

0.6

0.7

0.8

0.9

1

(b)

1

1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0

269

0.1

0

0.1

0.2

0.3

0.4

0.5

(c)

0.6

0.7

0.8

0.9

1

0

0

0.1

0.2

0.3

0.4

0.5

(d)

Figure 9.4. Graphs of the half-return map for parameter values a = 20 and ω = 5 (a), ω = 5.38 (b), ω = 5.4 (c), and ω = 5.5 (d).

• •

monotone decreasing branches in [ p2l , p2r ], monotone increasing branches in the complement of [ p2l , p2r ].

We assume for the moment that all intersections of Ws with S2 are transverse. Then Ws must have an odd number of interections with S2 and every intersection in [θ2l , θ2r ] is preceded by an intersection in the complement of this interval. Therefore, the number of monotone increasing branches is one larger than the number of monotone decreasing branches. Moreover, the image of all branches is contained in I H except the branch with a local minimum. We next analyse the properties of the map H near a point p ∈ Ws ∩ S2 . If p ∈ [ p2l , p2r ], then the limit value on the left is ql and the limit value on the right is qr . If p ∈ [ p2r , p2l ], then the limit value on the right is ql and the limit value on the left is qr . The slope of the map H on the two sides of p behaves very differently. The points with limit value qr cross S1 just to the left of ps . Since the line S1 is transverse to the stable and unstable manifolds of ps , the flow map

270

John Guckenheimer, Kathleen Hoffman and Warren Weckesser

P+ has a singularity of the form (θs − θ )α with 0 < α < 1‡. The points with limit value ql follow the unstable manifold Wu of ps to its first crossing with S1 . The map along the flow also has a power law singularity. The exponent is the ratio of the magnitudes of the stable and unstable eigenvalues of ps . Since the trace of the Jacobian is negative, the stable eigenvalue has larger magnitude and the exponent is larger than 1. We conclude that the map H has unbounded slope as H (θ ) approaches qr , and slope approaching zero as H (θ ) approaches ql .

9.3 Degenerate decompositions and fixed points of H This section considers the fixed points of the half-return map H , which correspond to periodic orbits of odd period for the DAE of the forced Van der Pol equation. The previous section describes the structure of H as a piecewise continuous, piecewise monotone mapping. There is a substantial difference between the cases for a < 1, 1 < a < 2 and 2 < a. For a < 1, H is a diffeomorphism and for 2 > a > 1, H has a single point of discontinuity. There will be ranges of values of ω in these cases for which H has a fixed point, one interval for each odd multiple of the forcing period. The case a > 2 is the one of most interest to us, and we assume a > 2 in the remainder of this section. There are two types of bifurcation that affect the number of fixed points of H: • •

Bifurcations in the interior of intervals of monotonicity are saddle–nodes where the slope of H is 1 or period-doubling bifurcations where the slope of H is −1. Bifurcations that involve endpoints of the intervals of monotonicity correspond to homoclinic bifurcations of the DAE.

We have not encountered period doubling bifurcations, but have found saddle– nodes. We have found homoclinic bifurcations where the periodic orbits approach ps from the left and also ones where the periodic orbits approach ps from the right. In addition to bifurcations, there are parameter values at which the number of discontinuities and turning points of H change. These correspond to degenerate decompositions for the DAE. For example, when a = 1, the DAE has a saddle– node and its preimage under P+ is a singular point of H . Despite the fact that a saddle–node has a half-space of trajectories that approach the saddle–node point, all of the trajectories starting at S2 reach S1 under the slow flow except for the one on the strong stable manifold of the equilibrium point. The other degenerate ‡ This is readily computed if we assume that the flow is linear near ps . Let (u, v) be coordinates with the stable manifold the u-axis and the unstable manifold the v-axis. Assume that the eigenvalues are µ > 0 and λ < 0. The trajectories of the flow are then graphs of functions v = cu µ/λ . The flow from the cross-section u = 1 to the line v = mu sends the point (1, c) to ((c/m)−λ/(µ−λ) , m(c/m)−λ/(µ−λ) ). Since λ < 0 < µ, 0 < −λ/(µ − λ) < 1.

Global bifurcations in the forced Van der Pol equation

271

decompositions we consider involve tangencies of the slow flow with the circle S2 . The general structure of H for a > 2 was described in the previous section. There is one local maximum, one local minimum and an odd number of jump discontinuities with the same limit values at each discontinuity. Here we examine how the number of intervals of monotonicity and the number of fixed points of H change. The discontinuities occur at points in Ws ∩ S2 . As the parameters vary, the number of these points changes only when there is a point where Ws is tangent to S2 . The tangency points of the slow flow with S2 are p2l and p2r . When ω increases with a fixed, the stable manifold Ws tends to move downwards. Consequently, points of Ws ∩ S2 appear at p2l where trajectories of the vector field are tangent to S2 from above and disappear at p2r where where trajectories of the vector field are tangent to S2 from below. Before a new branch appears at θ2l , H (θ2l ) approaches ql . The new branch appears with H (θ2l ) taking a value near θl = θ1u − 12 . New branches can disappear at θ2r only in the regime where Wu does not intersect S2 since Ws lies above Wu and p2r lies below Wu when Wu ∩ S2 is nonempty. As ω increases, the domain of the branch with local minimum at θ2r shrinks and H (θ2r ) approaches θr . When the branch disappears, H (θ2r ) jumps down to θl . In addition to changes in the number of branches of H associated with tangencies of Ws with S2 , there also are parameter values at which the point p2r is in the forward trajectory of p2l and parameter values at which Wu is tangent to S2 . Combinations of these degeneracies are also possible at isolated points. For example, near (ω, a) = (3, 15) there appear to be parameter values for which Ws passes through p2l and Wu passes through p2r . We observe in our numerical computations that the point p1u = Wu ∩ S1 does not change much with varying ω. The points of intersection in Ws ∩ S2 vary much more quickly. Thus, in describing the bifurcations of fixed points of H , we shall speak as if p1u is independent of ω though this is not the case. The only fixed points of H that we have observed occur in the branch containing p2r and the branch immediately to the left of this one. We denote the domains of these two branches by Im and Ilm , respectively. The branch of H with domain Im is a unimodal map with a turning point at p2r with θ2r < 12 . As ω increases, Im shrinks. The value of H at the endpoints of Im is θr = θs + 12 > 12 . As the branch shrinks, its minimum value also approaches qr . Therefore, the branch contains no fixed points when it is short enough. On the other hand, for many parameter values the branch is wide enough that its right endpoint is to the right of the diagonal. The graph of H |Im must then cross the diagonal an odd number of times, so we expect a single fixed point. There are two bifurcations that occur as the branch shrinks: at the first bifurcation, the right endpoint of the graph of H |Im meets the diagonal, and a second fixed point appears at the right end of the branch. This fixed point is unstable since the slope of H is unbounded near the endpoints of Im . The second bifurcation is a saddle–node that occurs as the two fixed points collide and the graph of H |Im moves completely above the diagonal.

272

John Guckenheimer, Kathleen Hoffman and Warren Weckesser

2.5

2.5

x

x

-2.5 -0.5

theta

(a)

0.5

-2.5 -0.5

theta

0.5

(b)

Figure 9.5. Stable periodic orbits of ω-period 1 (a), and ω-period 3 (b), for parameter values (ω, a) = (5, 20).

As Im shrinks with increasing ω, the right endpoint of Ilm moves to the right. It crosses the diagonal, creating a fixed point of H in Ilm . This often occurs before the homoclinic and saddle–node bifurcation within Im . Since H is decreasing on Ilm , it can contain only this single fixed point of H . Because the slope of H approaches 0 at the right endpoint of Ilm , the fixed point in Ilm is stable when it is created. Thus, our observations indicate that as ω varies, there are parameter regions in which H has two stable fixed points, and in part of this region, there is also an unstable fixed point in Im . Figure 9.4 displays graphs of H for four different values of (ω, a): a = 20 and ω = 5, 5.38, 5.4, 5.5. At ω = 5, there is bistability and the right end of the branch Im lies below the diagonal. At ω = 5.38, Im has shrunk, and its right endpoint is almost on the diagonal. At ω = 5.4, the branch Im has shrunk still further and lies just above the diagonal. At ω = 5.5, the number of branches has changed as the former Im disappeared. The periods of the branches are evident in figure 9.3. In figure 9.3(a), the value of θ changes by less than 1 as points in Im flow from S2 to S1 . The value of θ changes by an amount between 1 and 2 as points in Ilm flow from S2 to S1 . Therefore, the fixed point in Im has θ -period 1 and the fixed point in Ilm has θ period 3. These two periodic orbits are displayed in figure 9.5. In figure 9.3(b), the θ -periods in Im and Ilm are 9 and 11, respectively. We end this section with an observation about the limit of the slow flow equations in which a → ∞ and ω → ∞ with the ratio ω/a constant. If we rescale the equations by dividing by a and denote δ = 1/a, the slow flow equations become θ = β(x 2 − 1) (9.3) x = −δx + sin(2π θ )

Global bifurcations in the forced Van der Pol equation

273

where β = ω/a. The limit δ = 0 is a Hamiltonian system, with Hamiltonian E(θ, x) =

cos(2π θ ) − 1 + β(x 3/3 − x + 2/3) 2π

so we can use perturbation analysis to characterize the phase portraits of the system when δ is small. We have only begun to carry out this analysis and record here some of the properties of the case δ = 0. • • • •

There is a saddle at (θ, x) = (0, 1) with a homoclinic connection γu above the center at (0.5, 1). The constant part of E(θ, x) was chosen so that E(θ, x) = 0 on the saddle and the homoclinic connection. When β > 3/(4π), there is also a homoclinic connection γl that encircles the cylinder and goes below (0.5, 1). The homoclinic connection γu does not intersect the line x = 2; see figure 9.6(a). When β = 3/(4π), the homoclinic connection γu is tangent to x = 2 at θ = 0.5. Moreover, the lower homoclinic connection γl meets the saddle at (0.5, −1), forming a heteroclinic cycle; see figure 9.6(b). When 0 < β < 3/(4π), the homoclinic orbit γu intersects x = 2 twice. Moreover, the saddle at (0.5, −1) has a homoclinic connection γ0 that encircles the center (0.5, 1) and is tangent to x = 2 at θ = 0.5. This homoclinic connection does not encircle the cylinder. It intersects the fold line x = 1 at the points where cos(2π θ ) = 8πβ/3 − 1; see figures 9.6(c) and (d).

9.4 Concluding remarks This paper is an initial step towards determining the global bifurcation diagram for the forced Van der Pol equation near its singular perturbation limit. We exhibit stable periodic orbits in the singular limit of the differential algebraic equation and establish that their existence is due primarily to a phenomenon that has received little attention in the literature on singularly perturbed dynamical systems. In particular, the ‘projection’ of a fold curve along the fast flow of the system is tangent to the slow flow on a sheet of the critical manifold. These tangencies lead to a lack of monotonicity in the return map for the system along folds of the critical manifold. Our work appears to be the first time that such tangencies have been noted in the Van der Pol equation and related to its bifurcations. By viewing the system in terms of return maps to the fold curves rather than by fixing a cross-section of constant phase, the tangencies become evident. We conjecture that the parameter region with stable periodic orbits that have θ -period 2n + 1 forms a strip in the (ω, a) plane that extends from the ω-axis to ∞ with increasing a. One boundary of these strips is conjectured to be a saddle– node bifurcation of periodic orbits and the other boundary is a bifurcation that approaches a homoclinic orbit of the differential algebraic equation as ε → 0.

John Guckenheimer, Kathleen Hoffman and Warren Weckesser

274 2

2

γu

1.5

1

1

γl

0.5

0.5

0

x

x

γu

1.5

−0.5

0

−0.5

−1

−1

−1.5

−1.5

−2

−2 0

0.1

0.2

0.3

0.4

0.5 θ

0.6

0.7

0.8

0.9

1

0

0.1

0.2

0.3

0.4

(a) 2

2

0.7

0.8

0.9

1

0.6

0.7

0.8

0.9

1

γ

0

0

γu

1.5

1

1

0.5

0.5

0

x

x

0.6

(b)

γ

γu

1.5

0.5 θ

−0.5

0

−0.5

−1

−1

−1.5

−1.5

−2

−2 0

0.1

0.2

0.3

0.4

0.5 θ

(c)

0.6

0.7

0.8

0.9

1

0

0.1

0.2

0.3

0.4

0.5 θ

(d)

Figure 9.6. Phase portraits of the system (9.3) with δ = 0 and β = 0.30 > 3/(4π) (a), β = 3/(4π) (b), β = 0.15 < 3/(4π) (c), and β = 0.03 3/(4π) (d).

We have not yet characterized these bifurcations outside the singular limit, but conjecture that they become saddle–nodes as the stable periodic orbits collide with unstable periodic orbits at the edge of regions where there are periodic orbits with canards. For each pair of adjacent odd integers 2n ± 1, we further conjecture that there is a number an > 2 such that the strips overlap for a > an and that no overlaps occur for a < an . For the differential algebraic equation, there is a codimension-two point with a = an at which the period 2n − 1 orbit has a saddle–node and the period 2n + 1 orbit has a homoclinic bifurcation. The differential algebraic equation can be used to give approximations to the locations of canards in the Van der Pol equation. The stable separatrix of the folded saddle in the strip |x| < 1 gives the approximate location of the canards, and the jumps from the canards to the stable sheets of the critical manifold can be calculated explicitly within the differential algebraic equation. This information is a starting point for asymptotic analysis that should lead to global bifurcation diagrams of the Van der Pol equation in the relaxation oscillation regime with

Global bifurcations in the forced Van der Pol equation

275

ε > 0 small. Additional perturbation analysis should characterize the properties of the system in the parameter regions where ω and a are large, but have a bounded ratio.

Acknowledgements This research of John Guckenheimer was partially supported by grants from the National Science Foundation and the Department of Energy.

References [1] Arnold V, Afrajmovich V, Ilyashenko Yu and Shil’nikov L 1994 Bifurcation Theory: Dynamical Systems V (Encyclopaedia of Mathematical Sciences) (Berlin: Springer) [2] Cartwright M and Littlewood J 1945 On nonlinear differential equations of the second order: I the equation y¨ − k(1 − y 2 ) y˙ + y = bk cos(λt + a), k large J. London Math. Soc. 20 180–9 [3] Cartwright M and Littlewood J 1947 On nonlinear differential equations of the second order: II the equation y¨ − k f (y, y˙ ) y˙ + g(y, k) = p(t) = p1 (t) + kp2 (t), k > 0, f (y) ≥ 1 Ann. Math. 48 472–94 (Addendum 1949 Ann. Math. 50 504–5) [4] Flaherty J and Hoppensteadt F 1978 Frequency entrainment of a forced Van der Pol oscillator Stud. Appl. Math. 58 5–15 [5] Grasman J 1987 Asymptotic Methods for Relaxation Oscillations and Applications (Berlin: Springer) [6] Guckenheimer J, Hoffman K and Weckesser W 2000 Numerical computation of canards Int. J. Bif. Chaos 10(12) 2669–87 [7] Keener J 1980 Chaotic behavior in a piecewise continuous difference equation Trans. Am. Math. Soc. 261 589–604 [8] Levi M 1981 Qualitative Analysis of the Periodically Forced Relaxation Oscillations (Memoirs AMS 214) (Providence, RI: American Mathematical Society) [9] Levinson N 1949 A second order differential equation with singular solutions Ann. Math. 50 127–53 [10] Littlewood J 1957 On nonlinear differential equations of the second order: III the equation y¨ − k(1 − y 2 ) y˙ + y = bk cos(λt + a) for large k and its generalizations Acta Math. 97 267–308 (Errata at end of [11]) [11] Littlewood J 1957 On nonlinear differential equations of the second order: III the equation y¨ −k f (y) y˙ +g(y) = bkp(φ), φ = t +a for large k and its generalizations Acta Math. 98 1–110 [12] Mettin R, Parlitz U and Lauterborn W 1993 Bifurcation structure of the driven Van der Pol oscillator Int. J. Bif. Chaos 3 1529–55 [13] Mischenko E and Rozov N 1980 Differential Equations with Small Parameters and Relaxation Oscillations (New York: Plenum) [14] Mischenko E, Kolesov Yu, Kolesov A and Rozov N 1994 Asymptotic Methods in Singularly Perturbed Systems (Translated from the Russian by Irene Aleksanova) (New York: Consultants Bureau)

276

John Guckenheimer, Kathleen Hoffman and Warren Weckesser

[15] Smale S 1963 Diffeomorphisms with many periodic points Differential and Combinatorial Topology ed S Cairns (Princeton, NJ: Princeton University Press) pp 63–80 [16] Szmolyan P and Wechselberger M 2000 Canards in R 3 Preprint [17] Takens F 1974 Forced oscillations and bifurcations Commun. Math. Inst. Rijksuniversiteit Utrecht 3 1–59 (reprinted in chapter 1 of this volume) [18] Takens F 1975 Constrained equations; a study of implicit differential equations and their discontinuous solutions Report ZW-75-03 (Groningen: Mathematisch Institut Rijksuniversiteit) [19] Van der Pol B and Van der Mark J 1927 Frequency demultiplication Nature 120 363–4

Chapter 10 An unfolding theory approach to bursting in fast–slow systems Martin Golubitsky University of Houston Kreˇsimir Josi´c and Tasso J Kaper Boston University

We dedicate this chapter to Floris Takens on the occasion of his sixtieth birthday. We have learned a lot from his fundamental contributions, especially those in [63, 64, 65] that play such important roles in the work presented here. Many processes in nature are characterized by periodic bursts of activity separated by intervals of quiescence. In this chapter we describe a method for classifying the types of bursting that occur in models in which variables evolve on two different timescales, i.e., fast–slow systems. The classification is based on the observation that the bifurcations of the fast system that lead to bursting can be collapsed to a single local bifurcation, generally of higher codimension. The bursting is recovered as the slow variables periodically trace a closed path in the universal unfolding of this singularity. The codimension of a periodic bursting type is then defined to be the codimension of the singularity in whose unfolding it first appears. Using this definition, we systematically analyse all of the known universal unfoldings of codimension-one and -two bifurcations to classify the codimension-one and -two bursters. Takens was the first to analyse the unfolding spaces of a number of these. In addition, we identify several codimension-three bursters that arise in the unfolding space of a codimension-three degenerate Takens–Bogdanov point. Among the periodic bursters encountered in mathematical models for nerve cell electrical activity, so-called elliptical, or type III, bursters are shown to have codimension two. Other bursters studied in the literature are shown to first appear in the unfolding of the degenerate Takens–Bogdanov point and thus have codimension three. In 277

278

Martin Golubitsky, Kreˇsimir Josi´c and Tasso J Kaper

contrast with previous classification schemes, our approach is local, provides an intrinsic notion of complexity for a bursting system, and lends itself to numerical implementation.

10.1 A framework for classifying bursters The fields of mathematical and computational neuroscience focus on modeling electrical activity in nerve cells. Fundamental processes, such as voltage changes and ion transport across cell membranes, occur on disparate time scales. As a result, the mathematical models consist of fast and slow variables: x = f (x, y) y = )g(x, y)

(10.1)

where ) > 0 is small. The Hodgkin–Huxley equations, FitzHugh–Nagumo equations, Morris– Lecar equations, as well as many other fast–slow models of the above form, exhibit an extremely rich variety of nonlinear dynamical behaviors. In this work, we focus exclusively on the phenomenon of periodic bursting. Rinzel [52] defines a periodic burster as a periodic solution to a system of autonomous differential equations whose behavior alternates between near steady state and trains of approximate spike-like oscillation; see also [66]. Within the particular framework of fast–slow systems (10.1), Rinzel [50, 51, 54], Ermentrout and Kopell [23] and others, use analytical methods for identifying and constructing periodic bursters. One thinks of the slow variables y as providing a time-periodic forcing in the fast x equation, so that the solution of the fast equation visits various invariant sets in order. As noted in [54], the slow variables either provide this forcing effectively without feedback from the fast system, in which case the slow variables oscillate periodically on their own, irrespective of how the coupling to the fast subsystem influences them. Or, they provide it with feedback from the fast system, in which case the switching between the two states is determined also by the fast variables. In the former case, the system may effectively be modeled as x = f (x, y) y = )g(y)

(10.2)

with the slow component only depending on the slow variable y. A reduction to (10.2) can also be made for the latter case. However, in this second case, one needs to use invariant manifold theory, and the reduction is made separately over each of the local segments of the periodic trajectories, not globally over the entire period of the slow oscillation. There is also a geometric approach to constructing bursters that entails thinking of bursters as generalized heteroclinic orbits. Here the bursting trajectory

An unfolding theory approach to bursting

279

is a trajectory that moves from one invariant set (a steady state, a periodic solution, or a quasiperiodic solution) to another, spending a relatively short time in transition and a relatively long time near each invariant set. Golubitsky and Stewart [28] emphasized the heteroclinic structure of bursters by introducing the stylized notion of a pipe system in phase space. Pipe systems consist of joints containing hyperbolic invariant sets, such as equilibria and limit cycles, and tubes connecting the joints. They showed that primitive piecewise smooth constructions of pipe systems do lead to bursting time series of the type seen in experiments. From all of these points of view, a bursting trajectory is described by certain characteristics: the sequence of invariant states (equilibria, periodic orbits, invariant tori, etc) visited by the bursting trajectory and the ways in which these trajectories approach and leave each invariant set. The classification schemes that have been explored based on such phenomenological descriptions, including that in Izhikevich [37], all suffer from one difficulty—there is no way to know when the classification is complete. In this chapter, we suggest that bursters should be classified by their phenomenological description and by the codimension of the singularity in whose unfolding they first appear. We will use the fast–slow structure in (10.2) and the local unfolding ideas as a basis for this classification scheme. This approach is analogous to the path formulation description of bifurcation problems given in [26] where here the paths will be closed curves rather than curves and the unfolding theory will be the dynamical systems unfolding theory described in [30] rather than the singularity theory unfolding theory of Thom and Mather. This local approach has several advantages over the above global analytical and geometrical methods. First, it is well known [20, 26, 27, 30] that many bifurcations that are first observed globally can be more easily studied by local theory, i.e., by using the unfolding theory of degenerate singularities. In particular, the local theory provides methods by which global phenomena can be found locally using calculus and numerical techniques. Second, the local theory provides a rational method of classification by codimension that naturally indicates how complex a system needs to be in order for it to support bursters of given types. The local approach developed in this work based on the unfolding of singularities may be viewed as a logical extension of the approach taken in Bertram et al [8]. In particular, Bertram et al [8] showed that the distinct bursters known at the time, including the three from the classification of [51], as well as some found later, could be obtained by choosing appropriate paths in the bifurcation diagram of a codimension-three degenerate Takens–Bogdanov bifurcation point. Their analysis, in turn, relied heavily on the results reported in Dumortier et al [22] for the unfolding space of this singularity. The central new element we add is that the codimension of a periodic bursting type should be the codimension of the singularity in whose unfolding the burster first appears. It is this new definition, made precise in definition 10.5 below, that makes possible a rational classification scheme and that offers a natural measure of the complexity

280

Martin Golubitsky, Kreˇsimir Josi´c and Tasso J Kaper

of each bursting type. Our work also complements the classification scheme presented in de Vries [67]. There, a bifurcation map of the parameter space is developed by first finding the codimension-one bifurcation curves and then by finding the special codimension-two points along these curves that bound the regions with different bursters. The domain of the map is naturally split into separate regions, one for each distinct bursting type. This map therefore broadens the schemes developed earlier in [51, 8]. Remark 10.1. The need for a cogent mathematical classification also stems from the fact that there are difficulties inherent in relating features of times series to the topological classifications of bursters; see [8, 37]. In particular, indicators from the time series, such as phase resetting, spike-frequency profiles, and spike undershoot, only provide a limited tool for classification. 10.1.1 Phenomenological approach to bursting in fast–slow systems A necessary condition for bursting in (10.2) is that each of the fast system invariant sets must be asymptotically stable for certain values of y and lose stability at a bifurcation as y changes. Assuming that the slow system is varying periodically, which we assume for periodic bursters, we can rewrite (10.2) as a periodically forced system of differential equations of the form x = f (x, y()t)).

(10.3)

We now think of the solution of (10.3) as visiting invariant states of the frozen system x = f (x, y ∗ )

(10.4)

where y ∗ = y(θ ) for some θ ; i.e., the time-periodic evolution of y(t) forces the solution trajectory x(t) to oscillate between these invariant sets. With this structure in mind, we define a burster type as follows. Definition 10.2. A periodic burster type consists of (a) an ordered set S j ( j = 1, . . . , ) of stable equilibria, periodic orbits, or invariant tori for the frozen system (10.4) existing at y ∗ = y(θ ) for θ in the interval (θ j , θ j +1 ), where θ+1 = θ1 ; (b) the type of bifurcation that S j undergoes as θ varies past θ j +1, where these bifurcations cause the trajectory to leave the vicinities of the invariant sets; (c) the eigenvalues (or spectra)—all real (nodal) or some complex (oscillatory)—associated with each S j .

An unfolding theory approach to bursting

281

We assume that the eigenvalues (or spectra) of S j in (c) do not change their type as θ varies in (θ j , θ j +1) nor do the signs of their real parts change. Remark 10.3. This definition of bursting allows for bursting in systems in which there is at most one stable state at each parameter value of the fast system. Hence, it contrasts with previous descriptions of bursters which rely primarily on bistability of states. Furthermore, our description of bursters leads to an enlargement of the class of systems that are labeled as ‘bursting’. For example, it includes bursts between two steady states, which have no spiking in the active phase. 10.1.2 A local description using singularities and their unfoldings We now set up our framework for discussing the local birth of periodic bursters. We assume that the frozen system x = f (x, 0) has a singularity of codimension k at x = 0 and that the y variables are universal unfolding parameters for this singularity. That is, we assume that f : Rn × Rk → Rn . In this context, we assume that the unfolding theorem is valid and y(θ ) is a small amplitude periodic path in Rk . Of course, our discussion only refers to an unspecified neighborhood of the origin in Rk , so that the convention of writing the parameter space of k parameters as Rk is a slight abuse of notation. Locally, near the origin the universal unfolding defines a codimension-one transition variety V ⊂ Rk . This variety consists of parameter values at which singularities of codimension at least one occur. These singularities include, but are not limited to, saddle–node bifurcations, Hopf bifurcations, and homoclinic trajectories. A diffeomorphism ϕ : Rk → Rk preserves V if ϕ(V ) = V and if it maps each component manifold constituting the variety to itself. Hence, in particular, it is the case where ϕ(0) = 0 whenever ϕ preserves V , since 0 is the only codimension-k point of V . Definition 10.4. Two paths y(θ ) and z(θ ) are path equivalent if there exists a map ϕ : Rk × S1 → Rk and a reparametrization , : S1 → S1 such that ϕ(·, θ ) is a diffeomorphism that preserves V for each θ ∈ S1 and z(θ ) = ϕ(y(,(θ )), θ ). Therefore, one directly sees that whenever y(θ ) and z(θ ) are path equivalent, the corresponding bursters have the same type. The idea is that when two paths are path equivalent they traverse the same sets of (stable) equilibria, periodic solutions, etc with the same eigenvalue types—hence they have the same burster types.

282

Martin Golubitsky, Kreˇsimir Josi´c and Tasso J Kaper

10.1.3 Classification based on minimum codimensions of singularities We make the following definition of codimension of a periodic bursting type in order to be able to classify periodic bursters in a framework intrinsic to the fast–slow decomposition of the governing equations. The viewpoint adopted here is that, among all of those singularities whose unfoldings contain a given bursting type, the one with smallest codimension gives an intrinsic measure of the complexity of that burster: Definition 10.5. The codimension of a periodic bursting type is the minimum codimension of a bifurcation point in the fast system in whose unfoldings that type of bursting occurs. Definition 10.5 forms the basis for the classification presented in this work. We will show that there is a single codimension-one burster, namely that which arises through a (nondegenerate) Hopf bifurcation in the fast subsystem. Then, we classify all codimension-two bursters by systematically studying the known unfoldings of all codimension-two bifurcations. These include: the cusp singularity, degenerate Hopf bifurcation, Takens–Bogdanov bifurcation, Hopf– steady-state mode interaction, and Hopf–Hopf mode interaction. All other bursting types must be of codimension three or more, and in this work we will also discuss certain codimension-three bursters. A surprising observation resulting from our analysis is that systems traditionally labeled as type III (or elliptic) bursters have codimension two, whereas the other most commonly studied bursters—those labeled as types Ia (square-wave), Ib, II (parabolic) and IV—first occur in the unfoldings of codimension-three bifurcations, as was shown in [8] and as we will see here. Remark 10.6. Ultimately a local classification of bursters, as we describe here, is limited by the extent to which universal unfoldings of dynamical singularities are understood. For example, in the classification of codimension-two bursters, the unfoldings of Hopf–steady-state and Hopf–Hopf mode interactions are not yet completely understood in a rigorous fashion. We have only reported results here for the studies of the truncated normal forms and, hence, we can only say that our classification is complete to the extent that it is covered by the known theory. See section 10.3.4 and section 10.3.5 for more discussion. It seems likely, however, that these details will have little significant impact on our conclusions. Remark 10.7. The singularity-based approach may be contrasted with that of Izhikevich [37]. Specifically, [37] has classified periodic bursters by the precise bifurcations that occur along the trajectory. Each distinct sequence of codimension-one bifurcations in the fast subsystem is considered as giving rise to a different type of burster. Also, the resultant sequence is labeled as a ‘codimension-one’ burster, e.g. a sequence is labeled as Hopf–Hopf when the active phase begins and ends with Hopf bifurcations. There is, therefore, a

An unfolding theory approach to bursting

283

mixture of local and global in this classification scheme, whereas the singularitybased approach is purely local and intrinsic. In addition, it is not clear from the classification in [37] how ‘likely’ each of the given sequences is. In the singularity-based approach the possible sequences that occur in the unfolding of a given singularity can be seen directly. Moreover, since each of these sequences arises in an unfolding of a singularity, their intrinsic complexity is that of the associated singularity. 10.1.4 Generic paths Using transversality, loop space (the space of closed paths through the space of unfolding parameters Rk ) can be decomposed into connected components separated by a codimension-one variety. Again this decomposition into connected components is important only near the origin in loop space. We call a closed path µ : S1 → Rk generic if µ intersects V transversely. Transversality implies that µ intersects V only in codimension-one components and crosses those components with nonzero speed. Therefore, transversality implies that any sufficiently small perturbation of a generic path is generic and that the two paths are path equivalent. Thus, the connected components of loop space mentioned above consist of paths that are all path equivalent. The variety separating these components consists of paths that are tangent to the variety V or that intersect V at points of codimension greater than one. It seems quite difficult to classify all generic paths, that is, all components of generic paths in loop space. We sidestep this issue by mostly considering paths in the family µ(θ ) = A + cos(θ )B + sin(θ )C (10.5) where A, B, C ∈ Rk are vectors in parameter space. These paths are usually our candidates for paths supporting bursters. Note that this 3k-dimensional subspace of loop space also divides naturally into components of generic paths. This choice of paths in loop space can be motivated by thinking that there is a Hopf bifurcation in the slow equations that generates a family of periodic solutions in the slow variables. These periodic solutions then force the fast equation, since the slow variables are the parameters in the unfolding of the singularity in the fast equation. The family of paths obtained in this way is, to first order in a Fourier decomposition sense, the same as those in (10.5). 10.1.5 Organization of this chapter In section 10.2 we analyse the standard Hopf singularity that gives rise to the unique codimension-one periodic burster. In section 10.3, we systematically study the known codimension-two singularities of vector fields and identify all of the codimension-two periodic bursters that they generate. Finally, section 10.4 is devoted to a particular codimension-three singularity in whose universal

284

Martin Golubitsky, Kreˇsimir Josi´c and Tasso J Kaper

unfolding several codimension-three periodic bursters are found. We discuss the singularity in the fast system, the relevant paths through the unfolding of that singularity, and the associated time series for each burster type we consider. For the different periodic bursters identified and analysed in the following sections, we give examples and cite some of the recent literature in neurophysiology. Further examples from neurophysiology may be found in [3, 13, 15–19, 31, 33, 44–49, 53, 59, 69], though we emphasize that this is only a partial listing.

10.2 The codimension-one burster We begin the classification with bursters of codimension one. The generic codimension-one bifurcations of flows are saddle–node and Hopf bifurcations. However, paths in the unfolding space of a saddle–node bifurcation do not lead to bursting, since the fast system contains no stable states on one side of the bifurcation point. Thus, the sole codimension-one bifurcation of interest here is the Hopf bifurcation. The nondegenerate Hopf bifurcation, also known as the Andronov–Hopf bifurcation [1, 32, 41], has the normal form (in polar coordinates) r = r (µ − r 2 ) + O(r 4 )

θ = 1 + O(r 2 )

(10.6)

where µ is a real number. Here, we have chosen the coefficient in front of the cubic term to be negative, since we are interested in the bifurcation to a stable limit cycle. It is known that the dynamics of (10.6) is topologically equivalent to that of the truncated normal form, the system (10.6) without the higher-order terms. Thus, we study the truncated normal form. For each µ, the system (10.6) has an equilibrium at r = 0. It is a stable focus attracting all orbits at an exponential rate for each µ < 0, whereas it is an unstable focus for any µ > 0. The transition occurs precisely at the Hopf bifurcation point µ = 0, where the origin is topologically a stable focus but orbits approach it at only an algebraic rate. For µ > 0, there is also a unique limit cycle (of radius √ µ in the truncated system (10.6)) that is asymptotically stable and attracts all nonzero initial conditions. In this case, the transition variety V is just the origin. In order to study bursting, we make the unfolding parameter vary slowly along a closed loop µ = C sin()t), (10.7) where A = B = 0 in (10.5) and 0 < ) 1. This slow variation in the unfolding parameter µ causes the system to periodically cross the Hopf transition variety. More generally, in terms of the paths µ(t) = A+ B cos()t)+C sin()t) considered in (10.5), A, B, C space divides into three regions: one in which the path is always to the right of 0, one where it is always to the left of 0, and one where the path

An unfolding theory approach to bursting 0.6

0.6

0.4

0.4

0.2

0.2

0

0

−0.2

−0.2

−0.4

−0.4

−0.6

−0.6

−0.8

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

−0.8 1600

0.05

0.04

0.04

0.03

1700

1800

1900

2000

2100

2200

285

2300

2400

2500

0.03 0.02 0.02 0.01 0.01 0 0 −0.01 −0.01 −0.02 −0.02 −0.03

−0.03

−0.04 2000

2500

3000

3500

4000

4500

5000

−0.04 3740

3760

3780

3800

3820

3840

3860

3880

3900

3920

3940

Figure 10.1. Time series of r cos θ obtained from (10.6) and (10.7) with C = 0.3 for ) = 0.01 (top; note the slow passage effect) and for ) = 0.1 (bottom). The right panels show enlargements of a portion of the left panels so the spikes are visible. There are more spikes per burst event with ) = 0.01 than with ) = 0.1, since the frequency of the periodic orbit is (1) in the fast time and, hence, the slower the slow variable changes, the more the fast variable oscillates.

O

crosses the origin twice. The path given by (10.7) samples this last region and gives rise to bursting (in those cases where the slow passage effect permits it, see below). The other two regions are associated with paths that do not yield bursting. There is an important additional phenomenon, namely slow passage through a Hopf bifurcation [42, 43], that arises in the system (10.6) with µ given by (10.7), for sufficiently small ), as well as in other bifurcations in which the equilibria have eigenvalues with nonzero imaginary parts. For analytic vector fields, solutions will stay close to an unstable equilibrium point beyond the Hopf bifurcation value at which the equilibrium became unstable. Moreover, if there is a finite (O(1)) buffer point, then the solutions will do so until the value of µ reaches the value of that buffer point and they will all leave a neighborhood of the unstable equilibrium point in an exponentially small interval about the buffer point. This delay in the effect of the bifurcation runs counter to one’s intuition in that one would instead expect that solutions get repelled by the equilibrium as

286

Martin Golubitsky, Kreˇsimir Josi´c and Tasso J Kaper

soon as it becomes unstable. Therefore, this slow passage effect plays a prominent role in determining the amplitude profile of the burster. See figure 10.1 for the manifestation in the analytic Hopf burster, and we refer the reader to some of the by-now large literature on slow passage; see [4, 7, 14, 21, 34, 42, 43, 61, 62]. For completeness, it is also worth recalling here that noise will destroy the slow passage effect; see [4, 42, 43].

10.3 Codimension-two bursters The next simplest bursting types are those of codimension two. In this section, we analyse each of the generic codimension-two local singularities in the fast system to classify codimension-two bursters. These are (i) (ii) (iii) (iv) (v)

the cusp point, the degenerate Hopf bifurcation point, the Takens–Bogdanov bifurcation point, the Hopf–steady-state bifurcation point, and the Hopf–Hopf bifurcation point.

These systems have two-dimensional unfolding spaces and, within the framework of section 10.1, we take the unfolding parameter µ to be of the form µ1 = B1 cos()t) + A1 µ2 = C2 sin()t) + A2 .

(10.8)

10.3.1 The cusp The cusp is a saddle–node bifurcation in which there is a degeneracy in the quadratic terms. This bifurcation involves the transition from one steady-state to three. The universal unfolding is given by x˙ = −x 3 + µ1 x + µ2 + O(x 4 ).

(10.9)

Here also, the dynamics of the full normal form (10.9) is topologically equivalent to that of the truncated system, i.e., (10.9) without the higher-order O(x 4 ) terms. Hence, the bifurcation diagrams are qualitatively the same, so that one only needs to study the truncated system; see [26, ch III.12(c)] or [39, ch 8.2]. Following the framework of section 10.1, we find the transition variety for the cusp singularity. In (µ1 , µ2 , x)-space, the edges of the cusp surface are given by (µ1 , µ2 ) = (3x 2, −2x 3 ). Hence, the cusp curves, which are the projections of these edges onto the µ1 , µ2 plane, are given implicitly by µ 3 µ 2 1 2 = . (10.10) 3 2 See figure 10.2. For points (µ1 , µ2 ) inside the cusp curve, the cubic has three distinct real roots, whereas outside of it, there is only one real root. The cusp

An unfolding theory approach to bursting

287

µ2

path 2

path 1

µ1 Figure 10.2. The projection of the cusp surface −x 3 + µ1 x + µ2 = 0 onto the plane of the unfolding parameters. The cusp curves are given by (µ1 /3)3 = (µ2 /2)2 . Paths 1 and 2 lead to the two different types of bursting. The corresponding time series are given in figure 10.3. 1

1.5

0.8

0.6

1

0.4

0.2

0.5

0

−0.2

0

−0.4

−0.6

−0.5

−0.8

−1

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

−1

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

Figure 10.3. The time series x(t) of the solutions of (10.9) with the unfolding parameters given by (10.8) with B1 = C2 = 0.5 and (A 1 , A 2 ) = (0, 0) (left panel) and B1 = C2 = 0.3 and (A 1 , A 2 ) = (0, 0.6) (right panel). These formulae correspond to paths 1 and 2, respectively, in figure 10.2. In both cases ) = 0.01.

curves correspond to bifurcations along which two equilibria exist (one of which is a saddle–node point). Among the closed paths of the type considered in section 10.1, there are two distinct types that give rise to periodic bursters. These are illustrated in figure 10.2, and the corresponding time series with the unfolding parameters µ1 , µ2 varying as in (10.8) are shown in figure 10.3. In figure 10.3(left) µ1 , µ2 vary slowly around a circle centered at the origin (path 1 in figure 10.2). Along part of this circle, the system is in the regime with only one equilibrium, while

288

Martin Golubitsky, Kreˇsimir Josi´c and Tasso J Kaper

µ2

µ2

µ1

µ1

Figure 10.4. Left panel: sketch of the family of circles that are tangent to the cusp curves and of the locus of points at which these circles are centered, as computed in section 10.3.1. Right panel: Members of the six components in the space of circular paths which are separated by the families of paths depicted on the left.

for parameter values along the remainder of this circle, the system is in the three equilibrium regime. Hence, the time series in the left frame exhibits only one rapid jump (down) each period. By contrast, the time series in figure 10.3(right) exhibits two rapid jumps (one up and one down) each period. This time series was generated instead by choosing a circle in parameter space that crosses each branch of the cusp (transition) variety twice, once in each direction; see path 2 in figure 10.2. In the remainder of this subsection, we show how the method of section 10.1 also yields computable conditions on the parameters B1 , C2 , A1 and A2 under which the circular paths are tangent to the cusp curve. For simplicity of 2 calculation throughout, we transform the unfolding parameters to ν1 = −µ 2 , and µ1 ν2 = 3 . Hence, the cusp curves are given by ν12 = ν23 .

(10.11)

We let B1 = C2 = R and rewrite the paths in the more convenient form (ν1 , ν2 ) = (R cos(τ ) + A1 , R sin(τ ) + A2 ).

(10.12)

Parametrically, the tangency condition is then 2ν1 ν1 − 3ν22 ν2 = 0, subject of course also to the condition that (10.11) holds. Solving (10.12) for the trigonometric functions and recalling that R > 0, the tangency condition becomes 2ν1 (ν2 − A2 ) + 3ν22 (ν1 − A1 ) = 0.

(10.13)

Finally, the locus of points at which the circles must be centered so that they are tangent to the cusp transition variety may now be found parametrically. Fixing R

An unfolding theory approach to bursting

289

and solving (10.13) for ν1 in terms of ν2 and then by plugging this into (10.11), one obtains a quartic equation for ν2 whose coefficients depend on A1 and A2 . Analysis of this quartic then leads to the desired locus of (A1 , A2 ) values; see figure 10.4. Remark 10.8. The behavior studied in this subsection would not traditionally be considered to be bursting, since the active phase involves only a stable equilibrium and not a periodic state. 10.3.2 Degenerate Hopf bifurcation The next codimension-two singularity also arises when there is a degeneracy in a codimension-one point. In particular, we focus on the codimension-two degenerate Hopf bifurcation to a stable limit cycle; see [2, 6, 63]. The full normal form is r = (µ1 + µ2r 2 − r 4 )r + O(r 6 ) (10.14) θ = 1 + O(r 2 ). The normal form of (10.14) is again topologically equivalent to that of the truncated normal form, i.e., (10.14) without the higher-order correction terms. Also, the remaining codimension-two degenerate Hopf bifurcation not considered here has a plus sign of the quintic term, and a similar analysis can be carried out for it. Hopf bifurcations occur when µ1 = 0, and saddle–node bifurcations of µ2

periodic orbits occur along the curve µ1 = − 42 ; see figure 10.5. To generate the associated codimension-two burster, we let the unfolding parameters evolve as in (10.8). Examples of the four paths that lead to the four different periodic bursting types are shown in figure 10.5. Path 1 leads to nondegenerate Hopf bursting studied in section 10.2. Paths 2 and 3 also create bursting types that involve transitions between a stable equilibrium and a stable limit cycle. However, these bursters are distinguished by whether the transition between the states is a smooth one (a Hopf bifurcation) or a jump transition (caused either by a subcritical Hopf bifurcation or by a saddle–node of limit cycles); see figure 10.6 for the corresponding time series and also [36]. The bursting obtained from path 4 is a type III burster; see figure 10.7. Thus, type III bursters are of codimension two by definition 10.5. It should be noted that the states of the fast system and their bifurcations are the same as those in [8]. Due to the form of the equations in [8] the total number of bifurcations that the fast system undergoes during one period of the slow system is greater than in the present case. However, the sequence of states visited by the fast system and their bifurcations are the same in both examples. Therefore according to definition 10.2 we may say that both systems exhibit type III bursting. We also remark that the time series of this type III burster may seem slightly unfamiliar, since the radius of the limit cycle is changing, whereas in

Martin Golubitsky, Kreˇsimir Josi´c and Tasso J Kaper

290

2 periodic orbits 4 1 periodic orbit

saddle node

3 2

0 periodic orbits

1 Hopf

Figure 10.5. The (µ1 , µ2 ) plane for degenerate Hopf bifurcation. There are three regions: with zero, one, or two periodic orbits, respectively, and in each region there is also an equilibrium at r = 0. Paths 1–4 lead to different bursting types. Path 1 leads to Hopf bursting; see section 10.2. For bursting along paths 2–4; see figure 10.6 and figure 10.7. Note that the discussion immediately after definition 10.4 implies that changing the direction along paths 1 and 4 does not lead to new bursting types.

1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0

0

−0.2

−0.2

−0.4

−0.4

−0.6

−0.6

−0.8

−1

−0.8

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

−1

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

Figure 10.6. Time series corresponding to paths 2 and 3 in figure 10.5. For path 2 (left) the subcritical Hopf bifurcation causes a jump to a stable limit cycle followed by a Hopf bifurcation; for path 3 (right) there is a Hopf bifurcation to a limit cycle followed by a saddle–node of limit cycles. For both paths B1 = C2 = 0.5 and (A 1 , A 2 ) = (0, 0). Furthermore, ) = −0.01 for path 2 and ) = 0.01 for path 3, so the two paths are traversed in opposite directions. Both frames show the slow passage effect at the start of the active phases.

the standard case the radius is almost constant. A naive reckoning leads to the same conclusion, since one needs two parameters to arrange for the two essential topological characteristics of type III bursters to occur, namely that the active

An unfolding theory approach to bursting

291

saddle node of periodic orbits

Hopf

1

0.8

0.6

0.4

0.2

0

−0.2

−0.4

−0.6

−0.8

−1

0

500

1000

1500

Figure 10.7. A schematic representation of type III bursting (top) and the time series (bottom) of r cos θ generated by (10.14) and (10.8) with B1 = C2 = 0.15, A 1 = −0.05, A 2 = 0.6 and ) = 0.01 (corresponding to path 4 in figure 10.5). The stable and unstable periodic orbits collide in a saddle–node of periodic orbits, causing the system to return to the quiescent state. The slow passage effect is manifested at the beginning of each active phase.

phase begins in a subcritical Hopf bifurcation and that the burst terminates in a saddle–node of periodic orbits. Of course, type III bursters can also be found in the unfoldings of higher codimension singularities; see for example [8, Section 4] where it is shown that type III bursting occurs in the unfolding of the degenerate Takens–Bogdanov bifurcation studied in [22].

Remark 10.9. The bursters corresponding to paths 2, 3 and 4 are the first bistable bursters that we have encountered.

Remark 10.10. Here, we have studied singly degenerate Hopf bifurcations, i.e., the codimension-two case. Hopf bifurcations with higher-order degeneracies and/or symmetries also occur in the full normal form, in which case the resulting bursting type is of codimension three or higher; see [63].

292

Martin Golubitsky, Kreˇsimir Josi´c and Tasso J Kaper

III

µ2

Hopf

µ1= − µ22

II

homoclinic

IV

µ1= − 49/25 µ 2+O( µ 32 )

µ1 I

Figure 10.8. Bifurcation diagram in the unfolding space of a Takens–Bogdanov point.

10.3.3 Takens–Bogdanov bifurcation The Takens–Bogdanov bifurcation involves a double zero eigenvalue, and the universal unfolding has the form x = y

y = µ1 + µ2 y + ax 2 + bx y + O(3)

(10.15)

where µ1 and µ2 are the unfolding parameters, a = 1, b = ±1, and O(3) indicates third-order terms in x and y. Complete studies of this unfolding were first presented in [9, 10, 64]. Moreover, as in the previous sections, here also the bifurcation diagrams of the truncated normal form are not qualitatively changed by the addition of the higher-order terms, so that it suffices to study the truncated system. Figure 10.8 schematically depicts the dynamics in the different parameter regions of (10.15) when b = 1. The curve µ1 = −µ22 separating regions II and III is a locus of Hopf bifurcations, while the curve µ1 = −(49/25)µ22 + O(µ32 ) separating regions I and II is a locus of homoclinic bifurcations; see [20, ch 4.1], [30, ch 7.3] or [39, ch 8.4]. The case b = −1 is similar. See also [38]. Although the unfolding of this bifurcation is the most complicated we have considered so far, it does not lead to any new bursting types. The system does not have any stable attracting states when the unfolding parameters are in regions III and IV. Hence, only paths that remain in regions I and II are of interest. More specifically, only those paths crossing the locus of homoclinic bifurcations between regions I and II need to be considered. However, no such path leads to bursting, since the origin is a stable equilibrium on both sides of the homoclinic

An unfolding theory approach to bursting

293

µ2

µ1 Figure 10.9. Sketch of the path (10.17) for the system (10.16) with a = 0.5, b = −1, c1 = 0, and time reversed. The corresponding time series is shown in figure 10.10.

bifurcation. Finally, if the flow of the fast system is reversed in time (i.e., the arrows in figure 10.8 point in the opposite directions), then there will be a stable limit cycle in region II and a stable point in region III and no stable states in regions I and IV. Therefore, this case only leads to the codimension-one bursting already considered in section 10.2, since regions II and III are separated by a locus of nondegenerate Hopf bifurcation points. 10.3.4 Hopf–steady-state bifurcation In order to observe a Hopf–steady-state bifurcation, the fast system must be of dimension three or higher. The normal form of this bifurcation is r = µ1r + ar z + r z 2 + O(|r, z|4 ) z = µ2 + br 2 − z 2 + O(|r, z|4 )

(10.16)

θ = ω + c1 z + O(|r, z|2 )

where either b = 1 or b = −1; see [20, ch 4.6] or [39, ch 8.5]. We focus here exclusively on the truncated normal form, which is (10.16) without the higherorder terms and with c1 = 0. This truncated normal form is exact when there are certain symmetries in the vector field [20, 39]. However, for general systems, the higher-order terms do qualitatively alter the bifurcation diagram of the truncated system, leading to various global phenomena and chaotic states. Therefore, in this case (and in that of the Hopf–Hopf bifurcation considered in the next section) the truncated and full normal forms are not topologically equivalent, in contrast to the bifurcations analysed in sections 10.3.1–10.3.3. These two singularities were studied in [24], and the proof of uniqueness of the limit cycle was first given in [71, 72]. The main new bursting observed here involves a stable equilibrium bifurcating into a stable limit cycle and then into a stable two-torus, along with the

Martin Golubitsky, Kreˇsimir Josi´c and Tasso J Kaper

294 0.25

0.2

0.2

0.15 0.15

0.1 0.1

0.05 0.05

0

0

−0.05

−0.05

−0.1

−0.1

−0.15

−0.15

−0.2

−0.2 −0.25

0

1

2

3

4

5

6

7

8

9

10

5.4

5.6

5.8

6

6.2

6.4

4

x 10

6.6

6.8

7

7.2 4

x 10

Figure 10.10. Left panel: Time series of the variable x = r cos θ generated by (10.16) with the path (10.17) shown in figure 10.9 and with ) = 0.0001. The stable equilibrium at r = 0 undergoes a Hopf bifurcation to a stable limit cycle (with r (t) slowly growing), which in turn bifurcates into a stable two-torus (with r (t) oscillating rapidly) via a Neimark–Sacker bifurcation. The delayed passage effect is clearly visible in both bifurcations. Right panel: Enlargement of the transitions from periodic to quasiperiodic to periodic solutions.

attendant quasiperiodic spiking observed during the active phase. This occurs for example in the time reversed (10.16) with a = 0.5, b = −1, c1 = 0, ω = 0.005 and the path (µ1 , µ2 ) = 0.03(cos(g()t)), sin(g()t))) where g()t) = 0.865 + 0.815 cos()t).

(10.17)

See figure 10.9 and figure 10.10. Bursting to a two-torus occurs more generally when b = −1, a > 0 and time is reversed in (10.16), i.e., when one is in what is traditionally labeled as case III (with time reversed) in the analysis of the Hopf–steady-state bifurcation. In this case, there are both a Hopf bifurcation curve and a Neimark–Sacker bifurcation curve (bifurcation of a limit cycle into a two-torus); see [20, 29, 40]. The other cases, corresponding to b = 1 and a > 0, b = 1 and a < 0, and b = −1 and a < 0, respectively, are analysed in the appendix. However, a systematic search reveals that there are no other new types of bursting.

10.3.5 Nonresonant Hopf–Hopf bifurcation Hopf–Hopf bifurcations arise when there is a singularity that has a pair of nonresonant, purely imaginary eigenvalues. The fast system must be of dimension

An unfolding theory approach to bursting

295

four or higher. The normal form [20] is r1 = r1 ()1 + p1r12 + p2r22 + q1r14 + q2r12r22 + q3r24 ) + O(|r |6 )

r2 = r2 ()2 + p3r12 + p4r22 + q4r14 + q5r12r22 + q6r24 ) + O(|r |6 )

θ1 = ω1 + O(|r |2 )

(10.18)

θ2 = ω2 + O(|r |2 ). As in section 10.3.4, we consider the truncated normal form here, and the same comment about the impact of the higher-order terms also applies here. Since θ1 and θ2 are constant to second order, it is sufficient to consider the planar system (r1 , r2 ). A change of coordinates, together with a nondegeneracy condition, takes the first two equations of system (10.18) up to sixth order to σ x = x(µ1 + ηx − y + q1 x 2 + q2 x y + q3 y 2 ) α+1 α σ y = y µ2 − ηx + y + q4 x 2 + q5 x y + q6 y 2 β β +1

(10.19)

where σ, η = ±1, x ≥ 0, y ≥ 0, and µ1 and µ2 are the unfolding parameters. The sign of σ allows us to change the direction of time in the normal form. The study of this system can be reduced to 16 cases, some of which are equivalent. We refer the reader to [20, ch 4.7] for details. We have carried out a full analysis of these 16 cases, and many of the bursters occurring in its unfolding are of types that have already been described. Therefore, rather than presenting a full description of all the unfoldings, we will concentrate on the unfoldings that lead to new bursting types. We also emphasize that the results below are stated in terms of the planar system (10.19), unless otherwise stated. Case 1. η = −1, α > 0, β > α, and σ = −1 (case a− with time reversed in [20]). Here, we find a new periodic bursting type that involves a stable threetorus. In particular, the following sequence of bifurcations can be observed for the path µ1 ()t) = 0.3

and

µ2 ()t) = −0.12 + 0.13 sin()t)

(10.20)

(see figure 10.11) with ) = 0.003 α = 1 and β = 2, as well as for general paths like it in the µ1 , µ2 unfolding space. The corresponding time series is shown in figure 10.12. A stable state appears from the origin in a saddle–node bifurcation. This new stable state on the y-axis loses its stability, giving birth to another stable state in the region x > 0, y > 0, as the path crosses the curve M. This last stable state loses its stability via a supercritical Hopf bifurcation (on the curve marked H in figure 10.11). As the second half of the path is traversed, these bifurcations occur in reverse order. Note that, in the full system (10.18), this

296

Martin Golubitsky, Kreˇsimir Josi´c and Tasso J Kaper

µ2 µ1

M H Figure 10.11. Part of the parameter space for case 1 of (10.19).

1 0.8 0.6 0.4 0.2 0

0

1000

2000

3000

0

1000

2000

3000

4000

5000

6000

7000

4000

5000

6000

7000

0.7 0.6 0.5 y

0.4 0.3 0.2 0.1 0

t

Figure 10.12. Time series of x = r1 cos θ1 (top) and of y = r2 cos θ2 (bottom) for case 1 of the Hopf–Hopf bifurcation, generated by (10.19) with q1 = 1 and q2 , . . . , q6 = 0. The slow passage effect causes the delayed (and abrupt) onset of the oscillation.

sequence of bifurcations becomes: a stable equilibrium at the origin undergoes a Hopf bifurcation to a stable limit cycle, which undergoes a Neimark–Sacker bifurcation to a stable two-torus, which finally bifurcates to a stable three-torus (with the appropriate slow passage effect at each stage), and then back again. Since there is only one stable state for each of the parameter regions, this bursting type is not bistable†. For a general treatment of tori in dynamical systems and quasiperiodic dynamics on them, we refer the reader to [11, 12]. † Although it is of codimension two, this bursting type is not listed in the classification of [37].

An unfolding theory approach to bursting

297

Case 2. η = −1, β > 0, β < −α − 1, and σ = 1 (cases d− and h − in [20]). For a range of parameter values, there is a bistable regime with stable states on the x- and y-axes. At each of the borders of this regime a saddle from the region x > 0, y > 0 collides with one of these states, leading in both cases to an exchange of stability. Therefore, a path traversing this region in parameter space will lead to the fast system jumping between the two stable states (limit cycles). In other words, in the full system, one pair of variables will be active while the other pair is quiescent, and at each bifurcation the roles of the two pairs are reversed. Case 3. η = +1, α > 0, β > α, and σ = −1 (case a+ with time reversed in [20]). This case is similar to the preceding one, except that there is no bistability. There is a region of parameter space in which the origin is stable, and the origin gives birth to stable states on the x- and y-axes, respectively, at the boundaries of this region. A path traversing this region and part of the adjacent regions will lead to the stable state approaching these three stable states in turn. In the full system this will lead to bursting similar to the one discussed in the previous case. The main difference is that, while there are abrupt jumps between the two oscillating states in the last examples, in this case the amplitudes decay to the point where both pairs of variables are quiescent before the cycle starts again. Case 4. η = +1, β > 0, β < −α − 1, and σ = 1 (case d+ in [20]). In this case, a sequence of bifurcations starting from a stable state at the origin leads to the appearance of a stable limit cycle, as in the first case. Therefore, an appropriate path in the parameter space will lead to the same type of bursting. However, at the parameter values at which the limit cycle exists, the origin is also stable leading to the possibility of bursting from the quiescent state to a three-torus in the full system. Moreover, since the size of the limit cycle becomes greater than O()1 , )2 ) before it loses stability, the normal form cannot give a full picture of this case. Remark 10.11. These bursting types also occur at other parameter values. They represent all the new bursting types in the case of two purely imaginary eigenvalues without resonances. See also [25]. We refer the reader to [35], for example, for the normal form in the case of the 1:1 resonant Hopf–Hopf bifurcation.

10.4 Codimension-three bursters Whereas in the previous sections we were able to give an exhaustive treatment of all of the known generic codimension-one and -two bifurcations, our goal here is limited to showing that the traditional bursters of types Ia, II and IV can be found explicitly using the framework of section 10.1 in the unfolding of a particular codimension-three singularity. This was already noted in [8]. Our goal must necessarily be limited since the unfoldings of only very few codimension-three singularities are known. Those unfoldings that are known

298

Martin Golubitsky, Kreˇsimir Josi´c and Tasso J Kaper

are for codimension-three singularities of generic three-parameter planar vector fields, namely the swallow-tail bifurcation, the Takens–Hopf bifurcation, and the degenerate Takens–Bogdanov bifurcations with either a double or a triple equilibrium; see the bibliographical notes in [39, ch 8]. In contrast, the study of codimension-three singularities in higher-dimensional fast systems is far from complete. 10.4.1 Type Ia bursting Type Ia bursters, also known as square wave bursters, are characterized by monotonically decreasing spike frequency, i.e., increasing inter-spike intervals. The fast system is typically two dimensional and bistable, and the variation of a single slow variable causes the fast systems to visit both attractors in a timeperiodic manner; see figure 10.15. The burst (or active phase) begins at a saddle– node of equilibria in the fast system, where the trajectory jumps from a branch of stable equilibria to a branch of stable periodic orbits. The frequency of these periodic orbits decreases during the active phase until the family of periodic orbits disappears in a saddle-loop connection (homoclinic bifurcation). This saddle-loop connection, in turn, marks the end of the active phase, since near it the trajectory must jump back to the original branch of stable equilibria. Geometric singular perturbation theory treatments of type Ia bursters are given in [56, 66]. Thus, the bursting behavior is due to two dynamic bifurcations in the system: a saddle–node and the breaking of a homoclinic connection. From the classifications of codimension-one and -two bursters given in the previous sections, we see that these two bifurcations were not encountered in the unfoldings studied there. Hence, we know that the minimum codimension of a type Ia burster must be at least three. The codimension turns out to be exactly three, as we now show. In particular, we study a codimension-three degenerate Takens–Bogdanov point and use its truncated normal form as the fast system of our burster to show explicitly how type Ia bursting occurs in the framework of section 10.1: x 1 = x 2 x 2 = −x 13 + µ2 x 1 + µ1 + x 2 (ν + 3x 1 + x 12 ).

(10.21)

We focus here on those features of (10.21) that are of interest to us and, hence, do not consider the entire universal unfolding, which is quite complicated (see [22], though note that we have taken a slightly different form). Our principal goal is to locate a point at which both a homoclinic (saddle-loop) bifurcation and a saddle–node bifurcation occur. We label such a point HSN. Moreover, the procedure we use here to find the HSN point can be implemented numerically, so that this example also illustrates the computational advantage of the singularitybased approach.

An unfolding theory approach to bursting

saddle node

ν

saddle node

2/27 µ1

−2/27 TB

299

homoclinic HSN Hopf

Figure 10.13. Part of the bifurcation diagram of (10.21) on P. The codimension-two point at which both a homoclinic and a saddle–node bifurcation occur is labeled HSN.

The fixed points of (10.21) are given by x 13 = µ2 x 1 + µ1 ,

x2 = 0

and the Jacobian of the vector field at (x 1 , 0) equals 0 1 . D F(x 1 , 0) = −3x 12 + µ2 ν + 3x 1 + x 12

(10.22)

(10.23)

The condition that a fixed point is also a saddle–node bifurcation point is then that the bottom left entry of (10.23) vanishes, so that for µ2 > 0, we find x 12 = µ2 /3. Now, to further simplify our calculations, we choose µ2 = 1/3 and illustrate the results below in the µ1 − ν parameter plane P = {(µ1 , µ2 , ν) ∈ R3 : µ2 = 1/3}. Computations for other values of µ2 proceed in the same fashion. This simplifying choice implies that the fixed points (x 1 = ±1/3, x 2 = 0) are saddle– node points, which due to condition (10.22) exist when µ1 = ∓2/27. Hence, on P, the vertical lines µ1 = ∓2/27 are labeled as saddle–node lines. Furthermore, there is a single point on each of these saddle–node lines at which the fixed point degenerates into a Takens–Bogdanov point. These occur precisely where the bottom right entry of the Jacobian also vanishes, i.e., at ν = −10/9 and at ν = 8/9, respectively, for µ1 = ∓2/27 and x 1 = ±1/3. In the remainder of this analysis, we focus only on the first Takens– Bogdanov point at (µ1 , ν) = (−2/27, −10/9) on P and on the bifurcation curves emanating from it. A similar analysis may be performed for the second Takens– Bogdanov point. There exists a Hopf bifurcation curve emanating from the first Takens–Bogdanov point, and on P this curve reaches the opposite saddle–node

300

Martin Golubitsky, Kreˇsimir Josi´c and Tasso J Kaper

ν homoclinic points

saddle node

µ1

III II HSN

I

γ1

Figure 10.14. The path γ1 in the universal unfolding of (10.21) leading to type Ia bursting.

line (µ1 = 2/27) at the point with ν = −22/9. This may be seen by observing that when µ1 = 2/27, the second fixed point (in addition to one with x 1 = −1/3) of the system (10.21) is at (x 1 = 2/3, x 2 = 0) and that the condition for this second point to be a Hopf bifurcation point is that the bottom right entry in the Jacobian vanishes there, so that we find ν = −22/9. Finally, it can be checked numerically that the locus of homoclinic bifurcations emanating from the first Takens–Bogdanov point intersects the opposite vertical line µ1 = 2/27 on P at approximately ν = −2.083. Hence, the desired HSN point is located at (2/27, −2.083); see figure 10.13. We now choose a path that intersects both a surface of homoclinic bifurcations and a surface of saddle–node bifurcations. In particular, the path we choose in P is as shown in figure 10.14, which is a magnification of figure 10.13 around the point HSN (µ1 (t), ν(t)) = (0.07 + 0.015 sin()t), −2.1 − 0.15 sin()t)).

(10.24)

The sequence of bifurcations corresponds exactly to the sequence of bifurcations in type Ia bursting. Remark 10.12. We refer the reader to [58] for a detailed analysis of the saddle– node separatrix-loop bifurcation. Remark 10.13. Type Ib bursters are similar to type Ia bursters in that the twodimensional fast system is also bistable and the spike frequency decreases during the active phase until a homoclinic bifurcation is reached. However, type Ib bursting differs in that the stable periodic orbits encircle three equilibria of the

An unfolding theory approach to bursting

saddle node

301

homoclinic

1 0 0 1

1.2

1

0.8

0.6

0.4

0.2

0

−0.2

−0.4

−0.6

0

500

1000

1500

2000

2500

3000

3500

4000

Figure 10.15. A schematic representation of type Ia bursting (top) and the time series generated by (10.21) with (10.24) and ) = 0.005 and µ2 = 1/3 (bottom). This corresponds to the path γ1 through the unfolding space shown in figure 10.14. In the top sequence of phase portraits, the lower fixed point disappears in a saddle–node bifurcation, and the system jumps to the periodic orbit. After the periodic orbit disappears via a homoclinic connection, the system returns to the fixed point, and the cycle repeats. The filled circles represent stable fixed points.

fast system, rather than just one, which results in the spikes being more widely spaced and in there being an undershoot after each spike independently of the chosen projection. Moreover, there need not be a spike plateau. Three is also the minimum codimension of the singularity in the fast system that is needed to support type Ib bursting, and hence type Ib periodic bursters are also of codimension three according to definition 10.5. 10.4.2 Type II bursting Type II bursting is characterized by a parabolic plot of time-versus-spike frequency in which the frequency is small at the beginning and end of the active phase and larger in the middle. Hence, this phenomenon is also known as parabolic bursting. The two-dimensional fast system possesses an invariant circle; and, as a result of time-periodic changes in a two-dimensional slow variable, fixed points are created and destroyed in saddle–node bifurcations on this circle. In

302

Martin Golubitsky, Kreˇsimir Josi´c and Tasso J Kaper saddle node on periodic orbit

1.4

1.2

1

0.8

0.6

0.4

0.2

0

−0.2

−0.4

−0.6

0

1000

2000

3000

4000

5000

6000

7000

8000

Figure 10.16. A schematic representation of type II bursting (top) and the time series generated by (10.21) and (10.25) with ) = 0.003 (bottom). The oscillation stops due to a saddle–node bifurcation on the periodic orbit. The corresponding bifurcation diagram is given in section 10.4.2.

particular, during the quiescent phase, there are two equilibria on the circle, and the system is near the stable equilibrium. The disappearance of these equilibria in a saddle–node triggers the onset of the active phase, since then orbits on the invariant circle are free to travel around the circle periodically in time, producing spikes. Moreover, the frequency increases as the system moves further from the bifurcation point until it reaches a maximum, which corresponds to the vertex of the parabola. During the remainder of the active phase, the frequency decreases, and the active phase ends when there is again a saddle–node bifurcation in which the two equilibria re-emerge on the invariant circle. Examples are given in [5, 23, 55, 60, 68]. The saddle–node bifurcation on an invariant circle (SNIC) is a global bifurcation. In the classification of codimension-one and -two bursters given in the previous sections, we did not encounter it in the unfoldings of any of the local singularities. In fact, to observe a SNIC in an unfolding of a local bifurcation point, one needs to look at a singularity of codimension three or higher. We will establish, in this section, that type II bursting is of codimension three, according to definition 10.5. We do this by studying the same explicit example (10.21) of a codimension-three singularity. In the unfolding space of this singularity one can also construct paths leading to type II bursting. Such a degenerate point has

An unfolding theory approach to bursting

303

saddle node of periodics

saddle node

homoclinic Figure 10.17. A path γ in the unfolding of (10.21) leading to type IV bursting.

previously also been used to generate type II bursting in [8]. Examination of figure 10.14 shows that a SNIC bifurcation occurs on the line separating regions I and III above the point HSN. Therefore, a path γ crossing the surface of saddle–node bifurcations from region I into region III will lead to type II bursting. The paths (µ2 (t), ν(t)) = (0.333 + 0.02 sin()t), −2.05) (µ2 (t), ν(t)) = (0.333 + 0.02 sin()t), −2.05 + 0.002 cos()t)) with ) = 0.003 lead to type II bursting; see figure 10.16.

10.4.3 Type IV bursting Although type IV bursting is similar to type III bursting (encountered in section 10.3.2) since the bursting phase is terminated by a saddle–node of periodic orbits, it is of higher complexity than the latter. The reason for this is that, at the end of the quiescent phase, the stable fixed point loses stability through a saddle–node bifurcation rather than a Hopf bifurcation as in the case of type III bursting; see figure 10.17. This situation does not occur in any of the unfoldings of codimension-one or -two singularities, but it does occur in the unfolding of a degenerate Takens–Bogdanov point of focus type, as is shown in [8]. Therefore, this bursting type has codimension three. One can carry out an analysis of the truncated normal form as in section 10.4.1 to find a path in parameter space that leads to this type of bursting. The details of the calculations are very similar to those given there.

304

Martin Golubitsky, Kreˇsimir Josi´c and Tasso J Kaper

Appendix. Case-by-case analysis of Hopf–steady-state bursters of section 10.3.4 As stated in section 10.3.4, we present the case-by-case analysis of the unfolding (10.16) of the Hopf–steady-state singularity in order to verify that there are indeed no new bursters found here. We follow the enumeration of the cases used in [20]. Case I.. Under either forward or backward time, there is only one region in the (µ1 , µ2 ) unfolding space in which there is a stable equilibrium point. Furthermore, this equilibrium disappears on the bifurcation curves that bound these regions, and there are no other stable invariant sets. In fact, most of the other invariant sets are saddle periodic orbits in the 3D system. Hence, no bursting is observed with paths of the form (10.8). Case II.. All paths must avoid regions IIIa and IIIb in (10.16). Paths contained in the remaining regions exhibit either codimension-one bursting of the type created by nondegenerate Hopf bifurcations (see section 10.2) or the transition from a stable limit cycle to a stable two-torus (by crossing from region Ia or Ib into region II), which is precisely the new transition already described in section 10.3.4, or both. Case III.. In this case, we have already seen in section 10.3.4 that there is a new burster, bifurcation to a two-torus, that exists under time reversal in (10.16). To study the remaining possible bursters in this case, we begin by observing that all interesting paths must lie in the upper half of the parameter plane. However, in both forward and backward time, any such path other than that giving rise to two-torus bursting can only give rise to the bursting already found in the nondegenerate Hopf bifurcation, e.g., crossing between regions II and III or from II into IV via III. Case IV.. Again, here, all interesting paths must lie in the upper half of the (µ1 , µ2 ) plane. There is one stable equilibrium there, but it can only disappear via a saddle–node bifurcation of equilibria or lose stability via a subcritical Hopf bifurcation. Hence, there is no bursting.

Acknowledgements We thank Ian Stewart for many helpful discussions about the local structure of bursters. The research of MG was supported in part by NSF Grant DMS-0071735 and the Center for BioDynamics, Boston University. The research of TK was supported in part by NSF Grant DMS-0072596.

An unfolding theory approach to bursting

305

References [1] Andronov A and Leontovich E 1939 Some cases of the dependence of limit cycles upon parameters Uch. Zap. Gork. Univ. 6 3–24 (in Russian) [2] Arnold V I 1972 Lectures on bifurcations in versal families Russ. Math. Surv. 27 54–123 [3] Av-Ron E, Parnas H and Segel L A 1993 A basic biophysical model for bursting neurons Biol. Cybern. 69 87–95 [4] Baer S M, Erneux T and Rinzel J 1989 The slow passage through a Hopf bifurcation: delay, memory effects, and resonance SIAM J. Appl. Math. 49 55–71 [5] Baer S M, Rinzel J and Carrillo H 1995 Analysis of an autonomous phase model for neuronal parabolic bursting J. Math. Biol. 33 309–33 [6] Bautin N 1949 Behavior of Dynamical Systems near the Boundaries of Stability Regions (Leningrad and Moscow: Ogiz Gostexizdat) (in Russian) [7] Benoit E 1991 Dynamic Bifurcations: Luminy 1990 (Lecture Notes in Mathematics 1493) (Berlin: Springer) [8] Bertram R, Butte M J, Kiemel T and Sherman A 1995 Topological and phenomenological classification of bursting oscillations Bull. Math. Biol. 57 413– 39 [9] Bogdanov R 1975 Versal deformations of a singular point in the plane in the case of zero eigenvalues Funct. Anal. Appl. 9 144–5 [10] Bogdanov R 1976 Versal deformations of a singular point in the plane in the case of zero eigenvalues Proc. Petrovski Sem (Moscow University) 2 23–35 (in Russian) (Engl. transl. Sel. Math. Sov. 1(4) 373–88) [11] Broer H W, Huitema G B, Takens F and Braaksma B L J 1990 Unfoldings and bifurcations of quasi-periodic tori Mem. Am. Math. Soc. 83 421 [12] Broer H W, Huitema G B and Sevryuk M B Quasi-Periodic Tori in Families of Dynamical Systems: Order Amidst Chaos (Lecture Notes in Mathematics 1645) (Berlin: Springer) [13] Canavier C C, Clark J W and Byrne J H 1991 Simulation of the bursting activity of neuron R15 Aplysia: role of ionic currents, calcium balance, and modulatory transmitters J. Neurophys. 66 2107–24 [14] Candelpergher B, Diener F and Diener M 1990 Retard a la bifurcation: du local au global Bifurcations of Planar Vector Fields: Luminy 1989 (Lecture Notes in Mathematics 1455) (Berlin: Springer) pp 1–19 [15] Chay T R 1986 On the effect of intracellular calcium-sensitive K+ channel in the bursting pancreatic β-cell Biophys. J. 50 765–77 [16] Chay T R and Cook D I 1988 Endogenous bursting patterns in excitable cells Math. Biosci. 90 139–53 [17] Chay T R and Keizer J 1983 Minimal model for membrane oscillations in the pancreatic β-cell Biophys. J. 42 181–90 [18] Chay T R and Keizer J 1985 Theory of the effect of extracellular potassium on oscillations in the pancreatic β-cell Biophys. J. 48 815–27 [19] Chay T R and Rinzel J 1985 Bursting, beating, and chaos in an excitable membrane model Biophys. J. 47 357–66 [20] Chow S-N, Li C and Wang D 1994 Normal Forms and Bifurcations of Planar Vector Fields (Cambridge: Cambridge University Press) [21] Diener F and Diener M 1991 Maximal delay Dynamic Bifurcations: Luminy 1990

306

Martin Golubitsky, Kreˇsimir Josi´c and Tasso J Kaper

(Lecture Notes in Mathematics 1493) ed E Benoit (Berlin: Springer) pp 71–86 [22] Dumortier F, Roussarie R and Sotomayor J 1991 Generic three-parameter families of planar vector fields, unfoldings of saddle, focus, and elliptic singularities with nilpotent linear parts Bifurcations of Planar Vector Fields: Nilpotent Singularities and Abelian Integrals (Lecture Notes in Mathematics 1480) ed F Dumortier, R Roussarie, J Sotomayor and H Zoladek (Berlin: Springer) pp 1–164 [23] Ermentrout G B and Kopell N 1986 Parabolic bursting in an excitable system coupled with a slow oscillation SIAM J. Appl. Math. 46 233–53 [24] Gavrilov N K 1978 On some bifurcations of an equilibrium with one zero and a pair of pure imaginary roots Methods of Qualitative Theory of Differential Equations (Gorki: Gorki University Press) pp 33–40 (in Russian) [25] Gavrilov N K 1980 On bifurcations of an equilibrium with two pairs of pure imaginary roots Methods of Qualitative Theory of Differential Equations (Gorki: Gorki University Press) pp 17–30 (in Russian) [26] Golubitsky M and Schaeffer D 1985, Singularities and Groups in Bifurcation Theory I (Applied Mathematical Sciences Series 51) (New York: Springer) [27] Golubitsky M, Stewart I and Schaeffer D 1988 Singularities and Groups in Bifurcation Theory II (Applied Mathematical Sciences Series 69) (New York: Springer) [28] Golubitsky M and Stewart I 2001 The Symmetry Perspective submitted [29] Guckenheimer J 1981 On a codimension two bifurcation Dynamical Systems and Turbulence: Warwick 1980 (Lecture Notes in Mathematics 898) ed D A Rand and L-S Young (Berlin: Springer) pp 99–142 [30] Guckenheimer J and Holmes P 1990 Nonlinear Oscillations, Dynamical Systems and Bifurcations of Vector Fields revised and corrected reprint of the 1983 original (Berlin: Springer) [31] Guckenheimer J, Gueron S and Harris-Warrick R M 1993 Mapping the dynamics of a bursting neuron Phil. Trans. R. Soc. 341 345–59 [32] Hopf E 1942 Abzweigung einer periodischen L¨osung von einer stationaren L¨osung eines Differentialsystems Berlin Math.-Phys. Kl Sachs: Acad. Wiss. Leipzig 94 1–22 [33] Hoppensteadt F C and Izhikevich E M 1997 Weakly Connected Neural Networks (Berlin: Springer) [34] Hayes M G 1999 Geometric analysis of delayed bifurcations PhD Thesis Boston University [35] Iooss G and P´erou`eme M C 1993 Perturbed homoclinic solutions in reversible 1:1 resonance vector fields J. Diff. Eqns 102 62–88 [36] Izhikevich E 2000 Subcritical elliptic bursting of Bautin type SIAM J. Appl. Math. 60 503–35 [37] Izhikevich E 2000 Neural excitability, spiking and bursting Int. J. Bif. Chaos 10 1171–266 [38] Keener J 1981 Infinite period bifurcation and global bifurcation branches SIAM J. Appl. Math. 41 127–44 [39] Kuznetsov Y A 1995 Elements of Applied Bifurcation Theory (Applied Mathematical Sciences Series 112) (Berlin: Springer) [40] Langford W F 1979 Periodic and steady mode interactions lead to tori SIAM J. Appl. Math. 37 22–48 [41] Marsden J and McCracken M 1976 The Hopf Bifurcation and its Applications

An unfolding theory approach to bursting

307

(Berlin: Springer) [42] Neihstadt A I 1987 Persistence of stability loss for dynamical bifurcations I J. Diff. Eqns 23 1385–91 [43] Neihstadt A I 1988 Persistence of stability loss for dynamical bifurcations II J. Diff. Eqns 24 171–96 [44] Pernarowski M 1994 Fast subsystem bifurcations in a slowly-varying Li´enard system exhibiting bursting SIAM J. Appl. Math. 54 814–32 [45] Pernarowski M, Miura R M and Kevorkian J 1992 Perturbation techniques for models of bursting electrical activity in pancreatic β-cells SIAM J. Appl. Math. 52 1627–50 [46] Pinsky R F and Rinzel J 1994 Intrinsic and network rhythmogenesis in a reduced Traub model for CA3 neurons J. Comput. Neurosci. 1 39–60 [47] Rhodes P A and Gray C M 1994 Simulations of intrinsically bursting neocortical pyramidal neurons Neural. Comput. 6 1086–110 [48] Rinzel J 1978 Repetitive activity and Hopf bifurcation under stimulation for a simple FitzHugh-Nagumo nerve conduction model J. Math. Biol. 5 363–82 [49] Rinzel J 1981 Models in neurobiology Nonlinear Phenomena in Physics and Biology ed R H Enns, B L Jones, R M Miura and S S Rangnekar (New York: Plenum) pp 347–67 [50] Rinzel J 1985 Bursting oscillation in an excitable membrane model Ordinary and Partial Differential Equations (Lecture Notes in Mathematics 1151) ed B D Sleeman and R D Jarvis (Berlin: Springer) pp 304–16 [51] Rinzel J 1987 A formal classification of bursting mechanisms in excitable systems Mathematical Topics in Population Biology, Morphogenesis and Neurosciences (Lecture Notes in Biomathematics 71) ed E Teramoto and M Yamaguti (Berlin: Springer) pp 267–81 [52] Rinzel J 1987 A formal classification of bursting mechanisms in excitable systems Proc. Int. Cong. Math. 1987 ed A M Gleason (Providence, RI: American Mathematical Society) pp 1578–93 [53] Rinzel J and Ermentrout G B 1989 Analysis of neural excitability and oscillations Methods in Neuronal Modeling: From Synapses to Networks ed C Koch and I Segev (Cambridge: MIT Press) pp 135–69 [54] Rinzel J and Lee Y S 1986 On different mechanisms for membrane potential bursting Nonlinear Oscillations in Biology and Chemistry (Lecture Notes in Biomathematics 66) ed H G Othmer (Berlin: Springer) pp 19–33 [55] Rinzel J and Lee Y S 1987 Dissection of a model for neuronal parabolic bursting J. Math. Biol. 25 653–75 [56] Rubin J E and Terman D 1999 Geometric singular perturbation analysis of neuronal dynamics Handbook of Dynamical Systems III: Toward Applications ed F Takens to appear [57] Rush M E and Rinzel J 1994 Analysis of bursting in a thalamic neuron model Biol. Cybern. 71 281–91 [58] Schecter S 1987 The saddle–node separatrix-loop bifurcation SIAM J. Math. Anal. 18 1142–56 [59] Sherman A, Rinzel J and Keizer J 1988 Emergence of organized bursting in clusters of pancreatic β-cells by channel sharing Biophys. J. 54 411–25 [60] Soto-Trevi˜no C, Kopell N and Watson D 1996 Parabolic bursting revisited J. Math. Biol. 35 114–28 [61] Su J 1993 Delayed oscillation phenomena in the FitzHugh–Nagumo equation J. Diff.

308

Martin Golubitsky, Kreˇsimir Josi´c and Tasso J Kaper

Eqns 105 180–215 [62] Su J 1997 Effects of periodic forcing on delayed bifurcations J. Dynam. Diff. Eqns 9 561–625 [63] Takens F 1973 Unfoldings of certain singularities of vector fields: generalized Hopf bifurcations J. Diff. Eqns 14 476–93 [64] Takens F 1974 Singularities of vector fields Publ. Math. IHES 43 47–100 [65] Takens F 1976 Constrained equations: a study of implicit differential equations and their discontinuous solutions Structural Stability, the Theory of Catastrophes, and Applications in the Sciences (Lecture Notes in Mathematics 525) ed P Hilton (Berlin: Springer) pp 143–234 [66] Terman D 1991 Chaotic spikes arising from a model of bursting in excitable membranes SIAM J. Appl. Math. 51 1418–50 [67] de Vries G 1998 Multiple bifurcations in a polynomial model of bursting oscillations J. Nonlin. Sci. 8 281–316 [68] de Vries G and Miura R M 1998 Analysis of a class of models of bursting electrical activity in pancreatic β-cells SIAM J. Appl. Math. 58 607–35 [69] Wang X-J and Rinzel J 1994 Oscillatory and bursting properties of neurons The Handbook of Brain Theory and Neural Networks ed M A Arbib (Cambridge: MIT Press) pp 686–91 [70] Zeeman E C 1973 Differential equations for heartbeat and nerve impulse Dynamical Systems ed M Peixoto (New York: Academic) pp 683–741 [71] Zoladek H 1984 On the versality of a family of symmetric vector fields in the plane Math. USSR Sb. 48 463–98 [72] Zoladek H 1987 Bifurcations of a certain family of planar vector fields tangent to axes J. Diff. Eqns 67 1–55

Chapter 11 The intermittency route to chaotic dynamics Lorenzo J D´ıaz PUC-Rio Isabel L Rios Universidade Federal Fluminense Marcelo Viana IMPA

To Floris, whose work has been a continuous source of inspiration. The expression intermittency describes a mechanism of transition from simple behavior to turbulence in dissipative convective fluids, and many other dissipative dynamical systems. The pioneering work of Pomeau and Manneville [26] analysed intermittency in the Lorenz model, as well as in families of systems unfolding a saddle–node, a flip, or a Hopf bifurcation. Their article presented numerical evidence indicating that in these bifurcations the Lyapunov exponent grows continuously from zero beyond the bifurcation threshold. A conceptual formulation of intermittency in a broad setting was proposed by Floris Takens in [30]: an arc (one-parameter family) of diffeomorphisms (φµ )µ on a manifold has an intermittency bifurcation for µ = µ0 at a compact invariant set K if: • •

for every µ < µ0 the diffeomorphism φµ has an attracting compact set K µ (not necessarily transitive), converging to K in the Hausdorff sense when µ tends to µ0 from below; for µ > µ0 close to µ0 there are no φµ -attracting sets near K , yet the φµ orbit of Lebesgue almost every point in a neighborhood of K returns close to K infinitely often. 309

Lorenzo J D´ıaz, Isabel L Rios and Marcelo Viana

310

µ < µ0

F ss

µ = µ0

µ > µ0

W ss

Wc Aµ

Sµ

Wc P

Figure 11.1. Local dynamics at a saddle–node bifurcation.

Such bifurcations are accompanied by profound changes of the dynamics, both at the local level (in a neighborhood of the compact set K ) and at the global level. As we shall see, these global changes are mainly influenced by the way points return to the vicinity of K depending on the bifurcation parameter. The best studied situations correspond to the case where the set K consists of a unique fixed (or periodic) orbit of saddle–node type: one multiplier is equal to 1 and all others are less than 1 in norm. This is also the setting we have in mind in this review, especially when the global recurrence stems from the presence of a cycle, that is, periodic points with cyclic intersections of their stable and unstable manifolds. Other interesting cases include, for example, transitions derived from Anosov diffeomorphisms [29, 32], as well as certain bifurcations of partially hyperbolic sets in dimension three or higher.

11.1 Saddle–nodes of diffeomorphisms 11.1.1 Definitions and basic facts A saddle–node of a C r diffeomorphism φ : M → M is a fixed (or periodic) point P of φ, such that Dφ(P) has one multiplier equal to 1 and all others less than 1 in norm. The tangent space T P M splits into two Dφ-invariant spaces, the one-dimensional center space E c , which is the eigenspace associated with the multiplier 1, and the stable space E ss , corresponding to the remaining multipliers. By normal hyperbolicity theory [11, 19], there exist a locally invariant immersed center manifold W c and a strong stable manifold W ss , tangent at P to E c and to E ss , respectively. The stable manifold is unique and of class C r . In general, there are several center manifolds, and they may be less smooth than the diffeomorphism φ. It is part of the definition of a saddle–node that, for some choice of W c , the restriction of φ to the center manifold has a non-vanishing 2-jet at P: there is a coordinate x on W c (with P corresponding to x = 0) such that φ(x) = x + αx 2 + O(|x|3 )

with α %= 0.

The intermittency route to chaotic dynamics

311

Then P is a semi-attractor restricted to W c , as depicted at the center of figure 11.1, and it also follows that the center manifold is of class C r . The unstable manifold W u of P is an immersed half-line contained in W c . The stable manifold W s is a closed half-space with W ss as its boundary. Moreover, there is a unique φ-invariant foliation of the stable manifold of P by codimension-one submanifolds having W ss as a leaf. It is called the strong stable foliation F ss of the saddle–node. 11.1.2 Unfolding saddle–nodes Saddle–nodes are obtained by collapsing a saddle Sµ and a periodic attractor (node) Aµ into a single point, as described in figure 11.1. After the bifurcation, the periodic points disappear and there is no attracting set in the region where the saddle–node P was formed. The first part of Takens’s definition of intermittency is fulfilled by taking K = P and K µ to be the closure of the separatrix connecting Sµ to Aµ . To have the second part, we shall assume later that the saddle–node is part of a cycle. An arc of diffeomorphisms (φµ )µ unfolds generically a saddle–node P of a diffeomorphism φ = φµ0 if it cuts the hyper-surface of diffeomorphisms with a saddle–node point transversely at φ. Here is an alternative formulation, in terms of local expressions. One considers a continuation Wµc of the center manifold W c , for nearby parameter values (which exists because the invariant manifold W c is normally hyperbolic [11] for the diffeomorphism φ = φµ0 ). Generic unfolding means that, up to a convenient choice of coordinates x in Wµc , and a re-parametrization of the family, the restriction of φµ to Wµc has the form φµ (x) = x + µ + αx 2 + βxµ + γ µ2 + O(|µ|3 + |x|3 ). After re-parametrization, the bifurcation parameter has become µ = 0. From now on we shall always consider µ0 = 0. The notion of a saddle–node may be extended to include other nonhyperbolic periodic points obtained by collapsing two saddle-points with different stable dimensions: they have one unique multiplier equal to 1 and all others different from 1 in norm. See [7] for results in this setting. 11.1.3 Saddle–node cycles A diffeomorphism φ has a saddle–node k-cycle, k ∈ N , if there are a saddle–node p0 and hyperbolic periodic saddles p1 , . . . , pk−1 , such that W u ( p j −1) intersects W s ( p j ) transversely for every j and W u ( pk−1 ) meets W s ( p0). The cycle is critical if W u ( pk−1 ) is non-transverse to the strong stable foliation of the saddle– node. Otherwise, it is called non-critical. Figure 11.2 exhibits three different types of saddle–node cycle: from left to right we have a critical 1-cycle, a

Lorenzo J D´ıaz, Isabel L Rios and Marcelo Viana

312

W ss (P) W s (P)

P

W u (P)

W ss (P) W s (P)

W ss (P) W u (P)

Q

P

W u (P)

Figure 11.2. Saddle–node cycles.

critical saddle–node horseshoe, and a non-critical 2-cycle (non-critical saddle– node horseshoe). An arc of diffeomorphisms (φµ )µ unfolds generically a saddle–node cycle of φ = φ0 if it unfolds generically the saddle–node p0 involved in that cycle. This is a remarkably rich mechanism of bifurcation. For instance, one has the following. Theorem 11.1 (Newhouse et al [19]). If an arc (φµ )µ of surface diffeomorphisms unfolds generically a critical saddle–node cycle of φ0 , then there is a sequence of parameters νn → 0 such that, for every νn , the diffeomorphism φνn has a homoclinic tangency which is unfolded generically by the family (φµ )µ . This result extends to arbitrary dimension; see [8]. Moreover, the converse is also true (L Mora): the generic unfolding of a homoclinic tangency by a family of surface diffeomorphisms always includes the formation and generic unfolding of critical saddle–node cycles. From theorem 11.1 one deduces that any phenomena occurring during a homoclinic bifurcation (e.g. the creation of attractors) are also present when a critical saddle–node cycle is unfolded. However, saddle–node bifurcations have a very distinctive feature, that we state as the following informal principle: persistent phenomena (positive Lebesgue measure of values of µ) are, actually, prevalent (positive Lebesgue density at µ = 0). More precise statements and an explanation of the mechanism behind this property are provided in the next sections. 11.1.4 Persistence and prevalence Let (φµ )µ be an arc of diffeomorphisms on a manifold M, going through some bifurcation at µ = 0. Let P be some dynamical property, like hyperbolicity, co-existence of infinitely many sinks, or presence of non-hyperbolic strange attractors. The property P is persistent after the bifurcation if for every ε > 0 the subset E ε ⊂ [0, ε] of parameter values for which φµ satisfies P has positive Lebesgue

The intermittency route to chaotic dynamics

313

measure. P is called prevalent at the bifurcation if lim inf ε→0

|E ε | >0 ε

where |E ε | denotes the Lebesgue measure of E ε . Finally, P is fully prevalent if the above limit is 1. For instance, Newhouse, Palis and Takens [18, 20, 21] prove that hyperbolicity is fully prevalent in arcs of surface diffeomorphisms unfolding homoclinic tangencies associated with hyperbolic sets with Hausdorff dimension less than 1. This is not true if the Hausdorff dimension is bigger than 1, according to Palis and Yoccoz [23], but the union of hyperbolicity and persistent tangencies (Newhouse’s phenomenon [17]) is always fully prevalent at homoclinic bifurcations in dimension two, according to Moreira and Yoccoz [15]. In the same setting, Mora and Viana [13] proved that existence of nonhyperbolic strange attractors is a persistent phenomenon. According to a recent result of Palis and Yoccoz [25], it cannot be prevalent. On the other hand, as we shall see in a while, non-hyperbolic strange attractors are always a prevalent phenomenon in the unfolding of critical saddle–node cycles. This is a striking realization of the informal principle we stated before: in saddle–node bifurcations persistent properties tend to be prevalent. This remarkable feature results from the existence of a repetition pattern in parameter space that is characteristic of intermittency bifurcations: one can find sequences µn converging to the bifurcation value 0 such that the arcs obtained by restricting the parameter to each interval [µn+1 , µn ] have roughly the same dynamics for all large n, up to convenient parametrization. This is properly explained by means of the following construction of Newhouse et al [19], that plays a crucial role in what follows. For clarity, we shall restrict ourselves to the case of surface diffeomorphisms. However, this construction extends to any dimension [8].

11.2 Transition maps Let (φµ )µ be an arc of diffeomorphisms unfolding generically a saddle–node of φ = φ0 . Fix, once and for all, a continuation Wµc of a center manifold, and a coordinate system x in each Wµc so that φµ (x) = x + µ + αx 2 + βxµ + γ µ2 + O(|µ|3 + |x|3 ). It is no restriction to assume α > 0. Then, for µ = 0, the subsets {x < 0} and {x > 0} of the center manifold of the saddle–node are contained in its stable and unstable manifolds, respectively; see figure 11.1.

Lorenzo J D´ıaz, Isabel L Rios and Marcelo Viana

314 0

n (D − ) φµ n

µn

n (D − ) φµ

µ

υn

1

n φµ (D − ) n+1

µn+1

a

D−

P

b

D+

Figure 11.3. Dynamical normalizations of parameter space.

11.2.1 Finite-time transition maps For µ = 0, the presence of the fixed point prevents the transition of orbits from the left- {x < 0} to the right-hand side {x > 0}. However, this obstruction disappears when the parameter µ becomes positive. We can then define transition maps in the following way. Fix compact fundamental domains D − ⊂ {x < 0} and D + ⊂ {x > 0} of φµ restricted to Wµc . Their dependence on µ is not relevant here, so we omit it in our notations. For each µ > 0 let k = k(µ) be the smallest integer such that φµk (D − ) intersects D + . As µ decreases to zero, more and more iterates are needed for D − to reach D + , which means that k(µ) → ∞ as µ tends to zero from above. There is a decreasing sequence of parameters µn → 0 such that k(µ) = n for all µ ∈ [µn+1 , µn ) and φµn n (D − ) = D + ; see figure 11.3. It is useful to identify points in {x < 0} if they are in the same orbit of φµ , and similarly in {x > 0}, and we shall often do this in the following. This identification turns D − and D + into smooth circles. For each large n and µ ∈ [µn+1 , µn ) we consider the circle map T˜n (µ, ·) : D − → D + induced by the nth iterate φµn , and call it the time-n transition map of the saddle– node arc (φµ )µ . The repetition pattern we announced before comes from the fact that these arcs of finite-time transitions behave roughly the same when n is large: up to dynamically defined normalizations of the domain in parameter space, the arcs T˜n converge to some limit T∞ when n tends to infinity. 11.2.2 Parameter normalization and infinite-time transition A one-parameter family of vector fields (X µ )µ is a saddle–node arc if (in local coordinates around the origin) the vector fields are of the form X µ (x) = µ + αx 2 + βxµ + γ µ2 + O(|µ|3 + |x|3 )

The intermittency route to chaotic dynamics

315

for some constants α, β, γ with α > 0. The arc (X µ )µ is adapted to (φµ )µ if φµ (x) coincides with X µ1 (x) for all µ ≥ 0 and x close to zero, where X µ1 is the time-1 map of the vector field X µ ([19] use a weaker condition, the present definition is from [8]). For the existence of adapted arcs of vector fields see [12, 33]. Let us write D − = [a, φµ (a)] and D + = [b, φµ (b)] with a < 0 < b. By the (a) coincides with the right end-point φµ (b) definition of the µn , the point φµn+1 n of D + , whereas φµn+1 (a) coincides with the left end-point b of D + . Moreover, n+1 [µn+1 , µn ] : µ "→ φµn+1 (a) ∈ D + is increasing (if n is large). For each µ ∈ [µn+1 , µn ] we denote by ξn (µ) the time the flow of the adapted arc of vector fields X µ takes to go from φµn+1 (a) to φµ (b). That is, X µξn (µ) (φµn+1 (a)) = φµ (b)

⇔

X µn+ξn (µ) (a) = b .

ξn maps [µn+1 , µn ] onto [0, 1] in a decreasing fashion. We define the nth parameter space normalization υn : [0, 1] → [µn+1 , µn ] to be the inverse of this map ξn . The adapted arc (X µ )µ also allows us to exhibit infinite-time transition maps T∞ : [0, 1] × D − → D + , given by t (x)−σ

T∞ (σ, x) = X 0

(b) t (x)

where t (x) is the time the flow spends from a to x, that is, X 0 (a) = x. Keep in mind that we think of D − and D + as circles, under identifications of points in the same orbit. Note that, if one takes t (x) mod 1 as a new coordinate in D − and, similarly, considers the time the flow of X 0 takes to go from b to any point in D − as a new coordinate in D − , these T∞ (σ, ·) become circle isometries. In fact, each T∞ (σ, ·) is obtained by composing T∞ (0, ·) with the rigid rotation of angle −σ . 11.2.3 Convergence and distortion properties Let Tn be the arcs of transformations from D − to D + obtained by reparametrizing the finite-time transitions T˜n according to υn Tn : [0, 1] × D − → D + ,

Tn (σ, x) = T˜ (υn (σ ), x).

That is, Tn (σ, ·) is the map induced by the restriction of φυnn (σ ) (x) to the center manifold, in the quotient spaces obtained by identifying points in the same orbit, on {x < 0} and on {x > 0}. Here is the convergence statement we had announced.

316

Lorenzo J D´ıaz, Isabel L Rios and Marcelo Viana

Theorem 11.2 (Newhouse et al [19], D´ıaz et al [8]). The sequence of maps Tn : [0, 1] × D − → D + converges to T∞ : [0, 1] × D − → D + in the C r -topology when n → ∞. Most important for the kind of problems we want to deal with, the reparametrizations υn have uniformly bounded distortion Proposition 11.3 ([8], proposition 2.2). For every ε > 0 there is n 0 such that (1 − ε)|A| <

|υn (A)| < (1 + ε)|A| µn − µn+1

for every measurable subset A of [0, 1] and every n ≥ n 0 . We have been concerned only with the dynamics restricted to the center manifold. The reason is that the dynamics of the transition maps transverse to Wµc vanishes when µ approaches zero: all that is left is the dynamics along the center manifold, described by T∞ . Here is a more precise explanation. Consider neighborhoods C − and C + of D − and D + . If C − and C + are conveniently chosen, their quotients after identification of points in the same orbit (that we continue denoting in the same way) are diffeomorphic to cylinders D ± × [−1, 1]. Define Tˆn (σ, ·) to be the map from C − to C + induced by the diffeomorphism φυnn (σ ) (x) (now we do not restrict ourselves to the center manifold). Since our diffeomorphisms are contracting transversely to the center manifold, the image of Tˆn (σ, ·) gets closer and closer to the equator D + × {0} of C + when n increases. Indeed, we have the following higher-dimensional version of theorem 11.2. Theorem 11.4 ([8], theorem 2.6). The sequence Tˆn : [0, 1] × C − → ×C + converges to the arc Tˆ∞ : [0, 1] × C − → C + ,

Tˆ∞ (σ, x, y) = (T∞ (σ, x), 0)

in the C r topology when n → ∞.

11.3 Global aspects: ghost dynamics Now we analyse the unfolding of a saddle–node cycle from the global point of view. The situation when the saddle–node is the unique periodic point involved in the cycle deserves a separate treatment.

The intermittency route to chaotic dynamics

317

R∞ (σ, ·) C− C+ D−

D+ D+

φ0l (D + )

D+

Figure 11.4. Ghost circle maps.

11.3.1 A return map for 1-cycles Let (φµ )µ be an arc of diffeomorphisms generically unfolding a critical 1-cycle. Fix fundamental domains D − and D + as in the previous section. We assume that the unstable manifold of the saddle–node P is contained in its stable manifold. Then there exists l ≥ 1 such that φ0l (D + ) is contained in the region {x < 0}, inside the local stable manifold of P; see figure 11.4. Fix fundamental regions C − ⊃ D − and C + ⊃ D + as before, such that l φµ (C + ) is contained in {x < 0} for every µ close to zero, and the orbit of any point of φµl (C + ) has a representative in C − : it suffices that C + be sufficiently short, and C − be long enough along the vertical (strong-stable) direction. Then, identifying points in the same φµ orbit as we have been doing, there is a well defined arc of smooth maps !µ : C + → C − from the cylinder C + to the cylinder C − , induced by φµl . Moreover, if π denotes the projection from the stable manifold onto W c along the leaves of the strong-stable foliation, we can define a smooth circle map ψ0 : D + → D − from the circle D + to the circle D − , induced by π ◦ φ0l . Observe that if the cycle is critical then this circle map exhibits (at least two) critical points. This is the case figure 11.4 refers to, and the one we are most interested in for the time being. Composing the !µ with the transition maps that were introduced before, we obtain arcs of global return maps Rn : [0, 1] × C + → C + ,

Rn (σ, ·) = Tˆn (σ, ·) ◦ !υn (σ ) (·).

These maps encode the whole dynamics of the diffeomorphisms φµ close to the cycle. Moreover, by theorems 11.2 and 11.4, the sequence Rn converges, in the

318

Lorenzo J D´ıaz, Isabel L Rios and Marcelo Viana

C r topology, to the arc of ghost maps R∞ : [0, 1] × C + → C + , R∞ (σ, x, y) = Tˆ∞ (σ, ψ0 (x), 0) = (T∞ (σ, ψ0 (x)), 0). It is important to observe that, since the last variable y plays no role in R∞ , we may also think of it as an arc of circle maps R∞ : [0, 1] × D + → D + ,

R∞ (σ, x) = T∞ (σ, ψ0 (x)).

Thus, the unfolding of the saddle–node cycle may, to some extent, be reduced to a one-dimensional problem: from understanding the dynamics of these circle maps R∞ (σ, ·) one may draw conclusions about the behavior of φµ for small µ > 0. Next comes an important application of this idea. 11.3.2 Prevalence of hyperbolicity Suppose P is a robust property, that is, the set of dynamical systems that satisfy is open. Suppose, in addition, that P holds for some ghost circle map R∞ (σ, ·) : D + → D + . Then, by robustness, P is satisfied by Rn (σ, ·) for every large n and every σ in some interval J ⊂ [0, 1]. Since each Rn (σ, ·) is a quotient map of an iterate of φυn (σ ) (identification of points in the same orbit), we conclude that, up to convenient & translation, property P is satisfied by φµ for all parameters µ in the set E = n υn (J ). On the other hand, by the bounded distortion property in proposition 11.3

P

1 |E ∩ [µn+1 , µn ]| ≥ (1 − ε)|J | ≥ |J | |[µn+1 , µn ]| 2 for every large n. This means that E has positive density at µ = 0. In other words, the property P is prevalent at the bifurcation for the arc (φµ )µ . For instance, take P to be hyperbolicity (Axiom A plus strong transversality [29]). It is not difficult to ensure, for a critical saddle–node arc (φµ )µ , that some ghost circle map R∞ (σ, ·) is hyperbolic. For instance, one may choose R∞ (1/2, ·) such that it has exactly two critical points, both contained in the basin of attraction of a fixed point s0 , and the norm of the derivative is larger than 1 outside neighborhoods of the critical points contained in the basin of s0 . Then the non-wandering set of R∞ (1/2, ·) is hyperbolic (implying Axiom A) and the map satisfies the strong transversality condition. It follows, by robustness of hyperbolicity, that φµ is hyperbolic for a sizable subset of parameters µ. Along these lines one gets the following. Theorem 11.5 (D´ıaz et al [8]). There exists an open set of arcs of diffeomorphisms unfolding a critical saddle–node 1-cycle for which hyperbolicity is a prevalent property at the bifurcation.

The intermittency route to chaotic dynamics R∞ (σ, ·)

W ss (P) D−

D+

319

I

W u (P) D+

φ0l (I ) I

Figure 11.5. Saddle–node horseshoes: partially defined ghost maps.

This result extends to critical saddle–node l-cycles for any l ≥ 1 [8]. Question 11.6. Is prevalence of hyperbolicity a generic property (open and dense) among arcs of diffeomorphisms unfolding critical saddle–node cycles with finitely many criticalities (for the ghost circle maps) ? One way to prove this would be to show that, given a generic multi-modal map R of the circle (finitely many critical points), there exists σ such that R − σ (composition with the rotation by −σ ) is hyperbolic. 11.3.3 Saddle–node horseshoes The kind of system described in the central part of figure 11.2 was first treated by Zeeman [34], and it was pointed out by Takens [30] as an important model of intermittency. One considers a two-dimensional disk D and an embedding φ : D → D whose limit set in D consists of a horseshoe & and a periodic attractor. Then one lets the attractor and the accessible fixed point of the horseshoe collapse into a saddle–node. At the bifurcation the limit set &0 is topologically conjugate to the initial horseshoe, but it is no longer hyperbolic, as it contains the saddle– node. Since &0 has a dense subset of periodic points, the diffeomorphism exhibits saddle–node l-cycles for any l ≥ 2. A key difference with respect to the case of 1-cycles we discussed above is that now the unstable manifold of the saddle–node P is not completely contained in its stable manifold: for instance, W u (P) intersects the stable manifolds of all the other periodic points in the non-hyperbolic horseshoe &0 . This means that there is no family of global returns maps, as we were able to construct in the previous case. However, it is possible to construct partially defined return maps as follows. One fixes fundamental domains D − and D + as before, and considers a maximal open subinterval I of D + contained in W u (P) and whose extremes are points

320

Lorenzo J D´ıaz, Isabel L Rios and Marcelo Viana

of the strong stable manifold W ss (P). Then one defines, in much the same way as before, an arc of ghost return maps R∞ (σ, ·) from I to D + . In the example described in figure 11.5, the return maps have a unique critical point. Note that the norm of the derivative goes to infinity at the boundary of I . The convergence results of theorems 11.2 and 11.4 remain valid on compact subsets of I . Partially defined ghost maps are used by Costa [4] in her proof that global strange attractors are a prevalent phenomenon in the unfolding of saddle–node horseshoes, in a robust (open) class of cases. Prevalence of hyperbolicity had been proven in [8] for another robust class. A detailed study of these return maps R∞ (σ, ·) is carried out by D´ıaz and Rios [6], who provide a geometric model for the unfolding of saddle–node horseshoes. Another use of partially defined return maps, by D´ıaz and Ures [9], will be discussed in a forthcoming section. In a related setting, Crovisier [5] shows, in great generality, that saddle– node horseshoes give rise to true (hyperbolic) horseshoes when the saddle–node is unfolded in the direction of negative parameters. Cao and Kiriki [3] study the unfolding of non-critical horseshoes, as on the right-hand side of figure 11.2.

11.4 Prevalence of local and global strange attractors An attractor of a diffeomorphism φ : M → M is a compact invariant subset & of M that is transitive (dense orbits) and whose basin (or stable set) W s (&) = {x ∈ M : φ n (x) → & as n → +∞} has positive Lebesgue measure. A repeller of f is just an attractor of the inverse map f −1 . One calls the attractor strange if orbits in the basin are sensitive with respect to initial conditions: almost every pair of orbits starting in nearby points diverge from each other as time increases. In this section we discuss saddle–node cycles as a privileged mechanism for creating strange attractors, especially non-hyperbolic ones. 11.4.1 A general prevalence result According to theorem 11.1, the generic unfolding of a critical saddle–node cycle always involves the formation and generic unfolding of homoclinic tangencies. On the other hand, Mora and Viana [13] prove, based on the work of Benedicks and Carleson [1], that the presence of non-hyperbolic strange attractors is a persistent phenomenon in generic arcs of surface diffeomorphisms unfolding a homoclinic tangency. See also [28, 31] for the extension to arbitrary dimension. It follows that strange attractors are persistent also in the unfolding of saddle–node critical cycles. In view of the ideas discussed in section 11.3.2, one may expect the presence of strange attractors to be a prevalent phenomenon in this setting of saddle– node cycles. However, one should stress that the situation is much more subtle

The intermittency route to chaotic dynamics

321

U P

Figure 11.6. Global invariant region for 1-cycles.

than in the case of hyperbolicity (which we settled in section 11.3.2) because in the present context one lacks robustness: the sets of systems constructed in [1, 13, 31], for which strange attractors are known to exist, have empty interior. Thus, a delicate analysis of the bifurcation mechanisms is needed to justify that expectation. Theorem 11.7 (D´ıaz et al [8]). Existence of non-hyperbolic strange attractors is a prevalent property at the bifurcation for every arc of diffeomorphisms (φµ )µ unfolding generically a critical saddle–node cycle. 11.4.2 Global strange attractors The strange attractors obtained by the previous construction have a local nature: they are periodic, with high periods, and their basins have a large number of connected components, with small total Lebesgue measure. This is entirely in the nature of things: without further assumptions about the geometry at the bifurcation, the set of points whose forward orbits remain forever close to the cycle may have small volume, for all positive values of the parameter µ. On the other hand, in some relevant cases one can identify a global region around the cycle that remains forward invariant for all parameter values close to zero. An important example, corresponding to a saddle–node 1-cycle, is described in figure 11.6, where the invariant region is an annulus. In such cases, it is natural to ask whether a unique attractor can be found, in a persistent or even prevalent way, which accounts for the whole dynamical behavior in the sense that its basin contains the entire invariant region. The first construction of nonhyperbolic strange attractors with such a global character was the following. Theorem 11.8 (D´ıaz et al [8]). The presence of a global non-hyperbolic strange attractor is prevalent at the bifurcation for an open class of arcs of

Lorenzo J D´ıaz, Isabel L Rios and Marcelo Viana

322

γ

γ

&

W s (&) W s (&µ )

P

W u (&µ ) W u (&)

Figure 11.7. Persistent tangencies between invariant foliations.

diffeomorphisms unfolding a critical saddle–node 1-cycle. Other constructions appeared subsequently, including [4] in the setting of saddle–node horseshoes, where one may take a disk as the forward invariant region.

11.5 Persistence of tangencies In this section we discuss fractal dimensions and the phenomenon of persistent tangencies in the context of saddle–node bifurcations. 11.5.1 Fractal dimensions in homoclinic bifurcations Starting in the early seventies, works of Newhouse, Palis and Takens [18, 21, 20], and later also Yoccoz, Moreira and Palis [15, 23], have unveiled a deep connection between fractal dimensions (such as the Hausdorff dimension) of invariant sets and the frequency of hyperbolicity in the unfolding of homoclinic tangencies of surface diffeomorphisms. Let us outline this connection. One considers a homoclinic tangency associated with a periodic point P contained in a horseshoe &; see figure 11.7. The existence of a homoclinic tangency implies that the invariant (stable and unstable) foliations of & are tangent along a differentiable curve γ containing the homoclinic point in its interior and transverse to both foliations. The intersection of γ with the leaves of the foliations corresponding to points of the hyperbolic set & defines two Cantor sets &s and &u .

The intermittency route to chaotic dynamics

323

Given an arc (φµ )µ of diffeomorphisms unfolding the tangency, one considers the corresponding intersections &sµ and &uµ of γ with the stable and unstable leaves through the points of the hyperbolic continuation &µ of &. Clearly, if the sets &sµ and &uµ have non-empty intersection then there is a homoclinic tangency associated with &µ . Identifying γ with an interval of R one can think of &sµ and &uµ as µ-translations of the Cantor sets &s and &u . Newhouse [16] introduced a notion of thickness, which allowed him to give a sufficient criterion for two Cantor sets to intersect. It is defined as follows. Consider the process of construction of the Cantor set, by successively removing the corresponding gaps, in a non-increasing order of their lengths. Each time a gap is removed, compute the ratio between the lengths of the two remaining nearby intervals and the length of the gap itself. The thickness is the infimum of all these ratios. Newhouse’s gap lemma [16] states that two Cantor sets with a product of their thicknesses larger than 1 must intersect, unless one of them is contained in a gap of the other. Building on this, he was able to construct examples of arcs of diffeomorphisms (φµ )µ generically unfolding a homoclinic tangency of φ = φ0 such that for a dense subset of a whole interval [0, ε] of values of µ the diffeomorphism φµ has another homoclinic tangency. One speaks of an interval of persistent tangencies. Later, he proved in [17] that persistent tangencies occur in any generic unfolding of any homoclinic tangency by an arc of surface diffeomorphisms. Then, the series of papers by Newhouse, Palis, Takens, Yoccoz and Moreira mentioned above identified the Hausdorff dimension as a key fractal invariant determining the frequency of hyperbolicity in the unfolding of homoclinic tangencies on surfaces. In general terms, hyperbolicity is prevalent at the bifurcation if and only if the Hausdorff dimension of the horseshoe & is less than 1. More recent results of Moreira, Palis and Viana [14, 24] and Romero [28] have shown that this principle remains valid on manifolds with arbitrary dimension. In dimension larger than two there are other mechanisms (not involving fractal dimensions explicitly) yielding persistence of tangencies in the C 1 topology; see Bonatti and D´ıaz [2]. Moreover, Rios [27] extended many of the previous results to the unfolding of homoclinic tangencies accumulated by periodic points (where the homoclinic orbit is contained in the limit set of the diffeomorphism). 11.5.2 Thick horseshoes in saddle–node cycles Saddle–node cycles exhibit some original features from the point of view of the discussion in the previous section. One of the most striking is the possibility for thick horseshoes to be created ‘out of nowhere’, immediately after the bifurcation. In fact, such horseshoes may be seen as a kind of continuation of thick invariant sets of the ghost return maps. Let us explain this in the case of critical 1-cycles.

324

Lorenzo J D´ıaz, Isabel L Rios and Marcelo Viana

We may construct examples of critical saddle–node 1-cycles such that the ghost circle map R∞ (σ, ·) has a hyperbolic Cantor set with large thickness for some subset of parameters σ ∈ [0, 1]. For instance, one may take for R∞ (σ, ·) a circle map such that the derivative is larger than 1 in norm outside two intervals 1 and 2 (around the critical points) with length δ bounded by some small δ > 0. Then the maximal invariant set &σ of R∞ (σ, ·) in the complement of 1 ∪ 2 is hyperbolic and its thickness is of order 1/δ. Then, using the convergence theorems 11.2 and 11.4, and the continuous dependence of the thickness on the diffeomorphism [17], one gets that the diffeomorphism φµ , µ = υk (σ ) has a hyperbolic set with stable thickness (transverse thickness of the stable foliation) of order 1/δ for every large k. This observation is at the origin of a result of D´ıaz and Ures [9] we are going to state next, saying that the unfolding of certain saddle–node cycles leads to an interval of persistence of tangencies immediately after the bifurcation (the interval is of the form [0, ε0 ] for some ε0 > 0), even if the Hausdorff dimension of the limit set at the bifurcation is smaller than 1. However, the previous construction is not sufficient to prove such a result. One problem is that it proves the existence of thick horseshoes only for certain subintervals in the space of parameters µ. Another, more serious, difficulty is that the hyperbolic sets one gets in this way might have very small unstable thickness, and so the gap lemma might not apply to them. 11.5.3 Thick horseshoes from saddle–node horseshoes These difficulties can be bypassed for certain robust classes of arcs of diffeomorphisms unfolding a saddle–node horseshoe: one obtains hyperbolic sets with large product of stable and unstable thicknesses for all small values of the parameter µ, even if the saddle–node horseshoe itself is thin. As we have seen in section 11.3.3, in this situation ghost return maps R∞ (σ, ·) may be defined on convenient subintervals I of the fundamental domain D + . The end-points of I correspond to points of the strong stable manifold of the saddle–node, and that the norm of the derivative of R∞ (σ, ·) goes to infinity at the end-points; see figures 11.5 and 11.8. One proves that, in an open class of cases, the map R∞ (σ, ·) has a hyperbolic Cantor set &σ with large stable thickness, for every parameter σ . In fact, the stable thickness admits a lower bound M that is of the order 1/|B| where B is the smallest of the following intervals: the connected components of (D + \ I ) and an interval around the critical point outside of which the derivative is larger than 1. Assuming the gap of the initial horseshoe is big enough, we can take I proportionally big in D + , and then we can make M as large as we like. Next, one has to ensure that the unstable thickness remains bounded from zero by some small constant that may be fixed independently of M. For this one argues that almost all (a subset with nearly the same thickness) of the initial saddle–node horseshoe persists, as a hyperbolic horseshoe, after the unfolding

The intermittency route to chaotic dynamics

325

D+ I

Figure 11.8. Thick invariant Cantor sets for the maps R∞ (σ, ·).

of the saddle–node. This uses also the continuity of the thickness with the dynamics. Since the unstable thickness of the saddle–node horseshoe is positive, we conclude that the unstable thickness of the hyperbolic sub-horseshoe is bounded from zero by some m > 0. Since M and m depend on the geometry of the saddle–node horseshoe in different directions (stable and unstable, respectively), we may indeed increase M without reducing m, so that their product is larger than 1. This is a main ingredient in the proof of the following result. Theorem 11.9 (D´ıaz and Ures [9]). For every ε > 0 there is an open set of arcs (φµ )µ unfolding at µ = 0 a critical saddle–node horseshoe of Hausdorff dimension less that 1/2 + ε such that some (0, µ0 ] is an interval of persistence of tangencies. Let us observe that by [10] a saddle–node horseshoe always has Hausdorff dimension strictly bigger than 1/2. Question 11.10. Is there a necessary and sufficient condition involving fractal dimensions of the saddle–node horseshoe &0 guaranteeing the existence of an interval J of the form (0, µ0 ) of persistence of tangencies? A corresponding question was originally asked by Palis and Takens [22, section 7], in the context of homoclinic bifurcations. As we explained, in that context the frequency of hyperbolicity is essentially determined by the Hausdorff dimension of the hyperbolic set associated with the tangency. Here, in view of the previous observations, a natural approach would be to consider not only the

326

Lorenzo J D´ıaz, Isabel L Rios and Marcelo Viana

dimension of the saddle–node horseshoe but also the dimensions of the hyperbolic sets of the circle maps R∞ (σ, ·). Question 11.11. Does there exist a non-empty open subset of the space O(M) of arcs (φµ )µ of diffeomorphisms unfolding generically a critical saddle–node 1-cycle, such that for any arc in this subset the diffeomorphisms φµ are nonhyperbolic for all small µ > 0? This final question should be related to the problem of the density of hyperbolic surface diffeomorphisms in the C 1 topology.

Acknowledgements The authors are partially supported by CNPq 001/2000, Faperj, and PronexDynamical Systems.

References [1] Benedicks M and Carleson L 1991 The dynamics of the H´enon map Ann. Math. 133 73–169 [2] Bonatti C and D´ıaz L J 1999 Connexions h´et´eroclines et g´en´ericit´e d’une infinit´e de puits et de sources Ann. ENS 32 135–50 [3] Cao Y and Kiriki S 2000 An isolated saddle–node bifurcation occurring inside a horseshoe Dynam. Stab. Syst. 15 11–22 [4] Costa M J 1998 Saddle–node horseshoes giving rise to global H´enon-like attractors Anais Acad. Bras. Ciˆencias 70 393–400 [5] Crovisier S 2000 Saddle–node bifurcations for hyperbolic sets Preprint Math´ematiques: University Paris-Sud [6] D´ıaz L J and Rios I L Critical saddle–node horseshoes: a geometrical model for intermittencies, in preparation [7] D´ıaz L J and Rocha J 1997 Non-critical saddle–node cycles and robust nonhyperbolic dynamics Dynam. Stab. Syst. 12 109–35 [8] D´ıaz L J, Rocha J and Viana M 1996 Strange attractors in saddle–node cycles: prevalence and globality Inv. Math. 125 37–74 [9] D´ıaz L J and Ures R 2000 Critical saddle–node cycles: Hausdorff dimension and persistence of tangencies Preprint PUC-Rio [10] D´ıaz L J and Viana M 1989, Discontinuity of Hausdorff dimension Ergod. Theor. Dynam. Syst. 9 403–25 [11] Hirsch M, Pugh C and Shub M 1977 Invariant Manifolds (Lecture Notes in Mathematics 583) (Berlin: Springer) [12] Il’Yashenko Yu and Yakovenko S 1993 Nonlinear Stokes phenomena in smooth classification problems Adv. Sov. Math. 112 541–76 [13] Mora L and Viana M 1993 Abundance of strange attractors Acta Math. 171 125–50 [14] Moreira C G, Palis J and Viana M 2001 Homoclinic bifurcations and fractal invariants in arbitrary dimension Preprint IMPA

The intermittency route to chaotic dynamics

327

[15] Moreira C G and Yoccoz J C 2001 Stable intersections of regular Cantor sets with large Hausdorff dimensions Ann. Math. to appear [16] Newhouse S 1974 Diffeomorphisms with infinitely many sinks Topology 13 9–18 [17] Newhouse S 1979 The abundance of wild hyperbolic sets and non-smooth stable set for diffeomorphisms Publ. Math. IHES 50 101–51 [18] Newhouse S and Palis J 1976 Cycles and bifurcation theory Ast´erisque 31 44–140 [19] Newhouse S, Palis J and Takens F 1983 Bifurcations and stability of families of diffeomorphisms Publ. Math. IHES 57 7–71 [20] Palis J and Takens F 1985 Cycles and measure of bifurcation sets for two-dimensional diffeomorphisms Inv. Math. 82 397–422 [21] Palis J and Takens F 1987 Hyperbolicity and the creation of homoclinic orbits Ann. Math. 125 337–74 [22] Palis J and Takens F 1993 Hyperbolicity and Sensitive Chaotic Dynamics at Homoclinic Bifurcations (Cambridge: Cambridge University Press) [23] Palis J and Yoccoz J C 1994 Homoclinic tangencies for hyperbolic sets of large Hausdorff dimension Acta Math. 172 91–136 [24] Palis J and Viana M 1994 High dimension diffeomorphisms displaying infinitely many periodic attractors Ann. Math. 140 207–50 [25] Palis J and Yoccoz J-C 2001 Non-uniformly hyperbolic horseshoes unleashed by homoclinic tangencies C. R. Acad. Sci., Paris to appear [26] Pomeau Y and Manneville P 1980 Intermittent transitions to turbulence in dissipative dynamical systems Commun. Math. Phys. 74 189–97 [27] Rios I L 2001 Unfolding homoclinic tangencies inside horseshoes: hyperbolicity, fractal dimensions and persistent tangencies Nonlinearity 14 431–62 [28] Romero N 1995 Persistence of homoclinic tangencies in higher dimensions Ergod. Theor. Dynam. Syst. 15 735–57 [29] Smale S 1967 Differentiable dynamical systems Bull. Am. Math. Soc. 73 747–817 [30] Takens F 1988 Intermittency: Global Aspects (Lecture Notes in Mathematics 1331) (Berlin: Springer) pp 213–39 [31] Viana M 1993 Strange attractors in higher dimensions Bull. Braz. Math. Soc. 24 13– 62 [32] Williams R 1970 The DA maps of Smale and structural stability Global Analysis (Proc. Symp. Pure Math. 14) pp 329–34 [33] Yoccoz J-C 1984 Conjugaison diff´erentiable des diff´eomorphismes du cercle dont le nombre de rotation v´erifie une condition Diophantienne Ann. ENS 17 333–61 [34] Zeeman C 1981 Bifurcations, Catastrophe and Turbulence (Proc. Cent. Conf. Case Western Reserve Univ. Cleveland: New Directions in Applied Mathematics) ed P J Hilton and G S Young (Berlin: Springer) pp 109–53

This page intentionally left blank

Chapter 12 Homoclinic points in complex dynamical systems Robert L Devaney Boston University

One of the most influential papers that I ever read in my mathematical career was Floris Takens’s paper entitled Homoclinic points in conservative systems [12]. In a very real sense I have returned over and over again to the topic addressed by Floris in this paper: under what generic conditions does a particular type of system admit homoclinic points? In several papers in the 1980s, I considered the situation of homoclinic orbits to equilibrium points of Hamiltonian flows [1, 2]. Later I worked on homoclinic points in the area-conserving H´enon map, a particular case of Takens’s result [3]. Most recently, I have been looking at the existence and structure of homoclinic orbits in complex analytic dynamical systems. This is the topic of this paper. It is a pleasure to acknowledge in this paper the tremendous inspiration that Floris has provided for me over the years. Special thanks to Adrien Douady who guided me in most of this research. More specifically, we will investigate the dynamics of a polynomial map of the complex plane near a saddle–node fixed point, i.e., a fixed point with multiplier 1. We will give specific conditions under which these maps admit homoclinic points. In the associated parameter space, we show that the particular saddle–node parameter value is then the limit of infinitely many ‘baby’ Mandelbrot sets. For parameter values that lie in these Mandelbrot sets, the corresponding maps admit invariant sets on which an iterate of the map is dynamically equivalent to a quadratic map of the form Q c (z) = z 2 + c.

12.1 Baby Mandelbrot sets As is well known, the Mandelbrot set is the parameter space for the quadratic family Q c (z) = z 2 + c. As is also well known, this set features infinitely 329

330

Robert L Devaney

Figure 12.1. Baby Mandelbrot sets accumulating on a cusp point in the Mandelbrot set.

many small copies of itself. Each of these ‘baby’ Mandelbrot sets admits a main cardioid for which the corresponding quadratic polynomials have attracting cycles of some period n > 1. At the cusp of these cardioids, this periodic point is a neutral fixed point for Q nc which satisfies the hypotheses of the theorem below. Hence, we expect to see infinitely many baby Mandelbrot sets accumulating on this cusp. This is indeed the case, as is shown in figure 12.1. We remark that the ‘cauliflower-like’ structure that surrounds these copies of the Mandelbrot sets has been studied in [4]. We also remark that this result is a special case of a theorem of McMullen that asserts that baby Mandelbrot sets are dense in the boundary of the full Mandelbrot set [7].

12.2 The complex saddle–node Our goal in this section is to investigate the dynamical behavior of complex analytic maps near a saddle–node bifurcation point. To be specific, we consider a family of complex analytic maps given by Pc which depends analytically on the complex parameter c. We assume that the degree of Pc is three or more. When c = 0, we assume that this map has a fixed point at z 0 with multiplier 1. We will show below that, in many cases, there is a homoclinic orbit associated with z 0 , i.e., an orbit that tends to z 0 under forward and backward iterations of P0 . We make two global assumptions about the dynamics of P0 . Our first assumption is that the homoclinic orbit is non-degenerate (defined below). Our

Homoclinic points in complex dynamical systems

331

Figure 12.2. The graphs of P0 and Pc .

second assumption is that P0 admits a unique critical point in the immediate basin of attraction of z 0 , and that this critical point is of order two. Our main result is then as follows. Theorem 12.1. Under the above conditions, the parameter space for Pc (the c-plane) admits infinitely many subsets M j , j > J , each of which is homeomorphic to the standard Mandelbrot set M via a map φ j : M j → M. The M j converge in parameter space to c = 0 as j → ∞. Moreover, for each c ∈ M j , there is a subset &c of the Julia set of Pc on which some iterate of Pc is topologically conjugate to the map z "→ z 2 + φ j (c) on its Julia set. In the case of a real polynomial, our assumptions are equivalent to the assumption that the graph of P0 and one nearby Pc are as depicted in figure 12.2. This figure shows that P0 admits a point whose orbit is both forward and backward asymptotic to the indifferent fixed point. This is the homoclinic orbit. For Pc , the indifferent fixed point has disappeared, as has its basin of attraction. This allows orbits which previously tended to the fixed point to escape but then return over and over to a neighborhood of the fixed point. The resulting cyclic motion of orbits has been termed intermittency by Pomeau and Manneville [9]. In this regard, our results can be interpreted as a description of the dynamical motions possible near intermittency. Homoclinic points play a central role in smooth (non-analytic) dynamics. As is well known, the existence of a transverse homoclinic point for a map implies the existence of complicated orbit structure nearby (a Smale horseshoe—see [10]). An important question in dynamics is how this complicated behavior arises as a parameter is varied and a homoclinic point is ‘born’. The results in this section give a partial answer to this question in the special case of a complex analytic map where the homoclinic point is associated with a neutral fixed point.

332

Robert L Devaney

Figure 12.3. Ecalle cylinders for P0 .

12.3 Dynamics of P0 We may assume that the family of maps assumes the form Pc (z) = z + c + z 2 . . . . Thus P0 (0) = 0, P0 (0) = 1, and P0 (0) %= 0. The local dynamical behavior of P0 near 0 is well understood. Since P0 (0) = 1, there is a neighborhood O of 0 in which the local inverse P0−1 is well-defined. Since P0 (0) %= 0, there exist open disks D− and D+ in O which contain 0 in their boundary and satisfy P0 (D− ) ⊂ D−

and

P0−1 (D+ ) ⊂ D+ .

Moreover, each point in D− has forward orbit which is asymptotic to 0, while each point in D+ has forward orbit under P0−1 which is asymptotic to 0. The disks D− and D+ may be chosen small enough so that D− − P0 (D− ) and D+ − P0−1 (D+ ) are fundamental domains for the dynamics of P0 near 0. Using the map P0 , we may glue together the edges of D− − {P0 (D− ) ∪ {0}} to form a cylinder which we denote by C0− . Using P0−1 , we may similarly construct C0+ . C0− and C0+ are called Ecalle cylinders [5, 11]; see figure 12.3. Let B0 denote the immediate basin of attraction of 0 for P0 . It is known that B0 is an open disk containing 0 in its boundary. It is also known that the boundary ∂ B0 of B0 lies in the Julia set of P0 and that ∂ B0 is invariant under P0 . B0 must contain at least one of the critical points of P0 . We will make the following simplifying assumption throughout. Hypothesis A. B 0 contains no asymptotic values of P0 and exactly one critical point z 0 of P0 which satisfies P0 (z 0 ) %= 0. Moreover, all other critical or asymptotic values of P0 are attracted to an attracting or a parabolic cycle. The second part of hypothesis A is included mainly for convenience and can be weakened significantly. From hypothesis A, it follows that P0 |B0 is two-toone, except at the critical value P0 (z 0 ), which has only one preimage. Moreover, both z 0 and P0 (z 0 ) lie in B0 , not on ∂ B0.

Homoclinic points in complex dynamical systems

333

As a consequence of hypothesis A, it also follows that P0 |∂ B0 is expanding. This follows in the polynomial case from [5] (part 2, X proposition 3). In the case of an entire map of finite type, this follows from results in [8]. Hence, we may choose an open set B which contains B 0 in its interior and which satisfies P0 (B) ⊃ B. The set B is called an overflowing neighborhood of the immediate basin B0 .

12.4 Homoclinic points In this section we make a second assumption, more global in nature about the dynamics of P0 . A point w ∈ D+ − B 0 is called a homoclinic point if there exists N > 0 such that P0N (w) ∈ B0 . The orbit of w is therefore both forward and backward asymptotic to 0, with the backward orbit constructed using P0−1 . Proposition 12.2. There exist homoclinic points for P0 . Proof. Recall that P0 (0) %= 0. Since P0 has degree three or higher, it follows that there is a non-zero preimage of 0, say z 0 . By hypothesis A, z 0 ∈ / ∂ B0 . Since 0 belongs to the Julia set of P0 , it follows that z 0 also belongs to the Julia set. Therefore, given any neighborhood W of 0, we may find an iterate n for which z 0 ∈ P0n (W ). Now let V be a small neighborhood of z 0 such that V ⊂ P0n (W ). We therefore have that P0−n (V ) lies in D + − B0 provided W is chosen small enough. On the other hand, P0 (V ) meets B0 since P0 (z 0 ) = 0. It follows that there are points in P0−n (V ) that are homoclinic points. Remark 12.3. The assumption that P0 has degree three is important here, since the quadratic map Q(z) = z 2 + 1/4 has a saddle–node fixed point at z 0 = 1/2, but this fixed point admits no homoclinic orbits. Similarly, there are other ‘homoclinic-less’ neutral fixed points for higher degree polynomials, but these all violate hypothesis A. Let us assume that the homoclinic point w ∈ D+ satisfies P i (w) %∈ B0 ∪ D+ for 1 ≤ i < N, but P N (w) ∈ B0 . Then there is an open connected set U containing w and having the property that P0N (U ) = B, where B is the overflowing neighborhood of B0 constructed above. We say that w is a nondegenerate homoclinic point if P0N : U → B is an isomorphism; see figure 12.4. We remark that it is entirely possible for some of the P0i (U ), 1 ≤ i < N to contain critical points of P0 , in which case P0N |U would not be one-to-one. In many cases, however, this assumption may be readily verified. Our second main assumption about P0 is as follows. Hypothesis B. P0 admits a non-degenerate homoclinic point.

334

Robert L Devaney

Figure 12.4. Homoclinic orbits for P0 .

By adjusting D+ , we may assume that U ⊂ D+ − P0−1 (D+ ). Hence it follows that U−k = P0−k (U ) are disjoint open sets in D+ − B0 which tend to 0 as k → ∞.

12.5 Ecalle cylinders for Pc We may also erect Ecalle cylinders for Pc . By the results of [6], there is a wedgeshaped region R in the c-plane such that, if c ∈ R, then Pc has a pair of repelling fixed points which we denote by p− (c) and p+ (c). Moreover, we may choose Ecalle cylinders Cc− and Cc+ for each c ∈ R with vertices at p− (c) and p+ (c). The Cc± may be chosen so that their boundaries depend continuously on c and tend to the boundaries of C0± as c → 0 in R; see figure 12.5. Let z c denote the critical point of Pc in B. Without loss of generality, we may assume that the critical value Pc (z c ) ⊂ Cc− . As above, we may use the maps Pc and Pc−1 to glue together the boundaries − of Cc and Cc+ . We choose the critical value as basepoint in Cc− and any point in Cc+ as basepoint. Then there are isomorphisms πc− : Cc− → C /Z πc+ : Cc+ → C /Z which take basepoints to 0. Unlike the case when c = 0, there is a well-defined transit map φc : Cc− → Cc+ defined by φc (z) = Pck (z)

Homoclinic points in complex dynamical systems

335

Figure 12.5. Ecalle cylinders for Pc .

where k is the smallest positive integer for which Pck (z) ∈ Cc+ . Note that φc may be discontinuous. However, the projection of this map to C /Z given by c , where c

◦ πc− (z) = πc+ ◦ φc (z)

is an isomorphism. For c ∈ R, there is also defined a map

: R → C /Z

given by (c) =

− c (πc (Pc (z c )))

=

c (0).

The map determines the image of the critical value in Cc+ after it makes its transit between Cc− and Cc+ . Note that (0) is not defined. However, according to [6] (theorem 16.11) is given asymptotically by −2π (c) = ω0 + √ + o(1) c for some constant ω0 . Note that therefore wraps the wedge R infinitely often around C /Z as c → 0 in R. In particular, if U is an open disk with compact closure in C /Z, then −1 (U˜ ) consists of infinitely many disjoint components V j converging to 0 in R and on which induces an isomorphism V j → U˜ .

12.6 Polynomial-like maps In this section we recall some of the main results of [6]. Let D denote the closed disk. Suppose, for each λ ∈ D, there exist:

336

Robert L Devaney

(i) Open disks Uλ , Uλ depending continuously on λ and satisfying Uλ ⊂ Uλ . (ii) An analytic family of maps Fλ : Uλ → Uλ depending analytically on λ with the property that Fλ : Uλ → Uλ is of degree two. Any map with this property is said to be polynomial-like of degree two. (iii) For each Fλ , there is a unique critical point z λ ∈ Uλ . For λ ∈ ∂ D, we assume that the map λ → Fλ (z λ ) describes a curve in Uλ − Uλ which has winding number one with respect to each z ∈ Uλ . A family of maps with this property is said to have parametric degree one. Any family of maps satisfying (i)–(iii) is called a family of polynomial-like maps of degree two with parametric degree one. A major result in [6] (their theorem 4) asserts that such a family admits a subset M ⊂ D which is homeomorphic to the standard Mandelbrot set via a map c = c(λ). Moreover, for each λ ∈ M, Fλ |Uλ is topologically conjugate to z → z 2 + c(λ) on the filled Julia sets of each. That is, the dynamics of Fλ are equivalent to those of one of the quadratic maps z → z 2 + c on Uλ . Furthermore, this result asserts that all possible quadratic dynamical behavior occurs in the family Fλ .

12.7 Proof of theorem 12.1 Our goal in this section is to combine the results about Ecalle cylinders and parametrized families of analytic maps to prove that the maps Pc admit infinitely many copies of the Mandelbrot set M j , j > J , in the region R in the c-plane. Recall that there is an open set U ⊂ D+ − P0−1 (D+ ) with the property that P0N : U → B is an isomorphism where B is an overflowing neighborhood of the immediate basin B0 of 0. This follows from hypothesis B. In particular, P0 (B) ⊃ B 0 . For c small enough, it follows that PcN : U → C is also an isomorphism. Moreover, if Uc is an open set sufficiently close to U , then PcN |Uc is also an isomorphism, provided c is close enough to 0. For each j ∈ Z+ sufficiently large, we will determine a subset V j of R such j +N+1 that the family of maps Pc for c ∈ V j is a family of polynomial-like maps of degree two with parametric degree one. Each of the V j will be disjoint, and the V j → 0 as j → ∞. To construct V j , recall that U ⊂ C0+ − B0 is an open disk on which P0N is an isomorphism carrying U onto B. Let U˜ = π0+ (U ). U˜ is a disk in C /Z. By our remarks at the end of the previous section, −1 (U˜ ) consists of infinitely many

Homoclinic points in complex dynamical systems

337

components. These are the V j . We may assume that the index j is chosen so that j +1 c ∈ V j implies that Pc (z c ) ∈ Cc+ , i.e., that j gives the ‘time’ of transit of the critical point from Cc− to Cc+ . Let Uc = (πc+ )−1 (U˜ ). For each c sufficiently small, PcN is an isomorphism which maps Uc onto an overflowing neighborhood of B. −j Let Wc = Pc (Uc ). Since c ∈ V j , it follows that Wc contains the critical j value of Pc . Now Pc : Wc → Uc is an isomorphism. Let W˜ c denote the preimage of Wc containing z c . Pc |W˜ c is a degree two map onto Wc by hypothesis A. Note that, if c is sufficiently small, W˜ c ⊂ B. Therefore we have the fact that j +1+N

Pc

: W˜ c → B

is a polynomial-like map of degree two, if c ∈ V j . It remains to show that this family has parametric degree one. For this we first observe that : V j → U˜ is an isomorphism. So maps ∂ V j to ∂ U˜ with winding number one relative to the interior of U˜ . If we lift this curve via (πc+ )−1 , the result is still a degree one curve c → (πc+ )−1 ◦ (c) which is close to the boundary of U . Since PcN is C 0 -close to P0N , it follows that, for j +1+N c ∈ ∂ V j , Pc (z c ) is a curve which wraps once around B0 . This completes the proof. Remark 12.4. Much more can be said about the small copies of M: each is ‘encaged’ in ‘cauliflowers’. These are collections of Cantor-like sets that nest down to M. For more details, we refer to [4].

References [1] Devaney R L 1976 Homoclinic orbits in Hamiltonian systems J. Diff. Eqns 21 431–8 [2] Devaney R L 1978 Transversal homoclinic orbits in an integrable system Am. J. Math. 100 631–42 [3] Devaney R L 1984 Homoclinic bifurcations and the area-conserving H´enon map J. Diff. Eqns 51 254–66 [4] Douady A, Buff X, Devaney R L and Sentenac P 2000 Baby Mandelbrot sets are born in cauliflowers The Mandelbrot Set: Theme and Variations (London Mathematical Society Lecture Notes 274) ed Tan Lei (Cambridge: Cambridge University Press) pp 19–36 [5] Douady A and Hubbard J 1982 It´eration des polynˆomes quadratiques complexes C. R. Acad. Sci., Paris 29 123–6 [6] Douady A and Hubbard J 1985 On the dynamics of polynomial-like mappings Ann. ´ Norm. Sup. 4e S´eries 18 287–343 Sci. Ec. [7] McMullen C 2000 The Mandelbrot set is universal The Mandelbrot Set: Theme and Variations (London Mathematical Society Lecture Notes 274) ed Tan Lei (Cambridge: Cambridge University Press) pp 1–18 [8] McMullen C 1987 Area and Hausdorff dimension of Julia sets of entire functions Trans. Am. Math. Soc. 300 329–42

338

Robert L Devaney

[9] Pomeau Y and Manneville P 1980 Intermittent transition to turbulence in dissipative dynamical systems Commun. Math. Phys. 74 189–97 [10] Smale S 1964 Diffeomorphisms with many periodic points Differential and Combinatorial Topology (Princeton, NJ: Princeton University Press) pp 63–80 [11] Shishikura M 2000 Bifurcation of parabolic fixed points The Mandelbrot Set: Theme and Variations (London Mathematical Society Lecture Notes 274) ed Tan Lei (Cambridge: Cambridge University Press) pp 325–64 [12] Takens F 1972 Homoclinic points in conservative systems Inv. Math. 18 267–92

Chapter 13 Excitation of elliptic normal modes of invariant tori in volume preserving flows Mikhail B Sevryuk Russian Academy of Sciences

To Professor Floris Takens on the occasion of his sixtieth birthday. One of the most important achievements in the theory of dynamical systems in the twentieth century is the Kolmogorov–Arnol’d–Moser (KAM) theory named after its founders A N Kolmogorov [25], V I Arnol’d [2, 3, 4], and J Moser [27]. This theory studies quasi-periodic motions in flows and diffeomorphisms. The main ‘informal’ conclusion of KAM theory is that quasi-periodic motions (and invariant tori carrying these motions) constitute one of the ‘typical’ kinds of motion in dynamical systems and occur often in large quantities. For instance, consider a completely integrable analytic Hamiltonian system with n > 1 degrees of freedom whose phase space is foliated into invariant n-tori {I = const} carrying conditionally periodic motions ϕ˙ = ω(I ), where (I, ϕ) are the action– angle variables. It turns out that, if the frequency map I "→ ω(I ) is a local diffeomorphism, then most of the tori {I = const} are not destroyed under small analytic Hamiltonian perturbations of the original system but only undergo a slight deformation (Kolmogorov’s theorem [25, 2]). More precisely, if the frequency vector ω0 = ω(I 0 ) is Diophantine, that is, there exist Q > 0 and γ > 0 such that |'k, ω0 (| ≥ γ |k|−Q for each k ∈ Zn \ {0}, then the torus {I = I 0 } survives sufficiently small perturbations of the system, and the perturbed torus again carries quasi-periodic motions with the same frequency vector ω0 . (The angle brackets denote the standard inner product of two vectors in Rn , and |k| = |k1 | + · · · + |kn |.) The subsequent development of the theory has shown that the analyticity condition imposed on the unperturbed and perturbed Hamilton functions, the nondegeneracy condition det(∂ω/∂ I ) %= 0 imposed on the frequency map, and the Diophantine condition imposed on the individual 339

340

Mikhail B Sevryuk

frequency vector ω0 can be relaxed greatly; see e.g. [11, 31] for relevant theorems and references. Invariant n-tori filled with quasi-periodic motions are not isolated but organized generically into Cantor-like families (for n ≥ 2 in the case of flows and for n ≥ 1 in the case of diffeomorphisms). Even if such a torus is isolated in the phase space of the individual system then the adequate statement of the problem is sure to involve external parameters, and the tori carrying quasi-periodic motions are not isolated in the product of the phase space and the parameter space. The families of tori are smooth in the sense of Whitney; see [11, 28, 23, 12, 26, 9, 10] and references therein. A system can admit many Cantor families of invariant tori of various dimensions, and the mutual arrangement of these families is often rather complicated. The properties of Whitney-smooth families of quasi-periodic motions in dynamical systems strongly depend on the phase space structures the system is assumed to preserve. Up to now, such families have been explored for general (‘dissipative’) systems, volume preserving systems, Hamiltonian systems (i.e. Hamiltonian flows and symplectic mappings), and reversible systems; see [11, 23, 10] where all these four ‘contexts’ are treated from a unified viewpoint. Although most studies in KAM theory have been devoted to the Hamiltonian context, ample information on quasi-periodic motions for other contexts has been obtained as well. For instance, invariant tori filled with quasi-periodic motions in multidimensional volume preserving flows (i.e. with the phase space of dimension greater than two) were constructed in [11, 23, 12, 10, 19], and those in volume preserving diffeomorphisms in [19, 16, 44, 42, 17]. KAM-type results for multidimensional mappings possessing the so-called intersection property (which is, in fact, a relaxed version of volume preservation) were obtained in [43, 18].

13.1 Background on elliptic normal modes Among the aspects of KAM theory that have attracted much attention in the last decade has been the so-called excitation of elliptic normal modes of invariant tori. The general description of this phenomenon (for the case of flows, to be definite) is as follows. Let the unperturbed system (possibly depending on parameters) have an invariant M-dimensional submanifold M smoothly foliated into invariant n-tori carrying conditionally periodic motions, n > 1. The case M = n where the surface M consists of a single torus is not excluded. The usual problem of KAM theory in this set-up is to prove that, under appropriate nondegeneracy and nonresonance conditions, any sufficiently small perturbation of the system admits many invariant n-tori near M carrying quasi-periodic motions; see the book [11] and references therein. Assume now that the variational equation along each unperturbed n-torus has constant coefficients. (Then 0 is an eigenvalue of the coefficient matrix of the variational equation of multiplicity at least M − n provided that n < M.) Suppose that for each

Elliptic normal modes of invariant tori

341

unperturbed n-torus the coefficient matrix of the variational equation possesses ν ≥ 1 pairs of purely imaginary eigenvalues. It turns out that in this case, under suitable nondegeneracy and nonresonance conditions, the unperturbed system itself and all its sufficiently small perturbations admit many invariant tori near M of dimensions n + 1, . . . , n + ν. These tori carry quasi-periodic motions. This situation is called excitation of elliptic normal modes. Indeed, using physical terminology, one says that the elliptic normal modes of the unperturbed n-tori are excited. The cases n = 0 (invariant tori near equilibria) and n = 1 (invariant tori near periodic trajectories) are usually not considered within this scheme because they are much simpler than the case n > 1 and well known. Up to now, the excitation of elliptic normal modes of invariant tori has been examined for Hamiltonian and reversible systems only. In the Hamiltonian setting, the operator + of the variational equation ξ˙ = +ξ along a torus has an invariant subspace where this operator is Hamiltonian. In the reversible setting, the operator + is infinitesimally reversible. For Hamiltonian or infinitesimally reversible matrices, having purely imaginary eigenvalues is one of the typical, structurally stable possibilities; see e.g. [22]. Invariant tori of dimension n > n around invariant n-tori in Hamiltonian systems with n degrees of freedom were first studied by V I Arnol’d [3, 1], and in Hamiltonian systems with an arbitrary number n ≥ n of degrees of freedom, by A D Bruno [14, 15]. In the nineties, the excitation of elliptic normal modes of invariant tori in Hamiltonian flows was explored in detail by H W Broer, G B Huitema, and M B Sevryuk [11, 36] and by ` Jorba and J Villanueva [40, 24]. In [40, 24] the Hamiltonian system in question A is not assumed to be close to an ‘unperturbed’ system with a smooth family of invariant n-tori. In contrast, Jorba and Villanueva start with a single torus in the phase space and show that, under appropriate conditions, there are Cantor families of other invariant tori of various dimensions near this torus. The approach of [11, 36] and that of [40, 24] were compared in the survey [37]. The excitation of elliptic normal modes of invariant n-tori in reversible systems for n ≥ 2 was first conjectured in [32]. Rigorous theorems were obtained for reversible flows in [11, 33, 35] (see also the survey [38]) and for reversible diffeomorphisms in [35, 34, 29], again via various techniques. The excitation of elliptic normal modes of invariant tori in dissipative systems is impossible because a real matrix without any particular symmetry properties generically does not possess purely imaginary eigenvalues. What can one say on the remaining, volume preserving, context? The coefficient matrix + of the variational equation ξ˙ = +ξ along an invariant torus of a volume preserving flow is of trace zero. Let p be the phase space codimension of the torus, so that + ∈ sl( p, R). If p = 1 then + = 0. This is a heuristic explanation of the fact that invariant tori (carrying quasi-periodic motions) of codimension one in volume preserving systems are usually organized into Whitney-smooth one-parameter families [11, 23, 12, 10, 44, 42], the Lebesgue measure of the union of the tori being positive. If p ≥ 2 then + is generically nondegenerate. Invariant tori

342

Mikhail B Sevryuk

(carrying quasi-periodic motions) of codimension p ≥ 2 in volume preserving systems are, therefore, usually isolated in the phase space. To enable such tori to exist the system should depend on an at least one-dimensional external parameter [11, 23, 12, 10]. If p ≥ 3 then the matrix + generically possesses no purely imaginary eigenvalues. But for 2 × 2 matrices + of trace zero, a purely imaginary spectrum ±iε (ε > 0) is one of the two typical, structurally stable possibilities. (The other possibility is a real spectrum ±δ, δ > 0.) Thus, one may conjecture that codimension-two invariant tori in volume preserving systems can exhibit the excitation of the elliptic normal mode.

13.2 Aim and notation The aim of this paper is to formulate and prove a precise theorem showing that this conjecture is true. It is a special pleasure for me to publish this paper in this Liber Amicorum devoted to the 60th birthday of Professor Takens who has made, together with his school, so versatile and profound a contribution (see e.g. [11, 23, 12, 10, 39, 6, 7, 8, 5, 13]) to the theory of volume preserving flows and mappings! Our approach is very similar to that exploited in [11, 36] for the Hamiltonian context and in [11, 35] for the reversible context. The essence of this approach is to reduce the mode excitation problem to the usual persistence problem, but with very weak nondegeneracy conditions. We confine ourselves to the case of flows, although analogous statements hold for volume preserving diffeomorphisms. All the vector fields and their dependence on external parameters are assumed to be analytic, although the results carry over to the C ∞ and finitely differentiable categories. Recall some basic definitions. Let τ be a volume element (i.e. an everywhere nondegenerate differential K -form) on an orientable K -dimensional manifold K. Then the divergence divX of a vector field X on K is a real-valued function on K defined as d(i X τ ) = (div X)τ. Here i X τ is the (K −1)-form whose value at the vectors X 1 , . . . , X K −1 is equal to the value of τ at the vectors X, X 1 , . . . , X K −1 . Volume preserving (or divergencefree) vector fields X are those for which div X ≡ 0, i.e. the form i X τ is closed. A divergence-free vector field X is said to be globally divergence free if the form i X τ is not only closed but also exact. The vector fields under study in the volume preserving context of KAM theory are, in fact, globally divergence free rather than merely divergence free. For K = 2 volume elements are just symplectic structures and globally divergence-free vector fields are just Hamiltonian vector fields. The angle brackets 'a, b( will always designate the standard inner product of two vectors a and b. By |a| and a we denote the l1 -norm and the l2 -norm of a vector a, respectively, that is, |a| = j |a j | and a2 = j |a j |2 . The symbols

Elliptic normal modes of invariant tori

343

N and Z+ = N ∪ {0} denote the set of positive integers and that of non-negative integers, respectively. We will use the expression ‘an invariant torus carrying quasi-periodic motions’ only when the number of rationally independent frequencies of those motions is equal to the torus dimension. For parallel motions on a torus with an arbitrary number of frequencies, we will speak of ‘an invariant torus carrying conditionally periodic motions’. The paper is organized as follows. In sections 13.3 and 13.4, we formulate the precise persistence theorems for families of invariant tori of codimensions p ≥ 2 and codimension one, respectively, in volume preserving flows. (In fact, we will need only the statement for codimension-one tori, but for completeness, the persistence result for tori of greater codimensions is also presented here.) The excitation of the elliptic normal mode of codimension-two tori is considered in section 13.5.

13.3 Persistence theorem for tori of codimension greater than one Let Z be a neighbourhood of the origin in R p , diffeomorphic to an open pdimensional ball. Consider an s-parameter family of divergence-free vector fields X µ on Tn × Z = (R/2π Z)n × Z of the form X µ = [ω(µ) + f (x, z, µ)]

∂ ∂ + [+(µ)z + g(x, z, µ)] ∂x ∂z

(13.1)

where n ≥ 0, p ≥ 2, s ≥ 0, x ∈ Tn = (R/2π Z)n , z ∈ Z ⊂ R p , µ ∈ B ⊂ Rs is the external parameter (B being a bounded connected domain in Rs ), ω : B → Rn , + : B → sl( p, R), and f = O(|z|), g = O(|z|2 ). The volume element on the phase space is τ = dx ∧ dz where dx = dx 1 ∧ · · · ∧ dx n and dz = dz 1 ∧· · ·∧dz p . The n-torus T = {(x, 0) | x ∈ Tn } is invariant under the flow of X µ for each µ and carries conditionally periodic motions with frequency vector ω(µ). Suppose that for any value of µ ∈ B all the eigenvalues of the matrix +(µ) ∈ sl( p, R) are simple and equal to ±δ(µ) [δ(µ) > 0]

or

± iε(µ) [ε(µ) > 0]

for p = 2 (these two possibilities are called the hyperbolic case and the elliptic case, respectively), or δ j (µ) α j (µ) ± iβ j (µ)

(1 ≤ j ≤ p − 2m) [β j (µ) > 0]

(1 ≤ j ≤ m)

344

Mikhail B Sevryuk

for p ≥ 3 where 0 ≤ m ≤ Entier( p/2) and p−2m

δ j (µ) + 2

j =1

m

α j (µ) ≡ 0.

j =1

For p = 2 set r = 0 in the hyperbolic case and r = 1, λ = ε : B → R in the elliptic case. For p ≥ 3 set r = m, λ = β : B → Rm . The vector λ(µ) ∈ Rr is called the normal frequency vector of the torus T invariant under the flow of X µ [11, 23, 12, 10]. For p ≥ 3 assume in addition that all the numbers δ j (µ) are non-zero (1 ≤ j ≤ p − 2m) for any value of µ ∈ B. For n ≥ 1 and s ≥ 1 introduce the quantities ρ N (µ) as N N q q ρ (µ) = min max max 'e, D ω(µ)(u e=1 κ=0 u=1

|q|=κ

(e ∈ Rn , u ∈ Rs , q ∈ Zs+), where N ∈ N , D q ω(µ) =

∂ |q| ω(µ) q q ∂µ11 · · · ∂µs s

q

q

u q = u 11 · · · u s s

and the quantities *N (µ) as *N (µ)

q q = max max ', D λ(µ)(u κ=0 u=1 N

|q|=κ

(u ∈ Rs , q ∈ Zs+), where N ∈ N and ∈ Zr . Now we can formulate the persistence theorem for the torus T . Theorem 13.1 ([11, 10]). Assume that n ≥ 1 and s ≥ 1. Let the family X µ of divergence-free vector fields be as above and satisfy the following conditions: (i) there exists N ∈ N such that ρ N (µ) > 0 for any µ ∈ B, (ii) the inequality 'k, ω(µ)( %= ', λ(µ)( is valid for any µ ∈ B, ∈ Zr such that 1 ≤ || ≤ 2, and k ∈ Zn such that 1 ≤ k ≤

*N (µ) . ρ N (µ)

µ , µ ∈ B, be another family of divergenceThen the following holds. Let X n X µ is sufficiently close free vector fields on T × Z (a perturbation of X µ ). If µ to X in the real analytic topology then there exists a set ϒ ⊂ B such that µ possesses an invariant analytic n-torus for each µ ∈ ϒ the vector field X µ which carries Diophantine quasi-periodic motions (with a frequency vector T not necessarily equal to ω(µ)) and is close to the unperturbed torus T . The

Elliptic normal modes of invariant tori

345

µ is reducible to a constant coefficient equation. variational equation along T The Lebesgue measure meass of B \ ϒ in Rs tends to zero with the size of the µ × {µ}, µ ∈ ϒ, constitute a perturbation. Moreover, all the perturbed n-tori T n Whitney-smooth foliation in the product T × Z × B of the phase space Tn × Z and the parameter space B. Remark 13.2. Since the cohomology of Tn × Z is trivial in dimensions greater than n and, in particular, H n+ p−1 (Tn × Z , R) = 0 for p ≥ 2, any divergencefree vector field on Tn × Z is globally divergence free. Remark 13.3. As one easily sees, the inequality ρ N (µ) > 0 is equivalent to the fact that the collection of (s + N)!/s!N! vectors D q ω(µ) ∈ Rn , q ∈ Zs+, 0 ≤ |q| ≤ N spans Rn . Remark 13.4. For analytic frequency maps ω : B → Rn that we deal with here, the nondegeneracy condition (i) of theorem 13.1 is equivalent to the following geometric condition: the image of the frequency map ω in Rn does not lie in any linear hyperplane passing through the origin [11, 31, 10, 30]. Nondegeneracy conditions of this kind are obviously very weak. They were first introduced in the eighties by H R¨ussmann for Hamiltonian systems; see e.g. [31, 30]. The book [11, pp 70–3] and the preprint [31, pp 9–10] discuss the optimality of nondegeneracy conditions of type (i) and nonresonance conditions of type (ii) in the persistence theorems for quasi-periodic motions for various classes of dynamical systems. Note that the nonresonance condition (ii) of theorem 13.1 is void for r = 0, i.e. for the hyperbolic p = 2 case and for the case p ≥ 3 with m = 0. Remark 13.5. According to the general theory of hyperbolic invariant manifolds [20, 21, 41], in the hyperbolic p = 2 case and in the p ≥ 3 case with all the numbers α j (µ) nonzero (1 ≤ j ≤ m), the following holds. The torus T survives, as an invariant manifold, any sufficiently small perturbation of X µ for any value of µ ∈ B. However, for n ≥ 2 the perturbed tori are in general finitely smooth only, even if the initial vector fields X µ and their perturbations X µ are analytic, and they do not carry conditionally periodic motions. Theorem 13.1 states that under certain nondegeneracy and nonresonance conditions, the perturbed tori for most values of parameter µ are analytic and filled with quasi-periodic motions. Remark 13.6. Theorem 13.1 holds for any n ≥ 1, but makes sense for n ≥ 2 only. For n = 1 and ω(µ) %= 0 the torus T is an isolated (in the phase space) periodic trajectory whose persistence is a trivial fact (even for s = 0) which requires a much simpler nondegeneracy–nonresonance condition, namely that 1 should not be an eigenvalue of the monodromy operator LT (µ) of T . This is automatically fulfilled in the hyperbolic p = 2 case and in the p ≥ 3 case provided that all the numbers α j (µ) are nonzero (1 ≤ j ≤ m). In the elliptic p = 2 case, the condition 1∈ / Spec LT (µ) is tantamount to the condition that the ratio ε(µ)/ω(µ) is not an integer.

346

Mikhail B Sevryuk

13.4 Persistence theorem for tori of codimension one Let Y be a finite open interval on R. Consider an s-parameter family of globally divergence-free vector fields X µ on Tn × Y X µ = ω(y, µ)

∂ ∂x

where n ≥ 0, s ≥ 0, x ∈ Tn , y ∈ Y ⊂ R, µ ∈ B ⊂ Rs is the external parameter (B being a bounded connected domain in Rs ), and ω : Y × B → Rn . The volume element on the phase space is τ = dx ∧ dy = dx 1 ∧ · · · ∧ dx n ∧ dy. The n-tori Ty = {(x, y) | x ∈ Tn } are invariant under the flow of X µ and carry conditionally periodic motions with frequency vectors ω(y, µ). The persistence theorem for the set of tori Ty is as follows. Theorem 13.7 ([11, 10]). Let n ≥ 1 and assume that there exists N ∈ N such that for any (y, µ) ∈ Y × B, the collection of (s + N + 1)!/(s + 1)!N! vectors D q ω(y, µ) ∈ Rn spans Rn , where D q ω(y, µ) =

(q ∈ Zs+1 + and 0 ≤ |q| ≤ N) ∂ |q| ω(y, µ) q

q

∂y q1 ∂µ12 · · · ∂µs s+1

.

Then the following holds. Let X µ , µ ∈ B, be another family of globally n µ is sufficiently divergence-free vector fields on T ×Y (a perturbation of X µ ). If X µ n close to X in the real analytic topology then in the product T × Y × B of the phase space Tn × Y and the parameter space B, there is a set A of analytic n-tori 0 ∈ A lies wholly in one of the fibers Tn × Y × {µ0 }, is such that each torus T µ0 , carries Diophantine quasi-periodic motions, and invariant under the flow of X is close to one of the unperturbed tori Ty 0 × {µ0 }. The variational equation along 0 is ξ˙ = 0 (ξ ∈ R). The Lebesgue measure measn+s+1 of the union of each T 0 ∈ A in Tn × Rs+1 tends to (2π)n meas1 Y meass B as the size of the the tori T 0 ∈ A constitute perturbation tends to zero. Moreover, all the perturbed n-tori T n a Whitney-smooth foliation in T × Y × B. Remark 13.8. Recall that for analytic frequency maps ω : Y × B → Rn that we deal with here, the nondegeneracy condition of theorem 13.7 is equivalent to the following geometric condition: the image of the frequency map ω in Rn does not lie in any linear hyperplane passing through the origin [11, 31, 10, 30]. If the latter geometric condition is not fulfilled, then all the invariant n-tori of X µ can be removed by an arbitrarily small globally divergence-free perturbation; see [11, p 70]. Remark 13.9. Theorem 13.7 holds for any n ≥ 1, but makes sense for n ≥ 2 only. For n = 1 and ω(y, µ) %= 0 the nondegeneracy condition of theorem 13.7

Elliptic normal modes of invariant tori

347

is met automatically (even with N = 0), while the theorem itself concerns the persistence of periodic trajectories of planar Hamiltonian vector fields and is trivial.

13.5 Excitation of the elliptic normal mode Now we can proceed to the excitation of the elliptic normal mode of the unperturbed torus T in the elliptic p = 2 case. Consider again an s-parameter family (13.1) of divergence-free vector fields X µ on Tn × Z described at the beginning of section 13.3. Let n ≥ 0, p = 2, s ≥ 0. Suppose that for any value of µ ∈ B, the eigenvalues of the matrix +(µ) ∈ sl(2, R) are purely imaginary and equal to ±iε(µ), ε(µ) > 0. Define the mapping ω= ω(µ) = (ω1 , . . . , ωn , ε)

ω : B → Rn+1 .

(13.2)

The theorem on the excitation of the elliptic normal mode ε(µ) of the n-torus T = {(x, z) | x ∈ Tn , z = 0} is as follows. Theorem 13.10. Let s ≥ 1 and assume that there exists N ∈ N such that for any µ ∈ B, the collection of (s + N)!/s!N! vectors ω(µ) ∈ Rn+1 Dq

(q ∈ Zs+ and 0 ≤ |q| ≤ N)

µ , µ ∈ B, spans Rn+1 . Then the following holds. Given an arbitrary σ > 0, let X n be another family of divergence-free vector fields on T × Z (a perturbation of X µ is sufficiently close to X µ in the real analytic topology (the required X µ ). If smallness of the perturbation depends on σ ) then in the product Tn × Z × B of the phase space Tn × Z and the parameter space B, there is a set A of analytic (n + 1)-tori such that each torus T0 ∈ A (i) lies wholly in one of the fibers Tn × Z × {µ0 } in the σ -neighbourhood of the unperturbed n-torus T × {µ0 }, 0 (ii) is invariant under the flow of Xµ , (iii) carries Diophantine quasi-periodic motions with the frequency vector close to ω(µ0 ).

The variational equation along each T0 is ξ˙ = 0 (ξ ∈ R). The Lebesgue measure measn+s+2 of the union of the tori T0 ∈ A in Tn × Rs+2 is positive. Moreover, all the (n + 1)-tori T0 ∈ A constitute a Whitney-smooth foliation in Tn × Z × B. Proof. Theorem 13.10 is an almost immediate consequence of theorem 13.7. Indeed, let ∂ X µ = [ω(µ) + f (x, z, µ) + f˜(x, z, µ)] ∂x + [+(µ)z + g(x, z, µ) + g(x, ˜ z, µ)]

∂ ∂z

(13.3)

Mikhail B Sevryuk

348

where f˜ and g˜ are small. Near each point in B there exists an analytic matrixvalued function µ "→ C(µ) ∈ SL(2, R) such that 0 −ε0 (µ) [C(µ)]−1 +(µ)C(µ) = ε0 (µ) 0 with ε0 (µ) = ±ε(µ). Consider the function µ "→ L(µ) ∈ GL(2, R) given by L(µ) = C(µ)

in the case

ε0 (µ) ≡ ε(µ)

L(µ) = C(µ)R

in the case

ε0 (µ) ≡ −ε(µ)

where R is the diagonal matrix diag{−1; 1}. Define the linear coordinate transformation z = L(µ)w in R2 and introduce new coordinates (x, χ, η) in Tn × Z via the formulas x=x w1 = σ 2η cos χ w2 = σ 2η sin χ where χ ∈ T1 , η ∈ R, and η ranges in some fixed open interval 0 < c− < η < c+ . Then dw1 ∧dw2 = σ 2 dη ∧dχ. On the other hand, dz = dz 1 ∧dz 2 = ±dw1 ∧dw2 for ε0 (µ) = ±ε(µ), so that τ = dx ∧ dz = ∓σ 2 dx ∧ dχ ∧ dη for ε0 (µ) = ±ε(µ). Consequently, any vector field on Tn × Z (globally) divergence free with respect to the volume element τ is also (globally) divergence free with respect to the volume element τ = dx ∧ dχ ∧ dη. It is easy to verify that in the coordinates (x, χ, η), the fields X µ given by (13.3) determine systems of ordinary differential equations x˙ = ω(µ) + f + f˜ χ˙ = ε(µ) + (σ 2η)−1 [(h 2 + h˜ 2 ) cos χ − (h 1 + h˜ 1 ) sin χ] η˙ = σ −1 2η[(h 1 + h˜ 1 ) cos χ + (h 2 + h˜ 2 ) sin χ] where

h1 h2

= [L(µ)]−1

g1 g2

h˜ 1 h˜ 2

= [L(µ)]−1

g˜ 1 g˜ 2

(13.4)

and the functions f , f˜, g, g˜ are evaluated at the point (x, L(µ)w(σ, χ, η), µ). The key observation now is that the functions f and σ −1 h are small whenever σ is small because f (x, z, µ) = O(|z|)

g(x, z, µ) = O(|z|2 )

and w(σ, χ, η) = O(σ ). Hence, if the perturbations f˜ and σ −1 h˜ are also small (i.e. f˜ and σ −1 g˜ are small) then systems (13.4) satisfy all the conditions of theorem 13.7, where

Elliptic normal modes of invariant tori • • • •

349

the roles of n and s are played by n + 1 and s, respectively, those of Y and B by (c− , c+ ) and B, respectively, that of x by (x, χ), that of y by η, that of µ by µ, that of ω(y, µ) by ω(µ) = (ω(µ), ε(µ)).

Now theorem 13.7 provides us with a Whitney-smooth foliation of analytic µ . This completes (n + 1)-tori in Tn × (c− , c+ ) × B invariant under the flows of X the proof. Remark 13.11. Denote by Uσ the σ -neighbourhood of the (n + s)-dimensional surface {z = 0} in Tn × Z × B, and by W the union of the (n + 1)-tori T0 ∈ A described in theorem 13.10. Let f˜ and g˜ admit a holomorphic extension to a fixed (σ -independent) complex neighbourhood U of Tn × {0} × B ⊂ Tn × R2 × Rs . Then measn+s+2 (W ∩ Uσ ) →1 σ + sup | f˜| + σ −1 sup |g| ˜ →0 . measn+s+2 (Uσ ) U U If the perturbation is absent ( f˜ ≡ 0, g˜ ≡ 0), then conjecturally the measure measn+s+2 (Uσ \ W ) is exponentially small in σ (probably, under some additional conditions to be imposed on the analytic vector fields X µ ). For similar results concerning Hamiltonian systems, see [40, 24, 37] and references therein. Remark 13.12. The unperturbed frequency vector (13.2) for systems (13.4) does not depend on the ‘action’ variable η, and the nondegeneracy condition of theorem 13.7 is satisfied just by the dependence on the external parameter µ. Since this dependence turns out to be sufficient (due to the fact that the nondegeneracy condition of theorem 13.7 is very weak), we have dispensed with a partial Birkhoff normal form around the unperturbed n-torus T . Remark 13.13. Theorem 13.10 holds for any n ≥ 0, but makes sense for n ≥ 1 only. If n = 0 then ω(µ) ≡ ε(µ), and the nondegeneracy condition of theorem 13.10 is met automatically with N = 0 (even for s = 0), while the theorem itself concerns periodic trajectories of planar Hamiltonian vector fields near elliptic equilibria and is trivial. Remark 13.14. For n = 1, theorem 13.10 concerns invariant 2-tori around elliptic periodic trajectories T of divergence-free vector fields with a threedimensional phase space. Such tori can also be constructed by considering the area preserving return map on a Poincar´e section of T and applying the theorem on invariant curves of area preserving mappings of a plane near elliptic fixed points [3, 27]. This approach works for s = 0 as well but requires the Birkhoff normal form around T and nondegeneracy conditions on the nonlinear terms of that normal form.

350

Mikhail B Sevryuk

Remark 13.15. One may wonder whether the elliptic normal mode of the unperturbed n-torus T can be excited in the case s = 0 and n ≥ 2. In fact, this is plausible but has not been examined yet in the literature. Note that for s = 0 and n ≥ 2 conditionally periodic dynamics on T itself does not, generally speaking, survive small perturbations. It seems possible, however, that T is generically surrounded by invariant (n + 1)-tori carrying Diophantine quasiperiodic motions, and the Cantor family of those (n + 1)-tori persists under small perturbations. The book [11, p 96] and the papers [35, p 561] and [38, p 142] discuss the similar problem of the excitation of elliptic normal modes of isolated invariant tori in reversible flows without external parameters.

Acknowledgements I mastered KAM theory under the supervision of V I Arnol’d, and I am indebted to him for his guidance and generous help. I am also grateful to H W Broer and G B Huitema for a very fruitful collaboration which resulted in the paper [10] ` Jorba, A I Ne˘ıshtadt, J P¨oschel, and and the book [11], and to M R Herman, A F Takens for interesting discussions. Special thanks go to all the staff, students and former students of the Department of Mathematics and Computing Science at the University of Groningen for their warm hospitality during my five visits there in 1993–6.

References [1] Arnol’d V I 1962 On the classical perturbation theory and the problem of stability of planetary systems Sov. Math. Dokl. 3 1008–12 [2] Arnol’d V I 1963 Proof of a theorem by A N Kolmogorov on the persistence of quasiperiodic motions under small perturbations of the Hamiltonian Russ. Math. Surv. 18(5) 9–36 [3] Arnol’d V I 1963 Small denominators and problems of stability of motion in classical and celestial mechanics Russ. Math. Surv. 18(6) 85–191 [4] Arnol’d V I 1964 On the instability of dynamical systems with many degrees of freedom Sov. Math. Dokl. 5 581–5 [5] Braaksma B L J and Broer H W 1982 Quasiperiodic flow near a codimension one singularity of a divergence free vector field in dimension four Bifurcation, Th´eorie Ergodique et Applications (Ast´erisque 98–99) (Paris: Soc. Math. France Press) pp 74–142 [6] Broer H W 1979 Bifurcations of singularities in volume preserving vector fields PhD Thesis Rijksuniversiteit Groningen [7] Broer H W 1981 Formal normal form theorems for vector fields and some consequences for bifurcations in the volume preserving case Dynamical Systems and Turbulence (Lecture Notes in Mathematics 898) ed D A Rand and L-S Young (Berlin: Springer) pp 54–74 [8] Broer H W 1981 Quasiperiodic flow near a codimension one singularity of a divergence free vector field in dimension three Dynamical Systems and Turbulence

Elliptic normal modes of invariant tori

[9] [10]

[11]

[12] [13]

[14]

[15] [16] [17]

[18] [19]

[20] [21] [22] [23] [24] [25]

[26]

351

(Lecture Notes in Mathematics 898) ed D A Rand and L-S Young (Berlin: Springer) pp 75–89 Broer H W and Huitema G B 1995 Unfoldings of quasi-periodic tori in reversible systems J. Dynam. Diff. Eqns 7 191–212 Broer H W, Huitema G B and Sevryuk M B 1996 Families of quasi-periodic motions in dynamical systems depending on parameters Nonlinear Dynamical Systems and Chaos (Progress in Nonlinear Differential Equations and their Applications 19) ed H W Broer, S A van Gils, I Hoveijn and F Takens (Basel: Birkh¨auser) pp 171–211 Broer H W, Huitema G B and Sevryuk M B 1996 Quasi-Periodic Motions in Families of Dynamical Systems: Order amidst Chaos (Lecture Notes in Mathematics 1645) (Berlin: Springer) Broer H W, Huitema G B and Takens F 1990 Unfoldings of quasi-periodic tori Mem. Am. Math. Soc. 83(421) 1–81 Broer H W and van Strien S 1983 Infinitely many moduli of strong stability in divergence free unfoldings of singularities of vector fields Geometric Dynamics (Lecture Notes in Mathematics 1007) ed J Palis (Berlin: Springer) pp 39–59 Bruno A D 1974 The sets of analyticity of a normalizing transformation Preprints 97 and 98 (The USSR Academy of Sciences: Institute of Applied Mathematics) (in Russian) Bruno A D 1989 Local Methods in Nonlinear Differential Equations (Berlin: Springer) Cheng C Q and Sun Y S 1990 Existence of invariant tori in three-dimensional measure-preserving mappings Celest. Mech. Dynam. Astron. 47 275–92 Cong F and Li Y 1996 A parametrized KAM theorem for volume preserving mappings Preprint 42 (Peking University: Institute of Mathematics and School of Mathematical Sciences) Cong F, Li Y and Huang M 1996 Invariant tori for nearly twist mappings with intersection property Northeast Math. J. 12 280–98 Delshams A and de la Llave R 1990 Existence of quasi-periodic orbits and absence of transport for volume preserving transformations and flows Preprint University of Texas, Austin, TX Fenichel N 1971 Persistence and smoothness of invariant manifolds for flows Indiana Univ. Math. J. 21 193–226 Hirsch M W, Pugh C C and Shub M 1977 Invariant Manifolds (Lecture Notes in Mathematics 583) (Berlin: Springer) Hoveijn I 1996 Versal deformations and normal forms for reversible and Hamiltonian linear systems J. Diff. Eqns 126 408–42 Huitema G B 1988 Unfoldings of quasi-periodic tori PhD Thesis Rijksuniversiteit Groningen ` and Villanueva J 1997 On the normal behaviour of partially elliptic lower Jorba A dimensional tori of Hamiltonian systems Nonlinearity 10 783–822 Kolmogorov A N 1954 On the persistence of conditionally periodic motions under a small change of the Hamilton function Dokl. Akad. Nauk. SSSR 98 527–30 (in Russian) (Engl. transl. 1979 Stochastic Behavior in Classical and Quantum Hamiltonian Systems (Lecture Notes in Physics 93) ed G Casati and J Ford (Berlin: Springer) pp 51–6; reprinted as Kolmogorov A N 1984 Chaos ed Bai Lin Hao (Singapore: World Scientific) pp 81–6) Lazutkin V F 1993 KAM Theory and Semiclassical Approximations to Eigenfunctions

352

Mikhail B Sevryuk

(Berlin: Springer) [27] Moser J 1962 On invariant curves of area-preserving mappings of an annulus Nachr. Akad. Wiss. G¨ottingen Math.-Phys. Kl. II 1 1–20 [28] P¨oschel J 1982 Integrability of Hamiltonian systems on Cantor sets Commun. Pure Appl. Math. 35 653–96 [29] Quispel G R W and Sevryuk M B 1993 KAM theorems for the product of two involutions of different types Chaos 3 757–69 [30] R¨ussmann H 1989 Non-degeneracy in the perturbation theory of integrable dynamical systems Number Theory and Dynamical Systems (London Math. Soc. Lect. Note Series 134) ed M M Dodson and J A G Vickers (Cambridge: Cambridge University Press) pp 5–18 (reprinted as R¨ussmann H 1990 Stochastics, Algebra and Analysis in Classical and Quantum Dynamics (Math. and Appl. 59) ed S Albeverio, P Blanchard and D Testard (Dordrecht: Kluwer) pp 211–23) [31] R¨ussmann H 1998 Invariant tori in the perturbation theory of weakly non-degenerate integrable Hamiltonian systems Preprint 14 (Johannes Gutenberg-Universit¨at Mainz: Fachbereich Mathematik) [32] Sevryuk M B 1990 On the dimensions of invariant tori in the KAM theory Mathematical Methods in Mechanics ed V V Kozlov (Moscow: Moscow State University Press) pp 82–8 (in Russian) [33] Sevryuk M B 1993 Invariant tori of reversible systems of intermediate dimensions Russ. Acad. Sci. Dokl. Math. 47 129–33 [34] Sevryuk M B 1993 New cases of quasiperiodic motions in reversible systems Chaos 3 211–14 [35] Sevryuk M B 1995 The iteration-approximation decoupling in the reversible KAM theory Chaos 5 552–65 [36] Sevryuk M B 1997 Excitation of elliptic normal modes of invariant tori in Hamiltonian systems Topics in Singularity Theory: V I Arnol’d’s 60th Anniversary Collection (AMS Transl. Series 2 180 Adv. Math. Sci. 34) ed A G Khovanski˘ı, A N Varchenko and V A Vassiliev (Providence, RI: American Mathematical Society) pp 209–18 [37] Sevryuk M B 1998 Invariant tori of intermediate dimensions in Hamiltonian systems Reg. Chaotic Dynam. 3(1) 39–48 [38] Sevryuk M B 1998 The finite-dimensional reversible KAM theory Physica D 112 132–47 [39] Takens F 1972 Homoclinic points in conservative systems Inv. Math. 18 267–92 [40] Villanueva J 1997 Normal forms around lower dimensional tori of Hamiltonian systems PhD Thesis Universitat Polit`ecnica de Catalunya, Barcelona [41] Wiggins S 1994 Normally Hyperbolic Invariant Manifolds in Dynamical Systems (New York: Springer) [42] Xia Z 1992 Existence of invariant tori in volume-preserving diffeomorphisms Ergod. Theor. Dynam. Syst. 12 621–31 [43] Xia Z 1995 Existence of invariant tori for certain non-symplectic diffeomorphisms Hamiltonian Dynamical Systems: History, Theory, and Applications (IMA Vol. Math. and Appl. 63) ed H S Dumas, K R Meyer and D S Schmidt (New York: Springer) pp 373–85 [44] Yoccoz J-C 1992 Travaux de Herman sur les tores invariants S´eminaire Bourbaki Vol 1991–92 (Ast´erisque 206) Exp no 754 (Paris: Soc. Math. France Press) pp 311–44

Chapter 14 On the global dynamics of Kirchhoff’s equations: rigid body models for underwater vehicles Heinz Hanßmann and Philip Holmes Princeton University

We study the Kirchhoff model for the motion of a rigid body submerged in an incompressible, irrotational, inviscid fluid in the absence of gravitational forces and torques. Symmetries allow reduction to a two degree-of-freedom Hamiltonian system. In [8] the existence and stability of pure and mixed mode equilibria was studied and in [8, section 5.2] the system was averaged, allowing further reduction to one degree of freedom. We give an interpretation of the averaged Hamiltonian function as a normal form of order one. Iterating the process we obtain the normal form of order two, thus resolving a degeneracy noted in [8]. This allows us to prove that the (integrable) normal form of order two has heteroclinic orbits between the ‘pure 2’ and between the ‘pure 3’ modes in a range of parameter values, and, at a critical (bifurcation) value, heteroclinic cycles linking pure 2 and pure 3 modes. We discuss the implications for the original system and the full rigid body motions.

14.1 Setting of the problem We study the dynamics of an ellipsoidal rigid body immersed in an ideal fluid, modelled by Kirchhoff’s equations [9]. Our motivation stems from modelling the behavior of an underwater vehicle when viscous effects are negligable. The simplest possible motions are pure modes, in which the body rotates about one of its principal axes, while translating along that axis. Stable pure modes represent desirable motions that require minimal control, and rapid maneuvers may be performed by destabilising such states and allowing the system to follow 353

354

Heinz Hanßmann and Philip Holmes

its uncontrolled dynamics. In this context, heteroclinic orbits connecting pure modes may help in the design of energy-efficient control strategies for vehicle reorientation. For background on the Kirchhoff equations and the Hamiltonian setting see [9, 10, 8]. The configuration space is the manifold S E(3) = S O(3) × R3 of possible positions of e.g. the center of mass in 3-space and of possible rotations of the body about the center of mass. Correspondingly, the phase space is the cotangent bundle T ∗ S E(3) equipped with the canonical 2-form. Following [8], we focus on the case of a neutrally buoyant body, in which also the center of mass coincides with the center of volume and gravity exerts no net forces or torques. Then the Hamiltonian function is given solely by the kinetic energy and the system admits the full symmetry group S E(3), acting by left translations. Dividing out this symmetry allows reduction to the quotient space T ∗ S E(3)/ S E(3) ∼ = se(3)∗ ∼ = R3 × R3 , where the Poisson structure is derived from the Lie bracket on se(3). Denoting the coordinates of se(3)∗ by (π1 , π2 , π3 , p1 , p2 , p3 ), the Poisson bracket relations read {πi , π j } = − )i j k πk {πi , pi } = 0

(14.1)

{πi , p j } = − )i j k pk { pi , p j } = 0 where the alternating Levi-Civita symbol )i j k denotes the sign of the permutation 1 2 3 . The rank of this Poisson bracket (and thus the number of degrees i j k of freedom) is two. Indeed, there are two Casimirs—functions that Poisson commute with all other functions—given by κ1 (π, p) = ( p | p)

and

κ2 (π, p) = (π | p),

where (· | ·) denotes the inner product on R3 . For subsequent use, we also define κ3 (π, p) = (π | π). Given a Hamiltonian H = H (π, p), Kirchhoff’s equations read π˙ = π × ∇π H + p × ∇ p H p˙ = p × ∇π H.

(14.2) (14.3)

We remark that p and π are the linear and angular momenta (or impulses) measured in a body frame of axes and refer to [10] for further details on the reduction process.

On the global dynamics of Kirchhoff’s equations The kinetic energy is given by the inner product 1 π π , H (π, p) = M p p 2

355

(14.4)

where the symmetric 6 × 6 matrix M contains the added inertia tensor, the added mass tensor and coupling terms between linear and angular momenta. For ellipsoidal bodies, if the body frame coincides with the principal axes, M diagonalises to   I1 0 0 0 0 0 0 0 0   0 I2 0   0 0   0 0 I3 0 M =  0   0 0 0 m1 0   0 0 0 0 m2 0 0 0 0 0 0 m3 and the Hamiltonian (14.4) becomes H (π, p) =

1 πi2 1 pi2 + . 2 Ii 2 mi 3

3

i=1

i=1

(14.5)

The mass and inertia terms derive from the lengths li of the three principal semiaxes of the ellipsoidal body; see [8, 10]. Throughout this paper, without loss of generality, we take l1 > l2 ≥ l3 . Then m 3 ≥ m 2 > m 1 and I2 > I1 , but I3 may be larger than, equal to, or smaller than both I1 and I2 . The system admits the three reversing symmetries (π1 , π2 , π3 , p1 , p2 , p3 ) "→ (−π1 , π2 , π3 , − p1 , p2 , p3 ) (π1 , π2 , π3 , p1 , p2 , p3 ) "→ (π1 , −π2 , π3 , p1 , − p2 , p3 ) (π1 , π2 , π3 , p1 , p2 , p3 ) "→ (π1 , π2 , −π3 , p1 , p2 , − p3 ).

(14.6)

Since combinations of these yield the (non-reversing) symmetries (π1 , π2 , π3 , p1 , p2 , p3 ) "→ (π1 , −π2 , −π3 , p1 , − p2 , − p3 ) (π1 , π2 , π3 , p1 , p2 , p3 ) "→ (−π1 , π2 , −π3 , − p1 , p2 , − p3 )

(14.7)

(π1 , π2 , π3 , p1 , p2 , p3 ) "→ (−π1 , −π2 , π3 , − p1 , − p2 , p3 ) we conclude that, for fixed values κ1 = a 2 , κ2 = ab of the Casimirs, there are six (relative) equilibria ±(b, 0, 0, a, 0, 0), ±(0, b, 0, 0, a, 0) and ±(0, 0, b, 0, 0, a) of the (reduced) system. These are the pure modes, where the linear momentum is parallel to the angular momentum. As shown in [8] there exist further mixed mode equilibria in the cases I2 > I3 ≥ I1 and I2 > I1 > I3 . Figure 14.1 summarizes p) b these findings in a bifurcation diagram with parameter γ = | (π| ( p| p) | = | a |, and shows stability types of the equilibria determined in [8].

356

Heinz Hanßmann and Philip Holmes

Figure 14.1. Pure and mixed mode equilibria for the cases I3 > I2 > I1 (a), I2 > I3 ≥ I1 (b), I2 > I1 > I3 , small I1 − I3 (c), and I2 > I1 > I3 , large I1 − I3 (d). Solid lines indicate linearly stable (elliptic) equilibria, while dotted, chain-dotted and dashed lines indicate unstable equilibria with eigenvalue placements shown on the right. Bold solid lines indicate nonlinearly stable equilibria determined by the energy-Casimir method. Pitchfork bifurcation points are denoted γiP , γiPP and Hopf bifurcation point γiH . Reprinted from c [8] (>1998 with permission from Elsevier Science).

As the parameter γ is varied, several bifurcations take place. At γ = γ1H the pure 1 modes undergo a Hamiltonian Hopf bifurcation, where the value γ1H > 0 is given by

I2 I3 2 2 2 F1 + F1 − F2 γ1H = I2 + I3 − I1

On the global dynamics of Kirchhoff’s equations

357

with 1 1 1 2I2 − I1 1 − − + m1 m2 I2 m1 m3 3 I1 1 I1 1 1 1 − . = − − I3 m 1 m2 I2 m 1 m3

F1 = 2I3 I− I1 F2

In the case I3 ≥ I2 > I1 this is the only local bifurcation to occur. For I2 > I3 ≥ I1 there are Hamiltonian pitchfork bifurcations at γ = γ3P = α23 I3 and γ = γ2P = α23 I2 , with α23 > 0 given by 1 1 1 2 α23 = − . I2 − I3 m 2 m3 At the first of these bifurcations the elliptic pure 3 modes become unstable, giving rise to four stable mixed 2–3 modes (π, p) = (0, ±α23 I2 p2 , ±α23 I3 p3 , 0, p2 , p3 ).

(14.8)

As γ increases from γ3P to γ2P , the mixed 2–3 modes ‘approach’ the two pure 2 modes and reach them in the second Hamiltonian pitchfork bifurcation. Note that this scenario is governed√by the symmetries (14.7). At γ = γ23C = α23 I2 I3 the pure 2 and 3 modes are unstable saddle-centers and have the same energy, their energy-difference effectively changing sign as γ passes through this value. This suggests that there may be heteroclinic orbits between these pairs of modes, i.e. that a connection bifurcation takes place at γ = γ23C . The present paper was motivated by the desire to resolve this question. If I2 > I1 > I3 , there are four additional branches of unstable mixed 1–3 modes (π, p) = (±α13 I1 p1 , 0, ±α13 I3 p3 , p1 , 0, p3 ) with α13 > 0 given by 2 = α13

1 I1 − I3

1 1 − m1 m3

extending between Hamiltonian pitchfork bifurcations at γ = γ3PP = α13 I3 and γ = γ1P = α13 I1 . There are two (sub)cases according to whether I1 − I3 is small (and, hence, γ1H < γ1P ) or large. In the latter case the pure 1 mode does not undergo a Hamiltonian Hopf bifurcation. Homoclinic orbits are commonplace in Hamiltonian systems. Because of the reversing symmetries (14.6), heteroclinic orbits between symmetry-related hyperbolic equilibria are typical as well. In [8, section 5.1] the existence of transverse homoclinic orbits to the pure 1 mode is shown in the nearly axisymmetric case m 2 ≈ m 3 , I2 ≈ I3 , implying that the system is non-integrable. In [8, section 5.2] the Hamiltonian is formally averaged along the azimuthal

358

Heinz Hanßmann and Philip Holmes

angle of p with respect to π. The averaged system displays heteroclinic orbits connecting the pure +2 and −2 modes for γ < γ23C and the pure +3 and −3 modes for γ > γ23C , but turns out to be degenerate for γ = γ23C . Thus it was not possible to conclude the existence of orbits connecting the four ±2, ±3 modes at this value. The present paper further considers this possibility, and, more generally, investigates the global behavior of ‘nearly spherical’ vehicles for which m i ≈ m j and Ii ≈ I j . We note that the axially symmetric case m 2 = m 3 , I2 = I3 (analogous to the Lagrange top) is completely integrable; see, e.g., [2, 8]. This paper is organized as follows. In the next section we show that the averaged Hamiltonian obtained in [8] may be considered as a normal form of order one. We then compute the normal form of order two. Section 14.3 contains a partial analysis of these normal forms. In particular we show that the normal form of order two resolves the degeneracy of the normal form of order one. Furthermore, at γ = γ23C there are heteroclinic cycles connecting the pure 2 and pure 3 modes for the normal form of order two. In section 14.4 we discuss the consequences of these heteroclinic orbits. We summarize in section 14.5.

14.2 Normalization of the Hamiltonian The purpose of normalization is to perform a coordinate transformation that puts the ‘lower-order terms’ in a simpler form. The small parameter ε measures how far the Hamiltonian H deviates from the well-understood Hamiltonian H00. In transformed coordinates, the normal form H¯ of order n is then εn+1 -close to H . 14.2.1 Theoretical background Our aim is to find a coordinate transformation ψ that puts the system with Hamiltonian H = H00 + ε H10 in normal form, up to order two. If ψ preserves the Poisson structure this means that the transformed system is generated by the Hamiltonian H ◦ ψ = H00 + ε H01 +

ε2 2 H + O(ε3 ), 2 0

with {H0n , H00} = 0 for n = 1, 2. A simple way to ensure that ψ is a Poisson map is to define it as the time-one-map of some Hamiltonian vector field with Hamiltonian W . Making the ansatz W = εW1 +

ε2 W2 2

we obtain 1 H ◦ ψ = H + {H, W } + {{H, W }, W } + O(ε3 ) 2

On the global dynamics of Kirchhoff’s equations = H00 + ε H10 + ε{H00, W1 } + + ε2 {H10, W1 } +

359

ε2 0 {H , W2 } 2 0

ε2 {{H00, W1 }, W1 } + O(ε3 ) 2

whence {W1 , H00 } + H01 = H10

(14.9)

{W2 , H00} + H02 = 2{H10, W1 } + {{H00, W1 }, W1 }.

(14.10)

and We seek functions W1 , W2 that solve these homological equations. To this end let L H 0 : F "→ {F, H00} denote the Lie operator acting on elements F ∈ F , where

F is the space of (differentiable) Hamiltonian functions. Then H01, H02 are sought for in the kernel of L H . If L H is semi-simple, the homological equations are easily solved since F splits into a direct sum im L H ⊕ ker L H = F . Inverting the restriction of L H to im L H then allows one to compute W1 and W2 . 0

0 0

0 0

0 0

0 0

0 0

0 0

coincides with the average of H10 along the flow In particular, we see that 0 generated by H0 . This iterative procedure can be generalized to yield normal forms of any order; see [4, 6, 11]. H01

14.2.2 Ingredients of the algorithm In [8, section 5.2] the √ Hamiltonian (14.5) is averaged along the periodic flow generated by κ3 = (π | π). This is done in a canonical coordinate system defined on each symplectic leaf κ1 = a 2 , κ2 = ab. Here we prefer to work with H00 =

κ2 1 κ1 1 (π | π) + ( p | p) = 3 + 2I 2m 2I 2m

(note that κ3 and H00 generate equivalent flows), since H00 is ε-close to H with 1 1 1 1 ε = max − , − . i Ii I mi m Choosing I and m as convex combinations of the Ii and m i , respectively, implies that all mutual differences I1i − I1j and m1i − m1j are of order ε. The key ingredient in computing the inverse of the Lie operator L H 0 on its 0 image is to find diagonalising coordinates. Inspired by the variables given by (14) and (15) of [8], we define the complex coordinates τ1 = π1 (π | p) − (π | π) p1 + i (π | π)(π2 p3 − π3 p2 ) τ2 = π2 (π | p) − (π | π) p2 + i (π | π)(π3 p1 − π1 p3 ) τ3 = π3 (π | p) − (π | π) p3 + i (π | π)(π1 p2 − π2 p1 )

360

Heinz Hanßmann and Philip Holmes

and use κ1 , κ2 , κ3 , π1 , π2 , π3 , τ1 , τ2 , τ3 to embed our phase space in R6 × C 3 . The Poisson structure on R6 × C 3 is chosen to make that embedding Poisson, i.e. κ1 , κ2 are Casimirs and, in addition to (14.1), we have {κ3 , πi } = 0 {κ3 , τi } = iτi

(14.11) (14.12)

{πi , τi } = 0 {πi , τ j } = − )i j k τk {τi , τ j } = 0

(14.13)

{τi , τ¯i } = 2iκ3 (κ1 πi2 − 2κ1 κ32 + κ22 ) {τi , τ¯ j } = 2)i j k πk (2κ1 κ32 − κ22 ) + i κ1 κ3 πi π j . Specifically, in (14) and (15) of [8], the action-angle variables (r, s, ρ, σ ) are defined, where r = π12 + π22 + π32 = κ3 and ρ is its conjugate angle, so that

LH

0 0

( f ) = { f,

H00}

=

r2 f, 2I

=−

r ∂f . I ∂ρ

Thus, taking f = r, s, eiρ , eiσ , we obtain

LH ( f ) = λ f 0 0

with λ = 0, 0, − irI , 0, respectively. The coordinate τ1 effectively plays the role of eiρ and τ2 , τ3 are included so that the Poisson bracket operation is closed; see (14.13). Counting dimensions, there are six independent relations between our coordinates at every point. An obvious candidate is κ32 = π12 + π22 + π32 .

(14.14)

Taking the Poisson bracket with the τi leads to further relations π2 τ3 − π3 τ2 = iκ3 τ1 π3 τ1 − π1 τ3 = iκ3 τ2 π1 τ2 − π2 τ1 = iκ3 τ3 and we encounter more relations in (14.16) below. From (14.11) and (14.12) we have {τi , H00} = −

iκ3 τi I

and

{πi , H00 } = {κi , H00} = 0,

i = 1, 2, 3.

On the global dynamics of Kirchhoff’s equations

361

The Lie operator may therefore be written as L H00 = −iκI 3 τ1 ∂τ∂ − τ¯1 ∂∂τ¯ + τ2 ∂τ∂ − τ¯2 ∂∂τ¯ + τ3 ∂τ∂ − τ¯3 ∂∂τ¯ 1 1 2 2 3 3 and thus maps a monomial k

k

k

k

k

k

µ = κ1k1 κ2k2 κ3 3 π1k4 π2 5 π3 6 τ1k7 τ2 8 τ3 9 τ¯1 10 τ¯2k11 τ¯3k12 to

L H (µ) = −iI (k7 + k8 + k9 − k10 − k11 − k12)κ3 µ. Here k3 ∈ Z and all other ki ∈ N 0 . For µ ∈ im L H we obtain 0 0

0 0

L−1 (µ) = H k 0 0

iI κ3−1 µ . 7 + k 8 + k 9 − k 10 − k 11 − k 12

Thus, while those terms in H10 that are in the kernel of L H 0 form the normalized 0

H01, the other monomials contribute to the generator W1 , yielding 3 1 1 1 − κ3−5 Im τi κ2 πi − Re τi mi m 4 i=1 1 1 π j p k − πk p j − (3(π | p)πi + (π | π) pi ). =I mi m 4(π | π)2

W =I

ijk

cyclic

We remark that once W is known, the calculation of the transformed Hamiltonian H ◦ ψ = H00 + ε H01 + O(ε2 ) may equally be done in the (π, p) coordinates. Functions in the kernel of L H 0 are invariant under the flow ( generated 0

by H00. Kirchhoff’s equations for H00 are π˙ = 0 p˙ = p × π,

(14.15)

revealing that the ring of (-invariant functions is generated by ( p | p) = κ1 , (π | p) = κ2 , π1 , π2 , π3 . This yields the relations τi τ¯i = (κ1 κ32 − κ22 )(κ32 − πi2 ) τi τ¯ j = (κ1 κ32 − κ22 )(−πi π j + i)i j k κ3 πk ).

(14.16)

Note that κ1 κ32 − κ22 = (π × p | π × p). We now can compute the normal form of order one from the homological equation (14.9) as 3 1 (π | p)2 1 πi2 0 1 H0 + ε H0 = + (π × p | π × p)F, (14.17) + Ii (π | π)2 m i 2 i=1

362

Heinz Hanßmann and Philip Holmes

with 1 F= 4(π | π)

1 1 1 + + m1 m2 m3

−

3 πi2 1 . 4(π | π)2 mi i=1

As shown in [3] the reversing symmetries (14.6) are preserved by the normalization procedure, whence the πi only enter squared in (14.17). To obtain the normal form of order two we must add to (14.17) the secondorder terms, computed from (14.10) as 2 2 ε2 2 ( p | p)(π | p)2 1 1 2 πi π j H0 = − I − 2 mi mj 2 (π | π)4 i< j

+ (π × p | π × p)G

(14.18)

where G is a lengthy expression with two factors of the form ( m1i − m1 ) multiplying every term. We remark that there are terms of order three that also contain factors of the form ( I1i − 1I ). Computation of (14.17) and (14.18) was carried out in Mathematica [15]. Equation (14.17) coincides with the result of first-order averaging in the (r, s, ρ, σ ) coordinates of [8, section 5.2].

14.3 Dynamics of the normal forms The normal form H¯ depends on p only through the two Casimirs κ1 and κ2 . Explicitly writing H¯ = H¯ (κ, π) leads to the equations of motion π˙ = π × ∇π H¯

(14.19)

1 ∂ H¯ p × π. p˙ = p × ∇π H¯ + κ3 ∂κ3

Using the relation (14.14) to make H¯ a function of κ1 , κ2 , π1 , π2 , π3 alone, the second set of equations again becomes simply p˙ = p × ∇π H¯ ,

(14.20)

the only difference from (14.3) being that H has been replaced by its normal form H¯ . As the right-hand side of (14.20) is of the form p × π + O(ε), these are ‘fast’ equations, while the ‘slow’ equations (14.19) decouple. This is of course the purpose of the normalization procedure: the function H00 (and thus κ3 ) becomes an integral of motion and we may reduce to single degree-of-freedom dynamics governed by (14.19), and subsequently solve for p by integrating (14.20). The reduced phase space is R3 with Poisson structure (14.1). Correspondingly, κ3 becomes a Casimir, and, having fixed the value κ3 = r , solutions are confined to a 2-sphere of radius r . Before examining the normal form of order two, we review the results of [8] on the normal form of order one in the present context.

On the global dynamics of Kirchhoff’s equations

363

Figure 14.2. Degenerate bifurcation diagrams for equilibria of the first order normal form for the cases I2 > I3 ≥ I1 (a), and I2 > I1 > I3 (b). The Hamiltonian Hopf bifurcation of figure 14.1 does not appear in the normal form.

14.3.1 First-order analysis The Hamiltonian (14.17) coincides with that of a free rigid body with moments of inertia −1 2 b2 2r 2 − a 2 b 2 a a m −1 (14.21) Ji = Ii−1 + 4 m −1 i − i r 2r 4 after we subtract the combination of Casimirs 1 −1 −1 2 −2 (m + m −1 2 + m 3 )(κ1 − κ2 κ3 ). 4 1 Consequently, there are always six relative equilibria (±r, 0, 0), (0, ±r, 0), (0, 0, ±r )

(14.22)

of (14.19). When r = b these give rise to the pure modes of the full six averaged equations. We are interested in heteroclinic orbits between these modes and therefore concentrate on r = b. Moreover, if π p then the motion of p is also ε-slow since we are perturbing from (14.15). In fact, since p remains parallel to π, their dynamics are identical, implying that heteroclinic orbits of the reduced system (14.19) yield heteroclinic orbits of the full averaged system (14.19), (14.20). Hence, connecting orbits exist between each pair of unstable pure modes as long as the three ‘adjusted’ moments of inertia J1 , J2 , J3 all differ: we essentially have the classical free rigid body dynamics.

364

Heinz Hanßmann and Philip Holmes Note that, when r = b, the expression (14.21) simplifies to Ji = (Ii−1 + Thus, for I2 > I3 ≥ I1 we have J2 = J3 ⇔ γ = γ23C = α23 I2 I3

(γ 2 m i )−1 )−1 .

and for I2 > I1 > I3 we furthermore have

J1 = J3 ⇔ γ = γ13C = α13 I1 I3 .

In these cases an equator of the invariant sphere is filled with degenerate equilibria, and the scenarios (b) and (c) of figure 14.1 degenerate to panels (a) and (b) of figure 14.2, respectively. Note that γ3P (resp. γ3PP ) and γ2P (resp. γ1P ) are ε-close to each other. Evidently the normal form of order one cannot adequately represent the behavior on such small scales. 14.3.2 Second-order analysis Let H¯ now denote the normal form of order two, i.e. the sum of (14.17) and (14.18). Again we focus on the dynamics on the X H¯ -invariant submanifold {π p} of R3 × R3 , first analysing the reduced system (14.19) on the 2-sphere Sb2 of radius r = b and then reconstructing the solution of (14.20) by setting ) p(t) = ar π(t) = π(t γ . For π p the normal form of order two reduces to H¯ a,b(π) =

3 1 i=1

Ii

+

1 γ 2mi

2 2 πi2 I 1 1 2 πi π j − 4 2 − 2 γ b mi mj 2 i< j

and the (non-oriented) solution curves are given by the intersections Sb2 ∩ { H¯ a,b = h} ⊆ R3 . Most orbits are periodic, while those points where the two surfaces touch are equilibria. Furthermore, there are homoclinic and heteroclinic orbits to hyperbolic equilibria; see figure 14.3. Seeking critical points of H¯ a,b on Sb2 we are led to the equations 1 1 1 1 I 1 2 π22 1 2 π32 π1 = λπ1 + 2 − 4 − + − I1 m1 m2 m1 m3 γ m1 γ b2 b2 1 1 1 1 I 1 2 π12 1 2 π32 + 2 − 4 − + − π2 = λπ2 I2 m2 m1 m2 m3 γ m2 γ b2 b2 1 1 1 1 I 1 2 π12 1 2 π22 + 2 − 4 − + − π3 = λπ3 I3 γ m3 γ m3 m1 b2 m3 m2 b2 where λ is a Lagrange parameter. We immediately conclude that the points (14.22) are equilibria of the normal form of order two as well. Since I2 > I1 there are no valid solutions (π1 , π2 , π3 ) with both π1 and π2 non-zero.

On the global dynamics of Kirchhoff’s equations

365

(a)

(b)

(c)

(d)

π1 π3 (e)

π2

(f)

Figure 14.3. Phase portraits of the reduced order two system on Sb2 . Panels (a)–(e) show the cases I2 > I3 ≥ I1 and I2 > I1 > I3 , with γ increasing from below γ− to above γ+ , for mass and inertia values as specified in section 14.4.2. Panel (f) shows the case I2 > I1 > I3 with γ ∈]γˇ− , γˇ+ [, for the same mass and inertia values except for I1 = 9.7. See the text for further details.

366

Heinz Hanßmann and Philip Holmes

If I2 > I3 we may eliminate λ from the last two equations and obtain four additional equilibria (0, ±πˆ 2 , ±πˆ 3 ) for γ = | ab | in the open interval ]γ− , γ+ [ with ' I 1 1 I 2 2 + γ± = α23 I2 I3 1±4 − . (14.23) 2 2 I3 I2 At γ− and γ+ the equilibria (0, 0, ±b) and (0, ±b, 0) undergo Hamiltonian pitchfork bifurcations, respectively, see figures 14.3(a), (b) and (d), (e). Note that 2 and γ 2 , while the normalizing coordinate the values (14.23) are ε2 -close to γ3P 2P transformation ψ maps ±πˆ 2 ±πˆ 3 , 0, ±πˆ 2 , ±πˆ 3 , 0, (14.24) γ γ into an ε2 -neighborhood of (14.8). Hence, (14.24) correspond to the mixed 2–3 modes of the original system. At γ = γ23C we have H¯ a,b (0, ±b, 0) = H¯ a,b (0, 0, ±b) and all four of these (hyperbolic) equilibria—the pure 2 and pure 3 modes—are connected in a heteroclinic cycle, see figure 14.3(c). The whole X H¯ -invariant subspace {π p} of R3 × R3 consists of critical values ( H¯ , κ1 , κ2 , κ3 ) = (h, a 2 , ab, b) of the energy–momentum mapping ( H¯ , κ). To obtain an overview of these dynamics we therefore consider κ3 (14.25) ( H¯ , √ = γ ) : R3 × R3 \{0} −→ R2 . κ1 The set of critical values (h, γ ) of (14.25) is sketched in figure 14.4(a); see also figure 14.1(b). If I1 > I3 there are four further equilibria (±πˇ 1 , 0, ±πˇ 3 ) for γ ∈]γˇ− , γˇ+ [ with ' 1 1 I I 2 2 + γˇ± = α13 I1 I3 1±4 − 2 2 I3 I1 which correspond to the mixed 1–3 modes. However, the scenario is different here, since these equilibria are hyperbolic while the equilibria (14.22) are all elliptic. Therefore, it is now the mixed 1–3 modes that are connected in a heteroclinic cycle throughout ]γˇ− , γˇ+ [. In case γ+ < γˇ− the symmetries (14.7) ensure that the flows for these γ are all equivalent to that depicted in figure 14.3(f). The critical values (h, γ ) are shown in figure 14.4(b) for this case; note, however, that the (additional) intersection of the pure mode 1 and 3 branches does not correspond to a bifurcation. In case γ23C > γˇ− the two bifurcation sequences interact, resulting in fourteen coexisting equilibria (eight elliptic and six hyperbolic) and a heteroclinic cycle between the pure 2 and the mixed 1–3 modes. We end by noting that the (r, s, ρ, σ ) variables used in [8] are singular at π1 = ±r —(s, σ ) provide cylindrical coordinates on Sb2 —and thus that the full phase portraits of figure 14.3 could not be deduced from normalized systems in those coordinates.

On the global dynamics of Kirchhoff’s equations h

367

h 1 1–3

1 2 2

3

1

3

3 2–3

2–3

3 2

γ23C (a)

γ

2 γ23C

γ13C

γ

(b)

Figure 14.4. The set of critical values of (14.25), as a bifurcation diagram plotted in terms of energy, for I2 > I3 > I1 (a), I2 > I1 > I3 and γ+ < γˇ− (b). Mass and inertia values are as for figures 14.3(a)–(e) and (f), respectively.

14.4 Implications for the original system After the coordinate transformation ψ, the Hamiltonian of (14.5) is ε3 -close to the normal form H¯ of order two given by (14.17), (14.18). Hence, by the usual averaging theorem (e.g. [7]), solutions of the original Kirchhoff equations follow those of the normalized system with Hamiltonian H¯ within O(ε2 ) for times of O( 1ε ). However, in this case the dimension reduction implicit in the normalization procedure allows us to draw stronger conclusions. We describe these first for the model equations and then for the physical application. 14.4.1 Dynamics of the Kirchhoff equations Since the phenomena decribed in section 14.3 are structurally stable in the space of single degree-of-freedom Hamiltonian systems, and further normalization at successively higher order preserves invariance of κ3 , these behaviors persist in normal forms H¯ of arbitrarily high order. Therefore, we may conclude that, up to ‘splitting’ terms smaller than εn for any n, e.g. heteroclinic cycles among the pure 2 and 3 and the mixed 1–3 modes exist for parameter values γ = γ23C and γ = γ13C . More generally, the full dynamics tracks those of the completely integrable system H¯ with errors smaller than εn for any n. All these conclusions are, of course, predicated on the assumption that the (added) mass and inertia values are ‘nearly equal’: 1 1 1 1 1 . (14.26) − ε = max − , i, j Ii Ij mi mj I

368

Heinz Hanßmann and Philip Holmes

In contrast, in [8], it was only assumed that (14.26) held for i, j = 2, 3 and m 1 and I1 were allowed to differ from m 2 , m 3 and I2 , I3 respectively by large amounts. This is why the averaging of [8, section 5.2] is only formal. We remark that, in addition to the heteroclinic orbits to and cycles among the pure and mixed modes discussed above, the phase portraits of figures 14.3(b)–(d) also reveal homoclinic loops to the (unstable) pure 2 and pure 3 modes. These will also persist up to small splitting terms. 14.4.2 Results for underwater vehicles Unfortunately, the more stringent condition (14.26) implies that we may only draw rigorous conclusions regarding nearly-spherical vehicles, rather than for the broader class of nearly-axisymmetric, elongated (cigar-shaped) vehicles considered in [8], which are more appropriate for modeling typical streamlined underwater bodies. The heteroclinic cycle of figure 14.3(c) only occurs when I2 > I3 . The constants m 1 , m 2 , m 3 , I1 , I2 , I3 of a given ellipsoidal underwater vehicle are not independent, but all derive from the lengths l1 > l2 ≥ l3 of the principal semiaxes. In [8] a perturbation analysis from the axisymmetric case yields a threshold l1 l2 > 1.62 for passing from I3 > I2 > I1 to I2 > I3 > I1 . On the other hand, the more the li differ, the larger the differences m i − m j will be. It is therefore not a priori clear that an underwater vehicle displaying heteroclinic cycles indeed exists. We therefore considered the example l1 = 0.5, l2 = 0.3, l3 = 0.2 in more detail; here m 1 ≈ 158.4, m 2 ≈ 166.2, m 3 ≈ 278.7, I1 ≈ 5.120, I2 ≈ 10.71, I3 ≈ 8.707 and choosing I = 6.928 the relevant ‘small’ parameter (14.26) reads ε I ≈ 0.707. Thus, the normal form of order two is still a poor approximation, but can be improved upon by working with a normal form of (very) high order. While the actual computation of such a normal form is not feasible, it suffices for us that this implies the existence of heteroclinic cycles in the full system up to splitting terms smaller than any algebraic order. Correspondingly, for the mass and inertia values given above, we were able to find the scenario of figures 14.3(a)–(e) numerically, see figures 14.5(a)–(d). These are computed with √ κ3 = π ≈ 1 and κ1 = p varying from 3 to 4, corresponding to an O(ε3 ) error of approximately 1.8×10−5 . This accounts for the close agreement between actual stable and unstable manifolds and those approximated by the normalized system. However, the physical velocities corresponding to these momenta are very low (50–100 m h−1 ). Higher speeds would reveal more significant splitting of the manifolds. The linear momentum is fixed in inertial space. A motion corresponding to the heteroclinic cycle, therefore, results in a rearrangement of the body-axis aligning with that direction. We refer to [8, section 6] for a detailed description of the full 12-dimensional underwater vehicle motion. Kirchhoff’s equations are an idealisation that neglects viscous effects and,

On the global dynamics of Kirchhoff’s equations

369

(a)

(b)

(c)

(d)

π1

p1 π3

p3 π2

p2

Figure 14.5. Projections of stable and unstable manifolds of the full system for l1 = 0.5, l2 = 0.3, l3 = 0.2 to the π space (left) and p space (right, scaled 1:3). Here we use initial conditions satisfying π = b = 1 and start solutions close to the pure 2 and 3 modes with κ1 = a 2 and κ2 = ab chosen such that: γ = | ab | = 0.25 < γ3P (a), γ = 0.32 ∈]γ3P , γ23C [ (b), γ = 0.336 037 ≈ γ23C (c), γ = 0.35 ∈]γ23C, γ2P [ (d). Compare with figure 14.3.

370

Heinz Hanßmann and Philip Holmes

more significantly, assumes full pressure recovery and no separated flow; see [9]. This is reasonable for slender (cigar-shaped) bodies, but not for nearly spherical bodies traveling at realistic speeds. Addition of (phenomenological) models for separation-induced drag would introduce non-Hamiltonian perturbations that cannot be included in the present analysis.

14.5 Conclusions In this paper we have carried out normalization up to second order for the Kirchhoff equations (14.2) and (14.3) modeling the dynamics of a nearly spherical body submerged in an incompressible, inviscid fluid of infinite extent. By effectively averaging along the fast, integrable flow of the spherical case, we find that solutions are approximated, up to terms of arbitrarily small algebraic order, by those of a completely integrable system (with lowest-order terms (14.19) and (14.20)) whose angular and linear momentum components decouple. The resulting reduced dynamics may be pictured as flows on an (approximately) constant angular momentum sphere, restricted to intersections of that sphere with the normalized Hamiltonian. The phase portraits typically contain six, ten or fourteen equilibria and the hyperbolic equilibria have heteroclinic or homoclinic orbits; see figure 14.3. This permits us to carry out a complete global analysis of the normalized system, and to conclude that the Kirchhoff equations have families of nearly heteroclinic cycles connecting certain pure and mixed mode relative equilibria, in appropriate parameter ranges. We suggest that such cycles provide ‘channels’ in the natural dynamics that will be useful in obtaining energy-efficient reorientation maneuvers of underwater vehicles. In conclusion, we remark that the analysis of an averaged, effectively twodimensional flow carried out here is similar in spirit to Floris Takens’s own studies of unfoldings of codimension-two bifurcations of Poincar´e return maps in periodically forced oscillations [14]; see also [13]. In 1974 the chance discovery of these (still generally unpublished†) lecture notes, in the library of the Institute of Sound and Vibration Research at Southampton University, enormously encouraged and helped the second author in his fumbling attempts to learn bifurcation theory and apply it to problems in engineering. Some fifteen years later, the first author found these same two lecture notes very helpful in his ‘first steps’ in dynamical systems. We therefore believe that the present paper is an especially suitable tribute to an inspiring teacher, colleague, and friend.

Acknowledgements We thank Naomi Leonard for helpful discussions and for sharing her code for computation of added masses and moments of inertia, and John Schmitt, Britta Sommer, Wolf Jung, John Vincent and Jeff Moehlis for helping produce the † Eds: [14] is reproduced in chapter 1 of this volume.

On the global dynamics of Kirchhoff’s equations

371

figures. Figures were made using the programs gnuplot [5], CorelDRAW, ESFERAS [12], Mathematica [15] and DsTool [1]. HH thanks the Max Kade Foundation for a fellowship that supported his visit to Princeton. PH thanks the US Department of Energy for partial support under DoE: DE-FG02-95ER25238.

References [1] Back A, Guckenheimer J, Myers M, Wicklin F and Worfolk P 1992 Dstool: computer assisted exploration of dynamical systems Not. Am. Math. Soc. 39 303–9 [2] Bates L and Zou M 1993 Degeneration of Hamiltonian monodromy cycles Nonlinearity 6 313–35 [3] Churchill R C, Kummer M and Rod D L 1983 On averaging, reduction, and symmetry in Hamiltonian systems J. Diff. Eqns 49 359–414 [4] Deprit A 1969 Canonical transformations depending on a small parameter Celest. Mech. 1 12–30 [5] URL http://www.gnuplot.org [6] Gr¨obner W 1967 Die Lie-Reihen und ihre Anwendungen (Berlin: Deutscher Verlag der Wissenschaften) [7] Guckenheimer J and Holmes P 1983 Nonlinear Oscillations, Dynamical Systems and Bifurcations of Vector Fields (New York: Springer) [8] Holmes P, Jenkins J and Leonard N E 1998 Dynamics of the Kirchhoff equations I: coincident centers of gravity and buoyancy Physica D 118 311–42 [9] Lamb H 1932 Hydrodynamics (Cambridge: Cambridge University Press) [10] Leonard N E 1997 Stability of a bottom-heavy underwater vehicle Automatica 33 1–11 [11] Meyer K R and Hall G R 1992 Introduction to Hamiltonian Dynamical Systems and the N-Body Problem (New York: Springer) [12] Pascua P, Rubio J L, Viartola A and Ferrer S 1996 Visualizing relative equilibria and bifurcations by painting Hamiltonians on personal computers Int. J. Bif. Chaos 6 1411–24 [13] Takens F 1973 Introduction to Global Analysis Commun. Math. Inst. 1973-2 Rijksuniversiteit Utrecht [14] Takens F 1974 Forced oscillations and bifurcations Applications of Global Analysis I Commun. Math. Inst. 1973-3 Rijksuniversiteit Utrecht (reprinted in chapter 1 of this volume) [15] Wolfram S 1996 The Mathematica Book (Cambridge: Cambridge University Press)

This page intentionally left blank

Chapter 15 Global dynamics and fast indicators Carles Sim´o Universitat de Barcelona

Dedicated to Floris Takens, who has been doing so many things for dynamical systems, on his 60th birthday.

Dynamical systems play an important role in understanding many problems in science. The variety of difficulties that present themselves to the dynamicist is huge. Local problems around some well known object (a point, a periodic or quasi-periodic orbit, an invariant manifold, etc) can be studied by different methods. A combination of analytic, geometric and topological tools provides a detailed account of this local dynamics and the bifurcations which occur when changing parameters. On the other hand, more global problems, relating to a big part of the phase space or to a large set in the parameter space, can be studied by using probabilistic methods and by computing several numerical indicators. But it can happen that we would like to combine both things: we need a relatively detailed knowledge of the dynamics in a large set. To this end, the following is useful: •

•

to extend the local analysis to larger domains, say by using normal forms up to some relatively large order, so that they can give good quantitative information and to unfold the bifurcations found. This analysis provides a guidance to the numerical experiments to be done, to perform systematic numerical experiments, such as the computation of invariant objects: fixed points, periodic orbits, tori, etc, and, if it applies, the related stable, unstable and centre manifolds, as well as intersections of the manifolds (homoclinic and heteroclinic phenomena) and quantitative measures associated with them; and finally, to continue these objects with respect to parameters and detect and analyse the bifurcations. These experiments, in turn, give hints on new phenomena to be investigated. 373

374

Carles Sim´o

This programme has been carried out previously in several cases; examples are [3, 5, 6, 7, 12, 13, 14]. However, both approaches MAY require CONSIDERABLE effort. It is suitable to have fast indicators aiming at a significant knowledge of the dynamics in a quick way (with the help of arrays of processors). Both the design of the indicators and the interpretation of the results must be guided by • • •

the known dynamical phenomena of the considered class of systems, the role of the numerical errors, computational efficiency.

In this paper we sketch some numerical tools. They are presented by showing how they apply to some examples, where we restrict ourselves to conservative systems. But they are described in sufficient generality, so that they can be used in many other problems in ‘experimental’ mathematics.

15.1 A model problem in 1 12 degrees of freedom Consider a periodically perturbed pendulum with differential equation x = (α + β cos t) sin x.

(15.1)

This problem has received much attention because it is one of the simplest paradigms showing most of the dynamics in its class. For motivation and recent results see [11] and, in a slightly different context (limit of two coupled pendula) see [9]. Some of the methods presented here are used in [2]. It is convenient to introduce y = x . The study can be done by means of the stroboscopic map Pα,β (x, y) = ϕ(2π; 0, x, y, α, β), which is an area preserving map. In what follows we shall denote it simply by P. Note that the variable x ∈ S1, but it is also useful to consider x on the lift (e.g., to distinguish different fixed points of P). For β = 0 the invariant curves of P coincide with the orbits of a pendulum. The symmetries of (15.1) imply that it sufficient to consider both α, β ≥ 0. The dynamics is also symmetric with respect to both the x- and y-axes, that is P −1 Sx = Sx P, P −1 Sy = Sy P, where Sx , Sy are the reflectional symmetries with respect to the x- and y-axis, respectively. It is clear that for α, β 1 and y small we have a ‘slow pendulum’ with a fast time-periodic perturbation. In this region the problem is suitable for the use of normal forms; see [11]. On the other hand, for (α, β) fixed and y sufficiently large, the fast angle is x and t plays the slow role. Something similar happens if 1 β < γ α with γ bounded away from 1, except in a ‘neighbourhood of the separatrix’, where the two dynamics (the frozen pendulum and the perturbation) are comparable. This suggests that for any (α, β) most of the interesting dynamics occurs in a region without rotationally invariant curves (i.e., curves not crossing y = 0, being a graph over S1, denoted ric from now on.) Outside this region most of the points

Global dynamics and fast indicators

375

belong to a ric. The width of the chaotic regions becomes exponentially small with respect to |y|.

15.2 Bounding the region of interest To bound the domain of interest (from now on DI) we can look for a ric close enough to the upper boundary of the domain. We shall refer loosely to this curve as ymin . To identify such a curve we can use either the values ymin (x = 0), ymin (x = π) or the rotation number ρ ymin of P restricted to ymin . The three values are worth consideration and we intend to follow their evolution as (α, β) change. An exploration of this kind, with a similar algorithm, has been used in [15]. With our loose definition the curve ymin (say, the ‘first’ curve) is not uniquely determined. It depends on the algorithm and its implementation. Hopefully this affects only mildly the value of, say, ymin (x = π). 15.2.1 The algorithm Given (α, β) we look for the value of ymin (x = 0). Due to the symmetry one has Sy (ymin) = ymin . We scan the y-axis searching for ymin (x = 0). To solve equation (15.1) it is convenient to use an accurate and fast method and we have adopted a Taylor series method; see the appendix. The steps of the algorithm are the following. •

• • • •

•

For a given initial point Q 0 = (0, y0 ), y0 > 0, we proceed to compute iterates under P. We transport also an initial ‘random’ vector (e.g. v0 = (1, 0)) under the differential D P by integration of the variational equations. In this way points Q k = P k (Q 0 ) = (x k , yk ), with x k ∈ [0, 2π), and vectors vk = D P k (Q 0 )v0 are obtained, up to a maximum of N iterates. We also keep track of the value of x k on the lift xˆk . Let k = xˆk − xˆk−1 . We compute the values min = min j =1,...,k j and max = max j =1,...,k j . We discard the current initial value y0 if yk ≤ 0. Then y0 is incremented by δ y0 and the process starts again. If |vk | > L max for some fixed value of L max we also discard y0 and proceed as before. If k ≤ 0 or max − min ≥ 2π we discard y0 as before. Every n iterates we sort the points Q k , k = 0, . . . , mn for m ∈ N by increasing order of x k . Due to the symmetry we also use the preimages Q −k = (−x k , yk ). We check for the slope associated with points with nearby x values and discard y0 if the slope exceeds some Smax (which depends on (α, β)). Care must be taken not to use too close values of nearby x due to the effect of numerical errors. If three consecutive (after ordering) values of x correspond to iterate numbers k1 , k2 and k3 , let l = k2 −k1 . Then one must have k3 = k3t := k2 +l,

376

•

•

Carles Sim´o unless k3t ∈ / J = [−nm, nm] or, in case k3t ∈ J, if k3 − l ∈ / J . Otherwise we discard y0 and proceed as before. A ‘number of revolutions’ on S1, n(k), can be assigned to every Q k as follows. Let n 0 = 0. For j > 0 we put n( j ) = n( j − 1) if x j > x j −1 (in [0, 2π).) Otherwise n( j ) = n( j − 1) + 1. In a similar way we set n(− j ) = −n( j ) − 1. Then it is possible to produce upper and lower estimates of the rotation number (on S1), assuming that the points are on a ric, as follows. Let l j be the number of the iterate which has the x component in the j th place when it is sorted. Let q j = (n(l j +1 ) − n(l j ))/(l j +1 − l j ) ∈ R. Start with ρlow = 0, ρupp = 1 and then update ρlow , ρupp according to the rule: if l j +1 > l j set ρlow = max{ρlow , q j }, otherwise set ρupp = min{ρupp, q j }. If along the computation of the estimates we obtain ρlow > ρupp then y0 is discarded, as in the previous cases. If all the previous steps have been passed successfully and we reach the Nth iterate, we consider y0 as a candidate for ymin (x = 0). Then we can decrease the current value of y0 by some multiple of δ y0 (say, K δ y0 ), decrease the current step δ y0 to δ y0 /L for some 1 < L ∈ N and again do all the tests above. The refinement can be applied several times, but it is not convenient to start with a too large δ y0 . Otherwise we may completely miss zones with ‘regular’ behaviour.

15.2.2 Results and interpretation Different experiments have been carried out using the previous algorithm. The values used for the different parameters are N = 2500, δ y0 = 10−4 , L max = 106, √ n = 10, Smax = 2 α + β, K = 5 and L = 10. Tests have been made to check the suitability of these choices. Different strategies have been used to save computer time. Most of the time (say 98%) is devoted to the computation of iterates under P and D P. Figures 15.1–15.4 show different values of ymin in selected regions of the parameter space. In all of them the points on a grid (either 1D or 2D) have been joined with lines, to allow for a clearer representation. Figure 15.1 displays ymin (x = 0) for small values of (α, β). Notice the staircase structure or, because we are dealing with a surface, the structure of successive cliffs. At each of them there is a jump of ymin (x = 0), typically followed by a mild decrease. It is interesting to remark that all values of ymin (x = 0) found, even for large values of (α, β), are below 2.25. Hence, there are ric not too far from the origin. √ On the other hand, an estimate ymin (x = π) 4 2 α + β is provided by adiabatic theory. To make the√structure apparent we plot the difference d(α, β) = ymin (x = π)(α, β) − 2 α + β in figure 15.2. A similar structure appears, but with milder cliffs. It is also remarkable that the maximal value of d(α, β) is below 0.8. The plot of ρ ymin is more erratic, and it is harder to show a clean 3D

Global dynamics and fast indicators

377

1.5 1 0.5 0

10 0

8

2

6

4

4

6

α

2

8 10

0

β

Figure 15.1. Plot of ymin (x = 0) for (α, β) ∈ [0, 10] × [0, 10].

0.6 0.4 0.2 100

0 80 60

0 20

40 40

α

β

20

60 80

0

√ Figure 15.2. Plot of the difference d(α, β) = ymin (x = π)(α, β) − 2 α + β for (α, β) ∈ [0, 100] × [0, 100].

representation. Instead, several sections are shown in figure 15.3, for α = 0, 50 and 100. The typical structure is clear: ρ ymin increases with β, jumping over every rational value of the image. In some cases there are jumps up and down. When a value more or less close to 1 is reached, a jump to the vicinity of 0 is produced. The meaning is clear: on the top of the ric that we label as ymin there are chains of islands of increasing ρ. Increasing β the ric and the islands become closer, the ric is broken and the first ric we encounter is on the top of the chain of islands. The process is not monotonic: the dynamics on a vicinity of the island can become smoother and a ric can reappear below the chain of islands. This

Carles Sim´o

378 1 0.75 0.5 0.25 0 0

10

20

30

40

50

60

70

80

90

100

0

10

20

30

40

50

60

70

80

90

100

0

10

20

30

40

50

60

70

80

90

100

1 0.75 0.5 0.25

1 0.75 0.5 0.25

Figure 15.3. Evolution of ρ ymin (α, β) as a function of β, for α = 0 (top), α = 50 (centre), and α = 100 (bottom).

mechanism has been detected in [15]. The possible existence of strips, on (x, y), where the twist character is lost has to be taken into account. In figure √ 15.4 we show ymin (x = π). As mentioned earlier, it essentially behaves like 2 α + β. Only a few upstairs jumps can be seen at this scale (this is the reason to introduce d(α, β)). They correspond to the top of the cliffs in figure 15.2 and follow a well defined pattern. It can be seen in the right part of the figure, where we plot points close to ρ ymin = 1 (in fact places where the values are greater than 0.6 on a rough grid). These curves correspond to places where the first ric moves from the bottom to the top of an island of period 1. These islands correspond, in the unperturbed pendulum, to circulation orbits with periods of the form 2πq for q ∈ N . As an example, for the dots in figure 15.4(right) very close to (100, 100) one has q = 16.

15.3 Estimating the fraction of integrability The next step is to try to figure out the dynamics in DI. We know that for β/α small enough, most of that domain will be filled by invariant curves. What we propose is to have a measure of the ‘fraction of integrability’ inside the domain, in the spirit of the pioneering work [8]. We shall use the Lebesgue measure of the set of points with regular behaviour.

Global dynamics and fast indicators

379

100 8

80

6

β

4

60 40

2

20

0 10

0 0 5 0 5

α

10

β

20

40

60

α

80

100

0

Figure 15.4. Plot of the surface ymin (x = π) as a function of (α, β) (left), and points with ρ ymin > 0.6 (right); compare with figure 15.2.

15.3.1 A fast estimator of chaotic orbits To produce this estimate we start, for a given (α, β), by recovering the curve ymin . Together with Sx (ymin ) it gives approximate bounds of the domain of interest. Due to the symmetry it is enough to consider (x, y) in the first quadrant. Our algorithm is as follows. •

•

•

We find the maximal value, yˆ of ymin . It occurs at x = π, except on a very small domain in the (α, β)-plane bounded by the β-axis, a curve β 4 (2α)1/3 (where there are equal maxima at x = 0 and x = π) and a curve β = g(α), slowly decreasing, with g(0) 4 0.22 (where a jump to include period-4 islands inside DI occurs). In any case this region does not go beyond α 4 0.008. We select a value M ∈ N . This will be the number of pixels we put, in the x and y directions, on the rectangle [0, π] × [0, yˆ ], having the centres at the points (ξk = kπ/M, η j = j yˆ /M) for k, j = 0, . . . , M. For each k we find the maximum index j (k) such that the pixel of labels M(k, j ) is below ymin . ( j (k) + 1). Note Hence, the total number of pixels to be scanned is k=0 that the area of the pixels with k = 0 and k = M should be multiplied by 1/2, as well as the pixels with j = 0. Then we take initial points of the form (ξk , η j ) and we proceed to compute iterates under P and also iterates under D P of a random initial vector v0 , |v0 | = 1. A maximum of N iterates is computed. The iteration is stopped when either N is reached or when the length of D P l (ξk , η j )v0 exceeds some value L max .

Carles Sim´o

380

10

0

8

-2 -4

6

β

-6

4

10

2

0

8 2

6

4

4

α

6 2

8 0 10

0

β

2

4

α

6

8

10

Figure 15.5. Logarithm of the fraction of integrability log( f i) in DI as a function of (α, β) ∈ [0, 10] × [0, 10] (left), and its level curves (right). (The continuous curve is for log( f i) = −4, the broken curves for log( f i) = −1, −2, −3 and the dotted curve for log( f i) = −5.)

•

When the iteration is stopped we have decided that the orbit is either regular (0) or chaotic (1). Then all the pixels of the orbit are marked with the character 0 or 1 (but a pixel that has been already marked is not marked again). Of course, most of the pixels will contain points of both kinds but, giving prevalence to the first occupancy of the pixel by some point of an orbit, we believe that will reflect, statistically, the dominant character of the pixel. Note that if the iterate P l (ξk , η j ) := (ξk(l) , η(l) j ) is not in the first (l)

•

(l)

quadrant, the symmetric point (|ξk |, |η j |) is marked, which has the same character. We proceed with the scanning, but skipping all the points whose character has been already decided. Finally we count the number of pixels, n 0 , n 1 , having been assigned the value 0 or 1, respectively. The fraction of integrability is defined as f i = n 0 /(n 0 + n 1 ).

15.3.2 Results and interpretation The algorithm has been tested with different values of M, N and L max . Typical values of N, L max are 1000 and 106; no significant differences have been observed by increasing them. For M the values 256 and 512 have been used, depending on the region of the (α, β)-plane explored. The computational effort is proportional to M 2 and higher in regular zones. Figure 15.5 shows the results in the window [0, 10] × [0, 10], while figure 15.6 shows the results in the window [0, 100] × [0, 100] (computed with a larger step size). To better visualize the regions where f i is small, we have used a logarithmic scale. In both figures the left panel gives a 3D view and the right

Global dynamics and fast indicators

381

100

0

80

-2 60

-4

β

-6

40

100

20

0

80 20

60

40

40

α

60 20

80 0 100

β

0

20

40

α

60

80

100

Figure 15.6. Same as figure 15.5 but now in the domain (α, β) ∈ [0, 100] × [0, 100]. (On the right, the continuous curve is for log( f i) = −5, the broken curves are for log( f i) = −1, −2, −3, −4.)

panel displays level curves (log( f i ) ∈ Z) of the surface. As one may expect, for β/α small f i is close to one. Fixing a not too small value, say f i = exp(−k) for k = 1, 2, 3, 4, it seems that the lines in the (α, β)-plane with these values of f i are close to straight lines. However, it looks reasonable to conjecture that they all become closer to the diagonal in the parameter plane when α increases. Close to the diagonal there is a fast decrease in f i and we find, essentially, values below 0.01. In figure 15.5 it is still possible to distinguish the effect of the cliffs of section 15.2.2. (Notice the shape of the solid curve of level log( f i ) = −4.) It is clear that when we move the parameter, so that a relatively large island enters DI, the indicator f i increases. Smaller cliffs can be seen on the dotted level curve. Furthermore, close to the line β = α some structures are visible, either in figure 15.5(left) in the form of bumps or in figure 15.5(right) as oscillations in the line log( f i ) = −3. In figure 15.6 the integrable part is clearly seen, but for β > α and not too close to α = 0, it is hard to recognize any structure, except for the same bumps near α = β observed in figure 15.6(left). The levels of log( f i ) are small (of the order of −6, corresponding to f i 4 1/400) and the relative error in the determination of f i can start to be important. The situation becomes very clear in figure 15.6(right), where we restrict the levels shown to log( f i ) ≥ −5. (Levels below this one have a more complicate structure to be analysed in the future.) The levels −3 and −4 display oscillations related to the bumps in the left figure. The level log( f i ) = −5 (solid curve) has a pattern of roughly parallel strips. The origin of the bumps is easily detected. They correspond to the stability regions of the point (π, 0). Concerning the strips, if we compare with the location of the cliffs as shown in figure 15.4(right), we see that they are not the same

Carles Sim´o

382 0 -1 -2 -3 -4 -5 -6

0

10

20

30

40

50

60

70

80

90

100

0 -1 -2 -3 -4 -5 -6 40

50

60

70

80

90

100

Figure 15.7. Graphs of log( f i) as a function of β for α = 0 (top), and α = 50 (bottom). For comparison we also plot 2ymin (x = 0) − 4 + α/100; see the text.

phenomenon, although they seem related for small α. A possible explanation is the following. When increasing β for fixed α, certainly new islands enter DI, the main ones coming from fixed points. As soon as they go inside the chaotic zone they lose stability and do not contribute to f i . Furthermore, the relative width of these islands, compared to the size of DI, becomes smaller when β increases. Indeed, the islands become thinner and, at the same time, the area of DI increases. But inside DI other islands appear, mainly around fixed and period-2 points on the x-axis, on x = π and, less importantly, on x = 0. Just to cite a few of them, on the x-axis there appear systematic islands (i.e., along families of curves in the (α, β)-plane) near x = 0.25, 0.5, 0.85, 1.2, 1.5, 1.85, 2.2, and 2.6. The most relevant ones are located close to 0.5 and 2.6. When they appear they have a relatively large contribution to f i . This is illustrated in figure 15.7, where log( f i ) is plotted, with a small step size, for α = 0 and α = 50 and β ranging from 0 to 100. For convenience we display on the same plots the function ymin (x = 0)(α, β) (shifted vertically and magnified to use the same scale) for the same values of α. While in figure 15.7(top) there is a good agreement between cliffs of this last function and sudden increases of log( f i ), in figure 15.7(bottom) there are many other bumps of log( f i ) unrelated to the jumps of ymin (x = 0). Furthermore, it must be added that the different (major) islands contributing to f i have a changing importance when they are continued (in fact, we continue the fixed periodic point at the ‘centre’ of the island) with respect to α and β. Assume, for definiteness, that we follow some path in the (α, β)-plane. Assume

Global dynamics and fast indicators

383

also that, following that path, we find a centre-saddle bifurcation where an elliptic periodic point is born, and later we find a period doubling, so that the trace T r (of D P k if the period is k) decreases monotonically from 2 to −2. Then we can ask for the behaviour of the size of the island around the periodic point. Generically it will be zero when T r = −1, the eigenvalues being cubic roots of unity, and a local minimum appears when T r = 0, when the eigenvalues are ±i. A remarkable thing (tested for different examples; see [15]) is that the maximum of the area of the island seems to occur when the rotation number of the last invariant curve around the periodic point is between 1/10 and 1/9. Here a key idea is to consider the single indicator log( f i ), such that, when its variations with respect to (α, β) are suitably interpreted, this gives valuable hints on the full dynamics in the DI.

15.4 A model problem in 2 12 degrees of freedom Finally we come to a problem of higher dimension. Consider the Hamiltonian H (y, I2, I3 , x, θ2 , θ3 ) = 12 (y 2 + I22 ) + I3 + ε(cos x − 1)(1 + µ(cos θ2 + cos θ3 )). (15.2) This is the classical Arnol’d example of diffusion [1], but we shall consider both parameters, ε and µ, as equal. This is a difficult problem, because the splitting estimates of (15.2) given in [1] strongly rely on the fact that µ can be taken exponentially small with respect to ε. Here the unperturbed Hamiltonian contains the terms of degrees 0 and 1 in ε, while the perturbation are the terms in ε2 . Furthermore, the entire character of the perturbation (cos x − 1)(cos θ2 + cos θ3 ) adds more difficulties, because the splitting of the separatrices of the hyperbolic tori (x = y = 0, I2 fixed and (θ2 , θ3 ) ∈ T2 ), becomes much smaller than in a general analytic case; see [16, 17] for motivation and several results. Assume we are interested in the vicinity of I2 = ω. Scaling and shifting √ variables and time and introducing η = ε we obtain the system x = ηy J2 = η3 (cos x − 1) sin θ2 y = (η + η3 (cos θ2 + cos θ3 )) sin x

(15.3)

θ2 = ω + η J2 θ3 = 1. We want to have an idea of the global properties of (15.3). We can start at the Poincar´e section x = π, select J2 = 0 (this is irrelevant because we can change ω) and take as parameters the initial value of y and ω. The only restriction we impose is that we take the initial value of θ3 (= t) as zero.

384

Carles Sim´o

15.5 Computation of the local exponential growth of the distance To decide about the domain of regular and chaotic dynamics for system (15.3) it can be enough to use the method proposed in section 15.2.1. However, we shall use a different one, which allows us also to obtain the maximal Lyapunov exponent &. To compute a numerical approximation to & it is enough (with probability 1) to take an arbitrary vector v and transport it by the variational flow. Let v(t) be the vector obtained. Then the expression log |v(T )|/T, for an increasing sequence of values of T , provides approximations to &. Several problems can occur. •

•

•

The orbit used to transport v enters different regions where the local rate of divergence of nearby orbits is very different. This would suggest not to take values of T too large. One of the typical unpleasant problems, in this context, is that the orbit can spend long periods of time, in a quasi-random way, close to invariant objects (invariant curves or tori, known to be sticky). Then the estimates of & can seem to converge to zero for a while. The nearby orbits diverge in a regular way, but they have periodic or quasiperiodic oscillations added (or multiplied) by the exponential growth. Even in simple systems (e.g., in the linear case; see [4]) this can slow down the method. One has to take care about the location of the orbit at the end of the time intervals used to estimate &. It is convenient to use some average or smoothing of the data, or to replace them by some ‘envelope’, which will converge to the correct value much faster. It is also more suitable for extrapolation; see [4] for details.

Recently an averaging method has been proposed [7]. It supplies an indicator of the regular or chaotic behaviour and produces estimates of &. 15.5.1 The method As described in [7] we introduce an indicator of the mean exponential growth factor of nearby orbits (MEGNO). Let us consider a differential equation x (t) = f (t, x(t)) and the associated variational flow v (t) = Dx f (t, x(t))v(t), starting at an initial point x 0 with a random vector v0 . Then we can define the auxiliary differential equations y = t (v, v )/(v, v),

z = 2y/(1 + t).

(15.4)

It is clear that for regular orbits, with a linear rate of separation of nearby orbits, the quotient 2y(t)/t converges to 2 when t → ∞. On the other hand, for orbits (and initial vectors) having a maximal Lyapunov exponent &, it diverges as &t. The variable z is smoothing these quotients. Hence, w(t) := z(t)/t also tends to 2 in the regular case (except in the non-generic cases of super-linear

Global dynamics and fast indicators

385

separation) and behaves as &t/2 in the chaotic case. Let T be the final time used in a computation and let t j = j H , j = j0 , . . . , m, where H = T /m is some moderate time step and j0 accounts for a transient to be skipped. The sequence of values {w(t j )} can be fitted by a line. The slope produces an estimate of &/2. Either with this estimate or simply with the values of w(T ) at some final time, one can devise a method of detecting the character of a given orbit. Equations (15.4) must be integrated simultaneously with the field equations, in our case (15.3), and the corresponding first variationals. 15.5.2 Description of results The method described in section 15.5.1 has been applied to (15.3) for a sample of η ∈ [0.1, 0.3]; the result is presented in figure 15.8 for ε = 0.0625 (η = 0.25). The algorithm has been applied to an equispaced grid of 2000 × 2000 pixels in the domain (y, ω) ∈ [0, 5] × [0, 2]. The initial values of the remaining variables have been given above. The value T = 104 has been used. Orbits such that w(T ) < 1.92 have been selected as regular, while orbits with w(T ) > 2.08 are considered chaotic. The intermediate range is considered as, possibly, mildly chaotic. The values have been selected to have a good contrast between both kinds of orbit. In figure 15.8 the pixels corresponding to initial conditions of regular behaviour have been plotted in light grey, and those of chaotic behaviour in black. Let us describe the meaning of what we can observe in figure 15.8. A resonance can be identified as a light grey channel surrounded by black boundaries. If it is very thin perhaps only some trace of the light grey and/or black can be seen. Hundreds of resonances can be detected in figure 15.8. In the unperturbed case (i.e., if we delete only the terms in η3 in (15.3)) y = 0 (resp. 0 < y < 2, y = 2, y > 2) corresponds to the elliptic fixed point (resp. the libration orbits, the separatrix, the circulation orbits) of the pendulum part. Recall that the section has been taken on x = π. The ‘external’ frequencies are now ω and 1. The resonances occur for the pendulum periodic orbits (either of libration or circulation type) whose frequency ν(y, η) satisfies k1 ν+k2 ω+k3 = 0, k1 , k2 , k3 ∈ Z. All of them have zero width and are located on horizontal lines (if k1 = 0), on vertical lines (if k2 = 0) and on a countable set of curves ω = (−1/k2 )(k1 ν + k3 ), otherwise. Recall that ν(y, η) → 0 as y → 2, which explains the general pattern of the resonances. To recover the perturbed problem of (15.3) we can replace the η3 factor by ηγ and let γ move from 0 to η2 . This is quite different from changing η in (15.3), because in this last case the frequency ν(y, η) also changes. The ‘centre line’ of each resonance channel corresponds to 2D (normally) elliptic tori, while the ‘boundary lines’ correspond to 2D hyperbolic tori. When two resonances cross there is a periodic orbit at the ‘centre’ of the crossing. If the centre of the crossing is seen as a zone of regular motion the periodic orbit is of totally elliptic type. Generically the light grey regions correspond to 3D tori.

386

Carles Sim´o

2

1.5

1

0.5

0

0

1

2

3

4

5

Figure 15.8. Regions of regular motion (light grey) and chaotic motion (black) plotted in the (y, ω)-plane, where y is the variable of the pendulum and ω the initial frequency of the angle θ2 .

Note that the chaotic regions seem to be connected. This gives the possibility to have diffusion, but ﬁgure 15.8 tells us nothing about in which direction the instability takes place and with which probability. A detailed geometric study is required to check the heteroclinic connections. Alternative efﬁcient indicators of the quantitative details of the diffusion are also desirable. Both aspects are research in progress.

15.6 Conclusion Several tools have been presented to give detailed indications of the global dynamics of different systems. Many other tools are available, like the well known frequency analysis. The combination of different tools can give (relatively) fast and reliable methods to estimate the main features of the dynamics and their signiﬁcance. This paper has demonstrated the efﬁciency of such fast indicators.

Global dynamics and fast indicators

387

Appendix. Taylor series methods of integration Consider a dynamical system given by a differential equation x = f (t, x).

(15.5)

We look for the representation of the solution of (15.5) around some given initial conditions (tn , x n ) as x(tn + h) = nk=0 x (k) h k . This expansion will be truncated at a suitable order, to be discussed later. We can assume, with full generality, that the components of f can be expressed as the composition of arithmetic operations (+, −, ∗, /) and elementary functions (basically log, exp, sin, cos). If the representations of two functions q(t), p(t) are known to order k, it is immediate to write the representation of p • q, where • denotes any of +, −, ∗, /. To obtain the new truncated series to order N the cost is O(N 2 ). The same applies to obtaining p = log q from the relation p q = q , p = exp q from p = pq and p = cos r , q = sin r from p = −qr , q = pr . Again the cost is O(N 2 ). Note that we should work with numerical coefficients. Routines to produce the jet to order N, if a code producing the evaluation of f (t, x) from t and x is available, are easy to write. For one of such implementations we refer to [10]. A key point is the selection of the suitable order N and the variable time step h. To this end we must assume analyticity and some ‘good’ behaviour of the Taylor series. Proposition 15.1. Let ρ = ρ(t) be the radius of convergence around the point (tn , x n ). Assume the coefficients x (k) , in the Taylor representation of the solution, satisfy A1 ρ −k < |x (k) | < A2 ρ −k for some 0 < A1 < A2 . Then, to obtain a predetermined relative error in each step the more efficient step size (in the computational sense) tends to h = ρ/ exp(2) when the error tends to zero. Proof. Assume we want a relative error of at most ε in each step. (This can be the accuracy ε of the computer or, eventually, the adopted multi-precision value.) Let hˆ = h/ρ. Except by a constant factor it is enough to take hˆ N = ε. For fixed ε this gives a constraint for h and N. On the other hand, the cost per unit of t is C 4 h −1 B N 2 , where we use the O(N 2 ) character of the method. Minimizing C ˆ 2 /h. ˆ From under the constraint, if ε is small enough, leads to maximizing (log h) this the result follows. Note that the effects of the quotient A2 /A1 and of the terms of order > N + 1 can be bounded by a constant, independently of ε, and they become negligible if ε → 0. Remark 15.2. In cases where the assumption is obviously false (e.g. the series contains only odd terms) one can modify proposition 15.1 in a suitable way with the same result. The value of ρ(t) can be estimated from the behaviour of the coefficients x (k) .

388

Carles Sim´o

The method is especially interesting if ε is small (e.g., integration in multiprecision, say ε = 10−34.) In principle the only errors it produces are the unavoidable rounding errors. However, it can lose efficiency in two cases: (i) the evaluation of f requires a large number of operations, (ii) along the orbit one passes close to some singularity (i.e., ρ becomes very small). This singularity can be located at some complex t. From proposition 15.1 it follows that the value of N is, essentially, − 12 log ε. For a typical value of ε = 10−16 an order around N = 18 is suitable. Note that as this ε may not be small enough, some slightly different orders can have a better performance.

Acknowledgements I would like to express my gratitude to H Broer, P Cincotta, A Jorba, M van Noort, T Stuchi, D Treschev and C Valls, who have greatly contributed to this paper by means of discussions, suggestions and/or questions. My thanks to B Krauskopf for his very efficient help in the presentation of this paper in final form. This work has been supported by grants BFM2000-805 (Spain), 2000SGR-27 (Catalonia) and INTAS 97-771. The computing facilities of the UB-UPC Group of Dynamical Systems have been used.

References [1] Arnol’d V I 1964 Instability of dynamical systems with several degrees of freedom Sov. Math. Dokl. 5 581–5 [2] Broer H W, Hoveijn, van Noort M, Sim´o C and Vegter G 2001 Global coherent dynamics of the parametrically forced pendulum: a case study in 1 12 degrees of freedom Preprint [3] Broer H, Roussarie R and Sim´o C 1996 Invariant circles in the Bogdanov–Takens bifurcation for diffeomorphisms Ergod. Theor. Dynam. Syst. 16 1147–72 [4] Broer H and Sim´o C 1998 Hill’s equation with quasi-periodic forcing: resonance tongues, instability pockets and global phenomena Bul. Soc. Bras. Mater. 29 253– 93 [5] Broer H, Sim´o C and Tatjer J C 1998 Towards global models near homoclinic tangencies of dissipative diffeomorphisms Nonlinearity 11 667–770 [6] Cincotta P and Sim´o C 1999 Conditional entropy: a tool to explore the phase space Celest. Mech. Dynam. Astron. 73 195–209 [7] Cincotta P and Sim´o C 2000 Simple tools to study global dynamics in nonaxisymmetric galactic potentials—I Astron. Astrophys. Suppl. 147 205–28 [8] H´enon M and Heiles C 1964 The applicability of the third integral of motion: some numerical experimets Astron. J. 69 73–9 [9] Ivanov A 2000 Study of the double mathematical pendulum—III. Melnikov’s method applied to the system in the limit of small ratio of pendulums masses Reg. Chaotic Dynam. 5 329–44

On the global dynamics of Kirchhoff’s equations

389

` and Zou M 2001 On the numerical integration of ODE by means of high[10] Jorba A order Taylor methods Preprint [11] van Noort M 2001 The parametrically forced pendulum. A case study in 1 12 degrees of freedom PhD Thesis University of Groningen [12] Sim´o C 2001 Dynamical properties of the figure eight solution of the three-body problem Preprint [13] Sim´o C, Broer H and Roussarie R 1991 A numerical exploration of the Takens– Bogdanov bifurcation for diffeomorphisms Proc. Eur. Conf. on Iteration Theory (Batschuns) ed C Mira, N Netzer, C Sim´o and G Targonski (Singapore: World Scientific) pp 320–34 [14] Sim´o C and Stuchi T 2000 Central stable/unstable manifolds and the destruction of KAM tori in the planar Hill problem Physica D 140 1–32 [15] Sim´o C and Treschev D 2000 Evolution of the ‘last’ invariant curve in a family of area preserving maps: the case of the separatrix map Preprint [16] Sim´o C and Valls C 2000 A formal approximation of the splitting of separatrices in the classical Arnold’s example of diffusion with two equal parameters Preprint [17] Valls C 1999 The classical Arnold example of diffusion with two equal parameters PhD Thesis Universitat Barcelona

This page intentionally left blank

Chapter 16 A general nonparametric bootstrap test for Granger causality Cees Diks University of Amsterdam Jacob DeGoede University of Leiden

We introduce an information theoretic test statistic for Granger causality, which can be estimated by means of correlation integrals. The significance of the test statistic is determined using bootstrap methods rather than asymptotic distribution theory. Several bootstrap strategies are suggested and compared by Monte Carlo simulations. All these bootstrap methods appear to work well, but only bootstrapping the hypothesized noncausing time series has the additional advantage that a simplified test statistic can be used.

16.1 Granger causality For the case of two scalar-valued time series, {X t } and {Yt }, intuitively, {Yt } is a Granger cause of {X t } if past and present values of Y contain information about the distribution of future values of X, not contained in past and present observations of X. This causality concept, which will be defined more formally later, is useful in empirical research on causal relationships among observed time series. Tests for Granger causality originate from econometrics and have recently been applied in fields ranging from neurology [4] to epidemiology [10]. In econometrics the challenge consists of detecting and characterising dependence within a sea of noise. Koch and Koch [7] examined contemporaneous and lead–lag relationships across national equity markets, and found that the interdependence across national markets has increased over time. They also 391

392

Cees Diks and Jacob DeGoede

found that most dependence occurs within 24 hours, and that Japan’s market influence has increased to a size comparable to that of the USA. Tan and Cheng [16] reported the presence of causal links between money and output. Since Granger causality goes beyond dependence and focuses on causal relationships, such results are of vital importance in establishing sensible control strategies in the form of government policies, supported by empirical evidence that spurious correlations can be ruled out. The traditional approach to testing for Granger causality consists of comparing prediction errors of an autoregressive model of X with the prediction errors obtained by a model which regresses X on past and current values of both X and Y . This approach is appealing, since the test reduces to determining the significance of the coefficients of the terms in the regression that depend on past and current values of Y . There are, however, two disadvantages. First, parametric tests require modelling assumptions such as linearity of the regression structure and, second, tests based on prediction errors will be sensitive only to causality in the mean. Higher-order structure, such as heteroskedasticity, will be ignored. Asimakopoulos et al [1] examined nonlinear Granger causality in currency futures returns, and found several uni-directional causal relationships. A study by Longin and Solnik [8] confirmed the often reported empirical fact that markets are linked more strongly in periods of higher volatility. This is an indication of nonlinearity and stresses the importance of general tests for Granger causality which are sensitive also to nonlinear causal relationships. The nonlinear aspect was modelled explicitly in a parametric test for causality-in-variance proposed by Cheung and Ng [3]. In this contribution we will focus on nonparametric tests, which eliminates possible problems resulting from model misspecification. Before describing general nonparametric tests for Granger causality, it will be useful to give a more formal definition of the concept of causality. For two strictly stationary time series {X t } and {Yt }, satisfying some weak mixing conditions, we define Granger causality as follows. Definition 16.1. {Yt } is a nonlinear Granger cause of {X t } (denoted Y → X) if FX t+1 (x|F X (t), FY (t)) %= FX t+1 (x|F X (t)), where FX t+1 (x|F ) denotes the cumulative distribution function of X t +1 given F , and F X (t) and FY (t) denote the information sets consisting of past observations of X and Y up to and including time t. Note that this definition relies on comparing the one-step-ahead distribution of X with and without past and current observed values of Y . Generalizations of this definition of Granger causality which consider future joint distributions of X can be formulated in a similar way. Also note that our definition should be considered an operational definition of causality, since the existence of an unobserved variable, Z say, which is causing both X and Y , leading to Y being

Bootstrap test for Granger causality

393

a Granger cause of X, can never be excluded empirically. However, the great advantage of the concept of Granger causality is that it is empirically testable. We propose an information theoretic test for Granger causality for stationary weakly dependent time series, based on conditional entropies. These entropies can be expressed in terms of correlation integrals, the nonparametric estimation of which is straightforward. Correlation integrals originate from the study of chaotic systems, where they are important means of characterizing the dynamics of deterministic processes. The contributions of Floris Takens to this field are well known [15]. The test proposed by Hiemstra and Jones [5], who reported bi-directional Granger causality (interaction) between changes in trading volume on the New York Stock Exchange and returns of the Dow Jones Industrial Average Index, is also based on correlation integrals, and closely resembles the test proposed here. However, our test statistic is motivated by information theoretic arguments and we use bootstrap methods rather than asymptotic theory for determining the significance of the test statistic. An important new insight is recognition of the connection between correlation integrals and information theory, see e.g. Prichard and Theiler [13]. Correlation integral based information theoretic quantities require only weak assumptions on the underlying processes, and yet turn out to be powerful tools for characterizing causal relationships and quantifying information flows. A great advantage is that applications of these methods are no longer restricted to deterministic time series but are suitable for arbitrary stationary, weakly mixing processes. A famous and widely used example of a correlation integral based test for serial independence was put forward by Brock et al [2]. The theory of bootstraps for dependent processes also has been developed strongly recently, and with the current cheaply available computational power these bootstraps have become practically feasible even for large data sets. The combined use of correlation integral based information theoretical statistics together with recently developed time series bootstrap methods promises to provide powerful and statistically sound means for studying dynamical relationships among time series.

16.2 Information theoretic test statistic Given two time series {X t } and {Yt } we wish to test the null hypothesis H0 :

{Yt } is not Granger causing {X t }

According to the established tradition in statistics we should be speaking of testing the null hypothesis, rather than its negation. This implies that we are considering tests for Granger noncausality rather than tests for Granger causality. However, for simplicity we choose to continue this slight abuse of language. Using information theoretic quantities, we take an approach that closely follows the definition of Granger causality in the previous section. First, from

394

Cees Diks and Jacob DeGoede

the time series {X t } and {Yt } delay vectors X t = (X t −m+1 , . . . , X t ),

Yt = (Yt −l+1 , . . . , Yt )

(16.1)

are constructed of embedding dimension m and l, respectively. The idea is to quantify the average amount of extra information on X t +1 contained in the delay vector Yt , given that we already know X t . Generally speaking, the average amount of information a random variable X contains about a random variable Y can be expressed as the generalized [13] mutual information of X and Y , which, in terms of correlation integrals, reads Iq (X; Y ) = ln Cq (X, Y ) − ln Cq (X) − ln Cq (Y )

(16.2)

where

Cq (X, )) =

q−1 I(x− y≤)) dµ X (x)

dµ X ( y)

1 q−1

.

(16.3)

Here I(·) denotes the indicator function which is equal to one if its argument is true, and is zero otherwise, and · denotes the supremum norm x =

sup

|x i |.

(16.4)

i=1,...,dim x

For q = 2, C2 (X, )) is nothing but the fraction of distances between two independently chosen points, according to µ X , that is smaller than or equal to ). Thus the second-order (q = 2) correlation integral is equal to the probability that a distance between two independent realizations of X is smaller than or equal to ). For computational convenience, we use q = 2 in our calculations, and for simplicity the index q as well as the scale parameter ) are omitted in the notation of the correlation integrals in the sequel. When X and Y are independent, the joint correlation integral factorizes, C(X, Y ) = C(X)C(Y ), and I(X, Y ) = 0. In the extreme case where X and Y are identical, one has ln C(X, Y ) = ln C(X) = ln C(Y ), so that I(X, Y ) = ln C(X). In a time series setting the average information about X t +1 contained in X t and Yt jointly, is given by I(X t , Yt ; X t +1 ) = ln C(X t , Yt , X t +1 ) − ln C(X t , Yt ) − ln C(X t +1 )

(16.5)

while the average information about X t +1 in X t only is given by I(X t ; X t +1) = ln C(X t , X t +1 ) − ln C(X t ) − ln C(X t +1 ).

(16.6)

By subtracting these two information measures, we can quantify the average amount of extra information that Yt contains about X t +1 in addition to the information already in X t .

Bootstrap test for Granger causality

395

If past observations of Y contain no extra information about future values of X, one has I(X t , Yt ; X t +1 ) = I(X t ; X t +1 ). If, on the other hand, past observations of Y do contain information on current and future values of X, we expect I(X t , Yt ; X t +1) > I(X t ; X t +1 ). As our test statistic we use a correlation integral based estimator of Q = I(X t , Yt ; X t +1 ) − I(X t ; X t +1 ) = ln C(X t , Yt , X t +1 ) − ln C(X t , Yt ) − ln C(X t , X t +1 ) + ln C(X t ) (16.7) which gives = ln C(X t , Yt , X t +1 ) − ln C(X t , Yt ) − ln C(X t , X t +1 ) + ln C(X t ) (16.8) Q where C(X) represents the estimated correlation integral of X. Rather than using is asymptotic theory, bootstrap methods will be used to determine whether Q significantly larger than zero. Since we expect the test statistic to be equal to zero under the null hypothesis and positive under alternatives, a one-sided test is called for. The null hypothesis is significantly larger than zero. However, there are some is rejected only if Q subtleties involved here, and exceptions can be constructed for which the test statistic decreases in the presence of Granger causality. This is related to the fact that the correlation integral is not always larger for time series with more structure, which was pointed out to us by Floris Takens (also see [15]). Anomalies like these can be traced back to the use of the order-two correlation integral C2 rather than the correlation integral C1 of order one. Information theoretic quantities defined in terms of C1 rather than C2 , have a number of nice properties which those based on C2 are lacking. Some authors resolve this point by implementing tests based on C1 , which, however, is much more difficult to estimate. Others solve the problem by analysing ranks rather than the raw data [12]. In practice the differences between C1 and C2 are often small. We choose to use C2 for efficiency reasons, at the expense of a possible loss of statistical power. Notice that Hiemstra and Jones [5] test the relationship C(X t , X t +1 ) C(X t , Yt , X t +1 ) = C(X t , Yt ) C(X t ) by calculating

(16.9)

= C(X t , Yt , X t +1 ) − C(X t , X t +1 ) (16.10) T t) C(X t , Yt ) C(X is too large. Upon taking from the data rejecting the null hypothesis whenever T logarithms on both sides of (16.9), it can be shown that they test exactly the same equality as we do. However, their test statistic is different, and cannot be mapped to ours in a one-to-one fashion. Furthermore, they use asymptotic distribution theory rather than a bootstrap test, so that the size and power of their test need not be the same as ours.

396

Cees Diks and Jacob DeGoede

16.3 Bootstrap procedures The stationary bootstrap proposed by Politis and Romano [11] is simple to apply to univariate time series. We are dealing with the bivariate case, and there are several ways of bootstrapping the two time series. Before describing the various possibilities, a brief description of the stationary bootstrap will be given. The stationary bootstrap replicates the time series by concatenating blocks of observations from the original time series. The blocks are selected randomly from the original time series and have a random length with a geometric distribution. To ensure stationarity of the bootstrap time series, whenever a block exceeds the end of the time series, one continues by adding observations starting from the beginning of the time series. The following implementation is used for constructing a bootstrap replication {X t∗ }tN=1 of {X t }tN=1 . An index, i 1 , is selected randomly according to the uniform distribution on 1, . . . , N. The first observation of the bootstrap time series X 1∗ is taken to be X i1 . Then, with probability 1 − P, with P small, one chooses i 2 = i 1 + 1, and with probability P, i 2 is selected randomly again from the uniform distribution on 1, . . . , N. The next value in the bootstrap time series is then taken to be X i2 . In this way, one continues until a time series of length N is obtained. Whenever i k becomes equal to N + 1, i k is set to 1, that is, to the index pointing to the first observation of the original time series. Let us now return to the various possibilities of bootstrapping a pair of time series in the context of testing for Granger causality. We would like to test the null hypothesis that Y is not Granger causing X. As a first attempt we could bootstrap {X t } and {Yt } independently, to obtain a distribution under the null hypothesis (the absence of Granger causality). This bootstrap method will be referred to as the XY bootstrap. Note that this destroys any dependence between X and Y rather than only Granger causality, if present. This may influence the distribution of the test statistic and, hence, under the null of no Granger causality may introduce deviations of the rejection probability from the nominal size. In the simulation studies presented later we will consider this point. A slight modification is to bootstrap only the ‘causing’ time series {Yt }. This bootstrap, referred to as the Y bootstrap, also ensures the absence of information of Y on future values of X and, hence, should also work. Since the variability of the test statistic is expected to decrease using the Y bootstrap, the power of the test may increase for Y . Also it should be examined whether the Y bootstrap method gives rise to a size which differs from the nominal size in the presence of dependence, since this dependence is also lost under the bootstrap procedure. To preserve dependence between X and Y one can bootstrap X and Y contemporaneously. That is, whenever an index i k is selected the bootstrap values for X k∗ and Yk∗ are selected with the same index, X k∗ = X ik , and Yk∗ = Yik . This we call the (X, Y ) bootstrap, where the brackets now indicate that the two time series are considered as one bi-variate time series. Using this approach we no longer perform bootstraps under the null hypothesis, because Granger causality in the

Bootstrap test for Granger causality

397

original time series will be preserved in the bootstrap one. This actually amounts to performing a standard bootstrap procedure, in which the bootstrap distribution of the test statistics can be expected to be centred (on average) around the value of the test statistic of the original time series. Before determining the p-values from the bootstrap distribution one should first centre the bootstrap distribution of the test statistic around zero, which is the expected value of the test statistic under the null hypothesis. In the last method considered, referred to as the (X, Y ) bootstrap, delay vectors of embedding dimension m + 1 are bootstrapped contemporaneously instead of individual observations. Summarizing, we examine the following bootstrap procedures: • • • •

bootstrapping both time series independently, XY bootstrapping only the ‘causing’ time series, Y bootstrapping contemporaneously, (X, Y ) bootstrapping delay vectors, (X, Y ).

Strictly speaking, only the last two methods are bootstraps, and the first two methods are randomization procedures aimed at ‘bootstrapping under the null’. There is some analogy with randomization tests for serial independence. Under the assumption that a time series consists of independent, identically distributed observations, one can randomize the data by permuting them randomly. Since under the assumption of independence permutations are equally likely, this randomization procedure is exact. The randomization procedures proposed here are not exact, and before they are used in practice, at least some numerical evidence is required to warrant their application. In the next section some preliminary results are presented.

16.4 Monte Carlo simulations In this section the size and power of the bootstrap test are determined numerically by Monte Carlo simulation for various bivariate time series models. This is important since it became clear in the previous section that none of the bootstrap procedures can be expected a priori to have the desired size properties. The test statistic determined from the original pair of time series {X t } and 1 . The values of the B − 1 bootstrap replications {X t∗ } and {Yt } is denoted by Q ∗ 2 , . . . , Q B . The p-value is {Yt } of the pair of time series are referred to as Q determined as B i ≥ Q 1 ) I (Q p = i=1 . (16.11) B The numbers presented in the tables are the fractions of rejections at a nominal size α = 0.05 for 1000 independent realizations, where we used B = 20. That is, the bootstrap test is applied 1000 times to independently generated realizations of the pair of time series, and the rejection rate (the relative number of times

398

Cees Diks and Jacob DeGoede

Table 16.1. Rejection rates (size) in absence of Granger causality, X t ∼ N(0, 1) and Yt ∼ N(0, 1), independently; l > 0 corresponds to tests for Y → X, and l < 0 to X → Y . l

XY

Y

(X, Y )

(X, Y )

HJ

−2 −1

0.043 0.067

0.049 0.048

0.034 0.024

0.021 0.028

0.025 0.018

1 2

0.050 0.052

0.054 0.049

0.026 0.030

0.026 0.016

0.011 0.019

that p ≤ 0.05) is quoted. The size is the relative number of rejections for processes that satify the null hypothesis. Ideally, this ‘actual’ size should be close to the nominal size. If the actual size is smaller than the nominal size the test is called conservative. If the actual size is larger than the nominal size, the rejection probability is larger than the nominal size for processes that satisfy the null hypothesis, that is, the type I error is larger than the nominal size. This certainly is undesirable for a statistical test. The rejection rate under alternatives, i.e. processes not satisfying the null hypothesis, is called the power of the test. Provided that the size does not exceed the nominal size, the larger the power of the test, the better. The aim of this Monte Carlo study is twofold. We examine the size not only in cases where {X t } and {Yt } are independent, but also dependent with and without Granger causality. We also want to examine the power of the test in the presence of Granger causality. Throughout we will compare the size and the power of the test with the size and power obtained with the test of Hiemstra and Jones [5]. Our ultimate goal is to estimate the effect of dependence and to examine whether we can use the ‘randomization’ approaches, in which dependence is ignored. We use the following parameter values. In testing Y → X the embedding dimension for the X time series is set to m = 2, while the embedding dimension for Y can take the values l = 1, 2. In testing X → Y the same parameter values are used but the roles of X and Y are reversed. The scale parameter is taken to be ) = 1 (after rescaling each time series to unit variance). The time series length for the Monte Carlo simulations is N = 100. The switching probability used in the stationary bootstrap is set to P = 0.05. We used B = 20, which amounts to B − 1 = 19 bootstrap replications. 16.4.1 Size In this subsection we determine the size for two bivariate processes without Granger causality. Cases in which the time series are independent, as well as dependent, are examined. First we study an example in which {X t } and {Yt } are independent. We

Bootstrap test for Granger causality

399

Table 16.2. Rejection rates (size) in absence of Granger causality, but in the presence of dependence, (X t , Yt ) ∼ BVN(0, 0, 1, 1, 12 ). l

XY

Y

(X, Y )

(X, Y )

HJ

−2 −1

0.029 0.030

0.028 0.038

0.022 0.023

0.027 0.024

0.023 0.017

1 2

0.028 0.047

0.041 0.034

0.020 0.031

0.019 0.024

0.015 0.027

take both processes to consist of independent, normally distributed values, that is, X t ∼ N(0, 1), and Yt ∼ N(0, 1). In this case there is no Granger causality and no dependence, and all bootstrap methods should work, at least asymptotically. Table 16.1 gives the sizes obtained for this example. In all tables, positive values of the lag l correspond to tests for Y → X, and negative values l correspond to tests for X → Y . The different rows correspond to the various methods examined. The last row HJ denotes the Hiemstra and Jones test. The size is close to the nominal size α = 0.05 for the first two bootstrap methods, which both ignore dependence. Note, however, that the test appears to be somewhat conservative for the two bootstraps which are constructed to preserve dependence. This holds true also for the Hiemstra and Jones test. Next we consider an example in which there is instantaneous dependence between the two time series, but no Granger causality. We take (X t , Yt ) to be bivariate normally distributed, with a correlation coefficient of 12 , denoted by BVN(0, 0, 1, 1, 12 ), a bivariate normal distribution for which the two components both have mean zero and unit variance while the correlation between the components is 12 . The resulting rejection rates, given in table 16.2 suggest that all tests are slightly conservative. 16.4.2 Size and power As a first example with Granger causality, a case is considered with uni-directional Granger causality. We examine the previous process again, but now with the Y time series shifted in time by one time unit, so that it is running ahead. In this way, a situation is obtained in which Y Granger-causes X. The process satisfies (X t , Yt −1 ) ∼ BVN(0, 0, 1, 1, 12 ). The rejection rates for Y → X now amount to the power of the test, whereas the rejection rates for X → Y (l < 0) must be interpreted as sizes. The results shown in table 16.3 suggest that again the actual size is smaller than the nominal size, and that the bootstraps preserving dependence, as well as the Hiemstra and Jones test, are slightly more conservative than the tests which ignore dependence (XY and Y ). In terms of power (l > 0) the tests which ignore dependence appear to perform slightly better than the methods

400

Cees Diks and Jacob DeGoede

Table 16.3. Rejection rates (size and power) in the presence of uni-directional Granger causality, Y → X, with (X t , Yt −1 ) ∼ BVN(0, 0, 1, 1, 12 ). l

XY

Y

(X, Y )

(X, Y )

HJ

−2 −1

0.033 0.037

0.037 0.039

0.019 0.023

0.026 0.024

0.026 0.021

1 2

0.717 0.452

0.733 0.479

0.550 0.384

0.614 0.617

0.591 0.355

Table 16.4. Granger causality, Y → X, linear dependence. l

XY

Y

(X, Y )

(X, Y )

HJ

−2 −1

0.027 0.029

0.035 0.032

0.035 0.027

0.027 0.034

0.034 0.031

1 2

0.812 0.615

0.830 0.570

0.700 0.567

0.707 0.720

0.781 0.634

Table 16.5. Power, linear interaction case. l

XY

Y

(X, Y )

(X, Y )

HJ

−2 −1

0.319 0.524

0.314 0.524

0.344 0.448

0.721 0.728

0.506 0.652

1 2

0.492 0.290

0.533 0.310

0.489 0.337

0.721 0.719

0.585 0.444

which preserve dependence and also better than the Hiemstra and Jones test. The time series generated by the model X t = 0.6X t −1 + 0.5Yt −1 + )t Yt = 0.6Yt −1 + )t

(16.12)

where )t and )t are independent and standard normally distributed, also exhibit uni-directional Granger causality, Y → X. The rejection rates for this model are given in Table 16.4. For the first lag l = 1 the first two bootstrap methods again appear to have slightly more power than the other bootstrap methods and the Hiemstra and Jones test.

Bootstrap test for Granger causality

401

Table 16.6. Size and power for a model with nonlinear Granger causality, Y → X. l

XY

Y

(X, Y )

(X, Y )

HJ

−2 −1

0.037 0.024

0.033 0.030

0.036 0.028

0.030 0.032

0.022 0.025

1 2

0.965 0.869

0.956 0.861

0.889 0.805

0.910 0.895

0.926 0.823

Next an interaction case is considered, given by X t = 0.5X t −1 + 0.4Yt −1 + )t Yt = 0.5Yt −1 + 0.4X t −1 + )t .

(16.13)

Table 16.5 shows the obtained powers for this process. The tests which ignore dependence still have some power, but this can be seen to be considerably smaller than that for the tests which preserve dependence, and the Hiemstra and Jones test. A possible explanation for the small power of the bootstrap test could be the fact that a very small value of B was used (B = 20). As shown by Hope [6] and Marriott [9] the power slightly increases when a larger number of bootstrap replications are used, but there is no need to choose an excessively large value for B. According to Marriot, typically 5 is a suitable value for α B, suggesting the choice B = 100 rather than B = 20 for α = 0.05. Indeed the power for the bootstraps was observed to improve slightly in trial runs with B = 100, but this increase was certainly not sufficient to change our conclusion that the Hiemstra and Jones test outperforms all but the (X, Y ) bootstraps in this case. The last process we consider exhibits nonlinear Granger causality Y → X. The model is X t ∼ N(0, σt2 ) (16.14) σt = Yt −1 ∼ N(0, 1). This is a simple model of bivariate conditional heteroskedasticity. Processes with conditional heteroskedasticity play an important role in econometrics, where they are frequently used to model a phenomenon referred to as volatility clustering. This is the tendency of stock prices to show larger price movements after periods of large price movements, and smaller price movements after periods with small price movements. Table 16.6 shows the estimated size and power for this example. The size and power appear to be comparable for all tests.

16.5 Summary and discussion We argued that information theoretic quantities provide a natural means of testing for Granger causality, and we proposed an information theoretic test statistic for

402

Cees Diks and Jacob DeGoede

Granger causality. It was shown that several alternative time series bootstrap strategies perform well in terms of size and power. All bootstrap tests, and also Hiemstra and Jones’s test, were found to be conservative, at least for the examples studied here. This implies that practitioners need not fear that the rate of type I errors exceeds the nominal size. In some examples the power of the tests outperformed the Hiemstra and Jones test, whereas the opposite also occurred. The bootstrap tests turn out to be very robust against dependence between X and Y , even when the bootstraps destroy this. A possible explanation for this phenomenon is that the test statistic picks out just the right type of dependence between two time series. The test statistic by construction is only sensitive to dependence that can be associated with Granger causality. All other dependence such as instantaneous correlation is ignored by the test statistic. If only the noncausing time series {Yt } is bootstrapped, the terms t∗ , X ∗ ) and ln C(X t∗ ) for all bootstraps are equal, and equal to the value ln C(X t +1 obtained for the original time series. Therefore, p-values determined with this method will remain unchanged upon leaving out these terms from the test statistic. This bootstrap method thus has the advantage that a simplified test statistic can t , Yt , X t +1 ) − ln C(X t , Yt ). The quantity −Q can = ln C(X be used, namely Q be interpreted as the correlation entropy of X conditional on Y . Indeed, when Y Granger causes X one would expect the conditional correlation entropy of X given Y to be smaller than when Y contains no additional information on future values of X. This suggests that a Hiemstra and Jones type of test could also be t , Yt , X t +1 )/C(X t , Yt ). If the asymptotic variance = C(X designed based on T of this simplified statistic is known, conditionally on X, the significance can be determined by calculating how far, in terms of standard deviations, it is located t , X t +1 )/C(X t ). 0 = C(X from its value under the null hypothesis, T Although the nonparametric bootstrap tests discussed here all have nice size and power properties, they remain quite uninformative about the nature of the Granger causality involved. Information on the exact lags involved is difficult to obtain from the test results, even when the results for different lags are compared. For example, when Y Granger causes X through the first lag only, the test will also detect Granger causality for l = 2, simply because for l = 2 the delay vector Yt contains, apart from Yt −1 , also the lagged value Yt which contains information on X t +1 . A possible solution, in the spirit of Savit and Green [14] who use a similar approach for lagged dependence in a univariate time series setting, is to compare the information about X t +1 contained in (X t , Yt −1 , Yt ) with that contained only in (X t , Yt ). In this way, the extra information in each of the added lagged values of Y can be examined separately. In the examples studied the bootstrap tests and the Hiemstra and Jones test performed about equally well. The information theoretic approach, however, has the advantage that clear-cut statistical quantities can be used for time series problems that involve information flowing from one variable to another. Therefore, a future direction is the development of an asymptotic theory for the information theoretic test statistics proposed in this contribution and related

Bootstrap test for Granger causality

403

statistics. Since correlation integrals are known to be asymptotically normal for stationary mixing processes, the asymptotic distributions of the estimators of the information theoretic quantities based on them, are expected to be analytically tractable.

References [1] Asimakopoulos I, Ayling D and Mahmood W M 2000 Nonlinear Granger causality in the currency futures returns Econom. Lett. 68 25–30 [2] Brock W A, Dechert W D, Scheinkman J A and LeBaron B 1996 A test for independence based on the correlation dimension Econom. Rev. 15 197–235 [3] Cheung Y-W and Ng LK 1996 A causality-in-variance test and its application to financial market prices J. Econom. 72 33–48 [4] Freiwald W A, Valdes P, Bosch J, Biscay R, Jimenez J C, Rodriguez L M, Rodriguez V, Kreiter A K and Singer W 1999 Testing non-linearity and directedness of interactions between neural groups in the macaque inferotemporal cortex J. Neurosci. Meth. 94 105–19 [5] Hiemstra C and Jones J D 1994 Testing for linear and nonlinear Granger causality in the stock price–volume relation J. Finance 49 1639–64 [6] Hope A C A 1968 A simplified Monte Carlo significance test procedure J. R. Stat. Soc. B 30 582–98 [7] Koch P D and Koch T W 199) Evolution in dynamic linkages across daily national stock indices J. Int. Money Finance 10 231–51 [8] Longin F and Solnik B 1995 Is the correlation in international equity returns constant: 1960–1990? J. Int. Money Finance 14 3–26 [9] Marriott F H C 1979 Barnard’s Monte Carlo tests: how many simulations? Appl. Stat. 28 75–7 [10] Pitard A and Viel J F 1999 A model selection tool in multi-pollutant time series: the Granger-causality diagnosis Environmetrics 10 53–65 [11] Politis D N and Romano J P 1994 The stationary bootstrap J. Am. Stat. Assoc. 89 1303–13 [12] Pompe B, Blidh P, Hoyer D and Eiselt M 1998 Using mutual information to measure coupling in the cardiorespiratory system. New insights into nonlinear coordinations IEEE Eng. Med. Biol. Mag. 17 32–9 [13] Prichard D and Theiler J 1995 Generalized redundancies for time series analysis Physica D 84 476–93 [14] Savit R and Green M 1991 Time series and dependent variables Physica D 50 95–116 [15] Takens F 1993 Detecting nonlinearities in stationary time series Int. J. Bif. Chaos 3 241–56 [16] Tan K-G and Cheng C-S 1995 The causal nexus of money, output and prices in Malaysia Appl. Economics 27 1245–51

This page intentionally left blank

Chapter 17 Birkhoff averages and bifurcations Ale Jan Homburg University of Amsterdam Todd Young Ohio University

In studying bifurcations of dynamical systems in a regime of chaotic dynamics, one can try to obtain a partial understanding by considering how ergodic properties vary with the system. More concretely, we will consider families of dynamical systems and study the dependence on the parameters of averages of functions (Birkhoff averages) over orbits. Let { f γ } be a family of diffeomorphisms or endomorphisms on a manifold M. Let φ be a continuous function on M. For x ∈ M, consider averages ¯ f γ , x) = lim 1 φ( fγj (x)) φ( i→∞ i i−1

j =0

if the limit exists. If f γ supports a physical or SBR measure (after Sinai, Bowen and Ruelle), then by definition the limit does exist and is constant for x from a set ¯ f γ , x) of positive Lebesgue measure. The central question is how the values φ( on these positive measure sets, vary with the parameter γ . Of particular interest from this point of view is intermittency, which is characterized by the existence of alternating phases in the dynamics. In one phase, the laminar phase, the dynamics appear to be nearly periodic, while in the other phase, the orbit makes large seemingly chaotic excursions away from the periodic region. These excursions are called chaotic bursts. Although much can be presented in larger generality, we will center our presentation around intermittency in families of bimodal maps on the circle. Intermittency can occur in the unfoldings of saddle–node bifurcations and of certain homoclinic bifurcations (in the context of one-dimensional dynamics 405

406

Ale Jan Homburg and Todd Young

referred to as boundary crisis). The relative frequency with which the dynamics is in a laminar phase, is expressed by a Birkhoff average. For circle endomorphisms it is further natural to consider rotation numbers and determine how these depend on parameters. A rotation number is also expressed as a Birkhoff average. In the development of the theory around all of the above keywords (intermittency, rotation number, saddle–node bifurcation, homoclinic bifurcation), Floris Takens played a prominent role. It has been a great pleasure for us to write this paper on the occasion of his sixtieth birthday.

17.1 General assumptions and notations Throughout the paper { f γ } will stand for a one parameter family of degree one maps on a circle M, of class C 3 jointly in x ∈ M and γ ∈ R, so that: • •

f γ is bimodal, D 2 f γ (c) %= 0 for each turning point c,

•

f γ has negative Schwarzian derivative, i.e., the turning points.

D 3 fγ D fγ

− 32 (

D 2 fγ 2 D fγ )

< 0 outside

A model family is the standard family f b,ω (x) = x + ω +

b sin(2π x) 2π

which has negative Schwarzian derivative for b > 1, when fb,ω is not invertible. Much of the theory to follow can be stated in larger generality, for multimodal maps on intervals or circles. If a is a periodic point of f γ , then we denote by W s (a) the set of all points whose ω-limit sets consist of the orbit O(a) of a. We call any periodic orbit O(a) for which W s (a) contains open intervals a periodic attractor. The set of components of W s (a) which contain O(a) we call the immediate basin of the attractor. The unique component of W s (a) which contains a we call the immediate basin of a. 17.1.1 The saddle–node bifurcation The family { f γ } unfolds a saddle–node bifurcation at γ = 0 if there is a k-periodic point a such that • • •

D f0k (a) = 1, D 2 f 0k (a) %= 0, ∂ k ∂γ f γ (a) % = 0 at γ = 0.

Without loss of generality, we may assume that D 2 f 0k (a) > 0

and

∂ k f γ (a) > 0. ∂γ γ =0

Birkhoff averages and bifurcations

407

Thus for γ > 0 and all x in some neighborhood U of a, we have that f γk (x) > x. In particular, the periodic point disappears and there is no k periodic point in U . 17.1.2 The boundary crisis bifurcation We say that { f γ } unfolds a boundary crisis if it satisfies the following. • • •

At γ = 0 there is a turning point c ∈ M and a subinterval N containing c so that f 0k (N) = N, f 0i (N) ∩ N = ∅ and f0i (N) contains no turning points for 0 < i < k, D 2 f 0k (c) %= 0, ∂ k ∂γ f γ (c) % = 0 at γ = 0.

Without loss of generality we may assume that D 2 f 0 (c) < 0. assumption we have: •

With this

the interval N is of the form [a, b], where f 0k (a) = f 0k (b) = a.

The periodic point a is hyperbolic repelling since f γ has negative Schwarzian derivative and thus its continuation exists for γ close to 0. Hence, for γ close to 0, the continuation of N, which we denote by Nγ , exists. At γ = 0, i N¯ = ∪k−1 i=0 f 0 (N) is a union of k disjoint intervals. Its continuation for γ close to 0 will be denoted by N¯ γ . The assumption f 0k (N) = N implies that ∂ k f 0k (c) = b ∈ ∂ N. The inequality D 2 f 0k (c) ∂γ f γ (c) < 0 implies that for γ > 0 k small, f γ (c) %∈ N and thus the periodicity of the interval N is destroyed and the attractor in N may ‘explode’. For γ > 0 small, let E¯ γ = {x ∈ N¯ ; f γk (x) %∈ N¯ γ } be the set of points mapped outside of N¯ γ by f γk . The set E¯ γ is the union of k disjoint small intervals and we will write E γ for the component of E¯ γ inside Nγ . We say that f0 is renormalizable if there is a turning point c ∈ M and a subinterval N containing c so that f 0k (N) ⊂ N and f 0i (N) ∩ N = ∅ for 0 < i < k. If there is no such interval with these properties we say that f 0 is not renormalizable. We say that f 0 is at most once renormalizable if it is not renormalizable or if it is exactly once renormalizable, i.e. if N does not contain a smaller interval that is mapped into itself by some iterate of f 0 .

17.2 Likely rotation numbers Let F be a lift of f , i.e. a 1-periodic map from R into itself such that π ◦ F = f ◦π, where π is the usual projection map. Let ρ(F, x) = F(X) − X

408

Ale Jan Homburg and Todd Young

where X ∈ π −1 (x). Given x ∈ M, consider the sequence of sums 1 ρ(F, f j (x)). i i−1

ρi (F, x) =

(17.1)

j =0

The rotation number is the limit ρ(F, ¯ x) = lim ρi (F, x) i→∞

(17.2)

if it exists. The set of rotation numbers of f γ make up an interval I (F) called the rotation interval [9]. In general, we denote by I (F, x) the set of limit points of ρi (F, x). Observe that I (F, x) is a compact interval. Given any subinterval [α, β] ⊂ I (F), there exists x ∈ M such that I (F, x) = [α, β] [2]. In light of the usefulness of rotation numbers and rotation intervals, the measure properties of this theory were investigated in [11] and we introduce the reader to this extension. Let m denote the projected Lebesgue measure on M (m(M) = 1). We do not assume that m is invariant under f . A natural partition of M is as follows: A( f ) = {x ∈ M : I (F, x) = { p/q} ∈ Q } B( f ) = {x ∈ M : I (F, x) = {α} ∈ R \ Q }

(17.3)

C( f ) = {x ∈ M : I (F, x) is not a point}. The case m(C) = 0 is equivalent to the statement that {ρi (F, x)} converges almost everywhere as i → ∞. Assign ρ(F, ¯ x) an arbitrary value for x ∈ C. Since ρi (F, x) is a sequence of continuous functions, the function ρ(F, ¯ x) is measurable. For m(C) = 0, define µ to be a real valued function on the Borel sets of R given by µ(S) = m({x ∈ M : ρ(F, ¯ x) ∈ S}) where S is any Borel set. We call the measure µ the rotation distribution of F. We call the support of µ the likely rotation set of f . Under the condition m(C) = 0, the measure µ carries the measure theoretic information about the rotation interval. For instance if f has a periodic attractor which attracts malmost every x ∈ S, then µ will be the atomic probability measure supported on the rotation number of the attractor. For any i ∈ N and any Borel set S define µi (S) = m({x ∈ M : ρi (x) ∈ S}). We call µi the i th experimental rotation distribution. We call the support of µi the observed rotation set.

Birkhoff averages and bifurcations

409

It was shown in [16] that µi converges to the rotation distribution µ as i → ∞. However, each µi is absolutely continuous whereas µ may be singular (as in the case where m is ergodic), thus convergence, in general, is only weak. Proposition 17.1. If m(C) = 0, then µi 1 µ. In light of proposition 17.1, it was proposed that a satisfactory extension of the rotation distribution should be such that whenever µi converges weakly, it agrees with the limit. Thus, if the sequence of measures µi converges weakly, then we call the limit the rotation distribution of f and denote it by µ. The support of the measure µ we call the likely rotation set of f . A single number to which µ assigns positive measure, is called a likely rotation number. In [11] observed rotation numbers were investigated numerically for the standard circle map with emphasis on the dependence on the parameter values. It turns out that the dependence is seen to be quite erratic, a result which is clearly to be expected from propositions 17.2 and 17.3 below. However, for most parameter values near locking intervals it was found that the observed rotation numbers decay away from the locking value in a quite predictable way, consistent with theorem 17.12 and theorem 17.13.

17.3 Discontinuity of averages For an isolated periodic point p = fγk ( p), its unstable set Wγu ( p) is defined by Wγu ( p) = ∪i≥0 f γi (H ) where H is a small neighborhood of p on which f γk is monotonic and that contains no other periodic points of period k. The limit set +γ of f γ is the union of all ω-limit points of f γ . Denote by Pγ the set of periodic points of f γ . The following result shows that Birkhoff averages will, in general, change discontinuously at boundary crisis and saddle–node bifurcations. Following it we specialize and obtain a result on likely rotation numbers. Proposition 17.2. Let { f γ } be as above, unfolding either a boundary crisis or a saddle–node bifurcation. Let L = {q : q ∈ P0 ∩ W0u (a), c ∈ W0u (q)}. For a continuous function φ on M, let $=

¯ f 0 , x). φ(

x∈L∩P

Then for each ) > 0 and σ in the convex hull of $, there is γ within distance ) of 0, so that ¯ f γ , x) − σ | < ) |φ(

410

Ale Jan Homburg and Todd Young

for x from a subset of M with positive Lebesgue measure. Proof. Consider first a transitive component Lˆ of L. Note that the closure of & ¯ f 0 , x) is a compact interval [s, t]. For each σ ∈ [s, t] there is a φ( ˆ x∈ L∩P ¯ f 0 , q) − λ| < 1 ). For periodic points periodic point q = f 0l (q) ∈ Lˆ so that |φ( 2 x of f 0 we denote the continuation, which exists for small γ , by the same letter x. The periodic point q is in the unstable manifold of W uf0 (a) and, hence, in the unstable manifold Wγu (a) for all small γ . This implies the existence of γ¯ arbitrarily close to 0, so that f γ¯m (c) = q for some positive integer m. Since c is also in the unstable manifold of q for all γ sufficiently close to 0, there is γn converging to γ¯ as n → ∞ for which c is periodic and spends all but finitely many iterates near O fγn (q). Thus f γn has an SBR measure with support O f γn (c). The ¯ fγn , x) for x in the basin of attraction of O fγ (c) converge Birkhoff averages φ( n ¯ to φ( fγ¯ , c) as n → ∞. Similarly one finds parameter values γ for which the orbit O fγ (c) is periodic and spends long times near a finite union of periodic orbits O fγ (qi ) in L. In the following result we apply proposition 17.2 to likely rotation numbers. Let Fγ be a lift of f γ and let ρ(F ¯ γ , x), I (Fγ ) be as in the previous section. It is known that the left and right endpoint of I (Fγ ) depend continuously on γ [9]. Proposition 17.3. Let { f γ } be as above, unfolding either a boundary crisis or a saddle–node bifurcation. Assume that f 0 is at most once renormalizable. Assume that the second critical point c is eventually mapped inside of N. For each ) > 0 and σ in I (F0 ), there is γ within distance ) of 0, so that |ρ(F ¯ γ , x) − σ | < ) for x from a subset of M with positive Lebesgue measure. Proof. Consider the case where { f γ } unfolds a boundary crisis bifurcation. We claim that c is contained in the unstable set W0u (q) of each periodic point q of f 0 . Let be the period of q and consider the maximal interval I around q on which f 0 is monotonic. If the orbit of a boundary point z ∈ ∂ I contains c we are ready. Otherwise, for z ∈ ∂ I , there is an non-negative integer m with f 0m (z) = c . By assumption on O(c ), a small interval H ⊂ W0u (q) containing c is eventually mapped inside N. Thus H is mapped over c by some iterate of f0 . Recall that a denotes the periodic point in ∂ N. The unstable set W0u (a) is forward invariant. Because f 0 is at most once renormalizable, W0u (a) = M. The reasoning of proposition 17.2 can now be followed to prove the result. Similar arguments can be followed for families { f γ } that unfold a saddle– node bifurcation.

Birkhoff averages and bifurcations

411

17.4 Local embedding flows for the saddle–node s Denote by U a small neighborhood of a on which f 0k is invertible. Let Wloc (a) u and Wloc (a) denote the usual local stable and local unstable sets for a.

Proposition 17.4. Let { f γ } be a family of C r , r ≥ 2, maps unfolding a saddle– node. Then there exists a family of C r flows, {φγt }0≤γ <γ¯ , on U such that f γk ≡ φγ1 for each γ ≥ 0. Further, φγt (·) → φ0t (·) in the C 1 topology on U and in the C r topology on compact intervals away from the fixed point. The flow φ0t is uniquely determined by f0 . Proof. The C ∞ version of this theorem, for γ = 0, is due to Takens [13], compare further [9]. The C r result follows from part 2 of [14]. The case γ = 0 follows from appendix 3 of that reference. The case γ > 0 and the convergences as γ 6 0 follow from [14, theorem IV.2.5 and lemma IV.2.7]. Remark 17.5. This result is known as the Takens embedding theorem. A version of it appears in [6], where it is proved that one may obtain φγt (x) which depends C r smoothly on both x and γ , even at the fixed point, if one requires that (x, γ ) "→ f γ (x) be C R(r) smooth, where R(r ) may be larger than r . Proposition 17.4 allows for our weaker hypotheses and its implications are sufficient for our purposes. u (a), such that Choose a point e ∈ Wloc

Iγu ≡ [e, f γk (e)] ⊂ U s (a) and let for all 0 ≤ γ < γ¯ . Similarly, choose d ∈ Wloc

Iγs ≡ [d, f γk (d)] ⊂ U. Given γ ≥ 0 and x ∈ Iγu , define τγu (x) to be the unique number for which τ u (x)

φγγ

(e) = x. τ s (x)

For γ ≥ 0 and x ∈ Iγs , let τγs (x) be defined by φγγ (d) = x. It follows from the smoothness of φγt (x) that for each γ ≥ 0, the functions τγs,u are C r diffeomorphisms from Iγs,u to [0, 1]. From now on we will identify the interval [0, 1] with the unit circle S1. We will use τγs,u as coordinates on Iγs,u . Given d and e as above, let {γn }∞ n=n 0 be the sequence, γ¯ > γn 0 > γn 0 +1 > · · ·, defined by f γkn (d) = e. n For each n ≥ n 0 let gn : [0, 1] → [γn+1 , γn ] be the reparametrization map defined by (d) = e. φgn+θ n (θ)

412

Ale Jan Homburg and Todd Young

We have that gn (0) = γn and gn (1) = γn+1 . We may invert gn (·), for each n, to obtain maps θn : [γn+1 , γn ] → [0, 1]. Lemma 17.6. The reparametrization maps gn are smooth monotonically decreasing functions with uniformly small distortion: given ε > 0 there is N ∈ N so that for every n ≥ N and every θ ⊂ [0, 1], (1 − ε) ≤

Dgn (θ ) ≤ (1 + ε). |γn − γn+1 |

Proof. Diaz et al [4] proved this result under the hypothesis that (x, γ ) "→ f γ (x) is C R(r) , using Il’yashenko and Li’s embedding result (remark 17.5). A proof of this result under the current hypotheses appears in [1] based on [7]. We remark that n 2 γn converges as n → ∞ (see [8]), so that γn+1 /γn → 1 as n → ∞. This fact, together with lemma 17.6 imply the next proposition [1]. Proposition 17.7. Let be a measurable subset of [0, γ¯ ) and denote n =

∩ [γn , γn−1 ]. If the limit lim m(θn ( n ))

n→+∞

exists and equals , then m( ∩ [0, γ )) = . γ 60 γ lim

Let Ln,θ denote the local (first hit) map from Iγs to Iγu induced by f gn (θ) . The convenience of using τγs and τγu as coordinates on Iγs and Iγu is seen in the following proposition. Proposition 17.8. For each n ≥ n 0 and each θ ∈ [0, 1]

Ln,θ = (τγu (θ) )−1 ◦ R−θ ◦ τγs (θ). n

n

(17.4)

Proof. This follows from proposition 17.4 and the definitions of τγs and τγu as the time variables for the embedding flow for f γk . We wish to point out that, while the Takens embedding theorem is the key tool for understanding local saddle–node bifurcations in one dimension, results by Takens [12] are also crucial for considering local saddle–node bifurcations in higher dimensions.

Birkhoff averages and bifurcations

413

17.5 The Mather invariant and return maps s Let G¯ be the first hit map from I0u to Wloc (a). With τγs : Iγs → [0, 1] defined s (a) → [0, ∞) by in the previous section we also define, for γ = 0, τ¯0s : Wloc τ¯0s (x) ¯ : [0, 1] → R by φ (d) = x. Note that τ s = τ¯ s | I s . Define M 0

0

0

0

¯ = τ¯ s ◦ G¯ ◦ (τ u )−1 , M 0 0 ¯ and define M : [0, 1] → [0, 1] by M = Mmod 1. By identifying the endpoints of ¯ as a map from a subset of the circle S1 into R and M as [0, 1] we may consider M ¯ the Mather invariant a map from a subset of S1 into S1. Following [14], we call M for f 0 . One may show that M is a modulus of smooth conjugation, in other words, it is invariant under differentiable changes of variables. Let + = +( f 0 ) \ O(a), where +( f 0 ) is the ω-limit set of f 0 . If + is non-trivial and if the other critical point of f 0 has O(a) as its limit set, then negative Schwarzian derivative implies that + is a hyperbolic invariant set. If this is the case, then infinitely many points from I0u will be mapped onto + and this will result in infinitely many discontinuities for M, even when considered on S1. Given J , denote by Vγ (J ) the subset of Iγu defined by ¯ (x)) < J, f j (x) ∈ [d, a) Vγ (J ) = {x ∈ Iγu : M(τ 0 j

/ [a, f 0k (e)] for any j < J }. for some j < J, f0 (x) ∈ s (a) in a bounded The points in Vγ (J ) are those whose forward orbits enter Wloc number of iterations and which do not come too close to a in the process, either u s by re-entering Wloc (a) or by landing in Wloc (a) too close to a. Since the forward u orbit of almost every x ∈ Wloc (a) has O(a) as its omega limit set, the relative measure m(Vγ (J ))/m(Iγu ) may be made close to 1 by taking J to be large and choosing d and e close to a. For γ > 0, consider the first return map κγ of the interval Iγu and let κ˜ n,θ be the normalized map given by

κ˜ n,θ ≡ τgun (θ) ◦ κgn (θ) ◦ (τgun (θ) )−1 . Identifying the endpoints of [0, 1], we may consider κ˜ n,θ as a map on the circle. Proposition 17.9. Given any J , lim κ˜ n,θ τ u (Vγ ( J )) − R−θ ◦ M τ u (Vγ ( J )) n→∞

Cr

=0

for each 0 ≤ θ < 1. Proof. Let Gγ denote the global (first hit) map from Iγu to Iγs induced by f γ . Note that κγ is not equal to Lγ ◦ Gγ since some points in Iγu will return to Iγu

414

Ale Jan Homburg and Todd Young

before hitting Iγs . However, the two maps do agree when restricted to Vγ (J ) since points in Vγ (J ) will in fact hit Iγs before returning to Iγu . Since we are only considering a finite number of iterations, it follows from the construction of M that τγs ◦ Gγ ◦ (τγu )−1 converges to M on the restricted set τγu (Vγ (J )). Proposition 17.8 then implies that τγu ◦ Ln,θ ◦ Gγ ◦ (τγu )−1 converges to R−θ ◦ M on τγu (Vγ (J )).

(17.5)

The ideas of this section are all refinements of those used in [9], which studies the saddle–node by showing that global return maps have a limit as n → ∞. As it turns out, the Mather invariant, defined at γ = 0, is the limiting map when the time coordinates are used.

17.6 Intermittency near the saddle–node bifurcation Definition 17.10. Given a one-dimensional, piecewise smooth map T : X → X, a periodic interval is a closed interval N ⊂ X such that T n (N) ⊂ N for some n > 0, and the orbit of N is bounded away from the discontinuities of T . We will say that a periodic interval is hyperbolic if one of its endpoints is a repelling hyperbolic point, the other endpoint is mapped onto the first endpoint by T n , and the attractor in N is contained in the interior of N. If all the critical points of T are eventually mapped into the interior of a hyperbolic periodic interval N then we say that N is absorbing. Note that both hyperbolic and absorbing intervals are stable under C 1 perturbations of the map T . Proposition 17.11. Suppose that for each θ ∈ (θ − , θ + ) ⊂ S1 the map R−θ ◦ M has a hyperbolic interval Mθ of period k1 . Then there exist integers n 0 and m and a sequence of intervals {(βn− , βn+ )}∞ n=0 converging to 0 such that for each γ ∈ (βn− , βn+ ) and n ≥ n 0 , f γ has a hyperbolic interval Nγ of period k1 nk + m. The limit |βn+ − βn− | = |θ + − θ − | (17.6) lim n→∞ |γn − γn+1 | exists. Furthermore, we may replace the word ‘hyperbolic’ by the word ‘absorbing’ in the above statement. Proof. First assume that R−θ ◦ M has a hyperbolic interval m θ for θ ∈ (θ − , θ + ). By the definition of M, if we choose J large enough and γ small enough, then (τ u )−1 (Mθ ) ⊂ Vγ (J ). The existence of the intervals (βn− , βn+ ) is then immediate from proposition 17.9. Further, proposition 17.9 also implies that θn (β ± ) → θ ± , as n → ∞. Equation (17.6) then follows from lemma 17.6.

Birkhoff averages and bifurcations

415

Next assume that m θ is absorbing. Note that all critical points for R−θ ◦ M correspond to points in I0u which are mapped onto one of the two critical points of f 0 . Thus, although R−θ ◦ M may have many critical points it may have at most j j two critical values. These are given by τ s (cα ), α = 1, 2, where cα ∈ I0s is in the j forward orbit of the critical point cα under f 0 . Since R−θ ◦ M maps τ s (cα ) into the interior of the interval Mθ under a finite number of iterations, it follows that cα is mapped into Nγn (θ) in a finite number of iterations. The intervals (βn− , βn+ ) are periodic windows in the parameter space. The orbits outside U¯ are equivalent for each n, but the number of iterates inside U¯ is a multiple of n. As before write {Fγ } for a lift of a family of circle maps { f γ }. Recall that ρ(F ¯ γ , x) denotes the rotation number of x. Theorem 17.12. Let { f γ } be as above, unfolding a saddle–node bifurcation at γ = 0. Suppose that (θ − , θ + ) be as in proposition 17.11. Let = − + − + ∪∞ n=n 0 (βn , βn ), where {(βn , βn )} is given by proposition 17.11. Then has positive density at 0, i.e. lim

γ 60

m( ∩ [0, γ )) > 0. γ

(17.7)

Further, there exist constants C1 , C2 and a set Dγ ⊂ M of positive measure, so that ρ(F ¯ 0 , a) − ρ(F ¯ γ , x) ≤ C2 (17.8) C1 ≤ lim √ γ ∈ ,γ 60 γ for all x ∈ Dγ . If Mθ is absorbing then, in addition, Dγ has full measure in M. Proof. Positive density of follows from proposition 17.11 and proposition 17.7. Since the periodic interval Nγ has positive measure in M it suffices to let Dγ = Nγ . Observe that all points in Nγ have the same rotation number. We may assume that the orbit of Nγ does not intersect the boundary of U¯ . The orbit O(Nγ ) spends a fixed number of iterations outside U¯ , while, as n → ∞, O(Nγ ) spends an arbitrary number of iterations in U¯ . Specifically, for a fixed n, the numbers of iterations inside and outside U¯ are the same for all γ ∈ (βn− , βn+ ) and all points in Nγ . In fact, the number of iterations outside U¯ is also the same, say , for all n ≥ n 0 . On the other hand, the number of iterations inside U¯ is equal to a multiple of n. Note that for γ ∈ (βn− , βn+ ), O(Nγ ) consists of + mn disjoint closed intervals. The specific formula in (17.8) follows from a standard calculation of the number of iterations an orbit spends inside U¯ ; see [10]. Finally, if Mθ is absorbing, proposition 17.11 implies that Nγ is absorbing for γ ∈ . That is, both critical points are eventually mapped into Nγ . Since f γ has negative Schwarzian derivative, it follows that almost all points in M are eventually mapped into Nγ .

Ale Jan Homburg and Todd Young

416

17.7 Boundary crisis bifurcations We will not give a complete study of boundary crisis bifurcations in circle maps. Instead we briefly discuss the boundary crisis bifurcation in families of unimodal maps. We formulate and discuss a conjecture on the boundary crisis bifurcations in circle maps. So, in this section, { f γ } will denote a family of unimodal maps unfolding a boundary crisis, so that • • • •

f γ has a unique maximum at a critical point c, D 2 f γ (c) < 0, f γ (∂ M) ⊂ ∂ M, f γ has negative Schwarzian derivative.

We quote the following result from [5]. Let χγ denote the indicator function of N¯ γ : 0, if x %∈ N¯ γ , χγ (x) = 1, if x ∈ N¯ γ . Write

1 χγ ( f γi (x)) i i−1

χ¯ γ ( fγ , x) = lim

i→∞

j =0

if the limit exists. Theorem 17.13. Let { f γ } be a family of unimodal maps as above, unfolding a boundary crisis bifurcation at γ = 0. There exists a set of parameter values of positive measure, with 0 a density point of , i.e. m( ∩ [0, γ )) =1 γ 60 γ lim

so that for γ ∈ , fγ possesses an absolutely continuous invariant measure νγ . Restricting ourselves to γ ∈ , χ¯ γ ( fγ , x) is a constant χ¯ (γ ) almost everywhere on M and depends continuously on γ at γ = 0. Furthermore, lim

γ ∈ ,γ 60

1 − χ(γ ¯ ) = K, √ γ | ln γ |

(17.9)

for some K . We conjecture that similar statements can be formulated and proved for the dependence of likely rotation numbers on a parameter in the unfolding of a boundary crisis bifurcation. We end this paper by giving some support to this conjecture. Suppose { f γ } is a family of bimodal circle maps unfolding a boundary crisis bifurcation, as before. Suppose also that the orbit of second critical point c is

Birkhoff averages and bifurcations

417

eventually periodic at γ = 0. By [4], γ = 0 is a full density point of a set of parameter values for which |D fγi (c)| and |D f γi (c )| are bounded from below by K λi for some K > 0, λ > 1. Applying [15] and [3], fγ supports an absolutely continuous invariant measure for γ ∈ . For γ ∈ , one can try to mimic the arguments in [5].

References [1] Afraimovich V and Young T 1998 Relative density of irrational rotation numbers in families of circle diffeomorphisms Ergod. Theor. Dynam. Syst. 18 1–16 [2] Bam´on R, Malta I, Pac´ıfico M J and Takens F 1984 Rotation intervals of endomorphisms of the circle Ergod. Theor. Dynam. Syst. 4 493–8 [3] Bruin H, Luzzatto S and Van Strien S 1999 Decay of correlations in one-dimensional dynamics Preprint [4] Diaz L N, Rocha N and Viana M 1996 Strange attractors in saddle–node cycles: prevalence and globality Inv. Math. 125 37–74 [5] Homburg A J and Young T 2000 Intermittency in families of unimodal maps Ergod. Theor. Dynam. Syst. to appear [6] Ilyashenko Yu and Weigu Li 1999 Nonlocal Bifurcations (Mathematical Surveys and Monographs 66) (Providence, RI: American Mathematical Society) [7] Jonker L B 1990 The scaling of Arnol’d tongues for differentiable homeomorphisms of the circle Commun. Math. Phys. 129 1–25 [8] Misiurewicz M and Kawczy´nski A L 1990 At the other side of a saddle–node Commun. Math. Phys. 131 605–17 [9] Newhouse S, Palis J and Takens F 1983 Bifurcations and stability of families of diffeomorphisms Publ. Math. IHES 57 5–71 [10] Pomeau Y and Manneville P 1980 Intermittent transition to turbulence in dissipative dynamical systems Commun. Math. Phys. 74 189–97 [11] Saum M and Young T 2001 Observed rotation numbers in families of circle maps Int. J. Bif. Chaos 11 73–90 [12] Takens F 1971 Partially hyperbolic fixed points Topology 10 133–47 [13] Takens F 1973 Normal forms for certain singularities of vector fields Ann. Inst. Fourier 23 163–95 [14] Yoccoz J-C 1995 Centralisateurs et conjugaison diff´erentiable des diff´eomorphismes du cercle Ast´erisque 231 89–242 [15] Young L-S 1998 Statistical properties of dynamical systems with some hyperbolicity Ann. Math. 147 585–650 [16] Young T 2000 Distributions of Birkhoff averages with respect to non-invariant measures Preprint

This page intentionally left blank

Chapter 18 The multifractal analysis of Birkhoff averages and large deviations Yakov Pesin and Howard Weiss The Pennsylvania State University

For one-sided subshifts of finite type, we describe the fine structure of the exceptional set in the Birkhoff ergodic theorem for H¨older continuous functions. We show an intimate connection with large deviation theory for Birkhoff averages, and we provide several applications to probability and number theory, including a problem popularized by Billingsly. We study the decomposition of the phase space into level sets of the Birkhoff average. We show that there are typically uncountably many dense level sets and that each level set carries an auxiliary equilibrium measure, with constant pointwise dimension (a type of self-similarity). These equilibrium measures, each supported on a measure zero set, are key to our analysis. Floris Takens has been a pioneer in the dimension theory of dynamical systems and a leader in the multifractal analysis of dynamical characteristics. We dedicate this paper to him on the occassion of his 60th birthday.

18.1 Main result + Let σ : $ + A → $ A be a topologically mixing one-sided subshift of finite type [20], and φ ∈ C($ + A , R) a continuous function. Denote by φ(x) the Birkhoff average of φ along the orbit of the point x, i.e., n−1 1 φ(σ k x) n→∞ n

φ(x) = lim

k=0

419

420

Yakov Pesin and Howard Weiss

if the limit exists. The limit function is clearly σ -invariant and can be identified with the conditional expectation at x of the function φ with respect to the sigmaalgebra of σ -invariant functions. For any ergodic probability measure µ, it follows from the Birkhoff ergodic theorem that for µ-almost all x φ(x) ≡ φ≡ φ dµ. $+ A

However, the Birkhoff ergodic theorem provides no structural information about the exceptional set of measure zero. In probability theory and much of analysis, sets of measure zero have been considered negligible, since experts believed these sets carry no essential information about the measure. It was well understood that this exceptional set of measure zero supports other ergodic invariant measures, but they seem to have been of no use to study the initial measure. Recent work in the multifractal analysis of dynamical systems has changed this point of view. Out of the large collection of invariant measures supported on this exceptional measure zero set, the multifractal analysis provides a mechanism to select a special one parameter family of equilibrium measures and relate them to a moment generating function associated with the initial measure. We show that this determines the large deviation theory of Birkhoff averages. We make this precise in section 18.2.3. Our multifractal analysis of the exceptional set for Birkhoff averages begins with some natural questions, such as whether φ attains any other values, and if so, what is the range of values, and what is the dimension and topological structure of the level sets? Also, are there points for which φ(x) does not exist and if so, what is the dimension and topological structure of this set? Another way of asking these questions is to describe the fine structure (in the sense of topology and dimension) of the decomposition of the phase space into level sets of the Birkhoff average φ, where the decomposition is given by: Bα {x : φ(x) does not exist} $+ A = {x : φ(x) = φ } α% = φ

where the level set Bα ≡ {x : φ(x) = α}. Our study of this decomposition is important for several reasons. One practical reason, which was pointed out by Michael Fisher, is related to calculations of such averages on a computer. While computing Birkhoff averages, one finds that not only does the set of points which have a different Birkhoff limit have measure zero, but points sufficiently close to these exceptional ones also tend to have a different Birkhoff limit. Therefore, one should understand the structure and distribution of the exceptional points when doing calculations and/or simulations. Here we provide a complete description of this decomposition for an important class of ergodic invariant measures, namely for equilibrium measures

Multifractal analysis of Birkhoff averages

421

(Gibbs measures) for H¨older continuous potentials. If φ is a H¨older continuous potential (function), we denote by µφ the unique equilibrium measure for φ. We have included a short appendix which contains the definition and some important properties of equilibrium measures. See [8, 16, 10] for excellent expositions. Equilibrium measures are dynamical systems analogs of Gibbs canonical ensembles in equilibrium statistical physics, and comprise a large class of physically and dynamically interesting measures. The H¨older continuous potential in our case is the analog of the system Hamiltonian in statistical physics. A (mixing) Markov mixing measure is an equilibrium measure, the measure of maximal entropy is an equilibrium measure, and for smooth hyperbolic attractors the natural measure (known also as the SBR measure) is an equilibrium measure. Our results on the distribution of Birkhoff averages is a rather straightforward consequence of our multifractal analysis (MFA) for equilibrium measures for some low-dimensional hyperbolic dynamical systems. We have effected an MFA for pointwise dimension and Lyapunov exponents, and here we link the Birkhoff average with the pointwise dimension and then quote results from previous work. The relevant references are [4, 13, 14, 17, 21], but we have tried to make this presentation almost self-contained. We study the exceptional set by decomposing ∪α Bα into uncountably many disjoint sets where each element of the decomposition carries an auxiliary equilibrium measure, and exploit properties of these auxiliary equilibrium measures. Most of our tools are from symbolic dynamics and thermodynamic formalism. To state our main theorem, we need to define the auxiliary function Birkhoff spectrum for φ bφ (α) = dim H Bα where dim H F denotes the Hausdorff dimension of the set F. Let us now state our main theorem. Let µmax denotes the measure of maximal entropy (see the appendix). + Theorem 18.1. Let σ : $ + A → $ A be a topologically mixing one-sided + γ subshift of finite type, φ ∈ C ($ A , R) a H¨older continuous function, and µφ the corresponding equilibrium measure.

(i) If µφ %= µmax , then the function bφ (α) is real analytic and strictly convex on an open interval (a, b). It immediately follows that φ attains an interval of values. (ii) For a ≤ α ≤ b, each of the level sets Bα is an uncountable dense subset of $+ A . The values a and b are expressible via thermodynamic formalism (see the proof below). (iii) The interval [a, b] is maximal in the sense that φ does not attain any value outside this interval. (iv) If µφ %= µmax , the set of points for which φ(x) does not exist has maximal Hausdorff dimension, i.e., the Hausdorff dimension equals the Hausdorff dimension of $ + A.

422

Yakov Pesin and Howard Weiss

Comments (i) This theorem illustrates what is sometimes called the multifractal miracle— even though the decomposition of the phase space into level sets of φ is intricate and extremely complicated, the function bφ that encodes this decomposition is smooth and convex. (ii) A priori, one may consider the box dimension instead of the Hausdorff dimension in this definition of bφ . However, since the level sets Bα are dense, it follows (from the well known property that the box dimension of a set coincides with the box dimension of its closure [11]) that the box dimensions of these level sets Bα are equal to the box dimension of $ + A. We now define the notion of local dimension and relate it to Birkhoff averages. For a probability measure µ on $ + A , we define the pointwise dimension at x, denoted dµ (x), by log µ(B(x, r )) dµ (x) = lim r→0 log r provided the limit exists. Here B(x, r ) is the ball of radius r around the point x measured using the metric ρλ (see the appendix). From the definition we see that if dµ (x) = α then for balls of sufficiently small radius r the measure µ scales as µ(B(x, r )) A r α . This notion of a local dimension is essentially due to Billingsly. The classical multifractal analysis is a description of the fine-scale geometry of the decomposition of $ + A whose constituent components are the level sets K α = {x ∈ X | dµ (x) = α} for α ∈ R. The dimension spectrum of µφ is defined by f µ (α) = dim H K α where dim H K α denotes the Hausdorff dimension of the level set K α . The dimension spectrum is one of the main objects of study of the usual MFA. The following proposition relates the Birkhoff average to the pointwise dimension for a special equilibrium measure. + Proposition 18.2. Let σ : $ + A → $ A be a topologically mixing one-sided + γ subshift of finite type, φ ∈ C ($ A , R) a H¨older continuous function, and µφ the corresponding equilibrium measure. Then

φ(x) = P(φ) − log λ · dµφ (x) where P(φ) denotes the thermodynamic pressure of φ (see the appendix). Proof. It follows from the definition of equilibrium measure that µ(Cn (x)) A C · exp(Sn φ(x) − n P(φ))

(18.1)

Multifractal analysis of Birkhoff averages

423

where Cn (x) denotes the n-cylinder that contains the point x and Sn φ(x) =

n−1

φ(σ k x).

k=0

It immediately follows that lim

n→∞

1 log µ(Cn (x)) = φ(x) − P(φ). n

Since cylinder sets are metric balls, we have that dµφ (x) = lim

n→∞

n log µ(Cn (x)) log µ(Cn (x)) = lim n→∞ log |Cn (x)| n log |Cn (x)|

where |Cn (x)| denotes the diameter of the cylinder (ball) Cn (x). By definition (of the metric) there exists B > 0 such that |Cn (x)| = B exp(−n), and we immediately obtain that φ(x) = P(φ) − log λ · dµφ (x).

It is obvious from this proposition that results on the level sets of pointwise dimension dµφ (x) can be translated into results on the level sets for the Birkhoff average φ(x). As an immediate consequence we obtain the following relationship between the functions bφ (α) and f µφ (α). Corollary 18.3.

bφ (α) = f µφ

P(φ) − α . log λ

(18.2)

Remarks and ideas on the proof of theorem 18.1. Let φ ∈ C γ ($ + A , R) and let µ = µφ be the corresponding equilibrium measure. Define the one parameter family of functions ϕq , q ∈ (−∞, ∞) on $ + A by ϕq (x) = −T (q) log λ + q log ψ(x) where log ψ = φ − P(φ) and T (q) is chosen such that P(ϕq ) = P(−T (q) log λ + q log ψ(x)) = 0.

(18.3)

One can show that T (q) exists for every q ∈ R. It is obvious that for all q the functions ϕq ∈ C γ ($ + A , R). The following results, used to prove parts (i) and (ii), can easily be extracted from the more sophisticated argument for hyperbolic sets contained in [13, 14].

424

Yakov Pesin and Howard Weiss 8

slope = − α ( ) T (q )

d= dim H F 8

slope = − α (− )

q

1

8

slope = − α ( )

8

slope = − α (− )

Figure 18.1. Graph of T (q). From [13].

The pointwise dimension dµφ (x) exists for µφ -almost every x ∈ $ + A and 1 log ψ dµφ . (18.4) dµφ (x) = log λ $ +A The function T (q) is real analytic for all q ∈ R, T (0) = dim H $ + A , T (1) = 0, T (q) ≤ 0 and T (q) ≥ 0; see figure 18.1. This is a manifestation of analyticity of pressure (see the appendix) and the explicit formulas for its first and second (Frechet) derivatives. The function α(q) = −T (q) attains values in the interval [α1 , α2 ], where 0 ≤ α1 ≤ α2 < ∞. The function f µφ (α(q)) = T (q) + qα(q); see figure 18.2. This statement uses results about the family {µφq } of equilibrium measure for {φq }. The key points in the proof are verifying that µφq (K α(q) ) = 1 and that the pointwise dimension dµφq (x) = T (q) + qα(q) for x ∈ K α(q) . Since equilibrium measures are positive on open sets, the first property implies that the level sets K α and Bα are dense. It is not too hard to see that the interval of definition of bφ is [a, b] where a = − limq→∞ α(q) and b = − limq→−∞ α(q). Finally, if µφ %= µmax then the functions fµφ (α) and T (q) are strictly convex and form a Legendre transform pair (see the appendix). The strict convexity follows from convexity of pressure along with explicit derivative formulas. The proof of (18.3) is due to Schmeling [17]. The key ingredient is Simpelaere’s variational formula [19]. The proof of (18.4) is due to Bareirra and Schmeling [4]. See [12] (and also [18]) for a precursor result. The idea is

Multifractal analysis of Birkhoff averages

425

tangent to the curve at α (1) , with slope = 1 fν (α)

s=dimH J

info. dim. = HPν (1) vertical tangent 8

f ( α (− ) ) 8

f ( α ( ))

q>0

α (0 )

α (− )

8

8

0

vertical tangent α( ) α (1)

α

q<0

Figure 18.2. Graph of f µφ (α). From [13].

to insert the set of points which are typical with respect to a given equilibrium measure into the set of non-typical (non-generic) points by means of a Lipschitz continuous homeomorphism.

18.2 Applications and connections to probability and number theory What follows are applications of theorem 18.1, proposition 18.2 and corollary 18.3. 18.2.1 Borel’s law of large numbers and Birkhoff averages of functions over Markov chains The Birkhoff ergodic theorem is a generalization of Borel’s strong law of large numbers in the special case when the sequence of random variables {φ ◦ σ n } is IID (independent and identically distributed). Thus, theorem 18.1 yields refined new information about this fundamental law of probability theory. More generally, let {X n } be a (one-sided) irreducible Markov chain on a finite state space which has a stationary distribution π (see [5]). Via a well known construction (see [15, p 6]) this Markov chain can be identified, via a coding map χ, with an ergodic one-sided subshift of finite type equipped with a Markov measure µπ . Since Markov measures are equilibrium measures (for H¨older continuous potentials), we can apply theorem 18.1 to describe the decomposition of the probability space (+, F , π) into level sets of the Birkhoff averages for a H¨older continuous function. More precisely, one can decompose the probability

426

Yakov Pesin and Howard Weiss

space ˜ +=+

B˜ α

α

˜ is the set of points for which the Birkhoff averages do where B˜ α = χ(Bα ) and + not converge. Note that ν(+) = 0 for every ergodic stationary measure ν while ˜ ⊂ $ + has full Hausdorff dimension. the set χ −1 (+) A 18.2.2 Fine distribution in Borel’s normal number theorem Consider the base 2 expansion of a number x ∈ [0, 1], i.e., x=

∞ ak (x) k=0

2k

,

where ak ∈ {0, 1}. With only countably many exceptions this representation is unique. Borel proved that for (Lebesgue) almost every x the limit n−1 1 1 ak (x) = . n→∞ n 2

lim

k=0

This is an easy consequence of the Birkhoff ergodic theorem applied to the full one-sided shift on two symbols {0, 1}, the (1/2, 1/2) Bernoulli measure µ, and the (H¨older continuous) characteristic function Ik of the cylinder set Ck = {x ∈ [0, 1] : x = kx 2 x 3 . . .}. Eggleston [6] and Besicovitch [2] (see also [1]) found the remarkable formula stated in theorem 18.4 below for the dimension of the set of points where the above limit is equal to α for 0 ≤ α ≤ 1. We provide an alternate proof of this result using our MFA of the equilibrium measure µ Ik and corollary 18.3. We note that the deep connection between dynamical systems and dimension theory seems to have been first discovered by Billingsly [1] while studying this formula. Billingsly interpreted it in terms of ergodic theory, and reproved it using ergodic theoretical tools. Theorem 18.4. For 0 ≤ α ≤ 1, the Hausdorff dimension dim H

n−1 1 ak (x) = α = H (α) x ∈ [0, 1] : lim n→∞ n k=0

where H (α) =

−(1 − α) log(1 − α) − α log α . log 2

Multifractal analysis of Birkhoff averages

427

Proof. Consider the full shift σ on the space $2+ endowed with the metric ρ2 (see the appendix). To effect an MFA, one considers the one parameter family of (locally constant) potentials φq = −T (q) log 2 + q(Ik − P(Ik )), where T (q) is chosen such that P(φq ) = 0. A simple calculation shows that P(Ik ) = log(1 + e) and thus φq = −T (q) log 2 + q Ik − log(1 + e). Since the functions φq depend only on the first coordinate, i.e., φq (ω1 ω2 . . .) = φq (ω1 ), it follows that eφq (0) + eφq (1) = e P(φq ) . A simple calculation shows that T (q) =

q e +1 1 log . log 2 (1 + e)q

Differentiating the expression for T (q) yields u ≡ α(q) = −T (q) =

eq − log(1 + e) − eq log(1 + e) log(2) + eq log(2)

and one immediately obtains α(∞) = (−1 + log(1 + e))/ log 2 and α(−∞) = (log(1 + e)/ log 2. It follows that the dimension spectrum f µ Ik is defined on the interval [(−1 + log(1 + e))/ log 2, (log(1 + e)/ log 2] and by corollary 18.3 we see that the Birkhoff spectrum for Ik is defined on the interval [0, 1]. We note that α is invertible with inverse −u log 2 + log(1 + e) q ≡ α −1 (u) = log . 1 + u log 2 − log(1 + e) By the Legendre transform relation we know that f µ Ik (α(q)) = T (q) + qα(q). It follows that f µ Ik (u) = T (α −1 (u))+α −1 (u)u. The explicit expression for f µ Ik (u) is complicated, but through tedious algebraic manipulation, one obtains that fµ Ik (u) = H (−u log 2 + log(1 + e)) where H (u) =

−(1 − u) log(1 − u) − u log u . log 2

By (18.2), it immediately follows that b Ik (α) = fµ Ik

log(1 + e) − α log 2

and, thus, b Ik (α) = H (α) =

−(1 − α) log(1 − α) − α log α . log 2

Yakov Pesin and Howard Weiss

428

18.2.3 Connections with large deviation theory for Birkhoff sums In this section we establish an intimate connection between the Birkhoff spectrum for φ and (global) large deviations of the Birkhoff sums Sn φ. This is an important application of our multifractal analysis for pointwise dimension. + older Consider a subshift of finite type σ : $ + A → $ A . Let φ be a H¨ + continuous function on $ A and µ the coresponding Gibbs measure. Define Nn to be the total number of cylinder sets {Cnk } at level n, and consider the family of random variables X k = log µ(Cnk ), where the integer k is randomly chosen with uniform distribution from 1, . . . , Nn . The moment generating function of X k is j q µ(Cn ) . cn (q) = exp(q X k ) = (1/Nn ) j

Cn

In [13, 14] we show that log cn (q) = R(0) − R(q) n→∞ log n lim

where log

j

Cn

j q

µ(Cn )

. log n We also show that R(q) coincides with T (q) (defined by (18.3)), with λ = 2 and log ψ = φ − P(φ). We stress again that as a consequence of our multifractal analysis we obtain that T (q), and thus R(q), is smooth and convex. Combining these results with Ellis’s large deviation theorem [7], we immediately obtain the following counting formula for the dimension spectrum. R(q) = lim

n→∞

Theorem 18.5. If µ %= µmax , then f ν (α) = lim lim

ε→0 n→∞

log Jn (α, ε) log n j

j

where Jn (α, ε) is the number of cylinder sets Cn such that α−ε < µ(Cn ) ≤ α+ε. For each x ∈ $ + A and r > 0, let m r (x) be the positive integer defined by e Smr φ(x) ≥ r

and

e Smr +1 φ(x) < r.

As an immediate consequence of theorem 18.5 and corollary 18.3, we obtain a formula for large deviations of the Birkhoff sums Sn φ. In this explicit form, the formula first appeared in the recent thesis of Kesseb¨ohmer [9]. Theorem 18.6. For α(q) = −T (q), where T (q) is defined by (3), one has that for q > 0: ( ) Snr (x) φ(x) x : log µ ≥ −α(q) max − log r P(φ) − α(q) . bφ lim − dim H ($ + A ) = r→0 log 2 − log r

Multifractal analysis of Birkhoff averages

429

Appendix. Equilibrium measures and thermodynamic formalism This appendix contains some essential definitions and facts from symbolic dynamics and thermodynamic formalism. For details consult [3, 8, 10, 16]. + Let σ : $ + A → $ A be an topologically mixing one-sided subshift of finite + type. The space $ A has a natural family of metrics defined by ρλ (x, y) =

∞ |x k − yk | , λk k=1

where λ is any number satisfying λ > 1. Let us choose λ = 2. The set $+ A is compact with respect to the topology induced by ρλ and the shift map + σ : $+ A → $ A is a homeomorphism. + (i) Let g ∈ C($ + A , R). We define the pressure P : C($ A , R) → R by

1 log n→∞ n

P(φ) = lim

(i1 ...in )admissible

inf

x∈C i1 ...in

exp

n−1

φ(σ j (x)) .

j =0

C γ ($ + A , R)

(ii) The pressure function P : → R is real analytic. Let ϕ ∈ C γ ($ + , R ). The map R → R defined by t → P(tϕ) is convex. It is A strictly convex unless ϕ is cohomologous to a constant (ϕ ∼ C), i.e., there exists C > 0 and g ∈ C γ ($ + A , R) such that ϕ(x) = g(σ x) − g(x) + C. (iii) Let ϕ ∈ C($ + , R ). A Borel probability measure µ = µϕ on $ + A A is called a equilibrium measure for the potential ϕ if there exist constants D1 , D2 > 0 such that µ{y | yi = x i , i = 0, . . . , n − 1} ≤ D2 D1 ≤ k exp(−n P(ϕ) + n−1 k=0 ϕ(σ x)) for all x = (x 1 x 2 . . .) ∈ $ + A and n ≥ 0. For subshifts of finite type, equilibrium measures exist for any H¨older continuous potential ϕ, are unique, and coincide with the equilibrium measure for ϕ. Two equilibrium measure µφ and µψ coincide if and only if the potentials φ and ψ are cohomologous. The measure of maximal entropy µmax is the equilibrium measure with constant potential. Let f be a C 2 strictly convex map on an interval I , hence, f (x) > 0 for all x ∈ I . The Legendre transform of f is the function g of a new variable p defined by g( p) = max( px − f (x)). x∈I

It is easy to show that g is strictly convex and that the Legendre transform is involutive. One can also show that strictly convex functions f and g form a Legendre transform pair if and only if g(α) = f (q) + qα, where α(q) = − f (q) and q = g (α).

430

Yakov Pesin and Howard Weiss

Acknowledgements YaP was partially supported by the National Science Foundation grant #DMS9403723 and by the NATO grant #CRG970161. HW was partially supported by the National Science Foundation grant #DMS9704913. The manuscript was written while HW was visiting the IPST, University of Maryland, and he wishes to thank IPST for their gracious hospitality.

References [1] Billingsly P 1978 Ergodic Theory and Information (Huntington: Krieger) [2] Besicovitch A S 1934 Sets of fractional dimension IV: on rational approximation to irrational numbers J. London Math. Soc. 9 126–31 [3] Bowen R 1975 Equilibrium States and the Ergodic Theory of Anosov Diffeomorphisms SLN #470 (Berlin: Springer) [4] Bareirra L and Schmelling J 2000 Sets of ‘non-typical’ points have full topological entropy and full Hausdorff dimension Israel J. Math. 116 29–70 [5] Durrett R 1991 Probability: Theory and Examples (Pacific Grove: Wadsworth) [6] Eggleston H G 1952 Sets of fractional dimension which occur in some problems of number theory Proc. London Math. Soc. 54 42–93 [7] Ellis R 1984 Large deviations for a general class of random vectors Ann. Prob. 12 1–12 [8] Keller G 1998 Equilibrium States in Egodic Theory (Cambridge: Cambridge University Press) [9] Kesseb¨ohmer M 1999 Multifrakale und asymptotiken grosser deviationen PhD Thesis University of G¨ottingen [10] Parry W and Pollicott M 1990 Zeta Functions and the Periodic Orbit Structures of Hyperbolic Dynamics (Ast´erisque 187–188) (Paris: Soci´et´e Math´ematique de France) [11] Pesin Ya 1997 Dimension Theory in Dynamical Systems: Rigorous Results and Applications (Cambridge: Cambridge University Press) [12] Pesin Ya and Pitskel B 1997 Topological pressure and the variational principle for non-compact sets Funct. Anal. Appl. 18(4) 50–63 [13] Pesin Ya and Weiss H 1997 A multifractal analysis of equilibrium measures for conformal expanding maps and Markov Moran geometric constructions J. Stat. Phys. 86 233–75 [14] Pesin Ya and Weiss H 1997 The multifractal analysis of equilibrium measures: motivation, mathematical foundation, and examples Chaos 7 89–106 [15] Petersen K 1983 Ergodic Theory (Cambridge: Cambridge University Press) [16] Ruelle D 1978 Thermodynamic Formalism (New York: Addison-Wesley) [17] Schmeling J 1999 On the completeness of multifractal spectra Ergod. Theor. Dynam. Syst. 19(6) 1595–616 [18] Shereshevsky M 1991 A complement to Young’s theorem on measure dimension: the difference between lower and upper pointwise dimension Nonlinearity 4 15–25

Multifractal analysis of Birkhoff averages

431

[19] Simpelaere D 1994 Dimension spectrum of axiom A diffeomorphisms II. Equilibrium measures J. Stat. Phys. 76 1359–75 [20] Walters P 1982 Introduction to Ergodic Theory (Berlin: Springer) [21] Weiss H 1999 The Lyapunov spectrum of equilibrium measures for conformal expanding maps and axiom-A surface diffeomorphisms J. Stat. Phys. 95(3–4) 615–32

This page intentionally left blank

Chapter 19 Existence of absolutely continuous invariant probability measures for multimodal maps Henk Bruin University of Groningen Sebastian van Strien University of Warwick

This paper is dedicated to Floris Takens. Let f : I → I be a C 2 map with negative Schwarzian derivative and finitely many non-flat critical points. We generalize an existence proof of Nowicki and Van Strien [10] of invariant probability measures to multimodal maps. More precisely, we show that if either one of the following two summability conditions 1/(c) ∞ ˜ | f n (c) − c| ˜ (c)−(c) <∞ |D f n ( f (c))|

c∈Crit n=1

(with c˜ the critical point closest to f n (c)), or ∞

|D f n ( f (c))|−1/max < ∞

c∈Crit n=1

holds then f admits an absolutely continuous invariant probability measure (acip from now on) µ with µ ∈ L τ for any τ < max /(max − 1). Here (c) is the order of the critical point c and max = max{(c), c ∈ Crit}. In particular, the second condition holds for multimodal Collet–Eckmann maps and both conditions generalize the condition in [10] for unimodal maps. Polynomial growth rates of Dn (c) = D f n ( f (c)) are sufficient for acips to exist. 433

434

Henk Bruin and Sebastian van Strien

19.1 Setting of the problem Let f : [0, 1] → [0, 1] be a C 2 map with negative Schwarzian derivative and a single critical point c of order . A well known result of [10] states that if

|D f n ( f (c))|−1/ < ∞

(19.1)

then f has an acip, i.e. an invariant probability measure which is absolutely continuous with respect to Lebesgue measure. In this paper we want to extend the result to multimodal maps, i.e. maps where the critical set Crit consists of at least 2, but at most finitely many points. Of importance are the orders (c) of the critical points. If all these orders are the same (and finite), then the multimodal case turns out to bring no new difficulties. The condition is the direct analogue of (19.1) and somewhat weaker than the condition used in [4], where, however, also decay of correlations is addressed. Different critical points with different critical orders bring some new phenomena. For example, the forward Collet–Eckmann condition, i.e. there exist C > 0, λ > 1 such that |D f n ( f (c))| ≥ Cλn

(19.2)

no longer implies the backward Collet–Eckmann condition, i.e. there exists C > 0, λ > 1 such that D f n (z) ≥ Cλn

whenever f n (z) ∈ Crit.

(19.3)

Examples are given in [6] and more explicitly in [5]. If all critical points have the same order, then (19.2) implies (19.3); see [7]. Let us also remark that (19.2) is invariant under topological conjugacy within the class of S-unimodal maps; see [9]. For multimodal maps, this is not clear; see [11] and the remark in [9, p 35]. We have two approaches to deal with the effects of different critical orders. One consists of including in the summability condition a factor concerning slow recurrence. Slow recurrence was used in the Benedicks–Carleson approach [1] of Collet–Eckmann maps in the unimodal setting. For the other approach we assume that each critical point satisfies the summability condition with the maximal critical order max = max{(c), c ∈ Crit}. We remark that there are maps with an acip, but not satisfying the summability condition; see [3]. The same will be true of course in the multimodal setting. If (c) = ∞ for some c ∈ Crit, then no acip needs to exist. This depends largely on the precise form of the flatness, as is shown by Benedicks and Misiurewicz [2], and in more generality by Thunberg [12]. It remains an open question whether the summability condition (19.1), or any multimodal analogue, are topological invariants.

Existence of acips for multimodal maps

435

19.2 The multimodal summability condition Let f : [0, 1] → [0, 1] be a C 2 mapping with negative Schwarzian derivative, 1 i.e. | f |− 2 is a convex function. We assume that f has a finite set Crit of critical points, all of which are non-flat. This means that for each c ∈ Crit there exists = (c) < ∞ such that limx→c | f (x) − f (c)|/|x − c| exists and is different from 0. For each critical point c, let Dn (c) = |D f n ( f (c))|, and 1/(c) ˜ ˜ (c)−(c) | f n (c) − c| δn (c) = Dn (c) where c˜ is the critical point closest to f n (c). Obviously δn (c) = Dn (c)−1/ when all critical points have the same order . Let m denote Lebesgue measure. We are interested in finding an acip with respect to m. Theorem 19.1. Let f be a multimodal map with negative Schwarzian derivative and finitely many non-flat critical points. If (a)

∞

δn (c) < ∞

c∈Crit n=1

or (b)

∞

Dn (c)−1/max < ∞

c∈Crit n=1

then f admits an acip µ, and µ ∈ L τ for any τ < max /(max − 1). Remark 19.2. While there are multimodal Collet–Eckmann maps that do not satisfy (a) (cf [5]), every non-flat multimodal Collet–Eckmann map satisfies (b). Most likely, there are maps satisfying (a) but not (b). If all (c) are the same then both conditions are of course identical. For the proof of theorem 19.1 we will follow the proof of [10] (see also [8, ch V 4]) in the sense that we give the main arguments and refer to [8] for some of the technical details. However, in order to prove that (b) is a sufficient condition we also need some results from [5]. In several places we need a version of Koebe distortion estimates, namely the Koebe Principle, the one-sided Koebe Principle and the expansion of the cross ratio. The exact formulation and proof can be found in [8, section IV.1]. We start with the two main claims. There exists K 0 > 0 such that for each measurable set A and n ≥ 0 m( f −n (A)) ≤ K 0 m(A)1/max ,

(19.4)

436

Henk Bruin and Sebastian van Strien

where max = max{(c); c ∈ Crit}. The invariant µ is constructed as measure −i (A)). Claim (19.4) the weak limit of Cesaro means µn (A) = n1 n−1 m( f i=0 implies that µ(A) ≤ K 0 m(A)1/max , so absolute continuity follows, and a fortiori, dµ is an L τ (m)-function for any 1 ≤ τ < the Radon–Nikod´ym derivative dm max /(max − 1); see [8, p 378]. Let B(c, ε) be the ε-neighbourhood of c ∈ Crit. Then there exists K 1 > 0 such that (19.5) m( f −n (B(c, ε))) ≤ K 1 ε, for all c ∈ Crit, ε > 0 and n ≥ 0. Proposition 19.3. The claim made in equation (19.5) implies the claim made in equation (19.4). For the proof of proposition 19.3 we need some notation and a lemma. Let E n (c, ε) = f −n (B(c, ε)). Lemma 19.4. There exists K 2 > 0 such that for any interval I for which f n |I is monotonic, m( f n (I )) ≤ ε and one of the boundary points of I is a critical point of f n , we have 1/(c) ε (19.6) I ⊂ E i c, K 2 Dn−i−1 (c) for some 0 ≤ i ≤ n and c ∈ Crit. (If i = n, then put Dn−i−1 (c) = 1.) Proof. Let a ∈ ∂ I be such that f n has a critical point at a. Then f n (a) = f j (c) ˜ ˜ − Crit| < 3ε then (19.6) holds for for some c˜ ∈ Crit and j ≤ n. If | f j (c) the c ∈ Crit closest to f j (c), ˜ K 2 = 4 and i = n. Assume, therefore, that | f j (c) ˜ − Crit| > 3ε. Let B be the component of f −n (( f j (c) ˜ − 3ε, f j (c) ˜ + 3ε)) containing I . Also let s = max{m ≤ n; f m (B) ∩ Crit %= ∅}. Say f s (B) : c. As a ∈ ∂ I , n − j ≤ s < n. Let C be the convex hull of c and f s (I ). By construction, f n−s−1 maps a neighbourhood of f s+1 (B) monotonically onto ( f j (c) ˜ − 3ε, f j (c) ˜ + 3ε). By the one-sided Koebe Principle D f n−s−1 (c) ≤ O(1)

| f n−s (C)| ε ≤ O(1) . | f (C)| | f (C)|

Hence non-flatness gives I ⊂ E n−s (c, (O(1)ε/Dn−s−1 (c))1/(c)).

Proof of proposition 19.3. Let A be any measurable set; we only need to consider the case when ε := m(A) > 0. Let f n : (a − , a + ) = I → f n (I ) be a branch of f n , and let I ± ⊂ I be the maximal intervals such that a ± ∈ ∂ I ± and | f n (I ± )| ≤

Existence of acips for multimodal maps

437

ε. By the minimum principle (see [8, p 154]), m( f −n (A) ∩ I ) ≤ m(I − ∪ I + ). Lemma 19.4 yields 1/(c± ) ε ± ± I ⊂ Ei ± c , K 2 Dn−i ± −1 (c± ) ±

±

where c± = f i (a ± ). By claim (19.5), |I ± | ≤ K 1 K 2 (ε/Dn−i ± −1 (c± ))1/(c ) . Thus, summing over all branches I , we obtain m( f −n (A)) ≤

n−1

K1 K2

c∈Crit i=0

ε

1/(c±)

Dn−i ± −1 (c± )

.

19.3 Proof of theorem 19.1, part (a) It suffices to prove claim (19.5). We will subdivide the components I of E n (c, ε) into three classes. Given σ > 2ε, let I ⊂ I ⊂ I be the components of E n (c, ε), E n (c, 2ε) and E n (c, σ ), respectively. • • •

I ∈ Rn , the regular case, if f n has no critical point in I . I ∈ Sn , the sliding case, if f n has a critical point in I but not in I . I ∈ Tn , the transport case, if f n contains a critical point in I .

We treat these three cases separately. The regular case Let I ⊂ Rn (c), then f n (I ) is an interval of length 2σ and it contains a 2-scaled neighbourhood of f n (I ). By the Koebe Principle there exists a constant K R such that | f n (I )| | f n (I )| ε σ ≤ ≤ KR ≤ KR . |I | |I | |I | |I | We choose K R so that this holds for all c ∈ Crit. Since the intervals I corresponding to the I ∈ Rn are pairwise disjoint, I ∈R

|I | ≤ K R

ε σ

c∈Crit

I ∈Rn (c)

ε |I | ≤ K R . σ

(19.7)

Having fixed the constant K R let us explain the inductive structure of the proof of claim (19.5). The induction hypothesis is (IH)

m( f −k (B(c, ε))) ≤

3K R ε σ

Henk Bruin and Sebastian van Strien

438

I0

n0

6

R0

c0

A0

f k0 L1

n1

6

I1

A1

R 1 c1

f k1 I2

n2

6

c2

L2

R2

A2

f k2

I α1

α−1

α0

α2

Figure 19.1. Construction for the sliding case.

for all ε > 0, c ∈ Crit and k < n. This inequality is true for n = 2, because for σ sufficiently small, there are only regular intervals I . Assuming (IH) for all k < n, we proceed to prove (IH) for n for the sliding and transport case. ˜ ∈ ∪c B(c, σ ) for some c˜ ∈ Crit}. The Let ν(σ ) = min{k ≥ 1; f k (c) summability condition implies that f n (Crit) ∩ Crit = ∅ for all n ≥ 1 and, therefore, ν(σ ) → ∞ as σ → 0. (If f is a Misiurewicz map, then ν(σ ) = ∞ for σ sufficiently small. This case is harmless; it implies that all components I belong to Rn (c).) The sliding case Let I ∈ Sn (c). Let T ⊃ I be the maximal interval on which f n is monotonic. We construct a sequence n = n 0 > n 1 > . . . n s ≥ 0 and intervals T s ⊃ . . . T 1 ⊃ T 0 = T as follows. Let T0 = f n0 (T 0 ) = f n (T ). Because I ∈ Sn , T0 ⊃ I0 := f n (I ) = B(c, ε). Let R0 and A0 be the components of T0 \ I0 such that |R0 | ≥ |A0 |. By nonflatness, |R0 | ≥ O(1)|I0 |. Let us write T 0 = (α0 , α−1 ) where α0 is such that f n0 (α0 ) ∈ ∂ A0 . (Throughout the proof (a, b) is the interval with boundary points a and b, also if b < a.) We continue inductively as follows; see figure 19.1. Given T i = (αi , αi−1 ) ⊃ I and n i > 0, let •

n i+1 be such that f ni+1 (αi ) ∈ Crit,

Existence of acips for multimodal maps • • • • •

439

ki = n i − n i+1 , T i+1 be the maximal interval of the form T i+1 = (αi , αi+1 ) such that f ni+1 is monotonic on T i+1 ; so T i+1 ⊃ T i and they have the boundary point αi in common, Ti+1 = f ni+1 (T i+1 ) and Ii+1 = f ni+1 (I ), Ri+1 be the component of Ti+1 \ Ii+1 with the point ci+1 := f ni+1 (αi ) ∈ Crit as boundary point. Let Ai+1 be the other component. (It follows that f ki (Ri+1 ) = Ai .) L i+1 be the subinterval of Ai+1 adjacent to Ii+1 such that f ki (L i+1 ) = Ri .

The construction ends at n s when either n s = 0 or when |As | ≥ |Rs |. We write i for the critical order of ci . Proposition 19.5. There exists K 4 ≥ 1 such that |Is | ≤ | f n (I )| ·

s−1 *

K 4 δki (ci+1 )

i=0

where ci+1 ∈ Crit is the critical point in ∂ Ri+1 . Proof. The expansion of cross ratios | f ki (L i+1 )| | f ki (Ri+1 )| | f ki (Ii+1 )| | f ki (Ti+1 )| ≤ |L i+1 | |Ri+1 | |Ii+1 | |Ti+1 | in our case gives rise to |Ii+1 | ≤ |Ii |

|Ai ∪ Ii ∪ Ri | |L i+1 | |Ri+1 | . |Ri | |L i+1 ∪ Ii+1 ∪ Ri+1 | |Ai |

Using this for i = 1, . . . , s − 1 we obtain |A1 ∪ I1 ∪ R1 | * |Ri+1 | × |A1 | |Ri | s−1

|Is | ≤ |I1 |

i=1

×

s−1 * i=2

|L s | |Ai ∪ Ii ∪ Ri | |L i | × . |L i ∪ Ii ∪ Ri | |Ai | |L s ∪ Is ∪ Rs |

(19.8)

The last factor is clearly less than 1, and the one but last factor is less than 1 because of the inequality l(a+w) a(l+w) ≤ 1 for all w > 0 and a > l > 0. The factors |Ri+1 |/|Ri | can be estimated as |Ri+1 | | f (Ri+1 )|1/i+1 ≤ O(1) (non-flatness) |Ri | |Ri | 1 | f (Ai )| 1/i+1 ≤ O(1) (one-sided Koebe) |Ri | Dki (ci+1 )

440

Henk Bruin and Sebastian van Strien 1 |Ai |i /i+1 (non-flatness) |Ri | Dki (ci+1 )1/i+1 1 ≤ O(1)|Ri |(i /i+1 )−1 (|Ai | < |Ri |) Dki (ci+1 )1/i+1

≤ O(1)

≤ O(1)δki (ci+1 )

(19.9)

(|Ri | ≈ | f ki (ci+1 ) − ci |).

It remains to estimate |I1 |

|A1 ∪ I1 ∪ R1 | |L 1 ∪ I1 ∪ R1 | ≤ |I1 | |A1 | |L 1 | |R1 | |A0 ∪ I0 ∪ R0 | ≤ |I0 | (cross ratio) |A0 | |R0 | |R1 | ≤ O(1)|I0 | (|I0 | ≤ |A0 | ≤ |R0 |) |A0 | | f (R1 )|1/1 (non-flatness) ≤ O(1)|I0 | |A0 | 1/1 |I0 | |A0 | ≤ O(1) (one-sided Koebe) |A0| Dk0 −1 (c1 ) 1/1 | f k0 (c1 ) − c0 |0 −1 |I0 | ≤ O(1) (non-flatness) |A0|1−1/1 Dk0 (c1 ) ≤ O(1)|I0 | · δk0 (c1 )

(|A0| ≈ | f k0 (c1 ) − c0 |).

Combining these estimates gives the statement of proposition 19.5.

The next step is to compare I with a component of E ns (cs , ε ) for an appropriate ε . Let J ⊂ T s be an interval such that |J | = |I | and G := f ns (J ) ⊂ Is ∪ Rs is adjacent to cs . Since |As | > |Rs |, the one-sided Koebe Principle gives |G| | f ns (I )| ≤ O(1) . |J | |I | Because |I | = |J |, proposition 19.5 gives |G| ≤ K S | f (I )| n

s−1 *

K 4 δki (ci+1 )

i=0

for some K S > 0. Therefore, I is a subinterval of a component of the s−1 set E ns (cs , εK S i=0 K 4 δki (ci+1 )). For every s-tuple (k0 , . . . , ks−1 ) there are s at most (2#Crit) intervals I such that f ns (I ) slides to the same interval G. Furthermore, all the intervals Ti have size ≤ σ , so that ki ≥ ν(σ ) for all

Existence of acips for multimodal maps

441

0 ≤ i < s. Therefore, I ∈Sn (c)

|I | ≤

s−1 * ˜ εK S (2#Crit) E ns c, δki (ci+1 ) .

c∈Crit ˜

s

k j ≥ν(σ )

j

i=0

k j =n−n s ≤n

(19.10) The transport case Suppose I ∈ T n (c). Then f n has a critical point in I . Let k < n be maximal such that f k (I ) contains a critical point, say c, ˜ in its closure. Let us say that I ∈ Tnk (c, ˜ c) in this case. Clearly f n−k−1 maps f k+1 (I ) diffeomorphically into B(c, 2ε). Proposition 19.6. There exists K T > 0 such that I ∈Tn (c)

|I | ≤

n−1

k=0 I ∈Tnk (c,c) ˜

n−1

|I | ≤

k=0 c∈Crit ˜

E k (c, ˜ K T εδn−k (c)) ˜

(19.11)

for all c ∈ Crit and n ≥ 1. Proof. Clearly f (c) ˜ ∈ f k+1 (I ) ⊂ [x, y] where f n−k−1 maps [x, y] monotonically onto B(c, 2ε). Because B(c, 2ε) contains a 2-scaled neighbourhood of B(c, ε) ⊃ f n (I ), the Koebe principle assures that | f n (I )|/| f k+1 (I )| = O(1)Dn−k−1 (c). ˜ Therefore, there exists K 5 such that

ε

k

| f (I )| ≤ K 5

Dn−k−1 (c) ˜

1/(c) ˜

.

Since f n−k (c) ˜ ∈ B(c, 2ε), the Chain Rule and non-flatness give Dn−k (c) ˜ < O((c))Dn−k−1 (c)ε ˜ ((c)−1). Therefore, there exists K T such that | f k (I )| ≤ O(1)

˜ ε(c)/(c) ≤ | f n (I )|K T δn−k (c). 1/( c) ˜ Dn−k (c) ˜

Summing over all such I gives the statement of proposition 19.6.

Conclusion of the proof of theorem 19.1, part (a). Using [8, lemma 4.9] one can choose σ so small (and therefore ν(σ ) so large) that for any n ≥ 1

c∈Crit ˜

k j ≥ν(σ )

j

k j =n−n s ≤n

3K S

s−1 * j =0

2#CritK 4 δk j (c˜ j ) ≤ 1

(19.12)

442

Henk Bruin and Sebastian van Strien

and

k≥ν(σ ) c∈Crit ˜

3K T δk (c) ˜ ≤ 1.

(19.13)

By (19.10) and (19.11), for any c ∈ Crit we have |I | + +|I | |I | |E n (c, ε)| ≤ I ∈Rn (c)

ε ≤ KR + σ

I ∈Sn (c)

c∈Crit ˜

I ∈Tn (c)

(2#Crit)s

k j ≥ν(σ )

j

k j =n−n s ≤n

s−1 * × E ns c, K 4 δk j (c˜ j ) ˜ 7K S ε j =0

+

n−1

k=ν(σ ) c∈Crit ˜

|E k (c, ˜ K T εδn−k (c))|. ˜

By the induction hypothesis (IH) this is smaller than ε KR + σ

c∈Crit ˜

j =0

k j ≥ν(σ )

j

+

s−1 εK S * 3K R 2#CritK 4 δk j (c˜ j ) σ

k j =n−n s ≤n

n−1 k=ν(σ ) c∈Crit ˜

3K R

ε K T δn−k (c) ˜ σ

which by formulas (19.12) and (19.13) is smaller than 3K R ε/σ . This proves the induction. Therefore, claim (19.5) holds for K 1 = 3K R /σ .

19.4 Proof of theorem 19.1, part (b) The proof is an adaptation of the proof of part (a) combined with a result from [5]. We indicate the differences. We will again prove claim (19.4), but we change claim (19.5) into the following. There exists K 6 such that m( f −n (B(c, ε)) ≤ K 6 ε(c)/max

(19.14)

for all c ∈ Crit, ε > 0 and n ≥ 0. The proof that claim (19.14) implies claim (19.4) is the same as that for proposition 19.3. The construction of µ ∈ L τ remains the same as well. To prove claim (19.14) we use the same division into cases, with the addition that the sliding cases falls into three subclasses.

Existence of acips for multimodal maps

443

The regular case Keep the same K R as in the proof of part (a). Obviously (19.7) gives ε K R (c)/max |I | ≤ K R ≤ . ε σ σ I ∈R

We use the induction hypothesis (IH )

m( f −k (B(c, ε))) ≤

5K R (c)/max ε . σ

The sliding case Take L > 10 so large that 50L −1/max ≤ 1. Keeping the construction, we divide the case I ∈ Sn into • • •

I ∈ Sn , if 0 ≤ s or |A0 | ≤ L 2 ε, I ∈ Sn , if max = 0 > s and |A0 | > L 2 ε, I ∈ Sn , if max > 0 > s and |A0 | > L 2 ε.

First we consider the case I ∈ Sn as follows.

Proposition 19.7. There exists K 7 ≥ 1 such that for every I ∈ Sn | f n (I )|0 /s |Is | ≤ s−1 1/s i=0 K 7 Dki (ci+1 ) where ci+1 ∈ Crit is the critical point in ∂ Ri+1 . Proof. The proof is the same as the proof of proposition 19.5, except that we need to estimate the first and second factor in (19.8) a bit differently. Indeed, now we estimate |Ri+1 | ≤ O(1)Dki (ci+1 )−1/i+1 |Ri |i /i+1 , which follows immediately from (19.9). Hence,

O(1) Dks−1 (cs )1/s O(1) O(1) ≤ |Rs−2 |s−2 /s Dks−2 (cs−1 )1/s Dks−1 (cs )1/s .. .. .. . . . s−1 * 1 ≤ |R1 |1 /s . K 7 Dki (ci+1 )1/s

|Rs | ≤ |Rs−1 |s−1 /s

i=1

Consequently, s−1 * i=1

s−1 * |Rs | |Ri+1 | 1 = ≤ |R1 |(1 −s )/s . |Ri | |R1 | K 7 Dki (ci+1 )1/s i=1

444

Henk Bruin and Sebastian van Strien

For the remaining factors we find as before |A1 ∪ I1 ∪ R1 | |A1 |

|I1 | · |R1 |(1 −s )/s ·

|I0 | (cross ratio) |R1 |1 /s |A0 | |I0 | ≤ O(1) (non-flatness) | f (R1 )|1/s |A0 | |I0 | |A0 |0 1/s (one-sided Koebe and non-flatness) ≤ O(1) ( ) |A0 | Dk0 (c1 ) 1 ≤ O(1)|I0 |0 /s Dk0 (c1 )1/s ≤ O(1)

where the last inequality holds because 0 ≤ s or |A0 | ≤ L 2 |I0 |. This proves the statement of proposition 19.7. The same sliding trick now gives a K S ≥ 1 such that c∈Crit ˜

is at most

i=0

k j ≥ν(σ ) j

I ∈Sn (c) |I |

(c)/(c) ˜ K ε S (2#Crit)s E ns c, ˜ s−1 . 1/( c) ˜ Dki (ci+1 )

k j =n−n s ≤n

For the case I ∈ Sn we use theorem 1.2 and proposition 3.1 from [5]. By the Mean Value Theorem there exists ξ ∈ I such that |D f n (ξ )| = ε/|I | ≤ ε(c)/max /|I |. In the notation of [5], n is a critical time of type (AP) for ξ . Choose (ξ ) = 0 = max . Then there exists K 8 independently of ξ , such that |D f (ξ )| ≥ n

−1 s*

K 8 Dki (ci+1 )1/max

i=0

where 0 = n s < n s −1 < · · · < n 0 = n are the critical times of ξ in the construction of [5]. Furthermore, ki = n i − n i+1 and as before ki ≥ ν(σ ) for all i . Because there are at most (2#Crit)s intervals I ∈ Sn with the same critical times n s , n s −1 , . . . , n 0 , we find I ∈Sn (c)

|I | ≤

c∈Crit ˜ k j ≥ν(σ ) j k j =n

ε

(c)/max

−1 s*

i=0

2#Crit . K 8 Dki (ci+1 )1/max

The case Sn will be dealt with in the conclusion of the proof.

Existence of acips for multimodal maps

445

The transport case Proposition 19.6 still ensures the existence of K T ≥ 1 such that I ∈Tn (c)

|I | ≤

n−1

k=0 I ∈Tnk (c,c) ˜

n−1

|I | ≤

˜ K T ε(c)/(c) c, ˜ ˜ Dn−k (c) ˜ 1/(c)

Ek

k=0 c∈Crit ˜

for all c ∈ Crit and n ≥ 1. Conclusion of the proof of theorem 19.1, part (b). Choose σ so small that for any n ≥ 1 c∈Crit ˜

5K S

k j ≥ν(σ )

j

j =0

2#Crit ≤1 K 7 Dk j (c˜ j )1/max

(19.15)

k j =n−n s ≤n

−1 s*

c∈Crit ˜ k j ≥ν(σ ) j =0 j k j =n

and

s−1 *

k≥ν(σ ) c∈Crit ˜

2#Crit ≤1 K 8 Dk j (c˜ j )1/max

(19.16)

5K T ≤ 1. Dk (c) ˜ 1/max

(19.17)

First take ε ≥ σ/L 2 ; this means that the case Sn does not occur. Then for any c ∈ Crit |E n (c, ε)| ≤ |I | + + +|I | |I | I ∈Rn (c)

I ∈Sn (c)

ε ≤ K R ( )(c)/max + σ

I ∈Sn (c)

c∈Crit ˜

I ∈Tn (c)

(2#Crit)s

k j ≥ν(σ ) k j =n−n s ≤n

j (c)/( c) ˜ KS ε × E ns c, ˜ s−1 1/( c) ˜ K 7 Dk j (c˜ j )

j =0

+ε

(c)/max

−1 s*

c∈Crit ˜ k j ≥ν(σ ) j =0 j k j =n

2#Crit K 8 Dk j (c˜ j )1/max

n−1 ˜ K T ε(c)/(c) ˜ + . E k c, 1/(c) ˜ D ( c) ˜ n−k k=ν(σ ) c∈Crit ˜

Henk Bruin and Sebastian van Strien

446

By the induction hypothesis (IH ) this is smaller than K R (c)/max ε + σ

c∈Crit ˜

k j ≥ν(σ )

j

+ε

(c)/max

j =0

k j =n−n s ≤n

−1 s*

c∈Crit ˜ k j ≥ν(σ ) j =0 j k j =n

+

s−1 * 4K R 2#Crit (εK S )(c)/max σ K 7 Dk j (c˜ j )1/max

2#Crit K 8 Dk j (c˜ j )1/max

n−1 4K R (c)/max KT ε σ Dn−k (c) ˜ 1/max

k=ν(σ ) c∈Crit ˜

which by formulas (19.15), (19.16) and (19.17) is smaller than 4KσR ε(c)/max . This proves (IH ) for n and all ε ≥ σ/L M+1 for M = 1. We continue the induction in M, so suppose that (IH ) holds for all ε ≥ σ/L M+1 , and take σ/L M+2 ≤ ε < σ/L M+1 . The calculations are as before, except that now case Sn can occur. Suppose, therefore, that I0 = f n (I ) for some I ∈ Sn . Note that |A0 | < σ , otherwise we would be in the regular case. Find N ≤ M such that |A0 | |A0 | ≤ ε = |I0 | < N+1 . N+2 L L Let I˜ ⊃ I be such that I˜0 := f n ( I˜) is an interval of length L N ε centred around be the I0 . As L > 10, the Koebe Principle gives |I |/| I˜| ≤ 10|I0 |/| I˜0 |. Let Sn,N set of all intervals I ∈ Sn with this value N. Then |I | ≤ 10L −N | I˜| ≤ 10L −N m( f −n (B(c, L N ε)). I ∈Sn,N

I ∈Sn,N

By (IH ) applied to L N ε and since 0 = (c) ≤ max − 1, this is smaller than 10L −N

5K R N (c)/max 50K R −N/max (c)/− max (L ε) L ≤ ε . σ σ

Summing over all N ≥ 1 we find I ∈Sn

|I | ≤

50K R −N/max (c)/max K R (c)/max ε L ε ≤ σ σ N≥1

by the choice of L. Thus if we add the term 5K σ

R ε(c)/max .

we still find |E n (c, ε)| ≤ claim (19.14) holds for K 6 = 5K R /σ .

I ∈Sn

|I | in the earlier calculations,

This proves the induction. Therefore,

Existence of acips for multimodal maps

447

References [1] Benedicks M and Carleson L 1985 On iterations of x "→ 1 − ax 2 on (−1, 1) Ann. Math. 122 1–25 [2] Benedicks M and Misiurewicz M 1989 Absolutely continuous invariant measures for maps with flat tops Publ. Math. IHES 69 203–13 [3] Bruin H 1994 Topological conditions for the existence of invariant measures for unimodal maps Ergod. Theor. Dynam. Syst. 14 433–51 [4] Bruin H, Luzzatto S and Van Strien S 1999 Decay of correlations in one-dimensional dynamics Preprint University of Warwick [5] Bruin H and Van Strien S 2000 Expansion of derivatives in one-dimensional dynamics Preprint University of Warwick [6] Carleson L, Jones P and Yoccoz J-C 1994 Julia and John Bol. Soc. Bras. Mater. 25 1–30 [7] Graczyk J and Smirnov S 1998 Collet, Eckmann and H¨older Inv. Math. 133 69–96 [8] de Melo W and Van Strien S 1993 One-Dimensional Dynamics (Berlin: Springer) [9] Nowicki T and Przytycki F 1998 Topological invariance of the Collet–Eckmann property for S-unimodal maps Fund. Math. 155 33–43 [10] Nowicki T and Van Strien S 1991 Absolutely continuous measures under a summability condition Inv. Math. 105 123–36 [11] Przytycki F and Rohde S 1999 Rigidity of holomorphic Collet–Eckmann repellers Ark. Mater. 37 357–71 [12] Thunberg H 1999 Positive exponent in families with flat critical point Ergod. Theor. Dynam. Syst. 19 767–807

This page intentionally left blank

Chapter 20 On the dynamics of the renormalization operator Artur Avila and Welington de Melo IMPA Marco Martens IBM T J Watson Research Center

We consider the renormalization operator R acting on an open set in the space of C k unimodal interval maps (k ≥ 3) of bounded combinatorial type. Each map f in the domain D of the operator has a periodic interval around the critical point whose period q ≥ 2 is at most a given integer N ≥ 3 and R( f ) is affinely conjugate to the restriction of f q to this interval. A map f is infinitely renormalizable of combinatorial type bounded by N if all iterates R n ( f ) belong to D. Sullivan proved in [5], see also [3], the existence of a compact invariant subset ⊂ D such that the restriction of R to is a homeomorphism which is topologically conjugate to a full shift on a finite number of symbols. Furthermore, each map g ∈ is real analytic with a holomorphic extension which is quadraticlike in the sense of Douady–Hubbard. He also proved that if f is infinitely renormalizable with combinatorial type bounded by N then there exist g ∈ such that the C 0 -distance between R n ( f ) and R n (g) converges to zero. Using the fundamental result of Lyubich [1] on the hyperbolicity of the renormalization operator in the space of germs of quadratic-like maps, it was proved in [4] that in fact the C 0 -distance between R n ( f ) and R n (g) converges to zero exponentially fast. Here we complete the description of the dynamics of the renormalization operator by proving the exponential convergence of the C k -distances of these iterates. 449

450

Artur Avila, Welington de Melo and Marco Martens

20.1 Formulation of the main result To give a more precise formulation of our result we include for completeness the usual notions of unimodal renormalization. Fix α > 1, the critical exponent, and k ≥ 3, indicating the smoothness class. The standard folding family qt : [−1, 1] → [−1, 1], t ∈ [0, 1] is qt (x) = −2t|x|α + 2t − 1. Let Diffk ([−1, 1]) be the set of C k orientation preserving diffeomorphisms of the interval [−1, 1]. It is an open subset of a codimension-two affine subspace of the Banach space C k ([−1, 1], R) of the C k mappings endowed with the C k norm | f |k = supx {| f (x)|, |D f (x)| . . . |D k f (x)|}. The class of unimodal maps we consider here is U = Diffk ([−1, 1]) × [0, 1] where an element (φ, t) ∈ U should be interpretated as the C k unimodal map f = φ ◦ qt : [−1, 1] → [−1, 1] with critical exponent α > 1. The diffeomorphism φ is called the diffeomorphic part of the unimodal map f . The metric on U is the product of the metric of the interval with the C k -distance on Diffk ([−1, 1]). A unimodal map f ∈ U is called renormalizable iff there exists an expanding periodic point p ∈ (−1, 1) such that the first return map to the central interval C = [− p, p] is a of the form f q : C → C with f q ( p) = p and q ≥ 2. Up to rescaling, the first return map to C will be a unimodal map. This unimodal map is a renormalization of f . Observe that a renormalization is completely determined by the periodic point p. The combinatorial aspects of a renormalization are described by unimodal permutations. A permutation on a finite-ordered set is a unimodal permutation if the following holds. Embed the set monotonically into the real line. Draw the graph of the permutation. If this graph can be extended to the graph of a unimodal map then the permutation is called unimodal. A collection I = {I1 , I2 , . . . , Iq−1 , Iq } of oriented closed intervals in [−1, 1] is called a cycle for the unimodal map f if it has the following properties: • • • • • •

there is an expanding periodic point p ∈ (−1, 1) with Iq = [−| p|, | p|], f : Ii → Ii+1 , i = 1, 2, . . . , q − 1, is monotone onto, f (Iq ) ⊂ I1 with f ( p) ∈ ∂ I1 , the boundary of I1 , the interiors of I1 , I2 , . . . , Iq are pairwise disjoint, I inherits an order from [−1, 1], the map σ (I ) : Ii → Ii+1 on I is a unimodal permutation,

On the dynamics of the renormalization operator •

the orientation

451

oI : I → {−1, 1}

is such that oI (Ii ) = 1 when f i ( p) is the left boundary point of Ii and oI (Ii ) = −1 otherwise. Observe the following: • • •

a unimodal map is renormalizable iff it has a cycle, the last three properties defining a cycle follow automatically once a unimodal map has a periodic point with the first three properties. the orientation oI depends only on σ ; we will use the notation oσ .

Let σ be a unimodal permutation and Uσ = { f ∈ U | f has a cycle I with σ (I ) = σ }. The unimodal maps in Uσ are sometimes called σ -renormalizable to emphasize the type of renormalization under consideration. The renormalization operator R σ : Uσ → U is defined to assign to each unimodal map in Uσ the affinely rescaled first return map to the smallest central interval giving rise to a cycle I with σ (I ) = σ . These sets of renormalizable maps Uσ are non-empty; every family t → φ ◦ qt ∈ U contains points in each Uσ . Often a unimodal map has different cycles; the sets Uσ are not disjoint. They are nested, however. For each σ there exists a unique maximal factorization σ = 'σn , . . . , σ2 , σ1 ( such that Rσ = Rσn ◦ · · · ◦ Rσ2 ◦ Rσ1 . A unimodal permutation σ is called prime iff σ = 'σ (. Clearly each permutation in the maximal factorization is prime. Using the prime unimodal permutations we obtain a partition of the set of renormalizable unimodal maps and the renormalization operator becomes Uσ → U R : {renormalizable maps} = prime σ

with R|Uσ = Rσ . A unimodal map f ∈ U is infinitely renormalizable iff R n f is defined for all n ≥ 1. In particular, for each infinitely renormalizable maps f there is a unique sequence σ1 , σ2 , σ3 , . . . of prime unimodal permutation such that Rσ = Rσn ◦ · · · ◦ Rσ2 ◦ Rσ1 . Fix a finite set of prime unimodal permutations. Two infinitely renormalizable unimodal maps are said to be of the same bounded type if they have the same

452

Artur Avila, Welington de Melo and Marco Martens

sequence of prime unimodal permutation and all these permutations are in the given finite set of permutations. Theorem 20.1. Let α = 2. If f, g ∈ U are infinitely renormalizable maps of the same bounded type then there exist C > 0 and λ < 1 such that |R n f − R n g|k ≤ Cλn . The proof of theorem 20.1 is given in the following sections. The main ingredient, the notion of decompositions, is introduced in section 20.2. The technical lemmas are collected in the appendix and the actual proof is given in section 20.3.

20.2 Decompositions Applying the renormalization operator repeatedly will produce a sequence of unimodal maps. Each of the unimodal maps of this sequence has a diffeomorphic part and a standard folding part. The renormalization process causes these diffeomorphic parts not to be arbitrary. The main idea presented in [2] is the method of decomposing the diffeomorphic parts in a systematic manner. The decomposition of renormalizations shows naturally two parts: an analytic part and a C k part. This separation underlies the proof of our main result, theorem 20.1. Fix an infinitely renormalizable map f = (φ, qt ) ∈ U and let I n = n n {I1 , I2 , . . . , Iqnn } be the cycle corresponding to the nth renormalization, n ≥ 1. Each cycle will be partitioned in sets I n = L nk . k≥0

The definition of these level sets is by induction. Let

I 0 = L 00 = {[−1, 1]}. n+1 If I n+1 : Iin+1 ⊂ I nj ∈ L nk and 0 ∈ / Iin+1 then Iin+1 ∈ L n+1 then k+1 . If 0 ∈ Ii

Iin+1 ∈ L n+1 0 . Observe that

In =

n

L nk .

k≥0

Let I ⊂ [−1, 1] be an oriented interval and consider the zoom operator Z I : Diffk ([−1, 1]) → Diffk ([−1, 1]) which assigns to each diffeomorphism φ the affinely rescaled version of the orientation preserving restriction φ : I → φ(I ). The intervals I and φ(I ) are oriented the same.

On the dynamics of the renormalization operator qin

453

For each Iin ∈ I n , i %= 0, define the orientation preserving diffeomorphisms : [−1, 1] → [−1, 1] and φin : [−1, 1] → [−1, 1] where qin = Z Iin (qt )

and φin = Z qt (Iin ) (φ) n are oriented in the same direction, namely the orientation where qt (Iin ) and Ii+1 n o(Ii+1 ) defined by the cycle I n . Furthermore, let

tn =

|qt (I0n )| . |φ −1 (I1n )|

These definitions describe the decomposition of R n f . Namely, R n f = ((φqnn −1 ◦ qqnn −1 ) ◦ · · · ◦ (φ2n ◦ q2n ) ◦ (φ1n ◦ q1n ), tn ). Lemma 20.2. For every diffeomorphism φ there exists a constant C, which depends only on the C k -norm of φ and on the C 0 -norm of D ln Dφ, such that for every interval I ⊂ [−1, 1] we have |Z I (φ) − id|k ≤ C · |I |. Proof. For each diffeomorphism ψ we define its nonlinearity by ηψ = D log Dψ. The zoom operators act on the nonlinearities as |η Z I (ψ) |k−2 ≤

|I | |ηψ |k−2 . 2

(20.1)

This is a straightforward chain rule computation. Moreover, the C k−2 -norm of the nonlinearity bounds the C k -distance to the identity in the following way. For every B > 0 there exists a constant K such that |ψ − id|k ≤ K · |ηψ |k−2 for every diffeomorphism ψ with |ηψ |k−2 ≤ B. The definition of nonlinearity η of a diffeomorphism ψ implies x Dψ(x) = A · exp η −1

where A = 1

2

x −1 exp −1 η

.

(20.2)

Artur Avila, Welington de Melo and Marco Martens

454

These expressions and the bound on the nonlinearity of ψ imply |Dψ − 1|0 ≤ K · |η|0 . The expression for higher derivatives xof ψ is a sum of terms which are products of derivatives of η and a factor exp −1 η. Using the bound on the nonlinearity η we get the estimate |D j ψ|0 ≤ K · |η|k−2 ,

j ≤ k.

This proves (20.2). Observe that the above constants only depend on the C k−2 norm of ηψ . To prove the lemma observe that (20.1) implies that |η Z I (φ) |k−2 ≤ |ηφ |k−2 . Hence, we can apply (20.2) with a constant independent of I to get |Z I (φ) − id|k ≤ K · |η Z I (φ) |k−2 . Now apply (20.1) again and the lemma follows with a constant that depends only on |ηφ |k−2 . Note that this norm can be bounded by the C 0 -norm of ηφ and the C k -norm of φ. Lemma 20.3. For every infinitely renormalizable map of bounded type f ∈ U there exist C > 0 and λ < 1 such that •

q n −1

|φin − id|k ≤ Cλn

i=0

•

|qin − id|k+1 ≤ Cλm .

Iin ∈L nm

Proof. This lemma relies on the a priori bounds formulated in the appendix as lemma 20.8. Observe that the a priori bounds imply that for some constants C > 0 and λ < 1 q n −1 |Iin | ≤ Cλn . i=0

Because the diffeomorphic part of f has derivative bounded away from zero these two constants can be adjusted such that also q n −1

|φ −1 (Iin )| ≤ Cλn .

i=0

Now lemma 20.2, describing the restriction operator, assures the first estimate of lemma 20.3. The proof of the second estimate is by induction in m ≥ 1. Observe that the a priori bounds imply the estimate for m = 1. Every diffeomorphism qin

On the dynamics of the renormalization operator

455

with I n−1 ∈ L n−1 by with Iin ∈ L nm+1 is obtained from a diffeomorphism q n−1 m j j k+1 applying the restriction operator. In particular, the sum of the C -norms of all the diffeomorphisms obtained in such a way out of a specific q n−1 is bounded, j using the a priori bound, by a definite factor 1 − b times the C k+1 -norm of q n−1 . j Note that the obtained bounds only depend on the contraction constant of the restriction operator, which in turn only depends on the bounded geometry of the cycles. In particular, they do not depend on n but only on the number of times the restriction operator is applied, namely m + 1 times. This finishes the induction step. For each infinitely renormalizable map f ∈ U we consider the unimodal maps Tn f = (qqnn −1 ◦ · · · ◦ q2n ◦ q1n , tn ) ∈ U. Observe that the maps Tn f are unimodal and that the diffeomorphic part is analytic. In particular, R n f consists of the analytic terms qin and the restrictions φin of the diffeomorphic part φ of f . Lemma 20.3 states that the terms φin are all very close to the identity. The proof of theorem 20.1 will show C k convergence of Tn f and will compare Tn f with R n f . Lemma 20.4. For each infinitely renormalizable map of bounded type f ∈ U there are constants C > 0 and λ < 1 such that |R n f − Tn f |k ≤ Cλn . Proof. Fix n ≥ 1. Observe that lemma 20.3 and lemma 20.7 (see the appendix) imply that any composition of diffeomorphisms qin is uniformly bounded in the C k+1 -topology. In particular, the bounds are independent of n ≥ 1. Similarly, any composition of diffeomorphisms qin and/or φin is uniformly bounded in the C k -topology. These bounds allow us to apply lemma 20.6 (see the appendix) to the following diffeomorphisms. Define for each i = 1, . . . , qn − 1 the diffeomorphisms n n ◦ qi+1 !in = qqnn −1 ◦ · · · ◦ qi+2 and n i

= (φin ◦ qin ) ◦ · · · ◦ (φ2n ◦ q2n ) ◦ (φ1n ◦ q1n ).

Observe that the diffeomorphisms !in are uniformly bounded in the C k+1 topology, and the diffeomorphisms ni are uniformly bounded in the C k -topology. Moreover, Tn f = (!0n , tn ) and Rn f = (

n qn −1 , tn ).

456

Artur Avila, Welington de Melo and Marco Martens

For i = 0, . . . , qn − 1 define also Hin = !in ◦

n i.

The sequence Hin starts at Tn f = H0n and ends in R n f = Hqnn −1 . Hence, |R n f − Tn f |k ≤

q n −1

n |Hin − Hi−1 |k .

i=1

Because Hin = !in ◦ φin ◦ (qin ◦

n i−1 )

and n Hi−1 = !in ◦ (qin ◦

n i−1 )

lemma 20.6 gives the estimate n |k ≤ E · |φin − id|k . |Hin − Hi−1

The final step is to use lemma 20.3 (see the appendix) to get an exponentially small distance |R n f − Tn f |k ≤ E ·

q n −1

|φin − id|k ≤ Cλn .

i=1

Lemma 20.5. For each infinitely renormalizable map of bounded type f ∈ U the sequence Tn f , n ≥ 1 forms a precompact family of analytic maps. Proof. The diffeomorphic part of each Tn f is an analytic diffeomorphism which maps the interval [−1, 1] onto itself. More precisely, they are compositions of pieces of standard folding maps. To prove this lemma it suffices to show that the diffeomorphic parts have univalent extensions on a fixed simply connected domain containing the interval [−1, 1]. The construction of such a disk is based on lemma 20.9 (see the appendix). We will use the notation of lemma 20.4. In particular, we will apply lemma 20.9 to the sequence of maps !in n !i−1 = !in ◦ qin .

The diffeomorphisms qin are pieces of standard folding maps and, because of the a priori bounds, they are all univalent on a fixed disk D0 containing [−1, 1]. Moreover, the second estimate of lemma 20.3 gives a constant K > 0 which is independent of n ≥ 1 such that |(qin − id)|D0 |1 < K . i

On the dynamics of the renormalization operator

457

This estimate and lemma 20.9 assure that there will be a definite simply connected domain containing [−1, 1] on which the maps !in are all univalent. In particular the diffeomorphic parts of Tn f extend univalently to this domain.

20.3 Proof of theorem 20.1 In this section we will prove that C k -exponential convergence follows from C 0 exponential convergence in any class of unimodal maps (e.g. α > 1 and k ≥ 3). In particular, let f, g ∈ U be infinitely renormalizable C k maps of the same bounded type and assume that |R n f − R n g|0 ≤ Cλn where λ < 1. Observe, that this exponential convergence actually has been shown in the class of C 3 unimodal maps with quadratic critical point, e.g. α = 2. To prove theorem 20.1 it suffices to show that there are constants, say also denoted by C and λ < 1, such that |R n f − R n g|k ≤ Cλn . First, observe that the above C 0 -exponential convergence together with lemma 20.4 gives, for some constants C > 0 and λ < 1, |Tn f − Tn g|0 ≤ Cλn . Then lemma 20.5 gives that the sequences Tn f and Tn g form precompact families of analytic maps. That implies that the C k -norm is H¨older equivalent to the C 0 norm for these families. In particular, the constants C > 0 and λ < 1 can be adjusted such that |Tn f − Tn g|k ≤ Cλn . Now the proof is finished by a second adjustment of the constants and the inequality |R n f − R n g|k ≤ |R n f − Tn f |k + |Tn f − Tn g|k + |Tn g − R n g|k ≤ Cλn .

Appendix. Behavior of the composition operator The following two lemmas are estimates on the behavior of the composition operator. The proof of the next lemma is a straightforward chain rule computation. Lemma 20.6. There is a constant E = E(k, B) with the following property. Let ψ2 ∈ Diffk+1 ([−1, 1]) and ψ1 , φ ∈ Diffk ([−1, 1]) such that |ψ|k+1 , |φ|k , |ψ1 |k ≤ B

458

Artur Avila, Welington de Melo and Marco Martens

then |ψ2 ◦ φ ◦ ψ1 − ψ2 ◦ ψ1 |k ≤ E · |φ − id|k . Lemma 20.7. For every B > 0 and k ≥ 1 there exists a constant K such that the following holds. If n |φi − id|k ≤ B i=1

then |φn ◦ · · · ◦ φ2 ◦ φ1 |k ≤ K . Proof. Let Skn be the maximum C k -norm of diffeomorphism of the form φ = φn ◦ · · · ◦ φ2 ◦ φ1 , with

n

|φi − id|k ≤ B.

i=1

The product rule implies that S1 = sup S1n ≤ exp(B). n≥1

An inductive argument will show that each Sk = sup Skn n≥1

is finite. Assume Sk < ∞. Given are the diffeomorphisms φi , i = 1, . . . , n + 1, whose sum of C k+1 -norms is bounded by B. Let ! be the composition of the first n diffeomorphisms φi , i = 1, . . . , n and = φn+1 ◦ !. The induction hypothesis implies that is bounded in the C k -topology, the bound given by Sk . It suffices to estimate the (k + 1)th derivative of . Observe that this derivative is a polynomial expression whose terms are products of derivatives of ! and φn+1 . There is one term of the form (Dφn+1 ◦ !) · D k+1 !. The other terms involve only lower derivatives of !, that is, lower than k + 1. However, these terms have exactly one factor which is a specific higher derivative of φn+1 , that is, not a first derivative. This implies that there is a specific polynomial Pk+1 such that the following holds: n+1 n Sk+1 ≤ (1 + |φn+1 − id|k+1 ) · Sk+1 + Pk+1 (Sk ) · |φn+1 − id|k+1 .

The term Pk+1 (Sk ) is independent of n. Hence, we get Sk+1 ≤ exp(B) · Pk+1 (Sk ) · B. This finishes the induction step.

On the dynamics of the renormalization operator

459

The following lemma is known as the a priori bounds on the geometry of cycles of infinitely renormalizable maps of bounded type. Lemma 20.8. Let f ∈ U be an infinitely renormalizable map of bounded type. ⊂ I nj , There exists a constant 0 < b < 1 with the following property. Let Iin+1 l l = 1, . . . , m n be the intervals of the (n + 1)th cycle which are contained in the interval I nj of the nth cycle. Then • •

l

|Iin+1 | l |I nj |

| |Iin+1 l |I nj |

< 1 − b,

> b, l = 1, . . . , m n .

A proof can be found in [3, theorem 2.1 ch VI]. Lemma 20.9. Let D0 ⊃ D1 ⊃ [−1, 1] be strictly nested disks in the complex plane. There exists a constant K > 0 such that the following holds. Let φ : D0 → C be a univalent map that fixes the interval [−1, 1], and [−1, 1] ⊂ D˜ ⊂ D1 . There exists a simply connected domain Dφ ⊂ D˜ containing [−1, 1] such that • φ(Dφ ) ⊂ D˜ • ρφ ≥ (1 − K · |(φ − id)|D0 |1 )ρψ , where ρψ and ρφ are the distances between the boundary of resp. Dψ and Dφ to the interval [−1, 1]. Proof. Let ρφ be maximal such that the ρφ -neighborhood Dφ ⊂ D˜ of [−1, 1] satisfies ˜ φ(Dφ ) ⊂ D. Because D˜ contains the ρ-neighborhood ˜ of [−1, 1] we get ρφ ≥

1 ρ. ˜ supz∈ D˜ |Dφ(z)|

The map φ fixes the interval [−1, 1] and is defined in a domain D0 which strictly ˜ The Koebe Lemma implies that the derivative of φ on D˜ contains the domain D. is bounded. This implies the estimate as stated in the lemma.

Acknowledgements MM was partially supported by NSF Grant DMS-0073069 and acknowledges the hospitality of IMPA where parts of this work were done. WdM was partially supported by the Pronex Project on Dynamical Systems, Faperj Grant E-26/151.896/2000.

460

Artur Avila, Welington de Melo and Marco Martens

References [1] Lyubich M 1999 Feigenbaum–Coullet–Tresser universality and Milnor’s hairiness conjecture Ann. Math. 149 319–420 [2] Martens M 1998 The periodic points of renormalization Ann. Math. 147 543–84 [3] de Melo W and Van Strien S 1993 One-Dimensional Dynamics (Berlin: Springer) [4] de Melo W and Pinto A 1999 Rigidity of C 2 infinitely renormalizable unimodal maps Commun. Math. Phys. 208 91–105 [5] Sullivan D 1992 Bounds, Quadratic Differentials and Renormalization Conjectures (AMS Centenial Publications 2: Mathematics into the Twenty-First Century) (Providence, RI: American Mathematical Society)

Author index

•

Artur Avila (pp 449–60) IMPA—Instituto de Matem´atica Pura e Aplicada Estrada Dona Castorina 110 22460-320 Rio de Janeiro Brazil email: [email protected]

•

Robert L Devaney (pp 329–38) Department of Mathematics 111 Cummington Street Boston University Boston, MA 02215 USA email: [email protected]

•

Henk W Broer (pp 167–209) Department of Mathematics University of Groningen PO Box 800 9700 AV Groningen The Netherlands email: [email protected]

•

Lorenzo J D´ıaz (pp 309–27) Departamento de Matem´atica PUC-Rio Marquˆes de S. Vicente 225 22453-900 Rio de Janeiro Brazil email: [email protected]

•

Henk Bruin (pp 433–47) Department of Mathematics University of Groningen PO Box 800 9700 AV Groningen The Netherlands email: [email protected]

•

•

Jacob DeGoede (pp 391–403) Vakgroep Fysiologie Falculteit der Geneeskunde Universiteit Leiden Postbus 9604 2300 RC Leiden The Netherlands email: J.de [email protected]

Cees Diks (pp 391–403) Faculteit der Economische Wetenschappen en Econometrie Universiteit van Amsterdam Roetersstraat 11 1018 WB Amsterdam The Netherlands email: [email protected]

•

Freddy Dumortier (pp 131–66) Departement WNI Limburgs Universitair Centrum Universitaire Campus B-3590 Diepenbeek Belgium email: [email protected]

•

Peter Fiddelaers (pp 131–66) Departement WNI Limburgs Universitair Centrum Universitaire Campus B-3590 Diepenbeek Belgium

•

Welington de Melo (pp 449–60) IMPA—Instituto de Matem´atica Pura e Aplicada Estrada Dona Castorina 110 22460-320 Rio de Janeiro Brazil email: [email protected]

461

462

Author index

•

Bernold Fiedler (pp 211–59) Institut f¨ur Mathematik I Freie Universit¨at Berlin Arnimallee 2-6 14195 Berlin Germany email: [email protected]

•

Martin Golubitsky (pp 277–308) Department of Mathematics University of Houston 4800 Calhoun Road Houston, TX 77204-3476 USA email: [email protected]

•

•

•

•

John Guckenheimer (pp 261–76) Department of Mathematics Cornell University Malott Hall Ithaca, NY 14853 USA email: [email protected] Heinz Hanßmann (pp 353–71) Program for Applied and Computational Mathematics Princeton University Fine Hall, Washington Road Princeton, NJ 08544-1000 USA email: [email protected] Kathleen Hoffmann (pp 261–76) Department of Mathematics and Statistics University of Maryland Baltimore County Baltimore, MD 21250 USA email: [email protected] Philip Holmes (pp 353–71) Program for Applied and Computational Mathematics Princeton University Fine Hall, Washington Road Princeton, NJ 08544-1000 USA email: [email protected]

•

Ale Jan Homburg (pp 405–18) Korteweg–de Vries Institute for Mathematics University of Amsterdam Plantage Muidergracht 24 1018 TV Amsterdam The Netherlands email: [email protected]

•

Kreˇsimir Josi´c (pp 277–308) Department of Mathematics and Center for BioDynamics Boston University 111 Cummington Street Boston, MA 02215 USA email: [email protected]

•

Tasso J Kaper (pp 277–308) Department of Mathematics and Center for BioDynamics Boston University 111 Cummington Street Boston, MA 02215 USA email: [email protected]

•

Bernd Krauskopf (pp 89–111) Department of Engineering Mathematics University of Bristol Queen’s Building Bristol BS8 1TR UK email: [email protected]

•

Chengzhi Li (pp 131–66) School of Mathematical Sciences Peking University Beijing 100871 People’s Republic of China email: [email protected]

•

Stefan Liebscher (pp 211–59) Institut f¨ur Mathematik I Freie Universit¨at Berlin Arnimallee 2-6 14195 Berlin Germany email: [email protected]

Author index •

Marco Martens (pp 449–60) IBM T J Watson Research Center Yorktown Heights, NY 10598 USA email: [email protected]

•

Jacob Palis (pp 67–87) IMPA—Instituto de Matem´atica Pura e Aplicada Estrada Dona Castorina 110 22460-320 Rio de Janeiro Brazil email: [email protected]

•

•

•

•

Yakov Pesin (pp 419–31) Department of Mathematics The Pennsylvania State University University Park, PA 16802 USA email: [email protected] Isabel L Rios (pp 309–27) Instituto de Matem´atica Universidade Federal Fluminense (UFF) Rua M´ario Santos Braga s/n, Centro Niter´oi—RJ Brazil email: [email protected] Robert Roussarie (pp 167–209) Universit´e de Bourgogne Laboratoire de Topologie UMR CNRS 5584 UFR des Sciences et Techniques 9, avenue Alain Savary BP 47870 21078 Dijon cedex France email: [email protected] David Ruelle (pp 63–6) Theoretical Physics ´ Institut des Hautes Etudes Scientifiques 91440 Bures-sur-Yvette France email: [email protected]

463

•

Mikhail B Sevryuk (pp 339–52) Laboratory for General Dynamics of Elementary Processes Institute of Energy Problems of Chemical Physics Russian Academy of Sciences Lenin Prospect 38, Bldg 2 Moscow 117829 Russia email: [email protected]

•

Carles Sim´o (pp 373–89) Matem`atica Aplicada i An`alisi Universitat de Barcelona Gran Via, 585 08007 Barcelona Spain email: [email protected]

•

Sebastian van Strien (pp 433–47) Mathematics Department University of Warwick Coventry CV4 7AL UK email: [email protected]

•

Marcelo Viana (pp 309–27) IMPA—Instituto de Matem´atica Pura e Aplicada Estrada Dona Castorina 110 22460-320 Rio de Janeiro Brazil email: [email protected]

•

Florian Wagener (pp 113–29) Department of Quantative Economics Universtiy of Amsterdam Roeterstraat 11 1018 WB Amsterdam The Netherlands email: [email protected]

•

Waren Weckesser (pp 261–76) Department of Mathematics University of Michigan Ann Arbor, MI 48109 USA email: [email protected]

464

Author index

•

Howard Weiss (pp 419–31) Department of Mathematics The Pennsylvania State University University Park, PA 16802 USA email: [email protected]

•

Jean-Christophe Yoccoz (pp 67–87) College de France 3, rue d’Ulm F-75005 Paris France email: [email protected]

•

Todd Young (pp 405–18) Department of Mathematics Ohio University Morton Hall Athens, OH 45701 USA email: [email protected]