DYNAMICAL SYSTEMS Stability, SYITlbolic DynaITlics, and Chaos
Clark Robinson
CRC Press Boca Raton
Ann Arbor
London
Tokyo
Ubral')' 01 COOI"SI CatalopDg-lo-Publicatioo Data Robinson. C. (Clark) Dynamical systems: stabilily. symbolic dynamics. and chaos f Clark Robinson. p.
cm.--{Studies in advanced mathematics)
Includes bibliographical references and index. ISBN G-8493·8493-1 I. Differentiable dynamical systems.
I. Title.
II. Serics.
QA614.8.R63 1994 94-24456
514'.74--dc20
CIP This book cont ai ns infonnation obtained from authentic and highly regarded sources. Reprinted material i. quoled with pe nnission. and sources are indicated. A wide variety of references are listed. Reasonable effons have been made to poIbIish reliable dala and infonnation. hut the author and the publisher cannol assume responsibilily for the validity of all IDIfcriais or for the consequences of their usc. Neither this book nor any pan may be reproduced or transmitted in any form or by any means. electronic or mechanical. including pholocopying. microfilming. and recording. or by any infonnation storage or retrieval system. without prior pennission in writing from the publisher. CRC Press. I nc. ·s conscnt does
not extend to copying for general distribution. for promotion. for creating new obtained in writing from CRC Press for such copying. CRC Press. Inc 2000 Corporate Blvd N.W Boca Ralon. Florida 33431.
works. or for resale. Specific penoission must be Direct all inquiries 10
.•
.•
C 1995 by-CRC �ss. InlO.
No claim 10 original U.S. Governmenl works International SIaDdard Book Number G-8493-8493-1 Ubnry of COIIpas CII'II Number 94-24456 PrInted in the United Slates of America 2 3 4 5 6 7 8 PrlIIIed on acid-free paper
9 0
.•
Preface In recent years, Dynamical Systems has had many applications to science and engineering some of which have gone under the related headings of chaos theory or nonlinear analysis. Behind these applications there lies a rich mathematical subject which we treat in this book. This subject centers on the orbits of iteration of a (nonlinear) function or of the solutions of (nonlinear) ordinary differential equations. In particular, we are interested in the properties which persist under nonlinear change of coordinates. As such, we are interested in the geometric or topological aspects of the orbits or solutions more than an explicit formula for an orbit (which may not be available in any case). However, as becomes clear in the treatment in this book, there are many properties of a particular solution or the whole system which can be measured by some quantity. Also, although the subject has a geometric or topological flavor, analytic analysis plays an important role (e.g. the local analysis near a fixed point and the stable manifold theory). There have been several books and monographs on the subject of Dynamical Systems. There are several distinctive aspects which together make this book unique. First of all, this book treats the subject from a mathematical perspective with the proofs of most of the results included: the only proofs which are omitted either (i) are left to the reader, (Ii) are too technically difficult to include in an introductory book, even at the graduate level, or (iii) concern a topic which is only included as a bridge between the material covered in the book and commonly encountered concepts in Dynamical Systems. (Much of the material concerning measures, Liapunov exponents, and fractal dimension is of this latter category.) Although it has a mathematical perspective, readers who are more interested in applied or computational aspects of the subject should find the explicit statements of the results helpful even if they do not concern themselves with the details of the proofs. In particular, the inclusion of explicit formulas for the various bifurcations should be very useful. Second, this book is meant to be a graduate textbook and not just a reference book or monograph on the subject. This aspect of the book is reflected in the way the background materials are carefully reviewed as we use them. (The particular prerequisites from undergraduate mathematics are discussed below.) The ideas are introduced through examples and at a level which is accessible to a beginning graduate student. Many exercises are included to help the student learn the meaning of the theorems and master the techniques of the proofs and topic under consideration. Since the exercises are not usually just routine applications of theorems but involve similar proofs and or calculations, they are best assigned in groups, weekly or biweekly. For this reason, they are grouped at the end of each chapter rather in the individual section. Third, the scope of the book is on the scale of a year long graduate course and is designed to be used in such a graduate level mathematics course in Dynamical Systems. This means that the book is not comprehensive or exhaustive but tries to treat the core concepts thoroughly and treat others enough so the reader will be prepared to read further in Dynamical Systems without a complete mathematical treatment. In fact, this book grew out of a graduate course that I taught at Northwestern University many iii
times between the early 1970s and the present. To the material that I covered in that course, I have added a few other topics: some of which my colleagues treat when they teach the course, others round out the treatment of a topic covered earlier in the book (e.g. Chapter XI), and others just give greater flexibility to possible courses using this book. Details on which sections form the core of the book are discussed in Section 1.4. The perspective of the book is centered on multidimensional systems of real variables. Chapters II and III concern functions of one real variable, but this is done mainly because this makes the treatment simpler analytically than that given later in higher dimensions: there are not lUly (or many) aspects introduced which are unique to one dimension. Some results are proved so they apply in Banach spaces or even complete metric but most of the results are developed in finite dimensions. In particular, no direct connection with partial differential equations or delay equations is given. The fact that the book concerns functions of real rather than complex variables explains why topics such as the Julia set, Mandelbrot set, and Measurable Riemann Mapping Theorem are not treated. This book treats the dynamics of both iteration of functions and solutions of ordinary differential equations. Many of the concepts are first introduced for iteration of functions where the geometry is simpler, but an attempt has been made to interpret these results for differential equations. A proof of the existence and continuity of solutions with respect to initial conditions is also included to establish the beginnings of this aspect of the subject. Although there is much overlap in this book and one on ordinary differential equations, the emphasis is different. The dynamical systems approach centers more on properties of the whole system or subsets of the system rather thlUl individual solutions. Even the more local theory in Chapters IV-VI deals with characterizing types of solutions under various hypotheses. Chapters VII and IX deal more directly with more global aspects: Chapter VII centers on various examples and Chapter IX gives the global theory. Finally, within the various types of Dynamical Systems, this book is most concerned with hyperbolic systems: this focus is most prominent in Chapters VII, IX, X, and XI. However, an attempt has been made to make this book valuable to people interested in various aspects of Dynamical Systems. The specific prerequisites include undergraduate analysis (including the Implicit Function Theorem), linear algebra (including the Jordan canonical form), and point set topology (including Cantor sets). For the analysis, one of the following books should be sufficient background: Apostol (1974), Marsden (1974), or Rudin (1964). For the linear algebra, one of the following books should be sufficient background: Hoffman and Kunze (1961) or Hartley and Hawkes (1970). For the point set topology, one of the following books should be sufficient background: Croom (1989), Hocking and Young (1961), or Munkres (1975). What is needed from these other subjects is an ability to use these tools; knowing a proof of the Implicit Function Theorem does not particularly help someone know how to use it. For this reason, we carefully discuss the way these tools are used just before we use them. (See the sections on the Calculus Prerequisites, Cantor Sets, Real Jordan Canonical Form, Differentiation in Higher Dimensions, Implicit Function Theorem, Inverse Function Theorem, Contraction Mapping Theorem, and Definition of a Manifold.) After using these tools in Dynamical Systems, the reader should gain a much better understanding of the importance of these "undergraduate" subjects. The terminology and ideas from differential topology or differential geometry are also used, including that of a tangent vector, the tangent bundle, and a manifold. However most surfaces or manifolds are either Euclidean space, tori, or graphs of functions so these ideas should not be too intimidating. Although someone pursuing
Dynamical Systems further should learn manifold theory, I have tried to make this book accessible to someone without prior background in this subject. Thus, the prerequisites for this book are really undergraduate analysis, linear algebra, and point set topology and not advanced graduate work. However, the reader should be warned that most beginning graduate students do not find the material at all trivial. The main complicating aspect seems to be the use of a large variety of methods and approaches. The unifying feature is not the methods used but the type of questions which we are trying to answer. By having patience and reviewing the mathematics from other subjects as they are used, the reader should find the material accessible and rich in content, both mathematical and for applications. The main topic of the book is the dynamics induced by iteration of a (nonlinear) function or by the solutions of (nonlinear) ordinary differential equations. In the usual undergraduate mathematics courses, some properties of solutions of differential equations are considered but more attention is paid to the specific form of the solution. In connection with functions, they are graphed and their minima and maxima are found, but the Iterates of a function are not often considered. To iterate a function we repeatedly have the same function act on a point and its images. Thus, for a function f with initial condition xo, we consider XI = f(xo), and then Xn = f(xn-d for n ;:: 1. We are interested in finding the qualitative features and long time limiting behavior of a typical orbit, for either an ordinary differential equation or the iterates of a function. Certainly fixed points or periodic points are important, but sometimes the orbit moves densely through a complicated set such as a Cantor set. We want to understand and bring a structure to this seemingly random behavior. It is often expressed by saying, "we want to bring order out of chaos." One way of finding this structure is via the tool of symbolic dynamics. If there is a real valued function f and a sequence of intervals J i such that the image of J i by f covers Ji+lt f(J i ) :::) Ji+lt then it is possible to show that there is a point X whose orbit passes through this sequence of intervals, fi(X) E J i . Labels for the intervals then can be used as symbols, hence the name of symbolic dynamics for this approach. Another important concept is that of structural stability. Some types of systems (iterated functions or ordinary differential equations) have dynamics which are equivalent (topologically conjugate) to that of any of its perturbations. Such a system is called structurally stable. Finally the term chaos is given a special meaning and interpretation. There is no one set definition of a chaotic system, but we discuss various ideas and measurements related to chaotic dynamics. One of the ironies is that some chaotic systems are also structurally stable. Chapter I gives a more detailed introduction into the main ideas that are treated in the book by means of examples of functions and differential equations. Suffice it to say here that these three ideas, symbolic dynamics, structural stability, and chaos, form the central part of the approach to Dynamical Systems presented in this book. In the year-long graduate course at Northwestern, we cover the the material in Chapter II, Sharkovskii's Theorem and Subshifts of Finite Type from Chapter III, Chapter IV except the Perron-Frobenius Theorem, Chapter V except some of the material on periodic orbits for planar differential equations (and sometime the proof of the Stable Manifold Theorem is omitted), a selection of examples from Chapter VII, and most of Chapter IX. In a given year, other selected topics are usually added from among the following: Chapter VI on bifurcations, the material on topological entropy in Chapter VIII, and the Kupk&-Smale Theorem. A course which did not emphasize the global hyperbolic theory as much could be obtained by skipping Chapter IX and treating additional topics, e.g. Chapter VI or more on the measurements of chaos. Section 1.4 discusses the content of the different chapters and possible selections of sections or
topics for a course using this book. There are several other books which give introductions into other aspects or approaches to Dynamical Systems. For other graduate level mathematical introductions to Dynamical Systems see Devaney (1989), Irwin (1980), Nitecki (1970), and Palis and de Melo (1982). For a more comprehensive treatment of Dynamical Systems see Katok and H_lbl8tt (11Xl4). Some boob which .mph..i.. the dynamic. of iteration of 8 function of one variable are Alseda, Llibre, and Misiurewicz (1993), Block and Coppel
(1992). and de Melo and Van Strien (1993). Carleson and Gamelin (1993) gives an introduction to the dynamics of functions of a complex variable. Chow and Hale (1982) gives a more thorough treatment of the bifurcation aspects of Dynamical Systems. The article by Boyle (1993) gives a more thorough introduction into symbolic dynamics as a separate subject and not just how it is used to analyze diffeomorphisms or vector fields. Some books which concentrate on Hamiltonian dynamics are Abraham and Marsden (1978). Arnold (1978). and Meyer and Hall (1992). For an introduction to applications of Dynamical Systems, see Guckenheimer and Holmes (1983). Hirsch and Smale (1974). Wiggins (1990. 1988). and Ott (1993). For applications to ecology. see Hirsch (1982, 1985. 1988. 1990). Hofbauer and Sigmund (1988). Hoppensteadt (1982). May (1975), and Waltman (1983). There are many books written on Dynamical Systems by people in fields outside mathematics. including Lichtenberg and Lieberman (1983), Marek and Schreiber (1991). and Rasband (1990). I have tried very hard to give references to original papers. However, there are many researchers working in Dynamical Systems and I am not always aware of (or remember) contributions by various people to which I should give credit. I apologize for my omissions. I am sure there are many. I hope the references that I have given will help the reader start finding the related work in the literature. When referring to a theorem in the same chapter, we use the number as it appears in the statement. e.g. Theorem 2.2 which is the second theorem of the second section of the current chapter. If we are referring to Theorem 2.2 from Chapter VI in a chapter other other than Chapter VI. we refer to it as Theorem VI.2.2 to indicate it comes from a different chapter. There are not any specific references in this book to using a computer to simulate a dynamical system. However, the reader would benefit greatly by seeing the dynamics as it unfolds by such simulation. The reader can either write a program for him or herself or use several of the computer packages available. On an IBM Personal Computer. I have used the program Phaser which comes with the book by K~ak (1989). Also the program Dynamics by Yorke (1990) runs on both IBM Personal Computers and Unix/Xll machines. There are several other programs for IBM Personal Computers but I have not used them myself. Also the program DsTool by J. Guckenheimer, M. R. Myers. F. J. Wicklin. and P. A. Worfolk runs on Unix/Xll machines. Many of the programming languages come with a good enough graphics library that it is not difficult to write one's own specialized program. However, for the X-Window environment on a Unix computer, I found the VOGLE library (C graphics C functions) a very helpful asset to write my own programs. There are several programs for the Macintosh including MacMath by Hubbard and West (1992), but I have not used them. Over the years, I have had many useful conversations with colleagues at Northwestern University and from elsewhere. especially people attending the Midwest Dynamical Systems Seminars. Those colleagues in Dynamical Systems at Northwestern University include Keith Burns, John Franks, Don Saari, Robert Williams, and many postdoctoral instructors and visitors. Those attending the Midwest Dynamical Systems Seminars are too numerous to list, but surely Charles Conley is one who bears mentioning and will long be remembered by many of us. I also owe a great debt to the people who taught
me about Dynamical Systems, including Morris Hirsch, Charles Pugh, and Steve Smale. The perspective on Dynamical Systems which I learned from them is still very evident in the selection and treatment of topics in this book. I would also like to thank the many people who found typographical errors, conceJ>tual errors, or points that needed to be clarified in earlier drafts of this book. I would eIIpeclaily like to thank Keith Burna, Beverly Diamond, Roger Kraft, and Mlng-Chla Li: Keith Burns taught out of a preliminary version and made many suggestions for improvements, clarifications, and changed arguments; Beverly Diamond made many suggestions for improvements in grammar and other editing matters; Roger Kraft made both mathematical and typographical corrections; in addition to noting out typographical errors, Ming-Chia Li pointed out aspects which needed clarifying. This text was typeset using A,MS-1EX. The figures were produced using DsTooI, Xfig, Maple, and Vogle graphics C Library. I would like to thank Len Evens who supplied me with some macros which were used with ~1EX to produce the chapter and section titles and numbers, and produce the index and table of contents. I was supported by several National Science Foundation grants during the years this book was written. Clark Robinson Department of Mathematics Northwestern University Evanston, Illinois 60208
[email protected]
Contents Chapter I. Introduction 1.1 Population Growth Models, One Population 1.2 Iteration of Real Valued Functions as Dynamical Systems 1.3 Higher Dimensional Systems 1.4 Outline of t.he Topics of the Chapters Chapter II. One Dimensional Dynamics by Iteration 2.1 Calculus Prerequisites *2.2 Periodic Points *2.2.1 Fixed Points for the Quadratic Family *2.3 Limit Sets and Recurrence for Maps *2.4 Invariant Cantor Sets for the Quadratic Family *2.4.1 Middle Cantor Sets *2.4.2 Construction of the Invariant Cantor Set 2.4.3 The Invariant Cantor Set for Jl > 4 *2.5 Symbolic Dynamics for the Quadratic Map *2.6 Conjugacy and Structural Stability *2.7 Conjugacy and Structural Stability of the Quadratic Map 2.8 Homeomorphisms of the Circle 2.9 Exercises Chapter III. Chaos and Its Measurement 3.1 Sharkovskii's Theorem 3.1.1 Examples for Sharkovskii's Theorem 3.2 Subshifts of Finite Type 3.3 Zeta Function 3.4 Period Doubling Cascade 3.5 Chaos 3.6 Liapunov Exponents 3.7 Exercises Chapter IV. Linear Systems 4.1 Review: Linear Maps and the Real Jordan Canonical Form *4.2 Linear Differential Equations *4.3 Solutions for Constant Coefficients *4.4 Phase Portraits *4.5 Contracting Linear Differential Equations *4.6 Hyperbolic Linear Differential Equations *4.7 Topologically Conjugate Linear Differential Equations *4.8 Nonhomogeneous Equations *4.9 Linear Maps 4.9.1 Perron-F'robenius Theorem 4.10 Exercises • Core Sections
1 2 3 5 9
13 13 15 20 22 26 26 30 33 37 40 46 49 57 63 63 70 72 78 79 81 86 88 93 93 95 97 102 106 111 113 115 116 123 127
Chapter V. Analysis Near Fixed Points and Periodic Orbits *5.1 Review: Differentiation in Higher Dimensions *5.2 Review: The Implicit Function Theorem *5.2.1 Higher Dimensional Implicit Function Theorem *5.2.2 The Inverse Function Theorem *5.2.3 Contraction Mapping Theorem *5.3 Existence of Solutions for Differential Equations *5.4 Limit Sets and Recurrence for Flows *5.5 Fixed Points for Nonlinear Differential Equations *5.5.1 Nonlinear Sinks *5.5.2 Nonlinear Hyperbolic Fixed Points *5.5.3 Liapunov Functions Near a Fixed Point *5.6 Stability of Periodic Points for Nonlinear Maps *5.7 Proof of the Hartman-Grobman Theorem *5.7.1 Proof of the Local Theorem 5.7.2 Proof of the Hartman-Grobman Theorem for Flows *5.8 Periodic Orbits for Flows 5.8.1 The Suspension of a Map 5.8.2 An Attracting Periodic Orbit for the Van der Pol Equations 5.8.3 Poincare Map for Differential Equations in the Plane *5.9 Poincare-Bendixson Theorem *5.10 Stable Manifold Theorem for a Fixed Point of a Map 5.10.1 Proof of the Stable Manifold Theorem 5.10.2 Center Manifold *5.10.3 Stable Manifold Theorem for Flows *5.11 The Inclination Lemma 5.12 Exercises Chapter VI. Bifurcation of Periodic Points 6.1 Saddle-Node Bifurcation 6.2 Saddle-Node Bifurcation in Higher Dimensions 6.3 Period Doubling Bifurcation 6.4 Andronov-Hopf Bifurcation for Diffeomorphisms 6.5 Andronov-Hopf Bifurcation for Differential Equations 6.6 Exercises Chapter VII. Examples of Hyperbolic Sets and Attractors *7.1 Definition of a Manifold *7.1.1 Topology on Space of Differentiable Functions *7.1.2 Tangent Space *7.1.3 Hyperbolic Invariant Sets *7.2 Transitivity Theorems *7.3 Two Sided Shift Spaces 7.3.1 Subshifts for Nonnegative Matrices *7.4 Geometric Horseshoe 7.4.1 Horseshoe for the Henon Map *7.4.2 Horseshoe from a Homoclinic Point 7.4.3 Melnikov Method for Homoclinic Points 7.4.4 Fractal Basin Boundaries
... Core Sections
131 131 134 136 137 138 140 146 149 150 152 154 156 158 163 165 165 171 171 176 179 181 185 197 199 200 202 211 211 213 218 223 224 231 235 235 237 238 241 244 247 247 249 255 259 268 274
*7.5 Hyperbolic Toral Automorphisms 7.5.1 Markov Partitions for Hyperbolic Toral Automorphisms 7.5.2 The Zeta Function for Hyperbolic Toral Automorphisms *7.6 Attractors *7.7 The Solenoid Attractor 7.7.1 Conjugacy of the Solenoid to an Inverse Limit 7.8 The DA Attractor 7.8.1 The Branched Manifold *7.9 Plykin Attractors in the Plane 7.10 Attractor for the Henon Map 7.11 Lorenz Attractor 7.11.1 Geometric Model for the Lorenz Equations 7.11.2 Homoclinic Bifurcation to a Lorenz Attractor *7.12 Morse-Smale Systems 7.13 Exercises Chapter VIII. Measurement of Chaos in Higher Dimensions 8.1 Topological Entropy 8.1.1 Proof of Two Theorems on Topological Entropy 8.1.2 Entropy of Higher Dimensional Examples 8.2 Liapunov Exponents 8.3 Sinai-Ruelle-Bowen Measure for an Attractor 8.4 Fractal Dimension 8.5 Exercises Chapter IX. Global Theory of Hyperbolic Systems 9.1 Fundamental Theorem of Dynamical Systems 9.1.1 Fundamental Theorem for a Homeomorphism 9.2 Stable Manifold Theorem for a Hyperbolic Invariant Set 9.3 Shadowing and Expansiveness 9.4 Anosov Closing Lemma 9.5 Decomposition of Hyperbolic Recurrent Points 9.6 Markov Partitions for a Hyperbolic Invariant Set 9.7 Local Stability and Stability of Anosov Diffeomorphisms 9.8 Stability of Anosov Flows 9.9 Global Stability Theorems 9.10 Exercises Chapter X. Generic Properties 10.1 Kupka-Smale Theorem 10.2 Transversality 10.3 Proof of the Kupka-Smale Theorem 10.4 Necessary Conditions for Structural Stability 10.5 Nondensity of Structural Stability 10.6 Exercises
• Core Sections
275 279 288 292 294 299 300 303 304 306 309 312 318 318 326 333 333 343 350 351 356 356 362 367 367 374 374 377 381 382 388 398 401 403 407 413 413 417 419 425 428 430
Chapter XI. Smoothness of Stable Manifolds and Applications 11.1 Differentiable Invariant Sections for Fiber Contractions 11.2 Differentiability of Invariant Splitting 11.3 Differentiability of the Center Manifold 11.4 Persistence of Normally Contracting Manifolds 11.5 Exercises References Index
433 433 441 444 444 448 451 463
How many are your works, 0 Lord! In wisdom you made them all; the earth is full of your creatures. -Psalm 104:24
CHAPTER I
Introduction The main goal of the study of Dynamical Systems is to understand the long terl! behavior of states in a system for which there is a deterministic rule for how a stat, evolves. The systems often involve several variables and are usually nonlinear. In variety of settings, very complicated behavior is observed even though the equation themselves are not very complicated (only "slightly nonlinear"). Thus the simple "I gebraic form of the equations does not mean that the dynamical behavior is simp)' in fact, it can be very complicated or even "chaotic." Another aspect of the chaoti nature of the system is the feature of "sensitive dependence on initial conditions." I the initial conditions are only approximately specified then the evolution of the stat· may be very different. This feature leads to another difficulty in using approximate. () even real, solutions to predict future states based on present knowledge. To develop !II understanding of these aspects of chaotic dynamics, we want to find situations whicl exhibit this behavior and yet for which we can still understand the important featuff' of how a solution evolves with time. Sometimes we cannot follow a particular solution with complete certainty becall~' there is round off error in the calculations or we are using some numerical scheme to fill' the solution. We are interested to know whether the approximate solution we calculat· is related to a true solution of the exact equations. In some of the chaotic system, we can understand how an ensemble of different initial conditions evolves, and pro\ that the approximate solution traced by a numerical scheme is shadowed by a trll solution with some nearby initial conditions, If the system models the weather, peopl may not be content to know the range of possible outcomes of the weather that COlli, develop from the known precision of the previous conditions, or to know that a sma i change of the previous conditions would have produced the weather which had be<" predicted. However, even for a subject like weather, for which quantitative as well " qualitative predictions are Important, it is stili useful to understand what factors ell: lead to instabilities in the evolution of the state of the system. It is now realized t/};I no new better simulation of weather on more accurate computers of the future will " able to predict the weather more than about fourteen days ahead, because of the VI') nonlinear nature of the evolution of the state of the weather. This type of knowled!, can itself be useful. We now proceed to discuss these ideas in a little more detail in terms of specili equation!>, some of which arise from modeling different physical situations, First a comment about notation. Throughout the book, points and vectors in R". manifold, or a metric space are written in bold face, e.g. x. Points (and vectors) on th line or the circle are denoted without bold face, e,g. x. Also the components of a poil! or vector in R" are written without bold face, e.g. x = (Xl, ... ,xn)' In a few plac, it seems strange to use different notation for a point in the line from a point in Rn f, n ~ 2, but it seems like the best choice to distinguish a vector from a number.
I.
2
INTRODUCTION
1.1 Population Growth Models, One Population In calculus, the differential equation :i;
=ax
is studied, where :i; = ~;. This equation can be used to model many different situations. In particular, when a > 0 it can model the growth of a population with unlimited resources or the effect of continuously compounding interest. When a < 0, it can model radioactive decay. For this simple equation, an explicit expression for the solutions can be given, x(t) = xoeat , where Xo is the value of x when t = o. If a > 0, the solution tends to infinity as t goes to infinity, while the solution tends to zero if a < o. The behavior of the solutions is very simple since the equation is linear. See Figure 1.1.
FIGURE 1.1.
Plot of Solutions as a FUnction of Time: for a
> 0 and a < 0
Often, we do not plot the solution as a function of t, but merely plot the solution in the space of possible values of "x" , the phase portrait. The solution curves are labeled with arrows to indicate the direction that the variable changes as t increases. (This contains more information when more than one variable is involved.) See Figure 1.2. This type of phase space analysis can often yield qualitatively important information, even when an explicit representation of the solution can not be obtained.
o
• I
FIGURE 1.2.
x
•
Phase Portrait: for a > 0 and a
<0
A more sophisticated population growth model involves a crowding factor. For this equation, it is not assumed that the growth rate, XIx, is a constant, but that this quantity decreases as x increases, XIx = 0 - bx with a, b > o. This equation can be used to model the growth of a population when the resources are limited. Thus we get the equation :i; = (0 - bx)x. This equation is sometimes called the logistic model for population growth. This equation can be solved explicitly (by separation of variables and partial fractions), yielding the solution
axo
x(t) = bxo
+ (a -
bxo)e-at
As can be seen from the differential equation or from the solution, if Xo equals either
o or alb then :i; =
0 and the solution x(t) is constant in time. Such a solution is also
1.2 ITERATION OF REAL VALUED FUNCTIONS
3
called a steady state solution or a fixed point solution. If Xo lies between 0 and alb, then :i; > 0, and the solution continues to increase with time but can never quite reach alb. By a simple argument about monotone solutions or by using the exact form of the solution, it can be seen that for these initial conditions, x(t) tends to alb as t goes to infinity. Similarly, if Xo > alb, then:i; < 0 and the solution x(t) monotonically decreases toward alb as t goes to infinity. See Figure 1.3. (Figure 1.3A shows the graphs of four solutions and Figure 1.3B shows the phase portrait of all solutions.) Thus, for any initial condition Xo > 0, the solution x(t) tends to the quantity alb as t goes to infinity. Thus, alb is the long term limit state for any positive initial condition.
x
O~------------------------
FIGURE 1.3A. Logistic Equation: Solutions as a Function of Time
o I
x
FIGURE 1.3B. Logistic Equation: Phase Portralt For differential equations on the real line (one real variable), the dynamics are always about this simple. As t goes to infinity, all solutions must tend to either a steady state solution (a fixed point) or ±oo.
1.2 Iteration of Real Valued Functions as Dynamical Systems As mentioned above, the differential equation :i; = az can be used to model the growth of capital when interest is compounded continuously. If however the Interest is compounded at set time intervals (dally, monthly, or yearly) and added to the capital, then the amount of money at the n-th time interval in terms of the previous time is given by
where ~ > 1. (Here ~ is equal to one plus the interest rate.) This equation could also be thought to model the growth of a population which reproduces at fixed time intervals (e.g. always in the spring) rather than continuously. Letting !J,.(x) = ~x, we get that Xn = f>.(Xn-l). We also write n(xo) for b, 0 f>.(xo) and ff(xo) = b, 0 fr-1(xo). For the second iterate, X:l = n(xo) = b,(xt} = ~Xl = ~f>.(xo) = ~(~xo) = ~2xo, and by
4
I.
INTRODUCTION
induction on n,
Xn = f>.(xn-d
= >'X n - l = >.(>.n-l xo )
= >.nxo· Thus, if>. > I, Xn = ff(xo) tends to infinity as n goes to infinity. If 0 < >. < 1 (some kind of penalty situation rather than adding interest to the capital), then Xn tends to o as n goes to infinity. The behavior of these iterates is very similar to the solutions of the linear differential equation. Again, the dynamics are simple, because this function is linear. In the above situation, repeatedly crediting interest corresponded to iteration of a simple function, f>.. We put in an initial value of xo, and then calculate the value of x at later "times" n by the rule Xn = h.(xn-d. This use of a real valued function is much different from the usual idea of merely graphing it. We continue to develop this idea in this introductory chapter and subsequent chapters. As a next model, we consider the growth of a population at discrete time periods n where there is crowding or competing for resources. In comparison with the continuous differential equations, we get the discrete difference equations Xn = F,.(xn-d with
F,.(x) = Ilx(1 - x). (We have scaled the variables so the two constants a and b are equal, and use the single parameter Il.) Robert May (1975) noted that this simple model for discrete time intervals does not necessarily lead to a system which tends toward a steady state population. We study this example extensively in Chapter II, but at the moment we just remark that for 3 < Il < 1 + 6 1/:1 "'" 3.449, most initial values of Xo with 0 < Xo < 1 (including Xo = 0.5) do not tend toward a steady state solution, but tend to an orbit of period two: b = F,.(a) and a = F,.(b). For Il slightly larger than 1 + 6 1/ 2 , repeated iteration of Xo by F,. tends to a orbit of period four. For Il = 3.73, the dynamics are even more complicated, and the iterates of an initial Xo do not seem to tend to any period motion but move chaotically within the interval (0,1). For Il > 4, the set of points which stay in the interval [O,IJ for all forward iterates of F,.(x) turns out to be a Cantor set. The dynamics of iteration of points in this Cantor set can be analyzed and are very chaotic. The analysis of the dynamics on the Cantor set is by means of "symbolic dynamics." At any "time" n, the iterate Xn is either in the left half of the Cantor set or the right half. This distinction can be made by assigning a symbol (a 1 or a 2) for each time n. It is shown that this crude specification of the location for all times exactly determines the location at the initial state. By specifying a sequence of symbols, an orbit with a certain type of behavior can be shown to exist. Thus, although the dynamics of individual orbits are chaotic, the dynamics of all orbits can be completely understood and described by symbolic dynamics. This example for a one dimensional map has many of the properties of what is called a horseshoe for a two dimensional map. In two dimensions, there is again an invariant Cantor set and the dynamics are determined by symbolic dynamics. Many. equations used to model phenomenon are shown to be chaotic by proving the existence of a horseshoe. The horseshoe in two and higher dimensions is treated in Section 7.3. Subsections 7.3.1-4 treat various ways the horseshoe occurs for various functions or differential equations.
1.3 HIGHER DIMENSIONAL SYSTEMS
There are several lessons which the dynamics of iterates of the function F,. teachf' us. First, the simple algebraic nature of the function does not insure simple dynamb and in fact its iterates exhibit chaotic properties. Second, for iteration of a function th, evolution of the state is deterministic. The parameter is fixed and there is a set rule fOI determining the next state from the previous one, so the evolution of the state from 011' state to the next one is very predictable. Still, by taking many iterates, the state CIlI exhibit erratic behavior. The last lesson to learn from this example is that all errati, behavior is not always caused by changes in internal forces or stochastic effects, but CIlI also result from the nonlinear nature of the deterministic systems itself. Chapters II and III treat the dynamics of iteration of real valued functions. We do nfl' study the implications for modeling physical situations, but do study the mathematic" aspects of the subject. The quadratic example is studied extensively because of the wid, variety of types of dynamics it exhibits. Another thing to note about Chapters II all< III is that only simple analytic tools are used to prove quite sophisticated results: th, Mean Value Theorem, the Intermediate Value Theorem, differentiation of real valu.., functions, and a few tools from Point Set Topology. The fact that the mathematic;, tools used are simple does not mean that Chapters II and III are trivial. In fact, mall of the key ideas of the book are introduced in this low dimensional setting. Thus, th, difficulty of the material in Chapters II and III comes from the range of ideas and th, development of the machinery and not the technical nature of the arguments.
1.3 Higher Dimensional Systems Solutions of differential equations in two variables can not exhibit chaos in the sem;, we are using the term. Thus if x E R2 and f(x) is a given function with values in R (a vector field on the plane), then
X= lex) is an ordinary differential equation in two variables. The Polncare.Bendlxson Theorl'l' states that if x(t) is a solution which stays bounded as t goes to infinity, then eithf' (i) x(t) tends to a periodic solution, or (ii) x(t) repeatedly passes near the same fixf" point. In fact in the second case the motion x(t) still can not be very complicat('': Thus, chaos does not occur for differential equations in the plane. The existence of periodic behavior itself is sometimes interesting. One example whl'l it has been shown that there is an attracting periodic solution is the Van der PI equations
x=y-x3 +x iJ = -x. These equations were originally introduced to model a "self exciting" electric circuil If (XO,J/O) is any initial condition other than (0,0) (no matter how small), then t!; solution (x(t), yet)) tends to the periodic motion. Thus the system excites itself to thi
periodic motion, and other than transitory effects, the natural motion of a solution i the single periodic motion. There are many other special equations which model population growth, nonlinel' oscillators, and many other physical situations. We do not deal with the modelilJ aspect of the subject, but develop the mathematical tools to study such equations. Iterates of functions from R2 to R2 can be used to model populations with more th .. one generation or stages in life (as well as many other situations). The iterates of SUI functions can exhibit all the complexities of the quadratic map on the real line. In t\\ variables, the map can even be invertible (which the quadratic map on the real line '
6
I. INTRODUCTION
not) and still exhibit chaos. The simplest algebraic form of such a map is the Henon map,
(:~ )
= FA,B (:) = (
A-B: - X2) .
This map has two parameters A and B. It was introduced by Henon (1976), not as a model of any particular physical situation, but as a map with a simple algebraic form which could easily be studied by means of computer simulation. He found that his map exhibited a "horseshoe" and a "strange attractor" for different parameter values. For A = 5 and B = 0.3, this map has an invariant Cantor set in the plane, a "horseshoe," and the dynamics of iteration of points in this Cantor set are chaotic. (This can be proved rigorously.) For A = 1.4 and B = -0.3, it can be proved that there is a trapping region in which points which start in this region remain bounded for all further forward iteration. There is then an "invariant set," A, of points which remain in the trapping region for both forward and backward iteration. Computer simulation indicates that this map is chaotic on A, and A can not be broken up into smaller dynamically independent pieces (FA.S appears to be topologically tmnsitive on A). See Figure 3.1. For this reason this invariant set is called a strange attractor. The fact that F1.4,-o.3 is transitive on A has not yet been proved rigorously, but much of the structure of the dynamics of this map is well understood.
FIGURE 3.1. Henon Attractor For larger values of A, A = 5 and B = -0.3, the Henon map has an invariant Cantor set A. (The set A is homeomorphic to the cross product of the usual one dimensional Cantor set C with itself, A R::: C X C.) Points not on A become unbounded under either forward or backward iteration. To a point on A there corresponds a unique string of symbols which exactly determine the orbit: in this sense the dynamics on A are equivalent to the dynamics given by the symbolic dynamics. This type of invariant set is called a "horseshoe". It has many of the properties of the map F,. on the line discussed above. We study the horseshoe and the Henon map for these parameter values more in Chapter VII. Another way that maps can arise is by considering forced differential equations. For example, we can add a forcing term to the Van der Pol equation and get :i: = y - x 3
+ X + g(t)
y= -x where g(t) is a T-periodic function of t, e.g. g(t) = cos(wt) for some fixed w. The solutions of such nonautonomous equations can cross each other as t evolves, but by
1.3 HIGHER DIMENSIONAL SYSTEMS
7
following the solutions through a complete period, we can get a well defined map = (X(T»). (Xl) Yl y(T)
Following the solution for multiples of the period yields higher iterates of the map,
(Ynxn)
=
(X( nT) )
.
y(nT)
This period map (a special case of what is called a Poincare map) can exhibit the type of chaotic behavior which we have been discussing. These differential equations can be thought of as equations in three variables, X, y, and t with dt/dt = 1. Because one of the variables is time, some people speak of these as differential equations in two and a half dimensions. Thus differential equations in two and a half dimensions (or three) can exhibit chaos. The advantage of considering iteration of maps first in this book is that this same phenomenon can be observed in lower dimensions, in fact in one dimension for noninvertible maps. A type of forced van der Pol type equation was one of the first equations for which "random" or "chaotic" behavior was observed. Cartwright and Littlewood (1945) first discovered this and later Levinson (1949) gave a much simpler analysis of the situation. Much more recently, Levi (1981) gave a more complete analysis and showed how a horseshoe occurs in this situation. Another problem which has been shown to have a horseshoe is the motion of three point masses moving under Newtonian attraction. Sitnikov (1960) studied the situation of the motion of three masses where the third mass m3 moves on the z-axis and the first two masses are equal, ml = m2, their positions remain symmetric with respect to the z-axis, and they move in an elliptical orbit. The simplest description is for m3 = 0 where the first two masses affect the motion of the third but not vice versa. Then ml and m2 move in an elliptical orbit which is nearly circular in the (x, y)-plane. The third body oscillates up and down the z-axis. The motion of the first two bodies can be solved independently of the motion of the third mass. Thus the forces of masses ml and m2 on m3 can be thought of as a time dependent effect on its motion. Since their motion is periodic, the effect is a time periodic term in the equations for the motion of m3. The state of the third mass is determined by its height up the z-axis and its velocity. Thus the equations are time periodic in two variables. By means of a calculation which has become called the Melnikov method, it is possible to show there is a horseshoe for this motion, i.e., there is an invariant Cantor set for the period map and the behavior of an orbit with initial conditions in this Cantor set can be prescribed by sequences of symbols. One way of interpreting the symbolic dynamics for the Cantor set is that it is possible to specify any sequence of integers nj (as long as all the nj ~ No for some No), and then there is an orbit for which the first two masses make Rj revolutions between the j-th and the j + 1 return of the third mass. Since the sequence is arbitrary, knowing the length of time it took for the third mass to return the last time gives no information about how long it will take to return the next time. This unpredictability, or chaotic nature, of the motion even for very deterministic motion is one of the characteristics of the type of situations we consider. In fact the number of revolutions can be set to infinity. If Rio = 00 for some jo < 0 and all the Rj for j > jo are bounded, then the orbit is a capture orbit, i.e., for this orbit the first two masses rotate about their center of mass and the third mass comes in from infinity and is captured in bounded motion as time goes to plus infinity. Similarly, if njo = 00 for some jo > 0 and the Rj for j < jo are bounded, then there is an ejection orbit. The symbols can be thought
8
I. INTRODUCTION
of 88 a very inexact determination of the state of the system at a given time. By prescribing this inexact information for all time the exact present state of the system is determined: it is shown there is a unique initial condition which goes through this lM1Quence of rough prescriptions of states at the future times. Thus, the existence of a motion of a prescribed nature can be shown by means of such symbolic dynamics. Because of its usefulness in determining types of behavior, symbolic dynamics is one of the fundamental tools we use in our study of Dynamical Systems. In addition to Sitnikov, this situation was studied extensively by Alekseev (1968a, 1968b, 1969). Also see the book by Moser (1973) and the expository paper by Alekseev (1981). A set of differential equations which have been much discussed are the Lorenz equations given by :i; = -lOx
+ lOy
iJ
= 28x - y- xz . 8 z = -3z+xy.
These equations were introduced by Lorenz (1963) as a very rough model of the fluid flow of the atmosphere (weather). He studied these equations by means of computer simulation and observed chaotic behavior. After much investigation we understand the features which are causing the chaos in these equations and have a good geometric model for their behavior, but no one has analytically been able to verify that these particular equations satisfy the conditions of the geometric model.
FIGURE 3.2.
Lorenz Attractor
So far, we have emphasized that differential equations and iterations of functions can exhibit chaos. Another important idea is determining when two systems have the same dynamics, i.e., the two systems are topologically conjugate. A topological conjugacy can be thought of 88 a continuous change of coordinates. (When this idea is introduced, we explain why we can not usually find differentiable change of coordinates, i.e., differentiable conjugacies.) Two systems which are topologically conjugate have the same long term dynamical behavior: both are chaotic or not, both have the same type of periodic motions, both are "transitive" or not, etc. Thus, topological conjugacy is an important concept when we classify systems up to equivalent dynamical behavior. With this idea of topological conjugacy in place, we then want to define what it means for a system to have the same dynamics as all nearby systems in some suitable space of dynamical systems. What is of interest in this consideration is not the stability of a particular solution of the system, but the structure of the dynamics of the whole
1.4 OUTLINE OF THE TOPICS OF THE CHAPTERS
'.
system. For this reason, Smale (1965) introduced the idea of structural stability based on earlier ideas of Andronov and Pontryagin (1937). A system J is called Itrocturnll.,. stable provided that every system 9 which is near enough to J is topologically conjugat. to J. Thus, the dynamics of J are robust under perturbations. Some systems can bl both structurally stable and chaotic, but there certainly are others with only a finitf number of periodic orbits which are structurally stable (Morse-Smale systems). The main analytic feature, which allows us to show that a system is either chaotil or structurally stable, is the existence of a "hyperbolic structure. For these system~ the attention is focused first on the points which have some kind of recurrent behavior. the set of "chain recurrent points." The rough idea is that a system has a hyperboli. structure if each point in this chain recurrent set has a splitting into directions which are contracting and those which are expanding. (For maps on the real line, there i, only one direction and it is either everywhere contracting and the motion is periodic. or everywhere expanding and the motion can be chaotic.) The idea of a hyperboli< structure is a generalization of the notion of splitting into various eigenspaces for a sing\. linear map. The splitting at the different points has to be compatible: there needs to bl a compatibility along an orbit and also as the point varies with the invariant set. Thus. some infinitesimal displacements (tangent vectors) at a point in the chain recurrent set are contracted and others are expanded. The existence of such a structure is crucial to be able either to apply the mechanism of symbolic dynamics or to show a systen I is structurally stable. Because these topics are the focus of this book, we mainly treat systems with such a hyperbolic structure on the set of all chain recurrent points. Thes.· ideas can be used in other situations where merely an invariant subset has a hyperboli. structure. The question of the existence of capture for three point masses is such " system with a hyperbolic structure on only an invariant subset of states. We do show that a homoclinic orbit (a orbit which tends to the same fixed point as t tends to both ±oo but has other behavior between) is a situation where the proper assumption le8(b to a subsystem with a "horseshoe," i.e., an invariant subsystem with chaotic dynamics. These form an important category of situations where it is possible to prove that ther.. is chaotic dynamics. It
1.4 Outline of the Topics of the Chapters In order to introduce the main perspective early in a situation where the analytic and geometric aspects are simpler, we first consider the iteration of functions of a single real variable. In this setting, we can show how to use sequences of interval~ whose images cover the next interval to determine orbits (symbolic dynamics), and how some functions have dynamics which are equivalent to that of any of its perturbations (structural stability). These two ideas which are introduced in Chapter II are central to the approach to Dynamical Systems which we present. In fact, most of Chapter II is used heavily in the rest of the book with the exception of the rotation numbel of a homeomorphism on the circle, and the rotation number is an important idea ill Dynamical Systems. Chapter III treats topics related to iteration of a function of one variable, and ill particular a situation which leads to complicated dynamics. One example of such dynamics is the invariant Cantor set for the quadratic map which we obtain in Chaptel II. However in Chapter III, we connect these ideas with the notion of chaos. We al&· separate these sections from Chapter II, because they can easily be postponed until later in the book. In fact Sharkovskii's Theorem is only used to motivate subshifts 01 finite type. In turn, subshifts of finite type are not used again until Chapter VII whell we discuss horseshoes and toral automorphisms In higher dimensions. The material 011 chaos, Liapunov exponents, period doubling cascade, and the zeta function is not used
10
I.
INTRODUCTION
but is included since these concepts often occur in the literature on Dynamical Systems. After Chapter III, we turn to higher dimensions for iteration of functions and solutions of ordinary differential equations. In this setting, the analysis near a fixed point is much more analytically complicated than the one dimensional situation treated in Chapter II. As a first step in this analysis, Chapter IV deals with the dynamics of linear systems, both for iteration of functions and for differential equations. The PerronFrobenius Theorem could easily be skipped in this chapter as well as some of the detail on solutions of linear differential equations. Chapter V treats the local dynamics of a nonlinear system near a fixed point as a perturbation of the linear system. If the linearization has only contracting and expanding directions and no neutral directions (is hyperbolic) then the dynamics of the linearization determines the local dynamics of the nonlinear system. The proof of these results uses the method of finding a contraction mapping: a mapping from a space of functions to itself is constructed, which is shown to be a contraction mapping and whose fixed point is a conjugacy between the linear and nonlinear system. In addition to determining the local behavior near a periodic point, this chapter introduces some of the methods of proof which are used for the more global results in later chapters. The method of proof of the conjugacy to the linearized system (the Hartman-Grobman Theorem) can be used to show that linear Anosov diffeomorph isms are structurally stable. The proof of the existence of the stable manifold for a fixed point can be modified to prove the existence of stable manifolds for a "hyperbolic invariant set." Thus Chapter V is also an introduction into the analytic methods for multidimensional dynamical systems. Certainly some of the material could be skipped or treated more quickly: for example, the proof of the existence of solutions of differential equations, the subsection on the Van der Pol equations, the Poincare-Bendixson Theorem, the proof of the Stable Manifold Theorem, and the Center Manifold Theorem could be skipped. This chapter is not dependent on Chapter III. It also does not use the material on the invariant Cantor set from Chapter II. In Chapter VI, we treat the local bifurcation of fixed and periodic points, i.e., how the fixed points and their stability varies as a parameter is varied. We do not give a thorough treatment but give the three basic and generic bifurcations for one parameter families of functions. Chow and Hale (1982) has a much more thorough treatment of this aspect of Dynamical Systems. The Implicit Function Theorem is heavily used to prove these theorems. The section on the one dimensional saddle-node and period doubling bifurcations depend only on Chapter II and could be covered at that time. The rest of the sections depend on Chapter IV. Chapter VI is not used extensively in later parts of the book but there are reference to some of these results. In Chapter VII, we return to more complicated invariant sets. We give a number of examples and introduce the basic ideas which can be used to understand the structure of this type of dynamics. The unifying feature of these examples is their hyperbolic nature: they have some directions which are contracted and other directions which are expanded. A rigorous expression of these ideas requires some concepts from differential topology or differential geometry. We introduce the ideas necessary and mainly discuss the situation in Euclidean space or on tori where the need for machinery is minimized. This chapter depends on the treatment of the invariant Cantor set of Chapter II (which could be combined with the section in Chapter VII). Also the main ideas of local dynamics from Chapter V are important for a clear understanding of this material. The key sections are those on the Birkhoff Transitivity Theorem, the Geometric Model Horseshoe, the Horseshoe from a homoclinic point, attractors, the Solenoid, and Morse-Smale Systems. Section 7.5.1 on Markov Partitions for Hyperbolic Toral Anosov Automorphisms also makes a good connection back to the material on symbolic dynamics but is only needed for Section 9.6 on the Markov Partition for a Hyperbolic Invariant Set. Many of the
1.4 OUTLINE OF THE TOPICS OF THE CHAPTERS
11
other examples are interesting but not essential for the later chapters. Chapter VIII returns to the theme of Chapter III on measurements of chaos. All of these concepts appear widely in the literature in Dynamical Systems. We mainly introduce the the definitions and main ideas without proofs except for the material on topological entropy. None of this material is used in an essential way in the rest of the book. Chapter IX gives the theory of the analysis of these hyperbolic systems and their structural stability. This chapter uses heavily the ideas from the Stable Manifold Theorem in Chapter V. The examples of Chapter VII give substance to the general definition of this chapter. Conley's Fundamental Theorem can be treated by itself at the very beginning, but the ideas of a Liapunov function in Chapter V and a gradient flow given in Chapter VII are helpful for motivation of the ideas. The more global theorems of this chapter are interesting and important, but much of the richness of Dynamical Systems is revealed in the examples and more local theory of the previous chapters. Chapter X gives some generic properties, Le., properties which are true for most systems in the sense of Baire category. This chapter also is the first place where perturbations of systems are treated to any extent. In the earlier chapters, we addressed conditions which assure that the dynamics can not be changed by any perturbation. In this chapter, we address the question of what types of perturbations do cause changes in dynamics. In that sense, this chapter relates to Chapter VI. Although the definition of transverse intersection is given earlier in the book, some of the proofs require more theorems from transversality theory. Therefore, we include a section which states thlf needed theorems and gives references. In terms of generic properties, we prove that. most systems have hyperbolic periodic points. We also give some necessary conditionS for structural stability and a counter example to the density of structurally stable sy!H tems. This last example can be considered as an open set of diffeomorphisms, each o~ which can be made to undergo a bifurcation by means of a correctly chosen perturbation. Finally, Chapter XI treats some additional topics in stable manifold theory, mainly concerned with smoothness of the manifolds. In addition to stating some general theorems, we prove that the hyperbolic splitting for an Anosov diffeomorphism on T2 is. differentiable. We also return to prove the differentiability of the center manifold. Certainly this last topic could be treated right after Chapter V.
CHAPTER II
One Dimensional Dynamics by Iteration In this chapter we introduce the concepts of Dynamical Systems through iterati" of functions of a single real variable. The main idea is to understand what the orbit , a point is like when iterated repeatedly by the same function: Xl = /(xo) and Xn /(X n -1) for n ~ 1. In one dimension, the graph of the function can be used quite easil to analyze the iterates of a point by a function. Also because of the restriction to 011 dimension, the concepts of a periodic orbit being attracting or repelling become a simpi consequence of the Mean Value Theorem. Two other important Ideas In Dynamic. Systems are the notion of topological conjugacy and symbolic dynamics. Two functiOl' / and 9 are said to be topologically conjugate if there is a homeomorphism h wil g(x) = h 0 / 0 h- 1 (x). In one dimension, we are able to prove quite simply th" simple examples such as /(x) = 2x and g(x) = 3x are topologically conjugate. For tb quadratic family, F,,(x) = 1'2:(1- x), we can show that if 1',1" > 4 then F,.. and F,..' 81 topologically conjugate even though both maps have infinitely many periodic point We also use the quadratic map to introduce the ideas of symbolic dynamics. The id, is that we label two intervals II and 12 and are able to show that sequences of these t\\ intervals are in one to one correspondence with orbits of the map under iteration. Tb final section of the chapter concerns iteration of homeomorphisms of the circle. In th I setting the average rotation of a point under repeated iteration, the rotation numb(' 1 determines whether the map has periodic points or not. In later chapters we return to study periodic orbits, the possibility of topologic, conjugacy, and related topics for iteration of functions in higher dimensions. We al~ apply these notions to solutions of differential equations. The main reason to treat til case of iteration of one variable first is that it reduces the topological complicatioll while introducing the new concepts of dynamical systems.
2.1 Calculus Prerequisites In this short section, we recall three results from calculus which we use extensiv('1 In the material on iterating a real valued function of one real variable. As we proce<' through the material on dynamical systems, we discuss further results from calculll which we need. In particular material on derivatives in higher dimensions and til Implicit Function Theorem is given later.
Critical Points The points where a function attains its maximum and minimum are important i studying graphs. They are also important in determining the dynamics of iteration I the function. Therefore, we review the definition and terminology of points where til derivative is zero. Assume / is differentiable. A point a is called a critical point of / provided I'(a) = t It is called a nondegenerate critical point provided I'(a) = 0 and I"(a) :j:. O. F•. example, if F,,(x) = I'x(l-x) then x = 1/2 is a nondegenerate critical point. The POitl o is a degenerate fixed point for the function /(x) = x 3 • 13
14
II. ONE DIMENSIONAL DYNAMICS BY ITERATION
Intermediate Value Theorem Assume J is an interval and I : J ~ R is a continuous function. Further assume that a, bE J with a < b, and z is a value between I(a) and I(b). Then there is a (at least one) c with a < c < b such that I(c) = z. This property says that I attains all the values between I(a) and !(b). This is essentially the result that I([a, bJ) is connected. This result is also called the Theorem 01 Darboux.
Another consequence of continuity Assume I : J -+ R is a continuous function and Cl < I(a) < C2 for a in J. Then there is an open set U about a in J such that CI < I(x) < C2 for all x E U.
Mean Value Theorem Assume J is an interval and I : J -+ R is a continuous function. Further assume that a, b E J with a < b and that I is differentiable on (a, b). Then there exists a C with a < c < b such that I(b) - I(a) = j'(c). b-a This says that there is a point C for which the slope of the tangent line at c equals the slope of the secant line from (a, I (a» to (b, I (b». This result is also called the Theorem 01 Lagrange.
The Chain Rule The chain rule concerns the derivative of a composition of functions. If I, 9 : R -+ R are two functions, then we write 10 g(x) for I(g(x», i.e., for the composition of I with g. If both I and 9 are differentiable, then (f
0
g)' (x)
= j'(g(x»g'(x)
or
= j'(y)g'(x)
where y = g(x). The point is that the derivative of I must be taken at the correct point. Composition of functions is at the heart of the dynamics of iteration of functions: taking a point Xo E R, we want to find XI = I(xo), X2 = !(xt>, and Xn = !(Xn-I). Thus Xn = 10 ... o/(xo) where we take the composition of! with itself n times. We write and by induction
!2(X) = 10 I(x) r(x)
= 10
r-I(x)
forn~1.
Note that this is composition of the function and not squaring of the formula which defines I. Thus if !(x) = x(l - x) then 12(X)
= I(x(l
- x»
= x(l -
x)[l - x(l - x)].
For higher iterates, it quickly gets impossible to write out the formula for r(x). In this context, the chain rule to the composition of a function I with itself n times can be written as
2.2 PERIODIC POINTS
15
where xi = li(xo). In particular, if I(x) = x(1 - x), Xo 1/3, and n XI = 1(1/3) = 2/9, X2 = j2(1/3) = 1(2/9) = 14/81. The derivative is f'(x) and
= 3 then = 1 - 2x,
(J3)'(1/3) = 1'(14/81)/'(2/9)/'(1/3) = (1 -
2!~)(1 - 2~)(1 - 2~)
53 5 1 = 81 ·9·3· The point is that we do not need to compute 1 3 (x) or (J3)'(x) explicitly. l is the inverse of I, r2(x) = (J-I)2(X), and r"(x) = If I is invertible then (J-I)n(x) for -n < o. We also write 1° for the identity, jO(x) = x. Using the chain rule, it can be shown that the derivative of the inverse is the reciprocal of the derivative • of the function,
r
(J-I)'(X) = I'(:-d
where X-I = I-I(x). Terminology about types of functions Throughout this book, we use some terminology about functions and the extent of their differentiability. Let J be an open subset of R (possibly all of R), and let I : J --t R be a function. If I is continuous we say that I is Co. If I is differentiable at each point of J and both I and f' are continuous, then I is said to be continuously differentiable or a C I function. Given r ~ I, if I together with J
:s :s
r
2.2 Periodic Points Throughout this section I : R ...... R is continuous. Sometimes I is only defined on an interval in R. We add the assumption that I is C l or C2 whenever we take first or second derivatives of the function. In this section we discuss the existence of a fixed point X with J(x) = x or periodic point with I"(x) = x for some n > 1. The fixed points or points of low period can be found by solving the equations J(x) = x and I"(x) = x. For higher periods, It is not practical to solve these equations. Below, we discuss an analysis using the graph of I which is more useful for determining points of higher period.
r
Definition. A point a is a periodic point 01 period n provided (a) = a and P (a) '" a for 0 < j < n. (Note that n is the least period.) If a has period one then it is called a fixed point. If a is a point of period n, then the forward orbit of a, O+(a), is called a periodic orbit. The notation we use for all points fixed by (not all of least period n) is
r
Per(n, f) = {x : r(x) = x} Fix(J) = Per(I, f).
and
Finally, a point a is eventually periodic 01 period n provided there exists an m > 0 such that Im+"(a) = Im(a), so p+n(a) = Ij(a) for j ~ m, and Im(a) is a periodic point.
16
II. ONE DIMENSIONAL DYNAMICS BY ITERATION
x. The fixed points satisfy the equation x 3 - x = x, so The points x = ±l have their first iterates go to the fixed point 0, I(±I) = 0, so ±l are eventually fixed. As mentioned in the last section, the critical points satisfy f'(x) = 3x 2 - 1 = O. So the only critical points are x = ±l/va. The second derivative is f"(x) = 6x, which is nonzero at ± 1/ va so the critical points are nondegenerate.
Example 2.1. Let I(x)
are the points x
= X3 -
= 0, ±y2.
As the above example illustrates, to find the fixed points it is enough to solve the equation I(x) = x. To find the periodic points of higher period is harder. Of course, theoretically it is possible to find the expression for the higher iterate r(x) and then solve rex) = x. For n much larger than four this becomes algebraically very complicated even in the simplest examples. Exercise 2.6 asks you to find the points of period two and four for the map FI'(x) = I'x(1 - x). Sometimes it is possible to find points of higher period by understanding the graph of See Exercises 2.4 and 2.5. Also computer iteration can be used to find the points of higher period. As mentioned in the preface, the programs Phaser by Ko<;ak (1989) and MacMath by Hubbard and West (1992) are two programs which can carry out such iteration.
r.
Definition. For a continuous function I, the lorward. orbit of a point a is the set 0+ (a) = {Jk (a) : k ~ o}. If f is invertible then the backward. orbit is defined using negative iterates: 0- (a) = {Jk(a) : k $ OJ. The (whole) orbit of a point a is the set O(a) = {Jk(a): - 00 < k < oo}. If I is not invertible then we sometimes make choices and construct X-l,X-2,'" where f(x- n ) = X-n+l or X-n E f-l(x_n+l)' (For a noninvertible map, I-l(y) = {x : f(x) = y}.) Before discussing how to find the forward orbit using the graph of the function, we give some more fundamental definitions connected with convergence and stability of periodic points.
Definition. A point q is forward. asymptotic to p provided 1J3(q) - li(p)1 goes to zero as j goes to infinity. If p is periodic of period n then q is asymptotic to p if Ifin(q) - pi goes to zero as j goes to infinity. The stable set of p is defined to be W"(p) = {q : q is forward asymptotic to pl. If f is invertible then a point q is said to be backward. asymptotic to p provided If' (q) Ji(p)1 goes to zero as j goes to minus infinity. If I is not invertible then a point q is said to be backward. asymptotic to p provided there are sequences P_j and q-i with Po = p, qo = q, I(p-j) = P-j+l, f(q-j) = q-j+l, and Iq-j - p-il goes to zero as j goes to infinity. In either case, the unstable set 0/ P is defined to be
WU(p) = {q : q is backward asymptotic to pl.
Definition. A point p is Liapunov stable (L-stable) provided given any f > 0 there is a 6> 0 such that if Ix - pi < 6 then 1J3(x) - Ji(p)1 < f for all j ~ O. This says that for x near enough to p the orbit of x stays near the orbit of p. A point p is asymptotically stable provided it is L-stable and W'(p) contains a neighborhood of p. If p is a periodic point which is asymptotically stable it is also called an attracting periodic point or a periodic sink. If p is a periodic point for which W"(p) is a neighborhood of p, it is called a repelling periodic point or periodic 80Un:e.
2.2 PERIODIC POINTS
17
Example 2.2. Let I(x) = x 3 . The fixed points satisfy I(x) :..: x 3 - X, so x 0, ±l. Note that the fixed points correspond to points where the graph of I, {(x,J(x)}, intersects the diagonal {(x,x)}. See Figure 2.1. The graph of I is monotone. On (0,1) the graph of I lies below the diagonal and I(x) < x. Thus for x e (0,1), x > I(x) > /2(X) > .. ·/"(x) > O. Because this sequence is monotone it must converge to a fixed point and so to O. (The fact that a bounded monotone sequence of points on an orbit must converge to a fixed point is left to Exercise 2.2.) Thus (0,1) c W'(O). For backward iterates, x < ,-1(X) < 1 for x e (0,1). As j goes to minus infinity, Ji(x) is monotonically increasing to 1,60 (0,1) C WU(1). A similar analysis shows that (-1,0) C W'(O) and (-1,0) C WU(-I) since for xe(-l,O), x < f(x) < 12(x) < ... < - 1
rex)
<0 < ,-n(x) < ... < ,-2(x) < ,-1(X) < x.
Because W'(O) contains a neighborhood of 0, the fixed point 0 is attracting. For x > I, P(x) is monotonically increasing. If this forward orbit were bounded it would have to converge to a fixed point. Since there is no fixed point larger than 1, the orbit must go to infinity as j goes to infinity. As j goes to minus infinity, /i(x) is monotonically decreasing to 1 60 WU(I) :> (1, (0). Again for x < -1, Ii (x) is monotonically decreasing to -00 as j goes to infinity, and li(x) is monotonically increasing as j goes to minus infinity, WU( -1) :> (-00, -1). Because we have analyzed the iterates of all points in the line, the following list summarizes the stable and unstable sets: W'(O) = (-1,1) WU(O) =
to}
W'(±I) = {±1}
W U (I) = (0, (0) W U (-I) = (-00,0).
Because WU(1) is a neighborhood of 1 and (W'(l) is not), the iterates of points near 1 move away and the fixed point 1 is repelling (unstable). Similarly, -1 is repelling.
FIGURE 2.1. Stair Step Method for Example 2.2
II.
18
ONE DIMENSIONAL DYNAMICS BY ITERATION
To understand the orbit more fully we use the graph of f. We give a general description. but the above example is useful in illustrating the construction. Take a point Xo. Draw a vertical line segment from (xo.xo) to the point on the graph (xo,J(xo)). Then draw the horizontal line segment from this point on the graph over to the point on the diagonal (f(xo),J(xo)). Graphically this shows how we map from the point Xo to XI = f(xo)· Repeating this process with line segments from (Xlo xd to (Xlo f(xd) and then from (xI.f(xd) to (f(xd.f(xd) determines the next point X2 = f(xd. Repeating this process determines the orbit. Figure 2.1 draws these stair steps for Example 2.2 for a point 0 < Xo < 1. In the figure. we extend the vertical lines down to the axis to indicated the points Xn more clearly. Because of the appearance of this figure. this process is often called the stair step method to determine the phase portrait. The reader might want to draw the stair steps for Example 2.2 with Xo > 1. -1 < Xo < O. and Xo < -1. The stair step method can easily be adapted to find inverse iterates when the function is one to one (monotonic). In this case take the point x. and draw the horizontal line segment from this point (x.x) on the diagonal to the point on the graph: since the function is monotonic there is only one such point and it is (f-I(X).X). Now draw the vertical line segment from (f-I(X).X) to the diagonal (f-I(x).f-I(x)). or to the axis (f-I(X),O). In Figure 2.1. applying this process to X3 we get X2 = f- I (X3). XI = f- 2(X3). and Xo = f-3(X3). This process can either be thought of as (i) reversing the steps to find the forward iterate or (ii) thinking of x as a function of y (interchanging the roles of X and y) which is the way the inverse function is explained in a calculus course (although we did not redraw the the graph with the x-axis up). If the graph is not monotonic. in applying the stair step method to the inverse there may be more than one point on the graph which can be reached by a horizontal line segment from a point (x. x) on the diagonal. Each of these multiple points gives a possible inverse image of x. See Figure 2.2 for f(x) = -x + x 3 • Xo = 0.231. and rl(xo)
= {x~l ~ -0.854.x~1
~ -0.246. and x~l
= l.l}.
Function f(x) = -x+x3• Xo = 0.231. and preimage f-I(XO) = {x~f ~ -0.854.x~1 ~ -0.246. and x~f = 1.l}
FIGURE 2.2.
The above process describes how to calculate the forward and backward orbit of points. As is Example 2.2. this information can be used to determine whether a fixed point. is attracting or repelling. Because this process is involved. it is useful to have a criterion for a periodic orbit to be attracting which only uses the derivative of the function at points along the periodic orbit. The next theorem gives just such a criterion.
2.2 PERIODIC POINTS
19
Theorem 2.1. Assume f : R - R is a Cl function. (a) Assume that P is a fixed point with 1f'(p)1 < 1. Then P is an attracting fixed point (or asymptotically stable or a sink), i.e., W'(p) contains a neighborhood ofp. (b) Assume that p is a periodic point of period n with IU"),(p)1 < 1. Then p is an attracting periodic point. REMARK 2.1. By the Chain Rule the derivative in part (b) can be calculated as the product of the derivative of f along the orbit: IU")'(p)l = I!,(p"-dl" . I!'(pdl1!'(Po)1 where Pi = Ji(p). PROOF. (a) Because 1f'(p)1 < I, there is an interval (p-f,p+fl and a ~ with 0 <,x < 1 such that 1f'(x)1 ~ A for x E (p - f,p + fl. Then by the Mean Value Theorem, for x E (p - f,p + fl, there is a z between x and p with If(x) - pi
= If(x) -
f(P)1
= 1!,(z)I'lx -
pi ~ 'xIx - pi < Ix - pI·
Thus f(x) is closer to p than x so f(x) E (p- f,P+f), and we can repeat the argument. By induction This argument shows that fi(x) E (p - f,p + £] for all j ~ 0 proving that pis !rstable. Because ,Xi Ix - pi goes to zero, fi(x) converges to p as j goes to infinity, proving that p is attracting. The above argument can be understood using the stair step method. There are two cases, 0 < !,(p) < 1 and -1 < f'(p) < O. (The case f'(p) = 0 can be analyzed in a similar manner, but the exact argument depends on the form of the graph of ,.) In the first case with 0 < f'(p) < I, for x > p, x > f(x) > J2(x) > ... > p as can be seen from the graph. Thus fi(x) converges to p from above. See Figure 2.1. Similarly, if x < p, Ji(x) converges to p from below. Now consider the case with -1 < f'(P) < O. If x> p, then f(x) < p and f2(x) > p. Because 0 < ( 2 )'(p) < I, p < f2(X) < x. Thus Ji(x) converges to p also. See Figure 2.3.
FIGURE 2.3.
Near an Attracting Fixed Point with Negative Derivative
II. ONE DIMENSIONAL DYNAMICS BY ITERATION
20
r.
(b) For the periodic case. consider 9 = Then g(p) = P and Ig'(p)1 < 1. By part (a). gi(x) converges to P for x near p. By continuity of P for 1 ~ j ~ n it follows that IJi(x) - Ji(p)1 goes to zero as j goes to infinity for all j and not just for multiples of n.
o
The following theorem gives the comparable criterion for a periodic point to be repelling. Theorem 2.2. Assume I: R -+ R is a C' function. Assume that p is a periodic point of period n with l(r),(p)1 > 1. Then p is repelling. Moreover. for all sufficiently small intervals 1 about p and x E 1 there is a k = k", such that J'
2.2.1 Fixed Points for the Quadratic Family IJ
In this subsection we consider the family of quadratic maps F,,(x) = ,n(1 - x) for IJ > 1. We first find the fixed points.
> 0 and mainly for
Proposition 2.3. The fixed points of the family of quadratic amps F" are 0 and
< IJ < 1 and repelling for IJ > 1. The < 3 and repelling for 0 < IJ < 1 and 3 < IJ. The
PI' = IJ - 1. The fixed point 0 is attracting for 0 IJ
fixed point PI' is attracting for 1 < IJ only critical point is x = 1/2 which is nondegenerate.
7
PROOF. The fixed points satisfy x = ,n_,n2. so are the points x = 0 and x = PI' == as claimed. To determine their stability. note that F;(x) = IJ - 21Jx. Thus IF; (0) I = IIJI so 0 is attracting for 0 < IJ < 1 and repelling for IJ > 1. On the other hand IF; (PI') I = IIJ - 2(1J - 1)1 = 12 -IJI· Thus PI' is attracting for 1 < IJ < 3 and repelling for 0 < IJ < 1 and 3 < IJ. The critical points satisfy F;(x) = 1J(1-2x) = 0. so the only critical point is x = 1/2. Finally. F::(x) = -21J so x = 1/2 is a nondegenerate critical point. 0
We leave to Exercise 2.6 the determination of the points of period two and their stability. As for the eventually fixed points. note that F,,(I) = 0 so 1 is eventually fixed. By symmetry of the graph. if we let PI' = 1 - PI' = 1/IJ then F,,(p,,) = PI' so PI' is also eventually fixed. The following proposition indicates which points of F" go to infinity. and so which other points are potentially periodic. Proposition 2.4. Assume IJ > 1. If x goes to infinity.
[0.1) then Fd(x) goes to minus infinity as j
< 0. F;(x) = IJ - 2,n > 1. Thus for Xo < 0. 0 > Xo > F,,(xo) > > ... > Fd(xo) is decreasing. If this orbit were bounded it would have to
PROOF. For x
F~(xo)
f/.
2.2.1 QUADRATIC FAMILY
converge to a fixed point which would be a negative point. Since no such fixed poil exists, Ft(xo) goes to minus infinity. If Xo > I, then F,,(xo) < 0 so Ft(xo) = Ft- I 0 F,,(xo) goes to minus infinity. The next proposition shows that all the points in (0,1) converge to the fixed poil p" for the range of I-' for which p" is attracting. The solution to Exercise 2.6 sho" that this proposition is false for I-' > 3. However for 3 < I-' < 1-' .. most points in (0, I are asymptotic to an orbit of period two. For 1-'1 < I-' < 1-'2, most points in (0,1) fl! asymptotic to an orbit of period four. This continues and there are I-'n such that t, I-'n-I < I-' < J.tn, it can be shown that most points in (0,1) are asymptotic to an orbit· period 2n. (Such a proof can not be done directly by calculating /3" .) The I-'n convpr· to 1-'00' and for I-' > 1-'00 it is not always the case that most points in (0,1) are asymptOI to a periodic orbit. In Section 2.4, we see that for I-' > 4 there are many points in (n. which are not asymptotic to a periodic orbit. In Section 3.4, we return to a furth· discussion of this period doubling cascade. Proposition 2.5. Assume 1 < I-' < 3. If x E (0,1), then F,{(x) converges to p" goes to infinity. Thus W'(p,,) = (0,1).
/l.'
PROOF. (a) First consider 1 < I-' ::; 2. The maximum of the graph occurs at x = 1; For this range of parameters, F,,(1/2) = 1-'/4 ::; 1/2. Using the graph it is then elf', that PI' ::; 1/2. The function is thus monotonically increasing on (O,p,,) and the grsl lies above the diagonal. Thus for Xo E (O,p,,), FJ(xo} is a monotonically increasil sequence which must converge to the fixed point p,.. See Figure 2.4. Similarly, on t I interval (PI" 1/2] the function is monotonically increasing and the graph lies below II diagonal. Thus for Yo E (p", 1/2] FJ(yo) monotonically decreases to p,.. Finally, I, Xo E (1/2,1), F,,(xo} E (O,I/2) so Ft(xo} converges to p,.. This completes the proof t· this range of parameters.
FIGURE 2.4. p" < Yo::; 0.5
Iteration of Xo and Yo for 1 < J.t < 2, 0 < Xo < PI" and
(b) Now assume that 2 < I-' < 3. Note that PI' > 1/2. (i) Consider the interval [1/2,p"j. Because is monotone on [1/2,p"j, to find I image it is enough to determine the iterates of the end points:
F;
F~([1/2,p,,]) = F"(IP,,,J.t/4j) =
[J.t(~)(1 - ~),p"l.
22
II. ONE DIMENSIONAL DYNAMICS BY ITERATION
0.5 F\5) P F (.5) 11
FIGURE 2.5.
11
11
Iteration of x = 0.5 for 2 < II. < 3
We want to show that this image is contained in [1/2,pj.lJ, /1.(/1./4)(1 - /1./4) > 1/2, or (/I.-2}(1J. 2-2/1.-4). The roots of JJ.2-2/1.-4 are 1±5 1/ 2 so this factor is negative for /I. < 3. The first factor /I. - 2 > 0 so the prod uct is negative as desired. Thus we have shown that F;(1/2) = /I.(/I./4}(1 - /1./4) > 1/2 and F;([1/2,pj.lj) C [1/2,pj.lJ. See Figure 2.5. The second iterate of Fj.I is monotone on [1/2,pj.lj, so the graph of intersects the diagonal once on the interval [1/2,pj.lj, and this is at Pw Because the graph of is above the diagonal but below 1/2 on [1/2,pj.lj, all the points in this interval converge to Pw (ii) Next, let Pj.l = 1//1. < 1/2 as above, so Fj.I(pj.I) = Pj.I' Fj.I ([Pj.l , 1/2]) = Fj.I([1/2,pj.I]), and F;([P", 1/2]) C [1/2,p"j. Thus all the points in [p", 1/2J also converge to p" by the results of the previous case. (iii) Now, consider Xo E (O,p,,). The function F" is monotonically increasing on this interval and the graph lies above the diagonal. Thus F~(xo) is monotonically increasing as long as the iterates stay in this interval. Because F,,(pj.l) = P", the first time that an iterate F~(xo) leaves the interval (O,p,,) it must land in [Pj.l,p"L Le., FI~(xo) E [P",P"J for some k > O. Then F!+j(xo) converges to PI' as j goes to infinity. (iv) Finally, if Xo E (p", 1), then F,,(xo) E (O,p,,), so further iterates converge to Pw Combining the cases we have proved the proposition. 0
o > /1.3 -411.2+8 =
F;
F;
2.3 Limit Sets and Recurrence for Maps We have defined periodic points and found points of low periods. In the examples we study with complicated dynamics, there are points which are not periodic but whose orbits keep returning near where it started. Orbits with such properties are said to have a kind of recurrence. In this section we introduce several such concepts. Although we give a few examples in this section, the concepts should become clearer as they are used throughout the rest of the book. In this chapter, we mainly use the concepts of the Q and w-limit sets of a point, an invariant set, and the nonwandering set. The other concept could be postponed until they are used. A related concept is that of convergence of an orbit to another. We have already . ,rious times in our defined what it means for q to be forward asymptotic to P study of dynamical systems we need a more general concep{,; .,,' give this in terms of the Q and w-limit sets of a point which is introduced in this section. These concepts are also used to define the points with a kind of recurrence called the limit set. We give these definitions in this section for a continuous map f : X --+ X where X is a complete metric space with metric d. Several of the definitions use only the forward
2.3 LIMIT SETS AND RECURRENCE FOR MAPS
23
iterates of f and make sense even if f is not invertible. We do not distinguish these cases but just assume f is a homeomorphism throughout and let the reader determine which definitions make sense for noninvertible maps. In the next section, we return to the study of the quadratic map which gives an example of several of the types of recurrence defined in this section. Definition. A point y is an w-limit point of x for f provided there exists a sequence of nk going to infinity as k goes to infinity such that lim d(r"(x),y) =
k-oo
o.
The set of all w-Iimit points of x for f is called the w-limit set of x and is denoted by w(x) or w(x, I). Other hooks also use the notation of Lw(x, I) or L+(x,f). If the map f is invertihle then the a-limit set ofx for f is defined the same way but with nk going to minus infinity. The set of all such points is denoted by a(x) or a(x, f). Again other books also use the notation of L,,(x, I) or L - (x, I).
rex)
Example 3.1. If = x is a periodic point then both limit sets equal the orbit, w(x) = a(x) = O(x) = {x, f(x), ... , r-l(x)}. In this case the w-limit set is finite, and not connected if the period n is greater than 1. Example 3.2. The Section 2.8 on difl'eomorphisms of the circle gives a more complete discussion of this example on the circle. Let p ¢ Q. If fp(x) = x + p is considered a map on the reals, then it induces a map Tp : 8 1 --+ 8 1 of the circle by taking points modulus one. For any x E 8 1, it is shown in Section 2.8 that the forward and backward orbits of x, O+(x) and O-(x), are each dense in 8 1 , and also that w(x) = a(x) = 8 1 . Next we define various types of invariance for a subset. Definition. A subset 8 c X is said to be positively invariant provided f(x) E 8 for all x E 8, i.e., f(8) c 8. A subset 8 c X is said to be negatively invariant provided /-1(8) c 8. Finally a subset 8 c X is said to be invariant provided f(8) = 8. Thus, if 8 is invariant, then the image of 8 is both into and onto 8 but we do not require that 8 is negatively invariant. If f is invertible (a homeomorphism) and 8 is an invariant subset for f, then the conditions that f(8) = 8 and that f one to one implies that 8 is negatively invariant. Notice that a periodic orbit is always an invariant set. We show below that any w(x) is always positively invariant and often in~riant. Example 3.3. Let F5(x) = 5x(1 - x) on R (which is not invertible). We show in the next section that Fs has an invariant Cantor set A. In Section 2.5, we show that there are points x* with w(x*) = A and O+(x*) dense in A. The following theorem gives many of the basic properties of the limit set of a point. Theorem 3.1. Let f : X --+ X be a continuous map on a complete metric space X. (a) For any x, w(x) = nN>ocl(Un>N{r(x)}). If f is invertible then a(x) = nN
24
11. ONE DIMENSIONAL DYNAMICS BY ITERATION
d(f"(x),w(x» goes to zero as n goes to infinity. Similarly, if O-(x) is contained in some compact subset of X, then a(x) is nonempty and compact, and d(f"(x),w(x» goes to zero as n goes to minus infinity. (e) If Dc X is closed and positively invariant, and xED then w(x) C D. Similarly, if I is invertible and D is negatively invariant and xED, then a(x) C D. (f) Ify E w(x) then w(y) C w(x) and (if I is invertible then) a(y) C w(x). Similarly, if I is invertible and y E a(x) then a(y),w(y) C w(x). PROOF. (a) Let y E w(x). Then y E cl(Un>N{f"(x)}) by the definition of w-Iimit set. Therefore y E nN>ocl(Un>N{fn(x)}). This proves one inclusion. Now assume y E nN>ocl(Un>N{fn(x)}). Then for any N, y E cl(Un>N{fn(x)}). Now for each N tl\ke kN > kN--1 with kN ~ N I\nd d(fkN(X),y) < j,. -;rhen d(fkN(x),y) goes to 1.I'm IIIC N I(Of'fl to Infinity 111111 y E w(x). Thill proves the other inclusion, so the sets are equlll. (b) This is clear by the group property of iteration. (c) For any x, w(x) is closed because it is the intersection of closed sets by part (a). To see that it is positively invariant, let y E w(x). Then there is a subsequence nk with d(f"'(x),y) going to zero as k goes to infinity. For any fixed j E N, d(fno+j(x).Jj(y» goes to zero by the continuity of I. Therefore !iCy) E w(x). This proves that w(x) is positively invariant. If I is invertible, the above argument works for any j E Z proving that w(x) is invariant. If I is not invertible but O+(x) is contained in a compact subset, then we argue as follows. Let the subsequence nk be as above with d(fno(x),y) going to zero as k goes to infinity. At least a subsequence of the points Ino-l(x) converge to a point z which also must be in w(x). By continuity, I(z) = y. This proves that w(x) is invariant. (d) If O+(x) is contained in some compact subset of X, then
cl(
U {fn(x)}) n?N
is compact, so w(x) =
n cl( n?N U {In(x)})
N?O
is compact. Also w(x) is the intersection of nested nonempty sets and so is nonempty. Finally, assume d(fn(x),w(x» does not go to zero. Then there exists some fl > 0 and a subsequence of iterates nk with nk going to infinity such that d(fni (x), w(x» ~ fl. The points f"' (x) are bounded so have a limit point z outside of w(x) contradicting the definition of w(x). Thus the distance must go to zero as n goes to infinity. (e) This follows from either part (a) or the following argument. Let xED. By the invariance of D, f"(x) E D. Because D is closed, all the limit points of f"(x) must be in D, proving that w(x), a(x) C D. (f) Let Y E w(x). The set w(x) is invariant so by part (e) w(y) C w(x). A similar 0 argument applies to the a limit sets. We now define one type of invariant set which can not be dynamically broken into smaller pieces: a minimal set. Then the next proposition characterizes a minimal set in terms of the w-Iimit sets of points. With this characterization we can give a simple example of a minimal set.
Definition. A set S is a minimal set for I provided (i) S is closed, nonempty, invariant set and (ii) if B is a closed, nonempty, invariant subset of S then B = S. Clearly, any periodic orbit is a minimal set.
2.3 LIMIT SETS AND RECURRENCE FOR MAPS
25
Proposition 3.2. Let X be a metric space, I : X -+ X a continuous map, and SeX a nonempty compact subset. Then S is a minimal set if and only if w(x) = S for all xE S. PROOF. First assume S is minimal. Since S is invariant, for any XES, w(x) C S. Since S is compact, w(x) is a nonempty invariant subset of S. Because S is minimal, w(x) = S. This proves one direction of the implication of the result. Now assume w(x) = S for all XES. Assume 0 -# Be S is closed and invariant. If x E B, then S = w(x) c B c S so B = S. 0
Example 3.2 (revisited). Let Tp be the map of the circle SI as before with p 'i. Q. The whole circle SI is a minimal set for Tp because w(x) = SI for all points. See Section 2.8. Maps of the circle are the principal examples in which we consider minimal sets in this book, although their other types of maps with minimal sets. Now we can give the basic definitions of different types of recurrence. Definition. Using the definition of w-limit set and a-limit set, we say that x is wrecurrent provided x E w(x) and that x is a-recurrent provided x E a(x). Definition. For a map I : X -+ X, a point p is called nonwandering provided for every neighborhood U of p there is an integer n > 0 such that r(U) n U -# 0. Thus there is a point q E U with r(q) E U. The set of all nonwandering points for I is called the nonwandering set and is denoted by n(f). Definition. An f-chain 01 length n from x to y for a map < f.
I is a sequence {x
"0, ... ,Xn = y} such that for all} ~ j ~ n, d(f(xj-d,xj)
Definition. Let Y
c X.
The f-chain limit set olY lor I is the set
n;-(Y,/) ={x EX: for all n ~ } there is an y E Y and an f-chain from y to x of length greater than n}.
Then the lorward chain limit set 01 Y is the set n+(Y,f) =
n n;-(Y,f) . • >0
(This set is the analogous object to the w-limit set of a point but for chains.) If Y is the whole space X, we usually write n+(f). Similarly, n;(Y, f) ={y EX: for all n
~
} there is an x E Y and an
f-chain from y to x of length greater than n}, and the backward chain limit set 01 Y is the set
W (Y, f) =
nn;
(Y, f) .
• >0
(This latter set is like the a-limit set of a point.) Finally, the chain recurrent set all is the set R(f) = {x : there is an f-chain from x to x for all f
> O}
= {x: x E n+(x,/)} = {x: x E W(x,J)}. It takes a little bit of work to see that x E n+(x,f) is equivalent to there being an f-chain from x to itself for all f (without assuming the chain is arbitrarily long).
26
II. ONE DIMENSIONAL DYNAMICS BY ITERATION
Definition. We define a relation", on RU) by x '" y if Y E n+(x) and x E n+(y). Two such points are called chain equivalent. It is clear that this is an equivalence relation. The equivalence classes are called the chain components of R. (For flows these chain component$ arc actually the connected components of R which is the reason for the name.) If f has a single chain component on an invariant set A then we say that f is chain transitive on A. With these definitions, the reader can easily check that
c1 (U{w(x, f) ; x E Xl)
c nu) c
RU).
In thi~ I'hllptl'r we mainly use the conl'ept of the nonwandering set and the a and w-Ilmit SI'!"; of \lllin!s. In Cha)lh'rS Vll, IX, and X, the chain recurrent set is used extensively.
2.4 Invariant Cantor Sets for the Quadratic Family In this section we start our study of maps with more complicated dynamics by studying the Quadratic family, FI'(x) = I'x(1-x). For suitable choices of the parameter 1', the complicated dynamics of F" is not spread out over the whole line but is concentrated on an invariant Cantor set. Cantor sets occur often in maps with complicated or "chaotic" dynamics throughout this book. The use of the Quadratic map on the line allows us to introduce such complicated dynamics in a relatively simple setting and give a rather complete analysis. After determining the invariant Cantor set in this section, the next section contains a discussion of symbolic dynamics. By introducing symbols to describe the location of a point, the dynamics of a point in the Cantor set can be determined by means of a sequence of these symbols. Because many different patterns of symbols can be written down, points with many different types of dynamics can be shown to exist. We start in Subsection 2.4.1 by giving the properties which characterize a Cantor set and reviewing the construction of the middle-a Cantor set in the line. After this treatment, we show how a Cantor set arises for the Quadratic map in Subsection 2.4.2.
2.4.1 Middle Cantor Sets Definitions. Let X be a topological space and SeX a subset. The set S is nowhere dense provided the interior of the closure of S is the empty set, int(cI(S)) = 0. The set S is totally disconnected provided the connected components are single points. In the real line a closed set is nowhere dense if and only if it is totally disconnected. In other spaces these two concepts are different. For example a curve in the plane is nowhere dense but it is not totally disconnected. The set S is perfect provided it is closed and every point p in S is the limit of points qn E S with qn t p. A set S is called a Cantor set provided it is (i) totally disconnected, (ii) perfect, and (iii) compact. We write L(K) for the length of an interval K. Construction of a Middle-a Cantor Set Let 0 < a < 1 and (3 > 0 be such that a + 2(3 = 1. Note that 0 < (3 < ~. Let So = I = [O,lJ. Start by removing the middle open interval of length a;
G = «(3,1 - (3)
and
SI = [\G.
Notice that G is the middle open interval of [ which makes up a proportion a of the whole interval. Then
2.4.1 MIDDLE CANTOR SETS
27
is the union of 2 closed intervals each of length {3, L(Jj ) = {3 for i = 0,2. We label the intervals 50 that Jo is to the left of J2. (We use 0 and 2 rather than 0 and 1 or 1 and 2 because of the connection we make below with the representation of numbers in base 3.) For i = 0,2, let G j be the middle open interval which makes up a proportion 0 of Jj , 50 L(Gj ) = 0{3. Let Jj .o be the left component of Jj \ Gj and Jj.'l be the right component. For jlth .. 0 or 2, the length of the component J j ,.,. Is (J'l, L(J".Ja) .. L(Jj,)(1 - 0)/2 = {32. Let S2
= SI \
(Go UG2)
U
=
Jj,.jo.
j,.,.-0.2
This set, S2, is the union of 22 closed intervals, each of length {32. By induction, assume Sn-I is the union of 2n - 1 closed intervals each of length pn-I , and labeled as J j , •... ,;.. _, for all comhinations of jk = 0 or 2. For each of these intervals, let Gj, ..... be the middle open interval which makes up a proportion 0 of the interval J), ..... j._,. Thus, L(Gj, ..... j._,) = oL(Ji, ..... j._,) = Opn-I. Let
i._,
where JiJ •...•in_,.o is to the left of Jj , •...•in_ ..2. Each of these components has length pn, L(Jj, .....j .. _,.j..) = L(JiJ ..... jn _,)(I- 0)/2 = pn. Let
U
U
G;II" .. ,in_l
J j ...... j ••
iJ ... ·.i .. =0.2
Since each interval Jj , .....i. _, of Sn_I'yields two closed intervals of Sn, Sn has 2(2 n - l ) = 2n closed intervals. This completes the induction step of the construction. Finally, let
We check that C satisfies the properties of a Cantor set in the following claims.
Claim 1. The set C is nowhere dense. PROOF. The intervals which make up Sn have length pn. Therefore, for any point p E C there is a point q within a distance pn which is not in Sn and 50 not in C. Therefore C has empty interior and is nowhere dense. 0
Claim 2. The set C is perfect. The sets Sn are closed, and C is the intersection of these sets, 50 C is closed. Let p E C and i a positive integer. Take n such that {3n < 2- j and let K be the component of Sn containing p. Then K n Sn+1 is made up of two intervals. Let qj be one of the endpoints of the interval of K n Sn+1 not containing p. Then (i) qj :I: p, (ii) Ip - qjl < {3n < 2- j , and (iii) q E C (since all the end points of the Sj are in C). This gives a sequence of points qj E C, qj :I: p, and qj converges to p. 0 PROOF.
REMARK 4.1. The total length of Sn is 2npn = (l-o)n. Since 2{3 < 1, the total length of Sn goes to zero as n goes to infinity. Therefore, all these niiddle-o Cantor sets have measure zero.
211
II. ONE DIMENSIONAL DYNAMICS BY ITERATION
There are Cantor sets with positive measure, but they are not formed by taking a uniform proportion of the intervals out at each step. Let On > 0 be numbers such that 0:=1 (1 - on) > 0, i.e., E:::I on < 00. At the n-th stage remove the middle-on from each of the previous intervals to form the set Sn. Then the total length of Sn is 1 - On times the total length of Sn-I. By induction, the total length of Sn is 0;=d1 - OJ). As n goes to infinity, it follows that C = n:=1 Sn has measure 0;:1(1 - OJ) > O. REMARK 4.2. In the construction, it is not important that each component of Sn has length exactly (3n. Let Ln be the maximum length of a component in Sn,
Ln = max{L(K) : K is a component of Sn}.
What is important is that Ln goes to zero as n goes to infinity. This property implies that C is nowhere dense. If the sum of the lengths of the components of Sn does not go to zero, then C has positive melL~urf'. Also to prove that. C is perff'ct, it is not necessary that for f'1l.Ch component K of S" t.hat K n S,,+ I has two component.s; what is nc<,essary is that for each compoOl'nt K of S" that there is a k ~ n + 1 such that [( n Sk has at least two components. REMARK 4.3. We show that the quadratic map has an invariant set which is a Cantor set in the line but which is not a middle-o Cantor set. Later in the book we see many other invariant sets which are Cantor sets, or locally the cross product of a Cantor set and a disk (in some dimension).
Middle-Third Cantor Set and Ternary Expansion Any number x in the interval [0,1] can be written in ternary expansion as 00
X
.
"3 Jk10 ' = '~ "=1
Most numbers have a unique expansion but
In fact for any nand
ii, ... ,in,
=~ik ( ~ik)+in+l ~ 3" 3n ~ 310 10=1
10=1
+ ~ ~. ~
k=n+1
310
Thus repeating 2's in a ternary expansion plays the same role as repeating g's in decimal expansions, and the nonunique representations given above are the only ones that occur. expansions have nonunique representations. To make the comparison between the ternary expansion and the middle-third Cantor set, we want to avoid the use of 1's as much as possible. For that reason we make the following conventions on the coefficients j = (il,h, ... ) which we use in the cases where there is a nonunique representation: (i) we use the j for which there is an n with in = 2 and i" = 0 for all k > n but do not use the j for which there is an n with in = 1 and i" = 2 for all k > n, (ii) we use the j for which there is an n with jn = 0 and j" = 2 for all k > n, but do not use the j for which there is an n with jn = 1 and j" = 0 for all k > n, and finally (iii) we use Er..1 2·3-" to represent the number 1. With these restrictions, the representation is unique.
2.4.1 MIDDLE CANTOR SETS
29
Next we consider the set of numbers which use only O's and 2's as coefficients in their ternary expansion: 00
Co =
.
{L ~: : jn = 0 or 2}. n=1
Note for points in Co there is a unique ternary expansion even without any restrictions. (Their expansions use only O's and 2'8 and automatically obey the above rules for choices of the representation.) We want to represent Co as the intersection of sets. Define 00
S~ =
.
{L ~~ : jk = 0 or 2 for 1 :5 k :5 n and jk = 0, 1, or 2 for k > n}. k-I
We want t.o show that S~ = S" hy induct.ion on 71, where the set.s Sn are those given above for the lIIiddltl-third CI\ntor sct. First, note that S; contains all numbers except t.hose which nre 1/3 + y where y has an (ternary) expansion in 3- k for k > 1; thus o < y < 1/3. The two endpoints are contained in S; because 1/3 = ~~2 2· 3- k and 2/3 can be represented with jk = 0 for k ~ 2. (We do not use expansions whose coefficients end with a 1 followed by repeated O's or a 1 followed by repeated 2's.) Therefore, S; = S I. Also note that the left ends of the two intervals in S; have ternary representations which end in all O's, and the right endpoints end in all 2's. Now assume by induction that S~_I = Sn-I, and that all the left endpoints in S~_I have ternary expansions whose coefficients are jk = 0 for k ~ n and the right endpoints have ternary expansions whose coefficients are jk = 2 for k ~ n. Let JiI .....Jn_. be an interval in Sn-I = S~_l' Since its left endpoint has a ternary expansion with jk = 0 for k ~ n, J i ......in _. \ S~ is the.set of points
where 0 < y < 3- n • (Again, the open interval is removed because we do not use expansions which end in 1 followed by repeated O's or 1 followed by repeated 2'8.) Therefore, S~ n Jj, .....)n_. = Sn n Ji ......in _. for any Jj, .....in_I' and so S~ = Sn. Also note that the left endpoints of the intervals in S~ can be represented by expansions that end in repeated O's and the right endpoints by expansions which end in repeated 2's. This completes the proof by induction that S~ = Sn for all n. Now letting C be the middle-third Cantor set,
n=O
-n 00
-
S n'
n=O
= Co.
Thus the middle-third Cantor set consists of those points whose ternary expansion contains all 0'5 or 2's and no 1'5. Finally, we note a connection between the points in C which are not endpoints and their ternary expansions. The endpoints of Sn for some n are those points in C which are the endpoint of an open interval K where K C R \ C. They are also the points
II.
30
ONE DIMENSIONAL DYNAMICS BY ITERATION
which end in repeated O's or in repeated 2's. Because there are points in C with ternary expansions which have both jk = 0 for arbitrary large k and jk' = 2 for other arbitrary large k', there are points which are not the endpoints of any ofthe S~. These are points which have points arbitrarily close which are not in C, but they are also accumulated on by points of C from both sides.
2.4.2 Construction of the Invariant Cantor Set Remember that F,,(x) = ILx(1 - x) is the quadratic map, and I = [0,1). We showed before that if x 1. I then F,,(x) goes to minus infinity as n goes to infinity. Therefore we want to find the x-values such that F::(x) E I for all n E N. Here, and in the rest of this book N is the set of nonnegative integers, N = {n E Z : n ~ a}.
The maximum of F,,(x) occurs at the critical point x = 1/2 where F,,(x) takes the value 1L/4. We consider values of the parameter IL for which this value is greater than one, IL > 4, so F,,(l) covers I and F;;I(l) n I = F;;I(I) is the union of two intervals which we label II and 12 , F;; I (I) n I = I I U h (See Figure 4.1.) Later we take IL > 2 + 5 1/ 2 which insures that IF~(x)1 x E F;;I(I). This bound on the derivative makes some calculations easier.
> 1 for
1 ...--+--~--,
o FIGURE 4.1.
1
I I X-
Intervals for the Quadratic Family
Theorem 4.1. Assume p. > 4, and let A" = {x : F::(x) is a Cantor set.
E
I for all n ~ O}. Then A"
We introduce the following notation which is used throughout the proof:
n F;;k(li.)
n-I
I io ,... ,;0_1 =
= {x : F!(x) E I;. for 0 ~ k ~ n - I}
k=O
where ik = 1 or 2, and Sn =
n
n-l
k=O
k=O
n F;;k(l) = n F;;k(ll
U 12)
=
U io.il , ...• in_1 = 1,2
1;0,;1"",;0_1'
2.4.2 CONSTRUCTION OF THE INVARIANT CANTOR SET
31
For the proof that AI' is a Cantor set, it is not necessary to label the components of the set Sn so carefully. However, in a later subsection we use the labeling to prove that the map F,. restricted to AI' is "topologically conjugate" to a map on a space of symbols. (A topological conjugacy is a homeomorphism which takes orbits of one map to orbits of another map.) We introduce the notation here to avoid proving twice the lemmas used in both proofs. The conjugacy of F,. restricted to AI' with a map on a space of symbols is interesting because it allows us to prove that the periodic points of F,. are dense in AI' and that F,. has a dense orbit in AI" We start the proof with the following lemma. Lemma 4.2. Assume Jl > 4. For all n E III the following statements are true. (a) For any choice of the labeling with io, ... , i n - I E {I, 2}, I,o .....,~_. n Sn 1'0 ......,,_ •. 1 U 1'0 ...... ,,_ •• 2 is the union of two nonempty disjoint closed intervals (which are subsets of 1'0 ..... ,,,_.). (b) For tlvo distinct choices of the labeling (io, ... , in-d f (i ,i~_d, 110 ..... ,,,_. n I,:, ..... ,~_. = 0, so Sn is tIle union of2n disjoint intervals. (c) The map F,. takes the component 110 ...... ,,_. of Sn homeomorphically onto the component 1........,,_. of Sn-I.
o,'"
REMARK 4.4.
Notice from the definition of 1'0 ......,,_., n-I
1'0 ...... ,,_. =
n F;;"(l•• ) = {x: F!(x)
E I ••
for
"=0
°$ k $ n -I},
so it is characterized by the forward orbit of points. In Exercise 2.16, the reader is asked to determine the order on the line of all the intervals with three labels, 110 ., •• ,•. In terms of images of these intervals, part (c) says that F,.(Iio ......,,_.) = 1, ...... ,"_. where the first label is dropped. The reader should check why this is true from the above characterization of these intervals. PROOF. We prove the lemma by induction on n. For n = 0, So = I = [O,IJ and there is nothing to check. For n = 1, let G I = {x E I : F,.(x) > I} and SI = F;;I(l) n I = 1\ G I . Then, G I is an open interval in the middle of I because F,.(I/2) > 1, and SI = In SI is the union of two nonempty disjoint closed intervals as claimed, II U 12 with 1'0 c I for io = 1,2. The map F,. is monotonically increasing on [0, 1/2J, so F,. maps h homeomorphically onto I = So. Similarly, F,. is monotonically decreasing on [1/2, I], so F,. also maps 12 homeomorphic ally onto I = So. This verifies the induction hypothesis for n = 1. Although we do not need this step, take n = 2. The map F" is monotone on each of the separate intervals II and h For io = 1 or 2, F,.(Iio) :) II U 12 , so S2 n 110 = 1'0 n F;I(Sd is the union of two intervals, 1'0.1 U 1'0.2 where F,.(I.o.,,) = lie, and F,. is a homeomorphism from 1'0.,. onto I, •. Since there are four intervals 110 .• " this verifies all three conditions of the induction hypothesis. Now assume the lemma is true for n and we verify it for n + 1. Let 1'0 ..... ,,,_. be a component of Sn. Then F,.(Jio''''''n_.) = 1, ....... ,,_. is a component of Sn-Io and 1••.....• ,,_. :) I"''''''n_. n Sn = I"''''''n_ •. 1 U 1, ....... ,,_ •. 2. Therefore,
1'0 ..... ,,,_. n Sn+J = F;I(Sn) n Iio''''''n_. = F;;I(Sn n 1"''''''n_.) n 110 = [F;;I(I'l''''''n- •. d U F;;I(I"''''''n_ •. 2)J n 1'0
is the union of two nonempty diSjoint closed intervals, giving condition (a). Since there are 2n choices of the index for l io ''''''n_.' Sn+l is the union of 2(2n) = 2n + 1 intervals,
II. ONE DIMENSIONAL DYNAMICS BY ITERATION
32
giving condition (b). The map F,. is monotone on lio •...• i" ••• j. so it maps homeomor· phirally onto 1.......... _•. ). giving conditions (c). This completes the verification of the induction step. and so verifies the lemma. 0 With this lemma. some of the properties of a Cantor set can be verified. Since each of the sets Sn is closed. A = Sft is closed. If the lengths of the intervals in Sft go to zero. thf'n A is perfect as before. If the lengths of the intervals do not go to zero. then the intersection contains an interval and it is perfect. In any case. A is perfect. However. to prove that A is nowhere dense. we need to show that the maximal length of an interval in Sn goes to zero. This fact is easier to prove if 1F~(x)1 > 1 for all x E II uh This condition on the derivative is true if and only if J.L > 2 + 5 1/2. Therefore in the rest of thi!l srction we assume JJ > 2 + 5 1/ 2 and return to the case for 4 < J1 ~ 2 + 5 1/2 in the next subsection.
n:=o
Lemma 4.3. TIle absolute value of the derivative is greater thall Olle for aJJ points in fl U f2 = 51. 1F;.(x)l > 1 for all x E 51. if alld only if JL > 2 + 5 1/ 2 . PROOF. The derivative of F,. is given by F~(x) = J.L - 2JLx. Also F:(x) = -2JL < O. so thl' smalll'St vahll' of 1F~(x)1 on fj occurs where F,.(x) = 1. Solving 1 = F,.(x) = JLX - JLX 2, we get x± = [JJ ± (J12 - 4JL)I/2J/(2JL). But for these points.
IF;' (x±)1
=
IlL - 2JL(JL ± (JL2 -
4J1)1/2)1
2J.L
= I 'f (JL2
- 4JL)I/21
= {jL 2 -
4JL)I/2.
Therefore, we need 1F~(x±)1 = {JL2 - 4JL)I/2 > 1, or J.L2 - 4JL -1 > O. The left side equals zero when JL = 2 ± 5 1/ 2 , and it can be checked to be greater than zero for JL > 2 + 5 1/ 2 . (The second value of JJ arose because we squared the equation when we solved it.) This proves the lemma. 0 The next lemma proves a bound on the maximal length of a component of 5ft in terms of a bound on the derivative. If J.L > 2 + 5 1/2 then this bound goes to zero as n goes to infinity. For an interval K. let L(K) be its length. Lemma 4.4. Let A = A,. = inf{IF;~(x)1 : x E II U 12}. Then the length of any component f. o ..... i • _. is bounded by A-n. L(1.o ....... _.) ~ A-n for aJJ possible choices of the labeling. PROOF. We prove the lemma by induction. Take n = 1. Let lio = [a. b]. Since F,.(1io) = [O,IJ, {F,.(a). F,.(b)} = {0.1}. i.e .• endpoints go to endpoints. By the Mean Value Thoorem, there is some c E [a. b] for which F,.(b) - F,.(a) = F~(c)(b - a). Then. 1 = IF,.(b) - F,.(all = IF~(c)I'lb - al ~ )"L(/io)' Therefore L(1io) ~ ),,-1. This proves thE- induction step for n = 1. Assume the result is true for n - 1. Take a component lio .....' •.• of Sn' Then the image. F,.(1.o ..... ,•.• ) = li ...... i . . . . is a component of Sn-l. and by induction L(1i ........ _,) ~ ),,-(n-I). As above by the Mean Value Theorem. there is acE lio •... ;._. = [a,b] with F,.(b) - F,.(a) = F;.(c)(b - a). so
L(1, ...... , •.• ) = 1F,.(b) - F,.(a)1 = 1F;'(c)(b - a)1 ~
Alb - al
= AL(1io ..... i•.• ).
and L(1.o .... .,•.• ) ~ )" -1 L(1i ..... .i •.• ) ~ step and the lemma.
)". n.
This completes the proof of the induction 0
2.4.3
THE INVARIANT CANTOR
SET
FOR,.. > 4
33
PROOF OF THEOREM 4.1. Assume that ~ > 2 + 5 1/ 2 . We mentioned above that the Sn are closed so All is closed. We have shown that the length of the components of Sn are shorter than A-n which goes to zero with n. The proof that All is perfect and nowhere dense is the same as for the Middle-o-Cantor set. Take pEAl" For j E N, take n such that A-n < 2- j . Then p E lio ..... in_1 for some choice of the component of Sn' Then lio .....• n_1 nSn+1 = lio ....... _I.1 U/io ..... in _ I .2 is the union of two intervals. Take YJ E lio ..... io_1 \ Sn+1 in the gap, and qj an endpoint of lio ..... in_l.i. where lio ..... in_l.i n is chosen so that p It lio ..... i o -I.i o ' Then Yj is not in Sn+1 and so is not in All' The Yj converge to p, so this shows that A" is nowhere dense. Also qj '" p and qj E All since it is an endpoint. The qj converge to p proving that All is perfect. This completes the verification that A" is a Cantor set for ~ > 2 + 5 1 / 2 • 0
2.4.3 The Invariant Cantor Set for J.1 > 4 The proof of Theorem 4.1 for alii' wit.h 4 < I' goes hack to the work of Fatou and Julia on complex functions. (This is the theorem that if all the critical points of a polynomial have orbits which go to infinity then the Julia set is totally disconnected.) See Blanchard (l984) or Carleson and Gamelin (1993). The first proof using strictly real variables is found in Henry (1973). (This proof does not prove that 1F;{x)1 ~ C)"n for x E All') The proof given below is mainly given in terms of real variables, but we use Schwarz Lemma of complex variables to prove the key estimate. For values of ~ with 4 < I' < 2+5 1/ 2 , there are points x E InF;I{l) with 1F;{x)1 < 1 and other points with 1F;{x)1 > 1. Thus in terms of the usual length on the line we do not have a )" > 1 such that L{/io ..... i .. ) ~ A-I L{lil ..... i n ) for all the subintervals lio ..... i •. There are several ways around this difficulty. One method is to look at higher iterates of F,. and prove there is a C ~ 0 and)" > 1 such that 1(F!),{x)1 ~ C)"k for all k ~ 0 and x E huh By taking an m ~ 1 with C)"m = ),,' > 1 we have that F;' is an expansion on II uh and L{lio, .... i n ) ~ ()"/)-IL{li m ....... ). This inequality can then be used to prove that A,. is nowhere dense. Guckenheimer (1979) and van Strien (1981) have a proof in this spirit using the fact that F,. has negative "Schwarz ian derivative." Also see Misiurewicz (1981) and de Melo and van Strien (1993). Newhouse (1979) has a proof for two dimensional maps which can be adapted to prove this result. Rather than give the details of this proof we give an alternative proof which proceeds by defining a new length on the interval [O,lJ. This idea is essentially present in the complex variable proof mentioned above. Definition. Assume that p{x) > 0 is a continuous density function on an interval K. If x,Y E K, then define the p-distance from x to y by dp(x,y) =
IL'l
p{t)dtl·
It is easy to check that this is a metric on K: dp(x, y) > 0 for x '" y, dp(x, x) = 0, dp(x, y) = dp(Y, x), and dp(x, z) ~ dp(x, y) + dp(Y, z). Let Lp(J) be the length of an interval J in terms of the distance dp • We think of p(x) defining a length or norm of a vector (infinitesimal displacement) at x by IIvllp.% = Ivlp(x) where Ivl is the usual length of a vector. REMARK 4.5. Assume K be an interval and p: K ..... R+ be a positive density function for which there exist positive constants C I and C2 with C I ~ p(x) ~ C2 for all x in K. It can be seen easily by applying estimates to the integral that
34
II. ONE DIMENSIONAL DYNAMICS BY ITERATION
for any two points r.y E K. Therefore. if we can show that I>"lengths of a nested set of iDten-als go to zero then the usual lengths also go to zero. The following lemma indicates the property which we want p(r) to have. Lemma 4.5. Let K be an interval and p : K -+ R+ be a positive density [unction. Let / : R -+ IR be a C t [unction. Assume that there is a ~ > 0 such that p(f(r»/f'(x)/ = IIf'(x)vll p./(%) > ~ p(x) I/vl/ p, % -
.1, wllf'rf' J is It sIIhllltf'rvnl o[ K with /(J) I".l/(.I» .~ >.J.,.(.I).
(Ilr.r ..
c
K. Tllen [or tllis subintf'.Tval J,
REMARK 4.6. Since the derivative of / is always greater than zero on the interval J. / is monotone on J.
PROOF. The following estimate proves the lemma:
11 11 1I ~ 1 ~p(s)
Lp(f(J» =
p(t) dtl
IE/(J)
p(f(s))j'(s) dsl
=
(where t = /(s))
IEJ
~
IEJ
p(f(s))f'(s) Ip(s) ds p(s) ds
IEJ
= ~Lp(J).
o The following proposition relates the condition given in terms of varying norm /I I/p,% in the last lemma to a condition for the standard norm. Proposition 4.6. Let / : R -+ R be a C t [unction. Let K be an interval and p : K -+ R+ be a positive density [unction [or which there exist positive constants C l and C 2 with C t :::; p(x) :::; C 2 for all x ill K. Assume that there is a ~ > 0 SUell that IIj'(x)vll p,J(%) ~ ~IIvl/p.%
provided x,f(r) E K. Then taking the positive constant C = C./C2 of the 71- til iterate of f satislies
:::;
1, the derivative
/(r)'(x)v/ ~ C~"/v/ for all n
~
0 and all r E J in terms of the usual absolute value.
f is immediately expanding in terms of some different norm II p .r, it is eventually expanding in terms of the Euclidean norm.
REMARK 4.7. Thus if
II
PROOF. The proof follows directly because the two norms are uniformly eqUivalent:
/(f")'(r)v/ ~ C;·II(r)'(x)vll p,Jn(%)
~ c;t~"1/vl/P.% ~ C;tCl~"/V/.
o To use Lemma 4.5, we are free to define the density function p(x). We want a choice that satisfies the assumptions of the above lemma. The metric p we use is related to the Schwarz Lemma in Complex Variables.
2.4.3 THE INVARIANT CANTOR SET FOR ,..
>4
35
Schwarz Lemma 4.7. Let D = {Z E C : Izl < I} be the open disk in the complex plane. Assume 1 : D --+ D is complex analytic with 1(0) = 0 and I(D) ! D (not onto). Then 1/'(0)1 < 1. For a proof, see Theorem 15.1.1 in Hille (1962).
Corollary 4.8. Assume I: D g(z) = 1 -lzl2 and
--+
D is complex analytic and /(D)
!
D (not onto). Let
1
p(z) = g(z)'
Then p(f(z»IJ'(z)1 = 1If'(z)vllp,f(z) < 1 p(z) IIvllp,z for all zED.
4.8. This norm II· IIp,z is called the Poincare norm and it induces a metric (distance between points) called the Poincare metric. The unit disk with the Poincare metric is an example of a non-Euclidean metric on a surface with negative curvature.
REMARK
PROOF. This theorem is also proved in Hille (1962), Theorem 15.1.3, but is stated in terms of the distance between points and not the length of vectors. We also give a sketch of the proof using the Schwarz Lemma, Fix Zo E D and let Wo = /(zo). For j = 1,2, there are fractional linear transformations Tj preserving D,
with lajl < 1 such that T1(0) = zo and T2 (wo) = O. Thus T2 0 /0 TdO) = O. The fractional linear transformations preserve the length of vectors in terms of II . lip,., so I = IIT2(wo)lIp,o
II 1IIp,wo _ IITl(O)lIp,%o II 1 IIp,o Thus by the Schwarz Lemma 1
> I(T2 0 f 0 Til'(O)1 > IIT2(wo)lIp,o . 1If'(zo)lIp,wo . IITl(O)lIp,zo II I IIp,Ulo 1I/'(zo)lIp,wo IIll1p,zo
IIllbo
IIll1p,o
o In Corollary 4.8 the absolute value of the derivative is less than one, while in the following lemma it is greater than one. The reason for this difference is that in Corollary 4.8, / maps D into D, while F,,(/) covers I. In Corollary 4.8, the unit disk is centered at the origin. For the map F" the corresponding interval we use is centered at 1/2, 80 a change of variables (given in the proof) modifies the Poincare norm to give the one stated in the following lemma.
36
II. ONE DIMENSIONAL DYNAMICS BY ITERATION
Lemma 4.9. Let p(x) = [x(l - X)J-I on (0,1). (This density is singular at x Then for 11 > 4 and x, FI'(x) E (0,1),
= 0,1.)
p(FI'(x))IF~(x)1
p(x)
We consider FI' as a map of a complex variable, FI'(z) = Ilz{l - z). Let '0 1/ 2 be the disk of radius 1/2 centered at the point 1/2. Notice that FI' takes circles of radius r about 1/2 onto circles of radius IJr2 about 11/4: FI' (I /2 + re i8 ) = 11/4 _llr2ei28. In particular, FI' takes the circle of radius 1/2 onto the circle of radius 11/4 about 11/4. Thus for 11 > 4, FI' takes the circle of radius 1/2 around the outside of the disk '0 1/ 2 , Also, Z = 1/2 is the critical point for FI' (the point where the derivative is 7.ero), and for I' > 4, FI' takes this critical point outside '0 1/ 2 : F I / 2 {1/2) = 11/4 is real and greater than one. Therefore FI'('O I /2) covers '0 1/ 2 twice, and F,-;I has two branches of the inverse t.aking '0 1/ 2 into itself and each being one to one. (The branches correspond to the inverse of F,. on the two intervals II and h when FI' is restricted to the real variable x.) Because there are two branches of the inverse, neither is onto all of '0 1/ 2 . By Corollary 4.8, each of these inverses is a contraction in terms of the Poincare metric on '01/ 2 , so FI' is an expansion for points z with Z, FI'{z) E '0 1/ 2 , To complete the proof we need to determine what the Poincare norm is on this disk which is not the unit disk centered at the origin. The map h{z) = 2z -1 takes '01/ 2 onto the unit disk D centered at the origin. If p«) = (1- «)-I is the usual Poincare norm on D, t.hen poh(z) = [4zz-2z -2i]-I. For real z, this gives a constant multiple of the norm stated in the lemma. However, the correct way to use the map h to "pull back" the length of vectors is not to just look at this composition but to define IIvll •.• = IIh'{z)vlip.h(.). Since h'(z) == 2, this induces the norm IIvll •.• = Ivll2zz - z - i]-I which for real z gives IItllI .... = IvI2-I[X(1-x)]-I. If we take two times this norm everywhere we get the norm stated in the lemma and it will also be expanded by F~. 0 PROOF.
The norm in the previous lemma is singular at 0 and 1. To get rid of this difficulty we modify it slightly to make the norm singular at the points - f and 1 + f which are outside of the interval [0,1]. We accomplish this change by mapping the disk of radius 1/2 + f about 1/2 onto the unit disk rather than the disk of radius 1/2. Proposition 4.10. Assume 4 < 11. Let p(x) = [(x for 0 < f < 11/4 - I and x, FI'(x) E [0,1], IIF~(x)lIp.F~("') IIll1p~ REMARK
+ ()(1 + ( -
x)]-I on [O,lJ. Then
= p{F,.(x))IF~(x)1 > 1. p(x)
4.9. Notice that this density is nonsingular on [0,1].
4.10. This Proposition can be proved by a direct calculation considering only real x rather than proving it as a corollary of the Schwarz Lemma. The difficulty is that this argument is somewhat involved and needs to consider various intervals to prove the inequality. Exercise 2.12 asks such a direct verification to be carried out for a different density function for which the calculation is simpler and for which there is no easy way t.o use a complex variable argument. REMARK
REMARK 4.11. The first iterate of FI' stretches lengths in terms of this new metric, so that we are ahle to take C = 1 and .x > 1 in the inequality 1I(F,~)'(x)lIp.F,~(",) ?: C.xk which definE'S a hYPE'rholic set.
2.5 SYMBOLIC DYNAMICS FOR THE QUADRATIC MAP
37
PROOF. We want to modify the proof for Lemma 4.9 by taking h(z) so that it takes the disk of radius 1 + t centered at 1/2 onto the standard unit disk D:
h(z) = 2(1
+ 2t)-I(z -
1/2).
Using this h a direct calculation shows that the induced norm is as follows:
IIvll •.•
(1 + 2t)2- 1 = (x + t)(1 + t - x)
Again this norm is a constant scalar multiple of the one stated in the Proposition. To prove the proposition we only need to show that (i) F" takes the critical point z = 1/2 outside the disk of radius 1/2 + f centered at 1/2, V I/ 2 +.' and (ii) F,,(V I / 2+.) covers V I/ 2 +< twice. In connection with the first condition, we saw in the proof of Lemma 4.9 that F;. mapped the critical point x = 1/2 to the point p./4, so we need t > 0 such that p./4 > 1 + tor f < p./4 -I, which we assumed. For the second condition, the proof of Lemma 4.9 showed that F" mapped circles of radius r centered at x = 1/2 onto circles of radius Ilr2 centered at p./4. Thus we need p.(1/2 + £)2 > p./4 + t, which is always true for t > D and IJ > 4. Choosing D < t < p./4 - 1 we get the result. 0 For p. > 4, this last proposition proves that the p-length of the intervals l io ..... in _. are bounded by .x.- n . Applying Remark 4.5, we get the Euclidean length is bounded by C,\ -n for some C > D. This proves Theorem 4.1 for all these values of the parameter.
2.5 Symbolic Dynamics for the Quadratic Map In this section, we show there is a way to represent the dynamics of F" on A by a map on a symbol space made up by points which are sequences of l's and 2's. The map on the symbol space is called the symbolic representation of the map and is said to give the symbolic dynamics for the map. Definition. Let N = {D, 1,2, ... } be the nonnegative integers as always. For p an integer with 2 :::: p, let {I, 2, ... ,p}N be the space of functions from N into the set { I, 2 ... ,p}. We also write this space as ~ or 1:, to shorten the notation. We define a metric on 1:, by
d(8, t) =
f 6(s;~
tk)
k=O
for s
= (so, Sl,"') and t = (to, tl,"')' where 6 i . _ (,J)-
{D
1
if i = j
ifi~j.
Exercise 2.13 is designed to clarify the topology induced by this metric. (Many authors use 2- k in the definition of the metric, but we use 3- k which makes what are called cylinder sets into balls in terms of this metric. See Exercises 2.13-14.) Finally, we define a shift map on 1:, by 0'(8) = t where tk = Sk+I, i.e., O'(so, SI, ... ) = (SI' S2, ... ). The reader can check that 0' is continuous with the above metric. The space 1:p with the shift map 0', (1:,,0'), is called the symbol space on p symbols or the full (one-sided) p-shift space. (Later when we are studying diffeomorphisms in dimensions greater than one, we discuss the two-sided p-shift where s: Z -< {1,2 ... ,p}.) Next, we define the map which takes the points in A" to points in 1:2 •
38
II. ONE DIMENSIONAL DYNAMICS BY ITERATION
Definition. Define h: A,. -+ E2 by'h(x) =j = (jo.jl •... ) where F!(x) E Ii.' Thus E F,.-k(I].) for all k. so for any n, x E n~=oF;k(Ii.) = Iio.it .... ,in which is a component of Sn+1' The map h is called the itinemry map.
x
Theorem 5.1. Let I' > 4 for the quadratic map F,. defined above. Then the itinerary map h : A,. -+ El defined above is a homeomorphism from A,. to E2 such that h 0 F,. = U 0 h. PROOF. First we check the condition that u
0
h
=h0
F,.. Let x E A,., s
= h(x),
and
t = h(F,.(x)). Then, F!(F,.(x)) E It. and F!(F,.(x)) = F!+I(X) E 1'.+1' Sotk = Sk+1, t U(5), and h(F,.(x)) u(h(x)). Next we check that h is onto. Let s E E2. As in the earlier theorem, the intersections
=
=
1. 0 , ..••• = n~=o F;k(I•• ) are nonempty intervals and are nested as n increases. There1.0....... = n~o F;; " (I•• ). If x E 1'0, ... ,'. then for 0 ~ k ~ n, fore there is a Xo E x E F,-;"{/•• ) so F!(x) E I ••. Therefore F!(xo) E I •• for 0 ~ k < 00 and h(xo) = 8. This proves that h is onto. We give two proofs of the fact that h is one to one to illustrate different ideas. In both these proofs we assume 11 > 2 + 5 1/ 2 for simplicity. Assume that s = h(x) = h(y). By the above argument F!(x), F!(y) E I •• for all 0 ~ k. so x, Y E I.O, ..... n for all n. Lemma 4.4 proved a bound on the length of I.o .... '.n' L(I'o''''''n) ~ A-(n+1) L([O, 1]). so Ix - yl ~ A-(n+l) for all n, and x = y. As a second proof that h is one to one, we use the ideas of expansiveness. If z E h uI2 then 1F~(z)1 ~ A. Therefore if z"Z2 E I j then for some z' E I j IF,.(zd - F,.(Z2)1 = IF~(z')1 . IZI - z21 ~ A· IZI - z21. If s = h(x) = h(y), then F!(x), F!(y) E I .. for all o ~ k, so 1F;(x) - F;'(y) I ~ AIF;-I(X) - F;-I(y)l ~ Anlx - YI. If x I- y then for some n, Anlx - yl > L(/. n ) and F;(x) and F;'(y) can not be in the same interval. This contradiction proves that h is one to one. Last, we need to check that h is continuous. Take x E A,. and 5 = h(x). Let f > O. Pick an n such that 3- n < f. Consider the subinterval/,o''''''n' If fJ > 0 is small enough and YEA,. with Iy - xl < fJ then y E I'o''''''n' Then for yEA,. with Iy - xl < fJ let t = h(y). Then tIc = SIc for 0 ~ k ~ n. Therefore d(h(x),h(y)) ~ L~=n+13-" = 3-(n+1)[1 - (1/3)J- I = 3- n 2- 1 < f. This proves the continuity of h. Because the sets A,. and E2 are compact and h is a one to one continuous map, it follows that h is a homeomorphism. This completes the proof of the theorem. 0
n::,=o
REMARK 5.1. Given two maps 1 : X --+ X and g : Y --+ Y, a homeomorphism h: X --+ Y such that hoI = g 0 h is called a topological conjugacy. See the next section for further discussion. The same proof given above (with only minor changes) proves the following theorem about an arbitrary function and p intervals. Theorem 5.2. Let 1 : R --+ R be a C l function and II .... ,Ip be p disjoint closed bounded intervals with p ~ 2. Let I = U~_I I j . Assume that 1(/j) ::> I for 1 ~ j ~ p. Also assume there is a A > 1 such that 1/'(x)1 ~ A for x E In rl(I), Let A = n~o I-"(I). Then A is a Cantor set. Define h : A --+ Ep by h(x) = 8 where I"(x) E I ••. Then h is a topological conjugacy from 1 on A to u on Ep. We now use the above result on the conjugacy to prove facts about the periodic points of the map on the line. Before we state the theorem, we give the definition of one more property which is preserved by the conjugacy. Definition. A map 1 : X --+ X is (topologically) tmnsitive on an invariant set Y provided the (forward) orbit of some some point p is dense in Y. This property is
2.5 SYMBOLIC DYNAMICS FOR THE QUADRATIC MAP
39
equivalent to the fact that given any two open sets U and V in Y there is a positive integer n such that r(U) n V 1: 0. (This is called the Birkhoff Transitivity Theorem and is provM in Chapter VI!.) This property indicates that f mixes up the points of Y and the set is one piece dynamically. A stronger conditions is as follows: A map f : X -. X is called topologically mixing on an invariant set Y provided for any pair of open sets U and V there is a positive integer no such that j" (U) n V 1: 0 for all n ~ no. Thus if f is topologically mixing, the iterates j"(U) intersect V for allsufllciently large values of n. Theorem 5.3. Let f : R -. R be a CI function and II, ... ,Ip be p disjoint closed bounded intervals with p ~ 2. Let I = LY;'=I I). Assume that f(Ij) :J I for 1 ~ j ~ p. Also assume there is a ,x > 1 such that 1f'(x)1 ~ ,x for x E In rl(I). Let A = n:'ork(I). Then (a) the cardinality of the number of periodic points is given by #(Fix(rIA) = pn, (b) PerUIA) are dellse ill A, and (c) f is transitive on A. In fact, f is topologically mixing on A. REMARK 5.2. We leave to Exercise 2.15 the proof that Ep and so flA is topologically mixing.
CT p
is topologically mixing on
PROOF. The map f restricted to A is conjugate to CT on Ep, so it is enough to prove these facts for CT. (This uses the fact that a conjugacy takes a periodic orbit of period n to a periodic orbit of period n. See Exercise 2.24.) (a) Given n there are pn blocks of length n made up with letters in {I, ... ,pl. If b = bo··· bn -I is one of these blocks, let b be the string which repeats the block b, b = bo .. ·bn-Ibo .. ·bn_I·" or we could write b = bb· ... Then qn(b) = b so b E Per(n, q). On the other hand, if qn(s) = s then sn+j = Sj for all j. Therefore if b = So ... Sn-I then s = b. This shows that the points that are fixed by qn are exactly those given by one of the b where b is a block of length n. There are pn distinct blocks of length n so we have proved the claim. (b) Let sEEp and t > O. Take n such that 31 - n2- 1 < t. Let b = 80'" Sn-l and t = h. Then t E Per(n,q) and d(s, t)
=
f:
D(SJ; t)
j=n 00
~
L 3-
j
= 3 1- n 2- 1
< t.
j=-R
Since t is arbitrary, this proves that there is a periodic point within t of 8, and so the periodic points are dense in Ep. (c) To prove there is a point with a dense orbit, we describe such a point. Let t be a sequence which first lists all the blocks of length one, then lists all the blocks of length two, and continuing, lists all the blocks of length n for each successive n: t
= 1,2; (1,1), (2,1), (1,2), (2, 2); (1,1,1), (2, 1,1), ....
(The use of parentheses, commas, and semicolons is merely to clarify the blocks making up the string and has no real meaning.) Then, given any SEEp and k there is a n such that qn(t) and 8 agree in the first k places. Therefore d(qn(t),s) ~ 31:-12- 1. Since k is arbitrary, the orbit of t gets arbitrarily near 8. Since s is arbitrary, the orbit of t is dense in Ep.
40
II. ONE DIMENSIONAL DYNAMICS BY ITERATION
The fact that f is topologically mixing follows from the fact that given one of the I (1 ......... ) :> I. We leave the details to the reader. This completes intervals I ••.....•• , the proof of the theorem. 0
r+
REMARK 5.3. It is no accident that we proved that the map f is transitive by proving it is conjugate to a shift map and then verified the-property for the shift map. It is nearly impossible to specify a point which has a dense orbit for a nonlinear map. However the very nature of the coding of the shift space allows us to write down a point with a dense orbit for the shift map. Then the conjugacy proves that the nonlinear map inherits this property even though we can not write down the point with the dense orbit.
2.6 Conjugacy and Structural Stability In the last section showed that the quadratic map on its invariant Cantor set is topologically conjugate to the shift map. We used the conjugacy to determine the numher of Pf'riodie points and to prove that the quadratic map is topologically t.ransit.ive on its invariant cantor set. In this section we continue our discllssion of topological conjugacy: we apply it to simple maps of the type considered in Section 2.2 and get conjugacies on intervals in the line (and not just between two Cantor sets). These examples also illustrate some of the properties which topological conjugacies preserve and which they do not. In the next section we return to the quadratic map and obtain a conjugacy between two nearby quadratic maps on the whole real line. When constructing the conjugacy for the quadratic map, we did not give any general motivation. We start with such motivation now before stating various conditions which we can place on the conjugacy. The concept of conjugacy arises in many subjects of mathematics. In Linear Algebra, the natural concept is linear conjugacy. Thus if Xl = Ax is a linear map and X = Cy is a linear change of coordinates for which C has an inverse. then the map on the y-variables is given as follows: Yl = C-IXI = C-l Ax = C-l ACy. Thus the matrix for the map in the y-variables is C- l AC. As long as t.he two maps are defined on the same space (e.g. some Rn), a conjugacy can be considered a change of coordinates of the variables on the space on which the function acts. In Dynamical Systems, we consider a conjugacy (or change of coordinates) which is continuous with a continuous inverse, i.e., a homeomorphism. We could also consider a conjugacy of two functions for which the change of coordinates h is differentiable with a differentiable inverse, i.e., h is a diffeomorphism. A third alternative is a conjugacy where the change of coordinates is an affine function, h : IR -+ IR is an affine map, h(x) = ax + b. We discuss below why we usually are only able to find a conjugacy by a homeomorphism and not a diffeomorphism. (In the last section, it would not be possible to require that h : 11." -+ E2 is differentiable because E2 is not a Euclidean space or a manifold.) Definition. Let f : X -+ X and g : Y -+ Y be two maps. A map h : X -+ Y is called a topological semi-conjugacy from f to g provided (i) h is continuous, (ii) h is onto, and (iii) h 0 f = go h. We also say that f is topologically semi-conjugate to g by h. The map h is called a topological conjugacy if it is a semi-conjugacy and (iv) h is one to one and onto and has a continuous inverse (so h is a homeomorphism). We also say that I and 9 are topologically conjugate by h, or sometimes we just say that I and 9 are conjugate. Definition. To define a differentiable conjugacy we restrict to functions on the line where we know what we mean by differentiable. Let J, K c R be intervals. Assume f : J - 0 J and 9 : K -+ K are two cr - maps for some r ~ 1. A map h : J -+ K is called a cr -conjugacy from f to 9 provided (i) h is a cr -diffeomorphism (h is onto, one to one,
2.6 CONJUGACY AND STRUCTURAL STABILITY
41
c r , with a cr inverse) and (ii) hoI = 90 h.
We also say that I and 9 are cr -conjugote by h, or diiJerentiably conjugate. If the conjugacy h is affine, h(x) = ax + b, then we say that I and g are affinely conjugote.
We defer to Exercise 2.24 to show that a topological conjugacy takes periodic orbits of one map to periodic orbits of the same period of the other map. Thus two conjugate maps "have the same dynamics" .
Example 6.1. In this first example we show how to match up two families of quadratic maps: F" and a second family 90' In this case the conjugacy can even be affine. This is a partial verification of the fact that there is only "one family of quadratic functions" up to affine change of coordinates. Define 90(Y) = ay2 - 1.
A conjugacy must match up the fixed points. The fixed points of F" are 0 and p" = 1 - 1Ill. Those of Yo are y± = [1 ± (1 + 4a)I/21/2a. Also note that (i) 90( -y+) = y+ nnd F,.(I) = 0, IUld (ii) the critical points of Yo and F" are 0 and 1/2 respectively. Assume x = h(y) = my + b is the change of coordinates. Because -y+ < y- < y+ and 0 < PI' < 1. we must have h(-y+) = 1. h(y-) = p", h(y+) = 0, and h(O) = 1/2. Substituting in h, we get the equations m(-y+)+b= 1 1 my- +b = 1 - p. my+ +b = 0 1 m·O+b= 2'
From the last equation we get that b = 1/2. Subtracting the first equation from the second (and using the form of y±), we get that m(l/a) = -1/p., or m = -a/p.. Substituting these values in the third equation we get -[1 + (1 + 4a)I/21/[2p.1 = -1/2, Il = 1 + (1 +4a) 1/2, or 4a = p.2 - 2p.. This last two expressions give necessary conditions for the maps to be conjugate: p.=I+(1+4a)l/2
or
p.2 _ 2p.
and
a= - 4 -•
h(y) = ~ _ a y . 2 p. Once we have found these conditions we can verify directly that this h indeed does work:
F"
0
h(y) = F,,( -ay/p.
+ 1/2)
= p.(-ay/p. + 1/2)(ay/p. + 1/2)
a 2 y2
I'
=
'4 --;-'
while
h 0 90(Y) = h(ay2 - 1)
a
2
= -~(ay - 1)
a2 y2
= --p.
a
1
+2 1
+ -p. +-. 2
These two quantities are equal because 4a = p.2 - 2p.. This shows that these two functions are affinely conjugate when the parameters are correctly related.
42
II. ONE DIMENSIONAL DYNAMICS BY ITERATION
Example 6.2. Let D(z) = 2z
mod 2
be the doubling map on the circle S == {z mod 2}, T
_ { 2y (y) 2(1 - y)
if 0 :5 y :5 1/2 if 1/2 :5 y :5 1
be the tent map, g2(Y) = 2y2 - 1 be the quadratic map of the last example, and .r,.(x) = /lx(l - x) be the standard family of quadratic maps. The tent map is also called the rooftop map. The doubling map is also called the Sqtlarin9 map because In complex notation it can be written as D(z) = z2 on complex numbers z with Izl = 1. We claim that (i) D is topologically semi-conjugate to both TI[O,IJ and 921[-1, IJ, and that (Ii) TIIO, 1],9211-1, I], and F 4 i[0, I] are all topologically conjugate.
(a)
(b)
(d)
(c) FIGURE 6.1. The Graphs of (a) Don on [-1,1], and (d) F4 on [0,1]
Sl
(or [0,1]), (b) Ton [0,1]' (c) 92
If we consider z is equivalent to -z (or 2 - z) on the circle S == {z mod 2}, this induces a two-to-one projection p: S --+ [0, I] given by p(z) = Izl for -1 :5 z :5 1. Also, if 1:5 z :5 2, then p(z) = 2 - z. Then for -1/2:5 z:5 1/2, Top(z) = T(lzl) = 21zl. and po D(z) = p(2lzl) = 21z1 because :5 21z1 :5 1. On the other hand, for 1/2:5 Izl :5 1, To p(z) = T(lz!) = 2 - 21z1 and po D(z) = p(2lzl) = 2 - 2Iz1 because 1 :5 2Iz1 :5 2. Therefore, for any z, To p(z) = po D(z). This proves that D and T are semi-conjugate. To construct the semi-conjugacy from D to 92 we consider z E S to be the point on the circle with angle lTZ radians and let x = h(z) = cos(lTz) be the map to the x coordinate, h : S --+ [-1, II. Then h 0 D(z) = cos(2z) = 2 C082 (Z) - 1 = 2h(z)2 - 1 = 92(h(z». This proves that D is topologically semi-conjugate to 92,
°
2.6 CONJUGACY AND STRUctuRAL STABILITY
43
Lastly, note that p(z) = p(z') if and only if z is equal to ±z' modulo 2. This is also true for h, so k(y) = h 0 p-I(y) = cos(7rY) induces a topological conjugacy from T to g2, k : [0,1) -+ [-1,1). Notice that k is differentiable everywhere, but that k- I is not differentiable at ±1. Also note, by Example 6.1, g21[-1, 1] is conjugate to F4 1[0, 1], so combining TI[O,I] is conjugate to F4 1[0, 1) by H(y) = 1/2 - (1/2)cos(7rY) = sin2 (7rY/2). Again, His differentiable everywhere on [0,1] but H- I is not differentiable at and 1.
°
In the above examples, some of the conjugacies were affine It.nd so differentiable everywhere, and other conjugacies were differentiable but had inverses which were not differentiable at the endpoints. In Dynamical Systems, we usually consider the notion of topological conjugacy and not C" -conjugacy. The question arises, why do we only require that the conjugacy h be a homeomorphism and not a diffeomorphism? As the situation of the last section illustrates, sometimes a topological conjugacy Is all that exists and it is still useful to match the trajectories of the two maps. The following proposition gives another reason: the assumption that I and 9 are differentiably conjugate puts a very restrictive condition on the relationship between the derivatives of J and 9 at their respective periodic points. After this proposition, we define another concept (structural stability) which requires a map I to be topologically conjugate to all "small perturbations" g. After this definition, we give a specific example of a map I that is topologically conjugate to all perturbations, and we also verify that the conjugacy can not be differentiable for a particular choice of g. This example thus gives a second justification that topological conjugacy is the natural concept in Dynamical Systems. Proposition 6.1. Assume I,g: R -+ R are CI. (a) Assume that I and 9 are CI-conjugate by h. Assume Xo is a n-periodic point for I, and Yo = h(xo). Then (r)'(xo) = (g")'(yo)· (b) Assume I has a point Xo of period n and assume every n-periodic point 1/0 for 9 has (r)'(xo) =I (g")'(yo). Then I and 9 are not CI-conjugate. 6.1. Since most small perturbations of a map I change the derivative at periodic points, it is very difficult for two maps to be differentiably conjugate.
REMARK
REMARK 6.2. In higher dimensions, the corresponding result is that the matrices of partial derivatives of I and 9 are linearly conjugate and so have the same eigenvalues.
Clearly part (b) follows from part (a). To prove part (a), we assume that h 0 I(x) = go h(x), so h 0 r(x) = gn 0 h(x) and r(x) = h- I 0 g" 0 h(x). Taking the derivative at Xo we get that PROOF.
(f")'(xo) = (h-I)(gnoh(xo»(g")(h(xo»h(xo) = (h-I)(I/O)(g")(I/O)h(xo) = (g")(l/o)'
(We moved the point of evaluating the derivatives to subscripts to make the product easier to read.) This proves part (a). 0 We next want to discuss maps which are conjugate to all perturbations. Such maps have the same dynamics as all small perturbations and so the structure of the dynamics of the whole map is stable. For this reason, such maps are called structurally stable. To make this definition precise, we need to define what we mean for two functions to be close, i.e., we need to define the distance between two functions.
II. ONE DIMENSIONAL DYNAMICS BY ITERATION
44
Definition. Let r ~ 0 be an integer. Let I, 9 : IR -+ IR be C r functions and J C lR be an interval (usually closed and bounded). Define the cr -distance from I to 9 by dr.J(f,g)
= sup{l/(x) -
g(x)l. I!'(x) - g'(x)l, ... ,1/(r)(x) - g(r)(x)1 : x E J}.
Obviously, I and 9 do not need to be defined on the whole real line but only need their domains to include the interval J (unless J = R). Definition. Assume r ~ 1. Let I : R -+ R be a cr function. A function I is C r structumlly stable provided there exists an t > 0 such that I is conjugate to 9 on all of R whenever g: R -+ R is a C r function with dr.R(f,g) < t. A function I is said to be structumlly stable provided it is C l structurally stable.
Example 6.3. Take I(x) = 3x and g(x) = 2x. We want to construct a conjugacy h with h o/(x) = go h, or h(x) = g-I 0 II 0 I(x). If h exists then h(x) = g-j 0 h 0 Ji(x) for any j ~ O. However, we also have h(x) = go h 0 I-I(x) so lI(x) = g-j 0 h 0 Ji(x) for any -00 < j < 00. Both maps have 0 as a fixed point. As stated before, we need that h(O) = o. The image of 1 by h, 11(1), is fairly arbitrary, but we take h(I) = 1 and h( -1) = -1. Then h(3) = II 0 1(1) = go 11(1) = g(I) = 2. The definition of II for x between 1 and 3 is also arbitrary as long as it is monotone so we let ho : [1,3]-+ [1,2] be a CI map with (i) ho(I) = 1, (ii) ho(3) = 2, (iii) "0(1) = 1, and (iv) h (3) = 2/3. Similarly define ho: [-3, -1]-+ [-2, -1]. Once h is determined on [-3, -1] U [1,3] by h(x) = ho(x), this determines it for all nonzero x as follows. Take x i' O. There is a unique j = j(x) E Z such that Ji(x) = :Vx E (-3, -1] U [1,3). Define h(x) = g-j 01100 P(x). This defines h for all x E R. By construction h is a conjugacy. In the above construction, the orbit of every x i' 0 goes through the union of the intervals J = (-3, -1] U [1,3), and and there is no proper subset of J for which this is true. Because of this property, the set J, is called a fundamental domain lor the unstable set 0/0. Often it is preferred to take a fundamental domain as closed so the characterization is changed as follows: the pair of closed intervals J = [-3, -1] U [1,3] = cl« -3, -1] U [1, 3)) is called a fundamental domain lor the unstable set 010 because the orbit of every x i' 0 goes through J and there is no proper closed subset of J for which this is true. Note for those x approaching 1 from below, h'(x) = (3/2)h (3x). The limit of this expression as x approaches 1 is equal to 1 = 110(1):
o
o
= 1
= x-I lim ho(x). x>1
Thus this extension is differentiable at x = 1. Similarly, it is differentiable at 3 and all other points ±J.1. Thus this extension is C I for x i' O. Now let Xi be an arbitrary sequence that approaches 0 as i goes to infinity. Then there is aj = j(i) such that Zi == P(i)(Xi) E (-3, -1] U [1,3). It must be that j(i) goes to infinity as i goes to infinity. Then h(x;l = g-i(i)ho(Zi) = 2- j (i)ho(Zi) and this must converge to 0 as i goes to infinity. Thus ho is continuous at x = O.
2.6 CONJUGACY AND STRUCTURAL STABILITY
45
Finally, we want to show that h is not differentiable at O. Now let Xi = 3- i . Then Xi approaches 0 as i goes to infinity. Then h(Xi) = 2- i ho(3'Xi) = 2- i ho(l) = 2- i also approaches O. However, I' 2- i . h(Xi) - h(O) I1m = Im-. Xi - 0 i-oo 3- 1
i-oo
I. (3)i = .Im -
'-00 2
=
00.
Thus, h is not differentiable at O. Another way to see that h is not continuously differentiable is to show that the derivative of h at the points Xi. h'(x,), goes to infinity as i goes to infinity: lim h'(x,) = lim (g-')'(l)h'(l)(fi)'(x,) 1-00
'_00
= lim 2- i . 1 . 3' =
00.
Notice that if 9 were any map with g'(x) ~ >. > 1 for all x, then 9 has a single fixed point and the same proof shows that f is conjugate to g. Thus f is C l structurally stable. Also notice that if g(x) = (3 + t)x then do(f,g) = 00, but the derivatives of f and 9 are close for all x. Thus when we consider the perturbations of a function which are small in terms of the distance d l on all of R (which is noncompact), this is very restrictive. Often, we can allow the CO size of the perturbation to be larger near ±oo if the derivatives are controlled.
Example 6.4. In this example we consider f(x) = -xI3. There are two differences from the previous example: the origin is attracting and not repelling, and the map switches points from one side of the fixed point to the other. However if we let J = [1, r2(1» = [1,9) then every x # 0 has a unique j E Z such that Ji(x) E [1,9). Thus for a fundamental domain we do not need to take an interval for negative x as well as positive x; [1,9] is the complete fundamental domain for the stable set of O. With this change, the proof that f is structurally stable is also the same. We leave the details to the reader. Note that the fact that the fundamental domain for f(x) = -x13 is one interval, while that for g(x) = xl3 is the union of two intervals implies that these two maps can not be topologically conjugate. In fact, f is orientation reversing on R while 9 is orientation preserving. Two maps which are topologically conjugate must either both be orientation preserving or orientation reversing. This is why f and 9 can not be topologically conjugate, even though both have a unique fixed point whose basin of attraction is the whole real line. Example 6.5. Let f(x) = x3/2 - x12. This example has fixed points at x = 0, ±3 1/ 2 and critical points at ±1/3 1/ 2 • The fixed point x = 0 is attracting and the two fixed points x = ±3 1/ 2 are repelling. The main changes from the previous examples are the existence of several fixed points and the existence of critical points. If 9 is any small C l perturbation, it will have three fixed points. However to insure that 9 has exactly two critical points we must take 9 C2 near f. We take such a 9 and let Pj for j = -1,0, 1 be the fixed points and c:l: be the two critical points, with P-I < c- < Po < c+ < PI! Po < g(c-) < c+, and c- < g(c+) < Po· Notice however that the map f has f'(x) ~ 4 > 1 for all x with x ~ 31/ 2 or x < _3 1/ 2 • Thus we take (partial) fundamental domains for the unstable sets of ±3 1/ 2 given by
46
II. ONE DIMENSIONAL DYNAMICS BY ITERATION
[- I( -4), -4J U [4,/(4)J. We can construct a conjugacy h on (-00, -3 1/ 2J U [3 1/ 2,00) from I to 9 with -00, _3 1/ 2]) = (-00, -p-d and h([3 1/ 2,00)) = [Plo 00). Now we consider the points between ±3 1/ 2 . The point 1/3 1/ 2 is a critical point which is a minimum of the function f. Thus there are points on either side of 1/3 1/ 2 on which I takes the same value. Let x, and y, sequences of points converging to 1/3 1/ 2 with X, > 1/3 1/ 2 , y, < 1/3 1/ 2 and I(x,) = I(y,). If h is a conjugacy from I to 9, then we need that 90 hex,) = h 0 I(x,) = h o/(y,) = 90 hey,). Thus the points hex,) and hey,) need to be points which have the same image by 9. Thus h(I/3 1/ 2 ) has two distinct points arbitrarily near which are taken to the same point by g, and h(I/3 1/ 2 ) must be the critical point for 9, c+. Similarly, we need h( _1/3 1/ 2 ) = C- . Once we know that h must take the critical point of f to the critical point of g, the rest of the construction of the conjugacy is straight forward. The map f is monotone for x with _1/3 1/ 2 ~ x ~ 1/3 1/ 2 . We construct a conjugacy here to the perturbation 9 much as in the last example using the fundamental domain of the stable set of 0 given by [/2(1/3 1/ 2),1/3 1/ 2]. making sure that the image of the critical point by ho, hoo I( _1/3 1/ 2 ), is 9(C-). Once the conjugacy ho is defined on this fundamental domain, extend it to [-1/3 1/ 2 , 1/3 1/ 2 J as before, with h[-1/3 1/ 2 ,1/3 1/ 2 J = [c-, c+l. (Notice that 1(1/3 1/ 2 ) > -1/3 1/2 and 1(-1/3 1/ 2 ) < 1/3 1/ 2 .) Next, we need to extend h to the interval (1/3 1/ 2 ,3 1/ 2 ). Let 1+ be the restriction of I to this interval and 9+ be the restriction of 9 to [c+ ,pd. Now for x E (1/3 1/ 2 ,3 1/ 2 ) there is a smallest j > 0 with I!(x) E [-1/3 1/ 2 ,1/3 1/ 2 J. Define hex) = (9+)-j 0 h 0 I!(x). Make a similar construction for x E (_3 1/ 2, -1/3 1/ 2J: hex) = (g_ )-j 0 h 0 I~ (x) where 9_ is the restriction of 9 to (P-1,C-J and f- is the restriction of I to (_3 1/ 2 , _1/3 1/ 2 1. This extension makes h defined and continuous on the whole real line. This completes the necessary modifications. (The reader may want to check some of the claims we made in the construction.)
h«
2.7 Conjugacy and Structural Stability of the Quadratic Map In the preface we stated that we wanted to determine which systems were dynamically equivalent t.o any of its perturbation. In this section we prove that this is the case for the quadratic maps with invariant Cantor sets: any small perturbation 9 of FI' is conjugate to FjI' Befort' we prove t.hat a conjugacy exists on the whole real line, we first show that one exists between the nonwandering set of FI' and the nonwandering set of g. Because the nonwandering sets are contained in a compact interval, J, we only require that the two functions are close on this interval J. To make the statement clearer, we introduce notation for the restriction of the functions to this interval. Let f J be the restriction of I to J. Then the nonwandering set of IJ, nUJ), are the points z such that for any open neighborhood U there is a point y E U with Ik(y) E U and f1(y) E J for 1 ~ j ~ k. We start by giving the definition of two functions being conjugate on their nonwandering sets. We also use the definitions of two functions being close in terms of their derivatives which is introduced in the last section. Sometimes we only require that these values are close on an bounded interval.
Definition. Let I : R -+ R be a C r function. As a modification of the concept of structural stability, we consider a map h that only conjugates f restricted to its nonwandering set with 9 restricted to its nonwandering set. A function I is C r n-stable on J provided there exists an f > 0 such that f restricted to J) is topologically
nu
2.7 CONJUGACY AND STABILITY OF THE QUADRATIC MAP
conjugate to 9 restricted to O(gJ) whenever 9 : J
--+
47
R is a C r function with dr,JU, g) <
f.
Theorem 7.1. Let F,..(x) = Jlx(I - x) as before witb Jl > 4. (a) Tben F,.. is C l O-stable on [-2,2), and (b) F,.. is C 2 structurally stable on R. REMARK 7.1. By stating the theorem for part (a) the way we do, it implies that F,.. is O-conjugate to Fl" for III - ll'l small. PROOF OF THEOREM 7.1 (a). We restrict our proof to the case when Jl > 2 + 5 1/ 2 but the proof can be modified for Il > 4 using the metric introduced above. Let 16 = [-b,I + bJ for b > 0. For Ii > small enough, if z E [16 n F;I(I6)1 u [-2,OJ U [1, 2J. then IF~(z)1 > A > 1. Since F~(1/2) = 0, the assumptions imply that F,..(I/2) > 1 + 6. Takef > small enough so that if dt,i-2,2IU, g) < f theng(-li) < -Ii, g(I+Ii) < -6, g(I/2) > 1 + 6, and Ig'(z)1 > A for z E [16 n g-I(16») U [-2,01 U [1, 2J. Fix such a map g. These conditions imply that gl16 covers 16 twice. The next lemma shows that the nonwandering set of 9 is contained in 16.
°
°
Lemma 1.2. If yi'(x) E [-2,2J for all k ~ 0(gll-2,2))
c
n~o g-k(l6).
PROOF. If gk(X) [-2, -IiJ, then
r;.
16 then gk+l(X)
<
°tben
max{g(1
gk+2(X) _ (-Ii)
gk(X) E 16 for all k ~ 0. Therefore,
+ Ii),g(-li)} <
< gk+2(X) < A[gk+I(X)
-Ii. If also gk+I(X) E
g( -Ii) - (-Ii)J.
Therefore gk+2(x) stays less than -Ii. As long as gk+i(x) stays in [-2, -liJ,
Since A > 1, this inequality can only last for a finite number of iterates, and gk+i(x) < -2 for some j ~ 0. Thus if gk(x) E [-2,2J for all k ~ then gk(X) E 16 for all k ~ 0. The statement about the nonwandering set follows from the first statement. 0
°
Lemma 1.3. Let Ag
= n~o g-k{l6).
Tben glAg is topologically conjugate to
(T
on
I:2·
PROOF. Because glI6 covers 16 twice, g-I(l6) n 16 is the union of two intervals If and By the Mean Value Theorem the length of each of these intervals is less than A-I times the length of 16. Also 9 is monotone on each I j with derivative greater than A in absolute value. Let IJ.k = g-I{lk)n/j . Then Sf = n~=o9-i(I6) is the union ofthe four intervals IJ.k' By the Mean Value Theorem each of these intervals has length bounded as follows: L(/j,k) S >..-IL{lk) S >..-2L{l6). By induction, S~ = n~_og-i(I6) is the union of 2n intervals of length less than or equal to >.. -n L(I6). Exactly as in the earlier proof, Ag = n~o9-k(16) is a Cantor set, and glAg is topologically conjugate to (T on
Ir
I:2
0
To ill" . , •• <' proof of the theorem, note that glAg is topologically conjugate to (T on I:2 which in turn is conjugate to F,..IA,... Because topological conjugacy is a transitive relation, we have proved part (a) of the theorem. 0
48
II. ONE DIMENSIONAL DYNAMICS BY ITERATION
PROOF OF THEOREM 7.1(b). In this part, we let h: AI' -+ Ag be the conjugacy shown to exist by part (a). For part (b), we need to define a conjugacy off the Cantor set A,.. Just as in the simpler example, Example 6.5, which has only a finite number of periodic points, a conjugacy of FI' and a perturbation on all of R must take the critical point cl' of FI' to a critical point cg of 9. Thus 9 must have a unique critical point. In order to be able to know that 9 has a unique critical point, we must restrict ourselves to 9 which are C2 near F,..
FIGURE 7.1.
Orbit of the Critical Point
Note that F,,(c,,) > I, and 0 > F~(cl') > F!(cl')' Also if 9 is C 2 near enough to FI" then g(cg) is greater than the points in Ag, and g2(Cg) > g3(cg) and both points are less than the points in A g . The conjugacy we construct must take [F!(CI')' ~(cl')1 to [g3(cg ),92(Cg )l. These two intervals are the fundamental domains for FI' and 9 respectively. (These intervals are fundamental domains of the stable set of infinity or the unstable set of the invariant Cantor sets.) Let ho be such a map. (It could be taken to be linear on this interval.) The map F" is monotone (so has an inverse) when restricted to x ~ 1/2. Let FI'_ be this restriction, and FI'+ be the restriction to x ~ 1/2. Similarly, let g_ be the restriction of 9 to points y ~ cg, and g+ be the restriction to points y ~ cg • Then 9_ and g+ are each monotone with inverses. As in Examples 6.1 - 6.3, ho can be extended to points x < 0 by h(x)
= 9:' 0 ho 0
F~_ (x)
where F~_(x) E [F;(c,,), F~(c,,)I. This extension is continuous, as a map from AI' U {x: X < O} to Ag U {y : y < z for all Z E Ag}. Next, FI'+ {x : X > I} = {x : X < O}, and h is already defined on {x : X < OJ. Also g+{y: y > z for all Z E Ag} = {y : y < z for all Z E Ag}. Thus we can extend h to {x:x>l}by h(x) = g:;1 0 h 0 FI'+(x). With this definition, the map is still continuous and a conjugacy where it is defined. Also h(FI'(1/2» = 9.:;1 0 h 0 F~(1/2) = g:;1 0 g2(cg) = g(cg ). Next, if Gl,I." is the gap at the first level for F", and Gl,l.g the gap at the first level for g, then the equation h(x)
= { 9':; 10 h 0 FI'(x) g:1
0
h 0 F,,(x)
for x E G t .I.1' and x > 1/2 for x E G u ." and x < 1/2
2.8 HOMEOMORPHISMS OF THE CIRCLE
49
extends h continuously from G I •I .,.. to GI.I.g' As above, it can be checked that h(I/2) = cg • We continue inductively tc;1 the gaps at n'h level, Gj,n,,.. for F,.. and Gj,n,9 for g. For each j, we can extend h from Gi,n,,.. to Gj,n,g by using the appropriate branch of the inverse of g, g::;1 or 9: 1 . We leave the details to the reader. 0
2.8 Homeomorphisms of the Circle In this section, we discuss the periodic orbits and recurrence of orientation preserving homeomorphisms of the circle. By restricting to invertible maps and by exploiting the fact that the circle comes back on itself, we are able to give a rather complete description of what dynamics can occur for such a homeomorphism. An important quantity which makes this determination possible is the rotation number. This number measures the average amount that a point is rotated by the homeomorphism. When this is the rational number p/q the homeomorphism has periodic points with period q. When this number is irrational, there are no periodic orbits. With the assumption that the map is a C~ diffeomorphism with irrational rotation number, it follows that every orbit is dense in the circle. We denote by 8 1 the unit circle. It can be thought of as either (i) the real numbers modulo 1 or (ii) the points in the plane at a distance one from the origin. In terms of the second way of thinking about 8 1 , we often identify R2 with the complex plane C and so 8 1 = {z E C : JzJ = I}. There is also a covering space projection 11" from R onto 8 1 • In terms of the first way of thinking about 8 1 , 1I"(t) = t mod 1. In terms of the second way of thinking about 8 1 , 1I"(t) = e2'11"". (Note we used 11" for both the map and the number 3.14· .. but throughout the rest of this section 11" usually denotes the map.) Thus 11" can be thought of as taking an angle measurement to a point on the circle. Throughout this section I : 8 1 -+ 8 1 is assumed to be an orientation preserving homeomorphism. Given such a map, there is a (nonunique) map F : R -+ R which is called a lift 011 such that 11" 0 F = 1011'. A lift, F, of I satisfies (i) F is monotonically increasing and (ii) F(t + 1) = F(t) + 1 for all t, so (F - id) has period 1. For example, if 1>. is the rotation by A on 8 1 (or by 211'A radians) then F>,(t) = t + A is a lift. But for any integer k, F>,(t) = k + t + A is also a lift. In fact for any homeomorphism I, if FI and F~ are two lifts then there is an integer k such that F2 (t) = Fdt) + k for all t. (For each t, 11' 0 F2 (t) = 10 1I'(t) = 11" 0 FI(t) so there is an integer k, with F2(t) = FI(t) + k t . Since everything is continuous and the integers are discrete, k, is independent of t.) The aim of the section is to define an invariant of I called the rotation number and prove that it can be used to determine whether I has any periodic points or not. The rotation number is a measure of the average amount of rotation of a point along an orbit. Definition. We start by defining a number for a lift F of I. Let (10 (F ,
t) -_ I'1m FR(t) - t . n-oo n
We show below that this limit exists and is independent of t, and so we denote it by (1o(F). We also show that if FI and F2 are two lifts then (1o(FI,t) - (1o(F2,t) is an integer, so p(f) = (1o(F,t) mod 1 is well defined. The number p(f) is called the rotation number of f. REMARK 8.1. The definition of rotation number easily implies that (1o(Fk) = k{1o(F) and p(fk) = kp(f) mod 1.
II. ONE DIMENSIONAL DYNAMICS BY ITERATION
50
Example 8.1. Let h. be the rotation by >. on SI with lift F>,(t) = t + >.. This map is called a rigid rotation of SI by>.. Since Ff(t) = t+n>', it is easy to see that Po(F, t) = >. c . ! ' , '!'ld p(f) = ). mod 1. In this example every point is rotated by exactly>. so the rotation number should be ).. In this example we can see the connection between the rationality of the rotation number and the existence of a periodic orbit. Assume). = p/q is rational, i.e., h. is a rational rotation. Then F!(t) = t + q). = t + p. Therefore every point is periodiC with period q. Now assume that). is irrational. For J>. to have a point x of period q, it is necessary for F!(x) = x+q>' = x+k for some integer k. Thus we need), = k/q which is impossible because ). is irrational. Thus J>. has no periodic points in this case. Below, Theorem 8.3 shows that a every point in SI has a dense orbit when>. is irrational. We start by proving that a orientation preserving homeomorphism of the circle has a rotation number which is independent of the point. Theorem 8.1. Let f : SI -+ SI be an orientation preserving homeomorphism with lift F. Then (a) for t E IR the limit defining Po(F,t) exists and is independent oft, (b) if p(f) = Po(F, t) mod 1, then this is independent of the lift F, and (c) p(f) depends continuously on /. (a) Take any two points t, s E R. There is an integer l such that t $ s+l < t+1. By the monotonicity of F, F(t) $ F(s + l) < F(t + 1) = F(t) + 1, and by induction, F"(t) $ P(s + l) < P(t) + 1. Subtracting t + 1,
PROOF.
FP(t) - t - 1 $ P(s
+ l) -
t- 1
< FP(s + l) - s - l < FP(t + 1) - t = FP(t) - t
Since FP(s
+ l)
+ 1.
- s - l = FP(s) - s, FP(t) - t - 1 < FP(s) -
Writing p+n(t) - t
= FP(Fn(t)) -
~(t)
S
< FP(t)
+ ~(t) -
- t
+ 1.
t and applying (*) with s
Applying (**) with n = p, F 2P(t) - t
< 2[FP(t)
- tj
and by induction FnP(t) - t
< k[FP(t)
- tj
+ 1,
+k -
1.
For any n 2: p, write n = kp + i where 0 $ i < p. Then by (**) and (***), Fn(t) - t = Fkp+i(t) - t
< Fkp(t) - t + Fi(t) - t + 1
< k[FP(t)
-
tJ + Fi(t) - t + k.
= ~(t),
2.8 HOMEOMORPHISMS OF THE CIRCLE
Dividing by n and using the fact that n
~
51
kp,
Fn(t) - t k[FP(t) - t] < kp n
+
Fi(t) - t k +-k' n p
In the same way using the inequality P(t) - t - I < P(s) - s and the fact that (k + I)p > n, we get
k[FP(t) - t] (k + I)p
+ Fi(t) n
t _ _ k_ < _P--,(,-,t)_-_t (k + I)p n'
Using the last two inequalities and letting n (and so k) go to infinity with p fixed,
FP(t) - t
-~-
p
I. Fn(t) - t - - ~ hm sup p
n-oo
~
FP(t) - t
I
+ -. P
p
11
Notice that this last set of inequalities shows that the Iimsup is finite. Now letting p go to infinity, Fn(t) - t FP(t) - t lim sup ~ lim inf .
n
n-oo
p
p-oo
Thus we have proved that the limit defining the rotation number Po(F, t) exists and is finite for any t E R. Inequality (*) above shows that. for two different points t, s E R,
FP(t) - t - I
FP(s) - s
P
P
-""'"--'----<
<
FP(t) - t P
+I
.
Since the rotation numbers for both t and s exist, it easily follows that Po(F, t) = Po(F, s) for any two points t, s E R. Because we have shown that the rotation number Is independent of the point, we write Po(F) from now on. (b) Assume FJ and F2 are two lifts of f. We noted above that there is an integer k independent of t such that F2 (t) = F J(t) + k. By induction F2'(t) = Ff(t) + nk for any positive integer n. Therefore '" ) = I'1m F2'(t) - t Po ( £2 R-OO n (Ff(t)-t . = I1m n-oo n = Po(Fd + k.
nk) +n
Thus Po(F2) = Po(Fd mod I as claimed. (c) Let f > O. Choose an integer n > 0 such that 2/n < f. Let F be a lift of f. There is an integer psuch thatp ~ P(O) < p+l. It then follows that p-I < Fn(t)-t < p+I for all t (possibly replacing p). For 9 near enough to f in terms of the CO topology, a lift G of 9 can be chosen so that p - I < Gn(t) - t < p + l. (Note that n is fixed.) The nk-th iterate of 0 can be written as Fnk(O) = pnk(O) - 0 k-1
= Lpn j=O
0
Fjn(O) - Fin(O)
52
II. ONE DIMENSIONAL DYNAMICS BY ITERATION
which is greater than k(p - 1) and less than k(p + 1). Therefore
k(p - 1) < F"k(O) < k(p + 1), and we also get that
k(p - 1) < Gnk(O) < k(p + 1). Because the rotation numbers for F and G exist, they can be calculated by subsequences, = Iimk_oo Fkn(O)/kn and Po(G) = Iimk_oo Gkn(O)/kn. The above inequalities easily imply that
So Po(F)
p- 1 (F)
Therefore Ipo(F) - Po(G)1 < 2/n < theorem.
t.
and
This completes the proof of part (c) and the 0
We leave to the exercises to show that if f has a periodic point then p(f) is rational. The following proposition proves that p(f) being rational is also a sufficient condition for f to have a periodic point.
Theorem 8.2. The rotation number p(f) is rational if and only if f has a periodic point. In fact, p(f) = p/q if and only if f has a point of period q. (Here p/q is assumed to be in reduced form with p and q integers and q positive.)
f has a periodic point then p(f) is rational. Conversely, assume that p(f) = p/q is rational and expressed in lowest terms. Let F be a lift of f. By the definitions of p(f) and Po(F), there is an integer k with Po(F) = (p/q) + k. Then F(t) = F(t) - k is another lift of f and Po(F) = p/q. Also, Po(Fq - p) = Po(Fq) - p = qPo(F) - p = O. Let G(t) = Fq(t) - p. We need to show that G has a fixed point on it so f has a point of period q on SI. There are three cases: (i) G(O) = 0, (ii) G(O) > 0, and (iii) G(O) < O. In the first case, 0 is a fixed point, and we are done. In case (ii), because G is increasing, 0 < G(O) < G2(O) < ... Gn(o) < .... There are two subcases. First assume that 0 < Gn(o) < 1 for all 11. Because the orbit is monot.onically increasing, G"(O) converges to some point Xo. As we have mentioned before, it follows by continuity that G(xo) = Xo and G has a fixed point. As a second subcase, assume there is a k > 0 such that Gk(O) > 1. Then G 2k (O) = G ir co Gk(O) > G k (1) by the monotonicity of G k . But G Ir (I) = Gk(O) + 1 > 2, so G 2k (O) > 2. By induction, G]k(O) > j. Therefore Gjk(O)fjk > l/k and Po(G) ~ l/k. This contradiction shows that this second subcase can not occur and case (ii) implies there is a fixed point. In case (iii) when G(O) < 0, 0 > G(O) > G2(O) > .... By reasoning as in case (ii), there must be a fixed point. This finishes the proof of the theorem. 0 PROOF. Exercise 2.29 proves that if
Example 8.2. Let f be the map on SI whose lift F is given by F(t) = t + t sin(21T1It) for n a positive integer and 0 < E < 1/(21Tn). Then x is a fixed point of f if there is an integer p with F(x) = x + E sin(21Tnx) = x + p, or E sin(21Tnx) = p. Since E < 1 it follows that p = O. The solutions are x = j /2n for j = 0, ... ,2n - 1. The derivative
2.8 HOMEOMORPHISMS OF THE CIRCLE
FIGURE 8.1.
53
Example 8.2
F'(x) = 1 + f21Tn COS(21TnX) so F'(j/2n) = 1 + f21Tn > 1 for j even and 0 < F'(j/2n) = 1 - ,211'71 < 1 for j odd. Tlu·rr.fore thr. points x = j /2n are fixed point sources for j even and fixed point sinks for j odd. See Figure 8.1. Example 8.3. Let
I be the map on F(t) = t
SI whose lift F is given by
+ l/n + f sin(21Tnt)
for n a positive integer and 0 < f < 1/(21Tn). Let xi = j/2n. Then F(xj) = XjH 80 {xi: j is even} is one periodic orbit of period n and {xi: j is odd} is another periodic orbit of period n. The derivative of F" at Xo satisfies
so the first orbit is a source. Similarly,
so the second orbit is a sink. It can be checked that any other point y (i) is not periodic, (ii) has o(y) = O(xo), and (iii) has w(y) = O(xd. Having looked at some examples with rational rotation number, we return to rotations on the circle with irrational rotation number. We prove that every point for this map has a dense orbit. Theorem 8.3. Let I>. be a rotation on SI as defined above. Assume A is irrational. Then I>. has no periodic points and every point in SI has a dense orbit in SI. Thus for any x E SI, w(x) = SI and SI is a minimal set for 1>.. PROOF. As we noted in Example 8.1, the lift F>. is given by F>.(t) = t +A, Po{F>.) = A, and p(l>.) = A mod 1. By Example 8.1 or Theorem 8.2, I>. has no periodic points because A is irrational. Now we turn to showing that all orbits are dense. Since there are no periodic points, all the points Fi(x) are distinct modulo one. (If Ff(x) = Ff'(x) for n # m then x + nA = x + TTlA + k for an integer k. Then A = k/{n - m) is rational.) The set f~(x) must have a limit point in SI. Thus given f > 0 there exist integers n # m and k such that IFf (x) - FJ:'(x) - kl < f, or in SI d(ff{x), fJ:'(x» < f. The lift F>. preserves lengths, so letting q = n - m, d(f9(X), x) < f. Then d(f29(X),/9X) < f, d(f3 9(x), f 29 x) < f, ... d(f(j+1)9(X),li9X) < f. These intervals eventually cover SI 80 the orbit of x is f-dense in SI. Because f > 0 is arbitrary, the orbit of x is dense in 8 1 , w(x) = SI, and SI is a minimal set for f>.. 0
54
II. ONE DIMENSIONAL DYNAMICS BY ITERATION
Theorem 8.4. Assume I : SI -+ SI is a continuous orientation preserving homeomorphism and p(f) is irrational. Then the following are true: (a) w(x) is independent of x, (b) w(x) is a minimal set, and (c) w(x) is either (i) all of SI or (ii) a Cantor subset of SI. We leave the proof of this result to Exercise 2.26.
Theorem 8.5. Assume I: SI -+ SI is a continuous orientation preserving homeomorphism and T = p(f) is irrational. (a) Then I is semi-conjugate to the rigid rotation map with rotation T, IT' The semi-conjugacy takes the orbits of I to orbits of IT, is at most two to one on w(x,f), and preserves orientation. (b) If w(x, f) = 51 then I is conjugate to IT' (c) If w(x, f) ~ 51 then the semi-conjugacy h from I to IT collapses the closure of eacil open interVllI I in the complement ofw(x) to a point. PROOF. Let F be a lift of I. The next lemma proves that the order of orbits on the lift is the same as the order of the lift FT of the rigid rotation.
Lemma 8.6. Choose the lift F so that Po(F) = T, with T irrational. Let t E R and let k, m, n, q E Z. Then (i) Fn(t) + m < Fk(t) + q for some t if and only if (ii) Fn(t) + m < Fk(t) + q for all t if and only if (iii) nT + m < kr + q if and only if (iv) F;'(O) + m < F:(O) + q where FT is the translation by T. The equivalence of conditions (iii) and (iv) is clear. Because the rotation number is irrational, the order of ~(t) + m and Fk(t) + q in the line is independent of t, so condition (i) is equivalent to condition (H). In showing that condition (il) implies (Hi), replacing t by F-k(t) in condition (il) gives that Fn-k(t) - t < q - m for all t. This in turn implies that PROOF.
T
= Po(F) .
= ltm
Fi(n-k)(t) _ F(i-I)(n-k)(t) L --....!...<~--:-:----'-'j(n - k) j
j-co i=1
< q-m. - n- k The inequality must be strict because the rotation number is irrational, so nT + m < kT + q which is condition (iii). Assuming condition (iii), we get that Fn-k(t) - t < q - m for some t since the rotation number is less than (q - m)/(n - k). Again the sign of this inequality must be the same for all t because the rotation number is irrational. Substituting Fk(t) for t we get Fn(t) - Fk(t) < q - m for all t, or condition (H). 0 Let Xo E w(x, f) and to E R with 1r(to) = Xo. Let B = {Fn(to) + m : n, m E c R. Let A = {F.;'(O) + m : n,m E Z}. Then the map h : B -+ A defined by ii(~(to) + m) = F;'(O) + m is order preserving by Lemma 8.6. Also h(t + 1) = h(t) + 1 by the construction. Because of the order preserving property, h has a continuous extension to the closure, h: cl(B) -+ cl(A) = R. The extension is also order preserving and satisfies h(t + 1) = h(t) + 1. The order preserving map ii : cl(B) -+ R induces a map h : w(x, f) -+ SI. This map h is semi-conjugacy of Ilw(x, f) to IT' Z}
2.8 HOMEOMORPHISMS OF THE CIRCLE
Claim 1. If cl(B) = R then h is one to one on Rand h is one to one on 8 1 . Thus part (b) of the theorem is true. PROOF. Assume there are Xl < X2 with h(x.) = h(X2). Letting i = 1,2, there are two increasing sequences of points Yn; + mi E B for j ~ 1 such that Yn; + mi converges to Xi from below. (The mi can be taken independent of j because the points Yn ' + mi J converge to a single point.) We can assume that Yn; + ml < Yn~ + m2 for all j and k by -
n~
-
n2
taking a subsequence. Then h(Yn; +ml) = FT' (O)+ml and h(Yn~ +m2) = FT J (O)+ml both converge to the same point and from the same side. Thus the two sequences in the image must be interlaced while those in the domain are not. This contradicts the order preserving property of h and proves the claim about h. The properties of h in this case follow from those of h. 0 Claim 2. 1£ cl(JB) '# R, then h takes the two elJd points of an open interval in R 'cI(B) to tile same point. Thus h has a continuous order preserving extension to R which takes the c106ure of each open interval in R 'cl(B) to a single point. This map h : R -+ R induces a semi-conjugacy h : 8 1 -+ 8 1 which satisfies the conditions of part (c) of the theorem. PROOF. Take an interval (eI,e2) in the complement of cI(B). No points in the orbit of
to intersect (el' e2). We need to prove that h(e.) = h(e2)' If h(e.) '# h(e2), then by the order preserving property of h the image h(cI(B» misses the interval (h(e.),h(e2)) in R. However h(cI(B» = cI(A) = R. This contradiction proves that it must be the case that h(e.) = h(e2)' Thus h has a continuous order preserving extension from R to R with h([el' e2]) = {h(ed}. The rest of the conclusions of the claim follow. 0 By Claims 1 and 2, there is always an order preserving map h : R -+ R with h(t+ 1) = h(t) + 1 which induces a semi-conjugacy h : 8 1 -+ 8 1 from I to IT' The two claims prove the desired properties of h. This completes the proof of Theorem 8.5. 0
Theorem 8.7 (Denjoy). Assume that I : 8 1 -+ 8 1 Is a C 2 orientation preserving diffeomorphism. Assume that T = p(f) Is irrational. Then I Is transitive so I is topologically conjugate to the rigid rotation IT' See Nitecki (1971), de Melo (1989), or de Melo and Van Strien (1993) for a proof.
Theorem 8.8 (Denjoy). Let T be irrational. Then there exists a Cl orientation preserving diffeomorphism I of 8 1 such that p(f) = T and w(x) # 8 1 . PROOF. The proof consists of placing the open intervals which correspond to the gaps of the Cantor set in the same order as an orbit of IT' the rigid rotation. Let in be a sequence of p06ltive real numbers (lengths) indexed by n e Z with (I) limn_:too(in+!/in) = I, (ii) E::'__ oo in = I, (iii) in > in+l for n ~ 0, (iv) in < in+! for n < 0, and (v) 3in+! -in> 0 for n ~ O. For example, in = T(lnl + 2)-I(lnl + 3)-1 works where T-I = E::'=_oo(lnl + 2)-I(lnl + 3)-1. Let In be an closed interval of length In. We place these intervals on the circle in the same oroNl' the order of the orbit I~(O}. SO to place an interval In, consider the sum of the lengths of the intervals I j where I~(O} is between I~(O) and O. This determines the placement of In. Since the circle has a total length of one, the measure of 8 1 \ UnEZ int(In} is zero. The next step is to define I on the union of the In. It is necessary and sufficient for f'(t) 1 on the endpoints in order for the map to have a continuous derivative when it
=
II. ONE DIMENSIONAL DYNAMICS BY ITERATION
is extended to the closure. Assume In =
[On, b..1,
SO
in = bn - an. The integral
SO
Therefore, if we define
f
f for x
() X
= an+!
E
+
In by
1'" + 1
6(in+! -in)
p-
an
then f(b n ) = an+1
+ in + in+!
-in
(b n
- t)(t -
an)dt,
n
= bn+!.
Also, f is differentiable on In with
I'(x) = 1 + 6(£n+;3 -in) (b n
-
x)(x - an).
n
Thus, f'(On)
= 1 = I'(b n ).
Notice that for n < 0, t n + 1
tn > 0, that
-
1 ~ /'(x)
< 1 + 6(tn+1 -in) (t'n)2 i~ 2 3tn+! -in 2tn
and (3tn+! -
t n )/(2in ) goes to 1 as n goes to minus infinity. Similarly for n 1 > f'(x) > 3t'n+! - in > 0
-
-
Un
~
0 and
'
f'(x) goes to 1 as n goes to plus infinity uniformly for x E In. From these facts, it follows that f is uniformly CIon the union of the interiors of the In and has a C l extension to all of SI. The second derivative I"(x) is given by
SO
!"(x) = 6(£n+;3 - in) [(b n - x) - (x - an)l, n
(i) f"((a n + bn )/2) = 0 in the middle of the interval, and (ii) as x goes to an, f"(x) converges to
SO
which is unbounded as Inl goes to infinity. Thus f is not C 2 , and this example of Denjoy does not contradict the theorem of Denjoy. Let A = SI \ UnEZ int(In). This set can be formed by successively removing open intervals. Because an open interval is eventually removed from any of the closed intervals obtained at a finite stage. A is a Cantor set.
2.9 EXERCISES
51
The orbit of a point z E A is dense in A since it is like the orbit of 0 for /r by Theorem 8.5. Thus w(z) A. If z E int(ln), then there is a smaller interval U whose closure is contained in int{ln). Since the interval In never returns to In but wanders among the other 1;, z ;. w( z). Also z ;. 0(1), the nonwandering set of ,. This proves that A w(z) 0(1) for any z E Sl. This completes the proof. 0
=
=
=
Recent work has dealt with the existence of a differentiable conjugacy between a diffeomorphism / with irrational rotation number T and /r. Arnold, Moser, and Berman have obtained results. See de Melo (1989) or the revised version de Melo and Van Strien (1993) for a discu88ion of these results and references.
2.9 Exercises Periodic Points 2.1. A homeomorphism I of It is (_trid/,) monotonicall, i"crtui", provided z < " implies that I(z) < 1(,,). It is (_trid/,) monoto"icall, decrtuin, if z < " implies that I(z) > I(y)· (a) Prove that any homeomorphism I of lit is either monotonically increasing or monotonically decreasing. (b) Prove that a homeomorphism I of lit can never have periodic points whose least period is greater than 2. 2.2. Let I : lit - lit be continuous. A88ume for one point Zo the orbit I; (zo) is a monotone sequence and is bounded. Prove that Ij(zo) converges to a fixed point. 2.3. Prove Theorem 2.2. 2.4. Let 2z for z ~ 1/2 T () z ={ 2 - 2z for z ;::: 1/2. be the tent map. (a) Sketch the graph on 1 [0,1] of T, T2, and (a representative graph of) T" for n
=
> 2.
(b) Use the graph of T" to conclude that T has exactly 2n points of period n. (These points do not nece88arily have least period n but are fixed by T".) (c) Prove that the set of all periodic points of T is dense in [0,1]. 2.5. Let F.. (z) 4z(l- z) on lit. (a) Make a rough sketch of the graph of F4'(z) for n > 2. (b) Use the graph of F4' to conclude that F: has exactly 2n fixed points. (These points do not necessarily have least period n but are fixed by F:.) 2.6. Consider the quadratic map F,. for parameter values I' > 1. (a) Find the points of period two and determine their stability. (Indicate for which parameter values they exist and the stability for different parameter values.) (b) Find the points of period four. ( c) Let II > 1 be in the range of parameters for which the orbit of period 2 is attracting. Let 0 < z < 1. Prove that either (i) there is an integer Ie ;::: 0 such that F!(z) PI' where PI' is the fixed point or (ii) the w(z) is the orbit of period 2. 5 2.7. Let I(z) z3 - 4z. (a) Find the fixed points, {O,±p}, and determine their stability. (b) Find the critical points and show they are non degenerate. Label the critical points ±c. Draw the graph of ,. (e) Find all points which satisfy I(z) -z. Use this information to find an orbit of period 2, {±q}.
=
=
=
=
II. ONE DIMENSIONAL DYNAMICS BY ITERATION
58
(d) Show that f2 is monotone on [-c,cJ where ±c are the critical points. Use this information to prove that W'(O(q)) ::) [-c,O)U(O,cJ. Then prove that W'(O(q)) = (-p, 0) U (0, p) where ±p are two of the fixed points. Limit Sets 2.8. Let f : X -+ X be a continuous map on a metric space. Let p EX. (a) Prove that cl(O+(p)) = O+(p) Uw(p). (b) Prove that if w(p) = 0 then O+(p) is a closed subset of X. (c) Assume that O+(p) is a compact subset of X. Prove that O+(p) = w(p). Invariant Cantor Sets 2.9. Let I = [0, IJ, and T.(x)
={
SX,
8(1 - x),
for x :-:; 1/2 for x ~ 1/2.
This is a generalized tent map. Prove that for S > 2, A = n{T.-k(I) : 0 ~ k < oo} is the middle-a Cantor set for some a. 2.10. Let b,(x) = x 3 - Ax. (a) Find the fixed points of b, and determine their stability for A > o. (b) Let I), = [-(A + 1)1/2, (A + 1)1/2J. For A > 0, prove that a point x with x rt I), has ff(x) --+ +00 or ff(x) --+ -00. (c) Consider A = 8. Show that
n n
Sn =
fik(/),)
k=O
contains 3n intervals. Show that
is a Cantor set (i.e., perfect and nowhere dense). Hint: If x E SI show that
I/s(x)1 > 1. 2.11. Use Lemma 4.4 to show that if I.l > 2(1 + 21/2) then Alfo has measure zero. Hint: If It > 2(1 + 21/2) then F~(x) > 2 on I. U h Remark: If I.l > 4 then Alfo has measure zero, but this fact is harder to prove. 2.12. For I.l > 4, show that FIfo(x) = I.lx(1 - x) is expanding by a factor of 2 in terms of the density p(x) = [x(l- X)J- I / 2, i.e., show that If~(x)lp(flfo(x))/p(x) ~ 2 for x E [0, IJ n F;I([O, 1]). Symbolic Dynamics 2.13. Let d be the metric on E2 = {1,2}N defined by
d(s, t)
=
f: ISj ~ tjl. j=O
(a) Given t E E2 and n
~
3J
0, prove that
Also prove that this set is a closed set and an open ball in the above metric.
2.9 EXERCISES
59
(b) Given t E E2 and n ? I, prove that
is an open set (but not an open ball). Note that the range of entries of with j = 1 and not O. (c) Given t E E2 and m ~ n, prove that
{s E E2 :
Sj
8
starts
= tj for m ~ j ~ n}
is an open set. These sets are called cylinder sets. (d) Prove that E2 with the metric d is a complete metric space. (e) Prove that E2 is compact in terms of the metric d. Remark: The topology induced by d is t,hc sanlc as the one obtained by considering E2 as the infinite product of {I, 2} with itself and using the product topology. Thus by Tychonoff's Theorem E2 is compact. In this exercise the reader is asked to verify this fact directly in this case. 2.14. Let A > 1 and P>. be the metric on E2 = {I, 2}N defined by
~
Is -tl
p>.(s, t) = L.J ~. j=O
Let d be the metric of the previous problem and Section 2.5. (a) Prove that the identity map, id, is a homeomorphism from E2 to itself when the domain is given the metric d and the range is given the metric P>.. This shows that the two metrics have the same open sets and so induce the same topology for anYA> 1. (b) Given t E E2 and n ? 0, prove that
{s E E2 :
Sj
= tj for 0 ~ j ~ n}
is an open ball in terms of the metric P>. if and only if A > 2. Note that these sets are open sets in terms terms of the metric P>. for any A > 1 by part (a). 2.15. Prove that the full p-shift is topologically mixing, i.e., (1" is topologically mixing on E" for any positive integer p ? 2. 2.16. Let I = [0,1]' Fs(x) = 5x(1 - x), and g () x = {
4x for 0 ~ x ~ 0.5 4x - 2 for 0.5 < x ~ 1.
The map g is called the Baker's map. (a) Let Fs-l(I) n 1= huh For a string
Sj
E {1,2} for 0 ~ j ~ n, let
For n = 1,2,3, indicate on a sketch all the possible intervals, 1'0 ...•"' for different choices of So •.. Sn. When is the interval 1. 0 ...."0 to the left of 1.0 ...."1 and when is it to the right? What effect does having a symbol 2 (where the slope is negative on h) have on the order of the intervals?
60
II. ONE D1MRNSJONAL DYNAMICS BY ITERATION
(b) Let g-I(I) n 1 = J I U J2; for a string
Sj
E {1,2} for O:s
j:S n, let
For n = 1,2, indicate on a sketch all these possible intervals, J. o ...•• , for different choices of so . .. SrI' What effect does having a symbol 2 (where the slope is negative on 12 ) have on the order of the intervals? What is the difference between the order of the interval in part (b) from those in part (a)?
2.17.
Let l'(x) = { 2x 2 - 2x
for x :S 1/2 for x ?: 1/2
he the tent map. Set II = [O,I/2J and 12 = [I/2,IJ. Define the map H : E2 ..... [0, IJ by
n 00
H(s) =
T-k(l •• ).
k=O
(a) Prove that H is well defined, i.e., that the intersection is a single point. (b) Prove that H is a semi-conjugacy from (1 on E2 to Ton [O,IJ. (e) What sequences correspond to x = 0,1,1/2, i.e., what are H-I(O), H-I(I), and H-I(I/2)? (d) Prove that H is one to one on most points, and at most two to one, i.e., prove that H-I(X) contains at most two sequences. What sequences go to the same point by H? (e) Prove that T is topologically transitive. 2.18. Consider F;. for It > 2 + 5 1/ 2 with invariant Cantor set AI" Prove that if t > 0 is small enough then there is a fJ > 0 such that for any fJ-chain {x j } ~o with d( x j , A,,) < fJ for all j there is a unique x E A" such that IXj - P(x)1 < t for all j. (A point x satisfying the conclusion of this exercise is said to t-shadow the fJ-chain {x) }.)
Conjugacy and Structural Stability 2.19. Which of the following are topologically conjugate on all of 1R? Which are differentiably conjugate? Prove your answer. (a) J(x) = x/2 (b) J(x) = 2x (e) J(x) = -2x (d) J(x) = 5x (e) J(x) = x 3 2.20. Let ga(Y) = ay2 - 1 and Je(x) = x 2 - e. For each paramet.er value a find a parameter value e and a linear conjugacy from ga to Je. 2.21. Prove that the periodic points of F4 are dense in [O,IJ. Hint: Use the conjugacy from the tent map T to F4 given in Example 6.2, and the fact from Exercise 2.4 that the periodic points for T are dense in [O,IJ, 2.22. Assume I : It ..... R has a fixed point at Xo. Find an affine conjugacy h from I to a new function 9 where 9 has a fixed point at O. Also give the definition of 9 in terms of the function I. Hint: Think of I as written in terms of coordinates x, XI = I(x), 9 as written in terms of coordinates y, YI = g(y), and the affine conjugacy h as transforming the y coordinates into the x coordinates by means of a translation, x = hey) = x + b, 2.23.
Prove that I(x)
= x 3 + x/2 is CI-structurally stable.
2.9 EXERCISES
61
2.24. Assume that I : X --+ X and 9 : Y --+ Yare semi-conjugate by means of the map h : X --+ Y. Let x be a point of least period n. Prove that h(x) is a periodic point for 9 whose period divides n. Also prove that if h is a conjugacy then the period of h(x) is n. Homeomorphisms of the Circle 2.25. Let I, 9 : Sl --+ Sl be two orientation preserving homeomorphisms that are topologically conjugate. Prove that they have the same rotation number, p(f) = p(g). 2.26. Let I : Sl --+ Sl be an orientation preserving homeomorphism with irrational rotation number. (a) Let 1= 1f"(x),Jm(x)] for n < m and some x E Sl. Prove that for any y E SI there is a positive k such that IIc(y) E I. Hint: Consider j-i(m-n)(l) for consecutive values of i. (b) Prove that for any two x,y E SI it must be the case that w(x) = w(y). Conclude that w(x) is minimal. Hint: Use part (a) to show that if Z E w(x) then Z E w(y). (c) Take x E SI and let E = w(x). Prove that E is perfect. Hint: for Z E E show that Z E w(z) = w(x}. What does this imply about how E must accumulate on Z? (d) For E = w(x), prove that E is nowhere dense or E = SI. Hint: Show the boundary of E is invariant. Since E is minimal, what are the possibilities for the boundary of E? 2.27. For the following two functions on the circle, (i) find all the periodic points, (ii) determine the stability of each periodic point, and (iii) describe the phase portrait. (a) I(O) = 0 + t sin (nO) mod 211' for n ~ I, t < lIn. (b) g(O) = 0 + (211'ln) + ESin(nO) mod 211' for n ~ 2, E < lIn. 2.28. This exercise applies the argument on the existence of a rotation number to show that a subadditive sequence has a limit. Suppose that an is a sequence of real numbers and C is a fixed real number for which a m+n $ am + an + c for all m, n E N. (In ergodic theory, if c = 0 this type of sequence is called subadditivf.) (a) Prove that alcp $ ka p + (k - l}c for all k,p EN. (b) Consider a fixed p > 0 and write n = kp + i where 0 $ i < n. Prove that an < ~
n - n
+ a p +c. p
(c) By letting n and p go to infinity in the right order, deduce that lim sup an $ lim inf a p , n-oo n p-oo p and so the limit limn_co anln exists. Note that the limit could be (d) If c = 0 and the sequence is subadditive, prove that
-00.
lim an = inf an. n nEN n
n-oo
2.29. Let I : SI --+ SI be an orientation preserving homeomorphism with lift F : R --+ R. Assume I has a point Xo with least period q. (a) Prove that F9(tO) = to + P for some integer p, where to E R is a lift of Xo E SI. (b) Prove that Po(F) = plq. Thus if j has a periodic point, its rotation number is rational.
CHAPTER III
Chaos and Its Measurement The theme of the chapter is complicated dynamics or chaos of maps on the line. The first section presents a theorem of Sharkovskii; it proves for certain n and k, that the existence of a periodic orbit of period n forces other orbits of period k. This theorem is not. exactly about complicated dynamics, but it does show that if a map / Oil the lille IIIL" 11 period which is not C
3.1 Sharkovskii's Theorem In Chapter II we showed that the quadratic map FI'(x) = I-'x(l - x) has an invariant Cantor set AI' with points of all periods. In this section we study the following question: if a continuous function / : R --+ R has a point of period n, does it follow that / must have a point of period k? Stated differently, which periods k are forced by which other periods n? The following theorem is very simple to state and has a relatively simple proof. Theorem 1.1, Li and Yorke (1975). Assume / : R --+ R is continuous, and there is a point a such that either (i) p(a) :5 a < /(a) < /2(a) or (ii) /3(a) 2: a > /(a) > /2(a). Then / bas points of all periods. REMARK 1.1. Note that it follows if / has a point of period three then it has points of all other periods, hence the title of the Li and Yorke paper, "Period three implies chaos".
63
III. CHAOS AND ITS MEASUREMENT
64
REMARK 1.2. There is a more general result of Sharkovskii (1964), which also was proved earlier than the result of Li and Yorke. We prove the simpler result first because the proof is simpler and the lemmas used in the proof of the result of Li and Yorke are needed for the more general result. The treatment of this whole section follows the paper Block, Guckenheimer, Misiurewicz, and Young (1980). Two good general references for these results and other related one dimensional results are A1sedil., Llibre, and Misiurewicz (1993) and Block and Coppel (1992). We assume the first ordering in the theorem with f3(a) :S a < !(a) < j2(a). (The proof for the other ordering is merely obtained by a reflection in the line.) Let I, = [a,f(a)] and 12 = [J(a), !2(a)]. Then !(ld :::) 12 and f(h) :::) I, u 12 as can be seen from the image of the endpoints of the intervals. Lemma 1.2. If I and J Me closed intervals and !(I) :::) J then there exists a subinterval K c I such that !(K) = J, !(int(K)) = int(J), and !(8K) = 8J. Let J = Ib l ,b2 ]. There exist al,a2 E I such that f(a]) = b]. Assume (The other rase is similar with suprema and infima interchanged.) Let
PROOF.
XI =
sup{x :
By continuity f(xd = bl · X2
Then !(X2)
=~.
!«XI,X2)) n8J
o
= 0.
N~te
= inf{x :
:S
al
that XI
X
XI
:S
X
a, < a2'
:S a2 such that f(x) = b.}. <
a2.
:S
a2
Next let such that !(x) = ~}.
Thus f({XI,X2}) = {b l ,b2 }. By the definitions of Xl and X2, Thus !(int([xl,X2]) = int(J) = (b l ,b2). This proves the lemma.
Definition. An interval I f -covers an interval J provided f(l) :::> J. We write I
-+
J.
Lemma 1.3. (a) Assume that there Me two points a i b with !(a) > a and f(b) < b and [a. bJ is contained in the domain of!. Then there is a lixed point between a and b. (b) If a closed interval I f-covers itself then! has a lixed point in I. PROOF. (a) Let g(x) = !(x) - x. Then g(a) > 0 and g(b) < O. By the Intermediate Value Theorem there is a point c between a and b where g(c) = 0 so f(c) = c. This result can also be seen graphically by considering the two cases where (i) a < band (ii) a > b. See Figure 1.1. f(a)
a ....------,"---.,.
f(a)
a FIGURE 1.1. Fixed Point for Lemma 1.3(a)
3.1 SHARKOVSKII'S THEOREM
65
(b) By Lemma 1.2, there is an interval K = [XI,Xl] c I with /(K) = I = [a,b]. Then either (i) /(xd = a ::; XI and /(Xl) = b ~ Xl, or (ii) /(xa) = b > XI and /(Xl) = a < X2' If the equality holds we are done. Otherwise pan (a) applies to prove there is a fixed point. 0
Lemma 1.4. Assume Jo -+ J I -+ .•• -+ I n = Jo is a loop with /(J,,) :::> J"+l for k = 0, ... ,n-1. (a.) Then there exists a. fixed point Xo of with /"(xo) E J" for k = 0, ... , n. (b) fUrther assume tha.t (i) this loop is not a product loop formed by going p times around a shorter loop of length m where mp = n, and (ii) int(J;) n int(J/c) = 0 unless J" = Ji . If the periodic point Xo of part (a) is in the interior of Jo then it has least period n.
r
REMARK 1.3. Note that the loops that we allow can repeat some intervals as for example J o ..... ./1 -+ ........ J.. - 2 ..... Jo -+ Jo, or Jo -+ J I -+ J 2 -+ J I -+ J2 -+ J o. However we do not allow a loop such as Jo -+ J I -+ Jo -+ J I . PROOF. (8) We give a proof by induction on j. The induction statement is as follows. (S) There exists a subinterval K) C Jo such that for i = 1, ... ,j, f'(K j ) C J i , f'(int(K j » C int(J.), and Ji(K j ) = J j • By Lemma 1.2 the induction hypothesis is true for j = 1. Assume (S,,-a) is true. Thus there exists a Kk-I' Then
/k(K,,_d
= IU"-I(K,,_a) = I(J,,-a)
:::> Jk
By Lemma 1.2, there exists a subinterval K" C K"_I such that I"(K,,) = J" with Ik(int(K,,» = int(J/c). By the induction assumption (S,,-a) the other statements of (S,,) are true. Now using the statement (Sn), we have r(Kn ) = Jo. By Lemma 1.3 has a fixed point Xo in Kn C Jo. Because Xo E K n , li(XO) E J i for i = 0, ... , n. This proves part (a). For part (b), since r(int(Kn» = int(Jo), if Xo E int(Jo) then Xo E int(Kn) and P(xo) E int(Ji ) for i = 1, ... , n. Because the loop is not a product, Xo must have
r
~od~
0
PROOF OF THEOREM 1.1. We assume the first case where I(a) = b > a, j2(a) = I(b) = c > I(a) = b, and 13(a) = I(c) s a. Let II = [a, bl and 12 = [b, c]. Then II I-covers 12 and 12 I-covers both II and h First F(h) :::> 12 so there is a fixed point by Lemma 1.3. Next we show that I has a point of period n for any n ~ 2. Take the loop of length n with one interval being h and n - 1 intervals being repeated copies of 12 : II -+ 12 -+ 12 -+ •.. -+ 12 -+ h. By Lemma 1.4, there exists an Xo E Ia such that In(xo) = Xo and J1(xo) E 12 for j = 1, ... , n -1. If there were a k with 1 s k < n such that /"(xo) = Xo, then we would have Xo = I"(xo) E h. Thus we would have Xo E II n 12 = {b}. We now show that Xo = b is impossible. The argument is slightly different for n = 2 and n ~ 3. In the case when n = 2, 12(b) = j2(xo) = Xo = b, contradicting j2(b) = 13(a) ::; a. In the case when n ~ 3, we must have 12(b) = 12(xO) E 12 contradicting j2(b) = /3(a) ::; a. This contradiction shows that Ji(xo) :F Xo for 1 ::; j < n, and Xo has period n. 0
Definition. In order to state the result of Sharkovskii we need to introduce a new ordering on the positive integers using the symbol c> called the Sharkovskii ordering. First the odd integers greater than one are put in the backward order:
III. CHAOS AND ITS MEASUREMENT
66
Next, all the integers which are two times an odd integer are added to the ordering, and then the odd integers times increasing powers of two: 3 [> 5 [> 7 [> [>
••• [>
2 . 3 [> 2 . 5 [>
2n . 3 [> 2n . 5 [>
••• [>
••• [>
22 . 3 [> 22 . 5 [>
2n +l . 3 [> 2n +l . 5 [>
•••
••••
Finally, all the powers of two are added to the ordering in decreasing powers:
We have now given an ordering between all positive integers. This ordering seems strange but it turns out to the be ordering which expresses which periods imply which other periods as given in the Theorem of Sharkovskii (Sharkovskii, 1964). Theorem 1.5 (Sharkovskii). Let I : I c IR -+ R be a continuous function from an interval I into the real line. Assume I has a point of period nand n [> k. Then I has a point of period k. (By period we mean least period.) Until the proof of the theorem is complete, I is assumed to be a continuous function from I to IR as given in the statement. The proof of the theorem involves finding intervals which I-cover each other in certain ways. In order to express these ideas we introduce the following definition of a type of graph. Definition. Let A = {I" ... , I.} be a partition of I into closed intervals with disjoint nonempty interiors. An A-graph 01 I is a directed graph with vertices given by the Ii and a directed edge from Ii to h if Ii I-covers h. It is also called the graph lor the partition. See Figures 1.3 and 1.4 for examples. Example 1.1. Let I have a graph as indicated in Figure 1.2 with three intervals, 1,,12,13, Then hI-covers 12, 12 I-covers 1\ and 12 , and 13 I-covers 1\, 12, and 13. Thus the graph for the partition is as given in Figure 1.3.
FIGURE 1.2. Graph of the Function in Example 1.1 REMARK 1.4. We first consider the case where n is an odd integer for which (i) n > 1 and (ii) I has a point x of period n and f has no points of odd period k with 1 < k < n (i.e., k [> n.) To prove Sharkovskii's Theorem in this case, Peter Stefan had the idea to
3.1 SHARKOVSKII'S THEOREM
67
FIGURE 1.3. Graph for the Partition in Example 1.1 prove the existence of an orbit with a special pattern on the line: let point such that Xn < Xn-2 < ... < Xa < XI < X2 < X4 < ... <
XI
be an n-periodic
Xn-I
where Xi = li-I(xd. (The reflection of this ordering is just as good.) A periodic point with such an ordering of its orbit on the line is called a Stelan c!Jdr:.. Lemma 1.6 proves that indeed such an orbit does exist. Given such an orbit, let II = [Xl, X2), 12 = [X3, Xl)' 13 = [X2,X4), I2j = [X2i+1' X2i-1I, and I2i-l = [x2i-2, X2i) for j = 2, ... ,(n - 1)/2. Because of the nature of the orbit, (i) II I-covers II and 12, (ii) Ii I-covers 1;+1 for 2 ::s j ::s n - 2, and (iii) In-I I-covers all the Ij for j odd. Thus the existence of such a special type of orbit proves that the A-graph of I contains a subgraph of the form given in Figure 1.4. This subgraph is called a Stefan graph. Applying the lemmas above to this Stefan graph can prove the existence of all the periodic implied by n in the Sharkovskii ordering.
FIOURE 1.4. Subgraph for the Partition In Lemma 1.6 We now turn to the lemma and its proof. Lemma 1.6. Assume n is an odd integer with n > 1. Assume that f has a point X of period n and I has no points of odd period k with 1 < k < n (i.e., k I> n.) Let J = [min O(x), max O(x)). Let A be the partition of J by the elements of O(x). Then the A-graph of I contains a subgraph of the following form: The II"'" In-I can be numbered with all the intervals having disjoint interiors such that (i) II I-covers II and h (ii) Ij f-covers Ij+1 for 2 ::s j ::s n - 2, and (iii) In-I f-covers all the I j for j odd. See Figure 1.4.
PROOF. Let O(x)
= {ZI' Z2,""
zn} where the Zj are ordered as on the line, ZI < Z2 < is one of the other Zj. Similarly, I(zd > ZI'
... < zn}. Then, I(zn) < Zn because I(zn)
111. CHAOS AND ITS MEASUREMENT
68
Let a = max{y E O(x) : f(y) > y}. Then, a =1= Zn. Let b be the next larger than a among O(x) in terms of the ordering of the real line. Let I, = [a, bl E A. We show that this II can be used in the statement of the lemma. There is a sequence of small steps which we state as claims. First we need to show that I, covers itself (Claim 1), and eventually covers all of J (Claim 2). Claim 4 shows that there is a shortest loop with distinct intervals I, -+ 12 -+ .•• -+ In-I -+ h. Claims 5 and 6 complete showing that these intervals are situated on the line and behave as claimed. Claim 1. The image of II covers itself, f(Id :::> h. PROOF. We know that f(a) > a so f(a) Therefore f(Id :::> I, as claimed.
~
b. Also since b > a, f(b) < b so f(b)
Claim 2. Tile (n - 2) image of I, covers tile whole interval J,
r-
2 (I1)
~
a. 0
:::> J.
I,.
P/lOOF. Since f(ld :::> fk+l(ld :::> fk(Id, so the iterates arc nested. The number of points in O(x) \ {a,b} is n - 2, so Zn E fk(Id for some 0 ~ k ~ n - 2. By the nested property, Zn E r- 2 (Id. Similarly E r-2(Id. Since II is connected, 2 (Ill :::> [z" znl = J. 0
z,
r-
Claim 3. There exists a Ko E A with Ko
=1=
h such that f(Ko) :::> I,.
PROOF. This proof uses the fact that n is odd, so there are more elements of O(x) on one side of int(h) than the other. Call P the elements of O(x) on the side of int(I,) with more elements. There is some y"Y2 E P with f(yd E P and f(Y2) E O(x) \ P. Take adjacent points YI and Y2 with iterates as above. Let Ko be the interval from YI to Y2. Then f(Ko) :::> I, and Ko =1= I, as claimed. 0 Claim 4. There is a loop I, -+ 12 loop witll k ~ 2 has k = n - 1.
--+ ••• -+
Ik
-+
II with 12
=1=
II. The shortest sucll
r-
2 (Ill :::> Ko. There PROOF. Let Ko be as in Claim 3, so f(Ko) :::> I,. By Claim 2. are only n - 1 distinct intervals in A so there exists such a loop with 2 ~ k ~ n - 1. Now assume the smallest k that works satisfies 2 ~ k < n - 1 and we get a contradiction. Since this is the shortest loop, none of the intervals can be repeated or it could be shortened. Either k or k + 1 is odd. Let m = k or k + 1 be this odd integer, so 1 < m < n. Use the loop with m intervals given by h -+ 12 --+ ••• --+ lie -+ I, or II -+ 12 -+ ..• -+ Ik -+ h -+ II depending on whether m = k or m = k + 1. By Lemma 1.4(a) there is a point Z with fm(z) = z. The point Z can not be on the boundary of the interval because these points have period n which is greater than m. Thus Z has least period m by Lemma 1.4(b). Since m is odd this contradicts the assumption on n in the Lemma. This contradiction proves that k = n - 1. 0
For the rest of the proof we fix I" h, ... ,In -, as in Claim 4. Claim 5. (a) If f(Ij) :::> I, tllen j = 1 or n - 1. (b) For j > i + 1 there is no directed edge from Ii to I j in the graph. (c) The interval I, I-covers only I, and 12. PROOF. Part (a) follows from Claim 4. Parts (b) and (c) follow because the loop is the shortest possible. 0 Claim 6. Either (i) the ordering (in terms of the real line) of the intervals I j in the loop of Claim 4 is I n -, ~ I n -3 ~ ... ~ 12 ~ I, ~ 13 ~ ... ~ In- 2 and the order of the
3.1 SHARKOVSKII'S THEOREM
69
orbit is /"-1 (a) < /"-3(a) < ... < 12(a) < a < I(a) < 13(a) < ... < /"-2(0) or (ii) both of these orderings are exactly reversed.
=
PROOF. Let II [a, bJ. The interval II I-covers only II and h 80 they must be next to each other. Assume that h ~ II. (The other possibility gives the reverse order
mentioned in the claim.) Then it must be that I(a)
= b and I(b) is the left endpoint of
h
Next, l(aI2 ) = ah Since one of these endpoints is I(a} = b which is above int(1d both endpoints of 13 must be above int(1d. Also because of Claim 5a (12 does not I-cover Id and 5b (12 does not I-cover I j for j > 3) 13 must be adjacent to II' Continue the argument by induction. For k < n -I, since I" does not I-cover II and 1" does not I-cover Ij for j > k + I, 1,,+1 must be adjacent to 1" _ I, This covers all the int.ervals in the claim. Note that we have also shown the ordering on the orbit as stated in the claim. 0 Claim 7. 'I'll(' intcrvlllln_1 I-covers 1111 the I j for j odd. PROOF. Note that In-I = [/"-I(a),/"-3(a)J. Then /(/"-I(a}) = /"(a) = a. Also /"-3(a) E I n - 3 80 /(/"-3(a» = /"-2(a) E I n - 2 is the far right endpoint of J (the largest element in the orbit O(x». Thus I(In-d :::> [a,/"-2(a)] = II U 13 U··· U I n- 2. We have proved the claim. 0
All the claims together prove Lemma 1.6.
o
Proposition 1.7. Theorem 1.5 is true if n is odd and maximal in the ordering for which the theorem is true. PROOF. Take k with nt> k. There are two cases: (a) k is even and k < n and (b) k > n with k either even or odd.
Case a. The integer k is even and k < n. PROOF. Consider the loop of length k given by In-I - In-" - In-"+1 - ... -In-I. By Lemma 1.4(a) there is a Xo E In-I with I"(xo) = Xo. The point Xo can not be an endpoint because the endpoints have period n. Therefore Xo has period k. 0
Case b. The integer k > n with k either even or odd. Consider the loop of length k given by II - 12 - ... - In-I - II - II II. Again by Lemma 1.4(a) there is a Xo E II with I"(xo) = Xo. If Xo E all then Xo has period n. Thus n divides k, 80 k ~ 2n ~ n + 3. Also since /"(xo) E II the iterate /"+1 (xo) ¢ II which contradicts the conclusion of Lemma 1.4(a}. Therefore Xo rt all. and by Lemma 1.4(b), Xo has period k. This completes the proof of Case b and Proposition 1.7. 0 PROOF.
.•• -+
The first step in proving the result for other values of n proves the existence of a point of period two whenever there is a point of even period. Lemma 1.8. If 1 has a point of even period then it has a point of period two. PROOF. Let n be the smallest integer greater than one in the usual ordering of the integers (not the Sharkovskii ordering) such that I has a point of period n. If n is odd then we are done by Proposition 1.7. Therefore we can assume that n is even. Let 0, II = [a, b], and J = [min O(a), max O(o)J = [A, B] be as before. In the proof of Lemma 1.6 we only used the fact that n is odd to show that there exists a Ko E A with Ko i: 11 and I(Ko) :::> II.
70
III. CHAOS AND ITS MEASUREMENT
First assume there is such a Ko. There is a minimal cycle as in Claim 4 with 2 :5 k :5 n - 1. As before, I" covers all the I j on the other side. Thus In-I -+ I n - 2 -+ In-I is a cycle of length two, and there is a point of period two. Next assume there is no Ko E A with Ko '" h and I(Ko} :::> h. It follows that (i) all the points Xj E O(a) with Xj :5 a have I(xj) ~ band (ii) all the points Xj E O(a) with Xj ~ b have I(xj) :5 b. Since some points in O(a) are mapped to b and B, both b, B E f(IA, aD and so f(IA, aD :::> Ib, B]. Similarly, 1(lb, BD c lA, a]. Then lA, a] -+ Ib, B] -+ lA, a] is a cycle of length two. The intervals are disjoint so there must be a point of period two. 0 The proof of Sharkovskii's Theorem now splits into the following cases. Case 1: n is odd and maximal in the Sharkovskii ordering and n I> k. Case 2: n = 2m and nl> k. Case 3: n = 2mp with p > 1 odd, m ~ 1, n is maximal in the Sharkovskii ordering, and n I> k. Case 1 is proved above in Proposition 1.7. We split Case 2 up into subcases and prove it next. Case 2: n = 2m and n I> k SO k = 2' with 0 :5 s < m. Case 2a: s = 0, i.e., 1 has a fixed point. Case 2b: s = 1. Case 2c: s > 1. PROOF OF CASE 2a. We can define a and b as before with f(a} Therefore f(la, b]) :::> la, b] and f has a fixed point.
~
band f(b} :5 a. 0
Case 2b follows from Lemma 1.8. PROOF OF CASE 2c. Let 9 = 1"/2 = p.-I. The map 9 has a point of period 2m -,+I with m - s + 1 ~ 2. Lemma 1.8 proves that 9 has a point Xo of period 2. So Xo = g2(xO} = I"(xo) and Xo '" g(xo) = 1"/2(xO)' Thus the period of Xo for 1 is 21 for some t :5 s. If t < s then Xo is fixed by 9 which is impossible. Therefore t = s and Xo is a point of period 2' = k. 0 We also split Case 3 up into subcases. Case 3: n = 2mp with p > 1 odd, m ~ 1, n is maximal in the Sharkovskii ordering for f, and n I> k. Case 3a: k = 2'q with s ~ m + 1 and 1 :5 q and q odd. Case 3b: k = 2' with s :5 m. Case 3c: k = 2mq with q odd and q > p. We leave the proof of these cases to the exercises. See Exercises 3.2 - 3.4. This completes the proof of Theorem 1.5 (Sharkovskii's Theorem). 0
3.1.1 Examples for Sharkovskii's Theorem There are examples of maps with exactly the orbits implied by the Sharkovskii ordering. First consider the case where the maximal period in the ordering is odd. Example 1.2. Let n > 3 be odd. (If n = 3 there are points of all periods and there is nothing to prove.) Let XI be a point which is a Stefan cycle for n. Make the graph be piecewise linear connecting the adjacent points (Xj,f(Xj}) on the graph by straight line segments. Let h, ... , In-I be the intervals as in the proof. See Figure 1.5. We claim that such a map does not have a point of odd period k with 1 < k < n. Assume that x is a periodic point with perino different than n. If any iterate of x hits one of the endpoints of an I j then either :r ita" period n or is not periodic and so this can not happen. Thus J1(x} E int(/i(j)} for each j. Because the A-graph is exactly the
3.1.1 EXAMPLES FOR SHARKOVSKII'S THEOREM
71
nr---,.--------.---~------_,
1._2
n
FIGURE 1-5.
Example 1.2
subgraph proved to exist by Lemma 1-6, the length of cycles in the graph are exactly those k which are implied by n in the Sharkovskii ordering. Also the graph over I, has slope -2. There is a fixed point in h and all other points must leave I, and enter 12' Thus all orbits passing through h are either fixed points or have periods at least n - 1Other orbits have to have the same period as the period of the cycle of intervals (since the orbit must pass through the interiors)_ Thus the possible periods of periodic points are exactly those implied by n in the Sharkovskii ordering. In particular there are no points of odd period with 1 < k < n_ Definition. To get other examples with certain periods we introduce the doubling operator. Let I = [O,IJ. Assume I : I -+ I is a continuous map. We denote the periods of the orbits of I by P(f). Now we define the double 01 I, V(f) = g, by
g(x) =
~ + l/(3x) { [2 + I(l)](~
x- ~
- x)
l
for 0::; x ::; for x ::; ~ for ~ ::; x ::; 1.
l ::;
See Figure 1.6. It is easily checked that 9 is continuous.
FIGURE 1-6. The original map the right figure
I
is in the left figure and the double 9 is in
72
JII. CHAOS AND ITS MEASUREMENT
The next proposition relates the periods of I with the periods of the double of this result justifies the use of the name "double" for this construction.
I;
Proposition 1.9. The set of all periods of g, P(g), are related to the set of all periods of I, 1'(1), by p(g) = 21'(1) U {I}. Moreover 9 has exactly one repelling lixed point, and for each n 9 has the same number of orbits of period 2n as I has of period n and their stability is the same. PROOF. Let II = [0,1/3]' 12 = (1/3,2/3), and 13 = [2/3,IJ. Because g(I2) :) h 9 has a fixed point XI in h. Because the absolute value of the slope of gin h is at least 2, there is exactly one fixed point in 12 and it is repelling. Also any point in 12 other than the fixed point has an orbit which leaves h Because [g(h) Ug(I3)J nI2 = 0, none of the points in 12 \ {xd can be periodic. If X E II then g2(X) = g(2/3+ 1(3x)/3) = 1(3x)/3 E II' Thus for X to be periodic, its period must be even, 2k. But by induction, g2k(x) = I k(3x)/3 for k ~ 1. Thus g2k(x) = x if and only if 1"(3x) = 3x. We have shown that these periods of 9 are exactly twice the periods of I. Moreover, since g'(t) = 1 on 13, for a point x of period 2k for g, (g2k)'(x) = (lk)'(3x) so the two orbits have the same stability type. The periodic points of 9 in 13 are the same because they are on the orbits described above. This proves the proposition. 0
Example 1.3. Let I(x) == 1/3 for x E [O,IJ. The only periodic point of I is a fixed point. Let /J = V(I). By the above proposition the periods of /J, P(lIl, are {1,2}. Also, /J has one repelling fixed point and one attracting orbit of period 2. By induction, if In = vn(f) then the periods of In, P(ln), are {I, 2, ... , 2n }, and In has one repelling periodic orbit of pt'fiod 2i for 1 $ j < n and one at.tract.ing periodic orbit of period 2n. Finally let I"",(x) = lim n_ oo In(x). We leave to an exercise to prove that 100 is continllolls and 1'(/00) = {1,2, ... ,2n , ... }, i.e., 100 has repelling periodic points of periods 2n for all n and no other periods. See Exercise 3.8. We also leave to the exercises the fact that if n = 2mp for 1 < p, p odd, and m then there is a map I for which 1'(1) = {k : nc> k}. See Exercise 3.5.
~
1,
3.2 Subshifts of Finite Type In the proof of Sharkovskii's Theorem we considered graphs where intervals I-covered each other forming a graph as given in Figure 2.1.
FIGURE 2.1.
Graph of Partition in Sharkovskii's Theorem
With this graph, a point in II can go to II or 12; a point in 12 can go to 13; a point in 13 can go to I.; a point in I. can go to Ir.; a point in Is can go to 16; a point in 16 can go to II, 13 , or I r.. Paths in the graph correspond to allowable orbits of points. We can look at only the labeling of the intervals (as we did for the quadratic map I,,(x) = 1lX(1 - x)
3.2 SUBSHIFTS OF FINITE TYPE
73
for II- > 2 + 5 1/ 2 ) and consider sequences s = 808182 ... where 1 can be followed by 1 or 2; 2 can only be followed by 3; 3 can only be followed by 4; 4 can only be followed by 5; 5 can only be followed by 6; can 6 can be followed by 1, 3, or 5. All other adjoining combinations are not allowed. Thus a sequence like 634561123456 ... is allowed. Definition. Instead of looking at the graph, we can define a transition matrix to be a matrix A = (aij) such that (i) aij = 0,1 for all i and j, (ii) E j aij ~ 1 for all i, and (iii) E i D.ij ~ 1 for all j. Given a graph of the type in Sharkovskii's Theorem, we can form a transition matrix A by letting aij = 0 if the transition from i to j is not allowed (there is no arrow in the graph from Ii to I j ) and aij = 1 if the transition from i to j is allowed (there is an arrow in the graph from Ii to I j ). The assumption that E j D.i; ~ 1 for every i means that it is possible to go to some interval from Ii; the assumption that E i aij ~ 1 for every j means that it is possible to get back to I j from some interval. In the above graph the transition matrix is 0 0 0 0) o1 01 01 0 0 0 0 0 1 0 0 ( A= 0 0 0 0 1 0 . o 0 0 001 1 0 1 010 Definition. Let En be the space of all (one-sided) sequences with symbols in the set {I, 2, ... , n} as defined in Section 2.5, and (1 : En -+ En be the shift map given by (1(S) = t where tk = 8k+I' This space has a metric as defined before. Given an n by n transition matrix A let EA = {s E En : a•••• +! = 1 for k = 0,1,2, ... }. This space EA is made up of the allowable sequences for A. Let (1A = (1IEA' The following proposition shows that (1 A acting on a sequence in (1 A gives another sequence in (1A. The map (1A : EA -+ EA is called the subshift 01 finite type lor the matrix A. The following proposition also shows that EA is closed. Proposition 2.1. (a) The subset EA is closed in En. (b) The map (1A leaves EA invariant, (1A(E A ) = EA. (a) By using cylinder sets it is easily seen that EA is closed. (b) If S E EA and t = (1 A(S) then it follows directly that all the transitions in t are allowed so t E EA. On the other hand, if t E EA then there is some 80 such that a.ol o = 1 by the standing assumptions on A. Let Sic = tk-I for k ~ 1. Then s E EA and(1A(s)=t. 0 PROOF.
Definition. In general, a subset SeEn is called a SUb8hift provided that it is closed and invariant by the shift map (1. The following example gives a subshift which is not of finite type. Example 2.1. Let S be the subset of E2 consisting of all strings s such that between any two 2's in the string s there are an even number of 1'5: i.e., if 8; = 2 = 81c with j < k then there are an even number of indices i with j < i < k for which 8; = 1. This allows the string s to start with an odd number of 1'5, and s can have an infinite tail of aliI's or all 2's. A direct check shows that S is closed and invariant under the shift map. Because the number of 1's between two 2's can be an arbitrary even number, S is not a subshift of finite type.
III. CHAOS AND ITS MEASUREMENT
74
The next thing we want to do is count the number of periodic orbits in EA. A string which has period k for q A keeps repeating the first k symbols that appear in its string, e.g. 124212421242". has period 4. Therefore it is helpful to look at finite strings of symbols which are called words. Therefore 1242 is a word of length 4. Given a transition matrix A, a word w = (wo, ... ,Wk- d is called allowable provided the transition from WJ-I to WJ isallowahle for j = 1, ... , k, i.e., aWJ_I,w, = 1 for j = 1, ... , k. As a first step (the ind uction step) to determine the number of k-periodic points for q A, we prove the following lemma about the number of words of length k + 1 which start at any symbol i and end at the symbol j. Lemma 2.2. Assume that the ij entry of Ak is p, (Ak)ij = p. Then there are p allo"wable ""'ords of length k + 1 starting at i and ending at j, i.e., words of the form iSlS2 ...
sk-d·
PROOF. We prove the result by induction on k. Let num(k, i,j) be the number of words of length k + 1 starting at i and ending at j. This result is certainly true for k = 1 where num(I,i,j) is either zero or one depending on whether there is an allowable transition from i to j or not. Now assume the lemma is true for k - 1 for all choices of i and j. By matrix multiplication
3._1
"1 .• 2 ..... 61.:-2
""-I = num(k, i,j).
The last equality is true because if a'._d = 0 then these wOl'ds from i to Sk-l do not contribute to the count of the words from i to j. On the other hand, if a'._d = 1 then each of these words contributes one word to the words from i to j. 0 Corollary 2.3. The number of fixed points of q~ is equal to the trace of Ak. PROOF. This follows because
# Fix(q~IEA)
=
Li num(k, i, i) =
Li(Ak)ii = tr(Ak). 0
Definition. An n by n matrix of O's and 1's is called reducible provided that there is a pair i. i with (A k )ij = 0 for all k ~ 1. An n by n matrix of O's and 1'5 is called irreducible provided that for each 1 ~ i,i ~ n there exists a k = k(i,j) > 0 such that (Ak)ij > 0, i.e., there is an allowable sequence from i to i for every pair of i and j. The matrix A is called positive provided Aij > 0 for all i and i and is called eventually positive provided there there exists a k which is independent of i and j such that (Ak)ij > 0 for all i and j. Thus, both positive and eventually positive matrices are irreducible. Example 2.2. (a) The following transition matrix is reducible because it is not possible to get from 3 to 1:
(i ~ ! ~). o
0 1 0
3.2 SUBSHIFTS OF FINITE TYPE
75
FIGURE 2.2. Graph for Partition in Example 2.2(a) Its graph is given in Figure 2.2. (b) The following transition matrix is irreducible:
(~
~ ~1 ~) 1
1 0 o 0
1 0
Its graph is given in Figure 2.3.
FIGURE 2.3. Graph for Partition in Example 2.2(b) We often want to exclude the case when A corresponds to a permutation of symbols. A permutation is defined to be a transition matrix where the sum of each row is equal to one and the sum of each column is also equal to one. An example of a permutation is given by
Its graph is given in Figure 2.4.
1
~
2
\1 3 FIGURE 2.4. Graph for Permutation on Three Symbols
76
III. CHAOS AND ITS MEASUREMENT
The following lemma characterizes permutations in terms of row sums only.
Lemma 2.4. Let A be a trMsition matrix. Then A is a permutation matrix if and only ifL) a'j = 1 for all i. PROOF. If every row sum, Lj aij = 1 for all i, then Li,j a'j = n. Since a transition matrix has Li ail 2: I for every j, it follows that Li ail = I for every j. This shows A is a permutation matrix. The converse is clear. 0
In the next proposition, we show that a subshift is irreducible if and only if the shift map has a dense orbit. As a preliminary step, we prove the following lemma about the dense orbit.
Lemma 2.5. Let A be a transition matrix. Assume that the point s· has a dense orbit in EA for the shift map U A and s· is not a periodic point. (This last assumption means that A is not a permutation matrix') Then for My k > 1, u~(s') has a dense orbit. PROOF. It is clear that u~+)(s') is dense in EA \ {uA(s') : 0 ~ i < k}. Thus we only need to show that this orbit accumulates on {U~ (5') : 0 ~ i < k}. It is clearly sufficient
to prove that u~+i (5') accumulates on s' by taking a higher iterate to get near the other
uA(s·). Rather than look at s', we show that s' has a preimage t' and that u~+j(s') accumulates on t·. First, we show s· has a preimage. By assumption (iii) for a transition matrix, there is an element tl that can make a transition to == tl, so there is a t' E LA with u A (t') = 5'. If t' were on the forward orbit of s· then s· would be periodic which is not allowed. Therefore t' is not on the forward orbit of 5', and so u~(s') '" t' for o ~ j ~ k. Since u~(s') is dense everywhere in E A , and ~(s') '" t' for 0 ~ j ~ k, it follows that u~+)(s') must come closer to t' than the finite set of points {u{(s') : 0 ~ j ~ k}. Therefore ~ 0 u~ (5') comes arbitrarily near t·. This completes the proof that the forward orbit of u~(s') is dense in EA. 0
So
Proposition 2.6. Let A = (ai,i) be a transition matrix. Then the following are equivalent: (a) A is irreducible, and (b) UA has a dense forward orbit in EA. PROOF. For a finite word w, let b(w) be the first letter of w (beginning of w), and e(w) be the last letter of w (end of w). Given each pair i and j let tij be a choice of the words with b(ti]) = i and with e( t ij ) = j. Such a choice exists because A is irreducible. First we show that (a) implies (b). We describe the point with a dense orbit, s·. List all the words of length one with the proper choice of the transition word tij between them to make the sequence allowable. Then list all the allowable words of length two with the proper choice of the transition word tij between them to make the sequence allowable. Continue by induction listing all the allowable words of length n with the proper choice of the transition word tij between them to make the sequence allowable. In this way we construct an infinite allowable sequence which contains all the allowable words of finite length. If u is a sequence in EA and V is a neighborhood of u, then there is some n such that any sequence which agrees with u in the first n places is contained in V, Now for this n there is a word somewhere in s' which agrees with this word of length n in u, Next there is a k such that u~(s') has this word in the first n places. Thus u~(s') E V. Since u and V were arbitrary, this proves that the orbit of s' is dense, (The reader might consider the case when A is a permutation matrix separately, but the above proof is also applies to this case.)
3.2 SlJBSHlFTS OF FINITE TYPE
77
Next we show that (b) implies (a). If A is a permutation matrix, then it is clearly the case that A is irreducible. Thus we can assume that A is not a permutation matrix. Take s· E EA whose orbit is dense in EA. Because the matrix is not a permutation matrix, s· can not be a periodic point. Take an arbitrary pair i and j. By assumption (ii) for a transition matrix, it is possible to take a E EA such that ao = i. If a point t is close enough to a then to = ao. There is some kJ such that (7~I(S') is within this distance so (7~'(S')O = ao = i, and sk, = ao = i. Thus i appears in the sequence for s' . Similarly there is abE EA such that bo = j. By Lemma 2.5, (7~I(S') has a dense forward orbit. The same argument as above shows there is a k2 > kJ with (7!'(s')o = bo = j, and so sk. = bo = j. Thus there is an allowable word in s' from the entry to the k~" entry which goes from i to j, and we can get from i to j for an arbitrary pair i and j. 0
ki"
By Lemma 2.4, in order to assume that A is not a permutation matrix it is only necessary to assume that E j aij ~ 2 for some i. We use this assumption to prove that EA is perfect. First, we prove a preliminary lemma. Lemma 2.1. Assume that A is an irreducible transition matrix such that E j aioj for some io. Then for each i there exists a k = k(i) for which Ej(Ak)ij ~ 2. PROOF. Since A is irreducible there is a word wE EA such that b(w) = i and e(w) Let the length of w be k, so there are k - 1 transitions. Thus num(k - 1, i, io) Then there are least two possible choices after io. Thus
~
2
= i o. ~
1.
o Proposition 2.8. Assume that A is an irreducible transition matrix with for some i o. Then EA is perfect.
E j aioj
~
2
PROOF. For each i there is a k = k(i) such that Ej(Ak)ij ~ 2. Take an s E EA. Take a cylinder set U as a neighborhood, U = {t E EA : ti = Si for 0 $ i $ n}. Then there exists a k = k(sn) such that Ej(Ak) •• j ~ 2. Because there is more than one choice for the transitions from the nl" to the (n + k)'" entry, there is atE U with tn+m '" Sn+m for some m with 1 $ m $ k. This is true for all s E EA and all cylinder sets, so EA is perfect. 0
Proposition 2.9. Assume that A is an eventually positive transition matrix. Then is topologically mixing on EA.
(7A
We leave the proof to the exercises. See Exercise 3.14. Proposition 2.10. Assume A is a transition matrix. (We do not assume A is irreducible.) Then the states can be ordered in such a way that A has the following block form:
..
A2
•
o
0
... ...
78
III. CHAOS AND ITS MEASUREMENT
where (i) each Aj is irreducible, (ii) the * terms are arbitrary, and (iii) all the terms below the blocks Aj are all O. Moreover, the nonwandering set O(UA) = EAt U·· ·UE Am . We defer the proof to the exercises. See Exercise 3.15. REMARK 2.1. For an introductory treatment of further topics in symbolic dynamics, see Boyle (1993).
3.3 Zeta Function Artin and Mazur had the idea to combine the number of periodic points of all periods into a single invariant (Artin and Mazur, 1965). If we list all these numbers there are countably many invariants. For certain classes of maps, when these numbers are combined together in a certain way they yield a rational function which has only a finite number of coefficients. Thus the information given by these countable number of invariants is contained in this finite set of coefficients. We proceed with the formal definitions. Definition. Let f : X --+ X be a map, and N,,(J) zeta function for f is defined to be 00
= #(Per,,(J)) = #
Fix(J"). The
1
(,(t) = exp(L TcN,,(f)tk). k=J
The zeta function is clearly invariant under topological conjugacy because the number of points of each period is preserved. For more discussion of the zeta function see Chaptt'r 5 of Franks (1982). In this section, we merely calculate the zeta function for a subshift of finite type. This theorem was originally proved by Bowen and Lanford (1970). In Chapter VII, we return to prove that the zeta function is a rational function of t for some further types of maps (toral Anosov diffeomorphisms). Before stating the throrem, we give a connection between the determinant, exponential, and trace of a matrix. (The exponential of a matrix is defined by substituting the matrix into the power series for the exponential. It is discussed further in Section 4.3 in the context of solutions of linear differential equations.) Lemma 3.1 (Liouville's Formula). Let B be a matrix. Then det(e B ) = etr(B). PROOF. Let eJ ... en be the standard basis. The following calculation uses the facts that the determinant is alternating in the columns, that e BI = (eBle J, ... ,eBlen ), that e BO = I. and that fte BI = Be B !. Then
d BI )1100 = ' " d BI ej II=O,ej+J. ... ,en ) dtdet(e L.,..det(el, ... ,ej-I'dte J
= Ldet(el, ... ,ej_J,Bej,ej+1, ... ,en )
= LbiJdet(el, ... ,ej-J,ei,ej+I, ... ,en) i,j
= Lbjjdet(el, ... ,ej-l,ej,ej+I, ... ,en) j
= tr(B).
3.4 PERIOD DOUBLING CASCADE
79
For t = to,
dtd det(eBt)lt=to
=
dtd det(eB(t-to)lt=to det(e Bto )
= tr(B) det(e Bto ).
d Solving the scalar differential equation dt det(e Bt ) = tr(B) det(e B1 ) with initial condition det(e BO ) = 1 gives that det(e Bt ) = etr(B)t. Evaluating this solution at t = 1 gives the result. 0 Theorem 3.2. Let (7 A : EA -+ EA be the subsmit of finite type for A = (aij) with E {O, I} for every pair of i and j. Then the zeta function of (7 A is rational. Moreover (.,. .. (t) = [det(I - tA)J-I. aij
PROOF. By Corollary 3.3, Nk((7A) = tr(Ak). Therefore, using the linearity of the trace, the power series expansion of the logarithm, and Lemma 3.10 we can make the following calculation.
1
(L i/ tr(A 00
(.,. .. (t) = exp
k ))
k=1 00
1
= exp (tr(L ktk Ak») k=1
= exp ( tr( -log(I -
tA»)
= det ( exp(log(I - tA) -I»)
= det ((I -
tA)-I)
= (det(I -
tA)) -I.
This proves the theorem.
o
3.4 Period Doubling Cascade The Sharkovskii Theorem tells us which periods imply which other periods. In particular, if a map / : R -+ R has finitely many periodic orbits then all the periods must be powers of 2. For the quadratic family, F,.(x) = /Lx(1 - x), we saw that it had only fixed points for 0 < /L ~ 3, and only fixed points and a point of period 2 for 3 < /L ~ 1 + 6 1/ 2 • In fact Douady and Hubbard (1985) proved that for the quadratic family as /L increases new periods are added to the list of periods appearing and never disappear once they have occurred. See de Melo and Van Strien (1993). Let /Ln be the infimum of the parameter values /L > 0 for which F,. has a point of period 2n. By the Sharkovskii Theorem, /Ln ~ /Ln+!' Notice that all the /Ln < 4 because F. has points of a1l periods. Let /Loo be the limiting value of the /Ln as n goes to infinity. The dynamics for F,."", is like the map /00 given in Exercise 3.8: there is an invariant set on which F,."", acts like an adding machine. This sequence of bifurcations is often called the period doubling route to chaos. At the bifurcation value /LI = 3 for the family F,., the fixed point PI' changes from attracting for 1 < /L < /Ll to repelling for /LI < /L. At /L = /Ll. F~. (Pp.) = -1. For p. slightly larger than /Ll, the 2-periodic orbit O(Pp,l) is attracting with derivative just less than one, 1 > (F;)'(p,., I ) > O. In Chapter VI, we study the period doubling bifurcation
III. CHAOS AND ITS MEASUREMENT
80
and show at I-' = 1-'2 where the period four orbit is created that (F;.),(p".,d = -1. Again, this 2-periodic orbit O(p",d changes from attracting to repelling as I-' moves past 1-'2. The period 4 orbit O(p",2) is initially attracting for I-' just slight larger than 1-'2 and becomes repelling for I-' > 1-'3. This process repeats itself; at I-' = I-'n the period 2n orbit O(P",n) is added. This orbit is attracting for I-'n < I-' < I-'n+1 and becomes repelling for I-' > I-'n+l. A natural question to ask is the rate of convergence of the parameter values I-'n to 1-'00' Consider a geometric sequence of numbers, >'n = Co - cl>.n, where 0 < >. < 1. For this example the limiting value >'00 = Co and >. (or>. -I) gives the rate of convergence to >'00' In general, we want to define a quantity which measures the geometric rate of convergence to the limiting value. Feigenbaum (1978) calculated the rate of convergence by means of the limit () = lim I-'n - I-'n-I . n~oo I-'n+l -I-'n This value b is called the Feigenbaum constant. Notice for the sequence I-'n = Co - cl>.n, the value () would equal>. -I: lim I-'n n~oo
-I-'n-l
I-'n+ I
-
I-'n
r
Co - cl>.n - Co
+ cl>.n-1
n~~ Co - c l >.n+1 - Co + C1>.n
= >. -I. Feigenbaum (1978) discovered that this constant is the same for several different families of functions. The value has been calculated to be () = 4.669202. . .. Both Feigenbaum (1978) and Coullet and Tresser (1978) suggested using the renormalization method to prove the universality of this constant, i.e., that the constant is the same for any one parameter family of functions which go through the period doubling sequence of bifurcations. Much of this program has now been proved by Feigenbaum, Coullet, Tresser, Collet, Eckmann, Lanford, and others, but there are some mathematical aspects of this program which are still unproven. See de Melo and Van Strien (1993), Collet and Eckmann (1980), and Lanford (1984a, 1984b, 1986). How can these parameter values I-'n be determined for the family F,,? We mentioned above that the period 2n orbit is attracting for I-'n < I-' < I-'n+1' In fact the critical point xo = 0.5 must converge to the attracting periodic orbit. See the discussion of negative Schwarzian derivative in Devaney (1989) or de Melo and Van Strien (1993). Thus to find the attracting periodic orbit we could iterate the critical point a number of times, say 1000, without recording the iterates, and then record or plot the next 1000 iterates. See Figure 4.1A. Note that for the family F" the limiting parameter value I-'oc = 3.5699456. Therefore the whole period doubling bifurcation is shown in Figure 4.1B. Since this part of the orbit is near an attracting periodic orbit, we could inspect the orbit to determine the period. By varying 1-', we could determine the value of I-' when the orbit changed from period 2n- 1 to 2n; this value of I-' gives I-'n. A second method to determine the I-'n is to note that the period 2n - 1 orbit O(p",n-d becomes unstable at I-'n and
(F;:-')'(P"n,n-d
=-1.
Thus we could use a numerical scheme (e.g. Newton's method) to search for a point and a parameter value with this property. This search would determine the I-'n. Finally, there is a third method for determining the rate of convergence given by the Feigenbaum constant by determining slightly different parameter values. We mentioned above that and
3.5 CHAOS
81
FIGURE 4.1A. The Bifurcation Diagram for the Family FI': the Horizontal Direction is the Parameter ~ Between 2.9 and 3.6; the Vertical Direction is the Space Variable x Between 0 and 1 Between these two parameter values, there is another value 2" '(Pl'~.n ) (FI'J
~~
for which
= 0,
I.e., the critical point 0.5 has least period 2". These parameter values satisfy· .. < ~ < < ~n+l < .... Using these parameter values ~~ instead of the ~ gives the same universal constant as the rate of convergence. For larger values of the parameter ~ but with ~ still less than 4, an orbit of a point for the quadratic family FI' seems to be dense in the whole interval [0,1]. In fact Jakobson (1971, 1981) proved that there is a set of parameter values Me [4 - £,4] such that (i) M has positive Lebesgue measure (and 4 is a density point) and (U) for every ~ E M, FI' has an invariant measure vI' on [0,1] that is absolutely continuous with respect to Lebesgue measure on [0,1]. This result implies that most points in [0,1] have orbits which are dense in the interval [0, I] for these parameter values. Many people have written papers on this and related results. See de Melo and Van Strien (1993) for further discussion of this result. Recently, Benedi~ks and Carleson (1991) have used results about the transitivity of this one dimensional family of maps to prove the transitivity of the two dimensional Henon family of maps for certain parameter values. We discuss this further in Chapter VII. ~~
3.5 Chaos Dynamical systems are often said to exhibit chaos without a precise definition of what this means. In this section, we discuss concepts related to the chaotic nature of maps and give some tentative definitions of a chaotic invariant set.
82
III. CHAOS AND ITS MEASUREMENT
FIGURE 4.1B. The Bifurcation Diagram for the Family F,,: the Horizontal Direction is the Parameter IJ Between 3.54 and 3.5701; the Vertical Direction is the Space Variable x Between 0.47 and 0.57 In Section 2.5, we prove that the quadratic map F", with IJ > 4, is transitive on its invariant set AI" The property of being transitive implies that this set can not be broken up into two closed disjoint invariant sets. For a set to be called "chaotic" it should be (dynamically) indecomposable in some sense of the word. A weaker notion than transitive, which still includes the some kind of dynamic indecomposability of an invariant set, is that it is chain transitive. See Section 2.3. This latter condition seems more natural than topological transitivity but it is not as strong and allows certain examples which do not seem chaotic. (Both periodic motion and "quasi-periodic" motion is chain transitive but not very chaotic.) Therefore in the first definition of a chaotic invariant set we require the map to be topologically transitive and only give chain transitivity as an alternative assumption. To define a chaotic invariant set, we also want to add a second assumption which indicates that the dynamics of the map on the invariant set are disorderly, or at least that nearby orbits do not stay near each other under iteration. The following definition of sensitive dependence on initial conditions is one possible such concept. In the next section we define the Liapunov exponents of a map. Another way to express that nearby orbits diverge is that the map has positive Liapunov exponents. See Section 3.6. Definition. A map I on a metric space X is said to have sensitive dependence on initial conditions provided there is an T > 0 (independent of the point) such that for each point x E X and for each f > 0 there is a point y E X with d(x,y) < f and a k ~ 0 such that dU"(x),f"(y» ~ T. One of the early situations where sensitive dependence was observed was in a set of differential equations in three variables. E. Lorenz was studying a the system mentioned in Section 1.3 and discussed in Section 7.10 (Lorenz, 1963). While numerically
3.5 CHAOS
83
integrating the equations, he recorded the coordinates of the trajectory to only a threedecimal-place accuracy. After calculating an orbit, he tried to duplicate the latter part of the trajectory by entering as a new initial point q the coordinates of some point part way through the initial calculation, Pt.. Because the original trajectory had more decimal places stored in memory than he entered the second time by hand, the points Pt. and q were not the same but merely nearby points. He observed that the two tra.jectories, the original trajectory and the one started with the slightly different initial condition, followed each other for a period of time and then diverged from each other rapidly. This divergence is an indication of sensitive dependence on initial conditions of the particular system which he was studying. Another way that sensitive dependence is manifest is through the round off errors of the computer. Curry (1979) reports on numerical studies of the Henon map (for A = 1.4 and B = -0.3) using two different computers. After 60 iterates, the iterates have nothing to do with each other. Thinking of the numerical orbit 011 a computer as an (-chain of the function, two different (-chains diverge, giving an indication of sensitive dependence on initial conditions. On the other hand, the plot of the orbits on the two machines seem to fill up the same subset of the plane, giving an indication that the function is topologically transitive (or at least chain transitive) on this invariant set. For further discussion of an attractor for the Henon map, see Sections 1.3 and 7.9. The concept of sensitive dependence on initial conditions is closely related to another concept called expansive: a map is expansive provided any two orbits become at least a fixed distance apart. Definition. A map I on a metric space X is said to be expansive provided there is an r > 0 (independent of the points) such that for each pair of points x, y E X there is a k ~ 0 such that d(fk(x),lk(y» ~ r. If I is a homeomorphism, then in the definition of expansive we allow k E Z and do not require that k is positive, i.e., there is an r > 0 such that for each pair of points x, y E X there is a k E Z such that d(fk (x), IA: (y)) ~ r. If I is expansive and X is a perfect metric space, then it has sensitive dependence on initial conditions. In determining the proper characterization of chaos, the assumption that the map is expansive seems too strong. Therefore we make the following definitions.
Definition. A map I on a metric space X is said to be chaotic on an invariant set Y or exhibits chaos provided (i) I is transitive on Y and (ii) I has sensitive dependence on initial conditions on Y. REMARK 5.1. The use ofthe term chaos was introduced into Dynamical Systems by Li and Yorke (1975). They proved that if a map on the line had a point of period three, then it had points of all periods. They also proved that if a map I on the line has a point of period three, then it has an invariant set S such that
limsuplfn(p) - r(q)1 > 0
and
lim inf Ir(p) - r(q)1 = 0 fI.-OO
for every p, q E S with p '" q. They considered a map with this latter property as chaotic. This property is certainly related to sensitive dependence on initial conditions. REMARK 5.2. Devaney (1989) gave an explicit definition of a chaotic invariant set in an attempt to clarify the notion of chaos. To our two assumptions, he adds the assumption that the periodic points are dense in Y. Although this property is satisfied by "uniformly hyperbolic" maps like the quadratic map, it does not seem that this condition is at the
84
III. CHAOS AND ITS MEASUREMENT
heart of the idea that the system is chaotic. (This last comment is made even though in the original paper Li and Yorke (1975) proved the existence of periodic points.) Therefore we leave out conditions about periodic points in our definition of chaos. The paper of Banks, Brooks, Cairns, Davis, and Stacey (1992) proves that any map which (i) is transitive and (ii) has dense periodic points also must have sensitive dependence on initial conditions. However as stated above, we consider the conditions that the map (i) is transitive and (ii) has sensitive dependence on initial conditions a more dynamically reasonable choice of conditions in the definition. REMARK 5.3. As stated above, an alternative definition of a chaotic invariant set Y is that (i) , is chain transitive on Y and (ii) , has sensitive dependence on initial conditions on Y. This definition allows the following example which does not seem chaotic. Let x and y both be mod 1 variables, so ({x,y) : x,y mod I} is the two torus, 'f2. Let
'(x, y)
= (x + y, y)
he a shear map. Then, preserves the y variable. The rotation in the x direction depends on the y variable. This map is chain transitive on 'f2 but not topologically transitive. The controlled nature of the trajectories make it seem non-chaotic. One way to avoid the above example and still use chain transitivity as the notion of indecomposability is to require that solutions diverge at an exponential rate. This is defined in the next section in terms of the Liapunov exponents. (Also see Section 8.2 for Liapunov exponents in higher dimensions.) Using this concept, we give an an alternative definition of a chaotic invariant set: an invariant set Y could be called chaotic provided that (i) , is chain transitive on Y and (ii) , has a positive Liapunov exponents on Y. Ruelle (1989a) has a long discussion of a chaotic attractor in which he includes the requirements that it be irreducible and has a positive Liapunov exponent. There is another measurement related to chaos which relates to the invariant set for the system. There are various concepts of (fractal) dimensions including the box dimension which allow the dimension to be an non integer value. We give some of these dimensions in Section 8.4 (in higher dimensions where the concepts seem to have their natural setting). For experimental data (without specific equations), Liapunov exp(). nents are not very computahle but the box dimension of the invariant set is computable. Therefore in the setting of experimental measurements, the box dimension seems like a reasonable measurement of chaos. See the discussion in Chapter 5 of Broer, Dumortier, van Strien, and Takens (1991). REMARK 5.4. A more mathematical solution to making the notion of chaos precise is in terms of a quantitative measurement of chaos called topological entropy which is defined in Section 8.1. The topological entropy of a map' is denoted by h(f) and is a number greater than or equal to zero and less than or equal to infinity. This quantity has a complicated definition, but can be thought of as a quantitative measurement of the amount of sensitive dependence on initial conditions of the map. If the nonwandering set of , is a finite number of periodic points then h(f) = O. In this case a transitive invariant set is just a single periodic orbit, and this does not have sensitive dependence on initial conditions. If the dynamics of , are complicated as for the quadratic map F,. on A,. for p. > 4, then h(F,.) > O. Therefore, another characterization of a chaotic invariant set A for' might be that h(fIA) > O. Using this definition, we do not need to add the condition that, is transitive: if h(fIA) > 0, then with mild assumptions there is an invariant subset A' c A on which' is transitive and for which h(fIA') > O. (This last statement is not obvious but is true based on results of Chapter IX as well as the more precise definition of topological entropy in Section 8.1.)
3.5 CHAOS
85
REMARK 5.5. Thus we have given four alternative definitions of a chaotic invariant set. The characterization of chaos in terms of topological entropy is the most satisfactory one from a mathematical perspective but is not very computable in applications (with a computer). The definition in terms of Liapunov exponents is the most computable (possible to estimate) on a computer. The box dimension is most computable for data from experimental work. Thus there are a number of related important concepts, each of which is important in the appropriate setting. We use the definition of chaos which requires sensitive dependence on initial conditions and topological transitivity for the definition of chaos. The other concepts we refer to by stating the system (i) has positive topological entropy, (ll) has a positive Liapunov exponent, or (iii) has fractional box dimension.
With the above definitions, we can state a result about the quadratic map.
Theorem 5.1. (a) The shift map u is chaotic on the full p-shift space, E,.. In fact, a is expansive on E". (b) For II- > 4, the quadratic map F,. is chaotic on its invariant Cantor set AI" i.e., F,.IA,. has sensitive dependence on initial conditions and is topological transitivity. In fact, F,.IA,. is expansive. PROOF. We proved in an earlier section that both u and F,. are transitive on their respective spaces. (The fact that F,. is transitive follows from the conjugacy to a.) To show that u is expansive and so has sensitive dependence, let r = 1. If s 1= t for two points in E", then there is a k such that Sk 1= tk. Then uk(s) and ak(t) differ in the O-th place and d(uk(s), uk(t» ~ 1. This proves that a is expansive. For F,., since the itinerary map h is a homeomorphism, if x,y E A,. are distinct points with s = hex) and t = hey), then there is a k with Sk 1= tic. Therefore F!(x) and F!(y) are in different intervals II and h Since there is a minimum distance r between these two intervals, 1F;(x) - F;(y)1 ~ r. This proves that F,. is expansive. 0 REMARK 5.6. The fact that F,. is expansive on A,. also follows from the follOWing general result that a conjugacy between maps on compact sets preserves expansiveness.
Theorem 5.2. Let f : X -+ X be conjugate to 9 : Y -+ Y where both X and Y are compact. Assume 9 has sensitive dependence (resp. is expansive) on Y. Then f has sensitive dependence (resp. is expansive) on X. PROOF. Let r > 0 be the constant for 9 for either sensitive dependence or expansiveness. Let h : X -+ Y be the conjugacy. By compactness, h is uniformly continuous. Therefore given the value r > 0 as above, there is a 6 > 0 such that if d(p, q) < 6 in X then d(h(p), h(q» < r in Y. Thus if d(h(p), h(q» ~ r in Y then d(p,q) ~ 6 in X, or denoting the points differently, if d(p,q) ~ r in Y then d(h-I(p), (h-I(q» ~ 6 in X. Now we check the sensitive dependence case. Let x E X and £ > o. Then there is an £' > 0 such that if q E Y is within £' of y = hex) then p = h-I(q) is within £ of x. Take such a q E Y that is within e' of y and k ~ 0 as given by the condition of sensitive dependence of gat y. Let p = h-I(q). Then d(gk(y),gk(q)) ~ r, so d(h-I(gk(y)),h-I(gk(q))) ~ 6. But h-I(gk(y» = fk(h-1(y» = fk(x) and h-I(gk(q» = fk(h-I(q» = fk(p). Therefore, p is within £ of x and d(fk(x),fk(q» ~ 6. Thus the 6 from the uniform continuity works as the distance by which nearby points of f move apart in the condition of sensitive dependence. The proof for expansiveness is similar. 0
86
III. CHAOS AND ITS MEASUREMENT
3.6 Liapunov Exponents In discussing chaos, we referred to Liapunov exponents which measure the (infinitesimal) exponential rate at which nearby orbits are moving apart. In this section we give a precise definition and calculate the exponents in a few examples. In Section 8.2 we return to discuss Liapunov exponents in higher dimensions. Deftnltlon. Let J : R ...... R be a C t function. For each point Xo define the Liapunov exponent of Xo, ..\(xo}, as follows: ..\(Xo} = lim sup.!. 10g(l(r),(xo)l) n-oo n n-1
= lim sup .!. n-oo
n
L 10g(lJ'(xj)l) j=O
where x 1 = f1(xo}. (The first and second limits are equal by the chain rule.) Note that the right hand side is an average along an orbit (a time average) of the logarithm of the derivative. The definition of these exponents goes back to the dissertation of Liapunov in 1892, see Liapunov (1907). For a treatment from the point of view of time dependent linear differential equations see Cesari (1959) or Hartman (1964). In higher dimensions, the definition is more complicated than the one given above in one dimension. We discuss this situation in Section 8.2. Next we give three examples where we can calculate or estimate the Liapunov exp<>nents. Example 6.1. Let T(x} = {
for 0 < x < 0.5 for 0.5 ~ x ~ l.
2x 2(1 - x)
be the tent map. If Xo is such that x} = Tj(xo) = 0.5 for some j then ..\(xo) is not defined because the derivative is not defined. Such points make up a countable set. For other points Xo E [0, 11.1f'(x1 }1 = 2 for allj, so the Liapunov exponent, ..\(xo), is log(2). Example 6.2. Let FI'(x) = J.lx(l-x) for J.l2:: 2+5 1/ 2 . Let AI' be the invariant Cantor set. Then for Xo E AI" 10g(IF~(xJ)I) 2:: ..\0 > 0 for some AO. Thus the average is larger than AO, A(XO} 2:: ..\0. Thus we may not know an exact value, but it is easy to derive an inequality and know that the exponent is positive. Before giving the last example, we make some connection between the Liapunov exponent and the space average with respect to an invariant measure. If f has an invariant Borel measure J.I with finite total measure and support on a bounded interval, then the Birkhoff Ergodic Theorem (Theorem VII.2.2) says that the limit of the quantity defining A(XO} actually exists, and is not just a lim sup, for J.I-almost all points Xo. In fact, since the measure is a Borel measure and log(If'(x)l) is continuous and bounded above, A(X} is a measurable function and
J
A(X} d/l(X} =
J
log(If'(x}1) dJ.l(x).
If f is "ergodic" with respect to J.I, then ..\(x) is constant J.I-almost everywhere and
..\(X) =
I~I
J
log(I!,(x)I) d/I(X)
J.I-almost everywhere,
3.6 L1APUNOV EXPONENTS
87
where 11'1 is the total measure of 1'. (See Section 7.2 for the definition of ergodic.) This says that the time average of the logarithm of the derivative is equal to the space average (the integral) of the logarithm of the derivative for I'-almost point. The point to understand from this discussion is that if the map preserves a reasonable measure then the Liapunov exponent is constant almost everywhere. In higher dimensions, the proof that the appropriate limit exists for almost every x requires a much more complicated ergodic theorem due to Oseledec (1968). See Section 8.2. Example 6.3. Let F4(X) = 4x(l - x) be the quadratic map for I' = 4. If Xo is such that Xj = F~ (xo) = 0.5 for some j, then 10g(IF';(xj)l) = log(IF4(0.5)1) = 10g(0) = -00. Therefore A(XO) = -00 for these Xo. If Xo = 0 or I, then A(Xo) = 10g(lF';(O)l) = log(4) > O. For points Xo E (0,1) for which Xj is never equal to 0 or 1 (and so never equals 0.5), we use the conjugacy of F4 with the tent map T, h(y) = sin 2(1I"yj2). (This conjugacy is verified in Example 11.6.2.) Note that h is differentiable on (O,lJ so there is a K > 0 such that Ih'(y)1 < K for y E [O,lJ. Also h'(y) > 0 in the open interval (0,1), so for any (small) 6 > 0 there is a bound K6 > 0 such that K6 < Ih'(y)1 for h(y) E (6,1 - 6J. For Xo as above, A(Xo) = lim sup .! 10g(I(F4)'(xo)!) R-CX) n = lim sup .! log(l(h 0 1'" R-OO n
0
= lim sup.! (Iog(!h'(Yn)l) n-oo
n
$ lim sup .! (log(K) n-oo
=
n
h- I )'(xo)l)
+ 10g(I(1'")' (yo)!) + log(l(h -I )'(xo)l)
+ nlog(2) + 10g(l(h-I),(xo)!)
log(2).
On the other hand for these Xo, we can pick a sequence of integers nj going to infinity such that xn; E [6,1 - 6J. Then letting Yo = h-I(xo) and Yn = 1'"(yo), A(XO)
~ lim sup 2. log(I(F:T(xo)l) j-oo
nj
= lim sup 2. (Iog(!h'(Yn;)I) j-oo nj
~
lim sup 2. (log(K6) j-oo nj = log(2).
+ log(I(1'"i)'(yo)l) + log(l(h-1),(xo)l)
+ nj log(2) + 10g(l(h-l),(xo)l)
Therefore A(XO) = log(2) for all these points. (Note, there are points which repeatedly come near 0.5 but never hit 0.5 for which the limit of the quantity defining the exponent does not exist but only the lim sup.) In particular, the Liapunov exponent is positive for all points whose orbit never hits 0 or 1 (and so never hits 0.5). Since T preserves Lebesgue measure, the conjugacy also induces an invariant measure I' for F4i this measure has density function ",-I(x(1-x)t l / 2 • Notice the similarity with the density functions we used to prove that F,. is transitive for 4 < I' < 2 + 51/ 2 • By the above argument, A(X) = log(2) for I'-aimost all points. Integrating with respect to
III. CHAOS AND ITS MEASUREMENT
88
this density function gives that
t,
t _t
A(X)
10 log(1F4(x)I) dll(X) = 10 71'[x(l _ x)]1/2 dx log(2) dx - 10 71'[x(1 - x)j1/2 = log(2). On the other hand
1 log(iF~(x)I) 1
dJ.l(x) =
o
11 l' 0
=
log(IFHx)I) dx 71'[x(1 - x)j1/2 log(IT'(y)1) dy
= log(2).
These are equal as the Birkhoff Ergodic Theorem says they must be. REMARK 6.1. In the last section we mentioned that topological entropy is a measure of complexity of the dynamics of a map. (The formal definition of entropy is given in the Section 8.1.) Katok (1980) has proved that if a map preserves a non-atomic (continuous) Borel probability measure J.I for which Il-almost all initial conditions have non-zero Liapunov exponents, then the topological entropy is positive, so the map is chaotic. Thus a good computational criterion for chaos is whether a function has a positive Liapunov exponent for points in a set of positive measure.
3.7 Exercises Sharkovskii's Theorem 3.1. Let x be a point of period 71 for I, I: R be the orbit of I with Xl < I2 < ... < In. Let
--t
R continuous, O(x) = {XI,.'" xn}
be the set of intervals induced by the orbit. Assume there exists II E A such that = II, and define inductively
l(Id :) I •. Define J I
for j = 2, ... , n - 1. Further assume that I n - I :) K for K E A, where n is the period of x. (a) Show that there exists I n- 2 E A such that I n-2 f-covers K and I n-2 C I n- 2. (b) Show that there exists a sequence I j E A for j = 1, ... , n - 1 with II as above, In-I = K and such that I) f-covers Ij+! for j = 1, ... , n - 2. 3.2. Assume that f : R --t R is continuous and has a point of period n. Assume n = 2mp with p > 1 odd, m ~ 1, and n is maximal in the Sharkovskii ordering. Further assume that k = 2"q with s ~ m + 1 and 1 ::; q and q odd. Prove that f has a point of period k. Thus prove Case 3a of Sharkovskii's Theorem, page 70. 3.3. Assume that f : R --t R is continuous and has a point of period n. Assume n = 2mp with p > 1 odd, m ~ 1, and n is maximal in the Sharkovskii ordering. Further
3.7 EXERCISES
89
assume that k = 2" with 8 ::; m. Prove that I has a point of period k. Thus prove Case 3b of Sharkovskii's Theorem, page 70. 3.4. Assume that I : R ..... R is continuous and has a point of period n. Assume n = 2ffip with p > 1 odd, m ~ 1, and n is maximal in the Sharkovskii ordering. Further assume that k = 2mq with q odd and q > p. Prove that I has a point of period k. Thus prove Case 3c of Sharkovskii's Theorem, page 70. 3.5. Let n = 2mp with p > 1 odd, m ~ 1. Prove that there is a continuous function I: [0, IJ ..... [0, IJ whose periods, P(f), are exactly the set {k : n~ k}. 3.6. Construct a map of R with points of all periods except 3, i.e., construct a map with all periods implied by period 5 from Sharkovskii's Theorem but no other periods. Hint: Take an orbit of period 5, {Xl < X2 < X3 < X4 < X5}, with the order given as in the proof of Sharkovskii's Theorem; let the map be linear on each interval [Xi,XHIJ; show that this map works. 3.7. Construct a map with a point of period 10 and all periods implied by 10 by the Sharkovskii ordering but no others. Hint: Take the double of the map in the problem before the last. 3.8. As defined in Example 1.3, let In : [O,IJ ..... [O,IJ be the function with exactly one point of period 2i for 0 ::; i ::; n and no other periodic points. Define lco by lco(x) = limn_co In(x). (a) Prove that 100 is continuous. (b) Prove that the periods of '00 are exactly {2i : 0::; i < oc}. (c) Prove that for each n, 100 has exactly one periodic orbit of each period 2n , that it is repelling, and that the points of this orbit lie in the gaps Gn,j which define the middle-(1/3) Cantor set. (d) Let Sn =
10, IJ \
U
GIe,j
ISi:52"-1
ISleSn be the union of the 2n intervals used to define the middle-(1/3) Cantor set. Prove that 100(Sn) = Sn· (e) Let A = nn>1 Sn. Prove that A is invariant for 100' (f) Let E2 be the set of all sequences of O's and 1'5. Define A : E2 ..... E2 by A(8081S2 ... )
=
(SOSI82 ... )
+ (1000 ... ) mod 2,
i.e., (1000 ... ) is added to (80S1S2 .•• ) mod 2 with carrying (so (110) + (10) = (0010». The map A on E2 is called the adding machine. Define h : A ..... E2 by h(p) = s where Sle = 1 if P belongs to the left hand choice of the interval in Sn-l' Prove that h is a topological conjugacy from 100 on A to A on E2. (g) Prove that the adding machine A on E2 has no periodic points, and every forward orbit is dense in E. Subshifts of Finite Type 3.9. Give the matrix of the subshift of finite type for the map in Exercise 3.6 and the intervals Ix;, XH 11. 3.10. Let A =
(~
D,
An =
(~
!:).
and
Ill. CHAOS AND ITS MEASUREMENT
90
(a) For a vector (xo. Yo). let (xn• Yn) = (xo. Yo)An. With Y_I = Xo. prove that Xn+! = Yn. and Yn+l = Yn + Yn-I· (This is a Fibonacci sequence.) (b) Use the fact that (l. O)An = (an, bn ) and (O.I)An = (c,." dn ) to prove that an = an-I + an -2 and dn = dn- I + dn- 2 . (c) Prove that tr(An) = tr(An-l) +tr(An-2). 3.11. U A.
Consider the matrix A given in the last problem. Find all the fixed points of
U~. u~, and u~. Group the points into orbits and give their least period.
3.12. Let A be an n by n matrix with aij E to, I}. Lj aij ~ 1 for all i. and Li aij ~ 1 for all j. We define i to be equivalent to j. i '" j. if there exist k = k(i.j) ~ 0 and m = m(i.j) ~ 0 such that (Ak)ij 1: 0 and (Am)ji 1: O. (Because we allow k = 0 = m. i is equivalent to it.self.) Break {I, ...• n} into equivalence classes. {I, ... , n} = SI U ... U S" with Si n Sj = 0 for i 1: j. Assume that for each equivalence class, Sq. there exists a iq E Sq such that LjESq ai.j ~ 2. Prove that EA is perfect. 3.13. Let f : R -. R he a C l function. Assume there are p closed and bounded intervals 1 1, /2 , •••• I" and A > 1 such that (i) If'(x)1 ~ A for all x E U~=I Ii == I and (ii) if f(1;) n I j 1: 0 then f(1,) :> IJ' Let A be the matrix of the subshift of finite type dE'fined byaij = 1 if f(1.) :> I j and aij = 0 if f{I;) n I j = 0. Further assume that (iii) A is transitive and irreducible. Let A = U~=I f-i(I). Prove that flA is conjugate to the subshift of finite type U A on EA. 3.14. (This exercise asks you to prove Proposition 2.9, page 77.) Let A be an eventually positive transition matrix. with (Ak)ij 1: 0 for all i and j. (a) Prove for n ~ k that (An)ij 1: 0 for all i and j. (b) Prove that UA is topologically mixing on EA. 3.15. (This exercise asks you to prove Proposition 2.10, page 77.) Assume A is a transition matrix. (We do not assume A is irreducible.) (a) Prove that the states can be ordered in such a way that A has the following block form: * * .. . AI ( o A2 * .. . A= .
o
0
0
...
:]
where (i) each Aj is irreducible, (ii) the * terms are arbitrary, and (iii) all the terms below the blocks Aj are all O. Hint: Define an ordering on the states as follows. Call i ~ j provided there is a k = k(i.j) such that A~.j 1: O. Call states i and j equivalent provided i ~ j and j ~ i. Group together the equivalent state and order all the states in terms of the above ordering. (b) Prove that the nonwandering set O(UA) = EA. U··· U EAm' Hint: Show that O(UA,) = E A, for all j so O(UA):> EA. U··· U EAm' Also show that all points in EA \ (EA. U ... U EAm) are wandering. 3.16. Let EA be a suhshift of finite type with metric d defined in Chapter II. Let 6 = 0.5. Assume {s(j) E E A } is a O.5-chain for U A on EA' Explicitly indicated the point t E EA which O.5-shadows this 0.5-chain.
Zeta Functions Let A and B each be square matrices and
3.17.
tr(Ak) - tr(Bk) k) <(t ) = exp ( ~ L k t. k=1
3.7 EXERCISES
91
Prove that (t) = det{I - tB). det{I - tA)
Chaos and Liapunov Exponents 3.18. Prove that FI' is expansive on R for p. > 4. 3.19. Consider fl'(x) = IJ.X mod 1, for IJ. > 1. (a) Calculate the Liapunov exponent. (b) Prove that fl'(x) has sensitive dependence on initial conditions. 3.20. Let f : R ..... R be CI. Assume p is a periodic point and w(xo) = O(P). Prove that the Liapunov exponents of Xo and p are equal, >,(xo) = >.(p). 3.21. Let FI'(x) = IJ.X (1 - x) as usual. For 1 < IJ. < 1 + 61 / 2 , find the Liapunov exponents for the different points x E [O,IJ. Hint: For 3 < IJ. < 1 + 6 1/ 2 , there is an attracting orbit of period 2. See Exercise 2.6. Also see Exercise 3.20. 3.22.
Let 0 <
0
< 0.5, and
fa : [0, IJ ..... [O,IJ be defined by
2(0 - x) fa = { 2(x - 0)
2(0.5 + 0
-
for 0::;: x::;: 0 for 0 ::;: x ::;: 0.5 + 0 x) + 1 for 0.5 + 0 ::;: x ::;: 1.
(a) Draw the graph of fa. (b) Find the intervals on which fa is transitive and describe its dynamics.
CHAPTER IV
Linear Systems This chapter begins our study of systems of more than one variable with the consideration of linear systems, both linear ordinary differential equations and linear maps. In the next chapter, we apply these results to the study of nonlinear systems. Chapters II and III treated only maps in one dimension and not differential equations. In this chapter, we start our consideration of differential equations with the study of linear ordinary differential equations. Most of the results obtained concern linear equations with constant coefficients, e.g. the form of the solutions, the phase portraits, and the topological conjugacy class. These results help in the study of a system of nonlinear differential equations near fixed points. A few results allow the coefficient to depend on time; these are applicable to "linearized behavior" near a general orbit of a system of nonlinear differential equations. After studying linear ordinary differential equations, we indicate the comparable results for linear maps. In one dimension the linear theory is trivial so we did not need any special tools. In several variables, we must use Linear Algebra including eigenvalues, eigenvectors, and the real Jordan Canonical Form. The first section reviews some of this material, especially the real Jordan Canonical Form. The reader may already know some of the material of this chapter and can treat it quickly. However, we give a few definitions for general flows which we use later, e.g. topological conjugacy and topological equivalence. These definitions and the material on phase portraits and topological conjugacy may be new and should be mastered before these concepts are applied to nonlinear systems in the next chapter.
4.1 Review: Linear Maps and the Real Jordan Canonical Form We consider linear maps from Rk to R n , which we denote by L(Rk, Rn). Given bases {vj}~=l of Rk and {Wi}:'.:l of Rn, a linear map M E L(Rk,Rn) determines an n x k matrix A = (ai,j) by k
n
k
M(LXjV J ) = L(LXjai,j)Wi . j_l
i-I j==l
We often identify such a linear map in M E L(Rk,Rn) with this n x k matrix. With this identification, the linear map is given by
A(] (:) where the Xj are the coefficients of the basis {Vj}~_l and the Yi are the coefficients of the basis {Wi}:'.:l' i.e., Yi = E~=l ai,jXj as indicated by the first displayed formula above. 93
94
IV. LINEAR SYSTEMS
The space L(Rk, Rn) is given the opemtor norm (also called the sup-norm) defined by ,Av, _ IIAII = su p -,-, ' v~O
V
when A E L(Rk, Rn). This norm IIAII measures the maximum stretch of the linear map. Notice that this norm depends on norms on the domain and range space. We are also often interested in the minimum stretch of the linear map. (This measurement becomes important in some covering estimates of linear or nonlinear maps.) For A E L(Rk,Rn), we defined the minimum norm (or conorm) of A by . IAvl m(A) = vpo IDf -,V-, . The minimum norm is a measure of the minimum expansion of A just as IIAII is a measure of the maximum expansion. We are often interested in the case of A E L(Rn,Rn). If such an A has m(A) > 0 then 0 is not an eigenvalue and A is invertible. In turn, if A is invertible then it is easy to verify that m(A) = IIA-Ill- i . For the rest of this section we review the Jordan Canonical Form, and in particular, the real Jordan Canonical Form. Implicitly, we review eigenvalues and eigenvectors. For the rest of this subsection, A is a n x n real matrix. First we consider the canonical form over the complex numbers in the case where there is a basis of complex eigenvectors, vi, ... ,v n . Letting V = (Vi ... v n ), then AV = VA where A = diag(AI, ... ,An). Thus V-I AV = A is a diagonal matrix. (If A is symmetric, then the eigenvalues are real and the eigenvectors can always be chosen to be real. Therefore for a symmetric matrix, there is always a real matrix V which diagonalizes A.) If an eigenvalue Aj = OIj + i/3j is complex, then its eigenvector vi = u i + iwi must also be complex. Since A is real, the complex conjugate Xj = 01- i/3 is also an eigenvalue and has eigenvector vi = u j - iw j . Since
equating the real and imaginary parts yields
Au j = OIjUj - /3jW i Awi = /3iuj + OIi wj .
and
Using the vectors u j and w j as part of a basis yields a subblock of the matrix in terms of this basis of the form D· = ( 01') J -/3i Thus if A has a basis of complex eigenvectors, then there is a real basis Zl, ... , zn in terms of which A = diag(B 1 , ••• ,Bq) where each of the blocks B j is either (i) a 1 x 1 block with real entry Ak or (ii) B j is of the form Di given above. Next we turn to the case of repeated eigenvalues where the eigenvectors do not span the whole space. If the matrix A has characteristic polynomial p(x), then by substituting A for x we get that p(A)v = 0 for all vectors v. (This is called the Cayley-Hamilton Theorem.) In particular, if AI, ... ,Aka are the distinct eigenvalues with mUltiplicities ml, ... , mka, then Sk = {v E en : (A - AkI)m·v = O} is a vector space of dimension mk. Vectors in Sk are called generalized eigenvectors.
4.2 LINEAR DIFFERENTIAL EQUATIONS
Now fix an eigenvalue A = Ak and assume that there is an m x m Jordan block. This means that there are vectors Vi, ... , v m such that (A-AI)v l = 0 and (A-Al}V) = v j - I for 2 ~ j ~ m. In terms of this (partial) basis, the m x m (sub)matrix has the form
[~ ! ~
~ ~l.
o
0
A
0
0
o o
0 0
0 0
A 0
1 A
···
..
This gives the Jordan Canonical Form over the complex numbers. If we use the real and imaginary part of the eigenvectors for the complex eigenvalues to form a basis of real vectors, we get the Real Jordan Canonical Form for A, A = diag(BI' ... ,Bq) where B j is of one of the following four types: (i) B j = (Ak) for some real eigenvalue Ak, (ii)
[~ ! ~
~ ~l
o
0
A
0
0
o
0 0
0 0
A 0
1 A
···
o
for some real eigenvalue A = Ak, (iii) B j = Dk where Dk is a 2 x 2 matrix with entries for some complex eigenvalue Ak = Ok + i/3k, or (iv) Dk 1 .. . o o Dk .. . o ( BJ = : o 0 Dk o 0 0
.. .
Ok
and ±/3k as given above
where Dk is a 2 x 2 matrix with entries Ok and ±/3k as given above for some complex eigenvalue Ak = Ok + i/3k and 1 is the 2 x 2 identity matrix. See Appendix III of Hirsch and Smale (1974) for more discussion of the Jordan Canonical Form. Also see Gantmacher (1959).
4.2 Linear Differential Equations The next few sections are concerned with the solutions of linear differential equations. These results are not only interesting for themselves, but are also used in the theory of nonlinear differential equations, e.g. to determine the stability near fixed points. We give a few definitions in the context of a general flow so we can use them throughout the book. For this reason, we give the definition of the flow of a general, possibly nonlinear, differential equation. Consider the linear equation
dx - = A(t)x
dt
with x E Rn and A(t) = (a;j(t» an n x n matrix. Often we will take the case where A does not depend on t.
96
IV
LINEAR SYSTEMS
More generally in Chapter V, we consider the ordinary differential equation x(O) = Xo and d -x= I(x) (t) dt where 1 : Rn --+ Rn is a function all of whose coordinate functions have continuous partial derivatives. The flow 01 the differential equation is a function
di
0 such that
-
dT dt = 1.
These equation are time independent (autonomous) and their solutions are essentially the same as (*) since T(t) = TO + t. Therefore the general theory implies that given Xo E R", there is a unique solution x(t) of (*) with x(O) = Xo that is defined on some open interval containing O. If there is a bound on IIA(t)1I which is independent of t, then it can be shown that (.) has solutions for all t. (See Exercise 5.5.) For the present, we shall take the existence of solutions as known. We shall actually construct solutions for the constant coefficient case (when A is independent of t), giving another proof of the existence in this case. It will also follow in this case that the solutions exist for all t. We will also show that the solutions are unique in this case more simply than in the general theory. For the moment, the following lemma gives the linearity of the set of all solutions. Lemma 2.1. If x : I --+ Rn and y : J --+ Rn are two solutions of (.) and a, b E R are two scalars, then ax( t) + by( t) is a solution of (*) on I n J. REMARK 2.1. Sometimes we allow solutions of (*) with complex values. In this context, Lemma 2.1 is true for x: I --+ en and y: J --+ en and a,b E e. We use this in Lemma 3.2 below. PROOF.
The proof is very direct:
~[ax(t) + by(t)]
+ by(t) + bA(t)y(t) = A(t)[ax(t) + by(t)], = ax(t)
= aA(t)x(t)
so ax(t)
+ by(t) is a solution on the common interval of definition.
o
Once we have shown that solutions for linear equations with constant coefficients exist for all t. the above lemma shows that the set of solutions of (.) forms a vector space. Since solutions are determined by their initial conditions, it will follow that the dimension of the set of solutions is n. We give the details later.
4.3 SOLUTIONS FOR CONSTANT COEFFICIENTS
97
4.3 Solutions for Constant Coefficients In this section, we consider the form of the solutions of (.) when A is a constant matrix which is independent of t: A = (aij) and dx =Ax dt
..
( )
with x ERn.
This equation is a generalization of the scalar equation ~~ = all with II E R which has solutions II = lIoeB !. In order to motivate the form of solutions we seek, assume there is a basis of vectors Vi, ... , v n that puts the matrix A in diagonal form, diag(AI, ... ,An). Then in these new coordinates the differential equation (•• ) becomes the n scalar equations lil = Al 1110 ... , lin = Anlln· Using the form ofthe solutions of these scalar equations, we get that the solution
where e1 is the standard unit vector with a one in the jlh place and all other coordinates zeroes and Cj = I/j(O). Back in the original basis, this solution is
Rather than write down the details of the way solutions change when we change basis, we use this discussion for motivation and we look for solutions to (•• ) of the form eAtv. The following proposition gives the conditions that are necessary for this to be a solution. Before giving the proposition, we need to make one more point. If A = or + i{3 is a complex number with or, (3 E R, then the exponential of At can be written in its real and imaginary parts as follows: e(a+i!3)t = eat cos({3t)
+ ie at sin({3t).
This can be seen to be the correct expression by comparing the power series expansion of each side. Proposition 3.1. Assume A is a real n x n constant matrix. The curve e"tv is a (real) solution of (.. ) if and only if A is a (real) eigenvalue of A with (real) eigenvector v. PROOF. If Av = Av then define x(t) = e"tv. Then :ie(t) = Ae"tv = e"t Av = Ax(t). Thus x(t) is a solution. If A and v are real then clearly the solution e"tv is real. Conversely, assume that eAtv is a solution. Then x(t) = AeAtv = Ae"tv , so AV = Av and A is an eigenvalue with eigenvector v. If A = or + i{3 and v = u + iw, then x(t) = eAtv = eat[cos({3t)u - sin({3t)w) + ie ot [sin({3t)u + cos({3t)w). For x(t) to stay real for all t, it is necessary that {3 = 0 and w = O. Thus A and v must both be real. 0
The next step is to find real solutions in the case that the eigenvalues are complex. The following lemma is the main additional step in finding the real form of the solutions in this case.
IV. LINEAR SYSTEMS
98
Lemma 3.2. Let A(t) be a real matrix which can depend on time. Ifz(t) = x(t) +iy(t) is Ii complex solution of = A(t)z then x(t) and y(t) are each real solutions.
z
PROOF. This follows from the linearity of differentiation and matrix multiplication: z(t) = x(t) + iy(t) = A(t)(x(t) + iy(t)) = A(t)x(t) + iA(t)y(t), so by equating the real and imaginary parts we get that x(t) = A(t)x(t) and y(t) = A(t)y(t), proving the result. 0 Combining the lemma with the form of the solutions for complex eigenvalues given in the proof of Proposition 3.2, we get the following result for complex eigenvalues. Proposition 3.3. Let A be Ii real constant matrix with complex eigenvector v = u+iw for the complex eigenvalue A = 0 + i{3, {3 '" O. Then eot[cos({3t)u - sin({3t)w] and eQ1[sin(pt)u + cos({3t)w] are two real solutions. REMARK 3.1. If v = u + iw is the eigenvector for the complex eigenvalue A = 0 + i{3, then v = u - iw is the eigenve('tor for the complex conjugate eigenvalue>' = u - i{3. It can t.hen be checked that these give the same two real solutions of the differential equation as given above. Next we turn to the case of repeated eigenvalues where the eigenvectors do not span the whole space. We will write out the form of the solutions when the eigenvalues are real (or the complex form of the solutions if a repeated eigenvalue is complex). We will sketch a way of deriving solutions using the exponential of a matrix and then verify that the solutions that we derive are actually valid. For a matrix B, let e B be the matrix obtained from the power series: e
B
=
1 2 1 k 1+ B + 2iB + ... + kiB + ...
It can be shown that this series converges to a matrix just as for real numbers. However care must be exercised because it is not always the case that e B +C is equal to eBe C , This is the case if BC = CB (they commute). In particular, eAI
=
e~II+(At->.It)
=
e,xlle(AI->.II)
= e,xlel(A->.I).
It can be shown that eAtv is a solution of (.. ) for any v, but we will give a more specific form of the solutions that does not involve a power series. Given that there are not always as many eigenvectors as the algebraic multiplicity of the eigenvalue, we take a generalized eigenvector w such that (A - M)k+l w = 0 but (A - M)kw '" O. Then (A - M)lw = 0 for j > k and
tk( A - M )k w = e AI{ Iw + t(A - M )w + ... + ki
+
tk +1 (k
+ 1)!(A -
ti L -:;(A -
M)k+ 1W
+ ... }
k
=
e,xt
M)iw.
i-O J.
With the above calculations as motivation, we can check directly that the final form is indeed a solution (without verifying that the series converges).
4.3 SOLUTIONS FOR CONSTANT COEFFICIENTS
99
Proposition 3.4. If (A - AI)H1 W = 0 then
is a solution to ( .. ). PROOF. Let x(t) be as in the statement. Then using the fact that (A - AI)Hl w = 0,
t L "1Jt (A - '>'I)J w + e).1 L ~(A - '>'I)'w (J ) j
k
:ic(t) = '>'e).1
;=0
t
AI)jw + e).1
i=O J k
= e).l{.>.
L
.
j=1
i L "1(A k
= '>'e).1
j - I
k
.
HI
L
tj
- I
~(A - .>.I)jw
;=1 ()
~
-;(A - '>'I)jw
+ (A
)
k
- '>'1)
j=O)'
L
~
-;(A - AI)jw}
j=o ).
ti L -;(A - AI)jw J. k
= Ae).1
j=o
= Ax(t).
o
Thus x(t) is a solution as claimed.
We have now given the real form of solutions in all cases (except for repeated complex eigenvalues). Using this we have existence of solutions and the form of these solutions as given in the following theorem. Theorem 3.5 (Existence). Given a real n x n constant matrix A and Xo E Rn, there is a solution x(t) of:ic = Ax defined for all t such that x(O) = Xo. Moreover, each coordinate function of x( t) is a linear combination of functions of the form
t k e nl cos(,Bt)
and
where a + i,B is an eigenvalue of A and k is less than the algebraic multiplicity of the eigenvalue. PROOF. The Jordan normal form says that there is always a basis of generalized eigenvectors, vl, ... ,v n . Let xj(t) be the solution with xj(O) = v j for j = l, ... ,n. Given any Xo ERn, solve for al," .,an such that Xo = E'l=l ajv j . Then x(t) = E'l-l ajxj(t) is a solution with x(O) = Xo. This shows that there is such a solution, and that it exists for all t. Also, the form of the solutions xj(t) found above proves that the coordinate functions are linear combinations of functions of the form stated in the theorem. 0
We mentioned above that for any v, eA1v is a solution. Thus eA1v is the How for equations (**). If we look only at the matrix e A1 , it can also be shown that d
dt eAI = Ae At . Thus if we allow matrix solutions to the differential equation, eAt is such a solution. However, it is not very easy to compute. In Theorem 3.8 below, we show how to use the solutions constructed above to get another matrix solution to the equation. Before giving this construction, we prove directly the uniqueness of solutions using the matrix eAI rather than using the general theorem for ordinary differential equations.
IV.
100
LINEAR SYSTEMS
Theorem 3.6 (Uniqueness). Given Xo. there is a unique solution x(t) to x = Ax with x(O) = Xo. PROOF.
Let x(t) be a solution. Let y(t) = e-Atx(t). Then y(t) = -Ae-Atx(t) + e-A1x(t) = -Ae-A1x(t) = e- A1 ( -A
+ e- A1 Ax(t)
+ A)x(t)
=0.
Therefore y(t) == y(O) = eOx(O) = Xo. and e-A1x(t) == Xo. so x(t) == eA1Xo. This proves the result. 0 REMARK 3.1. The above proof can be modified to apply to the general linear equation (.) by using the fundamental matrix solution introduced below. Finally, we can prove that the solutions form a vector space of dimension n.
Theorem 3.7. Given an n x n constant real matrix A, the set ofsolutiollS oE( •• ),
s=
{x: R - Rn
:
x(t) = Ax(t)},
forms a vector space of dimension n. PROOF. We know from above that solutions exist for all time. By Lemma 2.1, ifx,y E S and a, bE R then ax(t) + by(t) E S. This shows that S is a vector space. Let vi, ...• v n be a basis for Rn (for example, either the standard basis or a basis of generalized eigenvectors of A). Let xj(t) be the solution with xj(O) = v j for j = 1, ... , n. Given any solution Z E S, solve for al,"" an such that z(O) = E;=I aivi. Then both z(t) and E;=I ajxj(t) are solutions, and they have the same initial condition at t = o. By uniqueness, they are equal, so z(t) = E;=I ajxj(t). This proves that {xl(t), .... xn(t)} span S. If a linear combination of the solutions E;_I ajxj(t) = 0 for some al, ... ,an, then by setting t = 0 we get that E;=I ajvi = O. Because the vi are independent, it follows that aj = 0 for all j. This proves that the xi(t) are independent and so a basis. This shows that S has dimension n. 0
Returning to matrix solutions of linear equations, an n x n matrix M(t) is called a fundamental matrix solution of x = A(t)x provided
~M(t) =
A(t)M(t)
for all t and M(to) is nonsingular at one time to. The following theorem justifies the assumption the M(to) is nonsingular at one time as well as giving an alternative construction of eAI. Theorem 3.8. Let A(t) be an n x n real matrix, and consider the linear equation A(t)x. (a) Assume xi(t), ... ,xn(t) are n solutions, and let
x=
4.3 SOLUTIONS FOR CONSTANT COEFFICIENTS
101
be the n x n matrix formed by putting the vector solutions in as columns. Then M(t) satisfies the linear equation as a matrix solution:
~M(t) =
A(t)M(t).
(b) If Xl (to), . .. , xR(to) are independent for one time to, then any solution can be written as x(t) = M(t)v for some vector v. (c) If M(t) is a matrix solution such that M(to) is nonsingular at one time to, then M(t) is nonsingular at all times. (d) If M(t) is a matrix solution with M(O) nonsingular, then eAt =
PROOF.
M(t)M(O)-I.
Defining M(t) as stated,
~M(t) =
(jcl(t), ... ,jcR(t»
= (A(t)XI(t), ... , A(t)xR(t» = A(t)M(t),
as claimed. Now assume the solutions are independent at to, and x(t) is any other solution. Because M(to) is nonsingular, it is possible to solve for v such that x(to) = M(to)v. Then both x(t) and M(t)v are solutions that agree at to. By the uniqueness of solutions they are equal for all time. Part (c) is equivalent to proving that if M(t) is a matrix solution that is singular at one time, then it is singular at all times. Assume that M(t) is such a matrix solution with M(to) singular. Thus the columns of M(to) are dependent and there is a nonzero vector v with M(to)v = o. Then both M(t)v and the zero function are solutions that are equal at one time. By uniqueness they are equal for all time, M(t)v = o. Thus M(t) is singular for all time, proving the result. Now, assume M(O) is nonsingular (or is nonsingular at some time to). Then both M(t)M(O)-1 and eAt are matrix solutions that are equal at time t = O. By uniqueness of solutions, they are equal for all times. (To reduce it to vector solutions, multiply both matrix solutions by the standard basis elements ei.) 0 3.2. Notice that if M(t) is a fundamental matrix solution of (*), then by multiplying on the right by M(tO)-l we get that M(t,to) = M(t)M(to)-1 is another fundamental matrix solution with M(to. to) = I. This idea is used in the proof of both Theorem 3.8 and 3.9. To end the section, we restate Liouville's Formula for any fundamental matrix solution of a linear differential equation which depends on time. This result is used when we make the connection between the divergence of a vector field and the way that the flow distorts area. REMARK
Theorem 3.9 (Liouville's Formula). Let M(t) be a fundamental matrix solution of the linear equation (with p06Sibly nonconstant coefficients) jc = A(t)x. Then
~ det (M(t») = det (M(t») tr (A(t») det (M(t») = det (M(O») exp
(l
and tr (A(s») dS)'
IV. LINEAR SYSTEMS
102
PROOF. We let ei be the standard basis. In the following calculation we use the notation ai,j(t) for the (i,j) entry of A(t). Using the fact that the determinant is multilinear in
the columns we get the following:
ft
det (M(t)M(to)-I)lt=t. = =
did det (M(t)M(to)-le 1 , .•. , M(t)M(to)-le n ) It=t.
= L
det (M(to)M(to)-le 1 , .•• , M(to)M(to)-lei- 1 ,
M'(to)M(to)-lei, M(to)M(to)-lei+1, ... ,M(to)M(to)-len ) = "L... det (I e , ... ,e1'-1 ,A(to)e1.,e1'+1 , ... ,en) j
= L
det (e l , ... ,e1- I , Lai,j(to)ei,ei+1, ... ,en)
j
' + 1 , ... ,e) n = "L...ai,j(to)det(e 1 , ... ,e1'-I ,ei,e1 i,i
= Laj,j(to)det(el, ... ,en ) j
= tr (A(to».
Therefore, using the multiplicative feature of the determinant,
ft
det (M(t»)!t=t. =
ft
det (M(t)M(to)-I) det (M(to»
= det (M(to» tr (A(to».
Solving this scalar differential equation gives the formula in the theorem relating the quantities det (M(t», det (M(O», and the exponential of
1t tr (A(s»
ds.
0
4.4 Phase Portraits In the last section, we gave the general theory of solutions to linear equations and found the solutions of constant coefficient linear equations. In this section, we give thf' phase portraits of two dimensional constant coefficient linear equations and some examples in three dimensions. The phase portrait of a differential equation :it = I(x) is a drawing of the solution curves with the direction of increasing time indicated. In some abstract sense, the phase portrait is the drawing of all solution curves, but in practice it only includes representative trajectories. The phase space is the domain of all x's considered. Example 4.1 (A Saddle). Consider
. (1 1) x.
x= The general solution is
0
-2
4.4 PHASE PORI'RAITS
103
FIGURE 4.1. A Saddle See Figure 4.1 for the phase portrait. (Solutions along the x-axis are moving away from the origin and those along the line of slope -3 are coming in toward the origin.) Example 4.2 (A Stable Node). Consider
. (-10 -20) x.
x= The general solution is
x(t) = C1e- t
(~) + C2e-2t (~) .
See Figure 4.2 for the phase portrait. (All the solutions are coming in toward the origin.)
FIGURE 4.2.
A Stable Node
Example 4.3 (An Improper Node). Consider
. (-1 1) x.
x= The general solution is
0
-1
104
IV. LINEAR SYSTEMS
FIGURE 4.3.
An Improper Node
See Figure 4.3 for the phase portrait. (All the solutions are coming in toward the origin.) Example 4.4 (A Center). Consider
. (0w -w) o x.
x=
The eigenvalues are ±iw with eigenvectors
(~i).
x(t) = C ( - Sin(wt») 1 cos(wt)
The general solution is
+ C (c~(wt»). 2
sm(wt)
See Figure 4.4 for the phase portrait. All the solutions are on closed orbits surrounding the origin.
FIGURE 4.4.
A Center
Example 4.5 (A Stable Focus). Consider
. = (-12 -2) -1 x.
x
4.4 PHASE PORI'RAITS
The eigenvalues are -1 ± 2i with eigenvectors x(t) = C 1 e- l [c08(2t)
(~i).
105
The general80lution is
U) - sin(2t) 0)1 (n
+C2 e- l [sin(2t)
+cos(2t)
(~)I.
See Figure 4.5 for the phase portrait. (All the solutions are coming in toward the origin.)
FIGURE 4.5.
A Stable Focus
Definition. In general, a linear system is called a node provided that every orbit tends to the origin in a definite direction as t goes to infinity (if all the eigenvalues are real and negative), or as t goes to negative infinity (if all the eigenvalues are real and positive). A linear system is called a proper node provided it is a node and there is a unique orbit going into (or coming out of) the origin for each direction (all the eigenvalues are equal, real, and nonzero); it is called an improper node provided it is a node and there is a direction for which there are more than one orbit going into (or coming out of) the origin in that direction. In this terminology, both Examples 4.2 and 4.3 are improper nodes. A linear system is called a stable focus or stable spiml (respectively, unstable focus or unstable spiml) provided the solutions approach the origin as t goes to infinity (respectively, minus infinity) but not from a definite direction. Example 4.6 (A Three Dimensional Saddle). Consider
x= The eigenvalues are -1 solution is
(-1o -2 0) 2
-lOx. 0 1
±2i and 1 with eigenvectors ( ±i) and (0 ~
0
1). The general
106
IV.
LINEAR SYSTEMS
}
d- 7 :@ ~ [
(C +
FIGURE 4.6.
A Three Dimensional Saddle
See Figure 4.6 for the phase portrait.
4.5 Contracting Linear Differential Equations In this section we use the features of the solutions of (**) to characterize the solutions as contracting, expanding, or subexponentially growing. We give the definitions of Liapunov stable and asymptotically stable again in this context. They are essentially the same as we gave for one dimensional maps. Definition. The orbit of a point p is Liapunov stable for a flow tpt provided that given any! > 0 there is a 6 > 0 such that if d(x, p) < 6, then d(tpt(x), tpt(p)) < ! for all t ~ O. If p is a fixed point, then the condition can be written as d(tpt(x), p) < !. The orbit of p is calif! 0 such that if d(x, p) < 61 , then d(tp'(X), tpt(p)) goes to zero as t goes to infinity. If p is a fixed point then it is asymptotically stable provided it is Liapunov stable and 61 > 0 such that if d(x, p) < 61 then w(x) = {pl. REMARK 5.1. For a linear system, if the origin satisfies the second condition of asymptotic stability then it is also Liapunov stable. There are nonlinear systems for which this is not the case. See Example V.5.3 for an example of Vinograd (1957). Another example can be given in polar coordinates with the fixed point at r = 1 and 8 = 0 can be given (near r = 1) by r=l-r
11 = sin(9/2). Initial conditions with reO) near 1, have ret) tending to 1. Also, theta(t) tends to 0 modulo 21r. Thus for any such point, w(ro, 80 ) = {(I, O)}. On the other hand, this point is not attracting. The main theorem of the section proves that if all of the eigenvalues of A have negative real part, then the origin is attracting at an exponential rate determined by the eigenvalues. Thus one criterion for asymptotic stability of a linear system of differential equations with constant coefficients is that all the eigenvalues of the matrix have negative real part. Before stating the theorem, we give an example which illustrates the fact that for a stable linear system, the usual Euclidean norm of a nonzero solution is not always strictly dl'Creasing.
4.5 CONTRACTING LINEAR DIFFERENTIAL EQUATIONS
107
Example 5.1. Consider the system of linear differential equations given by
x = -x - y iJ
= 4x -]I.
The eigenvalue!! are -1 ± 2i, and the general solution is given by ( X(t») =e-I (COS(2t) y(t) 2sin(2t)
-(1/2)Sin(2t») (xo) cos(2t) Yo'
See Figure 5.1 for the phase portrait.
FIGURE 5.1.
Phase portrait for Example 5.1.
The matrix in the above solution, with terms involving sin(2t) and cos(2t), has norm less than 2, so
which goes to zero exponentially. The time derivative of the Euclidean norm along a nonzero solution is given by !!..(X2 dt
+ y2)1/2
=
~(X2 + ]l2)-1/2(2xx + 2yli) 2
= (x 2 + y2)-1/2( _x2 _ y2
+ 3xy).
Along the line x = y, (d/dt)(x 2 + y2)1/l = 2- 1/ 2Ixl which is positive for x = y '" O. Therefore the Euclidean norm is not monotonically decreasing, although it dOe!! go to zero at an exponential rate. See Figure 5.2. In this example, it is possible to take a different norm (the norm in the coordinates which give the Jordan Canonical Form) which is monotonically decreasing along a solution. Let I ,1. be the norm defined by l(x,y)l.
== (4x 2 + y2)1/2.
The time derivative of this norm along a nonzero solution is given by !1(x,]I)I. =
~1(x,Y)I;I(4' 2xx + 2yli)
= l(x,y)I;I(-4x2 _ y2) = -\(x, y)I •.
108
IV. LINEAR SYSTEMS
FIGURE
C and
5.2. Solution Curve for Example 5.1 Crossing Level Sets i(x,y)1 =
I(x, y)l. = C'
Solving this linear scalar equation shows that l(x(t),y(t))I. = e-ll(xo,Yo)l.
which monotonically and exponentially decreases to zero.
Theorem 5.1. Let A be an n x n real matrix, and consider the equation x = Ax. The following are equivalent. (a) There is a norm I I. on RR and a constant a condition x E RR, the solution satisfies
> 0 such that for any initial
(b) For any norm II' on RR there exist constants a> 0 and C initial condition x E RR, the solution satisfies
~
I such that for any
(c) The real parts of all the eigenvalues A of A are negative, Re(A) < O. REMARK 5.2. A norm as in Theorem 5.I(a) is called an adapted norm. It is useful to use because solutions immediately contract in terms of this norm as time goes forward. The norm II. introduced in Example 5.1 is such an adapted norm. The adapted norm for the linear equation is also used to study nonlinear equations near a fixed point. REMARK 5.3. We give two proofs that condition (c) implies condition (a), i.e., that the condition on the eigenvalues in part (c) implies the existence of a norm in which the differential equation is a contraction as given in part (a). The second (alternative) proof uses the basis which puts the matrix in Jordan canonical form to define the norm. This proof has the advantage that it is constructive; it has the disadvantage that it does not apply in the situation we consider in Chapter VII of a hyperbolic invariant set (Theorem VILl.I). The first proof averages the usual norm along the orbits. It has the advantage that it applies in the more general situation of a hyperbolic invariant set; it has the disadvantage of being more abstract and not easily computable for a particular example.
4.5 CONTRACTING LINEAR DIFFERENTIAL EQUATIONS
109
PROOF. First we show that (a) implies (b). Let II. be the norm given in (a) and II' be any other norm. Then there are constants AI, A2 > 0 such that Allxl' ~ Ixl. ~ A2Ixl'. (This is true for any two norms in finite dimensions.) Then for t ~ 0,
1
leAtxl' ~ AlleAlxl.
< ~e-allxl • - Al
~ ~~ e-allxl'· This proves that (b) is true with C = A 2 /A I • This completes the proof that (a) implies (b). Next, we show that (b) implies (c). Suppose (c) is not true: there exists an eigenvalue A = (]( + if3 with (]( ~ o. There is a solution for the corresponding eigenvector of the form eO I [sin(f3t)u + cos(f3t)wl, and this solution does not go to zero as t --+ 00. This shows that (b) is not true. Finally, we need to show that (c) implies (a). Let 0 > 0 be chosen so that Re(A) < -0 for all the eigenvalues A of A. Let v I, ... ,v" be a basis of generalized eigenvectors, and x j (t) be the solution with x j (0) = v j . By the form ofthe solutions given in Theorem 3.5, for each j there is a Tj such that leAlvil ~ e-allvjl for all t ~ Ti. (Here we use the Euclidean norm, or any other fixed norm.) Then any x it can be written as x = E;=I ci vj , so the solution eAlx = E;=I Cixj (t). The form of these solutions implies there is a T which depends on x such that leAlxl ~ e-allxl for all t ~ T. Using the compactness of {x: Ixl = I}, there is one T which works for all x with Ixl = 1. By linearity, we get that there is one T which works for all x. Fix this T. What we have is that eAlx is a contraction if we take t larger than T. We average the norm along the trajectory for times between 0 and T with a weighting factor eal and show that in terms of this averaged norm, the linear How is an immediate contraction. To this end, define
Ixl. ==
foT ea'leA'xl d,.
We now show that this norm works. Take t ~ 0 and write it as t = nT + T with T. In the calculation which follows, we split up the range of 8 so that 8 + t runs from nT + T to (n + l)T and from (n + l)T to (n + I)T + T:
o~ T <
leAlxl. = =
for ea'leA'eAlxl ds
l
r - T ea'leAnr eA(T+O)xl ds
+
o
{r
lr-T
ea'leA(n+l)r eA(T-r+O)xl ds.
Making the substitution u = T + s in the first integral, and u integral, and using the estimate above for leAnrxl, we get
leAlxl.
= T - T + s in the second
~ £r ea(u-T-nr)leAuxl du + foT ea(u+T-T-(n+l)r)leAuxl du = e-at for
eauleAuxldu
= e-allxl.· This proves that (a) holds using the norm
I I•.
o
110
LINEAR SYSTEMS
IV.
ALTERNATE PROOF THAT (C) IMPLIES (a). Given E > 0, we can find a basis B of generalized eigenvectors in terms of which A = diag{ AI, ... ,Ap} where each Ai is one of the following types:
("'0
(I
o
OJ
o
0
0). 0
c·
,
... ~}
Dj
or
D)
0
where
o}
o
0 0
0
:),
0
o
-(3i) o
'
OJ
0 d
(I and
0 0
f). OJ
0 0
Dj
n n· D}
0 0
0
1=0
Let 1 18 be the norm in terms of this basis, i.e., if x =
D}
L:7=1 XjV j
then Ixl8
[L:7=1 x;j1/2. Using the components in the above basis,
~lx(t)18
=
L:7=1 Xj(t)Xj(t)
dt
= (x(t), AX(t))8
Ix(t)18
Ix(t)18·
Assume that C 1 and C 2 are two constants with C I < Re('x) < C2 for all eigenvalues ,X of A. It can be shown that if f is small enough in the above Jordan form, then Ctlxl~ < (x, AX)8 < C2Ixl~. Therefore, fel x (t)18 C 1 < IX(t)i8 < C 2 , CI < CIt
d
dt log(lx(t)18) < C 2 , IX(t)i8
< log (lx(O)18) < C 2 t,
and ec ,IIXol8 < Ix(t)18 < e C • 1Ix oI8. If all the eigenvalues have negative real part, then taking a = -C2 > 0 above we get the result claimed. 0
From the above theorem, it follows that if the real part of each eigenvalue is negative, then the origin is asymptotically stable for the linear flow. Also note that if for t ~ 0, we set y = eA1x in the first condition of Theorem 5.1, we get that Iyl. ~ e-1ble-A1yl. for all t ~ 0, or eb1111yl. ~ leA1YI. for all t ~ o. Thus going backward in time the flow is an expansion. For this reason if all the eigenvalues of A have negative real part, then we say that. the differential equation x = Ax has the origin as a sink, the origin is attmciing, or the flow is a contmction. If all the eigenvalues of a matrix A have positive real part, then an analogous result is true and the flow is an expansion for t ~ 0 and a contraction for t ~ o. In this case we say that the origin is a source, the origin is repelling, or that the flow is an expansion.
4.6 HYPERBOLIC LINEAR DIFFERENTIAL EQUATIONS
111
4.6 Hyperbolic Linear Differential Equations In the previous section, we considered the case when the linear flow eAtx is contracting in all directions or expanding in all directions. In this section we consider the case when it is contracting in some directions and expanding in others. We define the linear differential equation x = Ax to be hyperbolic provided all the eigenvalues of A have nonzero real part. If A induces a linear hyperbolic differential equation, and there are at least two eigenvalues of A, Au and A., with Re(Au) > 0 and Re(A.) < 0, then the origin is called a (hyperbolic) saddle for the differential equation. Thus in the case of a saddle, some directions expand and some contract. Both the contracting or expanding cases are hyperbolic but are not saddles. For any linear flow (either hyperbolic or not), we want to characterize the eigenspaces of generalized eigenvectors corresponding to the eigenvalues with positive, zero, and negative real parts. We first introduce some notation and then state the result. Let A be an 7l x n matrix. Define the stable eigerl8pace, unstable eigen8pace, and center eigen8pace to be IE' = span {v : v is a generalized eigenvector for lEu = span{v : for lEe = span{v :
an eigenvalue A with Re(A) < O}, v is a generalized eigenvector an eigenvalue A with Re(A) > O}, and v is a generalized eigenvector
for an eigenvalue A with Re(A) = O}, respectively. If A is hyperbolic (so lEe = 0), then the decomposition ofRn into subspaces given by Rn = lEU €a E' is called a hyperbolic splitting. Let V' = {v: there exist a > 0 and C
~
1 such that
leAtvl ::; Ce-atlvl for t ~ O},
vu
> 0 and C ~ 1 such that leAtvl ::; Ce-altllvl for t::; O}, and {v: for all a > 0, leAtvle-altl -.0 as t -. ±oo}.
= {v: there exist a
Ve =
Thus the subspace V' is defined to be all vectors which contract exponentially forward in time; ve is defined to be the vectors which grow at most 8ubexponentially both forward and backward in time. Finally, notice that VU is defined as the set of vectors which contract backward in time and not in terms of behavior forward in time. This is done because any vector which is not in E' €a Ee expands forward in time, and this is not a subspace and does not characterize VU. The following theorem shows that the conditions defining the va characterize the subspaces EO" for (J' = s, u, c. Theorem 6.1. Consider the linear differential equation x = Ax with x in Rn. Let EU, lEe, IE', VU, ve, and V· be defined as above. Then the following are true. (a) The subspaces lEa are invariant by the Bow eAt for (J' = U, C, 8, (b) 'T'he subspace IE" = V" [or (1 = U, C, B, so eAt lIEU is an exponential expansion, e·~tllE· is an exponential contraction, and eAtllEe grows subexponentially as t .-. ±oo, and these uniquely characterize the subspaces. By the form of the solutions given in Theorem 3.5, the subspaces are invariant as claimed.
PROOF.
for
eAt
112
IV. LINEAR SYSTEMS
By Theorem 5.1, IE" C V" and E' C V'. On the other hand, if v E V" \ E" then it has a nonzero component v' in either Ee or E'. By the form of the solutions, eAtv' and so eAtv do not go to zero exponentially as t -+ -00, so v f/. V". This contraction proves that V" C EU and so V" = E". Similarly, V' = E'. For v E Ee \ {O}, by the form of the solutions, eAtv has at mOllt polynomial growth Ee then there is a nonzero component in as t -+ ±oo, so v E VU. As above, if v E either E" or E', and there is exponential growth as t goes to either +00 or -00. This contradicts the fact that v Eve. This completes the proof that lEe = and so the theorem. 0
ve \
ve,
Using Theorem 5.1, we can see that if A induces either a purely contracting linear flow or purely expanding linear flow, then nearby B will also induce hyperbolic flows of the same type. For hyperbolic linear flows, the subspaces in the hyperbolic splitting can move, but a small perturbation B of A must remain hyperbolic and must have eigenspaces of the same dimension as those for A. We state this in the following theorem. Theorem 6.2. Assume that A induces a hyperbolic flow for the linear differential equation x = Ax for x E Rn. If B is an n x n matrix with entries near enough to A, then x = Bx induces a hyperbolic linear flow with the same dimension splitting E8 $IEB as that for A. Moreover, as A varies continuously to B the subspaces also vary
continuously. PROOF. Because A is hyperbolic, Rn = lEu $ IE' for the eigenspaces of A. Let p('\) be the characteristic polynomial for A. Let -y be a curve in the left half of the complex plane that surrounds all eigenvalues with negative real part for A and is oriented counterclockwise. Let -y' be a curve in the right half of the complex plane that surrounds all eigenvalues with p08itive real part for A, again with counterclockwise orientation. By residues dim(E') = P'(z) dz 27ri .., p(z)
-1-1 1
and
. " 1 P'(z) dlm(E ) = -2' -() dz. 7r1..,'PZ
For B near enough to A, the characteristic polynomial q('\) for B does not vanish on -y or -y'. The two integrals with p( z) replaced by q( z) will count the number of roots for q inside -y and '"Y' respectively. By continuity of these integrals with respect to changes in p, the number of roots for p and q are the same (since a continuous integer valued function is a constant). In particular, all the roots of q(z) are either inside -y or -y' and noDe are on the imaginary axis. This proves that B is hyperbolic and the subspaces Ea and EB have the same dimension as those for A. Next, Pv =
~
l(zI -
27r1 ..,
A)-lvdz
is a projection from R" onto E'. See Section 148 of Riesz and Nagy (1955). Similarly,
is a projection onto the stable eigenspace Ea for B. Again, this integral varies continuously with changes from A to B, so the subspace varies continuously. Similar statements are true for the unstable subspaces.
4.7 TOPOLOGICALLY CONJUGATE EQUATIONS
113
4.1 Topologically Conjugate Linear Differential Equations Definition. We consider two flows to have the same qualitative properties, and so to be topologically similar, if we can match the trajectories of one with the trajectories of the other. There are two ways of doing this depending on whether we demand that the conjugacy match the time parameterization of the two flows or allow a reparameterization. The stronger condition requires that the flows be matched without a reparameterizIV tion. We say that two flows '{" and 1/J' on a space M (M could be R" or some manifold) are topologically conjugate provided there is a homeomorphism h : M ---+ M such that h 0 '{"(x) = 1/J' 0 h(x) for all x E M and for all t E R. Allowing a reparameterization, we say that '{" and 1/J' are topologically equivalent provided there is a homeomorphism h : M ---+ M such that h takes trajectories of '{" to trajectories of I/J' while preserving their orientation. More precisely, '{" and 1/J' are topologically equivalent if there is a homeomorphism h : M -+ M and a (reparameterization) function 0 : R x M -+ R such that h 0 ,{,O(I.X) (x) = 1/J' 0 h(x) for all x E M and for all t E R, where we assume for each fixed x that o(t, x) is monotonically increasing in t and is onto all of R. We could also assume the group property on 0 to insure that it is indeed a reparameterization: ,{,o(I+"X)(x) = ,{,O(I ...,n( •. x)(x» 0 ,{,O('.x) (x). We use these concepts repeatedly in our study of flows. In most circumstances, it is not possible to preserve the parameterization when we make perturbations. However, the following theorem proves that two linear flows with the same dimensional contracting spaces and the same dimensional expanding spaces are actually conjugate, i.e., it is possible to preserve the parameterization. Theorem 1.1. Let A and B be two n x n real matrices. (a) Assume that all the eigenvalues of A and B have negative real part (both are sinks). Then the two linear Bows eAI and eBI are topologically conjugate. (b) Assume that all the eigenvalues of A and B have nonzero real part and the dimension of the direct sum of all the eigenspaces with negative real part is the same for A and B. (Thus the dimension of the direct sum of all the eigenspaces with pa6itive real part is the same for A and B.) Then the two linear Bows eAI and eBt are topologically conjugate. (c) In particular, if all the eigenvalues of A have nonzero real part and B is near enough to A, then the two linear Bows eAI and eBI are topologically conjugate. PROOF. Part (b) follows fairly easily from part (a). The theorem that gives a conjugacy for two systems with all their eigenvalues with negative real part (on the same dimensional space) implies the same result for systems with all their eigenvalues with positive real part (on the same dimensional space). Then given two hyperbolic linear flows as in part (b), there is a conjugacy on the eigenspaces for the eigenvalues with negative real part, and a conjugacy between eAI and eBI on the eigenspaces for the eigenvalues with positive real part, i.e., there are conjugacies h" : E~ ---+ E~ between eAIIE~ and eBIIE~ for 0' = U,5. There are projections 7r" : Rn ---+ E" for 0' = U,5, so any x E R" can be written as x = 7r.. (x..) + 7r.,(x). The two conjugacies can be combined to give a conjugacy on the total space. Define h(x) = h u (7ru (x» + h.(7r.(x». Using the linearity of the flows, it is easily checked that h is a conjugacy. Part (c) follows from part (b) because if B is near enough to A, then the dimensions of the splittings for A and B are the same by Theorem 6.2. It remains to prove part (a). By Theorem 5.1, there exist norms I IA and I IB and constants a, b > 0 such that we have the estimates leAlxlA :5 e-arlxlA and leB1xlB :5
IV. LINEAR SYSTEMS
114
e-b1lxlB for t ~ 0 and for any x in Rn. Running time backward, we get the estimates leA'xL~ ~ ea111lxlA and leB1xlB ~ eb11ljxIB for t ~ O. We want to match up the trajectories of eAI with those for eBI. Using the above
estimates, we see that for each x "f 0 the trajectory eA1x crosses the unit sphere for IIA exactly once, and each trajectory eB1y crosses the unit sphere for liB exactly once. Let the unit spheres in these two norms be denoted as follows: SA = {x: IxlA = I} and SB = {x : IxlB = l}. These spheres, SA and SB, are called the fundamental domains for the two linear flows because of the property that each trajectory of eA1x for x "f 0 (respectively of eB1y) crosses SA (respectively SB) exactly once. Therefore, we first of all define a homeomorphism ho from SA to SB by ho(x) = x/lxIB' (Any homeomorphism would do.) Notice that the inverse of ho exists and is given by ha1(y) = y/lyIA' To extend ho to all Rn we need to define the time when the trajectory that starts at x crosses the unit sphere. Using the above inequalities for the flow e A1 , it follows that for any x "f 0 there is a T(X) which depends continuously on x such that leAT(x)xIA = 1, i.e., eAT(x)x E SA. Because of the definition it follows that T(eA1x) = T(X) - t. Now using this homeomorphism ho on the unit sphere and the time T(X), we can define a map (homeomorphism) h : Rn -+ Rn by e-BT{X)ho(eAT(x)x)
h(x) = { 0
for x
"f 0, and
for x =
o.
The following calculation shows that h is a conjugacy: h(eA1x) = e-BT(eA'x)ho(eAT(eA'x)eAlx)
= e-B(T(X)-l)ho(eA(T(X)-l)eA1x) = eBle-BT(X)ho(eAT(X)x) = eBth(x).
Because T and the flows eAI and e B1 are continuous, it follows that h is continuous at points x "f O. To check continuity at 0, notice that if x] converges to 0 then Tj = T(X]) goes to minus infinity. Letting Yj = ho(eATjxj), we have that IYjlB = 1. Thus Ih(x])IB = le-BT'YjIB ~ e-bIT,1 must go to zero. Therefore h(xj) converges to o = h(O). This proves the continuity at O. To show that h is one to one, take X,y with h(x) = h(y). If x = 0, then 0 = h(x) = h(y), so y = 0 = x. Now assume x "f O. Then h(y) = h(x) "f 0 so y "f O. Letting T = T(X), h(eATx) = eBT h(x) = e BT h(y) = h(eATy). This shows that h(eATy) = h(eATx) E SB, so eATy E SA and T(Y) = T(X). Since ho(eATx) = h(eATx) = h(eATy) = ho(eATy), and ho is one to one, we have eATx = eATy and so x = y. Thus, h is one to one in all cases. Reversing the roles of A and B in the arguments above, we get that h- 1 exists (and so h is onto) and is continuous. This completes the proof. 0 In contrast to the above results which proved that many different linear contractions are topologically conjugate, there is the following standard result about linear conjugacy which implies that very few different linear differential equations are linearly conjugate.
Theorem 7.2. Let A and B be two n x n matrices, and assume that the two flows etA and etB are linearly conjugate, i.e., there exists an invertible M with e lB = MetA M- 1 . Then A and B have the same eigenvalues.
4.8 NONHOMOGENEOUS EQUATIONS
115
PROOF. Differentiating the equality e tB = MetA M-I with respect to t at t = 0, we get that B = M AM-I. Then the characteristic polynomial for B equals that for A: p(~)
= det(B - AI) = det(MAM- 1
-
~MIM-I)
= det(M)det(A - AI)det(M- 1 ) = det(A - AI).
The fact that the characteristic polynomials are equal implies that they have the same 0 eigenvalues. REMARK 7.1. If B = sA for s > 0, then certainly the two flows are linearly equivalent. This is the only new feature which the notion of linearly equivalent adds to that of linearly conjugate.
4.8 Nonhomogeneous Equations In this section we consider nonhomogeneous linear equations. These results are used in the next chapter when studying the stability of fixed points of nonlinear equations and other matters. The general form of the equations considered is given by
x=
A(t)x + g(t).
(NH)
Given such an equation, we associate the corresponding homogeneous equation,
x=
(H)
A(t)x.
The following theorem gives the relationship between the solutions of (NH) and those of (H). We leave its proof to the reader. Theorem 8.1. (a) If x 1 (t) and x 2 (t) are two solutions of (NH), then x 1 (t) - x 2 (t) is a solution of (H). (b) Ifxn(t) is a solution of (NH) and xh(t) is a solution of (H), then xn(t) + xh(t) is a solution of (NH). (c) If xn(t) is a solution of (NH) and M(t) is a fundamental matrix solution of (H). then any solution of (NH) can be written as xn(t) + M(t)v. In terms of the above theorem, we know how to find solutions of the homogeneous equation, at least in the constant coefficient case. We need to find one solution of the nonhomogeneous equation. One way is to look for solutions of the same type as the forcing term. For scalar second order systems, this method is often called the method of undetermined coefficients. The other method is the method of variation of parameters. The following theorem gives this result for systems. Also, in the scalar equations there is an arbitrary choice that has to be made. For systems, no such choice is needed: the process is straight forward. Theorem 8.2 (Variation of Parameters). Let M(t) be a fundamental matrix solution of the homogeneous equation (H). Then x(t) = M(t)
(1:
M(S)-lg(S) ds
+
v)
IV.
116
LINEAR SYSTEMS
is a solution of the nonhomogeneous equation. If v is allowed to vary, then this gives the general solution of the nonhomogeneous equation. PROOF. To derive this equation, we look for a solution ofthe form x(t) = M(t)f(t). If this is a solution, then
itt) = A(t)M(t)f(t) + M(t)f'(t) = A(t)x(t)
+ M(t)f'(t).
Since x(t) is a solution this has to equal A(t)x(t) + g(t), so we need f'(t) = M(t)-lg(t). Integrating from to to t, we get f(t) =
t
1to
M(s)-lg(s)ds
+v
for an arbitrary vector v. Substituting this for f(t), we get the form of x(t) claimed. The above calculation can be worked backward (or a direct calculation of the derivative of the right-hand side can be made) to show that this indeed gives a solution of the nonhomogeneous equation. The statements about the general solution follow from 0 Theorem 8.1 above.
4.9 Linear Maps The results for linear maps are very similar to those for linear differential equations; the groupings of the eigenvalues become those of absolute value bigger than one, equal to one, or less than one rather than those of real part positive, zero, or negative. Let A be an n x n real matrix. If v is an eigenvector for the eigenvalue A, then A"v = A"V. Thus, if IAI < 1 then IA"vl = IAI"lvl which goes to zero as n goes to infinity. Using the Jordan Canonical Form, we get a result for linear maps that is similar to the one for linear differential equations.
Theorem 9.1. Let A be an n x n real matrix and consider the map Ax. The following are equivalent. (a) There is a norm I I. on R" and a constant 0 < ~ < 1 such that for any initial condition x E R" the iterates satisfy
IA"xl.
s ~"Ixl.
for all n ~
o.
(b) For any norm II' on R" there exist constants 0 < Il < 1 and C any initial condition x E R" the iterates satisfy
IA"xl' s (c) All the eigenvalues
A of A
C~"lxl'
satisfy
for all n ~
~
1 such that for
o.
IAI < 1.
REMARK 9.1. As for a linear differential equation, a norm as in Theorem 9.1(a) is called an adapted nann. The proof of this theorem is similar to that for linear differential equations with summations replacing integrals, and will be omitted. With this result, a linear map induced by a matrix all of whose eigenvalues have absolute values less than one is said to be a linear contraction, and the origin is called a linear sink or attracting fixed point for this map. If all the eigenvalues have absolute values greater than one then A is automatically nondegenerate (nonzero determinant) and so A"x is an expansion for n > 0 and a contraction for n < o. The map induced by A is called a linear expansion, and the origin is called a linear source or a repelling fixed point for this map.
4.9 LINEAR MAPS
117
Theorem 9.2. Assume that B Il1Id C are two n x n matrices which induce invertible linear contractions (01' both induce linear expansions), Bx Il1Id Cx. FUrther, assume that B Il1Id C belong to the same path components of Gl(n, R), the set of invertible n x n matrices. Then the linear map Bx is topologically conjugate to the linear map Cx. REMARK 9.2. The General Linear Group, Gl(n, R), has two path components, those with positive determinll1lt and those with negative determinll1lt. (Compare with Theorem 9.6 at the end of this section.) Thus if A and B are two elements of Gl(n, R), both of which have positive determinant (both are orientation preseroing), then A and B are topologically conjugate. Similarly, if both have negative determinant (both are orientation reversing), then they are conjugate.
PROOF. The idea of the proof is very similar to that for flows, but the conjugacy on the "fundamental domains" is different. Let BI be a curve in Gl(7I. R) with Bo = C and B\ = B. Take norms liB and lie given by Theorem 9.1(a) such that Band C are contractions in terms of these respective norms. Let
DB = {x E R" : Ixls $ I} Ss
= {x E R"
: Ixls
and
= I}.
so Ds is the standard unit ball and SB is the standard unit sphere in R" in terms of the norm liB. Similarly, we define De and Se using the norm lie. The Collowing two "annuli" As
= e1[DB \ B(DB)I
and
Ae = e1[De \ C(De)1
are called fundamental domains because for any x '" 0 there is a j such that B1 (x) E AB. and for most x '" 0 there is a unique such j. We need to construct a conjugacy ho between the two linear maps on their respective fundamental domains, ho : As ..... Ae, such that if x, Bx E AB then ho(Bx) = Cho(x). After constructing ho on As, we extend it to all ofR" as we did Cor differential equations. The conjugacy ho needs to be a homeomorphism between As and Ae taking the outer boundary to the outer boundary and the inner boundary to the inner boundary. On the outer boundary we take ho to be essentially the identity (radial projection from SB to Se), and on the inner boundary we take ho to be essentially the map CB- 1 (plus a radial projection onto CSe). If ho has these values on the two boundaries, it is not hard to see that it is a conjugacy on AB. To construct ho with these properties we separated the adjustment of the radius from the change in the "angle" variable in S"-I. For the radial variable, the radial line segments from the inside boundary to the outside boundary have different lengths for AB and Ae. In order to adjust this radial component of the points, we first define a map hB from the standard annulus [0,11 x 8"-1 to the Cundamental domain AB. Similarly, we define he from [0,11 x 8"-1 to Ae. Having adjusted the radial component by means of the maps hB and he, we define a map H from [0,11 x 8"-1 to [0,11 x S"-1 using the path from C to B in Gl(n,R). This map H preserves the t value in [0,11 and is the map in 8"-1 induced by BIB-I. It is the identity for t = 1 and CB-1 for t = 0; thus H makes the adjustments of the "angular" component in 8"-1 in a manner so that ho = he 0 H 0 h'i/ satisfies the necessary conjugacy equation for points on the outer boundary: if x E Ss then Cho(x) = ho(Bx).
118
IV. LINEAR SYSTEMS
FIGURE 9.1. The Construction of ho Beginning the actual constructions, let hB :
[O,IJ
X
sn- I
AB
--+
be given by hB(t, x) = TB(t, x)x
where TB is the affine map in t such that (i) TB(I, x) = Ixll/ so hB(I, x) = x/lxlB E SB and (ii) hB(O,x) = TB(O,X)X E BSB. For TB(O,X)X to be in BSB, we need TB(O,x)B-I X E SB, TB(O,x)IB-IXI B = 1, or TB(O,X) = IB-IXIIiI. Since we choose TB to be an affine map in t, t 1- t TB(t,X) = IxlB + IB-IxIB For any x E sn-I, hB(l, x) = x/lxlB E SB is on the outer boundary of AB, and hB(O,X) = x/IB-IxIB E BSB is on the inner boundary of A B . Thus hB takes the radial line segment [0,11 x {x} onto the radial line segment from x/IB-'xIB to x, i.e., the line segment from the inner boundary of AB to the outer boundary of AB. For use in verifying the conjugacy condition, note that for any y E SB, letting x = By/lByl, By hB(O'IByl)
By
or
= lylB = By
h'i/(By) =
(0,
I!~I)'
Similarly, we define Te and he with he: [0,11 x
Sn-I --+
Ae.
Converting the formulae for hB into those for he, for any x and
=I 0, x/lxl
E sn-I,
4.9 LINEAR MAPS
119
As stated above, next we use the curve Be in GI(n, R) with Bo = C and Bl = B to define
H: [0,11 x
sn- I -+ [0,11 X sn- I
in a manner which preserves the "radius" t E [0,11 and continuously changes the "angular" component in sn-I. In fact, define
On the outer boundary {I} x boundary {o} X sn-I,
sn- I ,
H(l,x) = (l,x) is the identity. On the inner
so
which is essentially the map x ~ CB-I X . Finally, we combine these maps and define ho = hc 0 H
0
h"i/ : As
-+
Ac.
First note that ho is one-to-one and onto because each of the maps hc, H, and hB are. Next we check that ho is a conjugacy whenever x, Bx E A B , i.e., for x E SB. For these xE SB,
Cho(x} = Chc 0 H 0 h8 1(X) = Chc
0
H(l,
1:1)
1:1)
= Chc(l,
Cx = Ixlc· On the other hand, ho(Bx) = hc 0 H
= hc 0
hSI(Bx) Bx H(O, IBxl) 0
Cx
= hc(O, ICxl)
Cx = Ixlc = Cho(x}. This checks the conjugacy of ho on AB. Having defined ho from AB to Ac, we extend it to all of Rn by h(x) = {
~-j(X)ho(Bj(X)X}
forx=O forx~O
120
IV. LINEAR SYSTEMS
where Bi(x)x E AB. The only difference from the case of the differential equations is that we need to check that h is well defined. However, if both Bjx, Bj+I X E A8 then C-i-1ho(Bj+I X ) = C-j-1ho(BBjx)
= c-j-1Cho(Bjx) = C-iho(Bi x )
from the conjugacy property for ho verified above. Thus h is well defined. The reader should also check that it is continuous at points on B(88). The continuity at 0 is similar to before: if Xi converges to zero, then j(Xi) goes to minus infinity, so C- j(x.) ho(BJ(x; )x) goes to O. The rest of the proof is similar to that for differential equations. 0 Having discussed linear contracting maps and linear expanding maps, we now consider the possibility of both types of eigenvalues. If all the eigenvalues of A have absolute value which is not equal to one, Ax is called a hyperbolic linear map. In general, we define the eigenspaces much as before, EU = span{v : v is a generalized eigenvector for an eigenvalue Ee
Awith IAI > I},
= span {v : v is a generalized eigenvector
E' = span{v :
V
for an eigenvalue A with IAI = I}. and is a generalized eigenvector for an eigenvalue
Awith IAI < I}.
These subspaces are called the unstable eigenspace, center eigenspace, and stable eigenspace, respectively. Much as before, E" is characterized as vectors which exponentially contract as n -+ 00, and EU is characterized as vectors which exponentially contract as n -+ -00. The center subspace, Ee , is characterized as vectors which grow subexponentially as n -+ ±oo, Le.,
Next we prove the preservation of the hyperbolic nature of a linear map under perturbation, which is analogous to Theorem 6.2 for differential equations. This result is then used to prove that nearby hyperbolic linear maps are topologically conjugate, which is analogous to Theorem 7.1. Let H(n, R) be the set of matrices A in Gl(n, R) such that A induces a hyperbolic linear map, Ax.
Theorem 9.3. Assume that A E H(n, R). If B is near enough to A, then B E H(n, R) and the splitting for B, E~ eE~, bas suhspaces with the same dimensions as those for A. Moreover, as A varies continuously to B the subspaces also vary continuously. PROOF. The proof is the same as that for differential equations. Let p(A) be the characteristic polynomial for A. Let "( be a simple closed curve inside the open unit disk of the complex plane that surrounds all the eigenvalues of A with negative real part and is oriented counterclockwise. Let "(' be a simple closed curve outside the closed unit disk of the complex plane that surrounds all eigenvalues of A with positive real part but does not surround the unit disk, again with counterclockwise orientation. Then
_1_
271"i
f "f
p'(z) dz p(z)
4.9 LINEAR MAPS
121
counts the number of roots of p(z) inside "Y and varies continuously with the change from p(z) to the characteristic polynomial q(z) for B. A similar statement holds for the unstable eigenvalues. Thus for B near enough to A, all the roots for q(z) are either inside "Y or -y' and so B is hyperbolic with the same dimensional subspaces as A. As before, the projections,
1 .1(z/ - C)-Ivdz, Pv = -2 71"
.,
onto the stable subspace vary continuously with changes of C from A to B, so the subspace E~ varies continuously. See Section 148 of Riesz and Nagy (1955). A simUar statement holds for Eil. Therefore the subspaces of B are near those for A. 0 Theorem 9.4. Assume that A is an n x n matrix which induces a linear hyperbolic map Ax. If B is near enough to A, then Bx is topologically conjugate to Ax. This result follows from the cases of linear contractions and linear expansions (Theorem 9.2), and the fact that the dimension of the splitting does not change (Theorem 9.3) in the same way that it did for differential equations. In contrast to the above results which proved that many different linear contractions are topologically conjugate, there is the following standard result about linear conjugacy which implies that very few different linear maps are linearly conjugate. Theorem 9.6. Let Band C be two invertible matrices in Gl(n, R). Assume Band C are linearly conjugate, i.e., there exists an M in GI(n,R) with C = MBM-I. Then B and C have the same eigenvalues. REMARK 9.3. This result is a standard fact in linear algebra. Compare with the proof of Theorem 7.2. The details are left to the reader. REMARK 9.4. By the uniqueness of the Jordan Canonical Form, if Band C are linearly conjugate, then they have the same Jordan canonical form if the blocks are ordered In the same way.
Now we return to the question of the path components of Gl(n, R) and H(n, R). Let Cont(n, R) be the set of matrices A E Gl(n, R) such that A induces a contracting linear map, Ax. Let Exp(n,R) be the set of matrices A E GI(n,R) such that A induces a expanding linear map, Ax. We have the following result. Theorem 9.6. (a) Let A, B E Cont(n, R). Assume that det(A) and det(B) have the same sign. Then there is a curve {At E Cont(n,R) : 0:5 t:5 I} such that Ao = A and Al =B. (b) Let A, BE H(n, R). Let E~ and E~ be the eigenspaces for A and E~ and Eil be the eigenspaces for B. Assume that (i) the dimension ofE~ equals the dimension ofE~ (so the dimension of E~ equals the dimension of EiI), (ii) the signs of det(AIEO) and det(BIEO) are the same, and (iii) the signs ofdet(AIEU) and det(BIEU) are the same. Then there is a curve {At E H(n,R) : 0:5 t:5 I} such that Ao = A and Al = B. REMARK 9.5. For u = s, u, det(AIEO') is calculated in terms of a basis of EO'. The condition that det(AIEO') = ±1 is really the condition that A preserves or reverses the orientation on the invariant subspace EO'. REMARK 9.6. Because the determinant is a continuous function, if A,B E Cont(n,R) have sign(det(A» # sign(det(B», then there can not be a curve in Gl(n, R), let alone in Cont(n, R), connecting A and B. Thus the result of this theorem is sharp. PROOF. We break up the proof into lemmas and leave the proof of one main step to the exercises.
122
IV. LINEAR SYSTEMS
Lemma 9.7. Let A be the matrix given as follows:
A=(~
-:).
Assume T = (02 + (32)1/2 ~ O. Let 8 be chosen so that and let 8, = (1 - t)8. Define the curve of matrices A
t
= (TCOS(8t ) T
sin(8t )
Q
= T cos(8) and (3 = T sin(8)
-TSin(8e)) T cos(8t ) ,
:s :s
:s :s
for 0 t 1. Then (i) Ao = A, (ii) Al is a diagonal matrix, and (iii) for 0 t 1, the eigenvalues of At have the same absolute value as those of A, namely T. Thus we have given a curve of matrices from a block in the Jordan Canonical Form corresponding to a complex eigenvalue to a diagonal block. The validity of this lemma is obvious. Lemma 9.S. Assume A is a real diagonal matrix, A = diag(AJ, ... , An), in H(n, R). Let if 1 < Aj if 0 < Aj < 1 /-Ii = { -0.5 if -1 < Aj < 0 -2 ifAj <-1. Define Ai,t = (1 - t)Aj + t/-lj, and At = diag(AI,t, ... , An,d for 0 t 1. Then (i) At E H(n, R) for 0 t 1, (ii) Ao = A, and (iii) AI is a diagonal matrix whose entries are ±2 and ±0.5.
~.5
:s :s
:s :s
The verification of the lemma is direct and is left to the reader. See Exercise 4.12(b). Exercises 4.11 and 4.12 ask the reader to prove the following result using Lemma 9.7. Lemma 9.9. (a) Assume A E Cont(n, R). Then there is a curve
{At E Cont(n,R) : O:S t:S I} such that (i) Ao = A and (ii) diag(0.5, ... , 0.5) AI= { . dlag(0.5, ... ,0.5, -0.5)
if det(A) > 0 . If det(A) < O.
(b) Assume A E H(n,R). Then there is a curve {At E H(n,R) : 0 that (i) Ao = A and (ii) AIlE" = { d~ag(0.5, ... , 0.5) dlag(0.5, ... , 0.5, -0.5) and Al lEu
= { d~ag(2, ... ,2) dlag(2, .. , ,2, -2)
:s t :s
I} such
~f det(AIE:) > 0 If det(AIE ) < 0,
~f det(AIE:) > 0 Jf det(AIE ) < O.
PROOF OF THEOREM 9.6. The proofs of parts (a) and (b) are essentially the same so we consider only part (a). By Lemma 9.9, since A, B E Cont(n, R) and det(A) and det(B) have the same sign, there are curves At and B t in Cont(n, R) with Ao = A, Bo = B. and AI = B 1 • We can combine these two curves by defining
:s :s :s :s :s
for 0 t 0.5 C t = { A2t B 2- 2t for 0.5 t 1. Then Co = A. CI = B, and C 1 E Cont(n,R) for O:S t 1. This completes the proof.
o
4.9.1
PERRON-FROBENIUS THEOREM
123
4.9.1 Perron-Frobenius Theorem In this subsection we return to considering irreducible matrices. These were defined for transition matrices, but the definition makes sense if all the entries are nonnegative as we give below. The proof of the theorem of this section is not essential elsewhere in this book, although we do refer to the theorem in a few situations. Definition. An n x n matrix A = (aij) is called nonnegative provided all the entries are nonnegative, aij ~ O. An n x n matrix A = (aij) is called positive provided all the entries are positive, aij > O. It is called eventually positive provided it is nonnegative and there is an integer m > 0 for which Am is positive. An n x n matrix A = (ai;) is called reducible provided that there is a pair i, j with (Am)ij = 0 for all m ~ 1. It is called irreducible provided that it is not reducible, i.e., for 1 $ i,i $ n there exists m = m(i,i) > 0 such that (Am)i; ." O. If A Is eventually positive and irreducible, then (Ai)i; > 0 for all i ~ m. With these definitions, there is the following result of Perron, see Perron (1901) and Gantmacher (1959). Theorem 9.10 (Perron-Frobenius). Assume A is an eventually positive matrix. (a) Then there is a real positive eigenvalue Al which is a simple root of the characteristic equation such that if Aj is any other eigenvalue (with i > 1) then Al > IAjl. The eigenvector vI for the eigenvalue At can be chosen with all entries strictly positive, vf > 0 for 1 $ i $ n. In fact, all other eigenvectors, whether for real or complex eigenvalues, have components of both signs. This is true for both eigenvectors on the right and on the left. (b) If Am has all positive entries and x is any unit vector with Xi ~ 0 for all i, then Ajx/IAjxl converges to vl/lvtl as i goes to infinity, and there are positive constants C I and C 2 such that C I A1 $ IAjxl $ C 2 A{ and CtA{ $ (A;X)i $ C 2 A{ for i ~ m and 1 $ i $ n. (Here (Aix)i is the i-th component of Aix .) REMARK 9.7. Frobenius proved a generalization of this result for irreducible nonnegative matrices (Frobenius, 1912). A good general reference for these results and a proof of the more general result is Gantmacher (1959). REMARK 9.8. One application of this theorem is to a system with a finite number of states where probabilities of making the transition from one state to another are known. Assume there are n states and Pi; is the probability of making the transition from state i to state i for 1 $ i,i $ n. Let P = (pijh~i.;~n be the transition matrix. Since the probability of making the transition from state i to some state is one, it is assumed that Li Pij = 1. Also, assume Pi; > 0 for all pairs (i, i), so there is a positive probability of transition from any state to any other state. Note that P preserves the set of all distributions x for which E j Xj = 1, i.e., x represents the distribution within the finite states {I, ... , n}. Also note that (1, ... , 1) is a left eigenvector for the eigenvalue 1, (1, ... ,I)P = (L,Pilo''',L,P,n) = (1 ..... 1). Since all the entries of (1 ..... 1) are positive, Al = 1 is the largest eigenvalue in absolute value by the conclusion of the Perron Theorem. The eigenvalue At = 1 also has right eigenvector s· with all the > O. Also s· can be normalized so that L; = 1. This vector s· represents the final steady state distribution within the states. because if x is any initial distribution with all the x; ~ 0 and Lj Xj = 1, then p"x converges to s· as k goes to infinity because of the inequalities between the eigenvalues and the fact that P preserves vectors with Lj Xj = 1.
s;
s;
PROOF. By replacing A with Am. we can assume that A is positive. (Or perhaps only take powers Ai for i ~ m.)
124
Let {ei : I
IV. LINEAR SYSTEMS ~ j ~
n} be the standard basis of Rn. Let
be the first "quadrant",
sn- I = {x :
Ixl
= I}
be the sphere of all unit vectors, and
A = Qnsn-I be the simplex of unit vectors in the first quadrant.
The matrix A induces a map,
lA, on sn- I by Ax IA(X) = lAx!,
Because A is positive, Aei = L. a.)e' E int(Q). Applying the map fA to these unit vectors, IA(A) C int(A,Sn). where the interior is taken relative to sn. The simplex A is homeomorphic to Dn-I, the closed unit disk in R n- I , i.e., A is the image of a homeomorphism from Dn-I into Rn. By the Brouwer Fixed Point Theorem, IA must have a fixed point, Vi, in A. Then Av l = AIV I for some positive real number AI' Thus AI is an eigenvalue with unit eigenvector vi. (Here vi is a column vector because it is multiplied on the right of A.) The components of vi are all positive because Vi E int(Q). The above argument can be repeated with the action of A on row vectors where the vector is multiplied on the left. We obtain a left eigenvector Wi with all positive entries for some real eigenvalue A· where wi is a row vector, wi A = A·W I . Alternatively, apply the above result to AIr. (We do not use the fact that A· = Al but this is true. Because both Wi and Vi have positive entries WIV I > 0 and A·WIV I = Wi Av l = AIWlv l , so A· = Ad Now define the (n - I)-dimensional subspace W by
Thus, wi is the normal (co)vector to this subspace. Any nonzero vector x E W has some component positive and other components negative, because all the components of Wi are positive and wlx = O. It follows that W is invariant by A: if x E W then wl(Ax) = (wIA)x = A·WIX = 0 so Ax E W. Thus A restricted to W has n - 1 eigenvalues (with multiplicity), A2," ., An. We need to prove that Al > IA) I for j > 1. To prove this claim, first take the case where Ai is real with real eigenvector vi E W of length one. Since vi E W, it must have both positive and negative components as claimed in the theorem. Let V be the two dimensional subspace spanned by vi and v j . We change to the inner product on V which makes these two vectors into an orthonormal basis. Let Sl be the unit sphere in V in terms of this new inner product. Any x E Sl can be represented by
for some cpo Then
A[cos(cp)vi + sin(cp)vIJ = Aj cos(cp)vi + Al sin(cp)v l , so j . ( ) I) cos(cp)v + (AdAj)sin(cp)vl ( () j +smcpv I ACOSCPV =. . Icos(cp)v) + (AdAj)sin(cp)vll
4.9.1 PERRON-FROBENIUS THEOREM
125
There is a I{)o with 0 < I{)o < 1r12,0 < cos(l{)o) < 1, such that Xo = cos(l{)o)v j +sin(tpo)v 1 is on the boundary of t:J. = Q n 8 1 relative to 8 1 , B(t:J., 8 1 ). Because A is a positive matrix, f~xo E int(t:J.,8 1 ) for k ~ 1. If I~jl > ~J, then by the form of fA above, f~(Xo) would converge to v J as k goes to infinity, which contradicts the fact that f~Xo E int(t:J., 8 1 ). On the other hand, if I~jl = ~l then ~~ = ~I and f~(Xo) would equal Xo, which again contradicts the fact that f~Xo E int(t:J., SI). Therefore when ~j is real, the only possibility left is that ~l > I~jl. Next we consider the case of a complex eigenvalue, ~j = -y[cos(!J» + isin(!J»J. Then there is a a complex eigenvector v j +iw j with vi, w j E W such that Avj = -ycos(!J»vi + ,,),sin(!J»w j and Awj = -,,),sin(!J»v j + ,,),cos(!J»w j . Since vj, w j E W, both of these vectors must have both positive and negative components as claimed in the theorem. Let V be the three dimensional subspace spanned by vi, v j , and w j . We change to the inner product on V which makes these three vectors into an orthonormal basis. Let s'l he tllfl unit sphpw in V in tl'r/ll/i of this new inner produet. Orfillr X(O) = cos(O)v j + sin(O)wJ, so x(O) E S2. A direct calculation shows that Ax(O) = ,,),x(O+!J», so fA(X(O» = x(O+!J» and these points move on the unit circle in the plane spanned by v j and wi. Any point in 8 2 C V can be represented as cos(l{)x(O) + sin(l{)v 1 , and A[cos(l{)x(O) + sin(l{)vIJ = -ycos(l{)x(O +!J» + ~I sin(tp)v l , fA (cos( )x(O) + sine )v l ) I{) I{)
Assume
I~jl
= ")' >
~I'
=
so
cos(l{)x(O +!J» + (~d")')s~n(l{)vl . Icos(l{)x(O + !J» + (~d")') sm(l{)vll
By the form of fA given above, if cos(cp) =F 0 then
f~(cos(l{)x(O) + sin(cp)vl) converges to the unit circle in the plane spanned by v j and
wi as k goes to infinity. In particular such a point which starts in the boundary of t:J. = Qns'l relative to S2, B(t:J.,S2), does not remain in the int(t:J.,8 2 ) as k goes to infinity. This contradicts the fact that A is a positive matrix. If I~jl = ")' = ~I'
fA(COS(I{)x(O) + sin(l{)vl)
= cos(l{)x(O +!J»
+ sin(l{)v l .
Therefore all points on 8 2 are recurrent for fA. (1f!J> is a rational multiple of 211' then all points in S2 are periodic or fAi if!J> is an irrational mUltiple of 21r then all points in S2 are recurrent but only ±v l are periodic.) In any case f~ (B( A, S2» can not be contained in int(A, S2) for all positive k. Again, this contradicts the fact that A is a positive matrix. Combining all the cases, we have proved part (a) of the theorem: ~I > I~jl for all j>1. To prove part (b) of the theorem, assume that x is a unit vector with Xi ~ 0 for all j. Let a = (X, vi )/lv I 12 . Note that a > 0 because all the components of vi are positive and all those of x are nonnegative. Then x - av l is in the subspace W defined above. Therefore x = av l + v· with v· E W, AkX
= Akav l + Akv· = a~~vl + Akv·,
Akx
7I
Akv·
=av l
+ >F" I
and
IV. LINEAR SYSTEMS
126
where lim IAkv"1 = k-oo
,\~
o.
From the above convergence, it follows that there are positive constants 0 < C I < C 2 such that
for 1 ~ i theorem.
~
This completes the proof of the
n.
o
9.9. The above proof shows that Al > IAjl for all j > I, so that for any x E ~, f~ (x) converges to Vi flvll as k goes to infinity. In fact, using the comparison of the eigenvalues it follows that fA is a contraction on ~. We did not prove that fA is a contraction on ~ directly, but merely used the Brouwer Fixed Point Theorem to get the eigenvector for AI. After getting the eigenvector for AI, we argued that since fA maps ~ into its interior in sn, the other eigenvalues can not be larger than '\1 in absolute value. Then as remarked above, the inequalities between the eigenvalues implies that fAI~ is a contraction by general arguments. REMARK
REMARK 9.10. The matrix A can have complex eigenvalues and off diagonal terms in its Jordan canonical form, i.e., l's in its Jordan form. In fact, let vi = (1, ... ,l)tr,
W = {x : x· Vi = OJ, and v j for 1 < j ~ n be any basis for W. Let C be any matrix in terms of the basis {v 2 •... , vn}. Clearly, C can have any Jordan canonical form. Then consider the linear map L which preserves W, has the matrix C in terms of the basis {v 2 , .•• , v n } on W, and L(v l ) = AIV I . Then any of the standard unit vectors can be represented in terms of this basis,
Because v I has all positive coefficients, and for j ~ 2 the sum of the coefficients of vi in terms of the standard basis is zero, it follows that Yj.1 > 0 for all j. Then L(ei) = L(2:>j.iVi) n
= AIYj.lv l
+ LYi.iL(v i ). ;=2
Also let A = (ai.}) be the matrix of L in terms of the standard basis, so L(ei) = Lai}e;.
Comparing coefficients. we get that n
aij = AIYj,1
+ "L
Yj.kL(V k ). ; e .
k=2
Thus for Al large enough, aij > 0 for all i and j, A is positive, and L(ei) is in the interior of Q for all j. This proves that A can have any type of Jordan canonical form for the eigenvalues which correspond to the eigenvectors on W.
4.10 EXERCISES
127
4.10 Exercises Jordan Canonical Form 4.1. Let A be an n x n matrix, and Vi, ... , v n be a basis such that Av l = AV 1 and Avi = AV i + v i - I for j = 2, ... , n. Given E > 0, find a new basis wi such that Awl = AWl and Awi = AW j + EWj - 1 for j = 2, ... , n. Hint: Try w j = ajvj for suitable choices of ai' Solutions and Phase Portraits for Constant Coefficients 4.2. Find a basis of solutions and draw the phase portrait for x = Ax for each of the following choices of A.
C ~),
( a)
(~ ~).
( e)
(~2 ~l ~), (f) (~2 o
0
(b)
3
0
(c)
C \2).
0~l).
(d)
(:2
n,
1
4.3. Consider the second order linear equation given by x = Ax. (a) Prove that x( t) = eAtv is a solution of x = Ax if and only if p. = A2 is an eigenvalue of A with eigenvector v. (b) Find a basis of (four) solutions of x = Ax for
Hyperbolic Linear Differential Equations 4.4. Let Hdiff(n, R) be the set of matrices A in Gl(n, R) that induce a hyperbolic linear differential equation, j; = Ax. Assume that A E Hdiff{n, R). (a) Prove there is a curve {At E Hdiff{n,R) : 0 ~ t ~ I} such that (i) Ao = A and (ii) Al has no nonzero off diagonal terms in its Jordan canonical form. (b) Prove there is a curve {AI E Hdiff{n, R) : 0 ~ t ~ I} such that (i) Ao = A and (ii) AdlE' =diag{-l, ... ,-l) AdIEU = diag(I, ... , 1).
and
Conjugacy and Structural Stability 4.5. Consider linear systems of constant coefficients in R3. Construct explicit conjugacies h : R2 -+ R3 between x = -x and the following other systems: (a) y = -2y, (b)
Y=
(~2
and
-2 (c) Y = ( 1
4.6. Consider linear systems of constant coefficients in R2. Construct explicit conjugacies h : R2 -+ R2 between
4.7. Suppose that A E GL{Rn) is not hyperbolic. Show that the map Ax is not structurally stable. Hint: Consider the family of maps Ar{x) = rA(x) for r E (I-e, I + f).
IV. LINEAR SYSTEMS
128
4.8. Let I,g : Rn -- Rn be defined by I(x) = ax and g(x) = bx where 1 < a < b. Find an explicit formula for a conjugacy from the map 9 to the map I on all of Rn, i.e., a homeomorphism h for which 10 h = hog. Prove that any such h which is differentiable must have all partial derivatives at the origin equal to 0, and h- 1 does not have partial derivatives at o. Nonhomogeneous Equations 4.9. Prove Theorem 8.1. Linear Maps 4.10. (This exercise gives a direct calculation of the eventual contraction of a Jordan block with a real eigenvalue absolute value less than one. Compare with Theorem 9.1.) Assume 0 < 1>'1 < I. Assume A is a matrix with Ae l = >.e l and Ae k = >.e k + ae k- I for 1
~ 7n.
(a) Prove that
ror
71 ~
j.
(b) Prove that (;) ~ n j Ii! ~ nj. (c) Prove that njl>,jl goes to zero as n goes to infinity. (d) Prove that IAnekl goes to zero as n goes to infinity for any 1 ~ k ~ m. 4.11. Let Cont(n,R) be the set of matrices A in G/(n,R) that induce a contracting linear map, Ax. (a) Assume that A E Cont(n, R). Prove there is a curve {AI E Cont(n, R) : 0 ~ t ~ I} such that (i) Ao = A and (ii) AI is diagonal with entries equal to either 0.5 or -0.5. Hint: Use Lemma 9.7. Allow for 1's in the off diagonal terms of the Jordan canonical form. (b) Exhibit a curve from diag( -0.5, -0.5) to diag(0.5, 0.5). Hint: The curve of matrices can not remain diagonal. Use Lemma 9.7. (c) Assume A E Cont(n,R). Prove there is a curve {AI E Cont(n,lR) : 0 ~ t ~ I} such that (i) Ao = A and (ii) AI =
{
diag(0.5, ... ,0.5) diag(0.5, ... ,0.5, -0.5)
if det(A) > 0 if det(A) < o.
Hint: Use parts (a) and (b). Notice that this proves Lemma 9.9(a) for contracting linear maps. 4.12. Let H(n, R) be the set of matrices A in G/(n, R) that induce a hyperbolic linear map. Ax. (a) Assume that A E H(n,R). Prove there is a curve {AI E H(n,R): 0 ~ t ~ I} such that (i) Ao = A and (ii) AI is a real diagonal matrix. Hint: Use Exercise 4.11. Notice that the contracting and expanding subspaces are not necessarily spanned by the a subset of the standard basis. (b) Assume A E H(n,R). Prove there is a curve {AI E H(n,R): 0 ~ t ~ I} such that (i) Ao = A and (ii) AI is diagonal with entries equal to either 2, 0.5, -0.5, or -2, i.e., prove Lemma 9.8. Notice that l's are allowed in the off diagonal terms in the Jordan canonical form. (c) Prove Lemma 9.9(b). More precisely, assume A E H(n,R). Prove there is a curve {AI E H(n,R) : 0 ~ t ~ I} such that (i) Ao = A and (ii)
A IE' = { diag(0.5, ... , 0.5) I diag(0.5, ... ,0.5, -0.5)
if det(AIE') > 0 if det(AIE') < 0,
4.10 EXERCISES
and
AIlE" = {d~ag(2, ... ,2) dlag(2, ... , 2, -2)
129
if det(AIE") >0 if det(AIEU) < O.
Hint: Use the previous problem as well as parts (a) and (b). (Note that for a = s, u, det(AIE") is calculated in terms of a basis of E". The condition that det(AIE") is positive or negative is really the condition that A preserves or reverses orientation on the invariant subspace E".)
CHAPTER V
Analysis Near Fixed Points and Periodic Orbits In this chapter we consider solutions of systems of nonlinear differential equations and the iteration of nonlinear functions of more than one variable. We also consider the phase portraits for both types of nonlinear systems. For nonlinear systems. even the analysis near a fixed point is more complicated. Rather than just using inequalities from the Mean Value Theorem. we must use the more complicated linear theory from the last chapter and nonlinear theorems from differential calculus like the Inverse Function Theorem. Implicit Function Theorem. and the Contraction Mapping Theorem. We are able to prove that if the linearization is hyperbolic. then the nonlinear map or differential equation is topologically conjugate to the linearization. This theorem is simple in one dimension. but requires a proof using the Contraction Mapping Theorem in higher dimensions. We also introduce the nonlinear invariant manifold which is tangent to contracting directions of the linearization. called the stable manifold. and the corresponding invariant manifold which is tangent to the expanding directions. called the unstable manifold. The proof that these manifolds exist is again a nontrivial fact which needs an involved proof. As the above summary and this chapter's title indicate. this chapter concerns only the behavior near a single periodic orbit. In Chapters VII and IX we return to consider more global and complicated dynamics such as those for the quadratic map on the Cantor set and the structural stability of certain classes of examples. In between. Chapter VI is concerned with how periodic points change or bifurcate as a parameter is varied. The first few sections present a review of differentiation of function between Euclidean spaces as a linear map. and the important theorems from differential calculus in the form we use them: Inverse Function Theorem. Implicit Function Theorem. and the Contraction Mapping Theorem. As a first application of the Contraction Mapping Theorem. we prove the existence of solutions of nonlinear differential equations. After these beginning sections. we begin our study of properties of solutions of nonlinear differential equations and the iteration of nonlinear maps.
5.1 Review: Differentiation in Higher Dimensions: The Derivative as a Linear Map The general references for this material are Dieudonne (1960). Lang (1968). Marsden (1974). and Smith (1971). Both Dieudonne (1960) and Lang (1968) talk about the derivative of functions between Banach spaces. The third reference. Marsden (1974). is concerned with derivatives between Euclidean spaces. and the approach is more elementary but not as developed. Many other books on real analysis define the total derivative as a linear map. We are concerned with maps f : U c R" ..... Rn. where U is an open subset of R". If all the partial derivatives exist and are continuous. then the map is said to be continuously differentiable, C 1 • We put the partial derivatives at a point p into a single 131
132
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
matrix, DIp,
al (p»).
DIp = (ax'
J
We identify n x k matrices with linear maps from Rk to R n , L(R k,lR n ). Thus DIp should actually be thought of as a linear map in L(R k , Rn). Which coordinate function is used determines the row and which partial derivative is taken determines the column. With this choice of the entries in the matrix, the i-th coordinate of the product of this matrix by a vector v is given by
which is what it should be in terms of the partial derivatives. In fact, if all the partial derivatives exist in U and are continuous, then using the Mean Value Theorem it can be shown that for p E U, I(x) = I(p)
+ Dfp(x - p) + R(x,p)
where lim R(x, p) = O. x-p Ix - pi This latter condition is often expressed by saying that
R(x, p) = o(lx - pl). The fact that I(x) = I(p) + Dlp(x - p) + o(lx - pI) can be taken as the definition of I being differentiable with its derivative being the linear map DIp. If any (matrix or) linear map A E L(Rk,Rn) exists such that lim I(x) - I(p) - A(x - p) = 0 x-p Ix - pi then I is said to be differentiable at p and A is called the (Frechet) derivative at p. The derivative at p can be shown to be unique, and if the partial derivatives exist then the above matrix is the unique matrix which gives the derivative. The definition does not give a way to calculate the derivative but the partial derivatives do. The space L(R", Rn) is given the operator norm as defined in the last chapter. The derivative is called continuous provided the map D I : U --+ L(R", Rn) is continuous with respect to the Euclidean norm on the domain and the operator norm on the space of linpar maps. In fact, if the partial derivatives all exist and are continuous, then the derivative is continuous, and vice versa. Such a map is called continuously differentiable or If U is a region where I(x) is defined and C t , then we can let K = sup{IIDlxll : x E U}. By the Mean Value Theorem,
ct.
I/(x) - l(y)1 :S Klx - yl if the line segment from x to y is contained in U. See Dieudonne (1960), Lang (1968), or Marsden (1974). It is possible for a function to have this latter property but not be differentiable. In this case, if there is some K > 0 such that I/(x) - l(y)1 :S Klx - yl for all x and y, then f is called Lipschitz with Lipschitz constant K. We write Lip(f) = K
5.1 DIFFERENTIATION IN HIGHER DIMENSIONS
133
if K is the smallest constant that works. Thus if f is CI then it is Lipschitz. However, the function f(x) = Ixl on R is Lipschitz but not CI. In the same way, we call f aHolderfor a> 0 provided there is a constant K > 0 such that If(x) - f(y)1 :5 Klx-ylQ for all x and y. Thus if a function is l-Holder then it is Lipschitz. Finally, we call a function c r +n for r a positive integer and 0 < a :5 1 if the rlh order partial derivatives are a-Holder. Given the above definition of the derivative, if f : U C Rk - Rm and 9 : V c Rm Rn are differentiable at p and q = f(p), respectively, then the comp06ition, go I, is differentiable at p with derivative D(g 0 f)p = DgqDfp. Thus the derivative of the composition is the composition of the derivatives. This is called the chain rule. In calculus books this derivative of the composition is often written out as a product of partial derivatives. The reader should satisfy him or herself that these ways of expressing the derivative of the composition are compatible. Using this approach to understand the second derivative and higher derivatives seems complicated when first encountered. As stated above, the derivative of a map 1 : U C Rk __ Rn at a point p gives a map D f : U _ L(Rk, Rn) as the point p varies. This second space is itself isomorphic to the Euclidean space Rkn. It can be given either the Euclidean norm or the operator norm. (Any two norms on finite dimensional spaces are equivalent.) The second derivative at p, D2 f p , is then the derivative of this map and so is an element of L(Rk, L(Rk, Rn)). However, L(R k, L(R k, Rn)) is isomorphic to bilinear maps from Rk to Rn which we denote by L2(Rk,Rn). (Note L2(Rk,Rn) are those maps in L(Rk x Rk,Rn) which are linear in each factor of Rk separately.) An element B E L2 (R k, Rn) acts on two vectors in Rk and gives a vector in Rn. In terms of the standard bases {ei}~_1 of Rk and {stltzl of Rn, two vectors v and w can be written as v = Li Viei and w = Lj w j e3, and n
B(v, w) = ~)L 1 :5 i,j :5 kb~,jViWj)st t=1 n
=
L 1 :5 i,j :5 k(L bt,st)ViWj. t=1
Such a bilinear map is symmetric provided B(v, w) = B(w, v) for all vectors v and w. In terms of the entries bf,j' B is symmetric provided bf,j = i for all indices i, j, and t. The set of all symmetric bilinear forms from Rk to Rn is denoted by L~(Rk,Rn). Returning to the second derivative, D2/p acts on two vectors in Rk and gives a vector in Rn. If v and w are expressed in terms of the standard basis e l , ' .. ,ek as v = Li Viei and w = Lj wJ e3, then
bl
(Note that each. (f}f}2f}1
Xi Xj
)
P
is a vector.) The fact that the mixed cross partial derivatives
are equal, f}2
f
f}Xif}Xj
f}2
(p) =
f
f}Xjf}Xi
(p),
implies that D2 fp is a symmetric bilinear form, D2 fp(v, w) = D2 fp(w, v) for all v and w. This process can be continued to define higher derivatives, and the r-th derivative at p is a symmetric r-Iinear form, Dr fp E L~(Rk, Rn). We write Dr fp(vt to mean that
134
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
the r-th derivative is acting on the same vector v r times, Dr Ip(v, ... ,v). If all the derivatives (or all the partial derivatives) of order 1 :5 j :5 r exist and are continuous then I is said to be r-continuously differentiable, or I is cr. Just as for C 1 (or as for functions of one real variable), if f : U C Rk -+ Rm and 9 : V c Rm -+ R n are C r with q = f(p) E V for p E U, then 9 0 f is cr in a neighborhood of p. The formula for the second derivative can be calculated:
This is found by differentiating D(g 0 f)p = Dg/(p)Dfp using the product rule (and the chain rule). There is one term for each place that the variable p appears in the formula. The result is very similar to functions of one variable, but all the derivatives are matrices which are applied to vectors. The higher derivatives of the composition can be calculated, but the formulas are somewhat complicated. (For explicit formulas see Abraham and Robbin (1967).) Using these higher derivatives, we can state Taylor's Theorem. If f : U C Rk -+ Rn is C r , then
where lim R(x, p) = Ix - plr
o.
x-p
This last condition is often expressed by saying that R(x,p) = o(lx - pn. If I : Rk -+ Rk is cr, DIp is a linear isomorphism at each point PERk, and I is one to one and onto, then I is called a C r -diffeomorphism. The set of C r -diffeomorphisrns on Rk is denoted by Diffr(Rk). By the Inverse Function Theorem, it follows that 1-1 is also cr. (The statement of the Inverse Function Theorem is given in Section 5.2.2.) We have completed our introduction to the notion of derivatives as linear maps which we need to start studying the dynamics of functions of several variables. Other main topics of the multidimensional differential calculus which we use are the Implicit and Inverse Function Theorems. They are not used immediately. Th ... Implicit Function Theorem is used in discussions of the Poincare map of a differential equation near a closed orbit and in bifurcation questions. The Inverse Function Theorem is used in the proof of the Stable Manifold Theorem. Because this material can be postponed until it is needed, we put it in a separate section made up of several subsection (Sections 5.2 - 5.2.2). Section 5.2.3 deals with the related topic of the Contraction Mapping Theorem which is used repeatedly, including in the proof of the existence of solutions for differential equations, Section 5.3.
5.2 Review: The Implicit Function Theorem In a course on advanced calculus or real analysis a proof of the Implicit Function Theorem is often given but often the students do not get a very good idea about how it is used. On the other hand, most calculus students learn to calculate using implicit differentiation, but have no real idea of the significance of the method of calculation. In Dynamical Systems several uses are made of the theorem for bifurcation results and results concerning the Poincare map. In thj, "tion, we want to discuss the idea and meaning of the theorem. The proof will be left to books on real analysis. See Dieudonne (1960), Lang (1968), or Marsden (1974). (Also the proof of the HartmanGrobman Theorem later in the chapter solves a functional equation by applying a similar
5.2 THE IMPLICIT FUNCTION THEOREM
135
contraction mapping argument.) In subsection 5.2.2 we give the statement of the Inverse Function Theorem. Finally, in subsection 5.2.3, we give a statement and proof of the Contraction Mapping Theorem. We will first give the statement and interpret the result for the case when the function is real valued. In the next subsection, 5.2.1, we discuss the case when the function is vector valued.
Theorem 2.1 (Implicit Function Theorem). Assume that U c Rn+1 is an open set and F : U -+ R is a C r fllnction Eor some r ~ 1. For p E R"+i we write p = (x, Y) with x E R" and y E R. Assume that (~, Yo) E U and 8F( By ~,Yo )40 r .
(i)
Let C = F(xo, Yo) E R. Then there are open sets V containing ~ and W containing Yo with V x W c U, and /1 C r function h : V -+ W such that h(~) = Yo
F(x, h(x»
(ii)
=C
for all x E V.
(iii)
Further, Eor each x E V, h(x) is the unique yEW such that F(x,y) =
c.
This theorem states several things. First, it says that indeed the set of (x, y) that satisfy F(x, y) = C can be represented locally as a graph, y = h(x). Next, it says that this function is differentiable. Once we know that it is differentiable, then implicit differentiation gives a method of calculating its derivative: differentiating F( x, h( x» = C with respect to Xj and using the chain rule, we get 8F 8F 8h so 8xj (x, h(x» + By (x, h(x» 8xj (x) = 0 8h -8 (x)
[8F ]-1 -8 (x,h(x» . Y If we combine these partial derivatives together in the (Frechet) derivative of h, we get 8h 8h Dh(x) = (-8 (x), ... , -{J (x») Xj
XI
8F = --8 (x,h(x» Xi
Xn
8F ]-1(8F 8F) =- [ a..(x,h(x» -8 (x,h(x»""'-8 (x,h(x».
vv
xn
Xl
Thus the assumption in the theorem that *(~, Yo) -F 0 is exactly the assumption which makes the method of implicit differentiation valid. Or formally, the assumption that the partial derivative with respect to y is nonzero is the correct assumption to make so that y can be solved in terms of x. A second more geometric way of understanding the assumption on the partial derivative is in terms of the tangent line. Consider the example of two variables (n = I), F(x, y) = x 2 + y2 = 1. The tangent line at Po = (xo, Yo) is given by VFpo' (x - xo,y - Yo) = O.
See Figure 2.1. If *(po) -F 0, then we can represent this line as a graph of y in terms of X, i.e., we can solve for y in terms of x: 8F aF (x - xo) ax (Po) + (y - Yo) 8y (Po) = 0, (y - Yo)
=
8F -(x - xo)-(Po) ax
8F (Po) By
136
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
Conversely, if we can solve the tangent line equation to give y as a function of x, then ~(Po) '" 0. For example, at (xo, Yo) = (~, ~), y _
v'3 2
= -(x -
~)(2)(~)
=
_.2... (x _
(2) v'3 2
v'3
~) 2
gives the tangent line. The Implicit Function Theorem says that if the tangent line at Po can be represented as a graph of y in terms of x, then nearby the nonlinear level set F(x, y) = C can be represented as a graph y = h(x). The same ideas apply for n > 1.
FIGURE 2.1.
Gradient and Tangent Line to the Circle
Notice that in the example, x 2 + y2 = 1 at (1,0), ~(1,0) = 0, and the method breaks down. The tangent line is vertical so it can not be represented as a graph over the x variable. Also, near (1,0) the level set x 2 + y2 = 1 can not be represented as a graph of y in terms of x. Thus both the hypothesis and the conclusion fail to be true
at this point. However, ~~ (1,0) '" 0, so reversing the roles of x and y, it is possible to solve for x in terms of y, i.e., the level set is a graph of x in terms of y.
5.2.1 Higher Dimensional Implicit Function Theorem Similar results are true for functions that are vector valued as given in the following theorem. (See a calculus book on implicit differentiation involving partial derivatives, e.g. Edwards and Penny (1990), pages 797 and 806.) Theorem 2.2 (Higher Dimensional Implicit Function Theorem). Assume that U c R" X Rio is IllI open set IllId F : U ..... Rio is a cr function for some r 2:: 1. Represent a point p E U by P = (x,y) with x E R" IllId Y E Rk, and the coordinate functions of F by II> F = (/1,." ,J,.). Assume that for (Xo,Yo) E U
ali (ayj (Xo, YO») I $i,j$A: is an im'ertible k x k matrix (nonzero determinllllt). Let C = F(Xo,yo) E Rk. Then
there are open sets V containing Xo IllId W containing Yo with V x W function h : V ..... W such that h(Xo) = Yo F(x, h(x» = C
c U,
for all x E V.
FUrther, for each x E V, h(x) is the unique yEW such that F(x,y) =
c.
IllId a
cr
5.2.2 THE INVERSE FUNCTION THEOREM
137
Just as in two dimensions, once it is known that h is a C r function, it is possible to solve for the matrix of partial derivatives, (::;)l Si 9
.1
s ;sn in terms of the
two
. f t' I d ' . ( aIt) It) lSt.iS": matnces 0 par la envatlves ax; lStS".lS;Sn an d (aOy,
so
Or if we use the notation of DhXo for the Frechet derivative of h, DxFCXo.yo) for the matrix of partial derivatives with respect to the x/s, and DyFCXo.yo) for the matrix of partial derivatives with respect to the Yk'S, then this equation can be written as DhXo = -(DyFCXo.yo»)-l DxFCXo.yo)·
Notice that this formula is very similar to that for a real valued function F, where matrices have replaced numbers and the inverse of a matrix has replaced dividing by a number. Also, the theorem can be interpreted to say that if we can solve the linear (affine) equation F(Xo, Yo)
aI, (Xo, Yo) ) (x + (-a x;
Xo)
+ (ali -a (Xo, Yo) ) (y Yj
Yo) = c
for y in terms of x, then locally near (Xo,Yo) we can (theoretically) solve the nonlinear equation F(x,y) = C for y in terms of x. This means that if implicit differentiation works then indeed locally y is a differentiable function of x. Also the above linear equation gives the tangent space to the level set F(x,y) = c. If this linear tangent space can be represented as a graph with y given as a function of x, then the nonlinear equations can also be represented as a nearby graph. Just as in two dimensions, it might be that at some points
is not invertible. However, if the k x (n + k) matrix
which is the Frechet derivative at (Xo, Yo), has rank k, then it is possible to select k columns that give an invertible submatrix. The theorem then says that the corresponding k variables can be solved near the point in terms of the remaining n variables to give the level set F(x,y) = C as a graph. For a more thorough treatment see Dieudonne (1960) or Lang (1968).
5.2.2 The Inverse Function Theorem The Inverse FUnction Theorem can easily be proved from the Implicit FUnction Theorem and vice versa. However, although both theorems involve the assumption that a matrix of partial derivatives is invertible, they seem very different.
138
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
Theorem 2.3 (Inverse Function Theorem). Assume that U c an is an open set and / : U -+ Rn is a C r function for some r ;:: 1. Assume that for Xo E u, D/xo is an invertible linear map (matrix). Then there exist open sets V containing Xo and W containing Yo = /(xo) and a C r function 9 : W -+ V which is the inverse of / on V: go/(x) = x for x E V and /0 g(y) = y for YEW. Further D9f(x) = [D/x ]-'. One way of interpreting this theorem is that if the linearized (affine) equations (function) y = A(x) = /(Xo) + D/xa(x - Xo) have an inverse, then the nonlinear equations y = /(x) have an inverse near x = Xo. In fact, in the proof of the stable manifold theorem we need a more precise statement of the neighborhoods on which the function is one to one. We give this statement in the following theorem whose proof we leave to the exercises. See Exercise 5.3. (Also see Hirsch Pugh (1970) and Lang (1968).) The statement of the theorem uses the minimum norm of a linear map which we defined in the last chapter. Also for r > 0, let 8(X,r)={yElR n
:
Iy-xl::;r}
be the closed ball of radius r centered at x. Theorem 2.4 (Covering Estimate). Assume that U C lRn is an open set and / : U -+ Rn is a C l function. Assume that Xo E U, and that L = D/xo is an invertible linear map with a bounded linear inverse. Let Yo = /(Xo). Take any {J with 0 < (J < 1. Let r > 0 be such that (i) 8(Xo,r) C U and (ii) ilL - D/xll ::; m(L}(l - (J) for all x E 8(xo, r). Then /(8(Xo,r» ::> {Yo}
+ L(8(0,{Jr»::> 8(Yo,m(L){Jr).
Moreover, every point y E {yo} + L(8(0, (Jr» has exactly one preimage x E 8(xo, r), /(x) = y, so the inverse function 9 is a C' function from {yo}
+ L(8(O, (Jr»
::> B(yo, m(L){Jr)
into B(Xo, r). As stated above, we leave the proof of the covering estimates to the exercises. See Exercise 5.3.
5.2.3 Contraction Mapping Theorem In this section we consider maps on some metric space Y which decrease distance, so called contraction mappings. Finding the fixed points of a contraction map can itself be thought of as a problem in the dynamics of iteration. The theorem below shows that such a map 9 has a unique fixed point. Besides being interesting in itself, this result is used in many of the proofs of other theorems. In various proofs, a map is constructed whose fixed point gives the desired conclusion. For example, later in this chapter we prove that a nonlinear map with a "hyperbolic" fixed point is conjugate to a linear map in a neighborhood of the fixed point (Hartman-Grobman Theorem). This result is proved by constructing a map e on a set of functions. The fixed point of e turns out to be the conjugacy. A main step in the proof is verifying that e is a contraction mapping and concluding that there exists a unique fixed point. Similarly, in the next section we use the contraction mapping method to prove the existence of solutions of differential equations. The contraction mapping method is also used in the proofs of the Implicit and Inverse Function Theorems. With this motivation, we turn to the statement and proof of the Contraction Mapping Theorem.
~.2.3
CONTRACTION MAPPING THEOREM
139
Theorem 2.5 (Contraction Mapping Theorem). Assume Y is a complete metric space with metric d, and 9 : Y -+ Y is a Lipschitz function witb Lipschitz constant Lip(g) = K, < 1. Then there is a unique fixed point y., g(y.) = y.. More specifically, if Yo is any point in Y, then {gR(YO)}REN is a Cauchy sequence and d(yo, yO) ~ d(yo, g(yo»/[1 - Lip(g)J. PROOF. By induction,
d(gR(YO),gR+l(yO» ~ ltd(gR-l(YO),gR(yo» ~
K,Rd(yo, g(yo».
Then k-I
d(gR(YO),gR+k(yO» ~ ~d(gR+j(YO),gR+J+I(yO» j=O k-I
~ ~ K,R+jd(yo,g(yo» j=O K,R ~ 1- K,d(Yo,g(yo».
This latter inequality proves that the sequence gR(yO) is Cauchy and so converges to a. limit point y •. If there were two fixed points y. and y', then d(y·,y') = d(g(y*),g(y'» ~
Itd(y*,y'),
which is impossible unless d(y*,y') = 0 and y. = y'. Therefore the fixed point is unique. To get the final conclusion, note that d(yo,Y*) = lim d(Yo,gR(yO» R-OO
R-I ~ n_(X)~ lim "d(gi (Yo), gi+1 (Yo» j=O 00
= ~d(gi(Yo),gi+l(yo»
j=o 00
~ ~ K,jd(Yo,g(yo» j=o 1 = -1-d(yo, g(yo». -If.
This last inequality proves the desired result.
o
140
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
5.3 Existence of Solutions for Differential Equations In this section, we start our consideration of nonlinear systems of differential equations. In the last chapter we considered linear differential equations of the form :ic = Ax. d (As before:ic = dix is the derivative with respect to time.) A simple example of a nonlinear systems is
In general, we consider an equation of the form :ic = !(x), where x is some point in an open set U in Rn (or on some manifold like the torus) and ! : U c Rn _ Rn is a differentiable function. The function! is called a vector field because it assigns a vector !(x) to each point in U. We want to consider the solution, x(t), with x(to) some prescribed value. More precisely, given Xo and to, we want x(t) to be defined on an open interval of times, 1, with to E 1, x(to) = Xo, and
d
dix(t) = !(x(t))
(t)
for t in 1. Thus the tangent vector to the solution curve, :ic(t), is equal to the vector field! evaluated at the position at this time, !(x(t)). In Chapter IV, we used the concept of a flow when discussing topologically conjugate flows. Here we reintroduce this notion which is used extensively in the rest of the book and make the connection with the solutions of a nonlinear differential equation. Given the differential equation :ic = !(x), we let rpl(Xo) be the solution x(t) with the given d
initial condition Xo at t = 0: rpO(Xo) = Xo and dirpl(Xo) = !(rpl(Xo)) for all t for which it is defined. We also sometimes write rp(t, Xo) for rpl(Xo). (Other books often write rpl(Xo).) The function rpl(Xo) is called the flow of the differential equation. The function ! defining the differential equation is also called the vector field which generates the flow. The first theorem below shows that the solution is uniquely determined by the initial conditions Xo and the time t; the notation of the flow rpl(Xo) emphasizes this depende~ce. Later in the section we show that rpl(Xo) is a continuous function of the initial condition Xo. The final important property of the flow is the group property: '1'1 0 rp'(Xo) = rpl+·(Xo). Then '1'-1 0 '1'1 (xo) = rpO(Xo) = Xo so '1'-1 = ('1'1)-1, and for fixed t, '1'1 is a homeomorphism on its domain of definition. (There might be some points for which the solution is not defined up to time t.) Before verifying these properties for the flow generated by a differentiable vector field, we summarize these important properties of a flow. Since we occasionally refer to flows on metric spaces which are not the solutions of a differential equation on Rn, we state the definition in this context. Definition. For a metric space X, any continuous map 'I' : U c R x X - X defined on an open set U ::> {OJ x X is called a flow provided (i) it satisfies the group property '1'1 0 rp'(Xo) = ",1+·(Xo) and (ii) for fixed t, '1'1 is a homeomorphism on its domain of definition. We now tum to verifying the properties of the flow generated by a differentiable vector field on Rn.
5.3 EXISTENCE OF SOLUTIONS
141
Theorem 3.1 (Existence and Uniqueness of Solutions of Ordinary Differential Equations). Let U c Rn be an open set, and f : U --+ Rn be a Lipschitz function (or Cl). Let Xo E U and to E R. Then there exists an a > 0 and Ii solution, x(t), of x = f(x) defined for to - a < t < to + a such that x(to) = Xo. Moreover, if y(t) is another solution with y(to) = Xo, then x(t) = y(t) on their common interval of definition about to. PROOF OF EXISTENCE. For Xo E U take b > 0 such that B(Xo,b) == {x: Ix - Xol $ b} cU. The function f is Lipschitz so there is a K > 0 and M > 0 such that If(x) - f(y)1 $ Klx - yl and If(x)1 $ M for all x,y E B(Xo,b). If x(t) is a solution with x(to) = Xo. then
x(t)
= Xo +
l'
x(s) ds
= Xo +
l'
f(x(s» ds.
to
to
Conversely, if x : J --+ Rn satisfies (*), then x is a solution of x = f(x) with x(to) = Xo. Thus to get solutions to the ordinary differential equation we find solutions of (*). To this end, for y : J --+ B(Xo, b), we define F(y)(t) = Xo
+ J~
f(y(s» ds.
The idea is to show F is a contraction on some function space which has a fixed point which satisfies equation (*). and thus is a solution of the differential equation. We need to define the function space on which F acts. First we need to specify the length of the interval J. Take a with a < min {bIM, II K} and let J = [to - a, to + al. We are going to consider F as acting on potential solutions defined for t in J. We take a < blM so that F(y)(t) does not leave B(Xo,b) for t in J. We take a < 11K so that F is a contraction by A = aK. We now explicitly define the function space S of potential solutions on which Facts. Let S = {y: J --+ B(Xo,b) : y is CO, y(to) = Xo,Lip(y) $ M}, the space of M-Lipschitz curves that go through Xo at t = to and take their values in B(xo, b). We put the CO -sup-norm on S: for y, z E S we set lIy - zllo = sup{ly(t) - z(t)1 : t E J}. With this norm, it can be shown that S is a complete metric space. We need to show that F preserves S. i.e., if y is in S then Yl = F(y) is also in S. Clearly, Yl is continuous with F(y)(to) = Xo. Next,
This shows that F(y) is Lipschitz as a function of t with Lip(F(y» $ M. Then for It - tol $ a, Iy,(t) - Xol = Iy,(t) - ydto)1 $ Mit - tol $ Ma < b 80 F(y)(t) takes values in B(Xo, b) for It - tol $ a. Thus we have shown that for y in S, F(y) is in S, 80
F: S
--+
S.
142
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
Finally, to show that F is a contraction on S, IIF(y) - F(z) 110 = sup IF(y)(t) - F(z)(t) I IEJ
~ sup I !'
If(y(s» - f(z(s»1 ds
110
IEJ
~ sup I !' IEJ 110 ~ sup I !' IEJ 110 ~
~
Kly(s) - z(s)1 ds Klly -
I
I
zllo ds I
zllo AllY - zllo. Kally -
Thus F is a contraction by A on S and so has a unique fixed point x· in S. But a fixed point of F clearly satisfies equation (.). This proves the existence of a solution of the differential equation. 0 The uniqueness of the fixed point proves that the solution is unique among the curves which are M-Lipschitz. For the proof of the uniqueness of the solutions among all curves together with the continuity of solutions on initial conditions, we use the following result which is called Gronwall's Inequality. Theorem 3.2 (Gronwall's Inequality). Let vet) and g(t) be continuous nonnegative scalar functions on (a, b), a < to < b, C ~ 0, and
vet)
~ C + I !' 110
Then
vet)
v(s)g(s) ds
~ C exp (11.
1
10
I
for a
< t < b.
g(s) ds I).
PROOF. First we consider the case for to ~ t < b. It is not possible to differentiate inequalities and retain the inequality, so we define
U(t) = C
+
!'
v(s)g(s) ds.
110
Then, vet) ~ U(t) and we can differentiate U. In fact, U'(t) = v(t)g(t) First, if C > 0, then U(t) > 0 so
~
U(t)g(t).
U'(t) < (t) U(t) - 9 ,
log
(g(~?») ~ U(t)
1:
g(s) ds,
~ C exp (
since U(to) = C. Using the fact that vet) :_ to ~ t < b.
and
(' qt..) ds) l
ttl, we are done when C > 0 and
5.3 EXISTENCE OF SOLUTIONS
143
If a < t :5 to and C > 0, then we define
U(t) = C
+
[0
v(s)g(s)ds,
so U'(t) = -v(t)g(t) 2: -U(t)g(t). Then,
U(t ) [ '0 log ( U(;) ) 2: - I g(s) ds, U(t) :5 C exp
([0
and
g(s) ds).
This completes the modifications for C > 0 and a < t :5 to. Next, if C = 0, we can take Cj > 0 which converge to zero and have the assumed inequality true for C j • By the first case,
vet) :5 Cj exp
(Ii
g(s) ds I)·
Since this last term goes to zero as j goes to infinity, we get that vet) == 0 in this case, 0 which verifies the conclusion. We can now give the proof of uniqueness. However, we give the proof of continuity with respect to initial conditions at the same time. Theorem 3.3 (Continuity with Respect to Initial Conditions). Witb tbe_ Bumptions of Tbeorem 3.1, tbe solution 'P'(Xo) depends continuously on tbe initial condition Xo. PROOF OF UNIQUENESS AND COIII.TINUITY. Assume that
tions with x(to)
= Xo and y(to) = Yo. x(t) - yet)
Let vet)
= Ix(t) -
= Xo -
x(t) and yet) are two solu-
Then, Yo
+ J~
J(x(s» - J(y(s» ds.
y(t)l, which is nonnegative, and veto) vet) :5 veto) + ILI/(X(S»
-
= IXo -
J(y(s))! ds
yol. We get that
I
:5 veto) + IL Kv(s)ds I· SO by Gronwall's Inequality, vet) :5 v(to)eKIt-tol, or Ix(t) - y(t)1 :5 IXo - yol eKII-tol. This clearly implies the solutions depend continuously on Xo. Also, if Xo = Yo then we get that vet) == 0, or x(t) == yet). This last statement gives the uniqueness. 0 Example 3.1. If the differential equation is not Lipschitz, then it is possible to have nonunique solutions. One example is the equation :i: = 3x i on the real line. Given any to there is a solution x(t) which equals (t - to)3 for t 2: to and 0 for t :5 to. There is yet another solution z(t) == 0 for all t. Then both x(to) and z(to) equal zero but x(t) and z(t) are not equal on any interval about to. We next discuss the maximal interoal oj definition oj the solution. That is, for fixed x, we extend the solution so 'P' (x) is defined for t E (L, t+) but no larger open interval of times t. (Here Land t+ possibly depend on x.) The interval is open by the existence of solutions on a short interval starting at any point. The following example gives an equation for which the solutions are not defined for all time.
144
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
Example 3.2. Consider:i: = x 2 • If Xo > 0 then solving the equation by separation of variables shows that -00 then there is a tc- with L < tc- $ 0 such that 0 and K > 0 such that If(y)1 :5 M and If(y) - !(z)1 $ Kly - zl for y, Z E C. By the existence proof, as long as the solution 0 and a solution defined for (t+ - 6, t+ + 6) that agrees with
O(Xo)
Theorem 3.5 (Reparameterization). Assume f : U C Rn -+ IR" is a vector field defined on an open set U of R" with flow OJ therefore, t/J t is a reparameterization of the flow
ft[G ° W'(X)tl
= -[G ° W'(x)t2DG"'(x)F 0 wt(x)
5.3 EXISTENCE OF SOLUTIONS
145
and [G 0 ¢1(X)t1 ::; [G(X)t1
rill
+ Jo
IG 0 ¢·(x)l- l 'IIDG~'(x)II'IF 0 ¢·(x)lds
rill
::; [G(X)t1
+ Jo ds
::; [G(X)t1
+ Itl.
Thus [Go¢I(X)]-1 does not go to infinity in finite time, and the solution ¢I(X) is defined for all time. For the reparameterization,
~rpT(I.X)(X) = f 0 rpT(I,X)(x)r(t, x). dt
Therefore if r is defined as the solution of r(t, x) = go rpT(t.x) (x),
then both rpT(I.X)(X) and ¢I(X) satisfy the same differential equation, and so are equal by uniqueness of solutions. Because 9 is strictly positive, r( t, x) is a monotonically increasing function of time and the orbits of rpl and ¢I have the same orientation. 0 The final result of this section gives the group property for the flow.
Theorem 3.6. The flow rpl satisfies the following group property. Letting (L, t+) be the maximal interval of definition for the initial condition Xo, then rpl(rp·(Xo» = rpl+O(Xo)
for all t and s for which s,t + s E (L,t+). Let J(y) be the maximal interval of definition for rpl(y). Then rpl(rp·(X» is defined for t E J (rpo (x) ). Define the new function PROOF.
y(r) = {
rpT(X) rpT-·(rp·(x»
for 0 ::; r ::; s for s ::; r ::; s + t.
It is easily checked that y(r) is a solution. By uniqueness it must equal rpT(X) for s ::; r ::; s + t. Therefore s + t E J(x) and y(r) = rpT(X) for 0 ::; r ::; 8 + t, or rpl(rp"(X» = y(s + t) = rpIH(x), verifying the claim of the theorem. 0 If the flow is generated by a differentiable vector field, we have proved above that rpt(Xo) is a continuous function of Xo and a differentiable function of t. Actually, if the vector field f is CT with r ~ 1 then rpl(Xo) is cr jointly in Xo and t. We do not prove this, but a proof can be found in many graduate level ordinary differential equations books, e.g. Hale (1969) or Hartman (1964). By differentiating the equation
frp'(X) = f(rp'(X» with respect to the initial condition x and interchanging the order
ol differentiation, we get the equation
d I I (jiDrpx = Df'P'(x)Drpx
or d D rpxI V = D f'P'(x)DrpxI v (ji for any vector v. Either form of these equations is called the First Variation Equation.
146
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
5.4 Limit Sets and Recurrence for Flows In this section, X is a metric space with distance d, and cpt is a flow on X. In the applications in this book, X is a Euclidean space or manifold and cpt is the flow of a differentiable vector field f. We start with the definition of a fixed point and a periodic orbit, and then give the definitions and results about limit sets, nonwandering sets, and chain recurrent sets. These theorems for flows are similar to those for maps. (Compare this section with Section 2.3.) Definition. A point p is called a fixed point for the flow cpt if cpt(p) = P for all t. Sometimes such a point is also called an equilibrium or singular point. If the flow is obtained as the solutions of a differential equation :it = lex), then a fixed point is a point for which !(p) = O. A point p is called a periodic point provided there is T > 0 such that cpT(p) = P and cpt(p) #- p for 0 < t < T. The time T > 0 which satisfies the above conditions is called the period of the orbit. (The period is the least period because cpt(p) #- p for 0 < t < T.) The orbit of such a point p, O(p), is called a periodic orbit. Periodic orbits are also called closed orbits since the set of points on the orbit is a closed curve. It easily follows that if p is a periodic point of period T, and q E O(p) then q is a periodic point of period T. Definition. A point y is an w-limit point of x for cpt provided there exists a sequence of tk going to infinity such that limk_oo d(cpt. (x), y) = O. The set of all w-limit points of x for cpt is denoted by w(x), w(x, cpt), Lw(x), or Lw(x, cpt), and is called the w-limit set. The o-limit set of x is defined the same way but with tk going to minus infinity. The set of all such points is denoted by o(x) or Lo(x) (or with cpt specified in the notation). All the basic results for limit sets of maps are also true for flows as indicated in the following theorem. In addition, if the forward orbit of a point 0+ (x) = {cpt (x) : t ~ O} is bounded, then w(x) is connected. Remember that this is not true for a map as the example of a periodic point shows. (The reason flows are different than maps is that an orbit for a flow is connected but not for a map.) Theorem 4.1. Let cpt be a flow on a metric space X. (a) The w-limit set can be represented in terms of the forward orbit as follows: w(x)
=
n T~O
cl U {cpt (x)}. t~T
(b) If cpt (x) = y for a real number t, then w(x) = w(y) and o(x) = o(y). (c) The limit sets, w(x) and o(x), are closed and both positively and negatively invariant (contain complete orbits). (d) If O+(x) is contained in some compact subset of X, then w(x) is nonempty, compact and connected. FUrther, d(cpl(X),W(x)) goes to zero as t goes to infinity. Similarly, if O-(x) is contained in a compact subset of X, then o(x) is nonempty, compact, and connected, and d(cpl(X),W(x)) goes to zero as t goes to minus infinity. (e) If Dc X is closed and positively invariant and xED, then w(x) C D. (f) Ify E w(x), then w(y) C w(x) and o(y) C o(x). PROOF. All the proofs are the same as for diffeomorphisms except the new result about w(x) being connected in part (d). For a flow, the sets UI>T{CPt(x)} are connected, so clUI~T{cpt(x)} is a nested collection of compact and connreted sets, and so
n cl( U{cpl(X)}) T~O
t~T
5.4 LIMIT SETS AND RECURRENCE FOR FLOWS
147
o
is connected. (See Exercise 5.10.)
In this section, we present some examples of flows in the plane which illustrate some of these possibilities of limit sets. In Section 5.9, we discuss the Poincare-Bendixson Theorem which gives restriction on the possible limit sets for flows in the plane. Example 4.1. The condition that the orbit is bounded is necessary to prove that the limit set is connected. This example gives a flow in the plane for which the orbit is unbounded and the limit set is not connected. Let YI < 0 < Y2 be two fixed values of Y and L j be the horizontal line {(x,Yj)} for j = 1,2. Let cpt be the flow for which cpt(x, yd = (x - t, yd, cpt(x, Y2) = (x + t, Y2), the origin is a fixed point, and the orbits of points q between the lines Y = YI and Y = Y2 spiral out limiting on both LI and L'l' See Figure 4.1. For q = (x, y) '" 0 and YI < Y < Y2, w(q) = {(x, y) : Y = YI or Y = Y2}, which is not connected. For these same q, o(q) = O. For q = (x,Yi) for j = lor 2, w(q) = o(q) = 0.
E
FIGURE 4.1. Example with Disconnected Limit Set Example 4.2. Consider the equations
+ In(l - x 2 _ y2) + IJy(l - x 2 _ y2)
:i: = -Y
iJ
= x
for IJ > O. In polar coordinates, this is the system
8=1 r = IJr(l -
r2).
The time derivative of r satisfies
r{ ><00
for 0 < r < 1 for 1 < r,
so for initial conditions p '" (0,0) (ro '" 0), the trajectory has w(p) = Sl = {(x, y) : x 2 + y2 = I}. Thus solutions forward in time limit on the unique periodic orbit with r == 1. Such an orbit with trajectories spiraling toward it from both sides is called a limit cycle. Example 4.3. Consider the equations :i:=y
iJ = x - x 3 -
IJy(2y2 - 2X2
+ x 4)
148
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
for l' > O. We proceed to explain the phase portrait as given in Figure 4.2. There are three fixed points: 0 = (0,0), and p± = (±l,O). As we show later in the chapter, the two fixed points p± are repelling and the origin is a "saddle fixed point". To aid in the analysis of the phase portrait, consider the real valued (Liapunov) function L(x, y) = y2/2 - x 2/2+x4 /4. (See Section 5.5.3 for a discussion of Liapunov functions.) The time derivative of L along trajectories can be calculated as follows: . _ d L(x, y) = diL =
I
0
cp (x, y)ll=o
yy + (-x + x 3 )±
= y[x -
x 3 - l'y(2y2 - 2x2 = -4I'y 2L(x,y).
+ x 4 )] + (-x + x 3 )y
Because L(x,y) == 0 along the level curve L-1(0), the trajectories are tangent to this level curve, and so the level curve L-I(O) is invariant under the flow. (If l' = 0, then each level curve is invariant for the flow.) The level curve L -1(0) is made up of three trajectories: the fixed point 0 and two trajectories -y+ = {(x, y) : x > 0 and L(x, y) = O} and -y- = {(x, y) : x < 0 and L(x, y) = O}. For q E -y±, it can be shown that w(q) = a(q) = O. Because of this property, the trajectories -y± are called homoclinic connections for the saddle point O. See Figure 4.2.
FIGURE 4.2.
Example 4.3
For all q inside -y+, L( q) ~ 0 and L( q) > 0 if q is off the x-axis. Using these facts, it can also be shown that for ql inside -y+ but not equal to p+, w(q.) = {O} U -y+ and a(q.) = {p+}. Similarly, for q2 inside -y- but not equal to p-, W(q2) = {O} U -yand a( q2) = {p - }. Finally, for q outside both -y+ and -y-, L( q) ::s 0 and L( q) < 0 if q is off the x-axis. Again, it can be shown that for q3 outside both -y+ and -y-, W(q3) = {O} U -y+ U -y- and a(q3) = 0. At the moment we are not worrying about the fact that these particular equations have this phase portrait, but merely that this phase portrait illustrates a flow with different limit sets for different points. In particular, the a-limit set of certain points equals a single fixed point and for other points it is the empty set. The w-limit set of certain points equals a single fixed point; for other points, it is the fixed point at the origin together with one of the homoclinic connections (-y+ or -y-); for still other points, it is the fixed point at the origin together with both the homoclinic connections -y+ and ,,(-. Definition. Using the definition of w-limit set and a-limit set, we say that x is wrecurrent provided x E w(x) and that x is a-recurrent provided x E a(x). The only changes that are needed in the definitions of nonwandering and chain recurrent points from those in Section 2.3 is that the times must be large enough.
5.5 FIXED POINTS FOR NONLINEAR DIFFERENTIAL EQUATIONS
149
Definition. For a flow 1 such that
5.5 Fixed Points for Nonlinear Differential Equations For linear systems of differential equations, the asymptotic stability is determined by the real parts of the eigenvalues. For a fixed point of a nonlinear system, the real parts of the eigenvalues of the derivative also determine the stability. In the next two subsections we state this result for nonlinear equations and prove it for the case of an attracting fixed point. In Section 5.6, we give the comparable results for maps. In between, in Section 5.5.3, we define a Liapunov function for a nonlinear ordinary differential equation, which can also be used to determine the stability of a fixed point. Definition. We consider a (nonlinear) differential equation x = I(x) with x E aft. The point p is a fixed point if I(p) = O. In this case the flow fixes the point p for all t, 0 and Re(A_) < 0 (and the real parts of all the other eigenvalues are nonzero). A nonhyperbolic fixed point is said to have a center. (In a more restrictive sense of the word a center is a family of closed orbits surrounding a fixed point as occurs for a linear two dimensional center or an undamped pendulum.) We are often interested in the set of all points whose w-limit set is a fixed point (or in a more complicated set). The basin 01 attraction of a fixed point p is the set of all points q such that w(q) = {p}, which we denote by W"(p). Note this concept is most interesting and most often used when p is a sink. Theorems 5.1 and 5.2 below prove the stability of a nonlinear sink, and Theorem 5.3 and Corollary 5.4 state these results for nonlinear hyperbolic fixed points. We end this section with several examples.
150
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
Example 5.1. Consider the second order equation ij + sin(O) = 0 and X2 = 0, this can be written as a first order system
+ 60
= O. By letting
XI
. _(XI) _f(x) = _(-sm(xd . -
X -
• X2
X2
-
)
6X2
.
The flxocl points are x .. (ml',O) for n an Integer. The derivative of I is
Dfx = ( _
CO~(Xil ~6)'
At (0,0), or (ml',O) for n even, the derivative is given by
W-
The characteristic equation is A2 +6 A+ 1 = 0 with eigenvalues A = (-6 ± 4JI/2)/2. Since 62 - 4 < 62 , for 6 > 0 the real part of both eigenvalues is negative. Thus (0,0) is a nonlinear sink. Notice that for 0 < 6 < 2, the eigenvalues are complex, and for 6 ~ 2 the eigenvalues are real. For 0 > 6, (0,0) is a source. Next, for the fixed point (71',0), or (n71',0) for n odd,
and the eigenvalues are A = (-6 ± [6~ + 4J 1/2)/2. Notice that for 6 are ±1. For any 6, the fixed point is a saddle.
= 0,
the eigenvalues
Example 5.2. Consider the equation x+ X - x 3 = O. The fixed points are at and X = O. The derivative of the system of equations is Dfx =
(-1 ~
3x 2
X
= 0, ± 1
~).
The eigenvalues for the fixed point (0,0) are ±i. Thus this fixed point is nonhyperbolic; the linearized equation is a center. The fixed points x = ±l have eigenvalues ±2 1/ 2 , and thus are saddles.
5.5.1 Nonlinear Sinks In this section we consider the dynamics near a nonlinear sink. The following theorem proves the nonlinear flow near the fixed point sink is an exponential contraction if the linear flow is. Also, because the solutions are bounded for the nonlinear flow, they exist for all t ~ O. Theorem 5.1. Let p be a fixed point for the equations x = f(x) with x E an. Also assume that there is a constant a > 0 such that all the eigenvalues A for A = Dfp have negative real part with Re(A) < -a < O. Then the following two statements are true. (a) There is a norm I I. on an and a neighborhood U c Rn of p , ....11 that for any initial condition x E U, the solution is defined for all t ~ 0 and satisfies
1'P'(x) - pl. :5 e- '4 Ix - pl.
for all t ~ O.
5.5.1 NONLINEAR SINKS
151
(b) For any norm II' on Rn there exist a neighborhood U C Rn ofp and a constant C ~ 1 such that for any initial condition x E U, the solution is defined for all t ~ 0 and satisfies
REMARK 5.1.
A norm as in part (a) ofthe theorem is called an adapted nonn.
This theorem can be proved by using either approach to the proof of Theorem IV.5.l. We present the proof using the averaged norm of the first proof. We leave to the exercises the proof using the Jordan canonical form. See Exercise 5.20. REMARK 5.2.
PROOF. We first do a change of coordinates y = x - P to move the fixed point to the origin. Then
y
+ p) + g(y)
= it = I(y = Ay
where g(y) contains all the terms which are quadratic and higher. Therefore g(O) = 0 and Dgo = O. We let .,pt(y) be the flow in the y coordinates and (l(x) be the flow in the x coordinates. Since .,pt(yo) is the solution of y = Ay + g(y) with .,pO(yo) = Yo, then thinking of g(.,pt(yo» as a known function of t, it is also a solution of the nonhomogeneous linear equation
y = Ay + g(.,pt(yo» with y = Yo at t = O. Therefore we can apply the variation of parameters formula to the solution to get
.,pt(yo) = eAt Yo + 1t eA(t-a)g(.,pa(Yo»ds. We now want to use the estimates for the linear flow to get estimates for the nonlinear flow. We have to introduce a little error because of the nonlinear terms, so we take f > 0 and b = a + f such that all the eigenvalues A for A satisfy Re(A) < -b. Now let II. be any adapted norm for the IinPRr equations which is shown to exist by Theorem IV.5.1, Le., the norm satisfies leA1Yol. S e-1oIYol. for t ~ O. The nonlinear term 9 starts with quadratic terms, so there is a {) > 0 such that for Iyl. S {) we have Ig(y)l. < flyl.· We let U' be the neighborhood of y = 0 given by U' = {y : Iyl. S {)}. Applying this estimate to the integral above, if 1.,p·(yo)l. S {) for 0 S sSt then
l.,pt(Yo)l. S leA1yol.
+ lleA(t-a)g(.,pa(Yo»I. ds
S e-btIYol.
+l
S e-b1IYol.
+
ebll.,p'(Yo)l. S IYol. +
l
e-b(t-a)lg(.,pa(Yo»I. ds
l'
ebae-b'fl.,pa(Yo)l. ds
febaW(Yo)l. dB.
We can now apply Gronwall's Inequality to the function ebll.,pI(Yo)l. and get
ebtl.,p'(Yo)l. S IYol.ed
,
l.,pt(Yo)l. S IYol.e(-b+<)1 = IYol.e- OI •
152
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
It follows that if Yo E U' with IYol. $ h, then the solution will remain in U' for all t ~ 0: 1""(Yo)l. $ IYol.e(-b+<)1 $ h. Thus the solution is defined for all t ~ 0, and 1""(Yo)l. goes to zero exponentially. In the x coordinates, let U = {x: Ix - pl. $ h}. For Xo E U, let Yo = Xo - P and 0 as in Theorem 5.1 above such that for Ix - pl. $ 6, leA'(x - p)l. $ e-a'lx - pl. and l
h () x = {
p
for x = p for x '# p.
The proof that h is a conjugacy is now the same as in Theorem IV.7.1 as long as times are restricted so that T(X) $ t < 00. 0 Example 5.3. Vinograd (1957) gave an example of a nonlinear differential equation with the origin an isolated fixed point which is not Liapunov stable but which has a neighborhood U for every q E U has w(q) = O. The equations are . x 2(y _ x) + y5 X = -:-(x-;;"2-+-y'""'2""')[":"""1-+~(-:x2-=+'---cy2;;-:-);;;2]
. y2(y _ 2x) y = (x 2 + y2)[1 + (x2 + y2)2] .
The phase portrait is given in Figure 5.1 See Hahn (1967) page 191 for an analysis of the example.
5.5.2 Nonlinear Hyperbolic Fixed Points In the last subsection we considered nonlinear sinks. In this subsection we consider hyperbolic fixed points of nonlinear differential equations and state the results linking their stability with the real parts of the eigenvalues of the derivative of the vector field at the fixed point. We give this stability result as a corollary of the Hartman-Grobman Theorem. We defer the proof of the Hartman-Grobman Theorem to Section 5.7.
5.5.2 NONLINEAR HYPERBOLIC FIXED POINTS
FIGURE
5.1.
153
Phase Portrait for Example 5.3
Theorem 5.3 (Hartman-Grobman). Let p be a hyperbolic fixed point for x = I(x). Then the flow tpl of I is conjugate in a neighborhood ofp to the affine flow p+eAI(y_p) where A = DIp. More precisely, there are a neighborhood U of p and a homeomorphism h: U -+ U such that tpl(h(x» = h(p + eAI(y - p» as long as p + eAI(y - p) E U. We delay the proof of this theorem to Section 5.7.2 because it involves much more analysis than the special case of a sink, given in Theorem 5.2. As stated above, the stability of hyperbolic fixed points follows from the Hartman-Grobman Theorem. Corollary 5.4. Let p be a hyperbolic fixed point for x = I(x). lfp is a source or a saddle then the fixed point p is not Liapunov stable (unstable). lfp is a sink then it is asymptotically stable (attracting). The fact that a source is not Liapunov stable follows from a result like Theorem 5.1. It can be proved directly that a saddle is not Liapunov stable as is given in Hirsch and Smale (1974), page 187. Combining the conjugacy of linear hyperbolic Hows given in Theorem IV.7.1 with the Hartman-Grobman Theorem, we can state the following corollary. Theorem 5.5. Assume x = I(x) has a hyperbolic fixed point at p, and x = g(x) bas a hyperbolic fixed point at q (where both equations are in RR). Assume that the number of negative eigenvalues at the two fixed points are equal. Then there are neighborhoods U of p and V of q and a homeomorphism h : U -+ V such tbat h is a conjugacy of the flow of Ion U to the flow of 9 on V. REMARK 5.3. Since any two contracting linear Hows are topologically conjugate, it is clear that a topological conjugacy does not preserve much of the geometric nature of the phase portrait. By contrast, a differentiable conjugacy does preserve much of this structure. It is possible to prove that a nonlinear How is differentiably conjugate to the linear How near a hyperbolic fixed point if the eigenvalues satisfy a nonresonance condition. Assume that I is Coo and the eigenvalues at a hyperbolic fixed point are Aj for 1 :$ j :$ n. Assume that A/c ¥ mjAj for any choice of mj ~ 0 with mj ~ 2. Then the conjugacy h in Theorem 5.3 can be taken to be a diffeomorphism from V onto U. This result was initially proven by Sternberg (1958). In two dimensions, Hartman (1960) proved that near a hyperbolic fixed point, any C2 How is CI conjugate to its linearized How. This theorem is true even when the eigenvalue has multlpliclty two. In particular if the the linearized equations are diagonal with
E7-1
E7.1
154
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
a negative real eigenvalue, then the solutions for the linearized equations go straight toward the fixed point. The nonlinear flow could have trajectories which spiral as they approach the origin but the increase in the angle is bounded (as follows from the C 1 conjugacy or a direct estimate). Thus the two phase portraits are similar up to changes which C 1 nonlinear change of coordinates allows. Also see Belitskii (1973) for C 1 conjugacies for the general hyperbolic case. See the discussion in Hartman (1964) on differentiable conjugacies. Finally, we remark again that the assumption of a differentiable conjugacy between two flows in neighborhoods of their fixed points implies that they have the same eigenvalues. This follows directly from Theorem IV.7.2 about linear flows. Theorem 5.6. Assume:ic = J(x) has a hyperbolic fixed point at p, and :ic = g(x) has a hyperbolic fixed point at q (where both equations are in Rn). Assume that the flows are dilferentiably conjugate in a neighborhood of these two fixed points. Then the eigenvalues of the linearization of f at p equal the eigenvalues of the linearization of 9 at q.
5.5.3 Liapunov Functions Near a Fixed Point In this section we introduce the use of a real valued function which is decreasing along trajectories. For a system of equations for an oscillator, the function is often an energy function. We start by giving a !ij)ecific example as motivation. Example 5.4. Consider the equation of a pendulum with friction,
X=y
Y=
- sin(x) - 6y
with 6 > O. The fixed point (0,0) is attracting (asymptotically stable or a fixed point sink) as we saw in Example 5.1 using the eigenvalues and Theorem 5.1. In this section we will find a second way to see that is true using the "energy function" L(x,y) = !y2 _ cos(x). (The first term is the "kinetic energy" and the second is the "potential energy" .) We are interested in the time derivative of the real valued function L along a solution curve (x(t), y(t»: .
L(x(t), y(t»
d =diL(x(t), y(t»
= y(t)!i(t) + sin(x(t»x(t) = y(t)[- sin(x(t» - 6y(t»)
+ sin(x(t»y(t)
= _6y(t)2
$0. Since this function is non increasing, an easy argument proves that the origin is Liapunov stable. (See the proof of Theorem 5.7 below.) Since the function is decreasing most of the time, we can prove that the origin is asymptotically stable. (See Theorem 5.8 below.) These theorems also give an idea (If tl.r I .in of attraction of this fixed point sink, namely all points with L(x,y) < L(1r,U) ,aud -1r < X < 1r. Functions L as in the above example are called Liapunov functions. We give their important characteristic in the following definition.
5.5.3 LIAPUNOV FUNCTIONS NEAR A FIXED POINT
155
Definition. Let:it = J(x) be a differential equation on Rn with flow L(p) and L(x) == ftL(tp'(x»I,_o SO for all x E U \ {pl. These conditions imply that L 0
Theorem 5.7. Let p be a fixed point for it = F(x). Let U be a neighborhood ofp and L : U -> R a weak Liapunov function on the neighborhood U of p for the dilferential equation. Then p is Liapunov stable. (b) If L is a (strict) Liapunov function on the neighborhood U ofp for the dilferential equation, then p is asymptotically stable. PROOF. To show that p is Liapunov stable, given any E > 0 we need to find a neighborhood U1 C B(p, f) such that a solution starting in U1 stays in B(p, E). First of all we can assume that E is small enough so that B(p,E) C U. Choose a such that
L(p)
< a < min{L(x) : x
E 8B(p,E)},
and let U 1 = {x E B(p, E) : L(x)
< a}.
If x E U 1 then L 0 tpl(X) S L(x) < a, so the solution can not cross the boundary of B(p, f). Therefore z. Since the closed ball cl(B(p, f» is compact, z E cl(B(p,f». We claim that z = p. Assume there is an w-limit point z # p. By continuity, L(
tn + s,
When applying Theorem 5.7 to the pendulum example above, we see that (0,0) is Liapunov stable for () ~ O. However, L is not strictly negative on any deleted neighborhood, so it does not imply that the point is asymptotically stable for () > O. To get this result we need Ii refinement of the above theorem given next.
Theorem 5.S. Let p be a fixed point for:ic = F(x). Let U be a neighborhood of p and L : U -+ R a weak Liapunov function on U. Suppose S C U is a c106ed, bounded, positively invariant neighborhood of p. Let
z=
{x E S : L(x) = O}.
156
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
FUrther suppose that {p} is the largest positively invariant subset of Z. Then p is asymptotically stable, and S is contained in the basin of attraction of p. The w-limit set w(x) is nonempty because S is closed, bounded, and positively invariant. Let z be an w-limit point. By the argument in Theorem 5.7, L(<,?'(z)) = L(z) for all s > 0, so <'?'(z) E Z for all s > O. Thus z must be in a positively invariant subset of Z and z = p. 0 PROOF. Take any xES.
5.6 Stability of Periodic Points for Nonlinear Maps In this section we consider a map (usually a diffeomorphism) f : Rn -+ Rn or f : -+ At where M is a surface or manifold such as M = T" or sn. A linear map is given by f(x) = Ax where A is an n x n matrix. The reader should ('om pare the results of this section wit.h the results for maps on R given in Chapter II. The conditions on the derivative at the fixed point for one dimensional maps are replaced by conditions on the eigenvalues of the derivative.
M
Definition. A point p is a fixed point of f if /(p) = p. It is a periodic point of (least) period k if /k(p) = P but P(p) oF p for 0 < j < k. A periodic point p is called Liapunov stable provided given any E > 0, there exists a b > 0 such that IJi(x) - Ji(p)1 < E for all Ix - pi < band j ~ O. A periodic point p is called it asymptotically stable or attmcting provided it is Liapunov stable and there is a b > 0 such that if Ix - pi < b, then IP(x) - P(p)1 goes to zero as j goes to infinity. The basin of attmction of an attracting periodic orbit O(p) is the set of all points q such that w(p) = O(p). This basin is denoted by W"(O(p)). The following theorem gives the existence of an adapted metric for maps at an attracting fixed point which is similar to Theorem IV.5.l for differential equations. Theorem 6.1. Let p be a periodic point for / : Rn -+ Rn with period k. Assume there is a constant 0 < I-' < 1 sllch that all the eigen1llllues >. of D/: have 1>'1 < 1-'. (a) TlJen there is a norm I I. on Rn and a neiglJborlwod U c an of p such that for any initial condition x E U, the iterates satisfy II jk (x) - pk(p)l. ~ I-'jlx - pl. for all j ~ O. C
(b) For any norm II' on Rn tlJere exist a neighborhood U C Rn ofp and a constant 1 such that for any initial condition x E U, the iterates satisfy II jk (x) _/jk(p)I' ~ CI-'jlx - pI' for all j ~ o.
~
REMARK 6.1. As for differential equations, a norm as in part (a) of the theorem is called an adapted norm. REMARK 6.2. The proof is similar to that for Hows with integrals replaced by summations and is omitted.
To look at the case that is not a sink, for a linear map with matrix A we let
E" = span{v" : v" is a generalized eigenvector for an eigenvalue
>."
of A with
E' = span{v' : v' is a generalized eigenvector for an eigenvalue >.. of A with E = span{v C
C :
V
C
I>." I >
l},
1>..1 < l},
is a generalized eigenvector
for an eigenvalue >'c of A with I>'c! = l}.
5.6 STABILITY OF PERIODIC POINTS FOR NONLINEAR MAPS
157
I:.
Definition. For a periodic point p of period k, we consider these spaces with A = D The periodic point p of period k is called hyperbolic provided EO = to}, i.e., all the eigenvalues Aof DI; satisfy IAI "" 1. A hyperbolic periodiC point p of period k is called a .!ink provided all the eigenvalues of DI; are less than one in absolute value, IAI < 1, i.e., both E", EO = {a}. Theorem 6.1 above proves that a periodic sink is asymptotically stable (or attracting). In the same way, a hyperbolic periodic point is called a source provided all the eigenvalues are greater than one in absolute value, I~I > 1, i.e., both E' ,Eo = {o}. Applying Theorem 6.1 to the inverse, we get that a periodic source is repelling. Finally, a hyperbolic periodic point with E" "" {o} and E' "" {o} is a saddle. In the same way that Theorem 5.2 followed from Theorem 5.1, we could prove that a nonlinear sink is topologically conjugate in a neighborhood of the fixed point to the linear map induced by the derivative at the fixed point. The following theorem states the more general Hartman-Grobman Theorem for diffeomorphisrns. Theorem 6.2 (Hartman-Grobman Theorem). Let I : Rn -+ Rn be a C r diffoomorphism with a hyperbolic fixed point p. Then there exist neighborhoods U of p and V of 0 and a homeomorphism h : V -+ U such that I(h(x)) = h(Ax) for all x E V, where A = DIp. We delay the proof of this theorem to Section 5.7.1. REMARK 6.3. Just as for differential equations, it is possible to prove that a nonlinear
How is differentiably conjugate to the linear How near a hyperbolic fixed point if the eigenvalUes satisfy a non resonance condition. Assume that I is COO and the eigenvalues at a hyperbolic fixed point are Aj for 1 ::; j ::; n. Assume that At "" A';'J for any choice of mj ~ 0 with 1:;=1 mj ~ 2. Then it is possible to prove the existence of a conjugacy h as in Theorem 6.2 that is a diffeomorphism from V onto U. This result was initially proven by Sternberg (1958). In two dimensions, Hartman (1960) proved that near a hyperbolic fixed point, any C 2 diffeomorphism is C l conjugate to its linearized map. Thus the two phase portraits are similar up to changes which C l nonlinear change of coordinates allows. Also see Belitskii (1973) for C l conjugacies in the general hyperbolic case. See the discussion in Hartman (1964) on differentiable conjugacies. As in the case of a How, the stability of a fixed point follows from the HartmanGrobman Theorem.
n;=,
Corollary 6.3. Let I : R n -+ R n be a C r diffeomorphism with a hyperbolic fixed point p. If p is a source or a saddle then the fixed point p is not Liapunov stable. If p is a sink then it is asymptotically stable.
lf a fixed point is hyperbolic then the linear part determines the stability type of the fixed point. Small changes in the linear part preserve the same stability type. To end this section we give a result that says that a hyperbolic fixed point persists for small changes in the map. More specifically, we consider a one-parameter family of maps I,.. We could assume that Xo is a hyperbolic fixed point for llJO' However, it is enough to assume that 1 is not an eigenvalue of D(fIJO)x.o in order to show the fixed point persists, as the following theorem shows. Theorem 6.4. Let I,. (x) bea one-parameter family of differentiable maps with x ERn. Assume that I,.(x) is C l as a function jointly of 11 and x. Assume that llJO(xo) = Xo and 1 is not an eigenvalue of D(fIJO)xo' Then there are (i) an open set U about Xo, (ii) an interval N about 110, and (iii) a C I function p : N -+ U such that p(~) = Xo and
158
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
1/J(p(J.L» = p{J.L). Moreover, (or J.L E N, II' has no other fixed points in U other than p(J.L). Finally,
PROOF. We want to find points x such that II' (x) = x. We define the function G(x, J.L) = I/J(x) - x and the condition becomes finding zeros of G(·, J.L). This is set up in the form where the Implicit Function Theorem might apply. Note that G(Xo, J.Lo) = 0
8G and 8x (Xo./Jo) = D(J/JO)Xo -1 is invertible since 1 is not an eigenvalue of D(J/Jo)Xo' Therefore the Implicit Function Theorem does indeed apply to give a Cl function p(J.L) such that G(p(J.L), J.L) = 0 for J.L in some open interval N about J.Lo and these are the only zeroes in some set N x U where U is an open neighborhood of Xo. The calculation of the derivative follows by implicit differentiation. 0
5.1 Proof of the Hartman-Grohman Theorem To prove the Hartman-Grobman Theorem for a diffeomorphism in a neighborhood of a fixed point, we first prove a case where the nonlinear map is defined on all of a Banach (or Euclidean) space, and it is a bounded distance away from the linear map on the whole space. Next we apply the global theorem to prove the local HartmanGrobman Theorem for a diffeomorphism, Theorem 6.2. Finally, we prove the local Hartman-Grobman Theorem for a flow, Theorem 5.3. To carry out these arguments, we need to work with a few function spaces and with bounded linear maps. For two Banach spaces EI and E 2, if A: lEI --+ 1E2 is a linear map we define the operator norm or sup-norm of A just as in finite dimensions by
IIAII
=
IA v l2
SUP-I-I . v"o v
I
The linear map A is a bounded linear map provided the norm IIAII is finite. (Note the function A does not take on "bounded" values in 1E2') We let L(1E1t1E2) be the set of bounded linear maps from lEI to 1E2 with the norm II . II. We also use the minimum norm which is also defined in the same way as we defined it in Section 4.1 for finite dimensions: m(A) = inf IAvI2. v"o Ivll If A is invertible, then m(A) = IIA-Ill-i. A function I : Rn --+ R n is said to be bounded provided there is a uniform C > 0 such that I/(x)1 ~ C for all x E Rn. (Notice the difference from the use of the term for a bounded linear map.) We let cg (Rn) = cg (Rn , Rn) be the space of all bounded continuous maps from Rn to itself. We put the CO-sup topology on cg(Rn),
IIvl - v2110 = sup IVI(x) - v2(x)l· xER"
With this norm, cg(Rn) is a complete metric space. See Dieudonne (1960). For a differentiable map 9 : Rn -+ Rk, at each point a E R n the derivative is (a matrix or) a bounded linear map, Dg. : Rn -+ Rk. We let Ct{Rn) be the set of CI functions from Rn to itself, 9 : Rn -+ Rn, such that 9 is in cg(R n) and such that there is a uniform bound on the derivatives, i.e., a constant C independent of a E Rn such that IIDg.1I ~ c.
5.7 PROOF OF THE HAKrMAN-GROBMAN THEOREM
159
We want to consider CI functions f : an -+ an such that f = A + 9 with A E L(Rn,Rn) an invertible hyperbolic linear map and 9 E CleRn). The global HartmanGrobman Theorem says that such an f can be conjugated to A by a continuous map h : Rn -+ Rn with h = id + v and v E cg(Rn).
Theorem 7.1. Let A E L(Rn,Rn) be an invertible hyperbolic linear map. There exists an t > 0 such that if 9 E Ct(Rn) with Lip(g) < E, then f = A + 9 is topologically conjugate to A by a map h = id + v with v E cg(Rn), and the conjugacy is unique among maps id+k with k E cg(R n ). In fact, let 0 < a < 1 be such that each eigenvalue A of A has eitller IAI < a or lA-II < a. If both t(1 - a)-I < 1 and m(A) - E > 0, then this t works. (The condition on the eigenvalues can also be expressed by saying that spectrum(A) C {A: IAI < a or lA-II < a}. The condition that m(A) - E > 0 insures that f is one to one.) REMARK 7.1. Palis and de Melo (1982) have two proofs of this theorem. The first on pages 59-·63 is similar to that given here. The second on pages 80-88 is a more geometrical proof (using the A-Lemma). We use this latter type of reasoning later in this book. Irwin (1980), pages 92, 113-114, has a proof similar to the one given here. MOTIVATION AND OUTLINE OF THE PROOF. Formally, the conjugacy can be proved to exist by the Implicit Function Theorem. For 9 E Ct(Rn), we want to find a map id + v which conjugates A + 9 with A, i.e.,
+ g) 0 (A + g) 0 (A
(id + v) = (id (id + v)
0
+ v) 0 A, A-I = id + v,
O=id+v-(A+g)o(id+v)oA- I ,
o=
v- A0v
0
A-I - 9
Therefore we define III : CHRn) x cg(a n)
-+
or
(id + v) 0 A-I.
0
cg(Rn) by
III (g, v) = v - A 0 v 0 A - I
-
9 0 (id
+ v) 0 A -I.
Given 9 we want to find a Vg such that lII(g, vg) = O. Such a Vg corresponds to a semiconjugacy of A + 9 with A Notice that 111(0,0) = id - A 0 id 0 A -I = 0, so we can hope to use the Implicit Function Theorem near (0,0) E Cl(Rn) x cg(Rn) to solve for the Vg with lII(g, vg) = 0. It can be proved (after some work) that III is a C' map, and the partial with respect to the second variable is
( alii) au (0.0) v- = v• - A 0 v• 0 A-I == (id - A*)v == .c(v), where A*v = AoVoA- I . (See Franks (1979), Irwin (1972, 1980).) If we were to show that III is a C' map with partial derivative.c and that.c is an isomorphism (a bounded linear map with a bounded linear inverse), then the Implicit Function Theorem would show that we can solve for v = Vg as a function of 9 such that lII(g,vg) == O. This would prove the theorem. Instead of verifying that III is CI with partial derivative .c, we verify that .c is an isomorphism and imitate (or repeat) the proof of the Implicit Function Theorem. In the direct proof of the Implicit Function Theorem (as opposed to the proof using the
160
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
Inverse FUnction Theorem), the problem of finding a zero of III is changed into finding a fixed point by considering the function 8: Cl(an) x Gg(Rn) -+ Gg(a n ) given by
8(g, v) = C-I{Cv -1II(g,v)}. (The functions 9 are required to be bounded so that 8 takes its values in Gg(a n ).) After showing that C is an isomorphism using Lemma 7.2, we prove that if 1I.en Lip(g) < I then 8(g,') is a contraction on Gg(Rn) with a unique fixed point v g • Letting f = A + 9 and h, = id + v g , it follows that h, = f 0 h, 0 A-lor h, 0 A = f 0 h" Thus h, is a semiconjugacy. Lemma 7.4 proves that h, is one to one and Lemma 7.5 proves that h is onto, so h, is a conjugacy from A to f. This ends the outline of the proof. Before starting the proof, we give more notation and then prove a preliminary lemma. The matrix A is hyperbolic, so we can define the stable and unstable subspaces as usual: E" = span{v" : v" is a generalized eigenvector for an eigenvalue Au of A with IA"I > I}, E' = span{v' : v' is a generalized eigenvector for an eigenvalue Ao of A with IAol < I}. Then R n = E" $Eo . We use this decomposition to decompose the space of continuous bounded functions, Gg(Rn, an) = Gg(Rn, E") $ Gg(Rn,Es). We put norms on EU and ES such that the linear map A is a contraction and expansion on the two subspaces: II(AIE,,)-III ~ a < I and IIAIEsll ~ a < 1. On Rn, we put the maximum norm of the norms on EU and ES: if v = v" + v' with vO' E EO' for (1 = U,S, then
Let
A* v = A
0
v
0
A-I
be the map on Gg(Rn, Rn) as above and
C(v) = (id- A*)v = v- AovoA- I . For (1 = U,S, we also let A# = A*IGg(an,EO') and CO' = CIGg(Rn,EO'). Because we use the norms induced by the maximum norm on Rn, IICII = max{IICUII, IIC'II}. The first step is to give some results about linear maps on a Banach space. We use these results below, applied to A*, to prove that C is an isomorphism.
Lemma 7.2. Let E be,. Banach space and G,B E L(E,E). (a) If IIGII ~ a < I then id - G is an isomorphism and lI(id - G)-III ~ I~a' In fact, the inverse (id - G) -I can be represented by the series L:;:o Gj. (b) If B is an isomorpilism witll liB-III ~ a < I then B - id is IllI isomorphism with II(B - id)-III ~ al(1 - a). Again, this inverse, (B - id)-l, can be represented by a power series, L:~ I B - j . PROOF. To prove (a), given y we want to find x such that x - Gx = y, or x = y + Gx. We can find this x as a fixed point of a map, u. Let U : E x E -+ E be given by u(x.y) = y + Gx. Then
U(XltY) - U(X2'Y) = G(XI - X2). Iu(xlt y) - U(X2. y)1 ~ a I(xl - x21·
so
5.7 PROOF OF THE HARTMAN-GROBMAN THEOREM
161
Thus for y fixed, u(·, y) is a contraction. By the contraction mapping result, there Is a unique fixed point xy, Xy = u(Xy,y) = y+G(xy). Thusy = (id-G)xy. Theexlstence of Xy shows that id - G is onto. The uniqueness shows that id - G Is one to one. To get the bound on the norm of the inverse, notice that if x = (id - G)-I y , then x - Gx = y and Ixl - alxl :S Iyl, Ixl <
J!L
so
- I-a'
I(id - G)-Iyl < _1_ Iyl - 1- a·
Thus (id - G)-I is a bounded linear map and lI(id - G)-III :S 1/(1 - a). We will not bother with the details of convergence to show the inverse can be given by the series indicated. However, if the series converges then it can easily be checked that it is the inverse as follows: 00
00
00
(id-G)2: Gj = 2: Gj - 2: Gj j=O
j=O
j=l
= id.
Turning to part (b), by part (a), B-1 - id is an isomorphism. Since B - id = B(id - B-1) it follows that it is also an isomorphism. Its inverse is (B - id)-I = (id - B-1 )-1 B-1 so II(B - id)-III :S (1/(1 - a»a. We leave to the reader to check the series. This completes the proof of Lemma 7.2. 0 PROOF OF THEOREM 7.1. We want to show that each £,1 = id - A*, is invertible. Writing Gg(lE a ) for Gg(RR, Ea), the norm of A*, is given as follows:
Then for v E Gg(E a ), IIA*vllo = sup IAv (' A-Ixl xER"
= sup IAv(y)1 yER"
:S allvllo. Thus IIA*II :S a, and by Lemma 7.2(a), £. = id - A* is invertible with 11(£,)-111 :S 1/(1 - a). By a similar calculation, II(A;P-III :S a, and by Lemma 7.2(b), £u = id-A* is invertible with 1I(£u)-11i :S a/(I - a). Because the norm on Rn is the maximum of the norms on lEU and E', we get that 11£-111 :S 1/(1 - a). We have the map W(g, v) = v - A*(v) - 9 0 (id + v) 0 A-I and its 'linearization' at (0,0), £v = v - A*(v). Imitating the proofofthe Implicit Function Theorem, we let e(g, v) = Cl{£v - w(g,v)} = £-I{v - A*(v) - v + A*(v) + 9 0 (id + v) =
c
I {g 0
(id + v)
0
A -I}.
0
A-I}
162
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
Thus 119(g,vt} - 9(g,1I2)lIo :oS
11£-111
sup Igo(id+vt}oA- l x-go(id+V2)oA- l xl xERn
1
.
:oS
(-1-) Llp(g) sup IVI (y) - v2(y)1
:oS
(1
- a
yERn
~ a) Lip(g)lIvl
- v2110.
For a fixed 9 with (l~a)Lip(g) < I, 9(g,') is a contraction on cg(Rn). Because Vg with 9(g, v g ) = v g • A direct calculation shows that this is equivalent to w(g, vg ) = O. Letting 1= A+g and hJ = id+vg, the fact that wig, vg) = 0 implies that hJ = (A+g)ohJoA -I = lohJoA-I, or h f 0 A = 1 0 h J. All that remains is to show that h = h J is a homeomorphism. Before proving this fact, we show that I = A + 9 is a diffeomorphism.
cg (Rn) is a complete metric space, there is a unique fixed point
Lemma 1.3. The map
I
is one to one and onto, so is a diffeomorphism.
PROOF. Assume that I(x) = I(y). Then, 0 Therefore
= I(x)
- I(y)
= A(x -
y)
+ g(x)
- g(y).
0= I/(x) - l(y)1 ~ m(A)lx - yl - Lip(g)lx - yl ~ (m(A) - Lip(g») Ix - yl,
so x = y. This shows that I is one to one. The map I is onto because it is a bounded distanc· . onto and one to one.
Lemma 1.4. The map h
= hJ
I"
t he
linear map A which is
o
is one to one.
PROOF. There are two types of proofs that h is one to one. One uses the uniqueness of the conjugacy h (within maps h for which h - id is bounded) and the fact that we can also solve for a unique k with Ao k = k 0 I. Then Aokoh = k 0 I 0 h = kohoA. Thus k 0 h conjugates A with itself. By the uniqueness of the maps which conjugate A with itself (within maps for which h - id is bounded), k 0 h = id. This proves that h is a one to one, and even that h is a homeomorphism. We do not give the details of this proof. The second proof uses a property called expansiveness. If h(x) = h(y), then hoAx = loh(x) = loh(y) = hoAy. By induction, h(Anx) = h(Any) for n ~ O. Using the fact that I is invertible and 1- 1 0 h = h 0 A-I, we can also show that h(Anx) = h(Any) for n :oS 0, so for all n E Z. Now we write x = XU + x' and y = yU + y' with xu, yU E lEU. If x "I y then either XU "I yU or x· "I y'. If XU "I yU then IAi x u - AiyUI ~ a- i Ixu - YUI. Thus we can take a j ~ 0 with IAi x U - AiyUI ~ 311h - idll o > O. (If h = id then h is a homeomorphism and we are done.) Then letting Xj = Aixand Yj = Ai y , h(xj)-h(Yi) = Xi -Yi+(hid)(xj)-(h-id)(Yj), soO = Ih(xj)-h(Yi)1 ~ Ixj-yjl-l(h-id)(xi)I-I(h-id)(Yi)1 ~ IIh - idll o > O. This contradiction shows that it is impossible for XU "I yU. Similarly, using negative iterates we can prove that x· = y'. This completes the proof that h is ~to~.
0
5.7.1 PROOF OF THE LOCAL THEOREM
163
Lemma 7.5. The map h = hJ is onto, so it is a homeomorphism ofRn. PROOF. The proof that h is onto uses the fact that it Is a bounded distance from the identity: let b = IIh - idll o. We use the notation that B(r) = B(O, r) Is the open ball centered at the origin of radius r, cI(B(r» is the closed ball centered at the origin of radius r, and S(r) = cI(B(r» \ B(r) is the sphere of radius r centered at the origin. Notice that for x E cI(B(r)), Ih(x)l $ Ih(x) - xl + Ixl $ b + r, so h(cI(B(r))) c cI(B(r + b». Similarly, for x E S(r), Ih(x)1 ~ Ixl -Ih(x) - xl ~ r - b, so h(S(r)) c cI{B(r + b» \ B(r - b). Because h is one to one, the Brouwer Invariance of Domain Theorem implies that h takes an open set to an open set; in particular, the images h(B(r» are open. By taking the union, h(Rn) is open. (See Dugundji (1966) page 359 for the Brouwer Invariance of Domain Theorem.) One the other hand, we show that the image h(Rn) is closed. Assume Zo E cI(h(Rn». There exists x; E Rn with h(x;) converging to Zo. Thus h(x;) is bounded with Ih(x;)1 $ IZoI + 1 == R. Since R ~ Ih(x;)1 ~ Ix;1 - b, we get that Ix; I $ R + b, and the x; are bounded. By compactness of cI(B(R + b)), there Is a subsequence x;. which converges to a point Xo E cI(B(R + b». By continuity of h, h(Xo) = ZOo Therefore Zo is in the image, and cI(h(Rn)) = h(Rn). Because h(Rn) is both open and closed in Rn and Rn is connected, h(Rn) = Rn, i.e., h is onto. In finite dimensions, a continuous bijection is a homeomorphism. We show this fact explicitly in this situation, i.e., we show that h- I is continuous. Assume Yn is a sequence of points contained in some cI(B(R» converging to yoo. By the above arguments, there are Xn and Xoo in cI(B(R + b» such that h(xn) = Yn and h(Xoo) = Yoo' Thus h-I(Yn) = Xn , h-I(yoo) = Xoo E cI(B(R + b». Assume the Xn do not converge to Xoo. Then there is a subsequence Xn, converging to p i' Xoo. By continuity of h,
h(p)
= J-OO lim h(Xn,) = Yoo = h(xoo ).
This contradicts the fact that h is one to one. Therefore h - I (y n) must converge to h-I(yoo), proving that h- I is continuous. The key idea which made the above proof work is that h is proper. A map h is called proper provided the inverse images of compact sets are compact. This completes the proof of the lemma and Theorem 7.1.
o 5.1.1 Proof of the Local Theorem To prove the local version, we need to use what are called bump functions. These are functions which make the transition from being identically zero to functions which are identically one. We give the construction in the following lemma. Lemma 7.6. Given numbers 0 < a < b, there is a Coo function {3 on Rn such that o $ {3(x) $ 1 for all x E R n and {3(x) = {I for Ixl $ a o for Ixi ~ b. PROOF. We start by defining a function of a real variable,
a(x) =
{ o
e-I/'"
forx
164
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
A direct check shows that a is Coo. Next, for a < b, let ,.,(x) = o(x - a)o(b - x). Then ,.,(x) exactly on the open interval (a,b). Again,., is Coo. Now, if we define
~
0 and is greater than zero
C( ) _ f:,.,(s)ds x - b ' fa ,.,(s) ds then 0 ~ 6(x) ~ 1 for all x E Rand I forx
We now use this bump function to construct a map satisfying Theorem 7.1 from a nonlinear map in a neighborhood of a fixed point. Proposition 1.1. Let Uo be an open neighborhood of the origin ill Rn. Let I : Uo -+ Rn be a cr local diffeomorphism for r ~ 1 with 1(0) = 0 and A = D/o. Then given any f > 0, there is a smaller neighborhood U C Uo of 0 and e. C r extension] : Rn -+ Rn with]IU = IIU, (j - A) E Cl(Rn,Rn ), and Lip(] - A) < f. PROOF. Let .B(x) be the Coo bump function given by Lemma 7.6 on R with a = 1 and b = 2. Then there is a uniform K ~ 1 such that 1.B'(x)1 ~ K for all x E R. Let 9 = I -A, so Dgo = O. Take r > 0 such that IIDgx ll < f/(4K) for all x E B(O,2r) and B(O,2r) C Uo. Finally, let
and
r
](x) = Ax +
Clearly,] equals I inside U;: B(O,r) and A outside B(O,2r). Thus] E Cl(Rn, Rn). All that remains is to check the bound on the Lipschitz constant of
r
~ 1.B(lxl) r ~
r
l
+ 1.B(lyl)I·lg(x) - g(y)1 r
f
K· ~ 'Ix - yl· (4K) 'Ixl + 1· (4K) 'Ix - yl
1 ~ f(2 ~
.B(l!l)I·lg(x)1
1
1
+ 4K)lx - yl
flx-yl·
This completes the proof of the proposition.
o
For an arbitrary nonlinear map with a fixed point at p, there is a simple translation that brings the fixed point to the origin. By Proposition 7.7, there is an extension] which equals I in a neighborhood of 0 and satisfies Theorem 7.1. Then] and A are conjugate on all of Rn. Because] and I are equal on a neighborhood of 0, I and A are conjugate on a neighborhood of the origin. This shows how the local version of the Hartman-Grobman Theorem for a diffeomorphism, Theorem 6.2, follows from Theorem 7.1.
5.8 PERIODIC ORBITS FOR FLOWS
165
5.7.2 Proof of the Hartman-Grobman Theorem for Flows As in the statement of the theorem of the local Hartman-Grobman Theorem for flows near a fixed point, Theorem 5.3, we consider the differential equation i = I(x) in a neighborhood of a hyperbolic fixed point p. By a translation we can take p = O. Let B = D 10. Using a bump function we can find an extension I which equals I in a neighborhood of 0 and equals B outside a larger neighborhood. Let 'P' be the flow for the differential equation x = /(x) and e'B the flow for the linear equation x = Bx. Note that 'PI is also the flow for I in a neighborhood of O. Take the time one maps 'PI(X) and eBx. By Theorem 7.1 there is a conjugacy h between 'PI and eB , h 0 eB = 'PI 0 h, and h is unique among maps for which h - id is bounded. The following lemma shows that h actually conjugates 'PI and elB for all times t. Lemma 7.S. TIle J//al' " satisfies 'P'
0
h 0 e- IB = h for all real t.
PROOF. Fix a real number t. First, we show that not only h but also 'PI 0 h 0 e- IB is a conjugacy between 'PI and e B :
'PI
0
['PI
0
h
0
e- IB )
0
e- B = 'PI = 'PI
0 0
['PI
h0
0
hoe-B)
0
e- IB
e- IB
because h is a conjugacy for the time one maps of the flows. The conjugacy h is unique among maps for which h-id is a continuous bounded map. To see that 'Plohoe-IB -id is bounded, note that
The second quantity on the right hand side is bounded because h - id is bounded. The two flows are equal outside a bounded set 80 the first quantity on the right hand side is bounded by compactness. Thus 'P' 0 h 0 e- IB - id is bounded and also is a conjugacy. By the uniqueness of Theorem 7.1, 'PI 0 hoe-IB = h and we have proved the lemma. 0 By the lemma we have a conjugacy of the extensions on all of R". By restricting to a neighborhood of the fixed point, we have proved the local Hartman-Grobman Theorem for flows, Theorem 5.3. The next section discusses the behavior of a flow in a neighborhood of a periodic orbit.
5.8 Periodic Orbits for Flows We consider periodic orbits in this section and see how the eigenvalues of an appropriate matrix determine the stability. For a different approach than given here to the analysis of orbits in a neighborhood of a periodic orbit, see Hale (1969). Remember that a point p is a periodic point with (least) period T provided 'PT (p) = p and 'P'(p) ~ P for 0 < t < T. If p is a periodic point with period T, then the set of points O(p) = {'PI(p) : 0 ~ t ~ T} is called a periodic orbit or closed orbit. It is important to understand the flow in a neighborhood of a periodic orbit. One way to do this is to follow the nearby trajectories as they make one circuit around the periodic orbit. The following example gives a case where the orbits can be given as explicit functions of time.
166
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
Example 8.1. Consider the equations :i; =
iI =
+ x(l _ X2 _ y2) x + y(l - x 2 _ y2). -y
which in polar coordinates are given by
r = r(l - r2)
0=1. If we look at orbits with 9(0) = 0 then 9(271") = 271". Thus the solutions return to the surface {9 = 0 mod 271"} after a time of 271". The map P which takes the r value at time 0 to the r value at time 271" incorporates the effect of making one circuit around the periodic orbit. This map P is called the first return map or Poincare map from the surface {9 = 0 mod 271"} to itself. The solutions for this system of differential equations, and so also the Poincare map, can be explicitly calculated, and used to show that r = 1 is an attracting periodic orbit. Also in this case, it can be seen directly that r = 1 is an attracting periodic orbit because r > 0 for r < 1 and r < 0 for r > 1. Definition. In general, let 'Y be a periodic orbit of period T with p E 'Y. Then for some k the k-th coordinate function of the vector field must be nonzero at p, /,,(p) =I o. We take the hyperplane E = {x: x" = p,,}. This hyperplane E is called a cross section or tmnsversal at p. For x E E near p the flow cpt(x) returns to E in time r(x), which is about T, as can be seen by the Implicit Function Theorem (as carried out explicitly in the proof of the following theorem.) Let VeE be an open set in E on which r(x) is a differentiable function. The first return map or Poincare map is defined to be P(x) = cpT(X) (x) for x E V. REMARK 8.1. Example 8.1 gives a trivial example where the Poincare map can be explicitly calculated. In two subsections below we determine properties of the Poincare map in two different situations of differential equations in R2. These two subsections are not necessary for the rest of the book but give some idea of how the Poincare map could be determined when the equations can not be explicitly solved. In particular, we determine the nature of the Poincare map for the Van der Pol equation, and we determine the derivative of the Poincare map for equations in two dimensions in terms of an integral of the divergence of the vector field. In the first subsection below, we show how it is possible to take a map and recover a flow which has this map as a Poincare map. This process is called the suspension of a map. The proof of the following theorem carries out the construction of the Poincare map and determines some of its properties in terms of the Implicit Function Theorem. It should also be noted that a cross section can be taken to be a hypersurface which is nonlinear but we do not include the necessary modifications.
Theorem 8.1. Consider a C T flow cpt for r ~ 1. (a) If p is on a periodic orbit 'Y of period T and E is a transversal at p, then the first return time r(x) is defined in a neighborhood V ofp in E and r : V -+ R is cr. (b) If P : V --+ E given by P(x) = cpr(x)(x) is the first return map, then Pis cr. PROOF. Let E = {x : Xlc = pic} be the cross 1J'!(x, t) = cp~(x) - Pic. Then 1J'!(x, t) = 0 if .'ii,
!
!
Sf""
",
P with f,,(p) ::j:. O. Define x) E E. Also, 1jJ(p, T) = O.
Finally, 1J'!(x, t) = /lcOcpt(X) and 1J'!(p, t)lt=T = hc(p) ::j:. O. By the Implicit Function Theorem, there is a neighborhood V of p in E such that for x E V it is possible to
5.8 PERIODIC ORBITS FOR FLOWS
167
solve for t as a function of x, t = r(x), such that 1/J(x, r(x» == O. Moreover, r(x) is a CT function of x. (The reader may want to look at the section on the statement of the Implicit Function Theorem.) Once we have that r(x) is CT, it follows that the Poincare map P is CT because rpl(X) is jointly C T in t and x. 0 Now that we have shown that the Poincare map is differentiable, we can indicate the relationship between the eigenvalues of the derivative of the Poincare map and the eigenvalues of the derivative of the flow, Drp~. The first step is given in the following lemma.
Lemma 8.2. (a) Ifrpl(x) is a solution ofi = f(x), then Drp;f(x) = f(rpl(X» for any t. (b) If"l is a periodic orbit of period T and p E "I, then Drp~ has 1 as an eigenvalue with eigenvector f(p). (c) Ifp and q are two points on a T-periodic orbit "I, then the derivatives Drp~ and Drp~ are linearly conjugate and so have the same eigenvalues. PROOF.
We have that d f(rpt(x» = dsrp·(x)I.=1 = :Srpl
0
rp·(x)I.=o
= Drp~f(x).
This proves part (a). For part (b), rpT(p) = P so f(p) = f(rpT(p» = Drp~f(p). This proves part (b). For part (c), assume that q = rpT(p). Then
so taking the spatial derivative (with respect to x) at p yields
Thus Drp~ and Drp~ are linearly conjugate by the linear map Drp~.
o
Deftnition. If "I is a periodic orbit of period T with p E "I, then the above result shows that the eigenvalues of Drp~ are I,AI," .,An-I' The n -1 eigenvalues Ab ... ,An-l are called the characteristic multipliers (or eigenvalues) of the periodic orbit. (Note that Lemma 8.2(c) shows that the characteristic multipliers do not depend on the point p.) A periodic orbit "I is called hyperbolic provided IAjl '" 1 for all the characteristic multipliers (1 ::; j ::; n - 1). It is called attracting, a periodic attractor, or a periodic sink provided all the characteristic multipliers are less than one in absolute value. It is called repelling, a periodic repeller, or a periodic source provided all the characteristic multipliers are greater than one in absolute value. A hyperbolic periodic orbit which is neither a source nor a sink is called a saddle periodic orbit. With these results and definitions, we can now state and prove the following theorem which says that the characteristic multipliers of the periodic orbit and the eigenvalues of the Poincare map are the same.
168
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
Theorem 8.3. Let p be a point on a periodic orbit "( of period T. Then the characteristic mUltipliers of the periodic orbit are the same as the eigenmlues of the derimtive of the Poincare map at p. PROOF. Let 7r : R n ...... E be the projection along f(p). The Poincare map is P(x} =
D
and
7rD
o
Now we can state and prove the theorem giving sufficient conditions for stability in terms of the characteristic multipliers. Theorem 8.4. (a) Let "( be a periodic orbit for which all the characteristic multipliers AJ satisfy IAil < 1. Then,,( is asymptotically stable (attracting). (b) With the assumptions of part (a), if q is a point for which d( 1. Then -y is not Liapunov stable. PROOF. Let E be a cross section at p, and P : VeE -+ E be the Poincare map. Then pep) = p and all of the eigenvalues Aj of DPp have IAjl < 1. Thus p is an asymptotically stable fixed point for P, and there is a subneighborhood Vo C V of P such that for q E Vo, d(pn(q}, p} goes to zero as n goes to infinity. The next lemma shows that this property of the map carries over to the flow in a neighborhood of p in
E. Lemma 8.5. If q E Vo then lim,_co d(
0 there exists a 6 > 0 such that if d(x, p} ~ 6 and 0 ~ t ~ 2T, then f > d(
0, we have proved the lemma. 0 The above lemma proves part (a) if we start on the transversal, E. Now let U be an arbitrary neighborhood of "(. Let Yo c E be as in Lemma 8.5, and take V, C Vo a smaller neighborhood of pin E such that T(X) ~ 2T for x E VI, and U I == {
5.S PERIODIC ORBITS FOR FLOWS
169
n ~ O. Finally, let U2 == (!pt(x) : x E V, and 0 $ t $ T(X}}. Now take q E U,. Let qo = !p-IO(q) E V2 C VI, so !pt(qo) E U2 C U for 0 $ t $ T(qo) $ 2T. Thus !pt(q) E U I for 0 $ t $ T(qo) - to. Let qn = P"(qo) = !pt. (q). Then qn E VI and !p'(q) E U I C U for tn $ t $ tn+I' Thus !pl(q) E UI C U for all t ~ 0, and so 'Y is orbitally Liapunov stable. Also qn = !pt. (q) E VI and d(!pl (q), 'Y) goes to zero as t goes to infinity by an argument as in Lemma 8.5. This proves part (a). We want to find a point z E 'Y such that d(!pt(z),!pt(q» goes to zero as t goes to infinity. It suffices to consider q E Vo C E. As above, qn = P"(q) = !pt. (q) where tn = L.;~~ TO pj(q). We want to show that tn grows like nT, so we look at the difference, h n = tn - nT = L.;~~(Tj - T). We claim that h n is a Cauchy sequence.
Lemma 8.6. The numbers hn form a Cauchy sequence. PROOF. We need the fact that the derivative of T is bounded by a constant C > 0, IIDTxll :oS: C, so T is Lipschitz, Le.,IT(x) - T(Y)I $ Cd(x,y) for x,y E Vo. Also, there is a C I ~ 1 and v < 1 such that d(Qj,p) $ Clv'd(q,p) for all j ~ O. Then n+k-I
Ihn+k - hkl = I
L
(T(Qj) - T(p»1
(since T = T(p»
j=k
n+k-I :oS:
L
Cd(Qj,p)
n+k-I
$
L
CClv'd(q,p)
j=k
v"
$ CCld(q,p)--.
I-v
This last quantity goes to zero as k goes to infinity, so the lemma.
Ihn+k - h,,1 does also. This proves 0
Now, since hn is a Cauchy sequence, it approaches some limit value A. Then tn ~ A + nT. We use A to define z as z = !p-A(p). We need to show that this z works. Let h n = tn - nT or tn = h n + nT. Then d(!p'. (z), !pt. (q» = d(!p".+nT-A(p), pn(q»
= d(!p".-A(p),pn(q» $ d(!p"·-A(p), p)
(since !pnT(p) = p)
+ d(pn(p), P"(q».
On the last line, the first term goes to zero because hn - A goes to zero, and the second term goes to zero because p is asymptotically stable for the map. Therefore d(!p'(z),!p'(q» goes to zero as t = tn goes to infinity. For arbitrary large t, there is an n with tn $ t < tn+1 and Itn - tn+ll < 2T. Using the uniform continuity of !p'(x) for a bounded time interval, we get that d(!p'(Z),!pI(q» goes to zero as t goes to infinity. This proves part (b). The instability in part (c) follows as before from the Hartman-Grobman Theorem.
o
The In Phase Property does not always hold if the attraction to the periodic orbit is not at a geometric rate (given by the eigenvalues), as the following example shows.
170
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
Example 8.2. This example is given in polar coordinates. Consider T = -(r - rO)3
iJ = 1 +r - rO, where rO > O. Then r = rO is an attracting periodic orbit. The solutions are given by for ro = rO r(t) = {ro sign(ro _ rO) [2t + (ro - ro)-2jl /2 for ro '" rO for ro = r' 9(t) = { : : : : + sign(ro - rO){[2t + (ro - ro)-2J1/ 2 -(ro - rO)-I} for ro '" rO. Let r(t) and 8(t) = 90 +t+[2t+(ro _r·)-2j1/ 2 - (ro-r·)-l be a solution with ro > r·. We would like to select a solution f(t) == r' and 8(t) = 80 + t with ro = r' and so that the distance between the solutions goes to zero. But
19(t) - 8(t)1 = 1[2t + (ro - r·)-2jI / 2 - (ro - r·)-l
+ 90
-
90 1
does not go to zero for any choice of 80 . In fact the difference of the angles goes to infinity before we take everything mod 211', which means that the solution off the periodic orbit keeps going around the angle direction more rapidly than the solution on the periodic orbit. This proves that the solutions do not approach the periodic orbit in phase with a solution on the periodic orbit. We now consider a flow near a hyperbolic periodic orbit, "Y, of period T. If p E "Y, then D
if
N
= {(q,y) : q E "Y and y E lE~ EBlE~}.
Then, we can consider a flow \{lIon this normal bundle N which is linear on the "fibers" lE~ EB lE~,
\{I1(q, y) = (
5.8.2 AN ATTRACTING PERIODIC ORBIT FOR THE VAN DER POL EQUATIONS 171
the Poincare map of lilt in a neighborhood of {OJ. From the conjugacy of the Poincare maps, it follows that the flow cpt in a neighborhood of '"Y is topologically equivalent to lilt in a neighborhood of'"Y x {OJ. 0 REMARK 8.2. The above result about the flow being in phase for a periodic sink can be used to show that the two flows are topologically conjugate in this case. In fact, if the periodic orbit is hyperbolic then it is always topologically conjugate (and not just equivalent) to the linear bundle flow. The main step is to find a transversal E such that for a neighborhood Eo of'"Y In E, cpT(Eo) C E. See Pugh and Shub (1970a) and Irwin (1970a).
5.8.1 The Suspension of a Map The above discussion took the flow near a periodic orbit and found a map on a one dimensional lower space. This is a local map defined only at points near the periodic orbit. Sometimes it is possible to make this construction more globally, but we do not investigate this. See Fried (1982). On the other hand, there is a general construction that takes a C r diffeomorphism on a space and constructs a C r flow on a space of one higher dimension. (It also works for a homeomorphism to give a continuous flow.) The new space is not a Euclidean space and is often not even the product of a Euclidean space and a circle. In any case this construction is useful in order to understand what flows are possible. The flow is called the suspension 0/ the map and is essentially what topologists call the mapping torus. Given a map / : X -+ X we consider the space X x R with the equivalence relation, (x, s + 1) '" (f(x) , s). Then we consider the space
x = X x R/ "'. To get all points it is enough to consider 0 $ s $ I, but the other points are included because they make it clear that the quotient space has a c r structure if / is cr. Now we consider the equations on X x R given by
X=O 8 = 1.
This induces a flow
5.8.2 An Attractin~ Periodic Orbit for the Van der Pol Equations In the introductory chapter, we mentioned that the Van der Pol equations
x=y-x3 +x y= -x
(1)
have an attracting periodic orbit. In this section we prove this result for a slightly more general set of equations called the Lienard equations. The importance of this example is that it gives a set of equations for which it is possible to verify the properties of the Poincare map and the existence of an attracting periodic orbit for a nontrivial example.
172
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
Consider the second order scalar equation given by
x
+ g(X)x + x
=
o.
(2)
If g(x) is zero, this is the linear oscillator. The term involving:i; is a "frictional" term where the friction depends on the position x. In fact for small x we are going to take g(x) negative so it is an "anti-frictional" term, while for large x we are going to take g(x) positive so it is a "frictional" term. As mentioned in the introductory chapter, these equations can be used for a certain type of electrical circuit. See Hirsch and Smale (1974). We give the precise assumptions below. To change the equations into the form we use in our analysis we do not let v = x. Instead we let /(x) be the antiderivative of g(x), f'(x) = g(x) with /(0) = 0, and set y = x + /(x). Then equation (2) becomes x
= y -/(x)
(3)
iJ = -x. This set of equations is called the Lienard equations. Figure 8.1 shows the phase portrait for /(x) = -x + x 3 .
FIGURE 8.1. Phase Portrait of the Lienard Equation for /(x) = -x with -2 S x S 2 and -2 S y S x 6 4 2
-1
1
2
-2
-4 -6
FIGURE 8.2.
The Graph of /(x) = -x + x 3 for -2 S x S 1
+ x3
5.8.2 AN ATTRACTING PERIODIC ORBIT FOR THE VAN DER POL EQUATIONS 173
We now proceed to give the assumptions on / that make our analysis work. The model for / is /(x) = -x + x 3 which gives the Van der Pol equations (1) given above. Notice for /(x) = -x + x 3 that (i) / is an odd function of x, (ii) /(x) < 0 for 0 < x < 1 and /(x) > 0 for x > 1, (iii) / is a strictly monotone increasing function of x for x > 1, and (iv) /(x) goes to infinity as x goes to infinity. See Figure 8.2. We make similar assumptions on the general function which we consider. Assume (i) f is a C l odd function of x, (ii) there is an x· > 0 such that /(x) < 0 for 0 < x < x· and /(x) > 0 for x > x·, (iii) / is a monotone increasing function of x for x > x·, and (iv) /(x) goes to infinity as x goes to infinity. With these assumptions, we can state the main result. Theorem 8.8. With assumptions (i)-(iv) given above, equations (3) have a unique nontrivial periodic orbit 'Y. This orbit is globally attracting in the sense that if q ¥ 0 then w(q) = 'Y. PROOF. The first thing to note is that the only fixed point is at the origin. Next, we want to show that nontrivial solutions (solutions for q =I 0) go around the origin. To check that solutions go around the origin, we look at the curves where x = 0 and iI = O. Define y+ = {(O,y) : y > O}, y-
= {(O,y) : y
< O},
g+ = {(x, y) : y = /(x), x> O}, g- = {(x,y) : y = /(x),x
and
< O}.
Then x = 0 on g+ and g- , and iI = 0 on y+ and Y-. These curves are the isoclines for these equations. Let A be the region between y+ and g+, B be the region between g+ and v- , C be the region between y- and g- , and D be the region between g- and y+. Lemma 8.9. Every nontrivial trajectory moves clockwise around the origin crossing y+, entering A, crossing g+, entering B, crossing Y-, entering C, crossing g-, entering D, and returning to y+. PROOF. Start by assuming that (xo,Yo) E y+. Let (x(t),y(t» be the solution. Then x(O) > 0 so the solution enters A and XI = x(td > 0 for small time tl > O. As long as the solution stays in region A, x > 0 so x(t) ~ XI' In the region ((x,y): x ~ XI,y ~ Yo, (x,y) E A},
iJ
~ -XI
t2
> O.
< 0 so the solution can only exit A by crossing the curve g+ at some time
The argument in region B is slightly more delicate. Let (X2' Y2) E g+ be the solution at time t2' The trajectory enters region B for some time t3 > t2' Moreover, all along g+ , iI < 0, so once the trajectory leaves a smaIl neighborhood of g+, it can never return. Therefore along the trajectory there is an upper bound on x, x ~ -a < 0 for t > t3, as long as it stays in region B, and x(t) ~ X3 - a(t - t3) for as long as it stays in region B. The trajectory can leave B only by crossing v- or by the solution becoming unbounded. By the above bound, x(t) must become zero at least by time t = ta + xa/a. However, in this time interval iI = -x ~ -X3, so y(t) ~ Y3 - X3(X3/a). Since there is an a priori bound on y(t) on this time interval, the solution must exit by crossing y-. By the symmetry of the equations, the arguments in the other regions are similar to those given above. 0
174
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
Because the solutions travel from v+ to v- and back to v+, we can define the Poincare maps {3 : v+ -+ y- and u : y+ -+ v+. Because of the symmetry of the equations, if (x(t), y(t» is a solution then (-x(t), -y(t» is also a solution. Therefore u(q) = -{3(-{3(q». Note {3(q) E v- so -{3(q) E v+, {3(-{3(q» E v- and -{3(-{3(q» E y+. Lemma 8.10. (a) p E v+ (b) u(p) = (c) {3(p) =
The following are equivalent: is on a periodic orbit, p, and -po
PROOF. Note that solutions do not cross themselves and all solutions on y+ enter A, so u and {3 are one to one monotone functions. Therefore the only way for p E v+ to he a periodic solution is for q(p) = p. This proves that (1) and (2) are flquivalent. If (J(p) = -p then u(p) = -~(-{J(p» = -,B(p) = p. 011 the ot.her hand, If ,B(p) < -p, then -,B(p) > p, /7(p) = -,B(-{3(p» > -,B(p) > p, and u(p) :-I p. The case for ,B(p) > -p is similar. 0 Thus to find the periodic orbits we can use the Poincare map,B. To deten;nine the properties of,B we use a "Liapunov function" L(x, y) = (1/2)(x 2 + y2). Then the time derivative along solutions is given by L(x, y)
= -x!(x),
which is not always of one sign. The change of L as solutions move from v+ to v- is given by ["(P)
6(p) == L(,B(p» - L(p) = Jo
L(x(t),y(t»dt,
where tl(p) is the time to reach y-. Then {3(p) = -p if and only if 6(p) = o. To calculate 6, we sometimes look at the solutions as functions of x and write (x, y(x», or as functions of y and write (x(y), y). We write
~~ (x, y)
for the total derivative along
the solution written as a function of x, (x, y(x», and similarly
~~ (x, y)
for the total
derivative along the solution written as a function of y, (x(y), y). Then
dL( ) _ L(x,y) . d x x,y x -x!(x) !(x)'
=yand
dL
-(x,y) = dy
BL:i; Bx y
BL By
(-)("7) +-
= x(y -
!(x»)
-x
+y .
=!(x). Remember that x' is the value where !(x') = O. There is a unique point p' E v+ that Hows to (x'. 0) when it first reaches g+. Let r' = Ipol. This value is important in determining the properties of the function 6(p).
5.8.2 AN ATTRACTING PERIODIC ORBIT FOR THE VAN DER POL EQUATIONS 175
Lemma 8.11. (a) For P E v+ with 0 < Ipi :5 r·, 6(p) > O. (b) The function 6(p) is 8 monotonically decreasing function of Ipi for Ipi (c) As Ipi goes to infinity, 6(p) goes to minus infinity.
~
r·.
If 0 < Ipi :5 r· then x(t) :5 x· along the trajectory from p to (j(p). Thus f(x(t» :5 0 (and strictly negative for most times), £(x(t), y(t» ~ 0 (and strictly positive for most times), and so 6(p) > O. This proves part (a).
PROOF.
FIGURE
8.3
Now assume that Ipi > Ip/l ~ r·. Let 1'1 be the part of the trajectory of p from v+ until it hits the vertical line {(x·, y)}. Let 1'2 be the part of the trajectory from this first time of hitting the vertical line {(x·, y)} until it hits this same vertical line again. Finally, let 1'3 be the part of the trajectory from this second crossing of {(x·, y)} until it reaches v -. Let l' be the combination of 1'1, 1'2, and 1'3, which is the trajectory from p to (j(p). See Figure 8.3. Similarly, let 1'; be the parts of the trajectory for p'. We need to compare the changes of L along 1'j and 1';. For 0 :5 x < x· if y and y' are chosen so that (x,y) E 1'1 and (X,y') E 1'~, then y > y - f(x) > y' - f(x) > 0, and -xf(x) > 0, so the changes of L along 1'1 and 1't are compared as follows:
t,
1.
.., L(x,y)dt =
< =
1"" 0
-xf(x) Y _ f(x) dx
0
-xf(x) y' - f(x) dx
1",' 1. :
£(x, y) dt .
Similarly, along 1'3 and 1'3' y < y' < 0 but x is decreasing so
1.
r' y-xf(x) _ f(x) dx
... L(x,y)dt = -
10
<-
0
=
1",' 1
-xf(x)
y' - f(x) dx.
£(X',y')dt.
'Yo
"(2
For the comparison of the changes of L along 1'2 and 1'~, we need to further split up into three subpieces. Let y~ and Y3 be the y values where -rl and 1'3 meet {(x·,y)}.
176
V.
ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
Let ':b be the part of 1"2 between y\ and Y3' We use this part of the trajectory as a graph over y and notice that y is decreasing. Also let YI and Y3 be the y values where 1"1 and 1"3 meet {(XO,y)}. Then
1.
L(x,y)dt = - 111; f(x)dy - 111; ,f(x)dy -
~
~
~
111; 1
<-
illS f(x)dy ~
f(x')dy
II;
=
Note that
..,;
-1
L(x', y') dt .
f(x)dy < -
..,.
L
f(x) dy
'Y.
because the x values on 1"2 are larger than the corresponding x' values on 1"~ 80 f(x) > f(x') and the integral is more negative. Combining the comparisons of integrals shown above, we get that 6(p) < 6(p'). This proves part (b). To get that the limit of 6(p) equals minus infinity, note that as Ipi goes to infinity, the x values on 'Y2 go to infinity, 80 6(p)
<-
111; II,
f(x)dy
-+ -00.
o PROOF OF THEOREM 8.8. We showed above that P E v+ is on a periodic orbit if and only if 6(p) = O. Lemma 8.11 proved that 6(p) is positive for Ipi :5 rO. For Ipi 2: rO, 6(p) is monotonically decreasing, 80 it has a unique zero at 80me Po with Ipol > rO. Thus there is a unique periodic orbit. fUrther, for any P with Ipi > IPol, 6(p) < O. The trajectory for P can never cross the periodic orbit through Po, 80 it stays outside this periodic orbit and 6(u i (p» < 0 for j > O. Therefore lui(p)1 is monotonically decreasing, and the 8OIution comes inward toward Po with each revolution. The limit of the 6(u j (p» must be a fixed point of u and 80 must be Po. Then the trajectory for p limits on the periodic orbit for Po. Similarly, if Ipi < IPoI, then 6(u j (p» > 0, lui(p)l is monotonically increasing, and 80 6(u i (p» must converge to Po. This completes the proof of the theorem. 0
5.8.3 Poincare Map for Differential Equations in the Plane In this subsection, we consider a differential equation in the plane given by
± = X(x,y)
iJ = Y(x,y). Let V(x. y) = (
;~:::~)
be the corresponding vector field. Denote the divergence of
" at q by (diT ")(q). Assume ~
E is a tnms-.'erSal and E' c E is an open subset 011
5.8.3 POINCARE MAP IN THE PLANE
177
which the Poincare map is defined, P : E' _ E. Since the differential equatiollB are in the plane, E is a curve which can be parameterized by"'( : ] - E with "'(1') = E' and I",('(s)I = 1. Let V.J. (q) be the scalar component of V perpendicular to the tangent line to E at q given by V.J. 0 "'(s) = det(",('(s), V 0 "'(s)). In the case where E is a horiwntalline, {(x, yO) : Xl < X < X2}, then V.J. (q) = Y(q). In the case where E is a vertical line, {(xo,y) : Yl < Y < Y2}. then V.J.(q) = -X(q). As in the general case, let T(q) be the return time for q E E'. so P(q) = !p1'(q)(q). With these definitions and notation we can state the main theorem of this subsection.
Theorem 8.12. Let"'(:]' -+ E' be a parameterization of the transversal E' as above with I",('(s)I = 1. Then for s E ]', V
0
(Po"'()'(s) = V .J. P
11'01'(')
"'(s)
() exp( o"'(s 0 In particular, if P(qo) = qo and "'(so) = qo, then .1. 0
.
(divV) o!pt o"'(s)dt) .
r(qo)
(P
0
"'()'(so) = exp (10
(div V) 0 !pI (qo) dt).
REMARK 8.3. This theorem is contained in Section 28 of Andronov. Leontovich, Gordon, and Maier (1973). The case where P(qo) = qo is the one most often used. In the application in the example below we use the general case. PROOF. The first variation equation states that
d
I
I
diD!Pq = DV.... (q)D!pq.
Since det(D!p~) = det(id) = I, Liouville's formula for time dependent linear equations gives that r(q)
det(D!p~(q» = exp (10
(div V) 0 !pt(q) dt).
(We leave it as an exercise to prove this time dependent version of Liouville's formula. See Exercise 5.35.) Notice that the right hand side of this equality is the integral in the formula for (P 0 "'()'(s) as stated in the theorem. Therefore to complete the proof, we must relate (P 0 ",()'(s) with det(D!p~';':)(·». Taking the derivative of P 0 "'(s) = !pTO'Y(')(",(s» with respect to s yields (P 0 ",()'(s) = (D!p~~:)(')h'(s)
+ (T "'()'(s)[V !p1' 'Y(') ("'(s»J (D!p~~:?)h'(s) + (T 0 "'()'(s)[V 0 P 0 ",(s)J.
=
0
0
0
Then (P 0 ",()'(s)[V.1.
0
Po ",(s)J
= det«P 0 "'()'(s). V
0
P
0
= det«D!p~~:)(')h'(s), V
"'(s» 0
Po "'(s»
+ det«T 0 "'()'(s)[V 0
P 0 "'(s)J, V
= det«D!p~~:)(')h'(s), (D!p~~:)('»V
= det(D!p~~:)('» det(",('(s), V
0
0
0
Po "'(s»
"'(s»
"'(s»
(TO'Y(')
= exp (10
(div V) o!pt 0 "'(s) dt)V.1. 0 "'(s).
178
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
Dividing by V.l
0
o
po -y(s) gives the desired formula.
Example 8.3. Consider the Volterra-Lotka equations which model the populations of two species which are predator and prey: :t = x(A - By)
iJ = y(Gx - D) with all A, B, G, D > O. There is a unique fixed point in the interior of the first quadrant with x· = D/G and y. = A/B. Let E = {(x,y·) : x· :5 x < oo}. By using arguments like those for the Van der Pol equation, it can be shown that every point (x, yO) of E with x > x· returns to E, P : E -+ E. The argument below shows that this map extends differentiably so that P(x·) = x·. We show that all points in the open first quadrant except (x·, y.) lie on periodic orbits by applying Theorem 8.12. This fact is usually verified by finding a real valued function L which is constant on orbits, t == o. Let V be the vector field for the above differential equations. The divergence of V is given by (divV)(x,y) = (A - By) :t = -
x
+ (Gx
- D)
iJ
+-. y
We write P(x) for the x-value of the Poincare map of the point (x, y.). The integral in Tneorem 8.12 becomes exp (
l
dr ):t
(-
o
x
+ -iJ ) dt)
= exp (
y
l
P (r)
r
1
- dx X
+
111 '
II'
1
Y
dy)
= P(x)
x Applying the formula of Theorem 8.12 yields P'(x) =
Y(x, y.) . P(x) Y(P(x), yO) x
Defining I(s) = Y(s,y·)/s, we get that
10
P(x)P'(x) = I(x).
Integrating from x· to x yields F
0
P(x) - F
0
P(x·) = F(x) - F(x·),
where F is the antiderivative of I. Since P(x·) = x·, this gives F
0
P(x) = F(x).
Since I(s) > 0 for s > x·, F(s) is strictly monotonically increasing. Therefore, the fact that F 0 P(x) == F(x) implies that P(x) == x. This completes the proof that all orbits are periodic. For other examples applying Theorem 8.12, see Robinson (1985).
5.9
POiNCARE-BENDlXSON THEOREM
179
5.9 Poincare-Bendixson Theorem The Poincare-Bendixson Theorem is a result about flows in a region in the plane or on the two sphere, 8 2 . The reason for the restriction to these domains is that it depends on the Jordan Curve Theorem. It also depends on the fact that a transversal is one dimensional. The conclusion of the theorem is that the w-limit set of a point is either a closed orbit or contains a fixed point. In order to insure that the w-limit set is nonempty, we need to assume that the forward orbit is bounded, i.e., O+(P) is contained in a compact subset of the domain. See Section 5.4 for examples of limit sets for flows in the plane. For other references with more details on this result, see Hale (1969), Hartman (1964), and Hirsch and Smale (1974).
Theorem 9.1 (Poincare-Bendixson Theorem). Let V be a planar domain, i.e., either a simply connected subset ofR2 or V = fP. Let
Corollary 9.2. Let
x=y iI
= -x
+ y(l
- x 2 - 2y 2).
Let r be the polar coordinate. Then along a solution, d (r2) Cit 2'
.
.
= xx +yy
+ y2(l
- x 2 - 2y2) 2 = y2(l _ x _ 2y 2).
= xy - xy
Then f ~ 0 on r = 2- 1/ 2 and f:5 0 on r = 1. Thus the annulus {(x,y) : 1/2:5 r2 :5 I} is positively invariant. It also contains no fixed points. Therefore this annulus contains a periodic orbit. Now we begin the proof of the Poincare-Bendixson Theorem. PROOF OF THE POINCARE-BENDlXSON THEOREM. Let Lo be a connected open transversal to the flow, i.e., Lo is the image of an open interval. Let L be a connected open (sub-)transversal with cl(L) c Lo. Let
L'= (.
there is a tx > 0 such that
180
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
We assume that we ace dealing with an L for which L' is nonempty. For x E L', let T(X) > 0 be the smallest time such that ."I(X) E L. Then g(x) = ."T(X)(X) is the Poincare map and depends differentiably on x E L', 9 : L' -+ L. Below we often say we have a transversal L and do not mention Lo or L'.
Lemma 9.3. If gk(x) is defined for k k in terms of the ordering on Lo.
= 0, ... , n
then gk(x) is a monotone function of
PROOF. Consider the Jordan curve r formed by the trajectory {."I(X) : 0 ~ t ~ T(X)} and the line segment on L' connecting x and g(x). Then r is the boundary of a subset Dc D. The trajectories can only leave or enter D across the line segment on L'. Points are either all leaving across this segment or all entering. If they are all entering, then gk(x) E int(D) for k ~ 1. Thus the entire sequence is on the same side of x as g(x). The same argument applies to all the iterates. This shows the sequence is monotone. 0
Lemma 9.4. Lt.·t L /),. 11
transv('f.'iIlI/~~
Il/wve. Tlwn w(p)
nL
is I1t most /JIm point.
PROOF. Assume Yo E w(p) n L. Then ."I(p) gets near Yo infinitely often. If it gets near enough to L (in a closed neighborhood of Yo) then it has to cross L. (This follows by methods like the proof of the existence of the Poincare map using the Implicit Function Theorem.) Thus the trajectory crosses L infinitely often. The sequence of points where it crosses L is monotone by Lemma 9.3. A monotone sequence can accumulate on at most one point, so L n w(p) = {Yo}. This proves the lemma. 0
Lemma 9.5. Assume O(p) is bounded, w(p) contaillS no fixed points, and q E w(p). Then q is on a periodic orbit, and w(p) = w(q) = O(q). PROOF. Since q E w(p), w(q) C w(p). Take Z E w(q). Then Z can not be a fixed point, so we can take a transversal L at z. There is a sequence of times tn such that ."I n (q) accumulates on z. These times can be chosen so that ."I n (q) E Lj they also must be in w(p) by its invariance. Since w(p) n L is at most one point, .,,1 .. (q) = .,,1 1 (q) for all n. Thus .,,1.-II(q) = q and q is a periodic point. But then w(q) = O(q) = {."I(q) : 0 ~ t ~ T}, where T = T(q) is the (minimal) period of q. It remains to prove that w(p) = O(q). Take a transversal L at q. There is a sequence of times Sn, going to infinity, such that Pn = ."On (p) ELand these points converge to q. Because q is a periodic point. the times can be chosen so that 8 n +1 - Sn = T(Pn) ::::: T, where T = T(q) is the period of q. These points are monotone on L, so they must accumulate on exactly one point from one side. Because the return times are bounded, it follows that the orbit O+(p) can only accumulate on O(q) and w(p) = O(p). This completes the proof of the lemma. 0 These lemmas complete the proof of the Poincare-Bendixson Theorem.
o
The following generalization of the Poincare-Bendixson Theorem allows w(p) to contain a fixed point. See Hale (1969), page 56 for a proof.
Theorem 9.6. Assume that O+(p) is contained in a closed bounded subset K of the planar domain D of the differential equation :it = !(x). Assume further that! has only finitely many fixed points in K. Then one of the following is satisfied: (a) w(p) is a periodic orbit, (b) w(p) is a single fixed point of !, (c) w(p) cOllSists of 11 finite number of fixed points, ql,"" qm, together with 11 finite set of orbits 11, ... ,1n, such that for each j, o(-yj) is a single fixed point and w(-Yj) is a single fixed point: w(p) = {ql,' .. , qm} U11 U· .. U1n, o(-yj) = qi(j,o), and w(-yj) = qi(j ...,)'
5.10 STABLE MANIFOLD THEOREM FOR A FIXED POINT
181
5.10 Stable Manifold Theorem for a Fixed Point of a Map In this section we consider a C" differentiable map on a Banach space, f : U c E ...... E. We allow f to be non-invertible and not onto. Assume p Is a fixed point, and let A = D fp be the derivative at p. The spectrum of A Is the set of complex numbers for which A - >'1 is not an isomorphism, spec(A) = {>. E C : A - >.I is not an isomorphism}.
In finite dimensions the spectrum is the same as the set of eigenvalues. A fixed point P for f is called hyperbolic if spec(Dfp) n {o : 101 = l} = 0. By standard results in Spectral Theory, there are invariant subspaces EU and E" corresponding to the part of t.he spectrum ollt.~ide lind inRide thl' unit circle, and constants 0 < I' < 1 and>' > 1 such that IE = IE" EllIE', spec(DfpIIE U) C {o: 101> >'}, and spec(DfpllE') C {o: 101 < ,,}. In fact, we identify IE with lEu EllE", so we write a point x E E as (XU, x") where x" E EO' for (f = U, s. In finite dimensions, these correspond to the subspaces spanned by the generalized eigenvectors for the eigenvalues of absolute value greater than one and less than one respectively. Because the spectrum of EU Is bounded away from 0 (and by the construction of lEU), DfplEu is an isomorphism on EU. FUrther, there Is C > 0 such that IIDf;IIE'1I < c"n and IIDf;nllEull < C>.-n for n > O. By the usual change in the norm on E we can take C = 1. Such a norm Is called an adapted nonn or adapted metric. The subspaces IE" and lEU are called the stable and unstable subspaces for the fixed point p respectively. Given a hyperbolic fixed point p for a C" map f, and given a neighborhood U' C U of p, the local stable manifold for p in the neighborhood U' is defined to be the following set: W·(p,U',1) = {q E U' : fj(q) E U' for j > 0 and
d(fj(q),p) ...... 0 as j ...... co}. To define the unstable manifold, we need to look at the past history. Because f is not necessarily invertible we need a replacement for the backward iterates. We define a past history of a point q to be a sequence of points {q_j }~o such that qo = q and f(q-j-d = q-j for j ~ O. The local unstable manifold for p in U' Is defined to be the following set: WU(p, U', I) = {q E U' : there exists some choice of the past history of q {q-j}~o C
U' 8uch that d(q_j,p) ...... 0 as j
-+
co}.
Sometimes we write WI~(P, I) and W~(p, I) to indicate local stable and unstable manifolds for a suitably small but not specified neighborhood U'. We also write W: (p, f) for W·(p, B(p, e), I), and similarly W.U(p, I) for WU(p, B(p, e), I). The following theorem states that these local stable and unstable manifolds are C" embedded manifolds which can be represented as the graph of a map from a disk In one of the subspaces to the other subspace. The Hartman-Grobman Theorem already proves that the stable and unstable manifolds are topological disks but it does not prove they are differentiable. In fact the hard part of the proof of the Stable Manifold Theorem is to show that these manifolds are Lipschitz. Once this Is known, it can be shown they are differentiable. Again the Hartman-Grobman Theorem does not prove they are Lipschitz. On the other hand, the Hartman-Grobman Theorem proves that
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
182
all the orbits near the fixed point behave like the linear map, while the Stable Manifold Theorem only gives information about points on the stable and unstable manifolds. To represent a closed disk in one of the subspaces, we use the following notation: for any Banach space £ and 6 > 0, the closed disk in £ about the origin of radius 6 is represented by £(6) = {x E £ : Ixl ~ Ii}.
Theorem 10.1 (Stable Manifold Theorem). Let p be a llyperbolic fixed point for a C k map I : U c JE --+ JE with k ~ 1. We assume that the derivatives are uniformly continuous in terms of the point at which the derivative is taken. Then there is some neighborhood of p, U' c U, such that W' (p, U', f) and WU (p, U', f) are each C k embedded disks which are tangent to JE' and JEu respectively. In fact, considering JE = JEu x JE', there is a small r > 0 such that taking U' == p + (JEU(r) x JE'(r», W'(p, U', f) is the graph of Il C k function q' : JE'(r) --+ JEU(r) with q'(O) = 0 and Dq~
= 0:
W'(p, U', f) = {p + (q'(y), y) : y E JE"(r)}.
Similarly, there is a C k function aU : JE"(r) --+ JE'(r) with qU(O) = 0 and Dq~ that WU(p, U', f) = {p + (x, qU(x» : X E JEU(r)}.
Morco\'er, for r > 0 small enough and U' = p
w' (p, U', f)
= {q E U' :
= {q E U'
= 0 such
+ (JEU(r) x JE'(r»,
Ii (q) E U' for j
~ o}
: li(q) E U' for j ~ 0 and
d(P(q),p) ~ jlid(q,p) for all j ~
OJ.
This means that every point that is not on W'(p, U', f) leaves U' under forward iteration, and that points on W'(p, U', f) converge to p at an exponential rate given by the bound on the stable spectrum. Similarly, lVU(p, U', f)
= {q E U'
: there exists some choice of the past history of q
with {q-i}~O c U'} = {q E U' : there exists some choice of the past history of q with {q-i}~O
c U'
and
d(q_},p) ~ A-id(q,p) for all j ~
FIGURE 10.1. Fixed Point
OJ.
Stable and Unstable Manifolds in a Neighborhood of the
5.10 STABLE MANIFOLD THEOREM FOR A FIXED POINT
183
Once we have local stable and unstable manifolds, then the (global) unstable maniJold is obtained by WU(p,f) = UJjWU(p,U',f). j~O
If f is invertible, then the (global) stable manifold is obtained by
W"(p,f) =
U riW"(p,U',f). i~O
We end this section with a discussion of the types of proofs of the Stable Manifold Theorem. There are two basic types, the Graph Transform Method of Hadamard (1901) and the variation of parameters method of Perron (1929). For historical notes see Hartman (1964), page 271. Graph Transform Method of Hadamard In the Graph Transform Method, the approach is to take a trial function u : EU(r) E"(r) which might possibly give the unstable manifold. A new function r(0') : EU(r) E" (r) is defined so that
-+ -+
graph[f(O'») = f(graph(O'» n (EU(r) x E"(r)). The set of all such possible trial functions is defined by
E(r,L)
= {a: EU(r) -+ E"(r) : 0' is continuous, 0'(0) = O,Lip(O') :5 L}.
It Is shown that f : E(r, L) -+ E(r, L) is a contraction in the CO-sup topology. It thus has a fixed point function, f(O'U) = au, which gives the local unstable manifold as the graph of a Lipschitz function. Because the map f is not a contraction on the function space if the function space is given the C 1 or C k topology, the fixed point needs to be proven to be C 1 (and then Ck) as a second step by different means. Notice that in this method only one iterate of f is used to define f. Fur this approach see Hirsch and Pugh (1970), Hirsch, Pugh, and Shub (l
n fj(EU(r)
N-l
BN =
x E·(r».
j=O
Then B N+ 1 = f(BN) n Bo. It is prove
is the graph of a Lipschitz function that Is in fact C I . It is then proved that it is C k by induction on k. Variation of Parameters Method of Perron To explain the Perron method, it Is easier to consider an ordinary differential equation. AssulIle that :Ii: = J(x) = Ax + g(x) where A Is the linear part at 0, g(O) = 0, and Dga = O. Again take a splitting so x = (x..,x.)lr. Applying the variation of
184
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
parameters to the equation itt) = Ax(t) known nonhomogeneous term, we get
+ g(x(t»,
thinking of the second term as a
We then break this equation into the stable and unstable components. The stable component is specified with an initial condition at t = 0, x.(O) = a., so taking t2 = t and t\ = 0 for the stable component we get x.(t) = eA.1a,
+ 10 1eA.(I-')g.(x(s»ds.
On the other hand, we can not specify the unstable component at t = 0, but it is determined by the fact that the unstable component goes to zero (or is at least bounded) as t goes to infinity. Therefore, taking t2 = t and t\ = 00 for the unstable component and xu(oo) = 0, we get xu(t) = 0
+ J~ eA.(I-·)gu(x(s»ds.
Combining the right hand sides of these two equations, we can get a transformation of trial solutions on the stable manifold. To get a transformation of the whole stable manifold we look at trial solutions x( t, a,) where a, E E' (r) parameterizes the stable component of the initial conditions, and let F be the function space of such trial sets of solutions on the stable manifold. A transform T : F -+ F is defined by [T(x)J(t, a.) = eA.1a,
+ 101eA.(I-·)g,(x(s, a,)) ds + f~ eA.(I-·)gu(x(s, a.»
ds,
which takes one parameterized set of potential solutions on the stable manifold into another set. This transformation can be shown to be a contraction which has a fixed point which gives the stable manifold. Notice that one iterate of T uses the whole trial orbit of x(t, a.) which is pulled back with the linear flow. For this approach see Hale (1969), Chow and Hale (1982), or Kelley (1967). The proof in Irwin's book, Irwin (1980), is a mixture of these methods. For N the natural numbers, let
L:(E(r» = {q E cl(N, E(r» : q(j)
-+
0 as j
-+
oo}.
A function is then defined, F: E'(r) x C(E(r»
-+
C(E(r»,
by the formulas
F(x.. q)(O) = (X.,q(O)
+ A~l[qu(l) -
fu(q(O»]) E E'(r) x EU(r),
F(x.,q)(n) = (f.(q(n - 1»,q(n) + A~l[qu(n + 1) - fu(q(n»)]) for n > O.
It is then shown that FisC" and a contraction for each fixed x., so there exists a -+ C(E(r» with F(x., g(x.» = g(x,). Note that for this fixed function, f(g(x,)(n» = g(x.)(n + 1). Then h(x.) == g(x,)(O) is C" and its image is the stable manifold. For details see Irwin (1980).
C" function 9 : E'(r)
5.10.1 PROOF OF THE STABLE MANIFOLD THEOREM
185
Differentiability of the Stable Manifold The differentiability of the stable manifold is very important for our applications. There are several methods of proving this property. The main difficulty is that the various transformations are not contractions of the set of possible Cl manifolds so we can not directly look at a contraction map on this space. Hirsch and Pugh (1970) used the Fiber Contraction Principle. See Exercise 11.4. We use the method of cones inspired by McGehee (1973). Also compare with the argument in Moser (1973) to show that there is a subshift (horseshoe) as a subsystem of a dynamical system. Both of these references are based on work of Conley. Following these references, we use the cones to not only prove that the stable manifold is Lipschitz but also to show that it is C 1 • Cones are closely related to the idea of hyperbolicity. If L : R" ..... R" is a hyperbolic linear map, and CU is a cone of vectors roughly in the expanding direction, then L(CO.) c CU, i.e., the cone of vectors is invariant by L. (See following subsection for the definition of a cone of vectors. Some authors use the name of a sector for what we call a cone.) If the exact expanding direction is not known, then it is often easier to find an invariant cone, Cu. Using such a cone, it is possible to show that L is hyperbolic. The use of cones goes back at least to Alekseev (1968a). Sinai (1968,1970) also used the concept very early. As mentioned above they are used in McGehee (1973) and Moser (1973). Recent uses of cones include Wojtkowski (1985), Burns and Gerber (1989), and Katok and Burns (1994).
5.10.1 Proof of the Stable Manifold Theorem This subsection contains the proof of Theorem 10.1, the Stable Manifold Theorem. The proof presented here follows the method developed by Conley as given in McGehee (1973), but with some modifications which were influenced by Hirsch and Pugh (1970). Thus it is in the spirit of the Hadamard proof (more than the Perron proof) but has some differences. There is also a lot of similarity to the argument of Conley given in Moser (1973) to show there is a subshift (horseshoe) as a subsystem of a dynamical system. The arguments in both McGehee (1973) and Moser (1973) are presented in the case of two dimensions where both the stable and unstable directions are one dimensional. Some modifications need to be made to take care of the higher dimensional cases. Because the map is not assumed to be invertible, we need to indicate how we show both the stable and unstable manifolds exist.
W:
OUTLINE OF THE PROOF. To start the proof, we want to show that is the graph of a function 'P' : E'(r) ..... EU(r) which is Lipschitz with Lipschitz constant less than a (a-Lipschitz) for some fixed a. To verify that is a graph, we use the characterization of W: as points whose forward orbit remains in 8(r) = E'(r) x EU(r):
W:
W: = {x : jj(x) E 8(r) for all j ~ O}
n ri(8(r». ;=0 00
=
See Figure 10.2. (Later in Proposition 10.7, we verify that these points tend to the fixed point.) In order for to be a graph, we need for n [{x} x EU(r») to be a single point for each x E E·(r). The disk {x} x EU(r) is a trivial example of a Cl graph
W:
W:
of a function from EU(r) into E'(r) with slope less than a-I. We define an unstable disk to be the graph of a Cl function 'P : E"(r) -+ E'(r) with I/D'Pyl/ S a-I for all y E £"(r). Similarly, an a-Lipschitz ,table disk is defined to be the graph of a function '" : £'(r) ..... E"(r) with Lip("') S a. Using the bounds on the terms In the derivative
186
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
B(r)
u
{X} X E (r) FIGURE 10.2. Intersection of the Boxes rn(B(r» of f, Lemma 10.5 shows that if D(j is an unstable disk, then Dr = f(Dg) n B(r) is an unstable disk (with slope less than a-I) and diam(Do n rl(D't)) ::; (A - m-I)-I diam(Do) = (A - m- 1 )-12r. By induction, D'J = f(D'J-I) n B(r) is an unstable disk, and n
diam(n rj(D'J}) ::; (A - m- 1 )-n2r. j=O
From this it follows that the infinite intersection is a single point which we call (x, ip"(x»:
w: = [{x} x EU(r)] n n rj(B(r» 00
[{x} x E"(r)] n
j=O
= {(x, ip·(x»}.
W:
This shows that is a graph. To show that it is a-Lipschitz, we consider the stable and unstable cones which are defined by C'(a) = {(v" v u ) E E' x EO. : IVul ::; alv,l} C"(a) = {(V., v u ) E E' x EO. : Iv,,1 ~ alv.I}.
and
See Figure 10.3. The condition that
W: n [{p} + C"(a») = {p} for all points pEW: is equivalent to the graph being a-Lipschitz. Proposition 10.6 shows that this intersection is indeed a single point by the same argument which shows that is a graph.
W:
cs FIGURE 10.3. Stable and Unstable Cones
5.10.1 PROOF OF THE STABLE MANIFOLD THEOREM
187
W;
After we have shown that is the graph of a Lipschitz function, Proposition 10.8 shows that it is C 1 and Theorem 10.10 proves that it is C" by induction on k. To carry out the above argument, we need some estimates about the effect of the map / on the stable and unstable components. Lemma 10.3 shows that the distance between the unstable components of two points p and q is increased by the action of the nonlinear map / provided the displacement from p to q lies in the unstable cone (is "mainly an unstable displacement"). Before making these nonlinear estimates, we give similar linear estimates in Lemma 10.2: the derivate D/p preserves the unstable cones and stretches the length of vectors in the unstable cone. We prove this linear case first because it is easier and is used in PropOSition 10.8 to prove that W; is Cl. REMARK 10.1. There are some differences in the proof given here from those in the references cited. In particular, if dim(E") i' I, then n
8[n ri(8(r»] \ [8E'(r) x EU(r)] )=0
need not be made up of two graphs over E'(r), as is true in Moser (1973) where dim(EU) = 1. The fact that I need not be invertible necessitates some care in the proof for the unstable manifold: 8[nj=0Ii(8(r))] n [E'(r) x {yo}] need not be in the image by I of 8[nJn;~J1(8(r»]. 1(8[nj;JJ1(8(r»)]). In general, greater care needs to be taken in sever of the arguments when I is not assumed to be invertible. To start carrying out the above outline, we first introduce some notation. It is convenient in expressing some of the ideas to use the minimum norm of a linear map which is first introduced in the Section 4.1. We also use the estimates from the Inverse Function Theorem given in Section 5.2.2. (Those estimates should be reviewed at this time.) Remember that the minimum norm of a linear map A : lE - t lE is defined by . IAvl meA) = v~o mf -Iv-I . The minimum norm is a measure of the minimum expansion of A just as IIAII is a measure of the maximum expansion. If A is invertible then meA) = IIA -III-I. For a hyperbolic fixed point 0 for which IID/anllEull < CA- n for n > 0, it follows that m(D/O'llEU) > CAn. By taking local coordinates we can assume that the fixed point is 0 in the Banach space lE and E = E' x lEu, where these subspaces come from the hyperbolic splitting at this fixed point. (This cross product is isomorphic to the original Banach space.) Let 'II'u : lE - t lEU be the projection along E' onto EU, and 7r, : E - t E' be the projection along EU onto E'. Then 1(0) = 0 and we use the splitting E' x EU to write
D' _(A..0 Auu0) ' 10 -
where Au = 7r,Dlol(E' x {OJ) and Au .. = 'II'uD/ol({O} x E"). Because the fixed point is hyperbolic, there exist 0 < J.I < 1 < A and nonns on each subspace E' and lEu so that m(A .... ) > A > 1 and IIA .. II < J.I < 1. We fix these norms, and on lE we put the norm which is the maximum of the components in the E' and E" subspaces. In any Banach space E we denote the closed ball of radius r about 0 by lEer) = {x E E : Ixl ::s r}. We define a neighborhood of the fixed point in E by taking the cross product of the closed
188
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
balls in JEu and E' of radius r (which is also the closed ball in E because we are using the maximum of the norm in the stable and unstable subspaces): B(r) = E'(r) x JEU(r) C E.
Then I : B(r) -+ E, with 1(0) = o. We fix 0 > 0 which serves as a bound on the slopes of graphs over IE' into lEu. For all but spE'Cial cases, we can take 0 = 1. We take t > 0 small enough so that
+m +t < 1 and A - m- I - 2t > 1.
II
(These give bounds on the effects of the off diagonal terms to the expansion and contraction.) Given these 0 and t, we can find r > 0 small enough 80 that for q E B(r),
with
< II, > A, < t, < t, IIDlq - Dlall < t. IIA.. (q)1I m(Auu(q» IIA.u(q)1I IIAu.(q)1I
and
We say that I satisfies hyperbolic estimates in a neighborhood in which estimates of the above type are true. This name is justified because these estimates imply that the map contracts displacements which are primarily in the stable direction and expands displacements which are primarily in the unstable direction as is shown in Lemmas 10.3 and 10.4 below. Having fixed the notation and choice of of t, we start by proving the linear estimates as discussed above. Lemma 10.2. For p E B(r), DlpCU(o) C CU(o). PROOF. Let v
= (v •• v u) E CU(o) \ {O}. and p
E B(r). Then
11r.Dlpvl _ IA.u(p)vu + A •• (p)v.1 l1ruDlpvl - IAuu(p)vu + Au.(p)v.1 < IIA.u(p)II·lvul + IIA•• (p)1I ·lv.1 - m(Auu(p»lvul-IIAu.(p)II·lv.1 IIA.. (p)II(lv.l/lvu) + IIA.u(p)1I m(Auu(p» -IIAu.(p)II(lv.l/lvui) 110
- 1
+t
:'S A_m-l:'So But we chose
f
and
0
so that II
+ m < 1 and
-I(
II
+ fO)
A-m- I
A - m- I > 1.
-I
<0
.
o
5.lD.I PROOF OF THE STABLE MANIFOLD THEOREM
189
Lemma 10.3. Let p,q E B(r) and q E {p} + C"(a). Then the following are true: (a) 111".0 !(q) - 11". 0 !(p)1 :5 a- I (I£ + m)11I",,(q - p)l, (b) 111"" 0 !(q) - 11"" 0 !(p)1 ~ (A - m- I )11I",,(q - p)l, and (c) !(q) E {f(p)} + C"(a). PROOF. (a) The idea is to calculate how 1I".o! varies along the line segment from p to q, t/J(t) = p + t(q - p) for 0 :5 t :5 1. Using the Mean Value Theorem,
111".0 !(q) - 11". 0 !(p)1 = 11I".o! 0 t/J(l) - 11". o! 0 t/J(O)I d
:5 sup Id 1I".o! 0 t/J(t)1 O:51SI t :5 sup IIA.,,(t/J(t»1I·11I",,(q - p)1 °SIS!
+
sup IIA •• (t/J(t»1I·11I".(q - p)l oS'S!
:5 { sup IIA.,,(t/J(t»1I °SIS!
+ a-I
sup IIA •• (t/J(t»1I}11I",,(q - p)1
0S':51
:5 (£ + a- II£)11I",,(q - p)l = a- I (I£
+ ta)11I",,(q -
p)l.
This proves part (a). (b) To get an estimate of minimum expansion we would like to take the inverse of the function. To get a function from a space to Itself, we split the displacement from p to q into the component in the lE" direction and then the lE' direction, and apply the Inverse Function Theorem (Theorem 2.4) to the part that concerns the displacement in the E" direction. Let p = (P.,p,,) and q = (q.,q,,) for p and q as in the statement, i.e., p" = 1I""p and q" = 1I""q for C1 = u,s. Let z = (P.,q,,), so q - p = (q - z) + (z - p), with z - p E {OJ x E" and q - z E E' x {OJ. For y E E"(r) and x E E'(r) define h(y) = 11"" 0 !(P.,y) and g(x) = 11"" 0 !(x,O). Then g(P.) = 11"" 0 !(p" 0) = h(O). Also Dgx = A".(x, 0), so IIDgxll :5 t. By the Mean Value Theorem, tr ~ tlp.1 ~ Ig(p.)1 = Ih(O)I. Then, Dhy = A""(p.,p,, + y), which is an isomorphism with minimum norm greater than A, and IIDhy - A",,(O,O)II < t. Applying Theorem 2.4, hIE"(r) is onto lE"«A - t)r - Ih(O)l) ::,) E"«A - 2t)r) ::,) E"(r), and h has a unique inverse on this set. Also by the Inverse Function Theorem, the derivative of h- I is given by D(h-I)h(y) = (Dhy)-I which has norm less than A-I. Therefore by the Mean Value Theorem,
Ih- 1 (W2) - h-l(wl)1 :5 A- 1 1w2 - wd· Letting W2
= h(q,,) = 11"" 0 !(z)
and WI
= h(p,,) = 11"" 0 !(p), so
h- l (wll = p", we get
AI1I",,(q - p)l :5 111"" 0 !(z) - 11"" 0 !(p)l. Thrning to the displacement from z to q,
111"" 0 !(q) - 11"" 0 !(z)1 :5 sup IIA".1I·11I".(q - p)1 :5 tI1l".(q - p)l
:5 m- 1 111",,(q - p)l·
h- 1 (W2)
= q"
and
190
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
Combining, l7Tu 0 f(q) - 7Tu 0 f(p)1 ~ l7Tu 0 f(z) - 7Tu 0 f(p)1 -17Tu 0 f(q) - 7Tu
0
f(z)1
~ (oX - m-I )17Tu (q - p)l.
This completes the proof of part (b). (c) For p and q in the statement of part (c), to show that f(q) E {/(p)} + C"(o) we need to look at the ratio of stable and unstable components. Using part (a) to estimate the numerator and part (b) to estimate the denominator, we get
11T.[/(q) - l(p)JI < 0- 1 (I-' + m) l(p)JI (oX - m-I)
l1Tu[/(q) -
< 0- 1 . This last inequality uses the fact that I-' + m < 1 and oX - m-I > 1. From the above inequality we get that f(q) E {/(p)} + CU(o). 0 From this last lemma, we can determine the effect of I-Ion displacements within the stable cone. Lemma 10.4. Assume that f(q-d = q, I(p-d = p, q E {p} points P.q,P_I,q_1 E B(r). Then (a) q_1 E {p-d + C'(o), (b) 11T.q-1 -1T.p-d ~ (" + m)-II1T.q - 7T.pl·
+ C'(o),
and all the
PROOF. (a) We use the fact that C'(o) and C"(o) are complementary cones, i.e.,
C'(o)
= E \ int(CU(o».
By Lemma 1O.3(c), q ft {p} + CU(o) implies q-l ft {p-d + CU(a). Thus q-l E + C·(o). (b) Following the method of the proof of Lemma 10.3(a), let 1{.I(t) = P-l + t(q-l p-d. Then
{p-d
11T.q -1T.pl
= 11T. 0/01{.l(1) -1T. 0/01{.l(0)1 d
:5 sup Id 1T. 0 I O~t~l
0
t
1{.I(t) 1
:5 sup IIA .. (1{.I(t))ll·I7T.q-1 - 1T.P_t! O$l~1
+ sup IIA.u(1{.I(t»1I
·17Tuq-l - 7Tup-d
0$19
:5 { sup IIA .. (1{.I(t»1I O$l~l
+ 0 sup
IIA.u(1{.I(t»II} ·17T.q-l - 7T.P_ll
0~t9
:5 (I-' + Of) ·11T.q-l - 7T.p-d· Dividing both sides of the inequality by the constant, we get the result of part (b). 0 The following lemma makes precise the induction process with unstable disks which is discussed in a descriptive fashion earlier in this section.
5.10.1 PROOF OF THE STABLE MANIFOLD THEOREM
191
Lemma 10.5. Let D~ be a C 1 unstable disk over E"(r). (a) Then D~ = I(D8) n B(r) is an unstable disk over E"(r) and diam[7I".. U- I (Df) n Dg)]
(b) Inductively let
D~
:s (~- fa- I )-12r.
= I(D~_I)nB(r). Then D~ is an unstable disk for n
n rj(Dj)] :S
2: 1 and
n
diam[7I""
(~ - fa- I )-n2r.
j=O
See Figure 10.4.
, ~
•
\ FIGURE 10.4.
Disks
D~
and
l-n(D~)
PROOF. (a) Since DC; is an unstable disk, it is the graph of a CI function 'Po: E"(r) -+ E'(r) with IID('Po»'1I :S 0- 1. Let 0'0: E"(r) -+ B(r) be the associated function defined by uo(y) = ('Po(y), y). Define h = 71".. 0 I 0 0'0. We need to determine the size of the set covered by h and on which it has a unique inverse. We obtain this by applying Theorem 2.4 for which we need to estimate IIDh)' - A.... (O)II:
IIDh)' - A".. (0) II = IID(7I"..
0
I
0
= IIA".. (uo(y»
0'0»' - A" .. (O)II
+ A ... (uo(y»D('Po»'
- A.... (O)II
:S IIA.... (uo(y» - A..,,(O) II + II A... (uo(y))ll IID('Po»' II :S (f + fa-I). By Theorem 2.4, h is a C I local diffeomorphism and has an inverse defined on
To get the estimate on Ih(O)1 and so the image under h, notice that h(O) = 71".. 0 1('Po(O),O) with l'Po(O)1 :S r, and 71".. 0 1(0,0) = O. By the Mean Value Theorem applied to 71" .. 0 I and the displacement from (0,0) to ('Po(O), 0), we get that Ih(O)1 :S 8up(IIA... IDI'P0(0) - 01 :S fro Using this estimate, we get that h has an inverse defined on h(E"(r» :::> 1E"«~ - fa-l)r -lh(O)l) :::> E"«~ - fa-I - f)r) :::> 1E"(r). Since h = 71"" 0 I 00'0 is a CI diffeomorphism which covers 1E"(r), h- I (IE"(r» C E"(r). Therefore we can define what is called the graph transform of (10, (11 =
1 0 (10 0 h-lllE"(r).
v. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
192
Notice that 1I"u 00"1 = 1I"u 0/00"00 h- I = h 0 h- I = idIJEU(r). Thus EU(r). We also need an estimate on IID(h-l)yll. For Vu E EU,
ID(1I"u
0"1
is a graph over
0/ 0 O"o)yVul = IAuu(O"o(Y»vu + Au,(O"o(y»D(CPo)yvul ~ IAuu(O"o(y»vul -IAu,(O"o(y»D(cpo)yvul ~ m(Auu(O"o(y)))lvul -IIAu.(O"o(y»IIIID(cpo)yIlIVul
~ (~ - m-I)lvul.
Therefore m(Dhy)
~ ~
- m-I, IID(h-l)yll
~ (~-
m-I)-I, and
Ih- I (Y2) - h-l(y')l ~ (~- m- I )-IIY2 - yd·
IID(cp')yll ~ II A", (0"0
0
h-l(y»D(h-l)yli
~
+ IIA,,(O"o 0 h-l(y»D(cpO)h-1(y)D(h-l)yll f(~ - m-I)-I + I-'a-I(~ - m-I)-I
~
a -I (
II ~ -
+ fa
)
fa-I
Note that for this argument we need that II + m < 1 and ~ = O"I(JEU(r», we have verified the conclusions of part (a). Part (b) follows from part (a) by induction on n.
f -
Dr
m- I > 1. Letting 0
REMARK 10.2. Lemma 10.5 is the heart of the Hadamard proof of the existence of a
Lipschitz unstable manifold. We could define the function space
'H
= 'H(a, r) = {O" : JEU(r) -- B(r)
: 1I"u 00"
= id,
Lip(1I",
0 0")
~ a-I}.
By an argument like that in Lemma 10.5, we can show that for any 0" in 'H, l(image(O"» is again the graph of a Lipschitz function in the function space which we denote by 1*(0"). This map 1* on the function space 'H is called the graph transform. Then it is possible to show that 1* is a contraction on 'H with the CO-sup topology, and so has a unique fixed section O"u, for which 11", oO"u is a-I-Lipschitz. The graph of 11". OO"U or the image of O"U is the unstable manifold. For this type of approach see Hirsch and Pugh (1970), Hirsch, Pugh, and Shub (1977), and Shub (1987). Instead of following that method, we use the above lemma on C l functions (which are unstable disks) to prove the existence of a Lipschitz stable manifold. The following lemma now gives the induction step and the proof that W;' is the graph of an a-Lipschitz function. As discussed above, we use Lemma 10.5 to prove that for each x E JE'(r), n ({x} x EU(r» is exactly one point, and so is a graph. Finally, Lemma 10.3 shows that W;' is a-Lipschitz.
W:
W;
Proposition 10.6. The l()('al stable manifold, W: = n~o j-j(B(r», is a graph of an a-Lipschitz function cP' : E'(r) -- EU(r) with cp'(O) = O. PROOF. Let Sn = n;=o rj(B(r» so W; = n::,,=o Sn: the forward orbit of a point in the infinite intersection is contained in B(r) so it lies on W;.
5.10.1 PROOF OF THE STABLE MANIFOLD THEOREM
193
The first step is to use Lemma 10.5 to see that W; n [{x} x E" (r)] is at moet one point for each x E IE." (r). Fix x E IE' (r) and let D~(x) = {x} x E" (r). By Lemma 10.5 and induction on n, D~(x) I(D~_1 (x» n B(r)
=
is an unstable disk, and [{x} x F'(r)] n S"
n
= " r i (Dj(x» C Dg(x) i=O
is a nested sequence of closed and complete sets whose diameters are bounded above by (~ - (Q-I )-n2r. Since the diameters of [{x} x IE" (r)] n Sn go to zero u n goee to infinity, the intersection [{x} x JE"(r)] n W; is a single point for each x and W; is a graph over IE.". See Figure 10.5. We show below that the WI-limit set of a point in the intersection defining W; is the fixed point.
B(r)
FIGURE
10.5. Sets Sn Converging to
W;
To show that the graph is Lipschitz, let PEW;. Assume that
q E W: n [{p} + C"(a)] Because both p, q E W;, /i(p),/j(q) E B(r) for all j ~ O. By induction, applying Lemma 10.3 we get that for all j ~ 0,
Ii (q) E {Ii (p)} + C"(a) and IlI"u(fi(q) - fi(p»1 ~ (~- £a-lYllI"u(q - p)l. If q:F p, then since (~- (Q-l)i goee to infinity either fi(q) or Ji(p) must leave B(r). This contradicts the fact that both p and q are in W;. Thus q p, and there is at moet one point in W; n ({p} + C"( a». Applying this to x, y E JE" (r) with x :F y, since o-u(x) (x, 'P'(x» :F (y, 'P'(Y» o-U(y), it follows that UU(y) ;. {o-U(x)} + C"(a), o-U(y) - {o-U(x)};' C"(a), and utl(y) - {UU(x)} E C'(a). Thu8 Lip(lI"u ou') Sa. This 0 completes the proof of Proposition 10.6.
=
=
Before we prove that convergence to O.
=
W;
is Cl, we check its characterization in terms of the rate of
194
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
Proposition 10.7. If pEW:, then IP(p}1 $ 0(1-' converges to the fixed point 0 at an exponential rate.
+ fCt}ilpl for
j ~ 0, so li(p}
W:
PROOF. Because is o-Lipschitz, l7ru (Ji(p} - 011 $ ol7r.[fJ(p} - 011 so li(p} E to} + C·(o}. Applying Lemma 10.3 to P(p} and li(O} = 0, we get that (Jl
+ fCt}IP-l(p}1
+ eo}I7r. oli-1(p}1 (11 + eo)I7r.[/ j - 1(p} - /]-1 (O}II
~ (I-'
~
~ 17r.IP (p) - P (0)11 ~
17r,
0
P (p)l·
By induction. we get that (I-'
+ eoJilpl
~ 17r,
Then the unstable component is bounded by
0
0
P(p)l·
times this amount:
Because we are using the maximum norm of the components,
o
This completes the proof of the proposition.
Proposition 10.S. The local stable manifold, W:, is C l and is tangent to]E' at the fixed point O.
W:
PROOF. We have shown that is the graph of a Lipschitz function 'P' : ]E'(r) --+ ]EU(r} with Lip('P') $ 0. We let (T' : E'(r) --+ B(r} be defined by (T'(x) = (x,'P'(x)). We proceed to show that (T' and 'P' are CI. For any pEW:. since Lip('P') $ 0, (T'(JE'(r)) C {p} + C'(o). To get the derivative of 'P' . we use the comparison of the nonlinear action of 1 on C' (0) with the linear action of D/q on C·(o).
FIGURE 10.6.
Nested Cones
Define c,·n(p) = (D/;)-IC'(O). By Lemma 10.2. Df/J(p)CU(o} C CU(o) so [DI/J(p)]-IC'(O) C C'(o}.
5.10.1 PROOF OF THE STABLE MANIFOLD THEOREM
By induction, c··n(p) C C··n-1(p) C ...
c
C,·l(p)
c
195
C'(O).
See Figure 10.6. It follows from the argument of Proposition 10.6 applied to this sequence of linear maps, that C··n(p) converges to a plane Pp (which can depend on p) which is the graph of a linear map Lp : E' -+ EU. (This means that the maximal angle in the cone C··n(p) converges to zero as n goes to infinity.) The goal is to prove that "P' is differentiable at Xo = 1T.P with D"P~o = Lp and Pp = TpW:. (Because Po = IE' x {O}, this shows that W: is tangent to E' at 0.) We show that given n, there is an T/ = T/(n) > 0 such that for Iy - Xol < T/, q'(Y) - q'(xo) E C··n(p). Define truncated cones by c,·n(p,T/) = C··n(p) n 1T; 1(IE'(T/)),
and
C'(o,T/) = C·(O)n1T;l(E·(T/». We use the differentiability of J" to get the image by f- n of some truncated cone {In(p)} + C'(o, 71') inside the truncated cone {p} + C··n-1(P. 7/). Lemma 10.9. Let pEW: and n > O. Then there exists an TJ = TJ(n) > 0 such that the following hold.
(a) IfI1T.(q - p)1 ~ TJ and J"(q) E {/n(p)} + C'(o), then q E {p} + c··n-1(p, 17). (b) If q E W: and 11T.(q - p)1 ~ 17, then q E {p} + C··n-1(p, 17). PROOF. (a) We prove the contrapositive: we prove that there is an 17 > 0 such that if 11I'.(q - p)1 ~ 17 but q f/. {p} + C··n-1(p, TJ) then J"(q) f/. {In(p)} + C'(o). We want to use the differentiability of J" to conclude the inclusion of the truncated cones. Because the kernel of DI; doc<> not intersect the complement of c,·n-l(p,TJ), C··n-1(p,TJ)C, there is a constant C > 0 such that for v E C·· n- 1(p,17)C, ID.t;vl ~ C-1Ivl. On the other hand, by the definition of the C""(p), Df;c··n-l(p)c equals Dfr-'(p)C'(o)C, which in turn is contained in the interior of C·(o)C. In fact by Lemma 10.2, D"n_.(p)(C·(o)C) n {v: Ivl = I}
is uniformly contained in the interior of C·(o)c. Therefore there is {3 > 0 such that if v E c,·n-l(p)C, v' = DI;v, and Iwl < {3lv'l, then Vi + wE C·(o)c. With these constants we can take f' > 0 such that Cf' < {3. By the differentiability of In, there is TJ > 0 such that if Ivl ~ 17, then J"(p + v) = J"(p) + Df;v + E with lEI ~ f'lvl ~ f'Clv'l < {3lv'l where v' = Df;v. Now let 11I'.(q - p)1 ~ TJ but q f/. {p} + C·· n- 1 (P,71). Using the above inequalities with v = q - p, we get that J"(q) E {r(p)} + C'(o)C, or r(q) f/. {In(p)} + C·(o). This proves part (a). For the proof of part (b), we know that if q = q'(Y) then J"(q) E {In(p)} + C·(o). By part (b), given n there is an TJ such that if 11I'.(q - p)1 ~ TJ and r(q) E {/n(p)} + C'(o), then q E {p} + c··n-l(p). Combining these two facts proves part (b). 0 Returning to the proof of Proposition 10.8, we have shown by Lemma 10.9 that for > 0 small enough and Ix - "01 < TJ,
TJ = TJ(n)
qU(x) _ qU(Xo) E C··n-1(p). Given f' > 0, for n large enough, the maximum angle between vectors in c··n-l(p} is less than t', so for Ix - "01 < 1)(n) it follows that 1"P'(x) - "P'(Xo) - L,,(x - xo)1 < f'lx - XoI·
196
V.
ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
Thus 'P' is differentiable at Xo and D'P'(Xo) = Lp. Because the definition of C··n+l(p) only involves the derivative of a continuously differentiable function, In+ I, this cone is a continuous function of p. More precisely, for q near p, C··n+l(q) C C··n(p). Therefore the Image of both Lq and Lp are contained inside C··n(p). By taking n large and so q near enough to p, we can insure that the image of Lq is as near as we desire to the image of p. Therefore the derivative of 'P' is continuous and 'P' is C I . Because E' is invariant by Dlo, it follows that E' c c,·n(o) for all n and so Po = E'. Therefore we have proved that is tangent to E' at the fixed point O. 0
W:
Theorem 10.10. 1£ I is C k [or k ~ 1, then W: is C li . PROOF. If k = 1, then this is a restatement of Proposition 10.8. Assume that I is C li for k ~ 2, with 1(0) = O. Define a new function F(p,v) = (f(p),Dlpv). This function F is defined for (P. v) with P E B(r) and v is a (tangent) vector at p, v E TpE ~ E. Then F is C k - I , F(O,O) = (0,0), and
so DF(o.o)(p, v)
= (~~~e) = (Dto D~o)
(e) .
Thus (0,0) is a hyperbolic fixed point for F. By induction, F has a C k - I stable manifold, W:«O, 0), F). This manifold is characterized as points (p, v) which have Pj E B(r), and IVjl ~ r for all j ~ 0, where (Pj, Vj) :; Fj(p, v). Thus p E W:(O, f). In fact, IVjl = ID/~vl goes to zero exponentially as j goes to infinity. But vectors in TpW:(O, f) have this property. By the uniqueness of W:«O, 0), F) and counting the dimensions of TpW:(O, f) and W:«O, 0), F) n ({p} x TpE) ,
Thus, TW:(O.f) is C k -
I
Theorem 10.11. I[ I is Ck .
and W:(O,f) is
Ck
ek •
o
[or k ? 1, then the local unstable manifold W;'(O, f) is
PROOF. If I is invertible, then the existence of the unstable manifold for I follows from that of the stable manifold of I-I. We will now indicate the changes needed to take care of the case when I is not invertible. We can use Lemmas 10.3 and 10.4 as given. All we need is a replacement for Lemma 10.5. Lemma 10.12. Let D' = D~ be an o-Lipschitz stable disk over E'(r). (a) Then = I-I(D') n B(r) is an o-Lipschitz stable disk over E'(r) and
Dr
diam[/(D:>] ~ (I-' + m)2r.
(b) Inductively de1ine D~ = I-I(D~_I) n B(r). Then D~ is an o-Lipschitz stable disk [or n ~ 1 and diam[f"(DJ)] ~ (I-' + m)n2r.
Dr
PROOF. The first step is to show that = I-I(D') contains exactly one point in each fiber {x} x EU(r). But p E I-I(D') n [{x} x EU(r)] if and only if I(p) E D' n I[{x} x EU(r)] and p E {x} x EU(r). By Lemma 10.5, the second set is the graph of
5.10.2 CENTER MANIFOLD
197
a Cl function '1/1 : 1E"(r) -+ E'(r) whose derivative has norm slightly less than 0-1. There is a unique point of intersection of these two sets. This fact can be _n by considering the composition 71'" 0 (7(71','1/1) which is a contraction (has derivative with norm less than one). Let z be this point of intersection: {z} = D' n f[{x} x E"(r»). By the proof of Lemma 10.5, there is a unique P E [{x} x E"(r») with f(p) = z. Thus P E r1(D')n[{x} x 1E"(r)] is unique. By Lemma 1O.4(a), the graph is a-Lipschitz and diam[f(DDI ~ (I' + w)2r. This indicates the changes in the proof of Lemma 10.5. 0 Finally we want to check the characterization of the points in W,!'. Proposition 10.13. If pEW,!', then for any past history p_ j E B(r) with Po = P and f(p-j-d = P-j, it follows that Ip-jl ~ (>. - w)-jlpl· Also, the past history is unique in W,!'.
B(r) a past history. Thus, Po E {OJ Applying induction and using Lemma 10.3 with q = 0,
PROOF. Take Po = pEW,!' and P-i E
+ C"(a).
so 17I'"p-jl ~ (>. - w)-iI7l'"pol· Thus I71'" P-jI goes to zero exponentially fast as stated. The stable component, 17I',p-jl ~ aI7l'"p-jl also goes to zero exponentially fast. To show the uniqueness, assume that P_j and q_j are both past histories for P which remain in B(r) for all j ~ O. Then Po = qo, so qo E {Po} + C'(a). By induction and using Lemma 10.4, q_j E {p_j} + C'(a) for j ~ O. Applying the estimate of Lemma 10.4, the only way that both P-j-k and q-i-k can remain in B(r) for all k ~ 0 is for P_j = q-j' This proves uniqueness in W,!'. 0
5.10.2 Center Manifold In this section, we give the modifications of the Stable Manifold Theorem for the case that allows eigenvalues (spectrum) on the unit circle. We only discuss the case of finite dimensions, although the results in Banach spaces are true with some modifications of the assumptions. First of all, the stable and unstable manifolds exist in this situation as stated in Theorem 10.14. There also exists a manifold which is tangent to the center eigenspace as stated in Theorem 10.15. Theorem 10.15 also gives the existence of so called center-stable and center-unstable manifolds. Theorem 10.14 (Stable Manifold Theorem). Let f : U c R" -+ R" be a C k map for 1 ~ k ~ 00 with f(p) = p. Then R" splits into the eigenspaces of Dfp , R" = lE" EEl lEe EEl E', which correspond to the eigenvalues of Dfp greater than one, equal to one, and less than one. There exists a neighborhood of p, V C U, such that W'(p, V, f) and W"(p, V,!) are C le manifolds tangent to E' and E", respectively, and are characterized by the exponential rate of convergence of orbits to P as foUows. Assume that 0 < I' < 1 < >. and norms on E" and IE' are chosen such that liD fplE'1i < I' and m(DfpIE") > >.. Then W'(p, V, f) = {q E V : d(fi(q), p) ~ I'id(q,p) for all j ~ O}
and W"(p, V,!)
= {q E V: d(q-i'P) ~ >.-id(q,p) for all j
~0
where {q-i }~o is some choice of a past history of q}
198
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
REMARK 10.3. We do not give a proof of this theorem, although methods similar to those for the proof of the earlier Stable Manifold Theorem work with some modification. Most proofs of the Stable Manifold Theorem probably apply, but this theorem is certainly proved in Kelley (1967) and Hirsch, Pugh, and Shub (1977).
There is also a center-stable manilold, WC'(p, V, f), which is tangent to lEC ESlE', but it is not necessarily unique and is difficult to characterize locally. Also, if I is C k for 1 ~ k < 00, then there is a neighborhood on which the center-stable manifold is C k . See Section 11.3. However, the neighborhood can depend on k, and so if f is Coo there are examples where there is no neighborhood on which the manifold is Coo. See van Strien (1979), Carr (1981), and Exercise 5.44. Finally, the manifold WC'(p, V, f) is not strictly invariant, although all points which stay in V for all future iterates are contained in we,(p, V, f). In the same way, there is a center-unstable manilold, weu(p, V, f), which is characterized by choices of backward iterates. It is easier to characterize these manifolds by forming an extension, I, of I to all of Rn which is close to /(p) + D/p(q - p) on all of an. Once the extension is fixed, WCU(p,f) and WC'(p,f) are unique. By a translation we can assume that p = O. Let A = Dlo. Given E > 0 there is an r > 0 and an extension I: IR n .... IR n that is C k and such that IIB(O, r) = /IB(O, r), III - Allc. < E, and I = A off B(O, 2r). With this notation, the manifolds are characterized in the following theorem.
Theorem 10.15 (Center Manifold Theorem). Let I: U C IRn .... lR n be a C k map for 1 ~ k ~ 00 with /(0) = 0 and A = D 10. Let k' be chosen to be (i) k if k < 00 and (ii) some integer with 1 ~ k' < 00 if k = 00. Assume that 0 < P. < 1 < ,\ and norms on lEu and lE' are chosen such that IID/pllE'1i < p. and m(D/pllEU ) > A. Let E > 0 be small enough so that IID/pllE'li < p. - E and m(D/pllEU ) > A + E. Let r > 0 and I : Rn .... Rn be a C k map with IIB(O, r) = IIB(O, r), III - Allc' < E, and I = A off B(O,2r). 1£ r > 0 is small enough (it can depend on k'), then there exists an invariant C k ' center-stable manifold, we,(o, I), which is a graph over lEc ESlE', which is tangent to lEe ESlE' at 0, and which is characterized as follows: WC'(o,f) = {q : d(Ji(q), 0)'\ -i .... 0 as j .... oo}. This means that Ii (q) grows more slowly than :>.,i. Similarly, there exists an invariant C k ' center-unstable manifold, WCU(O, I), which is a graph over lEuESlEc, which is tangent to lEu ES lEc at 0, and which is characterized as follows: WCU(o,f) = {q : d(q_i,O)p.i .... 0 as j .... 00 where {q-i}~o is some choice of a past history of q}. This means that q-i grows more slowly than p.-i as j .... center manifold of the extension is defined as
00,
or -j ....
-00.
Then the
It is C k ' and tangent to Ee, There are also local center-stable, local center-unstable, and local center manifolds of / defined as WCO(O, B(O, r), f) = We,(O, I) n B(O, r), Weu(O, B(O, r), f) = WCU(o,f) n B(O, r), WC(O, B(O, r), f) = WC(O, I) n B(O, r),
and
5.10.3 STABLE MANIFOLD THEOREM FOR FLOWS
199
respectively. These local manifolds depend on the extension, but if Ii (q) E B(O, r) for all -00 < j < 00 then q E WC(O, I) for any extension 1 and q E We(O, B(O, r), f). REMARK 10.4. The proof that a CI center-stable and center-unstable manifolds exist is essentially the same as in the last subsection. In Section 11.3, we return to discuss why it is cr for 2 ~ r < 00. For a more thorough discussion of the Center Manifold Theorem, see Carr (1981). There are proofs in Kelley (1967), Hirsch, Pugh, and Shub (1977), and Chow and Hale (1982).
Example 10.1 (Nonunique Center Manifold). The following example illustrates the fact that the center manifold is not unique. It is attributed to Anosov in Kelley (1967), page 149. Consider the differential equations
y=
-yo
It is easy to see that the origin is a non-hyperbolic fixed point with eigenvalues -1 and O. The stable manifold of the origin is clearly the y-axis. To determine the various center manifolds, note that :: = :~ and y = Cell>: is a solution for any choice of C. Thus for any choice of C, the graph of the following function gives a center manifold: u(x,C) = { 0 Cel/>:
for x ~ 0 for x < O.
Note that as x --+ 0 with x < 0 that u(x, C) --+ O. Thus u(x, C) is continuous at x = O. With more calculations, it can be shown that u(x, C) is Coo at x = O. For any C, the graph of u(x, C) is tangent to the x-axis at x = 0 and is invariant. Therefore for any choice of C, this graph is a center manifold. Thus, far from being unique, there is a one parameter family of center manifolds.
5.10.3 Stable Manifold Theorem for Flows The statement of the Stable Manifold Theorem for a fixed point of a flow is similar to that for a diffeomorphism. We do not comment further on it except to mention that the sum of the dimensions of the stable and unstable manifolds of a single fixed point equals the total dimension of the ambient space. This is similar to a diffeomorphism but different than the stable and unstable manifolds of a periodic orbit for a flow. For a hyperbolic periodic orbit 1, it is certainly possible to get the stable manifold of a Poincare map P: E --+ E, W:(p, P), and let
W:b, ei) =
U ri(W:(p, Pl· t~O
However, this representation does not tell us much about the geometry of the local stable manifold of the periodic orbit. To explain this geometry, we need to look at the contracting and expanding splitting along the whole periodic orbit. Let cpt be a Row on a manifold M for a vector field X and with a periodic orbit 1 of period T. For any q E 1, there is a splitting TqM = IE~ Ell E~ Ell (X(q»,
200
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
where (X(q)) is the span of the vector X(q), and E~ = Dcp~E~ for 17 = s,u if q = cpt(p). Just as we defined the normal bundle for the periodic orbit when discussing the Hartman-Grobman Theorem, we can define the stable bundle and unstable bundle of the periodic orbit by E; = U{(q,y) : q E 'Y and y E E~}, and
E~ = U{(q,y) : q E 'Y and y E E:}. Further. for
17
= S. u
and f > 0 let
If the derivative of the flow restricted to the stable bundle. Dcp~IE~. is orientation preserving (has positive determinant). then E~(f) is isomorphic to the cross product of 'Y and E~(f). If Dcp~IE~ is orientation reversing (has negative determinant). then E~(f) is not isomorphic to the cross product of'Y and E~(f) but is a twisted product. By going around the orbit twice an oriented basis of E~ can be brought back to itself (while remaining a basis of the appropriate E~ the whole way) but not after only once around 'Y. If E~ is one dimensional and the stable bundle is twisted. then E~(f) is isomorphic to a Mobius strip. In higher dimensions. it is a corresponding twisted product if the bundle is not oriented. The local stable manifold of 'Y. W.'(-y. cpt). can be represented as a graph over E~(f) for some small f > O. Thus if E~ is one dimensional. the local stable manifold is either a graph over an untwisted strip (an annulus) or a Mobius strip. Similar statements can be made about the local unstable manifold. This geometric difference of types of local stable and unstable manifolds for a periodic orbit does not arise for fixed points of flows or for diffeomorphisms. The differentiation between E~ being twisted or not is related to Floquet theory for the time periodic linear system of differential equations (first variation equation)
d
'dtv = DX
5.11 The Inclination Lemma This section concerns a result which follows from the proof of the stable manifold theorem. or at least from similar ideas. It is used in Chapter IX to develop the general theory of invariant sets with a hyperbolic structure (some contracting and some expanding directions). Let p be a hyperbolic fixed point for f. The result concerns the iterates of a disk. of the same dimension as the WU(p), which crosses W'(p}. We first need to define what we mean by a disk crossing W'(p} transversally.
5.11 THE INCLINATION LEMMA
201
Definition. Let DI and D2 be the differentiable images in Rn of two disks, D; = for j = 1,2 where 9j : Rn; -- Rn is Cle and one to one. We say that these two embedded disks are transverse at p provided either p ¢ Dl n D2 or
9j (Rn; (r
j»
We say that DI and D2 are transverse provided they are transverse at all points. In the definition, note that if DI and D2 are transverse and n1 + n2 < n, then DI n D2 = 0. Thus two transverse lines in R3 do not intersect. Next, if nl + n2 = n, and the disks are transverse, then they intersect in isolated points. (A complete proof of this fact uses the Implicit Function Theorem.) So, a line and a plane which are transverse in R3 intersect in a single point. If nl + n2 > n, then the transverse objects intersect in a curve, surface, or higher dimensional object (of dimension nl + n2 - n). So, two planes which are transverse in R3 intersect in a line. For a more complete treatment of transversality, see Abraham and Robbin (1967), Guillemin and Pollack (1974), or Hirsch (1976). Now we can state the Inclination Lemma (or Lambda Theorem). Theorem 11.1 (Inclination Lemma). Let p be a hyperbolic fixed point for a C le diffeomorphism f. Let r > 0 be small enough so that in the neighborhood {p} + (JE"(r)xJE'(r» ofp the hyperbolic estimates work which prove the Stable (and Unstable) Manifold Theorem. Let D" be an embedded disk of the same dimension as IE" and such that D" is transverse to W:(p,f). Let = f(D") n (JE"(r) x JE'(r» and D~+l = f(D~) n (JE"(r) x JE·(r». Then D~ converges to W;'(p, f) in the Cle topology (pointwise and with all its derivatives). See Figure 11.1. So given E > 0, there is no such that for all n ~ no, D~ is with E of W;'(p, f) in terms of the Ck-topology. (The latter condition means that if D~ is given as the graph of Un : W;'(p, f) -- JE", then Un and its first k derivatives are smaller than E.)
Dr
FIGURE
11.1. The Disks
D~
Converging to the Unstable Manifold
The proof of this theorem follows from the methods of the proof of the Stable Manifold Theorem. See Palis and de Melo (1982) for the details. Palis and de Melo (1982) also contains another (geometric) proof of the Hartman-Grobman Theorem using the Inclination Lemma. This type of result can also be formulated for an invariant set with a hyperbolic structure which we define in Chapter VII.
202
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
5.12 Exercises Differentiation Assume I: U C Rk
5.1.
-+
Rm is CI and p E U. Show that 01
oxJ (p) =
.
D/pe1
where e1 = (0, ... ,0, 1,0, ... ,0) is the standard unit vector with zeroes in all but the j-th position. 5.2. Assume I : U C Rk -+ Rm and 9 : V C R m -+ an are differentiable at p and q = I(p), respectively. Using the matrix product write out the (i,j)-th entry of DgqD Ip. Discuss the relationship between the chain rule in terms of products of matrices and combinations of partial derivatives. Inverse Function Theorem 5.3. This exercise asks for a proof of the covering estimates of the Inverse Function Theorem stated in Theorem 2.4. Assume that U C IRn is an open set containing 0, and that I : U -+ IRn is a CI function with 1(0) = 0. Assume that L = Dlo is an invertible linear map (so has a bounded linear inverse). Take any /3 with 0 < /3 < 1. Let r > 0 be such that (i) 8(0, r) C U and (ii) ilL - Dlxll $ m(L)(1 - (3) for all x E 8(0, r). Define
for all XI,X2 E 8(0,r). (c) For y E 8(0,/3r), prove that 8(0, m(L)/3r). Contraction Mapping 5.4. Assume Y is a complete metric space with metric d, A is a metric space, and 0< I'i. < 1. Assume 9 : A x Y -+ Y is uniformly continuous, and for each x E A, g(x,·) is Lipschitz with Lip(g(x, .)) $ I'i.. Prove that there is a continuous map 0' : A -+ Y such that g(x, O'(x)) = O'(x) for every x E A. Hint: Apply Theorem 2.5 to show that for each x E A, there is a O'(x) E Y such that g(x,O'(x)) = O'(x). Then for x E A near Xo E A, apply the estimate in Theorem 2.5 on the fixed point O'(x) to see how near it is to O'(xo). Solutions of Differential Equations 5.5. Assume A(t) is an n x n matrix with real entries such that IIA(t)1I is bounded for all t, Le., there is a C > such that IIA(t)1I $ C for all t. Prove that the solutions of the linear system of equations x = A(t)x exist for all t. 5.6. Let Co, C II and C2 be positive constants. Assume v(t) is a continuous nonnegative real valued function on R such that
°
v(t) $ Co
+ Cdtl + C21l v(s) dsl·
5.12 EXERCISES
203
Prove that
Hint: Use
U(t) = Co 5.7. that all t. 5.8.
l
+ Cit + C'J
V(s) ds.
Let CI and C 2 be positive constants. Assume I : Rn -+ Rn is a C I function such I/(x)1 ~ C I + C 2 1xl for all x ERn. Prove that the solutions of x = I(x) exist for Hint: Use the previous exercise. Consider the differential equation depending on a parameter
x = l(x;lJ) where x is in an, IJ is in R", and I is a CI function jointly in x and Jl. Let 'P1(x;lJ) be the solution with 'P°(x; IJ) = x. Prove that 'PI (x; IJ) depends continuously on the parameter IJ. 5.9. (Differentiably flow conjugate) Let I,g : an -+ Rn be two vector fields with flows 'PI and 1/J1 respectively. (a) Assume h : Rn -+ Rn is differentiable flow conjugacy of the flows 'PI and 1/J1, i.e., h 0 'PI 0 h -I = 1/J1 for all t and h is a CI diffeomorphism. Prove that Dh,,-I(Xj/ 0 h-I(x) = g(x). In differential geometry, the vector field given by Dh,,-l (xj/ 0 h -I (x) is labeled by (h./)(x). (b) Assume I and 9 are differentiably flow equivalent, i.e., they satisfy the equation 1/J1(x) = h 0 'PQ(I,X)
0
h-I(x)
where h : Rn -+ Rn is diffeomorphism and a : R x Rn d dto(t,x) > O. Prove that
-+
R is differentiable with
Dh,,-I(Xj/ 0 h-I(x) = ,B(x)g(x) where,B: an -+ (0,00). 5.10. (Flow Box Coordinates) Consider a differential equation x = I(x) for I : Rn -+ R n a C" vector field for r ~ 1. Let a be a point for which I(a) oF O. Prove there exists a cr change of coordinates x = g(y) valid near a in terms of which the differential equations become for 2 ~ j ~ n. !JI = 1, The outline of the proof is as follows. (a) Let v = I(a). One of the coordinates k = 1. Take the hyperplane
Vk
oF O. Renumber the equations so that
s= {(al,Y2, ... ,Yn)}' Define g(y) = 'Pill (at. Y2,"" Yn) where 'PI is the How of the differential equation, Prove that VI 0 1 ( V2 Dga = : Vn
0
204
v. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
(b) Using the inverse function theorem, prove that 9 defines a C r set of coordinates near a, i.e., 9 is cr and has a cr inverse. (c) Prove the differential equations in the y variables is as the form given above. Limit Sets 5.11. Let X be a metric space, and Sj C X for 1 ~ j be a sequence of nested, compact, and connected subsets: Sj ::> Sj+l. Prove that n~l Sj is connected. 5.12.
Consider the flow e'Ax for the linear differential equation
x=Ax with x ERn. (a) Prove that all the eigenvalues of A have nonzero real part (i.e., the fixed point 0 is hyperbolic) if and only if for each x ERn, either w(x) = {O} or w(x) = 0. (h) For n = 4, show there is a choice of A and a point x for which w(x) :) O(x) but O(x) is neither a fixed point nor a periodic orbit. 5.13. Let x = X(x) be a (nonlinear) differential equation in an. Assume that a trajectory 'P' (p) is bounded and each of the coordinates of the solution is monotonic for
t
~ to, i.e., ft7rj'P ' (p) does not change sign for t ~ to and 1 ~ j ~ n, where 7rj : Rn -+ R
is the projection of the j-th coordinate. Prove that w{p) is a single fixed point. 5.14. Let f : X -+ X and 9 : Y -+ Yare continuous maps on compact metric spaces. Assume h : X ..... Y is a topological conjugacy. Prove that h takes the chain recurrent set of f to the chain recurrent set of g, h(n(f» = n(g), i.e., prove that the chain recurrent set is a topological invariant. Fixed Points for Differential Equations 5.15. Find the fixed points and classify them for the system of equations
± = v,
v=
-x +wx3 , W= -w. 5.16.
Find the fixed points and classify them for the Lorenz system of equations
x=
-lOx
y=
28x - Y - xz,
.
8
+ lOy,
z="3 z + xy .
5.17.
Consider the equation of a pendulum with friction,
X=y iJ = - sin(x) - 6y with 6 > O. (a) Using a Liapunov function, prove that the w-limit set of any point qo = (xo, YO) is a single fixed point. (b) Let PI: = (b,O) be the fixed points. For a fixed point PI:, let W"(Pk)
= {q : w(q) = pI:}
5.12 EXERCISES
be the basin of attraction of Pk (or the stable manifold of Pk). Discuss how the the basins of attraction for the fixed points are located in R2. In particular, explain how W'(P2k+d separates W'(P2k) from W'(P2k+2). (c) Discuss the difference of the motion of a point q = (0,110) which lies in W'(P2k) from the motion if it lies in W'(P2k+2). In particular, how many rotations through multiple of 211" in the x-variable does each forward orbit make? 5.18. Discuss the basins of attraction of the fixed points for the following system of differential equations with fJ > 0: :i:=11
iJ 5.19.
= x - x 3 - fJ y.
Consider the equations
:i: = x(A - x - ay), iJ = y(B - bx - y) which model two competing species. Assume A, B, a, b > 0, A > aB, and B > bA. (a) Find the fixed points. Hint: Consider the isoclines where :i: = 0 and iJ = O. (b) For any point q in the interior of the first quadrant, prove, using Exercise 5.13, that the w( q) is a fixed point with both coordinates positive. Hint: Consider the various regions where :i: and iJ have fixed sign. (The isoclines are the boundaries of these regions.) 5.20. This exercise asks for an alternate proof of Theorem 5.1 (on nonlinear sinks for ordinary differential equations) using the Jordan Canonical Form approach. Assume 0 is a fixed point for the equatlons:ic = f(x) with x ERn. Also assume that there is a constant a > 0 such that all the eigenvalues ~ for A = D fa have negative real part with Re(~) < -a < o. (a) Let ( , )B be the inner product and II liB be the norm determined by an arbitrary basis B. Show that ~II (t)11 = (x(t),/(X(t)))B dt
x
B
IIx(t)IIB·
(b) Let " > 0 be small enough so that Re(~) < -a - " for all the eigenvalues of A. Let B be the basis used in the alternative proof of the linear sink theorem, so
{x, Ax)B < (-a -
£)lIxll~.
Prove that if U is a small enough neighborhood of 0, then (X,f(X»B < -allxll~·
(c) Prove that for a small enough neighborhood U of 0 and :xo E U, the solution
206
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
(b) Prove that p is a hyperbolic fixed point of X if and only if DIp = 0 and D2/p("') is a nondegenerate bilinear form. 5.22. Assume:it = Ax and y = By are both hyperbolic linear flows on an and are differentiably conjugate, i.e., there is a C I diffeomorphism h : an -> an and a C I function T : R x Rn -> R such that h(eA1'(t,x)x) = eBth(x). Prove t.hat the eigenvalues of A are proportional to the eigenvalues of B. Periodic Points for Maps 5.23. Let h : R -> R be given by h(x) = x 3 . Find a Coo diffeomorphism I : R -> a such that h - I 0 I 0 h is not differentiable at 1. 5.24. Consider the map given in polar coordinates by l(r,B) = (1· 2,B - 0.5 sin(B)). This can be considered as a map on the two sphere by adding a fixed point at infinity. (a) Find the fixed points and classify their stability. (Include the fixed point at infinity). (h) Find the basins of attractions of all the fixed point sinks including the fixed point at infinity. 5.25. Let FAB = (A - By - x 2 ,x) be the Henon map. Find the fixed points and classify them for different values of the parameters A and B. 5.26. Prove Theorem 6.1 in the case of a fixed point. (This is the theorem that says that a map with a fixed point, all of whose eigenvalues are less than one in absolute value, is a contraction.) 5.27. Let I and gk be diffeomorphisms of a given by I(x) = x
gk(X) = x
+ ~ sin(x)
and
1
+ 2hk(x), x3
hk(X) = X - 3!
where X 2k + 1
x5
for k ~ 1.
+ 5! - ... + (-I)\2k + I)!
For k ~ I, prove that I and 9 are not topologically conjugate. 5.28. Suppose h is a conjugacy between I : an -> an and 9 : an -> an. (a) Show that p is a periodic point of I if and only if h(p) is a periodic point of g. (b) Show that if Ii (p) converges to q as j goes to infinity, then gi (h(p)) converges to h(q) as j goes to infinity. (c) Show that for any p E Rn we have h(w(p, f)) = w(h(p), g) where w(p, f) are the w-Iimit sets of p. 5.29. Assume I : Rn -> Rn is a CI diffeomorphism. Assume p is a hyperbolic periodic point for I. Given any positive integer n, prove there is a neighborhood U of p such any periodic point of I in U \ {p} has period greater than n. Hartman-Grohman Theorem 5.30. This exercise gives another proof of Theorem 7.1. Let A E L(an,an ) be an invertible hyperbolic linear map. Assume that I is a C I diffeomorphism of an with A 0 I-I - id E CI:(Rn,Rn). To solve the equation A 0 h = hoI for h = id + k with k E cg (an, Rn ), it is sufficient to solve
AO(id+k)or l =id+k, Aor l +Aokorl =id+k, A
0
f'
C(k) = A
id = k - A 0
r
l -
id,
0
k
0
f -I,
or
5.12 EXERCISES
where C(k) = k - A 0 k 0
207
rl.
(a) Prove that C is an invertible bounded linear operator on Gg(R n , Rn). This includes the fact that C preserves the space Gg(R n , Rn). (b) Prove that there is a unique solution ko to the equation C(k) = A 0 1-1 - id. (c) Let h = ko + id where ko is the solution obtained in part (b). Prove that h is (i) a homeomorphism, and (ii) a conjugacy between A and I. 5.31. Assume cpl and t/JI are two flows on Rn for which 0 is a hyperbolic fixed point sink. Show that there is a conjugacy h in a neighborhood of 0 from cpl and t/Jl (the time one maps) that is not a conjugacy of cpl and t/J I, and in fact does not take trajectories of cpl to trajectories of t/JI. Periodic Orbits for Flows 5.32. Assume 'Y is a periodic orbit for the flow cpl and (3 is a periodic orbit for the flow t/JI (both flows in n dimensional spaces). Let PIP be the Poincare map for the flow cpt for a transversal Ep at p E 'Y and P", be the Poincare map for the flow t/JI for a transversal Eq at q E (3. Assume that PIP in a neighborhood of pin Ep is topologically conjugate to P", in a neighborhood of q in Eq . Prove that the flow cpt in a neighborhood of'Y is topologically equivalent to t/J t in a neighborhood of (3. 5.33. Assume 'Y is an attracting periodic orbit for the flow cpt. Prove that cpt in a neighborhood of'Y is topologically conjugate to the linear bundle flow defined in Section 5.8 for Theorem 8.7. 5.34. Assume that two diffeomorphisms ! and 9 are topologically conjugate. Prove that their suspensions are topologically conjugate. 5.35. Let A(t) be a time dependent n by n curve of matrices. Let M(t) be a fundamental matrix solution for the system of differential equations :ic = A(t)x. Prove Liouville's Formula: det(M(tJl) = det(M(to))
5.36.
[',
110
tr(A(t)) dt.
Consider the Volterra-Lotka equations x = x(A - By - ax) = xM(x, y), iJ = y(Cx - D - by) = yN(x,y)
with all A,B,C,D,a,b > O. These equations model the populations of two species which are predator y and prey x and an increase in either population adversely affects its own growth rate (there is a crowding factor in both equations), (a) Find conditions on the constants so there is a unique fixed point p' = (XO, yO) with x' > 0 and y' > O. Hint: Look at the isoclines where x = 0 and iI = O. (b) Letting V (x, y) be the vector field for these equations, verify that (div V)(x, y) = xix + illY + xM.,
+ yNII
where M., and Nil are the partial derivatives with respect to x and y of the respective functions. Let E = {(x, y') : x ~ x'} and P: E ~ E be the Poincare map. The solution of the rest of this exercise proves that for any q in the interior of the first quadrant, w(q) = {pO}.
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
208
(c) Verify that P'(x) =
N(x,y·) . P(x) . e"(") N(P(x), y.) x
Cor x > x· where I'(x) < O. (d) Define !(x) = N(x, y·)/x and F the antiderivative oC!. Prove that F 0 P(x) < F(x) Cor x > x·, and that pn(x) converges to x· as n goes to infinity. (e) Conclude, Cor q in the interior ofthe first quadrant that w(q) = {p.}. 5.37. Assume X is a C I vector field on T2 with no fixed points. Prove that there is a closed curve r on T2 which is a transversal to the flow. Further, prove that r can not be contracted to a point in T2. Hint: Consider the vector field Y on T2 which is everywhere perpendicular to X, e.g. iC X and Y are the lifts oC X and Y to R2 then it is possible to take Y(x, y) = (X 2 (x, y), -XI (x, where X(x, y) = (XI (x, y), X 2 (x, y». Take and point q E T2 and consider P E w(q, Y). The trajectory oC q Cor Y repeatedly comes near p. By considering a pair oC points on the trajectory, show that it can be modified to make a transversal Cor the flow oC X. 5.38. Consider a vector field X on y2 with lift X to R2 oC the form X(x, y) = (1, X 2 (x, Prove that X has a periodic orbit if and only if the Poincare map has a rational rotation number. Poincare-Bendixson Theorem 5.39. Let:it = V(x) be a differential equation in R2 with only a finite number of fixed points. Assume Po is a point whose forward orbit, O+(po), is bounded. (a) Assume PI E w(Po) and P2 E w(Ptl with V(P2) ~ O. Apply Lemma 9.4 and the argument of Lemma 9.5 to prove that PI is on a periodic orbit. (b) If PI E w(Po) is not a periodic orbit, prove that o(Ptl and w(Ptl are each single fixed points. (c) If w(Po) is not a periodic orbit, prove that w(Po) contains a finite set of fixed points and a finite set of other orbits O(q;) where O(qi) and W(qi) are each single fixed points for each qi' 5.40. Let A = 51 x la, bJ be an annulus with covering space A = R x la, bJ. (The 'angle variable' is not taken modulo 1 in the covering space.) Let X be a vector field on A with lift X = (X I ,X2 ) to A. Assume that XI(x,a) < 0 and XI(x,b) > 0 for all x, and div(X) '= 0, i.e., X is area preserving. Prove that the flow of X has a fixed point in A. Hint: Assume X has no fixed points and let Y be a nonzero vector field which is perpendicular to X everywhere in A. Prove that Y has a periodic orbit "Y in A using the Poincare-Bendixson Theorem. Get a contradiction to the area preserving assumption on X by considering X along "Y. 5.41. Consider the differential equations
y»
y».
. 4xy x=a-x--1 +x2
Ii = bX(l - -y-) 2 1+x
for a,b > O. (a) Show that x· = a/5 and y. = 1 + (x·)2 is the only fixed point. (b) Show that the fixed point is repelling for b < 3a/5 - 25/a and a > O. Hint: Show that det(DF(.,_.\I_» > 0 and tr(DF(",- .\1-» > O. (c) Let XI be the value of X where the isocline {x = O} crosses the x-axis. Let YI = 1 + (xtl2. Prove that the rectangle {(x, y) : 0 :5 x :5 XI, 0 :5 Y :5 yd is positively invariant.
1i.l2 EXERCISES
(d) Prove that there is a periodic orbit in the first quadrant for a > 0 and 0 < b < 30/5 - 25/a. Fiber Contractions (Stable Manifold Theory) 5.42. Let fln(r) C R" be the closed ball in R" of radius r > 0 about 0, and fI"(r') C R" be the closed ball in R" of radius r' > 0 about O. Assume F : fln(r) x fI"(r') -+ R" x fI"(r') is a C l map of the form F(x,y) = (J(x),g(x,y» such that (i) /(Int(fln(r») ::> l1"(r), and (Ii) /lSn(r) Is a diffeomorphism from Sn(r) onto its image. Let D 2g(x,y) : R" -+ R" be the derivative with respect to the variables in R". Let K. x = SUp{II D2g(y,x)II : y E R"}. Assume that (iii) sup{K.x : x E fln(r)} < 1. (a) Prove that for each x E fln(r) there is a unique O'(x) E fI"(r') such that (x,O'(x» has a backward orbit by F in : fI"(r) x fI"(r'). Hint: Prove that the intersection F"( {/-"(x) x fI"(r')) is a single point. (b) Prove that 0' : fI"(r) -+ fI"(r') is an invariant section, i.e.,
n::,=o
F(x,O'·(x» for all x E /-I(fI"(r». (c) Prove that map 0' : fln(r)
-+
= (J(x),O'· o/(x»
fI"(r') is continuous.
5.43. Using the notation of the previous exercise, assume F : fln(r) x R" -+ R" X R" is a CI map of the form F(x,y) = (J(x),g(x,y» such that (i) /(int(fln(r))):> fln(r), and (ii) /Ifln(r) is a diffeomorphism from fln(r) onto its image, (iii) sup{K.x : x E fln(r)} < I, and (iv) sup{K. x Ax : x E fln(r)} < I, where Ax = II(D/x)-III and K. x = SUp{IID29(y,x)1I : y E Rio}. Let 0'. : fln(r) -+ R" be the unique continuous invariant section found in the previous exercise, Prove that 0'. is CI. Hint: Construct a family of horizontal cones C(x,y) that are taken inside themselves by DF(x,y), DF(x,C7(X»C(X,C7(X» C C F (X,C7(X»'
Center Manifold 5.44. (A polynomial vector field without a Coo center manifold. This example is taken from Carr (1981) and is a modification of an example of van Strien (1979).) Consider the equations x=xy+x3
y=o =z -
i
x2.
(a) Show that the center manifold of (x, y, z) = 0 can be written as a graph z = h(Z,II) for Ixl ~ 6 and Iyl ~ 6 for small 6 > o. (b) Show that the fixed points {(O, y, 0) : Iyl ~ 6} all lie on we(o). (c) Assume that z = h(z, y) is C2n for Ixl ~ 6 and Iyl ~ 6. Take the Taylor expansion of h in x about x = 0 (with coefficients which are functions of y): 2n
h(x, y)
=L
a; (y).:d
+ o(lxI2n).
j==l
Find a relationship between the coefficients by equating i and i
=
::x + ~y.
In particular, show that (i) al(y)
(iii) (1 - jy)a;(y) = (j - 2)a;_2(Y) for j > 2.
=z-
= 0,
x2
= h(x, y) -
x2
(iI) a2(y) :f: 0, and
V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS
210
(d) Show that the point (O,l/(2n),O) can not lie in the domain where h is C 2n . Hint: Show that a2i(1/(2n)) ". 0 for 1 ~ i < n, and obtain a contradiction for the relationship involving the coefficients a2n(1/(2n)) and a2n-2(1/(2n)). Remark: What makes this example work is the resonance between the eigenvalues at the fixed point (O,l/(2n), 0). The resonance forces the weak unstable manifold (for the eigenvalue 1/(2n) to be C 2n - 1 but not c2n. In turn, this manifold Is contained in the center manifold of 0 so it can not be C 2n either. (e) Show that there is no neighborhood of 0 on which the center manifold WC(O) is Coo.
5.45. Assume f : Rn x R -+ Rn is a C r function for r ~ 1. Write I>. (x) for f(x, A). The map F : Rn+l -+ Rn+l be defined by F(x, >.) = (f(x, A), >.) is also cr. Assume that Xo is a hyperbolic fixed point for fo. Let x.\ be the corresponding hyperbolic fixed point for A near O. Using the cr Center Manifold Theorem for F, prove that the stable manifold of X A for fA, W"(x A, 1>.), depends C r jointly on position and parameter A, i.e., prove that W"(x.\, fA) for IAI < E and E > 0 small can be represented as the graph of a cr function q: E~o.o x (-E,E) -+ E~.o.
Inclination Lemma 5.46.
Let A =
(00 5
~),
and consider the linear map Ax. Let S = [-r, r] x [-r, r]
be the square. Let D be the line segment be the part of the line through the point (1, 0) with slope s that lies within S. Assume sand r are chosen so that D intersects the top and the bottom of S and not the sides. Prove by a direct calculation that the n-th iterate of the line segment, An (D) n S, converges to the part of the y-axis given by {O} x [-r, rIo Prove the convergence both in terms of the points and the slope.
CHAPTER VI
Bifurcation of Periodic Points Throughout this chapter we consider a map with one parameter. These results also apply to differential equations near a periodic orbit by considering the Poincare map. We write fl'(x) = I(x,/J) where /J E R and x E RH. We proved in Theorem V.6.4 that if f"o(Xo) = Xo is a fixed point and 1 is not an eigenvalue of D(fI'O)xo' then the fixed point can be continued for values of the parameter /J near /Jo. This is a non-bifurcation result. Notice that the tool we used to show that the fixed point could be continued in this case is the Implicit Function Theorem. We repeatedly use this theorem to study the bifurcations considered in this chapter. The first type of bifurcation, called the saddle-node bifurcation, occurs where the above assumption on the eigenvalues is violated and 1 is an eigenvalue. With further assumptions on the higher derivatives of II'(x), it follows that (i) for /J on one side of Ito there are no fixed points near Xo, and (ii) for /J on the other side of /JO there are two fixed points. Of the two fixed points, one is attracting and the other is repelling (at least in one dimension, n = 1). The other types of bifurcations that we consider are ones where the fixed point persists (1 is not an eigenvalue), but the stability type of the fixed point changes as /J passes through Ito (one eigenvalue has absolute value equal to 1). Among this type of bifurcation, the second type we consider, called a period doubling bifurcation or flip bifurcation, occurs when one eigenvalue is -1. In one spatial dimension, n = 1, an attracting point of period one becomes repelling at /Lo and a stable orbit of period 2 branches off from the fixed point. The last type of bifurcation we consider is called the AndronotJ-Hopf bifurcation. It occurs when a pair of complex eigenvalues have absolute value one when /J = /JO but are not equal to ± 1. As /t passes through /Lo, the absolute value of these eigenvalues change from less than one to greater than one. For a pair of complex eigenvalues to occur, it is necessary to be considering a map in at least two dimensions. With further assumptions on the derivatives, for /L slightly bigger than /Jo there is an invariant closed curve near xo. This bifurcation is simpler for differential equations. In this case, for a differential equation in two dimensions, a fixed point changes from attracting to repelling as a pair of eigenvalues crosses the imaginary axis, and a stable periodic orbit branches off from the family of fixed points. For a more complete treatment of bifurcation theory, see Guckenheimer and Holmes (1983), Chow and Hale (1982), Wiggins (1990) and (1988), or Hale and Koc;ak (1991).
6.1 Saddle-Node Bifurcation As stated in the introduction, the first bifurcation we consider is the one that occurs when the map fails to be hyperbolic because 1 is an eigenvalue of the derivative. We first consider the case where x is a real variable. We want fl'o(xo) = Xo and I~o (xo) = 1. We also want the tangency of the graph of 11'0 to the diagonal {(x,1/) : 1/ = x} to occur in the simplest possible fashion so we assume that I~o(xo) 1: o. Finally we need that the graph of I" is moving upward or downward as the paranleter varies, 211
VI.
212
:~ (xo, JlO) f. O.
BIFURCATION OF PERIODIC POINTS
With these assumptions, the fixed point disappears on one side of 110
and two fixed points branch off on the other side. Before stating the general theorem we give an example.
Example 1.1. Let I,,(x) = Jl+x-ax 2 with a> O. For Jl < 0, I,,(x) -x = Jl-ax 2 < 0 for all x so there are no fixed points. For Jl = 0, 0 = lo(x) - x = -ax 2 has one root at x = 0, so 10 has one fixed point at x = O. For Jl > 0, 0 = I,,(x) - x = Jl - ax 2 has two roots, x± = ±('I/a)I/2. Thus I" has two fixed points. See Figure 1.1.
__________
~~--~~------------~~x
FIGURE 1.1.
Creation of Two Fixed Points
Theorem 1.1 (Saddle-Node Bifurcation). Assume that I : R2 --+ R is a C r function jointly in both variables with r ~ 2. We also write I,,(x) = l(x,Jl). Make the following further assumptions: (I) l(xo,Jlo)=xo, (2) I~o(xo) = 1, (3) 1::O(xo) f. 0, and
81
(4) --8 (xo, /lo) II
f. O.
Then there exist intervals I about Xo and N about /lo and a C r function m : I --+ N such that (i) Im(x)(x) = x, (ii) m(xo) = /lo, and (iii) the graph ofm gives all the fixed points in I x N. Moreover m'(xo) = 0 and
These fixed points are attracting on one side of Xo and repelling on the other. See Figure 1.2. PROOF. To find the fixed points of
II"
we consider the new function
G(x, /l) = I(x, /l) - x. Then fixed points of I" exactly correspond to zeroes of G. First we have G(xo, /10) = O. We want to use the Implicit FUnction Theorem to solve for nearby zeroes of G. Note that (8G/8x)(xo, /lo) = (xo) -I = 0 so we can not solve for x in terms of /l. However,
1:.0
6.2 SADDLE-NODE BIFURCATION IN HIGHER DIMENSIONS
213
x (
stable
\
unstable
FIGURE 1.2.
Bifurcation Diagram for Saddle-Node Bifurcation
so we can solve for p. in terms of x. In fact, there are intervals I about Xo and N about P.o and a C r function m : 1 -+ N such that G(x, m(x)) == 0, and these give all the zeroes of G in I x N. This construction proves the first facts about the fixed points. To calculate the derivatives of m(x) we use implicit differentiation. We use subscripts to designate partial derivatives. Thus G z = Differentiating 0= G(x,m(x» with respect to x, we get = G z + Gl£m'. Evaluating this at Xo and using the fact that Gz(xo, p.o) = 0 and GI£(XO, p.o) l' 0, we get that m'(xo) = 0. (Notice that this much of the proof only uses the fact that f(x,p.) is C 1 and does not use the second derivative.) To get the second derivative of m, we differentiate the equation 0 = G.,(x, m(x)) + GI£(X, m(x»m'(x) a second time with respect to x and get
If.
°
0= Gzz
+ 2Gl£z m' + GI£I£(m')2 + Gl£m".
Evaluating this expression at Xo and using the fact that m'(xo) = 0, we get that m"(xo)
= -G:z:z(xo, p.o) = - f:z:z(xo, p.o)
GI£(XO, p.o) fl£(xo,p.o) as claimed. (In this notation, fl£(xo, p.o) is the p.-partial derivative evaluated at the point indicated.) To find the stability of the fixed points we use the Taylor expansion of f" about (xo, p.o): of ox (x, p.) =1
02f
02 f
+ ox 2 (x - xo) + OXOI-' (I-' - p.o) + O(lx - xol 2 ) + O(lx - xol ,11-' -
Because m'(xo) = 0, it follows that m(x) - p.o = O(lx of o2f -0 (x, m(x» = 1 + -02 (x - xo) x x (Zo.,..,)
Because
:~ (xo, p.o)
oF 0,
:~ (x, m(x)) -
p.ol)
+ 0(11-' _ p.oI2).
xoI 2 ).
Therefore,
+ O(lx -
2
xol ).
1 has opposite signs on the two sides of xo,
:~ (x, m( x» is less than one on one side and greater than one on the other, and so one side has an attracting fixed point and the other side a repelling fixed point. 0
6.2 Saddle-Node Bifurcation in Higher Dimensions In the last section, we gave the saddle-node bifurcation in one spatial dimension. This section considers this same bifurcation in higher dimensions. Before stating the theorem, we consider the example of the Henon map.
214
VI. BIFURCATION OF PERIODIC POINTS
Example 2.1. Let FAB(X,y) = (A - By - x 2,x) be the Henon family of maps. The fixed points have x-coordinate given by x = _ B ; 1 ± [( B ; 1) 2
+A
r/
2,
so there are
no fixed points when A < -I(B + 1)/2J2 and two fixed points when A> -I(B + 1)/2J2. The eigenvalues of the derivative D(FAB)(z.y) are ~:I: = -x ± Ix 2 - Bj1/2. When A = -[(B + 1)/21 2 and x = -(B + 1)/2, the eigenvalues are ~+ = Band .L = 1. Thus there is a bifurcation from no fixed points to two fixed which occurs when A = -[(B + 1)/2J2, and one of the eigenvalues of this fixed point is 1 at this bifurcation value. We leave to the exercises for the reader to verify that this family satisfies the conditions of the theorem below for a saddle-node bifurcation. See Exercise 6.2. To state the theorem in higher dimensions, it is necessary to specify the derivative of the "coordinate" function along the direction of the eigenvector for the eigenvalue 1. We assume that 1 is an eigenvalue of D(f,.o)xo of multiplicity one. Let vi be the right eigenvect.or (written as a column) for the eigenvalue 1, and w be the left eigenvector (written lIS a row), D(f,.o)Xovl = vi and wD(f,..)xo = w. If v j for 2 ~ j ~ n are the other generalized right eigenvectors, then wv j = 0 for 2 ~ j ~ n. (The product wv j is zero because Aj i' 1 and wv J = (wD(f,.o)Xo)v J = w(D(f,.o)xovJ) = AjWV j .) We can now use w I,. (x) lIS the component of I,.(x) along the direction of vi. Theorem 2.1. Let I : Rn x R ~ Rn be C 2 jointly in all the variables, and write I,.(x) = I(x, Il)· Assume that I,. satisfies the following conditions: (1) I(Xo,llo) = Xo, (2) D(f,.o)xo has eigenvalues Al = 1 and Aj with IAjl i' 1 for 2 ~ j ~ n, letv l be the right eigenvector for the eigenvalue 1 of D(f,.o)xo and w be the left eigenvector for the eigenvalue 1 (written as a row), (3) W [D2(f,.o)xo (vI, vi)] i' 0, and (4) w(81/81l)(Xo,Ilo) i' O. Then it is possible to parameterize Il = m(s) and x = q(s) such that m(O) = Ilo, q(O) = xo. q'(O) = vi, and I(q(s), m(s» == q(s). REM A RK 2.1. It is possible to make a change of basis so that
D(f,.o)xo =
(o1 0• ...... 0)* .
.
...
.
'
o • ... • where the terms marked by a '.' are unspecified. In terms of this basis, the left eigenvector w = (1,0,··· ,0) = (Vl)t (the transpose to a column vector). Thus in terms of this basis, condition (3) is given by (8/t/81l)(Xo,IlO) i' 0 and condition (4) is given by (82/d8xr)(Xo,llo) i' O. REMARK 2.2. If n = 2, IA21 < 1, and the derivatives in conditions (3) and (4) have opposite signs, then one of the fixed points is a stable node (attracting fixed point with two unequal eigenvalues) and the other is a saddle. This is the reason for the name of the bifurcation. REMARK 2.3. There are two approaches to the proof. One uses the center manifold associated to F(x, Il) = (f,.(x), IJ) which is two dimensional, one spatial dimension and one parameter direction. The restriction of F to this invariant manifold satisfies the assumptions of the earlier theorem. A second approach us<'~ thp Implicit Function Theorem to do the reduction in dimensions. This method does not produce an invariant two dimensional set, but it does
6.2 SADDLE-NODE BIFURCATION IN HIGHER DIMENSIONS
215
produce a two dimensional set on which all the fixed points must be found. This second method is often called Liapunov-Schmidt reduction. We give proofs using both approaches below. PROOF USING THE CENTER MANIFOLD THEOREM. Define th map on Rn+l given by F(x, Il) = (f(x, Il), Il)· and its derivative is given in the following block form:
Take the eigenvectors of D(F",,)xo 50 that Ivll = 1, Iwl = 1, and wv l = 1. Make a change of basis to the eigenvectors of D(f"" )xo; in these coordinates, the eigenvectors of D(fl-'o)xo are the standard basis. Then w = (1,0,· .. ,0) and vi = WI. The center directions at (xo, 1-10) for the map F include the Vi and Il directions. By the Center Manifold Theorem, we can solve for an invariant manifold x = cp(8, Il) which is a graph over the span of Vi and the Il variable. (The total center manifold is (X,Il) = (CP(8,1l),Il).) The parameter s can be chosen 50 that cp(O,I-Io) = Xo and W[cp(8, Il) - XoJ = s where wz is a projection of the vector z onto the component in the direction in the vector vi. The surface given by the image of cp is tangent to • properties of cp, it follows that
Vi, 50
~Cp (O,IlO) = vi. vS
Also, from these
and With these preliminaries, define
which is the projection onto the span of vi of the difference of the iterate from Xo. We want to apply the Saddle-Node Bifurcation Theorem in one spatial dimension. Let us check the conditions of the one dimensional theorem for g: g(O, 1l0) = w[Xo - XoJ = 0, 8g 8cp 8s(S,Il) = w[D(fI-')
50
8g 88 (0,1-10) = w [D(f",,)xo v IJ = wv I = 1.
This is the first condition for a saddle-node bifurcation for g. Taking the derivative again,
8 2g 2 8cp 8cp 8s 2 (0'I-Io) = w[D (f",,)xo( 88 (0,1-10), 88 (0,1-10» We mentioned above that w~(O,I-Io)
l)2cp]
+ D(f,.o)xo 8s 2 (0,1-10) .
= 0,50 both
and
8 2cp take their values in the span of v 2 through v n , and wD(f",,}xo 8s 2 (0,1-10) that
= 0. It follows
216
VI. BIFURCATION OF PERIODIC POINTS
which is nonzero by an 8SSumption of the theorem. Finally,
by another 8SSumption of the theorem. (The second term is zero because w :~ (0, /-10) = 0.) We have checked the 8SSumptions of the saddle-node bifurcation in one spatial dimension applied to g, so we can solve for Il = m(s) such that g(s, m(s)) = s. Also, m'(O) = 0 and •
10. Next we check that this function gives us the set of fixed points that we want. First, w[/('P(s, m(s)), m(s)) - "0]
= s = w['P(s, m(s)) -
xo].
Both (f('P(s,m(s)),m(s)),m(s)) and ('P(s,m(s)),m(s)) are in the center manifold and have the same components along vi and p. Therefore they are equal, 1('P(s, m(s)), m(s)) = 'P(s, m(s)).
o
Letting q(s) = 'P(s.m(s)), we have the conclusion of the theorem.
PROOF USING THE IMPLICIT FUNCTION THEOREM. This method uses only the fact that A) f. 1 for 2 ~ j ~ n and not that IAjl 1 1. Rather than using a center manifold, we reduce the problem to one spatial dimension by means of the Implicit Function Theorem, a method called Liapunov-Schmidt reduction. Wt> take a basis so that
D(fl'<»xo =
( o~.~.*
In terms of partial derivatives, 88ft ("0,/-10) = 0 Xj
by !/J)(x,p) = h(x,ll) Then !/J("o, Po) = 0 and
Xj
for 2
~ j ~
fo~ j ~
n, where the
I/s
= ( 8/; )
( 8!/Ji ) 8x)
~.) ... *
2$i,)$"
8xj
2. Define!/J: Rn+1
Rn- I
are the coordinate functions.
_ id 2$i,j$n
--+
6.2 SADDLE-NODE BIFURCATION IN HIGHER DIMENSIONS
211
is invertible at (XU, ~). Therefore we can solve for (X2,' .. ,xn ) in terms of (XI,IJ),
{}1/J
and -a (XU,~) Xl
= 0,
arp
so -a (a,~) Xl
= 0 where a = (xuh-
Also, note that ",-1(0) is
not an invariant manifold, but we can use it in the same manner as the invariant center manifold in the previous proof. Writing a for (xoh, define
Then g(O,llO) = 0 is a fixed point, and
ag aft -a (a,~) = -a (a + a,rp(a + a,~),~) a
Xl
all
arp
vXi
Xl
+ (,,-(a + a,rp(a + a'~)'~)){-a (a + a,~)), and at (0, ~),
:!(O,~) = because aaft Xj
(XU,~)
= 0 for
i ~
1
2. Thus 9 satisfies the first two conditions for a
saddle-node bifurcation. Taking the second derivative with respect to a,
because
arp -a (a,lJo) = 0 and Xl
aft
"- (XU,~)
V~j
= 0 for J. ~ 2.
The final assumption on 9 for a saddle-node bifurcation involves the derivative with respect to IJ. But
ag
aft
aft
arp
IJ
IJ
Xi
IJ
-a (O,lJo) = -a (xu,~) + (-a (xu'~))-a (a,~) aft
= alJ (xu,~) ¥ 0 because aaft
Xi
(xu.~) = 0 for i ~ 2.
VI. BIFURCATION OF PERIODIC POINTS
218
Thus 9 has a saddle node bifurcation, and we can solve for /J., /J. = m(s) such that 9(S, m(s» == s. By the previous results, m'(O) = 0 and
This completes the proof using the Implicit Function Theorem.
o
6.3 Period Doubling Bifurcation Consider the quadratic family of maps, F,,(x) = /J.x(l- x). In Chapter II we showed that the fixed points are 0 and p" = 1 - 1//J.. We also showed that p" is attracting for 1 < /J. < 3 and repelling for 3 < /J.. The reason the stability can change at /J. = 3 is that FHp3) = -1, so the fixed point is not hyperbolic. We further showed that for 1 < /J. < 3, all points x E (0,1) have their forward orbit Ft(x) converge to P" as j goes to infinity. Thus there are no points of period two for 1 < /J. < 3. A further calculation shows that for /J. > 3, there is an orbit of period two, which bifurcates off from the fixed point. It can be shown that this orbit is attracting for 3 < /J. < 1 + 61/ 2 , and then repelling for /J. > 1 + 6 1/ 2 . In Section 3.4, we discuss the repeated period doubling bifurcations which takes place as the stable orbit increased in period through 1, 2, 4, 8, ... , 2n , ... . In this section we concentrate on one of these bifurcations, e.g. the one which occurs at /J. = 3 where a fixed point becomes repelling when its derivative equals -1 and a period two orbit is created. The following example is a model problem where the fixed point is always at x = o.
q;,
Example 3.1. Let f,,(x) = -/.LX + ax 2 + bx 3 • Notice that f{(O) = -1. We want to find the points of period two for /J. near 1. See Figure 3.1.
FIGURE 3.1.
The Graph of f~ for (a) /J. < 1, (b) /J. = 1, and (c) /J. > 1
6.3 PERIOD DOUBLING BIFURCATION
219
A calculation shows that
f;(x) = p?x
+ x 2( -al-' + a1-'2) + x 3( -bl-' -
2a 21-' - b1-'3)
+ O(X4).
We want to find zeroes of f'f,(x) - x = O. We know that I~(O) - 0 = 0, since 11'(0) = O. This is reflected in the fact that x is a factor of I~(x) - x. Since we want to find the zeroes of f~(x) - x = 0 other than 0, we define
M(x, 1-') = J'f,(x) - x
x = 1-'2 - 1 + x( -al-'
+ al-'~) + x 2( -bl-' -
2a 2 1-' - b1-'3) + O(x 3 ).
This function vanishes at (x,l-') = (0,1), M(O,I) = O. Also, both the constant term, 1-'2 - 1, and the coefficient of x, -al-' + a1-'2, vanish at (x,l-') = (0,1). Using the fact that 1-'2 - 1 = (I-' - 1)(1-' + 1) ::::: 2(1-' - 1), to lowest terms the zeroes of M(x, 1-') are approximately equal to the zeroes of 0 = 2(1-' - 1) - 2(b + a2 )x 2 , which are 1-'=1+(b+a2)x2, or for bl-' - 12 > O. b+ a2 +a In the proof of the general theorem, this is justified by applying the Implicit Function Theorem. The partial derivatives of Mat (0,1) are M",(O, 1) = -al-' + a1-'211'=1 = 0 and MI'(O,l) = 21-'11'=1 = 2 # O. The Implicit Function Theorem says that I-' can be solved for in terms of x to give zeroes of M, I-' = m(x) with M(x,m(x» == 0 so I;'(",)(x) = x. To justify the approximation made above, we can calculate the derivatives of m(x) by implicit differentiation and show that this gives the lowest order terms given above. Differentiating 0 = M(x, m(x» twice with respect to x gives
±( I-' - 1 ) 1/2
x=
+ MI'(x,m(x»m'(x) and + 2M"'l'm' + MI'I'(m,)2 + Ml'm". M",(O,I) = 0 and MI'(O,I) # 0 in the first equation
0= M",(x,m(x» 0= M"""
Using the fact that gives that m'(O) = O. We use the second equation to determine m"(O). Because M""" is the only second derivative not multiplied by m'(O) (which equals zero), this is the only one we need to calculate. Using the explicit expression for M, M"",,(O,I) = 2( -bl-' - 2a 2 1-' bI-'3)11'=1 = -4b - 4a 2. Then m"(O) = -M"",,(O, 1) - 2M"'I'(O,I)m'(0) - MI'I'(O,I)(m'(O»2 MI'(O,I) 4b+ 4a 2 -0 - 0
2 = 2(b
+ a 2 ).
Thus to get a quadratic shape to the new points of period two, we need to assume that b + a2 # o. In the general theorem, we see that the sign of b + a2 also determines the stability of the period two orbit. Note that -2(a 2 + b) is the coefficient of x 3 in Jl where 1 is the bifurcation parameter value. The above example is fairly general, but it does assume that the fixed point does not vary with the parameter, so
:~ (0,1) = O. In the theorem we allow :~ (xo, 1-'0) # 0 and
show the effect of this term. We also give a condition for the spatial derivative of 1 to vary along the curve of fixed points as the parameter varies. This condition is given in terms of derivatives of f, so it is not necessary to calculate f;(x) to apply the theorem. The bifurcation described in the following theorem is called the period doubling or flip bifurcation.
220
VI.
BIFURCATION OF PERIODIC POINTS
Theorem 3.1 (Period Doubling Bifurcation). Assume that 1 : R2 -+ R is a C r function jointly in both variables with r ~ 3, and that 1 satisfies the following conditions. (1) The point Xo is a fixed point for Il = J.'o: l(xo.J.'o) = Xo. (2) The derivative of 1"0 at Xo is minus one: I;"'(xo) = -1. Since this derivative is not equal to 1, there is a curve of fixed points X(Il) for Il near 1'0. (3) The derivative of 1~(x(Il» with respect to p. is nonzero (the derivative is varying along the family of fixed points):
(4) The graph of I'/.. has nonzero cubic term in its tangency with the diagonal (the quadratic term is zero):
Then tIl ere is a period doubling bifurcation at (xo. J.'o). More specifically, there is a differentiable curve of fixed points, X(Il), passing through Xo at 1'0. and the stability of the fixed point changes at Ilo. (Which side of J.'o is attracting depends on the sign of 0.) There is also a differentiable curve "f passing through (xo. 1'0) so that "f \ {(xo. J.'o)} is the union of hyperbolic period 2 orbits. The curve "f is tangent to the line R x {IlO} at (XO,/lo), so "f is the graph of a function of x, Il = m(x) with m'(xo) = 0 and m"(xo) = -2{3/0 '" O. The stability type of the period 2 orbit depends on the sign of {3: if {3 > 0 then the period 2 orbit is attracting, and if (3 < 0 tllen the period 2 orbit is repelling. PROOF. By the assumptions, I~o(xo) '" 1 so there is a curve of fixed points parameterized by 1-1. X(Il). Moreover,
".f'
want to translate the coordinates so that 0 is a fixed point for all nearby 1-1, so we define
g(y, Il) = I,.(y
+ X(Il»
- X(Il)·
Then, g(O./I) == O. We write g2(y, Il) when we mean g(., 1') 0 g(y, Il). The partial derivatives of 9 with respect to y are the same as those of 1 with respect to x at the corresponding points,
The value of the partial derivative with respect to the position, : : (0. Il), determines the stability of the fixed point, so
[J2 all~ (0, Il) measures the change along the curve of
6.3 PERIOD DOUBLING BIFURCATION
221
fixed points, and
829
8 8/
8"Dy (0, "0) = 8" ax (x(,,), ,,)1,.=,.0
oP/
= a,,8x (xo, /10)
82 /
= 81'8x (xo,/IO)
82 /
+ 8x 2 (xo, /IO)x 1a2 /
,
(/10)
0/
+ 2ax2 (xo, /10) a" (xo,/IO)
=0",0. This calculation shows that 0 measures the quantity described in the statement of the theorem. Let aj(") be the coefficient of yj in the Taylor expansion of 9 about y = 0, so g(y, II) =
al (,,)y
+ a2(,,)y2 + a3(,,)y3 + O(y4).
A direct calculation then shows that
where we do not exhibit the dependence of the coefficients aj on ". As in the example, we want to find points where g2(y, 1') - Y = that are different than 0. Since y = is always a solution, we divide out by y when y '" 0. So we define
°
°
g2(y, ,,) _ y { M(y,,,) =
a
for y '"
Y 2
8y (g (y, 1'»1 11 -0
Notice that the definition for y = Using the expansion of g2, M(y, 1') = (a~ - 1) + (ala2
-
°
1 for y = 0.
°
is the natural extension of the definition for y '" 0.
+ a2a?)y + (ala3 + 2ala~ + a3a~)y2 + O(y3).
In order to show that the Implicit Function Theorem applies, we note that M(O,/IO) = 0, and the partial derivatives are M\I(O, /10) = ala2
+ a2a~I,.0
= -a2("0)
+ a2(/IO)
=
° and
88g2 M,.(O, 1'0) = 8" Dy (0, ,,)1"0 8 8g 2 = 8" (8y(0,,,») I,.., 8g 8 2g = 2 Dy (0, /10) 81'8y (0, /10) = -20", 0.
Because M,.(O, /10) '" 0, the Implicit Function Theorem applies and there is a differentiable function m(y) such that M(y,m(y» == 0. By implicit differentiation, 0= Mil (0, /10) + M,.(O, /IO)m'(O), so
VI. BIFURCATION OF PERIODIC POINTS
222
To calculate the second derivative of m(y), differentiate the equation 0 = MII(y, m(y» M,,(y, m(y»m'(y) again: 0= Mill/ Evaluating these at y
+
+ 2MII"m'(y) + M",,(m,)2 + M"m".
= 0 where m'(O) = 0 gives m"(O) = -MIIII(O,~o). M,,(O,~)
Thus we need to calculate the numerator,
MIIII(O,~) = 2(ala3 + 2ala~
+ a3a~)I"o
= 2( -a3 - 2a~ - a3)IPo = -4(a3 = -4/3
+ a~)I"o
i: o.
Therefore m"(O) = 4/3 = _ 2/3 i: O. -2a a lfIg2 It can also be checked that 8y3 (0, ~o) = 3MIII/(0, ~o) = -12/3
i: O.
This leaves only the stability of the period 2 orbit to check. For this we use the 8(g2) Taylor expansion for """OJ/(y,m(y)) about y = 0 and ~ =~:
We have already calculated the value of several of these coefficients:
(by the chain rule),
8;;::) (O,~) = 0 as
is noted in the calculation of MI/(O, ~), and
so
~~~ (O,~)(m(y) -~) = M,,~mll(0)y2 + O(y3) =
M,,~C::III1)y2
= 2/3y2.
"
6.4 ANDRONOV-HOPF BIFURCATION FOR DIFFEOMORPHISMS
223
Finally, 1 (J3(g2) (1) 2 3 2--;3y3(O,Jlo) = 2 6(ala3 + 2ala2 + a3 al) = -6{3.
Combining these terms,
a~:) (y, m(y»
= 1 + 2{3y2 _ 6{3y2
= 1 - 4{3y2
+ O(y3)
+ O(y3).
Thus, if {3 > 0 the period 2 orbit is attracting, and if {3 < 0 then it is repelling. This finishes checking all the conditions of the theorem. 0
6.4 Andronov-Hopf Bifurcation for Diffeomorphisms As stated in the introduction to the chapter, the Andronov-Hopf bifurcation for a diffeomorphism occurs when a pair of eigenvalues for a fixed point changes from absolute value less than one to absolute value greater than one, i.e., the fixed point changes from stable to unstable by a pair of eigenvalues crossing the unit circle. With further conditions on derivatives, it follows that an invariant closed curve bifurcates off from the fixed point. The motion on this invariant curve is a rotation, whose rotation number starts near the value determined by the eigenvalue. The formal statement of the theorem requires some notation and preliminary change of variables. We do this before we state the theorem. Assume that ~ : R2 x R -+ R2 is a one-parameter family of C r diffeomorphisms which satisfies the following conditions. (We do not state a version which allows the fixed point to vary with the parameter.) (a) Assume r ~ 5. (b) Assume that the origin is a fixed point of ~I' for I-' near 0: ~I'(O, 0) = (0,0). (c) Assume that D(~I')(o,O) has two non-real eigenvalues, >'(1-') and X(I-'), such that d 1>'(0)1 = 1 and dl-' 1>'(1-')1 '" o. By a change of parameter, we can assume that
1>'(1-')1 = 1 + 1-'. (d) By a change of basis on R2 (which depends on 1-'), we can assume that
(e) We further assume that >.(o)m = eim/3(O) '" 1 for m = 1,2, ... ,5. This means that >'(0) is not a low root of unity (in addition to not being equal to ±1). (f) Because >'(0) is not a low root of unity, there exists a change of coordinates that bring ~I' into the form
where in polar coordinates,
We make the assumption that !teO) '" o. Thus, the radial component of the map is a nonlinear function of r. (Notice that (1 +I-')r- h(l-')r3 = r has a solution r2 = 1-'!t(1-')-1 for those I-' with 1-'!t(I-')-1 > o. Thus the normal form terms have an invariant circle for I-' on one side of I-' = 0.) The bifurcation described in the following theorem is called the Andronov-Hopf Bifurrotion for diffeomorphisms.
224
VI. BIFURCATION OF PERIODIC POINTS
Theorem 4.1 (Andronov-Hopf Bifurcation). Assume ~,,(x, y) satisfies assumptions (a) - (f). Then for all sufficiently small,.,. with ,.,.11{,.,.)-1 > 0, ~" has an invariant closed curve surrounding the fixed point (O,O) of radius approximately equal to [J.l111 (11)]1/2. FUrther, if 11 (0) > 0 then the closed curve is attracting, and jf IdO) < 0 then it is repelling. REMARK 4.1. This theorem was proved by Naimark (1967), Sacker (1964), and Ruelle and Takens (1971). Also see Marsden and McCracken (1976) and Carr (1981). The name of Andronov-Hopf Bifurcation is commonly used because of the connection with the Andronov-Hopf bifurcation for differential equations given in the next section. REMARK 4.2. The map on the invariant closed curve can vary. This map is most likely not conjugate to a rigid rotation for most parameter values. As J.I approaches 0 the rot.ation number is approximately {3(/1)/{21r). REMARK 4.3. The proof of this theorem is more difficult than those for the other bifurcations which we treat. The reason that it is harder is that we need to find a whole closed curve of points and not just a single point. For this reason we can not use the Implicit Function Theorem to prove this theorem, but must apply a contraction mapping argument to a set of potential invariant curves, and the construction is much more involved and delicate. Because of these complications we do not give the proof but refer the reader to the references. In the next section we do prove the simpler HopfAndronov bifurcation for differential equations where we can use the Poincare map in polar coordinates from () = 0 to itself and reduce the problem to finding a fixed point of this map.
6.5 Andronov-Hopf Bifurcation for Differential Equations As stated in the introduction to the chapter, the Andronov-Hopf bifurcation for a system of differential equations occurs when a pair of eigenvalues for a fixed point changes from npgative real part to positive real part, Le., the fixed point changps from stable to unstable by a pair of eigenvalues crossing the imaginary axis. With further conditions on derivatives, it follows that a periodic orbit bifurcates off from the fixed point. Notice that for a differential equation, the Andronov-Hopf Bifurcation gives a periodic orbit and not just some invariant closed curve. This difference makes the analysis much simpler in this case. We proceed to make some assumptions and constructions before we state the main bifurcation theorem of the section. \Ve consider a one parameter family of differential equations
x=
I{x,,.,.) = I,,(x)
with x E ]R2 that satisfies the following assumptions. (1) The origin is a fixed point for all values of J.I near 0: I(O,J.I) = o. (2) The eigenvalues of D(f,,)o are o{,.,.) ± i{3{,,) with 0(0) = 0, {3(0) = f30 =F 0, and 0'(0) =F 0, so the eigenvalues are crossing the imaginary axis. The last assumption of the theorem involves the Taylor expansion of the differential equations, with a condition given on a combination of the coefficients when they are expressed in polar coordinates. Therefore, after we make some preliminary change of coordinates, we indicate in a lemma the form of the equations when transformed into polar coordinates. By the Implicit Function Theorem, since 0'(0) =F 0, the parameter can be changed so that 0(11) = IJ. We use this new parameter. Then there is a change of basis on R2 such that
6.5 ANDRONOV-HOPF BIFURCATION FOR DIFFERENTIAL EQUATIONS
225
with
and
) _ (BHxl'X~'JJ)+BHxl'X~.#~)+O(IXI4)) B~(XI,X2,JJ) + B~(XI,X2,JJ) + O(lxI4)
F(x
,JJ -
where Bj(X('X2,JJ) is a homogeneous polynomial of degree j in Xl and X2. Next, we transform the equations to polar coordinates and obtain the form stated in the following lemma.
Lemma 5.1. Consider the differential equ8tiollS (.) when expressed in polar coordinates, Xl = reos(O) and X2 = rsin(O). Then
r
+ r2C3(0,JJ) + r3C4(0,JJ) + O(r4) !3(JJ) + rD 3(O, JJ) + r2 D 4(O, JJ) + O(r3)
= JJr
{J =
where Cj{-,JJ) and Dj(',JJ) are homogeneous polynomials of degree j in sin(O) and
eos( 0). In fact, C 3(0,JJ) = eos(O)B~(eos(O),sin(O),JJ)
+ sin(O)B~(cos(O),sin(O),JJ) + eos(O)B~(cos(O), sin(O), JJ),
D3(0, JJ) = - sin(O)B~ (eos(O), sin(O), JJ)
where Bj{-'" JJ) is the homogeneous term of degree j in terms of Moreover,
PROOF.
Xl
and X2 of XA;.
Taking the time derivatives of the equations which define polar coordinates,
we get
where Bj are functions of rcos(O), rsin(O), and JJ, Bj(rcos(O),rsin(O),JJ). Factoring out r from the Bj (and remembering that Bj is homogeneous of degree j), we get the form of the statement of the lemma. To check the integral of C3(O,JJ), notice that C3 (O, JJ) is a homogeneous cubic polynomial in sin(O) and cos(O) so that
226
VI.
BIFURCATION OF PERIODIC POINTS
o (3) Using the coefficients defined in Lemma 5.1, we define
and make the assumption that K I- O. The significance of assumption (3) is best understood in terms of a "normal form." With assumptions (1) and (2), there is a change of coordinates R = r + ul(r,O,p) in terms of which the differential equations become the following:
R=
pR + K R3
+ O(~)
9 = fj(p) + O(R). See Section 3.2 in Carr (1981). Thus the fact that K I- 0 means that the R equation has a nonzero cubic term and so has an invariant closed curve of approximate radius (-Ill K)1/2 (almost a circle of this radius). This situation is similar to the discussion we gave for the diffeomorphism case. In the following proof we avoid the use of the normal form (so we do not need to verify it), but merely build the necessary construction into the proof. In the following theorem we use the radius of the solution at t = 0 as the parameter and denote it by E. Thus we find T(E)-periodic solutions in time, x"(t, p(£», such that their initial conditions in polar coordinates are given by r"(O, pte»~ = e and 0"(0, pte)) = 0, for some parameter p(E) which is a function of E, and where the period T(e) is a function of E. Thus the period and the parameter value for which the periodic orbit occurs are functions of the approximate radius of the periodic solution. The bifurcation described in the following theorem is called the Andronov-Hop! bifurcation for flows.
Theorem 5.2 (Andronov-Hopf Bifurcation). Make assumptions (1) - (2) on the differential equation, x = I(x,p). (a) Then there exists an EO > 0 such that for 0 :5 £ :5 EO, there are (i) differentiable functions Il(E) and T(E) with T(O) = 27r/{30, p(O) = 0, and p'(O) = 0 and (ii) a T(E)periodic function oft, X"(t,E), that is a solution of(*) for the parameter value p = PtE) and with initial conditions in polar coordinates given by r" (0, e) = € and 0" (0, €) = O. In fact, for all t, r·(t, £) = e +0(£). (Uniqueness) Further, there are po> 0 and 60 > 0 such that any T-periodic solution x(t) of(.) with Ipl :5 PO, IT-27r/.Bo1 :5 60, and Ix(t)1 :5 60, must be x"(t,p) up to a phase shift, i.e., x(t + to) = x"(t,p) where p = 1l(lx(to)i) and to is chosen so that the polar angle 0 is zero for x(to), Otto) = O. (b) If we also make assumption (3), then not only is p'(O) = 0 but also Ill/(O) = -2K I- O. (This means the periodic solutions occur for p on one side of 0 with the side determined by the sign of K.) fUrther, the periodic solution is attracting if K < 0 and is repelling if K > O. See Figure 5.1. REMARK 5.1. Examples of this bifurcation are found in the work of Poincare. This theorem was explicitly stated and proved by Andronov (1929). Also see Andronov and Leont
6.5
ANDRONOV-HOPF BIFURCATION FOR DIFFERENTIAL EQUATIONS
227
(b)
(a)
(c)
FIGURE 5.1. The Phase Portraits of the Andronov-Hopf Bifurcation with < for (a) Il < 0, (b) Il = 0, and (c) Il >
K
°
°
REMARK 5.2. The proof given below is a combination of those found in Carr (1981) and Chow and Hale (1982). PROOF OF THEOREM 5.2(a). We first determine the rate of change of r with respect to 8 to determine the Poincare map. By using the equations in polar coordinates and dividing r by 9, we obtain dr Il d8 = /ir
+r
2
1 Il [/iC3 (8, Il) - {32 D 3 (8, Il) 1
+ r 3[1/iC4 (8, Il) - f321 C 3 (8,Il)D 3 (8,1J) - ;D4 (8,1l)
+ ;3 D3(8,1J)2j
+ 0(r 4 ). (In this and subsequent equations, 0(r4) is really a remainder term in an expansion, so it can be differentiated to give a term ofthe next lower order: (Oi /fJr i )0(r4) = 0(r4-i) for 1 ::; j ::; 4.) Let r(8, E, IJ) be the solution for the parameter value IJ with r(O, E, IJ) = E. If E = then r(8, 0, IJ) ;: 0, so this gives E = as a fixed point for r(21r, E, IJ). Since we want to find nonzero fixed points of the Poincare map, we define
°
°
=
F( E,IJ ) - r(21r, E, IJ) E
E
.
We want to apply the Implicit Function Theorem to continue a zero of F. In fact, we show that F(O,O) = 0, F.(O,O) = 0, and F,..(O,O) i' 0. To verify these conditions, we need to show that r(21r, E, Il) -E = 0(E2), and r(21r, E, 0) -E = 0(E 3). Such estimates can probably be proved using Gronwall's Inequality applied to the equation (d/dB)e-",'/fJr = 0(r2). Instead of using this approach, we scale the variable r by E, defining r = fP, and derive estimates on p. Thus we define r = EP, and let p(8,E,IJ) be the solution with p(O,f,lJ) = 1 (which corresponds to r(O,E,IJ) = f). We need to verify that p(21r,E,IJ) - 1 = 0(£). The
VI.
BIFURCATION OF PERIODIC POINTS
differl"ntial equation, dp/dO, is given by
dp I' 2 1 I' dO = ~P+EP [~C3{O,I')- tpD3{O,I')]
+ E2 P3 [1~C4{B,I') -
-;2
1 C3{B,I')D3{B,I') tp
D4{B,I')
+
;3
D 3 {B,I')2]
+ O{E 3p4). We trl"at these equations as a perturbation of the linear terms. Using the integrating factor e-,..fJ/o, the derivative of e-,..9/0 p is as follows:
Integrating from 0 to 21T, we obtain
e- 2",../O p{21T,E,I') - 1
Jor
= E
h
1 I' e-"'1I/0p{O'E'I')2[~C3{O,I') - ,62D3{O,Il)] dO
+ E212" e-,..1I/0 p{O, E,1')3 [~C4{O,I') -
-;2
D4{O, 1')
~2 C3{O,I')D3(B,Il)
+ ;3D3{O, 1')2] + O{E 3p4{O,E,I'»dO
== Eh{E, Il). Thus
F(E,I') = p{21T,E,Il)-1 = [e 2",../o -IJ + Ee 2""'/Oh{E,I'). At (E. 1') = (0,0). F(O,O) = 0 and F. (0 0) = (21T)e 2",../o _ (21T1')(0,6)e 2""'/O
"",6
tp
01'
+ E~ [e 2",../13 h{E, 1')]1 .-0.,..-0 _ _ 01' = 211"
t30
f: O.
Therefore we can solve for I' as a function of E, I'{E), such that F{E,I'{E» == O. These points are periodic orbits with 00 = 0, p{O,E,I'{E» = 1 so r{O,E,I'(t:» = E, and p{211", E,I'(E» = 1 so r(211", E,I'{E» = E. We find the derivative of I'{t:) by implicit differentiation:
F.{O, O)
+ F,..(O, 0)1"(0) = o.
6.5
ANDRONOV·HOPF BIFURCATION FOR DIFFERENTIAL EQUATIONS
229
Thus we need to calculate F.(O, 0). Note that F(E, 0) = O+Eh(E,O) 1
r2w
= E10
p(9,f,0)\'30)C3(9,0)dB r 2w
1 1 {P(9,E,0)3[(.eo)C4 (9,0) - (J3g)C3(9,0)D3(8,0)]
+E210
+ O(E3p4(9,E,0»)} d9. Using the fact that p(9,0,0) == 1,
r 1 r = fJo 10 1
F.(O, 0) = .eo 10
2w
p(9,0,0)2C3 (9,0)d9+0
21<
C 3 (9,0)d9
=0. The integral is zero because C3(9, 0) is a homogeneous polynomial of odd degree in sin(9) and cos(9). Using that 1"(0) = -F.(O,O)F,,(O,O)-l, F.(O,O) = 0, and F,,(O,O) '" 0, we get that 1"(0) = 0. This completes the proof of part (a) of the theorem. 0 PROOF OF THEOREM 5.2(b). By part (a), there is a function I'(E) such that F(f,I'(E)) == 0, 1"(0) = 0, and = F.(t,I'(E» + F,,(E,I'(f)}I"(f). Differentiating this last equation again with respect to E and evaluating at (E,I') = (0,0) gives
°
0= F.. (O, 0) + 2F." (0, 0)1"(0) + F",,(O, 0)1"(0)2 + F,,(O, 0)1'''(0), so
1'''(0)
= _ F•• (0, 0) F,,(O,O) .eo = -(2.,..)F.. (0,0).
Thus we need only show that F•• (O,O) = (4.,../.eo)K in order to verify the derivative given in the theorem. In the calculation of F.. (O, 0), we can fix I' = and differentiate F(E,O) twice. Using the expression for F(E,O) given in part (a),
°
1
F.(E,O) = (.eo)
121< 0 P(9,f,0) 2C3(9,0)dB
r
2w
E
+ (.eo) 10
r
2w
+2E 10
+ 0(E2 ), and
8p 2p(9,E,0) 8E(9,E,0)C3(9,0) dB
1 1 P(9,E,0)3 [(.eo)C4 (8, 0) - (J3g)C3 (8,0)D 3 (8,0)] dB
VI. BIFURCATION OF PERIODIC POINTS
230
because p(8, 0, 0) == 1. Thus it is enough to show the last integral, which we denote by I, is zero. Since (or IJ = 0,
dp d8 (8, e, 0)
= ( f30e ) p(8, e, 0) 2 C3(8, 0)+ O(e 2 ),
the derivative with respect to e gives (the first variation equation)
Integrating from 0 to 8 gives
{)p ({)e)(8,O,O)
1 {' = (130) Jo C3(8,O)d8
= (63 (8,0»
-
130
.
Thus the integral I is given as follows:
2
I = (f3~)
=
{2ft
Jo
263(8,O)C3(8,O)d8
(;~)[63(2'11",O)2 -
6 3(0,0)2]
=0 because C3 (8,O) is a homogeneous polynomial of odd degree in sin(8) and c08(8). This only leaves the stability of the periodic orbit to be determined. For f30 > 0, the Poincare map, P,.(e), is given by P(e,lJ) r(2'11",e,lJ) ep(2'11",e,IJ). (For f30 < 0, the Poincare map is given by P(e,lJ) r( -2'11", e,lJ), and P(e,IJ)-1 r(2'11", e,IJ). We do not indicate the changes (or this case.) The derivative with respect to e, which is used to determine the stability, is given by
=
=
=
=
The Taylor expansion in e and IJ is given by
p(21r,e,lJ)
= F(e,IJ)+ 1 = 1 + F.(O, O)e + F,.(O, O)IJ 1
2
1
2
3
+2"F«(O,O)e +F.,.(O,O)elJ+2F,.,.(O,O)1J +O(I(e,IJ)I) and
"(0) 1 p(2'11", e,lJ(e» = 1 + 0 . e + F,.(O, 0)[Te2 + 0(e 3») + 2F«(O, 0)e 2 + 0«3) 2 1 '\ 2 ( 3) ( )( -F«(O,O» =l+F,.O,O 2F,.(O,O) e +2Fulll,lJ)' +Oe
= 1 + 0(e3 ).
6.6 EXERCISES
231
Next,
and
Combining,
p,.(£)
= 1 + F•• (0, 0)£2 + 0(£3) = 1+
(~)K£2 + 0(£3).
Therefore if K < 0, then 0 < P;(£) < 1 for small £ and the orbit is attracting. Similarly, if K > 0, the orbit is repelling. (When /30 < 0, it is still the case that the orbit is attracting for K < 0 and repelling for K > O. We leave this verification to the reader.) This completes the proof of the theorem. 0
6.6 Exercises Bifurcations for Maps 6.1. Let F,,(x) = I-' + x 2 for x E R. (a) Find the point and parameter value where there is a saddle-node bifurcation of fixed points. Verify the assumptions of the theorem. (b) Find the point and parameter value where there is a period doubling bifurcation from a fixed point to an orbit of period two. Verify the assumptions of the theorem. 6.2. Let FAB(X,y) = (A - By - x 2,x) be the Henon family of maps. Prove that FAB undergoes a saddle-node bifurcation when A = -[(B + 1)/2)2.
6.3. Let f,,(x) = I-'X - x 3 for x E R. (a) Find the fixed points. Note that the bifurcation at I-' = 0 is not one we have studied. It is called the pitchfork bifurcation, and takes place naturally in systems with a symmetry 1,,( -x) = - f,,(x) for alII-'. (b) Show that there is a period doubling bifurcation at I-' = 2. (Verify the conditions of the theorem.) 6.4. Assume that f,,(x) = - f,,( -x) for alII-' where x E R. (a) Prove that 1,,(0) == 0 and 1;(0) == o. (b) Assume 1:"'(0) = 1, I;~(O) '" 0, and :/~(O)I,,=I'O '" O. Prove that f,,(x) undergoes a pitchfork bifurcation like the example in the previous exercise. 6.5. Assume I : R" x R -+ Rn is C 3 and satisfies the following conditions. (1) There is a Xo E R" and 1-'0 E R such that f(Xo, 1-'0) = Xo. (2) The derivative of fl'O at Xo has eigenvalues '\l(JLO) = -1 and .\;(1-'0) for 2 ~ j ~ n with 1.\;(1-'0)1 '" 1. Let vi be the right eigenvector for the eigenvalue .\dl-'o) = -1 of D(fI'O)"'" (3) Let x(JL) be the curve of fixed points of I,.. Let .\;(JL) be the eigenvalues of D(f,,)x(,,)' Assume d dl-'.\l (1-')11'0 '" O. (a) Prove that there is a curve of points of period 2 bifurcating off from (Xo, 1-'0) in Rn x R, i.e., there is a differentiable curve "f passing through (xO,I-'o) so that
232
VI. BIFURCATION OF PERIODIC POINTS
1 \ {( %0, I'o)} is the union of period 2 orbits. The curve 1 is tangent to the line < vI> x{l'o} at (Xo,l'o). For parts (b) through (d), assume Xo O. Let w be a left eigenvector for the eigenvalue -1 of D(f,.o)o and let 11' : IR n -+ R n- I be the projection along vI onto (v 2, ... , vn). Using coordinates with vI along the zl-axis, and I/J(x, 1') = lI'[/~(x)-x], construct cp(XI,I') such that I/J(ZI,cp(ZI,I'),I') == and let 9(ZI,I') be defined as in the proof of Theorem 2.1. (b) Prove that
=
°
::~(0'1'0)=WD2(f,.o)(VI'VI)
and
::~ (0, 1'0) = w D3(f,..)(v l , vI, vi) + 3 W D2(/,..)(v l , :28~ (0,1'0»' (c) Prove that
82~ {} 8 (0,1'0) = -[11' D 2(f,..) -
WI 88 22 (f;.)(0). %1
I,..
I;.
(d) Use parts (b) and (c) to write the conditions on the derivatives of and which insure that the orbits of period 2'are on one side of I' 1'0. 6.6. Let FAB(Z, y) (A - By - z2, %) be the Henon map. Fix B Bo. (a) Show that FAB. has a fixed point with one eigenvalue equal to -1 (and the other eigenvalue equal to -Bo) for A 3(1 + Bo)2/4. 3(1 + Bo)2/4 (b) Prove that FAB. undergoes a period doubling bifurcation at A as A varies using the conditions derived in the last exercise. (Verify at least the conditions of part (a).) Bifurcations for Differential Equations 6.7. Let
= =
=
=
=
z=y iI = -z + I'Y + ayl. (a) Show that this system satisfies the eigenvalue conditions for a Andronov-Hopf bifurcation at the origin for I' 0. (b) Find a (right handed) basis (of eigenvectors) which changes the linear terms at the origin to the system
=
= IIU - 1V v = 1U + IIV
Ii
for the appropriate choice of II and 1. Also, calculate the nonlinear equations in terms of these variables. (c) Express in polar coordinates the nonlinear equations found in part (b), where r2 u 2 + v 2 and tan 9 v/u. (d) Find the constant K (used in the statement of the Andronov-Hopf bifurcation theorem), and show that it is nonzero. Here a '" 0, but it can either be positive or negative. Hint: C3(0,9) D3(0, 9) 0. 6.8. Consider the system of differential equations given by
=
=
=
=
z =a - (1 + 6)% + Z2 y iI = 6z - ~21J.
6.6 EXERCISES
This system of equations is a model of a certain chemical reaction and is called the 'Brusselator'. (a) Prove that (a, b/a) is the unique fixed point. (b) Prove that the fixed point is stable for b < 1 + a2 and unstable for b > 1 + a 2 with a pair of complex eigenvalues crOSBing the imaginary axis at b = 1 + a2 • (It can be further shown that a Andronov-Hopf bifurcation to a stable periodic orbit occurs at b = 1 + a 2 • See Prigonine and Lefever (1968) and Lefever and Nicholis (1971).) 6.9. Consider the following systems in polar coordinates, where in each case 9 = 1. In each case, find the periodic orbits and draw the phase portrait (in Cartesian coordinates) for JI. < 0, JI. = 0, and JI. > O. Indicate which conditions of the full Andronov-Hopf Theorem are not true (if any). (a) r = r(Jl.2 - r2). (b) r = Jl.r(l + r2). (c) r = r(JI. - r2)(4J1. - r2). (d) r = r(JI. - r 4 ). 6.10. Consider the Lorenz system of differential equations, :i;
= -lOx
iJ = . Z
+ lOy,
px - y-xz,
8 = 3z+xy.
(a) Find the fixed points, 0, a±. (b) Find the eigenvalues at the fixed point O. (c) The rest of the problem deals with the eigenvalues at the fixed points a±. (The eigenvalues are the same at ±a.) In particular, part (h) asks the reader to verify that a pair of complex eigenvalues crOSB the imaginary axis (some of the conditions for a Hopf bifurcation) at a parameter value PI which is found in part (f). To start this process, we ask the reader to find the characteristic polynomial at a±. Show that the characteristic polynomial at a± is given by
ptA) = A3
41
8
160
+ ("3 )A2 + 3(10 + p)A + 3'(p - 1).
(d) Show that ptA) has a negative real root. (Note that p(O) > 0 and all the coefficients are positive.) (e) For P 2': 14, show that ptA) is a monotonically increasing function of A, so it has exactly one real root, >.0. Thus the eigenvalues are Ao and a ± iP with P .;: o. (There is a pair of complex eigenvalues for smaller P as well, but it is not as easy to show.) (f) Find a parameter value PI for which a = O. Hint: Note that .>.0 + 2a = -41/3, where 41/3 is the coefficient of A2. So find P = PI such that Ao = -41/3 is a real root. (g) Show that Ao > -41/3 for P < PI. and .>.0 < -41/3 for P > Pl. (h) Show that the complex eigenvalue a + iP crosses the imaginary axis as P increases through Pl. Note that this shows that the Lorenz equations satisfy some of the conditions for a Hopf bifurcation at the fixed point a for P = Pl.
CHAPTER VII
Examples of Hyperbolic Sets and Attractors In this chapter, we return to consider examples with complicated invariant sets. We introduce the idea of a hyperbolic invariant set and show that not only periodic orbits can have stable and unstable manifolds, but that a hyperbolic invariant set also has a family of stable manifolds. We give a number of different types of examples which are hyperbolic and give a method to show that the map is topologically transitive on the invariant set. One very important type of example arises from the intersection of stable and unstable manifolds of a saddle periodic orbit. This gives rise to an invariant set called a Smale horseshoe. It is very similar to the invariant set which we found for the quadratic map on the real line. Another important type of hyperbolic invariant set occurs where all nearby orbits tend toward the invariant set. Such an invariant set is called an attractor (with further conditions added). Thus an attractor is like a periodic sink where the invariant set itself is more complicated topologically. The final cl_ of examples we consider are those with only a finite number of periodic points, called Morse-Smale systems. For these systems we make a connection with the Lefschetz theory through the Morse-Smale inequalities.
7.1 Definition of a Manifold In this chapter, we consider diffeomorphisms (or flows) on a manifold M having a dimension greater than or equal to two. A manifold is merely a set on which there are local coordinates that make it a Euclidean space. We have already seen examples of manifolds: (i) the circle which is represented by 71": RI -+ Sl, 7I"(t) = t mod 1, and (Ii) local stable and unstable manifolds which are represented as graphs, (1' : E'(r) -+ E"(r) and W:(O) = ({x,(1'(x» : x E E·(r)}. We now give a more formal definition of a manifold. Definition. A c r n dimensional manifold M is a second countable metric space together with a collection of homeomorphisms CPo : Va eRn -+ Uo c M for Q in some index set A such that (i) cp(Vo) = U (Ii) {UO}OEA is an open cover of M, and (iii) if Uo n Up -F 0, then Q '
CPo,p = cp;;1 0 CPo : cp;;l(Uo n Up) C Va
-+
cp;;l(Uo n Up) c Vp
is a C r diffeomorphism between open subsets of Rn. One of the allowable maps CPo : Va C R" -+ Uo c M is called a coordinate chart on M. Example 1.1. For the circle, we can use the map 71" : R -+ Sl to induce homeomorphisms on open intervals 10 of length less than one. If 10 and Ip are two such open intervals with 71"(10) n 71"(113) -F 0, Uo = 71"(10)' and Up = 71"(113), then 71"0.13 = (71"1113 )-1 071": (7I"110)-1(U0 n U13 ) -+ Ip
is given by 7I"o,l3(t) = t + j for some integer j. Therefore 7I"o.P is Coo and the collection of these maps clearly satisfies the conditions of the definition of " ",anifold. 235
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
236
REMARK 1.1. In the case of a local stable manifold, we allowed it to be the graph of a function between Banach spaces. We do not make the formal definition of a Banach manifold, but it is much the same as that given above for finite dimensional manifolds. See Lang (1967). REMARK 1.2. The local stable or unstable manifolds of a hyperbolic fixed point of a diffeomorphism can be represented as a graph so it is clearly a manifold. The global stable and unstable manifolds do not always satisfy the full conditions stated above for a manifold. The problem is that the global stable manifold can accumulate on itself. (See the examples of the Smale horseshoe and the toral An080V automorphism given in Sections 7.4 and 7.5.) More specifically, let / : M -+ M be a C'" diffeomorphism with a hyperbolic saddle fixed point p. It is always possible to define a C'" map", : JFi, -+ M such that (i) '" is one to one, (ii) '" is onto W'(p), and (iii) the derivative of", at each point is an isomorphism. (See the discussion of the derivative of maps into a manifold below.) However, the map", does not always have a continuous inverse (Le., '" is not a homeomorphism). so W'(p) is not a manifold in the full sense defined above. A C r map '" : N -+ M from one manifold into another is called an immersion provided the derivative of '" at each point is an isomorphism. The image of a one to one immersion is called an immersed submani/old. If an immersion is a homeomorphism then it is called an embedding and its image is called an embedded submanifold. (Some people require that an embedding is also proper, Le., the inverse image of a compact set is compact.) See Hirsch (1976) for discussion of these concepts. REMARK 1.3. A one dimensional manifold is usually called a curve; a two dimensional manifold is usually called a surface.
=
Example 1.2. The n-torus 1['n is the product of n copies ofthe circle. 1['n Sl x· .. X Sl. To define a map from IR n onto 1['n. we first define an equivalence on IR n by (ZI,' ..• zn) (J1l •...• lin) if ZJ = 11; mod 1 for all j. Define 11' : IR n -+ 1['n by letting 1I'(x) be the equivalence class of x under -. The reader can check that this gives 1['n the structure of a Coo manifold.
=
=
Example 1.3. Let sn {x E IR"+I : Ixl I} be the n-sphere. To show that sn is a manifold. we represent pieces of sn as graphs. Let D n be the open unit ball in IRn. For 1 :5 j :5 n. define D n -+ sn by
",t :
",1(YI •...• lin) = (111 ..... lIj-I, ±(1 - 1I~ - .. , - 1I~)1/2, lIj,' .. , lin).
"'1
Each of these maps is a homeomorphism onto a "hemisphere" of sn. We leave to the exercises the verification that these maps gives sn the structure of a Coo manifold. See Exercise 7.1.
Example 1.4. A common method to specify a manifold is as the level set of a function. Let F : lRn+l -+ lR be a C'" function for some r > 1. Assume c E IR is a value such that for each p E F-I(c), DFp -:F O. Le .• DFp has ;-ank one, or some partial derivative of F is nonzero at p. Let M F-I(c). For each p EM, the Implicit Function Theorem proves that there is a neighborhood Up of p and a C'" function up : Vp C lR" -+ IR such
=
that the graph up is onto Up. More specifically. if 88F (p) -:F 0, there there is an open set Vp in lR n and a C'" function up : Vp Up
Zj -+
IR such that
= {(1I1 •...• lIj-l. Up (lIl •... , lin). lIj,"
., lin) : (111, ...• lin) E Vp }
is a neighborhood of p in M. These graphs give M the structure of a C r manifold. This example is a generalization of the n-sphere of the last example.
7.1.1 TOPOLOGY ON SPACE OF DIFFERENTIABLE FUNCTIONS
237
See Guillemin and Pollack (1974), Hirsch (1976), Chillingworth (1976), or Lang (1967) for more details and examples of manifolds. The only explicit examples of manifolds that we use are Euclidean spaces (a trivial example), tori, and spheres. Given the definition of a differentiable manifold, we can define a differentiable map between two manifolds. Definition. Let M and N be two c r manifolds for some r ~ 1. Assume I : M - 1 N is a continuous map. We say that I is r time. rontinuousl1l differentiable, or cr, provided for each point P E M and coordinate charts tpo : Va - 1 Uo c M and tpp : Vp - 1 Up eN at P and I(p), respectively (i.e., p E Uo and I(p) E Up), tppl 0 I 0 tpo is differentiable at tp;;l(p). Note that iftpplol0tpo is C r at tp;;l(p) for one pair of coordinate charts, and tpo' : Vo' -+ Ua' C M and tpp' : Vp' - 1 Up, eN is another pair of coordinate charts at p and I(p) then tpp.1 0 I 0 tpa' is cr at tp;;}(p) because both tpo,a' and tpfJ,fJ' are cr. The set of all c r maps from M to N is denoted by cr(M,N). A C r maps from M to AI is a diffeomorphism provided it is one to one, onto, and the derivative at each point (in local coordinates) is nonsingular. The set of all C r diffeomorphisms on M is denoted by DiW(M).
7.1.1 Topology on Space of Differentiable Functions In this chapter and the next we consider the structural stability of dilfeomorphisms or differential equations on a compact manifold. The definition of structural stability uses the notion that two functions are close in the C l or cr topology. In this subsection, we give these definitions which are used later. Definition. The definition of the cr distance between functions is easier on the torus, so we start with this case. If I : 1'" - 1 1"', then there is a lift F : Rn - 1 Rn such that (i) 11' 0 F = 1011' where 11' is the projection 11' : Rn - 1 T" defined in Example 1.2, and (Ii) F(x + j) = F(x) + j for all x E Rn and j E Z". If I, 9 : 1'" - 1 1'" are two c r maps with lifts F, G : R" -+ Rn then the c r distance from I to 9 is defined to be dr(f,g) = sup{d(foll'(x),goll'(x)), IIDiFx - DiGxll : x = (X., ••• ,xn ) E R n satisfies 0 ~ Xj ~ 1
for 1 ~ j
~
n, 1 ~ i
~
r}.
In this definition, d is the distance between points on 1"'. The next easiest case to treat is functions from a compact manifold M to a Euclidean space RN. Let {tpj : Vi eRn - 1 Uj c M}f=l be a finite number of coordinate charts with uf-I Uj = M. Let Cj C Uj be compact subsets with Uf-l C j = M. If I,g : M - 1 RN are two cr functions then the cr distance from I to 9 is defined to be dr(f,g) = sup{l/(x) - g(x)I, IIDi(f otpj)OPj'(x) - Di(gotpj)OPj'(x) II :
x E Cj ' 1 ~ j
~ J,
and 1 ~ i
~
r}.
The case of maps between two manifolds is slightly more complicated. Rather than define a distance between two functions, we define a base of neighborhoods for the topology. Assume M and N are c r compact manifolds. Let {tpj : Vi C R" - 1 Uj C M}f=1 be a finite number of coordinate charts on M, and C j C Uj be compact subsets
238
VII.
EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
as above. Let {t/Jk : Vk C Rn -+ U;., c N}:=l be a finite number of coordinate charts on N. Let f E Cr(M, N). Each Cj can be broken into finite number of compact subpieces Cj = U;~l C j •t , and an index kU,t) chosen so that f(Cj,t) c U;"(j.t) for 1 ~ t ~ L j and 1 ~ j S; J. For! > 0, let N be the neighborhood of f given by
N
= {g EC'(M, N) : g(Cj,t) C U;"(j,l)'
It/Jk(~,l)
0
f(x) - t/Jk(~.t) 0 g(x)1 < t,
IIDj(t/Jk(~,t) 0 f
0
"'A'j'(x) -
for all x E Cj,t, 1 S;
t
Dj(t/Jk(~,l) 0 9 0
"'j)
S; L j , 1 S; j S; J, 1 ~ i ~ r}.
The CO estimate It/Jk(~,t) 0 f(x) - t/Jk(},t) 0 g(x)1 < ! can be replaced by the estimate d( f (x), g( x» < l using distances between points on the manifold. The set of these base neighborhoods N generate the topology on cr(M, N). REMARK 1.4. Let M and N be manifolds with M compact. The manifold N can be embedded into some (large dimensional) Euclidean space JRL. Using this embedding, it is possible to consider cr(M,N) c cr(M,IRL), i.e., f E cr(M,N) if f E cr(M,JRL) and f takes its values in the subset N C JRL. Using this, the topology on cr(M, N) can be inherited from Cr(M, JRL). Franks (1979) uses this approach.
Definition. A C r diffeomorphism f : M -+ M is called C r structurally stable provided there is a neighborhood N in Cr(M. M) such that every 9 EN is topologically conjugate to f. A C· structurally stable diffeomorphism f is also called just structurally stable. Definition. For flows ",I, t/J t : M -+ M, the C 1 topology measures their derivatives as functions from [O,lJ x M to M. A C 1 flow ",I : M -+ M is called structurally stable provided there is a neighborhood N of ",I in the C 1 topology such that any flow t/JI E N is flow equivalent (topologically equivalent) to ",I. We give the definition of vector fields and the topology on the set of vector fields in the next subsection. For more detail on the C r topology, including the noncompact case, see Hirsch (1976).
7.1.2 Tangent Space We are interested in invariant sets which are more complicated than a single fixed point, The first example is a horseshoe. In Chapter II we saw that the quadratic map had an invariant Cantor set. In two dimensions, the horseshoe is homeomorphic to the product of two Cantor sets. It has some directions which are expanding and some which are contracting. The directions which are expanding and contracting can vary from point to point. These directions of expansion and contraction at a point p can be thought of as infinitesimal displacements at p and are called tangent vectors. A tangent vector at p is also thought of as the derivative of a curve through p: if"> : (-6,6) -+ M is a differentiable curve with ')'(0) = p, then v = ')"(0) is a tangent 'it p. In order to organize these ideas, we need to distinguish vectors at different /lOJ1110, The following definition makes thf>Se ideas precise even on a manifold. Definition. We start t '\ giving the definition of tangent vectors for Euclidean space. Fix a point pER". A tangent vector at p is a pair (p, v) where v E an. This pair (p, v) is often written vp. The set of all possible tangent vectors at p is denoted by TpRn is called the tangent space at p. The tangent space at p is a vector space where
7.1.2 TANGENT SPACE
239
(p, v) + (p, w) = (p, V + w). The disjoint union ofthe tangent vectors at different points is called the tangent bundle or the tangent space of Rn and is denoted by TRn:
TR n = {(p, v) : p = Rn x Rn.
E
Rn and v is a tangent vector at p}
Thus the tangent space of a Euclidean space Rn is isomorphic to the cross product of Rn with itself: the first copy of Rn is the set of the base points at which the vector is situated and the second copy is the set of vectors at a given point, or (p, v) is a vector at p in the direction of v. For a manifold M, we need to specify what we mean by a tangent vector at a point p .. Assume "( : ( -6, 6) c R -+ M is a Cl curve with "(0) = p. Assume <{)o : Vo -+ Uo is a coordinate chart at p. By the above definition of differentiability, <{);;l o"(t) is Cl. In the coordinate chart, the tangent vector determined by"( is given by (<{);;l o "()'(O) = v~. If <{)fJ : VfJ -+ UfJ is another coordinate chart at p, then the tangent vector determined by"( in this coordinate chart is given by (<{)/il ° "()'(O) = v~. Note that v~
= D(<{)/il 0 <{)O)pDV~
where pO = <{);;l(p). These two vector, v~ and v~, should be considered as representatives of the same vector in different coordinate charts because they are the derivative of coordinate representatives of the same curve. The derivative of the curoe on a manifold (or tangent vector to a curoe) is the equivalence class of representatives in different coordinate charts, where v~ '" v~ provided v~ = D(<{)pl 0 <{)o)pov~. A tangent vector at p is a derivative of a differentiable curve through p. The set of all tangent vectors at p is written TpM, TpM
= {vp :
vp is a derivative of a differentiable curve through pl.
For the point p fixed, the set of vectors at p, TpM, forms a vector space (using the addition in anyone of the coordinate charts at p). The disjoint union of the tangent vectors at different points gives the tangent bundle or the tangent space of M which is denoted by T M: TM = {(p, v) : p EM and v is a tangent vector at p}
=
U{p} x TpM. pEM
If M is a manifold and S is a subset of M, we denote the tangent vectors to M at points of S by TsM = {p} x TpM.
U
pES
REMARK 1.5. If M c Rn+/c is a cr manifold, then the tangent space of M at p can be thought of as a subspace of the tangent vectors of Rn+/c at p, TpM C TpRR+Ir.
Definition. With this idea of tangent vectors, if f : M -+ N is a Cl map between manifolds, we consider the derivative of I at p to be a linear map from TpM to Tf(p)N, DIp: TpM -+ Tf(p)N. If <{)o : Vo -+ Uo c M and <{)fJ : VII -+ Up c N are coordinate charts at p and f(p), respectively, then D(<{)ijl
° J 0 <{)O)pDV~ = v1(p)
240
VII. EXAMPLES OF HYPERBOLIC SETS AND ATIRACTORS
takes the representative of a vector at p in the coordinate chart (!Po, Vo , Uo ) into the representative of a vector at f(p) in the coordinate chart (!pp, Vp, Up). In fact, when we used cones of vectors to prove the stable manifold theorem, we used this idea of the derivative acting on tangent vectors at points implicitly in our description. Definition. Below, we define a hyperbolic structure in terms of expanding and contracting directions. To do that, we need to be able to determine the length of vectors. The length can be determined from an inner product on each tangent space TpM. A Riemannian metric is an inner product on each tangent space,
which is a symmetric, positive definite bilinear form. Such an inner product can be obtained by embedding M in some large Euclidean space RN and inheriting the inner product from RN. For example, the two sphere inherits a Riemannian metric from R3 because it is a subset, f{J C R3. Once the inner product is known, then Ivpl p = (vp, v p );!2
defines a Riemannian nonn on each tangent space. We are most often interested in the norm rather than the inner product. Definition. A differential equation on a manifold M is given in terms of a vector field, i.e., a function X : M - TM such that X(x) E TxM for all x EM. The vector field is cr if the representatives of X are C r in local coordinates. The set of cr vector fields on M is denoted by xr(M). Let 1 ~ r < 00. The cr topology on xr(M) is determined by local coordinate charts. Assume M is compact. Let {!pj :
~ C Rn
-
Uj C M}f=,
be a finite number of coordinate charts on M and Cj C Uj be compact. subsets with M. If X, Y E xr(M), then the cr distance from X to Y is defined to be
U;.l C} =
dr(X, Y) = sup{lxj(x) - yi(x)l, IiD'(Xi)
J
where Xi, yi : 11; _ Rn are the representatives of X and Y in the local coordinate chart. Given such cr a vector field X on M, then
p = X(p) is a differential equation on M. The flow !p' of the vector field X satisfies d
ditp'(p) = X 0 !p'(p). As before, a flow satisfies the group property.
For more details on the ideas of the tangent space, the derivative between tangent spaces, and Riemannian metrics see Guillemin and Pollack (1974), Hirsch (1976), Chillingworth (1976), or Lang (1967).
7.1.3 HYPERBOLIC INVARIANT SETS
241
7.1.3 Hyperbolic Invariant Sets We want to make more precise the concept of expanding and contracting directioDl at different points for a map. Let A be an invariant set for a diffeomorphism f. If a periodic point p for I is hyperbolic there are subspaces t; and ii. such that (i) TpM is the direct sum of n; and n; (as vector spaces), (ii) TpM II.'; e E;, and (iii) t; is expanding and Il!t. is contracting under D f;. An invariant set A is said to be hyperbolic provided that at each point p in A this same type of splitting into subspaces exists and the subspaces vary continuously as the point p varies. We give the Cormal definition oC a hyperbolic invariant set in this subsection. We give examples of hyperbolic invariant sets in the rest of the chapter which should help to make the definition more understandable.
=
Definition. An invariant set A has a la,per60lic ,t,..cture for a diffeomorphism I on M provided (i) at each point p in A the tangent space to M splits as the direct sum ofn; and lFj., TpM n; en;, (ii) the splitting is invariant under the action of the derivative map in the sense that D/p(lI.';) Ei(p) and D/p(lFj.) Ej(p)' (iii) t; and lFj. vary continuously with p, and (iv) there exist 0 < ,\ < 1 and C 2! 1 independent of p such that for all n 2! 0,
=
=
ID.t;v'l :5 c,\nlv'l IDI;nv"l :5 c,\nlv"l
Cor Cor
=
v' e F;., v" en;.
If an invariant set A has a hyperbolic structure for invariant set. REMARK
I,
and
we also say that A is a la,per6olic
1.6. Notice that if m is a positive integer such that p
ID.t;'v'l :5 plv'l IDI;mv"l:5 plv",
v' e F;., for v" e »;,
Cor
=c,\m < I, then
and
so D.t;' IIl!t. is a contraction and D.t;' 1m; is an expansion. Thus, the constant C determines the number of iterates of I which are necessary beCore the vectors get contracted (respectively, expanded) in the subbundle JI!1, (respectively, F;). Sometimes an invariant set satisfying the above conditions is said to have a uni/orm la,lper60lic ,t,..cture because the constants ,\ and C, and so the number oC iterates m, are independent oC the point p. By contrast, a diffeomorphism is sometimes said to be nonuni/orml, larperho/ic on ,orne lIet provided it has nonzero Liapunov exponents for almost all points on this set. REMARK 1.7. We use backward iterates to specify the unstable bundle because any vector with a nonzero component in the unstable subbundle expands exponentially under forward iteration. Therefore forward iteration does not characterize the vectors in n;. However the above conditions characterize these subbundles (i.e., make them unique). Just as for a fixed point, it is useful to change the norm so that the cODltant C = 1. In this case, the new norm ,. must vary Crom point to point, i.e., the norm oC a vector depends on the point which is the base point of the vector. In terms of this new norm, vectors are stretched or contracted by DIp, and not just by the derivative oC some high iterate of I, Df;. This type oC norm is called an adapted norm. The existence of an adapted norm for a hyperbolic invariant set can not be proved using the Jordan Canonical Form, but must be proved using the averaging method given for a fixed point. See the appendix oC Smale (1967) by Mather for one of the original proo'" of the existence of an adapted norm.
I;
242
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
Theorem 1.1. Assume A is a hyperbolic invariant set for I with constants 0 < ..\ < 1 and C 2: 1 giving the hyperbolic structure. Let..\' be any number with 0 < ..\ < "\' < 1. Then there is a continuous change of norm I . I~ for points PEA such that in terms of this norm ID/pY'lj(p) ID/;lyuli_.(p)
:S :S
..\'lv'l~ ..\'lyul~
for Y' E F;, for y U E~.
and
PROOF. As stated above, we must average the norm and can not use some kind of Jordan canonical form. For pEA, any vector Y E TpM can be decomposed into components Y' En; and y U E JE;, Y = Y' + yU. Then we can let
Thus it is enough to define the norm on each of the 8ubspaces. (The maximum of two norms on subspaces defines a norm.) Let no be a positive integer 8uch that C(..\/..\,)n o < I, i.e., c..\n o < (..\,)n o. For y' En;, define "0-1
IY'I~ =
2: (..\,)-jID(fj)pY'I·
j=O We show this norm works on n;:
"0- 1 ID/pv'lj(p) =
2: (..\,)-jID(fHl)py'l
j=O no-2
= ( 2: (..\,)-jID(fHl)pY'O + (..\,)-no+lID(rO)pY'I j=O no-l
:S ..\'(
2: (..\,)-j ID(fj)py'l) + p,)-no+lc..\n oIy'l j=1
no-l
:S"\'
2: (..\,)-jID(fj)py'I, j=O
since C(,,\/,,\')n o < 1. In the same way, for
y
U
E JE; define no-l
Ivul; =
2: (..\')-i ID(ri)pyUI·
;=0
The reader can check that a similar calculation shows that
o If a set A is a hyperbolic invariant set then it can be proved that there are stable and unstable manifolds through each point in A. The following theorem states this result more precisely.
7.1.3 HYPERBOLIC INVARIANT SETS
2-43
Theorem 1.2 (Stable Manifold Theorem for a Hyperbolic Set). LetJ: M -+ M be a C k diffeomorphism. Let A be a hyperbolic invariant set for J with hyperbolic constants 0 < A < 1 and C ~ 1. Then there is an f > 0 such that for each pEA there are two C k embedded disks W:(p, f) and W.U(p, f) which are tangent to E~ and 1E~ respectively. In order to consider these disks as graphs of functions, we IdentIfy a neIghborhood of each point p with 1E~(E) x E~(f). (On a manifold, this identification can be realized by either local coordinates or the exponential map.) Using this identification, W:(p, J) is the graph of a C k function q~ : 1E~(f) -+ 1E~(f) with q~(Op) = Op and D(q~)o
= 0:
W:(p,f)
= ((q~(y),y)
:yE
1E~(f)}.
Also, the function u~ and its first k derivatives vary continuously as p varies. Similarly there is a C k function u~ : E~(f) -+ 1E~(f) with u~(Op) = Op and D(q;)o = 0 and with the function u~ and its first k derivatives varying continuously as p varies such that
W.U(p, f)
= {(x, q;(x»
Moreover, identifying a neighborhood of p with for I' > 0 small enough
W:(p,f)
:x
E
E;(f)}.
1E~(f) x 1E~(f)
and taking A < A' < I,
= {q E 1E;(f) x 1E~(f)
: Ji(q) E IEji(p)(f) x IEj,(p)(f) for j ~ O}
= {q E 1E;(f) X 1E~(f)
: Ji(q) E IEji(p)(f) x IEji(p)(f) for j ~ 0 and
d(P(q), /i(p» ~ C(A')id(q, p) for all j ~
OJ.
Similarly,
W;'(p,f) = {q E E;(f)
X
E~(f)
:
rj(q) E Ej-i(p)(f) x Ej-i(p)(f) for j ~ O} = {q E E;(f) x E~(f) :
rj(q) E IEj-J(p)(f) x IEj-J(p)(E) for j ~ 0 and d(f-i(q),ri(p»:S C(A')Jd(q,p) for all j ~ a}. The proof is very similar to that for a hyperbolic fixed point. We delay the proof until Section 9.2. After the local stable and unstable manifolds are obtained by the above theorem, the global manifolds are determined as follows:
W'(p, f) == {q EM: d(fJ(q), P(p»
= U r"(w:(p, f)
-+
0 as j
-+
co}
and
"2:0
WU(p, f)
== {q EM: d(ri(q), rj(p» =
-+
0 as j
-+
co}
U !"(W.U(p, f). "2:0
An important fact about the stable (respectively, unstable) manifold of each point is that it is an immersed copy of the linear spaces E~ (respectively, E;). This means that the map u' : E~ -+ M (respectively, qU : E; -+ M) whose image equals W'(p, f) (respectively, WU(p,f) is one to one but need not have a continuous inverse. The fact that the stable and unstable manifolds of points are immersed copy of the linear spaces implies that they can not be circles or cylinders.
244
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
Proposition 1.3. Let A be a hyperbolic invariant set for a diffeomorphism I. Then for each pEA, W'(p, f) is an immersed copy ofE~, and W"(p, f) is an immersed copy ofE~.
W:
PROOF. By the Stable Manifold Theorem, each (p, f) is the embedded image of the closed disk E~(t) by a map O'~. Because f is a diffeomorphism, !'(W:(p, f)) is thus the embedded image of the closed disk E~(t) by the map O'~,j = Ii 0 O'~. Thus W'(p, f) is the union of these sets each of which is an embedded copy of a disk, and so W'(p, f) is the immersed copy of the linear subspace. The proof for the unstable manifold is similar. 0
In the following sections, we give examples of hyperbolic invariant sets and determine their stable and unstable manifolds. These examples should make both the definition and the meaning of the theorem more understandable. Hyperbolic Structure for Flows We end the section by mentioning the changes needed in the definitions for a How or differential equation. Definition. An invariant set A for the flow rpt has a hyperbolic structure, or A is a hyperbolic invariant set, provided (i) at each point p in A the tangent space to M splits as the direct sum ofE~, E~, and span(X(p)), TpM = E~ E9E~ E9span(X(p)),
(ii) the splitting is invariant under the action of the derivative in the sense that D(rpt)pE~ = E~.(p), D(rpt)pE~ = E~.(p),
and
D(rpt)pX(p) = X(rpl(p)). (iii) E; and E~ vary continuously with p, and (iv) there exist JJ > 0 and C that for t ~ 0 IDrp~v'l ~
IDrp;tv"l
~
~
1 such
Ce-l'tlv'l for v' E E~, and Ce-l'tlv"l for v" E E~.
REMARK 1.8. If A is a hyperbolic invariant set for '1'1 and is a single chain component for '1", then either A is a single fixed point or A does not contain any fixed points (because of the assumption that the splitting varies continuously). In Section 7.11, we consider the Lorenz equations which have an invariant set which contains a fixed point together with other non fixed points. This set can not have a standard hyperbolic structure. In that situation, we discuss the possibility of a "generalized hyperbolic structure" .
7.2 Transitivity Theorems In the following sections, we often want to prove that a map is (topologically) transitive on a set X, i.e., the (forward) orbit of some point p is dense in X. When considering the horseshoe for the quadratic map on the line, we were able to verify this condition by means of the conjugacy with the shift map and then constructing an explicit point with a dense orbit for the (one-sided) shift map. We will also be able to do this in the
7.2 TRANSITIVITY THEOREMS
245
first set of examples below which are horseshoes for a diffeomorphism in two dimensions using the conjugacy to the two-sided shift map. In other cases, this type of approach is not possible. The following theorem of Birkhoff' shows that it is enough to show that the orbit of any open set intersects every other open set. In fact, it follows from this hypothesis that there are many points (a residual set in the sense of Baire category) with dense orbits. Remember that a set A is residual in Ii space X if there is a countable number of dense open sets {Uj}~l in X such that A = n~l Uj . The Baire category theorem says that any residual subset of Ii complete metric space is dense. Theorem 2.1 (Birkhoff Transitivity Theorem). Let X be a complete metric space with countable basis and I : X --+ X a continuous map. (a) Assume that for every open set U of X, O-(U) = U.. o I"(U) is dense in X. Then there is a residual set 'R.- such that for every p E 'R.-, the-backward orbit ofp, O-(p), is dense in X. (c) Combining the first two parts, if I is a homeomorphism and every open set U has both O+(U) and O-(U) dense in X, then there is a residual subset 'R. C X such that any p E 'R. has both O+(p) and O-(p) dense in X. PROOF.
Let {V; : i E Z} be a countable basis of X. Then
is the intersection of dense open sets and so is residual. Let p E 'R.+. Then p E 0- (V;) for all i, and so O+(p) n V; '" 0. This proves that O+(p) is dense in X, and so part (a). In the proof of (b), 1 is assumed to be a homeomorphism so the backward orbit of a point is well defined. The rest of the proof is similar. Part (c) is a combination of 0 parts (a) and (b). Ergodicity and Birkhoff Ergodic Theorem In connection with a few concepts we refer to invariant measures: Liapunov exponents (Sections 3.6 and 8.2), measure theoretic entropy (Section 8.1), Sinai-RuelleBowen measure (Section 8.3), and Hausdorff' dimension (Section 8.4). In some of these situations, we refer to a system being ergodic. Because the concept of ergodicity is measure theoretic type of transitivity, and so relates to the Birkhoff' Transitivity Theorem, we introduce it in this section. A measure IJ on a space X is called a Borel measure provided it is a measure for the sigma algebra of Borel sets (generated by open sets). A measurable set A in X is said to be of full measure in X provided IJ(X 'A) = O. If the measure of X is finite, a set A is of full measure in X provided IJ(A) = ",(X). For a measure IJ on X, the support 01 the measure, supp(IJ), is the smallest closed set with full measure, or supp(lJ)
= n{C : C
is closed, and IJ(X, C)
= o}.
A measure IJ is invariant for a map I : X --+ X provided ",(I-I(A» = IJ(A) for all measurable sets A. If IJ is an invariant measure for /, / is also said to be a mea.rure preseroing trons/ormation lor IJ. The flow of a vector field with zero divergence preserves Lebesgue measure, so this gives one example of a measure preserving system. (Geodesic flows are such systems. See Katok and Hasselblatt (1994).)
246
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
Finally, a map I : X -+ X is called ergodic with respect to invariant measure 1.1 provided IJ(X \ A) = 0 for any measurable invariant set A for I with IJ(A) > O. Thus for an ergodic map, all invariant measurable sets either have zero measure or full measure in X. If I is ergodic with respect to the measure 1.1, we also say that 1.1 is an ergodic
measure lor I. Exercises 7.9-11 make some connection. between a meaaure preaerving map and recurrence. In particular Exercise 7.10 states that if I is a measure preserving homeomorphism for an invariant measure 1.1 then the set {x EX: x E a(x) and x E w(x)} has full measure. Thus for such a homeomorphism, IJ-all every point is recurrent. In the discussion in Section 3.6, the Birkhoff Ergodic Theorem is used to show that if I : R ...... R has an invariant measure J.I. with compact support, then the Liapunov exponents exists J.I.-almost all points. We state this theorem next but leave the proof to the reference. Theorem 2.2 (Birkhoff Ergodic Theorem). Assume I : X ...... X is a measure preserving transformation for the measure J.I.. Assume 9 : X ...... R is a J.I.-integrable function. Then limn_co(lln) L;':~ go li(x) converges J.I.-almost everywhere to an integrable function g". Also g" is I invariant wherever it is defined, i.e., g"of(x) = g"(x) for J.I.-almost all x. Also, (i) if I'(X) < 00 then g"(x) dJ.l.(x) = g(x) dlJ(x), and (ii) if J.I. is an ergodic measure for f then g" is a constant J.I.-almost everywhere.
Ix
Ix
2.1. Consider the case when J.I. is an ergodic measure for f. The value is the space average of the function g; the value g"(p) is the time average of 9 along the orbit of p. Thus for any integrable function (e.g. the characteristic function of an open set), the time average along almost all orbits equals the space average of the function. This fact indicates that almost all orbits for an ergodic measure are dense in the support of the measure. REMARK
Ix g(x) dJ.l.(x)
The following theorem gives a result for ergodic maps which is similar to the Birkhoff Transitivity Theorem. Theorem 2.3. Let X be a complete metric space with countable basis. Assume f : X ...... X is an ergodic map with respect to the measure J.I.. Assume that J.I. is positive on open sets and J.I.(X) < 00. Then there is a set R of full measure such that the forward orbit of every point in 'R is dense in X. PROOF. Let {Y:, : j E Z} be a countable basis of X. For each Vj • O-CVj) is a measurable invariant set of positive measure, so it has full measure, i.e., X \ 0- (\Ij) has zero measure. Therefore
has zero measure and has full measure. For any p E 'R +, 0+ (p) is dense in X just as in the proof of the Birkhoff Transitivity Theorem, 0 For details and an introduction to ergodic theory, see Walters (1982). For further discussion of ergodic theory in the context of smooth dynamical systems see Katok and Hasselblatt (1994).
7.3.1 SUBSHIFTS FOR NONNEGATIVE MATRICES
247
7.3 Two Sided Shift Spaces
•
In Chapters II and III, symbolic dynamics is used to specify the forward itinerary for the orbit of a noninvertible map, and so one sided shifts and subshifts are used. In this chapter, symbolic dynamics is used to specify both the forward and backward Itinerary for the orbit of an invertible map, and 80 two aided ahlft spac.. are Introduced. A point In a two sided shift space Is given by B = ( ... , S-I, SO,SI, ..• ) which includes a symbol Sj for all j E Z. If Section 3.2 has not be read previously, it should be read at this time. Definition. Let n be an integer that is greater than one. Let En={l, ... ,n}Z = {a = ( ... 8-1, So, SI,"')
: Sj E
{I, ... , n} for all j E Z}.
We define the distance d on En:
~
des, t) = L
.5(Sj, tj)
41jl
.
j=-oo
where .5(a,b)=
{ o1
ifa=b ifa¥b.
AB in the case of the one sided shift, this makes En into a complete metric space. Exercise 7.12 proves that making the factor in the denominator greater than 3 makes the cylinder sets into open balls in terms of the metric. There is also a shift map (1 defined on En by (1(s) = t where tj = Sj+!' The space En together with the map (1 is called the full two sided n-shift, or sometimes simply the n-shift. The shift space is called two sided because the index is both positive and negative. It is called the full n-shift because all possible sequences are allowed on n symbols. Notice that this C1 is one to one on this En and so is invertible, while the one sided shift map is n to one. As for one sided shifts, we can also define subshifts of finite type. If A = (aij) is an n x n matrix with each aij E to, I} then EA C En is the subspace of all points 8 = (8j) for which a"'H' = 1 for all j. The shift map restricted to EA is designated by C1A = C1IE A. The map C1 A on EA Is called a 8ubshift of finite type. As for one sided shifts, the number of fixed points of C1~ is given by the trace of Ak, # Fix(C1~) = tr(A k ). Also C1 A has a dense forward orbit in EA if and only if A is irreducible (where irreducible is defined as before).
7.3.1 Subshifts for Nonnegative Matrices In this subsection we extend the definition of a subshift of finite type by associating a shift space with a matrix with nonnegative integer entries. For FI' = 1'X(1 - x) with JI. > 4, there are two subintervals 11 ,12 C [0,1) such that FI'(I,,)::> It uh Using the itinerary map for these intervals, we get sequences in E2, i.e., for the transition matrix
(! !).
However, the one interval I = [0, 1] crosses itself twice, so we could a.sssocate
the matrix (2). There are several advantages to this extension. First, the matrix which gives the symbolic dynamics can be made to have smaller size. In the above example we get the 1 x 1 matrix (2) rather than the 2 x 2 matrix
(! !).
Second, if A is a
248
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
transition matrix. then we can from the subshift of finite type for A'" for any positive power. For example. if A
= (~ ~)
then A:l
= (~ ~)
also has a subshift of finite
type associated to it. Finally, when we consider Markov partitions for hyperbolic toral automorphisms later in the chapter, the matrix for the subshift of finite type associated to a hyperbolic toral automorphism can be taken to be the same as the matrix which induces the hyperbolic toral automorphism even when it has entries which are greater than one. We call A an adjacency matrix provided A = (ajj) is an n x n matrix with entries aij
E N. For example, (2) and
(i
~)
are adjacency matrices. We call A a transition
matrix provided A = (ajj) is an n x n matrix with entries ajj E {0,1}. Now we turn to understanding how an adjacency matrix corresponds to a subshift. The previous subshift constructed for a matrix with entries of O's and 1's we call the vertex subshift. For an adjacency matrix, we construct an edge subshift. Let S = {VI, V2 , .•. , Vn } be a set with n elements. We use the adjacency matrix A to from an oriented graph G A with vertices the elements of S. If the entry ajj > 0, then we draw alj directed edges from vertex V; to vertex V;. For example, if aij = 2, then there are
two edges from V; to Vj . Figure 3.1 gives the graphs for the matrices (2) and
(~ ~).
Thus there are N = Ej. j ajj edges in the graph GA. Notice that the adjacency matrix tells which vertices are adjacent to each other in the sense that there is a path in the graph directly from the first vertex to the second. Let e = {E I ,E2 , •..• EN} be the set of all edges. If A is a {O,l}-matrix, then we get the same graph which we have considered before with either no edge or one edge from vertex V; to vertex V;.
vl~~ FIGURE 3.1. Graphs for the Matrices (2) and
n V
2
(~ ~ ) .
Next, we form the transition matrix T = (tjj) on e. This is an N x N matrix whose entries are O's or l's. For the edge Ej E e, let b(Ej ) E S be the beginning vertex of edge E) and e(Ej) E S be the end vertex, i.e., E j is an edge from b(Ej } to e(Ej ). The entries of T are defined as follows: tjj
=
{
1 if e(Ej ) = b(Ej ) 0 if e(Ej ) ' " b(Ej ).
Thus the transition on e from E j to Ej is allowable provided the end of the edge E j is at the vertex which is the beginning of the edge Ej • Using the transition matrix T, we can form the vertex subshift ET. This process takes an adjacency matrix A and constructs a subshift UT : ET --+ ET called the edge subshift lor A. (This induced subshift on N symbols is often labeled as UA : EA --+ EA but we do not do this. We always label the subshift by the transition matrix which induces it.) An allowable sequence 8 E ET can be visualized as a curve in the graph GA. An allowable curve in the graph G A is a continuous function "y : R --+ G A such that for each
7.4 GEOMETRIC HORSESHOE
249
i, -y(i) is a vertex and -y([i, i + 1]) is an edge from -y(i) to -y(i + 1). Thus an allowable sequence s E ET can be visualized as the allowable curve -y in G,4 with -y(i) = 8, E E. Notice for the adjacency matrix A = (2), its transition matrix is T =
(~
!). (Ei-
ther of the two edges can follow each other.) Therefore the 1 x 1 matrix (2) corresponds to the full two-shift. In the same way the adjacency matrix A = (n) corresponds to the full n-shift. One case that is interesting is when A is already has all its entries aij E to, I}. Let T be the transition matrix formed above. (The matrix T is usually a different size than A.) We leave to Exercise 7.16 to show that vertex subshlft 0',4 : E,4 -+ EA is topologically conjugate to the edge subshift formed from A, O'T : ET -+ ET. This result shows that the edge subshift is a generalization of the vertex subshift for matrices with nonnegative integer entries. The adjacency matrix A is called irreducible provided that for each pair 1 :S i,i :S n, there is a positive integer k = k(i,j) such that (A")ij > 0. (This is really the same definition as for a to, 1}-matrix.) We leave to Exercise 7.15 to verify that if A Is an irreducible adjacency matrix and T is its induced transition matrix then T is irreducible. Thus if A Is irreducible, then O'T is topologically transitive. For further development subshifts of finite type, see Boyle (1993) and Franks (1982).
7.4 Geometric Horseshoe In this section, we give an example of a diffeomorphism I : S2 -+ S2 (or from R2 to itself) that has an invariant set which is a Cantor set. This example is closely related to the map I,,(x) = lJX(l-x) on R for IJ. > 4 discussed in Chapter II. It was introduced by Smale and is one of the important early examples with complicated (chaotic) dynamics on an invariant set. It is called the Smale horseshoe or geometric model horseshoe. It has been at the heart of much of modern Dynamical Systems. See Smale (1965, 1967). In the next subsection, we show how a horseshoe arises for a polynomial map, the Henon map. The horseshoe also leads to a better understanding of the dynamics of points near a transverse homoclinic point as we discuss in the second subsection below. Finally, we show how transverse homoclinic points and so horseshoes can be proven to exist for small parameter time periodic perturbations of differential equations (Melnikov method). Let S = [0, IJ x [0, IJ be the unit square. Let H j for i = 1,2 be two disjoint horizontal substrips, Hj = {(x,V) : 0 :S x :S 1,y{ :S V :S ~} with 0 :S vI < v~ < vl < y~ :S 1. Similarly, let V; for i = 1,2 be two disjoint vertical substrips, V; = {(x, y) : x{ :S x :S ~,O:S Y :S I} with 0 :S xl < x~ < x¥ < x~ :S 1. We assume that I is a diffeomorphism such that I(Hj ) = V; for i = 1,2, S n F-I(S) = HI U H 2 , and for p E HI U H2
D/p=(ao b~) with lapl = IJ. < 1/2 and Ibpl = A > 2. See Figure 4.1. This map I can be extended to all of S2 as follows. Let A be the semidisk of radius ~ on the bottom of S, B be the semidisk of radius ~ on the top of S, and N = SuAuB be the topological disk which is the union of these three region. Let G be the horizontal gap between HI and H2, H3 be the top horizontal strip In S above H 2 , and Ho be the bottom horizontal strip in S below HI. We extend I so that I(G) c B and the Image arcs from the top of VI to the top of V2 . This part of the map can be taken 80 that I :~ (q)1 = IJ. for q E G. See Figure 4.2. For the extension, the image of Ho
u H3 Is
250
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACfORS
H3 H2 G
VI
V 2
HI HO
A
FIGURE 4.1. Horizontal and Vertical Strips in 8 contained in Ai the first coordinate function of f on HouH3 is a contraction by a factor of 11-; the second coordinate function of f on Ho U H3 changes from an expansion by ,\ on the boundary with HI U H2 to a contraction at the boundary with Au B. We take f so that f(A) C A. and that f is a contraction on A, so f has a unique fixed point Po E A. Also./(B) c A. Thus 1 maps the topological disk N into itself. The map can be extended to all of R2 so all other points enter this topological disk under forward iteration. Finally. 1 can be extended to S2 so that it takes the point at infinity, POOl to itself. and Poo is a source for 1 on 8 2 • This map f restricted to N can be thought of as the composition of two maps. The first map. L. takes the region and stretches it out to be over twice as tall and less than half as wide. L(x, y) = (II-X, >.y). The second map 9 takes this longer and thinner rectangle and bends it in the middle and puts it down so it crosses 8 twice. The map 9 must change the image near the ends so that 1 = 9 0 L is a contraction on A and I(B) C A. The image I(N) is drawn in Figure 4.2 together with the region N itself. Since I(N) C N, it follows that j2(N) C f(N) c N. Notice that L 0 f(N) is a longer and thinner version of I(N) with two vertical strips. Then j2(N) = goLof(N) is bent again in the middle and has its image inside f(N). Thus the part of the image in 8, f2(N) n 8, is four vertical strips of width 11-2. See Figure 4.2. Next, we consider the chain recurrent set of f. If q E Au B then w(q) = {po}, so 'R.(f) n (A U B) = {Po}. If q E 8 2 \ N then o(q) = {Poo}, so n(f) n (82 \ N) = {Pool. Finally, if q E 8 n n(f) then Ii (q) must be in 8 for all j E Z (or else it wanders forward to Po or backward to Pool. Combining, 'R.(f)
c A U {Po} U {Pool
where
11.=
n
fi(8).
iEZ
Below we prove that fill. is conjugate to the shift map on a symbol space. As a corollary of this result, A C 'R.(f) so 'R.(f) = A U {Po} U {Pool. Now we turn to the analysis of A. We define sets in a manner similar to the con-
7.4 GEOMETRIC HORSESHOE
251
FIGURE 4.2. First and Second Image of the Neighborhood N for the Ge0metric Horseshoe struction of the Cantor set for the quadratic map on the line:
n"
S;:' =
/)(S).
j-m
By the description above, SJ is the union of the two vertical strips VI and V2 of width 1'. AJ5 in the one dimensional case,
So = /(S;;-I) n S
Vd U [/(SO-I) n V2 ] 1 = 1(80- n H.) U 1(~-1 n H2). = [/(SO-I) n
In particular for n = 2,
sg =
[!(SJ) n Vd u [/(SJ) n V2J n H.) u /([V1 U V2) n H 2 ).
= /([V1 u V2J
Then for k = 1 or 2, /(SJ) n VA: = /((Vl U V2 ) n HA:) is the union of 2 vertical strips of width 1'2. Therefore, sfi is the union of 22 vertical strips of width 1'2. By induction
is the union of 2" vertical strips of width 1'''. Taking the infinite intersection,
n ".0 00
sgo =
~ = C1
X
[0,1]
is a Cantor set of vertical line segments. If q e 88" , then q e f1 (S) and /- i (q) all j ~ O. Thus S8" is the set of points whose backward iterates stay in S.
e S for
252
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
Considering the sets ~m' ~I = HI U H2 is the union of two horizontal strips of height ,\-1. Then, ~2 is the union of four horizontal strips of height ,\ -2. Continuing by induction, S~m is the union of 2m horizontal strips of height ,\ -m, and
n 00
~oo
=
~m
= [O,lJ
X
C2
msO
is a Cantor set of horizontal line segments. If q E S~oo' then for all j ~ 0, q E f- j (8) and f1 (q) E 8. Thus ~oo is the set of points whose forward iterates stay in 8. Intersecting these two sets, we get that A = S~oo
=sO'n~oo = C I x C2
is the product of two Cantor sets, and A is the set of points such that both the forward and backward iterates stay in 8. We want to show that A has the three properties of a Cantor set in the line: A is perfect and its connected components are points. It is perfect because both the sets Cj are perfect. From the fact that the set S~n is the union of 22n rectangles of dimensions p'n by ,\ -n, it follows that the connected components of A are points. Thus A has the three properties of a Cantor set in the line. There is in fact a theorem which says that such a set is homeomorphic to a Cantor set in the line. (See Hocking and Young (1961), page 97.) We now turn to considering the stable and unstable manifolds of points. We first consider the set
n S~m = 00
~oo
=
[O,lJ
X
C2 ·
m=O
A point q is in S~oo if and only if q E J3(8) for all j ~ 0, if and only if f1(q) E 8 for all j ~ o. Thus such a point q is in the stable manifold of the whole set A. In fact if q E [O,lJxC2 and p E C I xC2 = A have the same y coordinate, then 1f1(q)-J3(p)1 ~ p.j and q E compp(W'(p) n 8). Thus these horizontal line segments are the local stable manifolds of points in A. Similarly for pEA, compp(WU(p) n 8) is the vertical line segment through p. The union of the local unstable manifolds for all points in A, W/~c(A), is thus the set we labeled SO'. The global unstable manifold of A, WU(A), is the forward orbit of So. Because the forward images of the larger neighborhood N are nested, J3(N) c f"(N) for j > k. it is not hard to see that WU(A) is the intersection of these images, 00
WU(A)
= U P(SO') j=O
n 00
=
fj(N).
j=O
This set winds around and accumulates on itself. In fact, it is a continuum which can not be written as the union of two proper subcontinua. For this reason it is an example of an
7.4 GEOMETRIC HORSESHOE
indecomposable continuum. This particular example is called the Knaster continuum. See Barge (1986) for more discussion of this property. To introduce the symbolic dynamics of the map 1 on A, we need to consider the two sided shift on two symbols because 1 involves both forward and backward iterates to determine A. The next theorem proves that IIA is topologically conjugate to (1 on 1:2. Theorem 4.1. Let 1:2 be the full two sided two-shift space with shift map (1. Define h : A -+ 1:2 by h(q) = s where P(q) E H' j for all j. Then h is a topological conjugacy from IIA to (1 on 1: 2 , PROOF. Let h(q)
=s
= t. Then p+l(q) E H'j+, but also li+l(q) = = tj, and (1(s) = t or (1(h(q» = h(f(q)). This proves
and h(f(q»
po/(q) E HI,. Therefore 8i+l
the first property of a conjugacy. Next we show that h is continuous. Let h(q) = s. A neighborhood of 8 is given by
N
= {t : tj = 8j for
- no ~ j ~ no}.
With no fixed, the continuity of 1 insures that there is a Ii > 0 such that for pEA and Ip - ql ~ Ii, Ij(p) E H., for -no ~ j ~ no. Thus, if t = h(p) and Ip - ql ~ Ii then tEN. This proves the continuity of h. To show that h is one to one, assume that h(p) h(q) 8. Then for all j, both rj(p) and rj(q) are in H._ i , so p,q E P(H._,). Letting j run from 1 to Infinity, we see that p,q E n~l Ji(H._,} so they are in the same vertical line segment. Letting j run from minus infinity to 0, we see that p, q E n~_-oo Ii (H._,), so they are in the same horizontal line segment. Using all j from minus infinity to infinity we see that p = q. This proves h is one to one. Finally, we check that h is onto. We apply induction on n to show that n;=l Ii (H._ j ) is a vertical strip of width I-'n for all strings of symbols s E 1:2. Let 8 E 1:2. For n = I, this set is just I(HL . ) = V._. which is a vertical strip of width 1-'. Then
=
n n
n
Ij(H._,) =
j=l
is a strip of width I-'R since infinity,
f(n
=
li- 1 (H._,» n I(H._.)
i=2
n;=2Ii - 1 (H._ i )
is a strip of width IJn-l. Letting n go to
n li(H._ j=l 00
i )
is a vertical line segment. Similarly, n~_-oo P (H._ j ) is a horizontal line segment, and n~-oo P(H L , ) is a (single) point, q. In particular, the intersection is nonempty. For this q, h(q) = s, and h is onto. This completes the proof that h is a conjugacy. 0 It is possible to have other horseshoes which are conjugate to subshifts of finite type. Example 4.1. Let the region N in the plane be made up of three disks A, B, and C and two strips 8 1 and 82 connecting them as in Figure 4.3. The map 1 takes N inside itself. The disks A, B, and C are permuted with I(A) C B, I(B) c C and I(C) c A. The map on these sets is a contraction and there is a unique attracting orbit of period three, {Pl,P2,P3} with Pi E A, P2 E B, and Pa E C. The strip 8 1 is stretched across 81t B, and 82 with a contraction in the vertical direction as indicated. Finally 8 2 is stretched across 8 1 . This map can be extended to S2 so it has a fixed point source
254
VII.
EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
at infinity, Poo. As in the geometric horseshoe above it can be shown that the chain recurrent set is given by where
n 00
11.=
li(Sl U S2).
j=-oo
The map I can be taken so that it has a hyperbolic structure on A with one expanding direction and one contracting direction. Because of the manner in which I maps the strips. 1111. is not conjugate to the the full two shift, but is conjugate to the two sided subshift of finite type E8 for the transition matrix
There are restrictions on what combinations of periodic orbits and subshifts of finite type which can be realized on a manifold such as the two sphere. For more details see Franks (1982).
FIGURE 4.3. Horseshoe for a Subshift of Finite Type
7.4.1 HORSESHOE FOR THE HENON MAP
255
7.4.1 Horseshoe for the Henon Map Let FAB : ]R2
-+ ]R2
be given by
FAB(X,y) = (A-By-x 2,x) -1
FAB(x, y)
= (y,
A - x - y2
B
so
).
We will often write F for FAB in this section. This map is called the Henon map. It was written down by Henon to realize the Smale horseshoe for a specific function which could be iterated on the computer, Henon (1976). He also observed what appeared to be an attractor. We return to this aspect of the map in Section 7.10. Notice that
-2X det (DFAB)(x,lI) = det ( 1
-B) 0
= B
is the amount that F changes area. If B > 0 then F preserves orientation, and if B < 0 then it reverses orientation. The usual paramf'tf'rs discussed are A = 1.4 and B = -0.3 for which computer iteration indicates there is a "strange attractor". Notice that for these parameter values FAB decreases area and reverses orientation. It is still unproven that there is a "transitive" attractor for these parameter values. We will discuss this more fully when we come to examples of attractors. The following theorem is the main result of the section. It proves that FAB has a horseshoe for larger A, namely A = 5 and B = ±0.3 or, more generally, A ~ (5 + 2v'5)(1 + IBI)2/4. Note that for B = 0, FAO is conjugate to the quadratic map F,,(x) = Jlx(1 - x). Therefore the following theorem is analogous to Theorem 11.4.1. Theorem 4.2. Let B '" 0 and A ~ (5 + 2J5)(1 + IBI)2/4. In particular, B = ±0.3 and A = 5 are allowed. Let R = + IBI + [(1 + IBI)2 + 4Aj1/2} and let the square
HI
S
= {(x,y) E]R2 : Ixl $
Rand
Iyl
$ R}.
Let A = n~-<x> Fh(S). Then, (a) all points which are non wandering are contained inside S, (b) FAB has a hyperbolic structure on A, (c) A is a Cantor set in the plane, and (d) FABIA is topologically conjugate to the two sided shift on two symbols. Thus for these parameter values, the nonwanderiag set of FAB is a horseshoe. REMARK
4.1. The proof given below is based on Devaney and Nitecki (1979).
PROOF. To make the calculations somewhat easier, we take A = 5 and B = 0.3. Most of the calculations are very similar with B = -0.3. We also take S = {(x, y) : Ixl $ 3,lyl $ 3} which is a slight enlargement of the size given in the general definition in the statement of the theorem. A direct analysis of the effect of the map F on points in different regions outside S shows that for p ~ S, P is wandering with IFj(p)1 going to infinity as j goes to either 00 or -00. See Devaney and Nltecki (1979). To find the image of S by F we look at the image of the corners, the middle vertical
256
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
line, and horizontal lines: F(±3, 3)
= (5 -~: -
9) = ( -;39),
F(±3 -3)- (5+0.9-9) _ (-3.1) , ±3 ±3 ' F(O ,II ) -- (5 - 00.3 11 ) '
and
z F( Z,II0 )_(5-0.31/Oz
2) .
These images show that the four corners go to points to the left of the box S. Since 5 - 0.311 > 3 for -3 ~ II ~ 3, the middle vertical line is mapped to the right of the box. Finally, each horizontal line goes to a parabola. See Figure 4.4. F(d)
F(c)
b F(O .y)
F(b)
F(.)
FIGURE
4.4.
•
Image of the Square by the Henon Map
Thus F(S) n S has two horizontal strips. Similarly, F-l(S) n S has two vertical strips, and F-l(S) n S n F(S) has four components. By induction, Fj(S) has 4n components. With this much information we can define a semi-conjugacy h : A -+ {I, 2}Z which is onto. The next step is to show that A has a hyperbolic structure so it is possible to prove that the components of A are points. This will enable us to conclude that A is a Cantor set and that the semi-conjugacy h is one to one and so is a conjugacy. To prove that the hyperbolic structure exists, we define cones (or sectors) at each point that are mapped into each other. This follows the ideas that we used for the proof of the stable manifold theorem. Also see the discussion in Moser (1973). We define the cones CU(p) and C'(p) by
ni=-n
CU(p)
= He, '1) E TpJR 2 :
C'(p) = He, '1) E TpJR 2
:
I'll ~ A-llell, I'll ~ AleIl·
For ease of estimations. we use the norm which measures the larger component of a vector: I(e. '1)1. = max{lel.I'IIl· We find A > 1 that make the cones invariant. (For A 5 and B ±0.3. A can be taken to be 1.7.)
=
=
=
=
Lemma 4.3. There is a A> 1 (which can be taken to be 1.7 (or A 5 and B ±0.3) for which the following two statements are true. (a) ForalJp E SnF-l(S) and v E CU(p), DFpv e CU(F(p» andlDFpvl. ~ Alvl •. (b) For alJ pES n F(S) and v e C'(p), DF;lv E C'(F-l(p» and IDF;lvl. ~ Alvl.· PROOF.
In the proof. we need some estimates which we prove first in a sublemma.
7.4.1 HORSESHOE FOR THE HENON MAP
267
Ixl > 1 and 21xl -IBI ~ 1.7 = A> 1. Iyl > 1. PROOF. Let (XI, yd = F(x, y). If Iyl $ 3 and Ixd $ 3 then 5 - IBly - x' $ 3, or x 2 ~ 2 - IBI(3) = 1.1. Thus Ixl > 1. The second estimate follows from the first: 21xl -IBI ~ 2 - 0.3 = 1.7. Let (x-Ioy-d = F-I(X,y). Then (x,y) = F(x_I,y_I), so Iyl = Ix-d > 1 by part W. 0 Sublemma 4.4. (a) 1£ (x. y), F(x, y) E S then (b) If(x,y),F-I(x,y) E S then
Now take
p=
(x,y) with
we have that I(e, '1)1. =
Then
lei·
p,F(p) E S and (~)
1'111 = lei and led = 1- 2xe -
Therefore
(~:)
E
E C"(p).
Because
lei ~ AI'll> I'll,
The image of the vector by the derivative is given by
B'II ~
12xllel-IBII'I1 ~ (2Ixl-IBI)lel
~
Aiel
~
AI'III·
C"(F(p)), and I(e .. '1dl. =
lell
~
Aiel
=
Al(e, '1)1.·
This proves the first part of the lemma.
p, F-I(p) E Sand (~) E C·(p). Because I'll· The image of the vector by the derivative
For the second part, take p = (x, y) with
lei < Aiel $ I'll, we have that of the inverse is given by
Then
1'1-11
~
I(e, '1)1.
(l2yl-1)IBI- 1 1'11
~
=
(2 -1)IBI- I I'I1
e-I) ( 'I-I
E
~
AI'll
=
Ale-II.
Therefore
C'(F-I(p)),
and This completes the proof of the lemma.
o
Now to consider the hyperbolic structure on A, the lemma shows that for each pEA,
n DF~_;(p)c"'(rj(p)) n
j=O
is a nested set of cones at p so the infinite intersection is a nonempty cone:
n DF~_j(p)C"(rj(p)) :/= 0. 00
j=O
Similar statements are true for the stable cones: -00
n DFt_'(pp'(rj(p)) '" 0.
j=O
258
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
Since Lemma 4.3 proves the expansion and contraction in these two sets of intersections, to complete the proof that the set is hyperbolic, all we need to show is that this intersection is a line for each pEA. In fact the maximal angle between two vectors in the intersection n
n
DFt_'(p)C"(ri(p»
j-O
goes to zero as n goes to infinity as the following calculation shows. Take
with ~ =~' = I,
17]1,17]'1:::; A-I,
and q = F-j(p) for some j > O. Then
and ~I
= -2x~ - B7] = -2x - B7]
~; = -2x~'
- B7]'
= -2x - B7]'.
Thus
17]1 ~I
7]11 -
1
1
1
~1 = -2x - B7] - -2x - B7]' =
1
1
B( 7] - 7]') 1 (-2x - B7])( -2x - B7]')
< IBII7] - 7]'1 -
A2
< IBI I7]
7]'1
-~ ~-f
since I - 2x - B7]I, I - 2x - B7]'1 ~ A and I~I = I~'I = 1. Therefore the angle between two vectors is contracted by IBIA -2 and so the intersection of the cones goes to a single line in the tangent space at each point p:
n DF~_i(pPU(F-j(p» 00
= lE~.
j-O
Notice that the line EU(p) can depend on p even though the original cones did not, because its definition involves the derivative of F along the backward orbit of p. Similarly, -00
n DF~_i(pp'(rj(p»
= lE~
j=O
is a line in the tangent space at p. This completes the proof of the hyperbolic structure on A, or part (b) of the theorem.
7.4.2 HORSESHOE FROM A HOMOCLINIC POINT
259
For pEA, W'(p) has to start inside the cone C'(p) + {pl. In fact, for q E W'(p) we have TqW'(p) C C'(q). Let HI and H2 be the two (horizontal) components of S n F(S). Then by using the estimates of Lemma 4.3. we see that for each iterate by F. comp p (W'(p) nHi ) is shrunk by a factor of A-I. In this way we see that
nn n
~Bf length {compp [W'(p)
Fi(S)] } ~ 6,\-n
J=-n
which goes to zero as n goes to infinity. Similarly
n n
max length {comp p [W"(p) n ~A
.
Fi(S)] } ~ 6,\-n
)=-n
which also goes to zero. Therefore for q E A. compq(nj=_n Fi(S» have diameters which go to zero and so converge to the single point {q}. Therefore the connected components of A are points. This shows that A is a Cantor set. Also, by arguments as before. we can prove that the semiconjugacy of FIA to E2 is one to one. and so is a topological conjugacy. This completes the proof of the theorem. 0 REMARK 4.2. Fix B. For A < -(B + 1)2/4. FAB has (no periodic points and) empty nonwandering set. For A> (5 + 2v'5)(1 + IBI)2/4. FAB has a horseshoe. Therefore as A varies. FAB forms a horseshoe. There are many bifurcations which take place as this horseshoe is formed. Many people have studied this process but it is not yet completely understood. See Newhouse (1979). Mallet-Paret and Yorke (1982). Robinson (1983), Holmes and Whitley (1984). Holmes (1984). Yorke and Alligood (1985). Easton (1986. 1991). and Patterson and Robinson (1988).
7.4.2 Horseshoe from a Homoclinic Point In the last two sections. we have analyzed the geometric model horseshoe and have shown how it arises in the Henon map for certain parameter values. In this section. we show how it arises from a transverse intersection of the stable and unstable manifolds of a periodic point. Such an intersection is called a homoclinic point. Definition. Let p be a hyperbolic periodic point of period n for a diffeomorphism J. Let n-I
W<7(O(p» =
U W<7(Ji(p))
and
W<7(O(p)) = W<7(O(p» \ O(p)
for q = s, u. A point q E W'(O(p» n W"(O(p» is called a homoclinic point Jor p. A point q is called a tmnsverse homoclinic point provided the manifolds W'(O(p)) and W"(O(p» have a nonempty transverse intersection at q. The following theorem proves that the existence of a transverse homoclinic point implies the existence of a Smale horseshoe. Figure 4.7 indicates why the invariant set of part (a) is called a horseshoe. By analogy. we also call any invariant set like the one constructed in part (b) a horseshoe.
260
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
Theorem 4.5. Suppose that q is a transverse homoclinic point for a hyperbolic periodic point p for a diffeomorphism f. (a) For each neighborhood U of {p, q}, there is a positive integer n such that has a hyperbolic invariant set A C U with p, q E A and on which is topologically conjugate to the two sided shift map on two symbols, tT on E 2 . Thus A C c1(Per(f» C n(f) and q E c1(Per(f». (b) Let V be a neighborhood ofO(p) U O(q). Then there is a smaller neighborhood 8 8 i ofO(p) U O(q) with n ~ 2, 8 C V, such that As niEzJi (8) C V is a hyperbolic invariant set for f and flAs is topologically conjugate to the shift map tT A on a transitive two sided subshift of finite type on n symbols, EA C En. The conjugacy h : As -+ EA is the itinerary function given by hex) = 8 where Ii (x) E H,; for all j. Notice that q E O(p) U O( q) C As C c1(Per(f» C n(f), 80 q E c1(Per(f». If p is a fixed point then the transition matrix A = (aiJ) for the subshift of finite type is given by
r
r
= U?=,
=
for j
= 1,2
for j '" 1,2 for j = i + 1, 2 ~ i for j '" i + 1, 2 ~ i for j 1
=
forj",l. See Remark 4.8 for the transition matrix when p is not a fixed point. REMARK 4.3. The theorem is most often stated as in part (a). See Smale (1965, 1967). There are many other references including Guckenheimer and Holmes (1983), pages 252-3; Newhouse (1980), pages 13-24; Moser (1973); and Wiggins (1990), pages 470-483.
=
REMARK 4.4. Given a set 8, As niEZ Ii (8) is the maximal invariant set in 8, i.e., the largest invariant set contained in 8. (See Section 9.3 for further discussion of the maximal invariant set.) In part (b), the set 8 needs to be chosen carefully 80 that this maximal invariant set is conjugate to a relatively simple subshift of finite type. REMARK 4.5. One way to think of the subshift of finite type in part (b) is in terms of l-chains. Consider the case when p is a fixed point. For big enough N, and N2, A~ {p,f-N'(q), I-N,+l(q), ... ,fN'(q)} is a periodic l-chain. Let n 2 + N, + N2 , q, p, and Qj f-N,-Hi(q) for 2 ~ j ~ n. The subshift of finite type given in p to q, or the theorem corresponds to allowing the following transitions: (i) from q, N• (q), (ii) from Qj to Qj+' for 2 $ j < n, and (iii) from qn IN. (q) to q, q2 p. The space EA C En for this subshift is a Cantor set with dense periodic points. (The proof is the same as for the section on one sided subshifts since one of the n symbols has more than one option.) The fact that any sequence of symbols in EA can be realized by an actual orbit follows from shadowing. We prove below that the set Aq = O(p) U O( q) has a hyperbolic structure. By the Shadowing Theorem IX.3.1, any l-chain in A' C Aq can be shadowed by an orbit of I. The sets Hi are taken to be disjoint and neig~borhoods of the points qi E A~; the orbit which shadows the l-chain {xi} C A~ can be taken with Ii (x) in the same H,; as Xj. In particular, the Shadowing Theorem proves that the homoclinic point q is in the closure of the periodic points. However, we want to prove that the maximal invariant set As in this neighborhood 8 of Aq i8 conjugate to a 8ubshift of finite type EA. (This is Smale'. main contribution.) If we prove the theorem u8ing the Shadowing
= = =r
=
=
=
=
=
7.4.2 HORSESHOE FROM A HOMOCLINIC POINT
261
Theorem, we would need to prove that the set of all these orbits which shadow the f-chains in Aq is the maximal invariant set in B. This last step is not obvious from the statement of the Shadowing Theorem and involves shadowing all f-chains in A~ at once. For this reason, we give a proof below which is independent of the Shadowing Theorem and duplicates some of its proof. REMARK 4.6. Notice that in addition to the transverse homoclinic point q, the theorem allows that f can have nontransverse homoclinic points for p at points other than q. PROOF. First, we give an outline of the proof. Let Aq = O(p)UO(q). This invariant set is closed because o(q) = w(q) = O(p). The first step is to prove that Aq has a hyperbolic structure. The second step is to prove that if V is a small enough neighborhood of Aq then the maximal invariant set in V, Av = niEZ fi(V), also has a hyperbolic structure. For the third step, the proofs of both parts (a) and (b) use boxes contained in the ambient space. These hoxes are similar to those used for the geometric horseshoe and the Henon map. For part (a), we find two disjoint boxes VI and V2 contained in V and an n > 0 such that each of the images r(V;) crosses both VI and V2 • It follows that r restricted to A = fmn(V1 U V2 ) is conjugate to the full two sided subshift on two symbols. For part (b), the construction uses disjoint closed boxes Bi C V such that the union B = U Bi is a neighborhood of Aq • This neighborhood B is contained inside V so A8 = mEZ fm(B) C Av has a hyperbolic structure. One difference between this case and part (a) (or the geometric horseshoe) is that the image of each f(Bi) does not intersect all the other B j • However, each nonempty intersection f(Bi) n B j completely crosses B j , so A8 can be shown to be conjugate to a subshift of finite type. Throughout much of the proof we assume p is a fixed point. (The notation and labeling becomes a little simpler in this case). The reader can make the necessary changes if it is not fixed. We start the proof as if we were proving part (b), although the first part of the proofs is the same for both parts. We indicate where the bifurcation in the proof of the two parts takes place and the reader can then choose which part to read first. As indicated in the outline, the first step is to show that Aq has a hyperbolic structure. Let qm = fm(q). For each point x E Aq, let ~ = Tx(W"(p» for u = S,u. This is clearly a continuous splitting on O( q) and p separately. We must check that it is continuous as O(q) approaches p. Consider qm as m goes to 00, so qm approaches p. The stable bundle E~_ = Tq_(W·(p» approaches E~ because the stable manifold W·(p) is C I • The unstable bundle E~_ = Tq_(WU(p» approaches E~ by the linear estimates in the Inclination Lemma. The argument as m goes to -00 is is similar. Therefore we have a continuous splitting on Aq • Next we check that vectors in E"IAq are uniformly contracted. We can take an adapted metric so that on E~ the derivative is an immediate contraction, IIDfpIE~1I < A < 1. By continuity, there is a neighborhood W of p such that liD fq_IE~_1I < .A for all qm E W. Since there are only finitely many qm E Aq \ W, there is a C ~ 1 such that if fi(qm) ¢ W for 0 ~ i < k then IIDf~_IE~_1I ~ C Ai for 0 < i ~ k. For any qm E Aq \ {p}, there is at most one string of iterates for which f(qm) ¢ W: there are i l and i2 such that fi(qm) ¢ W for il ~ i < i2 and f(qm) E W for i < i l or i ~ i2' Combining the estimates on and off W, II D f~_IE~m II ~ C Ai for any qm E Aq and all 0< i. The estimates for EUIAq are similar interchanging f and This proves the hyperbolicity of Aq . Now we turn to step 2 which proves that there is a small neighborhood V of Aq 8uch that the maximal invariant set Av = n.EZ f'(V) has a hyperbolic structure. Note that there is no neighborhood V of Aq for which Aq = Av. (This might not be obvious but is true from the conclusion of this theorem or by the ShadOWing Theorem IX.3.1.) This
n:=-oo
n
rl.
262
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
means that Aq is not an isolated invariant set. (An invariant set A is called isolated provided there is a neighborhood V for which A = Av with Av defined as above. See Section 9.3 for further discussion of isolated invariant sets.) For simplicity below, we take an adapted metric on Aq and extend it to a (perhaps smaller) neighborhood V. (The adapted metric implies that f is an immediate contraction and expansion on Aq.) We use cones to extend the splitting from Aq to a neighborhood. For x E V, using the adapted metric let
and CU(x) = {(~, '7) E E~ EB E~ : I~I
::; ~1'71}
some 0 < ~ < 1. By the hyperbolicity on Aq and continuity, there is a (perhaps smaller) neighborhood V of Aq such that DfxCU(x) Df;IC'(X)
c c
C"(J(x))
provided x, f(x)
C'U-1(x))
IDf:'v"l ~ .x-mlv"l IDf;mv'l ~ .x-mlv'l
E
V,
provided x,rl(x) E V, provided fi(X) E V for 0 ::; i < m, and provided ri(x) E V for 0 ::; i < m
for any V U E C"(x) and v' E C'(x). Let Av = neighborhood V and x E Av, define
nez fi(V)
:::) Aq as before. For this
i;:::O
E: = n Dfp(x)C'(t(x)). i;:::O
Because of the expansion esimates, these are subspaces for each x E Av which depend continuously on x, have the usual invariance properties, and satisfy E~ EB E~ = TxM. By the inequalities for all vectors in the cones, vectors in E~ and E~ are expanded and contracted, respectively, so Av is a hyperbolic invarinat set. This completes the proof of step 2. The invariant set Av defined above is probably not be conjugate to a simple subshift of finite type. To get such an invariant set, we carefully construct a smaller neighborhood of Aq, B c V, which is a finite union of "boxes". Each box Bi corresponds to a symbol in the subshift. The transitions which are allowed in the subshift are exactly those for which f(B;) n B j -F 0. The images of the boxes Bi are correctly aligned so that a string of symbols is allowable if and only if there is a point whose orbits goes through this sequence of boxes. Since B C V, AB C Av, so AB has a hyperbolic structure. We take coordinates near p induced by the hyperbolic splitting. In fact identifying IE~ with a subspace for (J" = 8, U, we can take coordinates so a neighborhood can be considered as a subset of E~ x E~, and the local stable and unstable manifolds are disks in the subspaces given by the splitting, W:(p) = E~(r) x {O} and W;'(p) = {O} x E~(r), and identify these two local stable and unstable manifolds with E~(r) and E~(r), respectively. For 0" Ou > 0, we let D' = WI. (p) and D" = Wl. (p). We also consider D' = IE~(O,) and D" = 1E~(ou) and take their cross product in local coordinates near p. We can take 6"0,, > 0 and k > 0 such that q E int[r"(D') \ r(k-I)(D')]
q E int[fk(D U )
\
f("-I)(D U )],
7.4.2 HORSESHOE FROM A HOMOCLINIC POINT
263
where the interiors are taken relative to W'(p) and WU(p), respectively. See Figure 4.5. We can also insure that D' x DU c V (resp. U) where V (resp. U) is the neighborhood of Aq (resp. {p, q}) given in the statement of part (b) (resp. of part (a». For the same k to work both forward and backward, the relative sizes of 6. and 6u need to be adjusted.
p
FIGURE 4.5. Images of the Disks D' and DU Having fixed k, take iI ?: 0 such that for i ?: ii, fk(DU) crosses f-k(D' x rj(DU» transversally in the components of the intersection containing q and p. By transversally, we mean that it hits transversally each "horizontal fiber" f-k(D' x {y}) once and only once for each y E f-j(DU). Thus fk(DU) is a "vertical disk" through q. See Figure 4.6.
For i ?: i .. the set
is a thin neighborhood of fk(DU) by the Inclination Lemma. Therefore for j ?: iI large (DU» crosses (D' x j transversally In the components enough, fk+ j (D' x ofthe intersection containing q and p. In particular, each "fiber" fk+i( {x} x crosses each rk(D' x {y}) once and only once for each xED' and y E f-j(DU). See Figure 4.7. Notice the similarity with the figure for the geometric horseshoe.
r'
r II
r (DU»
r'(DU»
264
VII. EXAMPLES OF HYPERBOLIC SETS AND A'ITRACTORS
Fix an j
~
11 large enough to satisfy the above conditions. Let n = 2k BI = D' x rj(D U ),
+ j,
and
V = rk(Bd.
Letting comp.(B) be the connected component of B containing z, set VI = compp(Vn r(V» V2 = compq(Vnr(V»,
Bi =
r-
c
BI ,
and
l - k - J (V2 )
for 2 ~ i ~ n. See Figures 4.7 and 4.8. The set of boxes {VI, V2} is used in the proof of part (a), and the set of boxes {Bi : 1 ~ i ~ n} is used in the proof of part (b). At this point we are in position to prove either part (a) or (b) of the theorem. The reader can choose which to read first. PROOF OF PART (b). Let B = UI:5i:5n B;. For j large enough Be V. Finally, let
As = nfi(B), iEZ
so As is the maximal invariant set in B, As C Av c V, and As has a hyperbolic structure. We show that the (nonlinear) boxes Bi can be used as symbols so As is conjugate to a subshift of finite type. Because ofthe construction, f-Ic(Bd and flc+i(Bd = r(V) cross V2 , but ft(Bd n t.'2 = 0 for -k < t < k + j. It follows that (i) f(Bd crosses BI and k -j+1(V2) = B2 but not B; for i > 2, (ii) flc(V2) = f 0 f 2Ic+j-I-Ic-i(V2) = f(Bn) crosses B 1 , and (iii) f t (V2) n BI = 0 for -k - j < t < k, so B; n BI = 0 for 2 ~ i ~ n. Thus we have constructed the desired disjoint boxes for the symbols. The first symbol, B 1 , goes to either itself or B 2 • The other symbols can only go to the next symbol: B; goes to Bi+1 for 2 ~ i < n, and Bn goes to B 1• Therefore the transition matrix for the subshift is given as in the statement of the theorem. (Note that we have assumed that p is a fixed point. The form of the transition matrix when p is not fixed is given in Remark 4.8 below.) In the construction of these boxes, they are correctly aligned: each of these boxes can be assigned coordinates so that the image of an unstable disk in Bi crosses Bi+1 in the unstable direction, and the inverse images of an stable disk cross Bi crosses B'_ 1 in the stable direction.
r
7.~.2
HORSESHOE FROM
A
HOMOCLINIC POINT
,- -,,
D -+--+-+., - - - - - ,
FIGURE 4.8. Choice ofthe Boxes Bit ... , Bn The conditions on the boxes are similar to the properties of a Markov partition for a hyperbolic invariant set which we define in Section 7.5.1. One difference is that the Markov partition is made up of "boxes" which are subsets of the hyperbolic invariant set while these boxes are diffeomorphic to "Euclidean boxes" and are neighborhoods in the ambient space. Because of this difference, we say the set of ambient boxes satisfies the Markov properly rather than calling them a Markov partition. Define h : As -+ EA as the itinerary function by hex) = 8 provided li(x) E B., for all i E Z. Because the boxes are disjOint, h is well defined. Fix any symbol 8 E EA. For any m ~ 0, because the images of the boxes have the correct topological alignment, li(BL , ) is a nonempty nonlinear sub-box of B.o which stretches all the way across the unstable direction. Similarly, Ji(B._;} is a nonempty nonlinear sub-box of B.o which stretches all the way across the stable direction. Therefore, ~:-m li(B._.> and iEZ li(B._;} are nonempty. Thus h is onto EA. By an argument like we used before, ho/lAs = uoh so h is a semiconjugacy. (This much ofthe argument does not use that As has a hyperbolic structure, but can be made to work if there is a "topologically transverse" intersection. See Burns and Weiss (1994).) The fact that h is one to one follows from the fact that I has a hyperbolic structure on As. The contraction and expansion implies that for any symbol sequence 8 E EA there is only one point in the intersection niEzJi(B._.), i.e., there is only one point x E A such that rex) is in the box V•• for all i, i.e., h is one to one. 0
n::o
n?=-m
n
PROOF OF PART (a). Let U be an open set of p and q which is contained in the set V defined above. Now fix k, j, n = 2k + j, 6., and 6u as above. By the above choices, VI and V2 are two correctly aligned sets, and VI U V2 C U. See Figure 4.7. By the fact that the images of VI and V2 by In stretch vertically across V,
n lin(V1 U V nlin(v)
m-l
Sr:- I =
m
2)
=
i=O
has 2m components each of which stretches vertically across V. Similarly -I
S::..
=
n pn(V1 U V
2)
0
=
n pn(v) i=-m
has 2m components each of which stretches horizontally across V transverse to the
266
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
vertical fibers. Combining, m-l
n /in(V1
S~,;; 1 =
U V2 )
i=-m
has 22m components. (So far we have not shown that the maximum of the diameters of these components goes to zero as m goes to infinity.) By arguments like those for the geometric horseshoe, it follows that there is a semiconjugacy h : A -+ E 2 , where
n /in(V1 00
A=
U V2 ),
ia:-oo
which is onto and such that h 0 riA = (10 h. The fact that h is one to one follows from the fact that has a hyperbolic structure on A: the contraction and expansion implies that for anyone symbol sequence 8 there is only one point x E A such that /in(x) is in the box V., for all i. 0
r
4.7. It might seem that the invariant set for I, A B • should be the orbit of oft he O(A, f). However, O(A,f) is not the largest natural invariant set invariant set for in a neighborhood of Aq , because O(A. f) only has periodic points which are multiples of n; A8 has points of all periods larger than n (if p is a fixed point). Another way to see the difference is that flAB is topologically mixing (if p is a fixed point) while flO(A) is topologically transitive but not topologically mixing. Still Another way to characterize the difference is in terms of the topological entropy which we define Section 8.1, which is a measure of the complexity. The characteristic polynomial of the matrix A of Theorem 4.5(b) is p(>.) = >.n - >.n-l -1. Since p(21/n) = 2_2(n-l)/n_1 < 0, the largest eigenvalue of A is larger than 21/n. In Section 8.1 we show that the logarithm of this eigenvalue is equal the topological entropy of flAB. On the other hand, riA has the same entropy as 0"21E2which equals log(2). By further results in Section 8.1. flO(A) has entropy (lin) log(2). Therefore flAB has more entropy than /lO(A) and so has more complex dynamics.
REMARK
r,
REMARK 4.8. If p is not a fixed point but has period p then the subshift has a cycle of period p rather than a fixed point. Therefore ai,j = 1 in the following cases:
i 2
~ i
= 1 and i = 2,p + 1, < P and i = i + 1,
i = P and i = 1, P + 1 ~ i < n and i = i + 1. and i=nandi=1.
For all other (i,i).
ai.j
= O.
A=
Thus the transition matrix is 0
0
0
0 0
0 0 1 0 0 0
0 0 0 0 0 0 0 1
0 0 0 0 0 0
0 0 0 0 1 0
0 0 0 0 0 0 0 0 0
1 0 0 1 0 0
7.4.2 HORSESHOE FROM A HOMOCLINIC POINT
267
We leave the details to the reader. In the next subsection, we show how a transverse homoclinic point arises from a time periodic perturbation of a differential equation with a nontransverse homoclinic connection. In the remainder of this subsection, we discuss a more geometric construction of a perturbation which changes a nontransverse homocllnic connection into a transverse homoclinic point. Example 4.1. We start by giving the construction of a diffeomorphism in R2 with a nontransverse homoclinic point. In fact, our example has one branch of the stable manifold coinciding with one branch of the unstable manifold for a saddle fixed point. The simplest construction of such a diffeomorphisrns is by means of the flow of a system of differential equations. Let ",' be the flow of the system of differential equations XI
= X2
X2=XI-X~,
The origin is a saddle fixed point for ",'. The real valued function H (x) = x~ - xV2 + xf!3 is an integral of motion, H(x) ;;;; o. ( The analysis of the next subsection derives this function as the sum of the kinetic and potential energy.) Using the level sets of H, it can be seen that W"(O, ",') n {x :
XI
> O}
= WU(O, ",') n {x : XI x2
= {x : X2 = ±( 21 See Figure 4.9. Let
> O}
x3 1/2
- ;)
1 be the time one flow of ",', I(x) = ",I (x).
W"(O,f)
n {x: XI > O}
= WU(O,f) n {x: = {x:
X2
X2
= ±( 21
XI
,XI>
O}.
Then
> O} x3 1/2
- 31 )
,Xl>
O},
so 1 has a nontransverse homoclinic connection for the fixed point O.
FIGURE 4.9. Homoclinic Connection for Example 4.1 The next step in the construction is to perturb 1 to a new diffeomorphism 9 which has a transverse homoclinic point. Let x· = (1.5,0) be the point where the homoclinic connection crosses the Xl-axis. Let U be a relatively small neighborhood of x· which satisfies the following properties: (i) I-I(U) n U = 0 and I(u) n U = 0, and (ii) letting 1 = W'(O, f) n U, Un Ij(1) = 0.
U
jy.o
268
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
Let U' CUbe a smaller neighborhood of x·. Let ,6(x) be a nonnegative real valued bump function such that
{o
,6() x =
forx~U
and
1 for x E U'.
We use the bump function to define the perturbation k. by k.(x) = x
+ e,6(x) ( ~2 )
,
and the new diffeomorphism g. by g.(x) = k. o/(x).
(Notice that k. is a sheer near x·.) We defer to Exercise 7.19 the verification ofthe following statements for small enough e > 0: (a) g. is a diffeomorphism, is a saddle fixed point for g., (b) (c) letting I = W"(O, f) n U,
°
00
Uli(I) c W"(O,g.), i=O -00
U li(1) c WU(O,g.),
and
j=-I
k.(I)
c
WU(O, g.),
and (d) x· is a transverse homoclinic point for g•. Notice that the perturbation by composition with k. changes the unstable manifold in U but leaves the stable manifold unchanged in U. (The stable manifold in l-l(U) become I-I 0 k;I(I).) Part (d) shows that g. has a transverse homoclinic point which is the desired result. REMARK 4.9. The Kupka-Smale Theorem states that any diffeomorphism
I
can be
C'" -approximated by a diffeomorphism 9 for which (i) all the periodic points of 9 are hyperbolic and
(ii) for any pair of periodic points p and q for g, W"(p, g) is transverse to WU(q, g). The above example (and Exercise 7.19) gives an explicit example of the construction which makes condition (ii) true for a perturbation. See Section 10.1 for a discussion of the Kupka-Smale Theorem.
7.4.3 Melnikov Method for Homoclinic Points In the last section we showed how a horseshoe arises from a transverse homoclinic point for a hyperbolic periodic point. In this section, we give one way to verify that a certain type of differential equations has a transverse homoclinic point. This approach goes back to Poincare, but its recent use starts with Melnikov (1963). Some people call this the Poincare..Melnikov-Arnold method. For a more complete treatment of this type of result see Wiggins (1988) or (1990).
7.4.3 MELNIKOV METHOD FOR HOMOCLlN1C POINTS
269
The class of differential equations which we consider is the time periodic perturbations of a Hamiltonian system. We introduce the concept of a Hamiltonian system on a Euclidean space (or a Euclidean space cross a torus). Assume H(ql,' .. , qn, Ph'" ,p,.) is a real valued function of the 2n variables. The Hamiltonian differential equations generated by H are given by ·
8H
qj=-, 8pj
·
8H
P--) 8qj
for j = 1, ... , n. The corresponding Hamiltonian vector field is given by
Notice that H is conserved along a trajectory (is a weak Liapunov function):
Example 4.2. One simple example is where H is the sum of "kinetic energy", E j plus a "potential energy" which depends only on the positions q, V(q),
p2
H(q,p) =
L ~ + V(q). j
Then the equations of motion are given by
qj = Pi' · 8V Pj = --8 . q; For example, with n = 1 the equations
q=p, p = q -l = J(q) are Hamiltonian with potential energy
V(q) = -
!
q2 J(q)dq = - -
2
q"
+4
riJ/2,
270
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
FIGURE 4.10.
Graph of the Potential Function p
-ffir----t-""*-t----ttt-q
FIGURE 4.11.
Phase Portrait for Example 4.2
and Hamiltonian function H(q,p) = p2/2 + V(q). The potential function V(q) is a quartic polynomial with two minima and one local maximum. See Figure 4.10. The phase portrait of the equations has one saddle point and two centers. See Figure 4.11. We next want to consider the perturbation of these equations by adding a periodic forcing term and a "frictional" term:
q =p, p= q -
q3
+ E"}'cos(wt)
- t6p.
We analyze this example below. This last form of the equations, with the periodic forcing term and a "frictional" term added, is of the form ( : ) = XH ( : ) +tY(q,p,t)
where Y has period T in t. We can add another variable independent of time:
T
to make the equations
7.4.3 MELNIKOV METHOD FOR HOMOCLINIC POINTS
271
where the variable T is taken modulo T. (In the explicit example above we could take T WI and T Wit rather than T t.) For the rest of the section we assume that n 1, 80 P and q are each real variables (or a single angle variable). We assume that XH has a hyperbolic saddle fixed point (qO,Po), 80 equations (.) have a closed orbit 'Yo for ( O. BecaulII! the eigenvaluee (characteristic multipliers) of 'Yo are not equal to one, for ( '" 0 but small, there persists a closed orbit 'y.. Next, we assume that XH has a homoclinie orbit for the fixed point, i.e., a point
=
=
=
=
=
(q, p)
e [W'«qO, po), XH) n WU«qO,po), XH») \
{(qO, Po)}.
In the (q,p, T)-space, the homoclinic orbit of XH becomes a homoclinie surface for 'Yo for the equations (.), f
= {(q, p, T) : (q,p) is on a homoclinic orbit for XH}.
For t > 0, the closed orbit r. remains hyperbolic and its stable and unstable manifolds vary smoothly with t on compact subsets. For each Zo E f, let a'(zo,t) be the point where W'(-}'., i.) intersects the normal to f through zo. By the smooth dependence on f, z'(zo, 0) Zo and z'(zo, t) is a smooth function of t. Similarly define ZU(zo, f). We want to measure the separation of the stable and unstable manifolds in the directions orthogonal to f, i.e., the separation ZU(zo, t) from z'(zo, f). The function H is a good measure of a displacement in these directions (since the gradient of H is nonzero at points in f), 80 we want to measure G(zo, () H(zb(ao, f» - H(a'(zo, (». Since G(zo, 0) == 0, it is possible to write
=
=
A zero of G(zo, t) corresponds to a homoclinie point. Since we want to measure the rate of separation with respect to t (the infinitesimal separation), we define M(zo)
:t
= H(zU(zo, €»I.=o =G(zo,O),
:€ H(z'(zo, f))l.=o
which is called the Melnikov function. The function M is considered a function from r to the real numbers. Then G(zo, f) equals M(zo) plus terms involving (. A zero of M corresponds to a place where infinitesimally the stable and unstable manifold continue to intersect. In fact the following theorem, which is a direct consequence of the implicit function theorem applied to G, gives a criterion that the manifolds actually intelllect for ( '" 0, (See Melnikov (1963), Holmes (1980), or Marsden (1984) for a proof.) The use of this function to prove the existence of a transverse homoclinic point and a horseshoe is referred to as applying the Melnikov method.
Theorem 4.6. Suppose Zo is a point on f with M(zo)
a;:
= 0 and BOrne directional
(zo) '" 0 (or v tangent to f. Then (or small enough € '" 0, r. bas a derivative transverse homoclinic intersection near Zo, In (act tbe point of transverse homoclinic intersection varies smoothly with (.
As a consequence of this theorem, i. has a hyperbolic horseshoe near Zo of the type indicated. In order for this result to be useful we need a method of calculating M. The next theorem gives just such a result.
272
VII. EXAMPLES OF HYPERBOLIC SETS AND ATI'RACI'ORS
i:
Theorem 4.1. The Melnikov function is given by the following improper integral: M(ZO) = where 1P0 is the flow of XH for PROOF.
f
DH...o(I ••o)Y(lPo(t, ZO)) dt
= 0 and X. = XH + fY.
We need to calculate I. H(z"(zo,()) for
! ft!
(1
=
U,S.
To do this, we calculate
H ° lP(t, z"(zo, f), ()I.-o
along the whole trajectory, and in fact
H ° lP(t, z"(zo, f), f)I.=o.
From now on, all the derivatives with respect to ( are evaluated at ( = 0 even though this is not explicitly noted. Then using several rules of differentiation (including the chain rule and a Lelbniz rule)
~
!H
0IP(t,z"(zo,e),f) = =
:f~H olP(t,Z"(Zo,f),f)
a ae [DH. (XH + eY)l"'(I,.~(.o.'}.
= (DH . Y) ... (I •• ~(SO.O).O)
a
+ af [DH . XH l. . (I •• ~(.o •• ) •• ) = (DH·
Y) .... (I.SO)
because DH· XH ;: 0 at all points and z"(ZO,O) = ZOo Now integrating between two times Tl and T2 we get
a H ° IP(T2 , z"(zo,e),e) - -8a H °IP(Tloz"(zo,e),f) = h~ (DH. Y) -8 e
T , " ' . (I
e
) .110
dt.
For (1 = S, we use Tl = 0 and let T2 = T2 go to infinity; for (1 = U, we use T2 = 0 and let Tl = Ti go to minus infinity. Substituting these into the definition of M(ZO) we get T.'
M(zo) =
r iT;
2
(DH· Y)
(I
~o t~
)
dt
+ :e H ° IP(T;., ZU(zo, f), e) -
!
H ° IP(T~, z'(zo, e), e).
By the chain rule,
!H OIP(~,Z'(Zo,f),f) = DH...(T.....O) !IP(T3,Z'(ZO,f),f) As T2 goes to infinity, the point IP(T2,zo, 0) goes to the fixed point P of XH in (q,p)space where DHp = 0, so DH...(T•• so.O) converges to the zero row vector. This quantity acts on :e lP (T2,z·(zo,e),e). The point at which this is evaluated, IP(T:l,z'(zo,e),f),
!
goes to 1., so the limit of IP(~, z'(zo, f),e) goes to the infinitesimal movement of the closed orbit which is bounded. (This uses the smooth dependence of the stable manifold on the parameter.) Combining, we get that I.H olP(~,Z'(ZO,f),f) goes to zero as T2 goes to infinity. Similarly, I.H 0 IP(T:z,ZU(ZO,f),f) goes to zero as T{ goes to minus infinity. Thus, letting Ti go to minus infinity and T2 go to infinity, we have proven that the integral converges absolutely and the equality given in the theorem is true. 0
7.4.3 MELNIKOV METHOD FOR HOMOCLINIC POINTS
273
We end the section by giving the calculation for the example given above. Example 4.2 (Continued). We consider the equations
q=p p = q -l + t"(C08(T) - €6p t
=W.
The Hamiltonian function is H(q,p) = p'l/2_ q2/2+p4/4. The point (q,p) = (0,0) = 0 is a saddle fixed point with a homoclinic connection on two sides. A parameterized form of these two connections is given by
q±(t») ( ±2!sech(t) ) ( pff(t) = =F2t sech(t) tanh(t) T(t) TO + tw as can be verified by differentiation. To apply the theorem we use one of the two homoclinic orbits. We use the positive side and drop the label '+'. Other parameterizations of homoclinic orbits are given by (qo(t - to),Po(t - to),T(t» where to is the time at which p = O. The two variables (to, TO) parameterize the points on r. We actually only need to calculate M on some cross section. Either to = 0 or TO = 0 is a reasonable cross section. For now we keep both variables. Letting s = t - to in the integral,
M(to. TO) =
1: (-
qo(t - to)
+ qg(t -
to),Po(t - to» x
( "( COS(TO + tw)0 - 6Po(t - to) )
=
=
-61: -61:
Po(s)2 ds + "(
1:
Po(s) COS(TO
Po(s)2 ds + "(C08(TO
- "(sin(To + toW)
+ tow)
1:
& + tow + ws) ds
1:
Po(S)C08(WS) ds
Po(s)sin(ws) ds.
The integrand of the second integral is an odd function of s because Po(s) is an odd function and C08(WS) is an even function. Therefore the second integral is zero. Substituting in for Po(s) in the first integral,
as can be directly calculated by a substitution. The third integral is given as follows: -1 sin (TO
+ tow)
1:
Po(s) sinews) ds
1:
= 1 sin(To
+ tow)21
= "(sin(To
+ tow)2 1",wsech( "';)
sech(s) tanh(s) sinews) dB
274
VII. EXAMPLES OF HYPERBOLIC SETS AND ATIRACTORS
as can be calculated by means of residues (after the substitution z = ea.) Therefore
M(to, TO) = Now we take the cross section
'TrW 3-40 + 1'2 ! 7rwsech("2) sin(To + tow).
TO
= 0, and set
This function has a nondegenerate zero as long as
The meaning of this inequality is that as long as there is not too much friction in comparison with the periodic forcing, there is a transverse homoclinic orbit.
7.4.4 Fractal Basin Boundaries A certain perspective on Dynamical Systems places most of the emphasis on the attracting sets, i.e., sets A such that {q: w(q) C A} is an open neighborhood of A. (See Section 7.6 for the actual definition.) If we are mainly interested in attracting sets, then in what way are horseshoes important? One answer to this question is that the stable manifolds of a horseshoe can form the separating set between two different basins of attraction. A good introduction to this type of result is contained in Alligood and Yorke (1989). Also see Grebogi and Yorke (1987). We start with an example. Example 4.3. Instead of the horseshoe which gives the full two shift, we consider the horseshoe which gives the full three shift. Figure 4.12 shows a neighborhood Nand its image feN) by a diffeomorphism f on the two sphere. We assume that f(A) C A and feB) c B and there is a fixed point sink in each of these regions, ql E A and q2 E B. The square middle region S contains a hyperbolic horseshoe, A, such that flA is conjugate to the two sided shift map on three symbols. Considering the diffeomorphism on S2 we add a source at infinity, qoo. If we are careful in the construction, then R(f) = AU {QI,q2,qoo}' B
A
FIGURE 4.12
7.5 HYPERBOLIC TORAL AUTOMORPHISMS
275
The sphere can be split up into the stable manifolds of the invariant sets,
8 2 = W'(qt) U W'(q2) U W'(A) U {q",,}. The attracting sets for this example are merely the two fixed point sinks. Their basins are dense in the sphere. The stable manifold of the horseshoe separates them. The reader should determine which points in the gaps 8 \ l-t(8) are in W'(qt} and which are in W'(q2). In this case the boundary of the basin W'(q;) for either fixed point sink is W'(A) U {qoo}' This set has the structure of a Cantor set across the stable manifolds and so is not a smooth manifold. Such sets are called fractal because they have fractional Hausdorff (or box) dimension. We discuss these concepts in Section 8.4. If q is a fixed point sink for a diffeomorphism I, let B be the boundary of its oRSin of attraction. This is the boundary in the sellse of point set topology, B = cl(W'(q» \ W'(q). This set B is called the basin boundary. In studying a basin boundary, an important feature is which points are accessible from within W·(q). A point x E B is called accessible provided there is a curve 'Y starting in W'( q) and x is the first point on 'Y which is not in W·(q). In the above example, not all points of W'(A) are accessible. If PI, P2, and P3 are the three saddle fixed points in the corresponding vertical strips, then the accessible points in this example are W'(pd U W'(P3) U {q",,}. To understand this fact, let L be a vertical line segment in 8 from top to bottom. Then W'(A) n L is a Cantor set. The only accessible points in W'(A) n L from within L are the end points of the Cantor set (at a finite stage in its formation). But these end points are exactly the points on the stable manifolds of Pt and P3 (and not the middle fixed point P2). Alligood and Yorke (1989) discusses the question of showing that the basin boundary contains a periodic point. Also see Barge and Gillette (1991). There has also been a number of papers on understanding how the basin boundary changes from a smooth set to a fractal set under a deformation of the diffeomorphism. See Alligood and Yorke (1989), Grebogi, Ott, and Yorke (1987), Hammel and Jones (1989), and Alligood, Tedeschini-Lalli, and Yorke (1991).
7.5 Hyperbolic Toral Automorphisms The horseshoe is an example of an invertible map with infinitely many periodic points. It is created on a piece of Euclidean space so can occur on any manifold. In this section we consider another type of example with infinitely many periodic points which is more global in nature; the dynamics occur on the whole torus. As such, this type of dynamics is not as prevalent as the horseshoe, but it is an important construction of a more global example. We use it later in the chapter to construct an attractor called the DA attractor. The hyperbolic toral automorphisms were introduced about the same time as the horseshoe examples. R. Thom is credited with suggesting them to Smale as examples with infinitely many periodic points. They are special cases of systems called Anosov diffeomorphisms or flows. Anosov (1967) showed that the geodesic flow on a manifold with negative curvature are examples of Anosov flows. In this section we define a general Anosov diffeomorphism but only analyze the hyperbolic toral automorphisms. To prove that a horseshoe has infinitely many periodic points, we show it is conjugate to a subshift of finite type. The proof in this section that a hyperbolic toral automorphism has infinitely many periodic points is more straight forward and a1gebraic. In Section 9.4, we prove that any Anosov diffeomorphism has infinitely many periodic points by a more analytic or geometric argument. We start with the definitions.
276
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
Definition. A diffeomorphism f : M -+ M is called an Anosov diffeomorphism provided f has a hyperbolic structure on all of M. It is ca.1led a toral Anosov diffeomorphism provided in addition M is a torus, 1"". Construction of hyperbolic toral automorphisms. The examples of toral Anosov diffeomorphisms which we introduce are induced by a linear map on Rn. Let A = (aij) be an n by n matrix with all the aij integers, det(A) = ±l, and A hyperbolic (no eigenvalue of absolute value one). Let LA : Rn -+ Rn be the induced linear map on Rn. Because A has all integer entries, LA takes the integer lattice zn (points with all integer components) into itself. Because in addition the determinant of A is ± I, A-I has integer entries and (LA)-I = LA-I a.1so takes zn into itself, so LA(zn) = zn. Let 11': R n -+ 11"" be the projection which takes a point x = (Xl, ... ,xn ) in Rn to the point in the torus by taking each component modulo one. Because LA(zn) = zn, LA induces a map fA from the torus 1"" to itself such that fA 01l'(x) = 11' 0 LA (x): if 1I'(x) = 1I'(x') then x = x' + m with m E zn, so 11' 0 LA(x) = 1I'oL A(x') + 1I'oLA(m) = 11' oLA(x') and fA 0 (x) is well defined. Because det(A) = ±l, the inverse A-I is again a matrix with integer entries, so fA is a diffeomorphism with fA I = fA-I. These maps are ca.1led hyperbolic toral automorphisms. Example 5.1. Let
A=
C~)
and A2 =
C ~).
These are both hyperbolic matrices. The eigenvalues of A are ,x± = (1 ± 5 1/ 2 )/2, ,X + ::::: 1.618 and ,X - ::::: -0.618, so I,X + 1> 1 and I,X -I < 1. The eigenvectors are
for ,X - and ,X + respectively. Notice that both eigenvectors have irrationa.1 slope. It can be shown that this must be the case for hyperbolic integer matrices of determinant one. Notice that A reverses orientation and has one negative eigenvalue, while A2 preserves orientation, with two positive eigenvalues, (,X±)2. The eigenvectors for A2 are the same as those for A. With the above example as a model, we can now state the main theorem. Theorem 5.1. Let fA be a hyperbolic toral automorphism. Then the following are true. (a) The periodic points are dense in yn. In particular, there are an infinite number of periodic points. Also, the nonwandering set of fA is all of yn. (b) The toral automorphism fA has a hyperbolic structure on all ofyn: (i) E" is the space of generalized stable eigenvectors of A, (ii) EU is the space of generalized unstable eigenvectors of A, (iii) E~ is the translation ofE" to TpT", (iv) E~ is the translation ofEu to TpT", (v) fA is an Anosov diffeomorphism, and (vi) if1l'(p) = p, then W"(p) = 1I'(p + E") and WU(p) = 1I'(p + EU). (c) The toral automorphism fA is topologically transitive. (d) The toral automorphism fA is expansive and so it has sensitive dependence on initial conditions. (e) The toral automorphism fA is structurally stable.
7.5 HYPERBOLIC TORAL AUTOMORPHISMS
277
PROOF. (a) For fixed positive integer k, let Rat(k) be the rational points in T" with denominators k: Rat(k) = 1I"{(it!k, ... ,inlk) : i j E Z} cT".
Then, LA({(it!k, ... ,inlk) : i j E Z}) c {(it!k, ... ,inlk) : i j E Z}, 80 fA(Rat(k» c Rat(k). This set has a finite number of points (kn points) and fA is one to one, 80 fAIRat(k) is a permutation of Rat(k) and every point in this set is periodic. Finally, Uk Rat(k) is dense in the torus, so the periodic points are dense. The other two statememts in part (a) easily follow from the first. (b) Because 11" : IR n -+ Tn is onto, these give global coordinates. If U c T" is an open set and 1{il,l{i2 : U -+ Rn are two sets of local coordinates on U which are inverses of 11", then 1{i2 0 1{i,1 (x) = X + m where m E Therefore, for PET", the tangent space at p can be thought of as {p} x Rn. In the local coordinates, the map fA is given by LA. AB a map on Rn , the derivative of LA at a point p is equal to the matrix A (or the linear map LA), D(LA)fI = A. Therefore, D(fA)p = A as a map from TpT" = {p} x Rn to T, .. (p1"" = {f(p)} x Rn. In dimension two, the span of the eigenvectors gives the stable and unstable directions for A, IE" = {tv" : t E R} and lEu = {tv" : t E R}. In higher dimensions they are the generalized stable and unstable eigenspaces of A respectively as stated in the theorem. There is a C ~ 1,0 < " < I, and ~ > 1 such that IIA"IE"II ~ C ,," and IIA-"IEull ~ C ~ -" for k ~ 1. For any point p E 1"", define the subspaces at p to be the translates of the stable and unstable eigenspaces of A, E~ = {p} x IE" and IE~ = {p} x lEu. For p E Tn and v E IE~, ID(f~)pvl = IAkvl ~ C "klvl where 0 < " < 1 and C ~ 1 are the constants given above. The bound C ,,"Ivl goes to zero as k goes to infinity, so IE~ is made up of stable vectors. A similar argument holds for v E IE~ as k goes to minus infinity. This proves the hyperbolic structure. Because fA has a hyperbolic structure on all of Tn and all points are nonwandering, fA is an Anosov diffeomorphism. Next we turn to the stable manifold for a point p = 1I"(p), For t > 0 and q = 7r(q), let B(q,f) C Rn be the ball of radius E and U(q,E) = 7r(B(ii,f» C T". For E small enough fAI(U(f(q), E» does not wrap around the handles ofthe torus, 80
zn.
fAI(U(f(q), E» n U(q, E) = 7r[L:4 1 (B(LA(q), E)) n B(q, E)l. Then the local stable manifold is given as follows: n~O
n L:4n(B(L~(p),t»] 1I"[P+ n A-n(B(O,t»].
= 7r[
n~O
=
n~O
By the properties of the linear map, 7r[p + 1E"(C-1E)] Therefore the global stable manifold is given as stated:
c W:(P,fA) c 7r[p + E"(E)].
n~O
The result about the unstable manifold is proved similarly. This proves part (b) of the theorem.
278
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
Corollary 5.2. For any point p, W'(p) and W"(p) are each dense in T" and so are their intersections, which are transverse homoclinic points. PROOF. For n = 2, the slopes of the lines lE' and IE" are irrational in ]R2, so the lines W'(p) = 1T(iHIE') and W"(p) = 1T(p+lE") are each dense in 'f2, and their intersections are dense. For n > 2, we replace the lines with subspaces. The stable and unstable manifolds of LA of a point q are given by W'(q, LA) = q + lE' and W"(q, LA) = q + lEu. Let q be a periodic point of period m with lift q. Then W"(O, LA) intersects W'(q,L A ) in a point z (because these two affine subspaces are complementary dimensions and not at all parallel). Letting z = 1T(i), z E W"(1T(0).!A) n W'(q, fA) and d(fA(z),/A(q)) goes to zero as n goes to infinity. Therefore W"(11'(0),fA) accumulates on q at the points fmn(z). Because the periodic points are dense in 'f", W"(11'(0).!A) is dense in 'f". Similarly, W'(1T(0),fA) accumulates on q and so W'(11'(0).!A) is dense in 'f". Because W"(1T(0),fA) and W'(1T(0), fA) are projections of complementary subspaces, they intersect transversely arbitrarily near q, so the transverse homoclinic points are dense in 'f". For an arbitrary point p in 'f", W" (p, fA) and W' (p, fA) are translates of the manifolds of 0, so they are each dense in 'f", and the homoclinic intersections for p are dense 0 in 'f". PROOF OF THEOREM 5.1 CONTINUED. (c) To prove that fA is topologically transitive we verify the hypothesis of the Birkhoff Transitivity Theorem. Let U and V be any two open sets in T". The stable manifold of the origin, W'(1T(0)), is dense in 'f", so it intersects U in a point q. Let J = 1T(q + lE"(r)) where r > 0 is small enough so that J cU. If A is a lower bound on the unstable eigenvalues, then the k-th iterate of the disk, f~(J), contains a disk of radius at least least C-1Akr in W"(f~(q)). As k increases, the "radius" of f~(J) becomes larger, so f~(J) accumulates on compact pieces of W"(1T(0)), and so it must intersect V. For this iterate,
o~ f~(J) n V c
f~(U) n V.
Thus 0+ (U) n V ~ 0. Similarly, 0- (U) n V ~ 0. By the Birkhoff Transitivity Theorem, fA is topologically transitive. (d) Given any point p and q E W"(p), the distance d(fk(p),fk(q)) ~ Akd(p,q) as long as the distance stays less than one. The quantity Akd(p, q) grows, so the distance gets bigger than 1/4. This proves that fA is expansive with expansive constant 1/4, which is part (d). (e) The proof of structural stability uses the proof of the Hartman-Grobman Theorem, Section 5.7. After looking at the lifts of the diffeomorphisms to Rn, the main change is that a little extra checking needs to be done in the proof that the conjugacy is one to one. The map LA is the lift of fA to a map on Rn. Let 9 be a C l perturbation of fA· It is possible to choose the lift G : Rn -+ Rn for which G(O) is near 0 = LA(O). Let = G - LA. Then for any lattice point w E zn, LA(x + w) - LA(x) = LA(w), and G(x + w) - G(x) = LA(w). (This latter equality can be proved by taking a homotopy 9t with 90 = fA and 91 = g. The lift G t will have Gt(x + w) - Gt(x) a lattice point for each 0 $ t $ I, so G(x + w) - G(x) = Go(x + w) - Go(x) = LA(w).) Then, a(x + w) = a(x). Because of the periodicity of a, it is C l small on all of Rn. This shows that G is C t close to LA on all of Rn. To conjugate LA and G, we solve for H id + v. The map v should be a bounded function on Rn and periodic. As before, let Gg(Rn) be the space of all bounded contin-
a
=
7.5.1 MARKOV PARTITIONS FOR HYPERBOLIC TORAL AUTOMORPHISMS
279
uous maps from Rn to itself. Now we let cg,per(R") = {v E cg(R") : vex + w) = vex) for all w E Z",x E R"}
and Cl,per(R n ) = cg.per(R n ) n CI(Rn).
Then, if G E Cl,per(R"). Because ofthe periodicity, there is a uniform bound on IIDGxll for all x ERn. For G E Cl,per(R") and v E cg,per(R"), as in the proof of Hartman-Grobman we let where
C(v) = (id - (LA)*)V = v - LA ovoLAI. A direct check shows that 8(G,·) preserves C::,per(R n ). (We leave this verification to the exercises. See Exercise 7.22.) Exactly as in the proof of the Hartman-Grobman Theorem, if Lip(G) is small relative to the distance of the contraction and expansion rates away from one, 8(G, -) has a fixed point v~ E C::,per(R"). Letting HG = id + v~ we get that C(v~) = Go (id + v~)
HG
0
LAI
= id + v~ = LA 0
LAI
= LA
+ GoHG 0 LAI
0
HG oLAI
+ LA 0 va 0 LAI + C(va)
=GoHGoLAI on IRn. For W E zn, HG(x + w) = HG(x) + w so HG induces a map hg on 1'" that satisfies go hg 0 fA I = hg. Next we check that hg is one to one. If hg(x) = hg(Y), x is a lift of x, and y is a lift of y, then HG(x) = HG(Y) + w = HG(Y + w) for some w E Z". Replacing Y with y' = y + w we get another lift of y with Ha(x) = HG(Y')' Because Ha is one to one, x = y' and x = y. (The proof that HG is one to one uses the fact that LA is expansive: HG 0 L:4(x) = HG 0 L:4(y') for all n so x = y'.) Thus hg is one to one. By invariance of domain, hg(T") is open in 1"'. Since it is also close, hg(T") = T", and hg is onto. This completes the proof that hg is a homeomorphism, that fA is structurally stable, and the proof of the theorem. 0 REMARK 5.1. In Section 9.7 we prove that all Anosov diffeomorphisms are structurally stable. Manning (1974) proved that any Anosov diffeomorphism on a torus is topologically conjugate to a hyperbolic toral automorphism. One conjecture which is still unknown is whether being Anosov implies that all points are nonwandering (or chain recurrent) .
7.5.1 Markov Partitions for Hyperbolic Toral Automorphisms We want to connect the dynamics of a hyper l,. ; ,.1 automorphism, f : T" -+ T", with that of a subshift of finite type, i.e., to see how symbolic dynamics can be applied to a hyperbolic toral automorphism. We need to find (and define) the replacements for the geometric boxes of the horseshoe which are used to define the symbol sequences. The theory which we give is for all dimensions, but the examples are all in two dimensions where the situation is simpler.
280
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
Example 5.2. We introduce the ideas of rectangles, a Markov partition, and the semiconjugacy using the toral automorphism fA induced by the matrix A =
(~ ~ ).
We
want to subdivide the total space into rectangles (which can be taken to be actual parallelepipeds in two dimensions but not higher dimensions). The eigenvalues are A" = (1 + 5 1/ 2 )/2 with eigenvector v" = (2,5 1/2 - 1) and A. = (1 - 5 1/ 2 )/2 with eigenvector v' = (2, _5 1/ 2 - 1). Note that Au + A. = 1 = tr(A) > O. Thus A" = tr(A) - A., and the fact that the trace of A is a positive integer insures that Au > O. Then A"A. = det(A) = -1 < 0, 80 this insures that A. < O. Also, v" has positive slope and v' has negative slope. To form the rectangles for A, we look in the covering space, R2. From the origin and other lattice points take the part of the unstable manifold of this point in R2 that crosses the fundamental domain above and to the right of the lattice point. See Figure 5.1. Next, extend the stable manifold from the lattice point downward to the point a where it hits the part of the unstable line segment drawn above. Similarly, extend the stable manifold upward from a lattice point to the point b where it hits the part of the unstable manifold drawn above. Finally, extend the unstable manifold to the point c where it hits the line segment [a, bl. in the stable manifold. These line segments, [a, bl. in W'(O) and [0, cl" in W"(O) (and their translates in R2), define two rectangles RI and R2 in T2. See Figure 5.1.
FIGURE 5.1. Rectangles for Example 5.2 To find the images of the rectangles, we first consider the images of the points a, b, and c: fA(a) = b, fA(b) = c, and fA(C) E [0, bl., where [x,yl. is a line segment in the stable manifold from x to y. See Figure 5.1. Using these images, it follows that fA(Rd crosses R} and R 2 , f A(R2 ) crosses R I •
See Figure 5.2. The pair of rectangles {R}, R 2 } have the properties of a Markov partition for fA: (i) the collection of rectangles covers y2, (ii) the interiors of R} and R2 are
7.S.1 MARKOV PARTITIONS FOR HYPERBOLIC TORAL AUTOMORPHISMS
FIGURE 5.2.
281
Images of Rectangles for Example 5.2
disjoint, and (iii) if IA(int(~» n int(Rj ) # 0, then IA(~) reaches all the way across R j in the unstable direction and does not cross the edges of Rj is the stable direction. (There is a fourth condition which we only discuss implicitly below in terms of the semi-conjugacy.) We give the general definition below. We define a transition matrix which indicates which itineraries for the orbit of a point are allowable: for a transition from rectangle ~ to R j to be allowable, it must be possible for an orbit of a point to pass from the interior of ~ to the interior of R j • (We disregard the fact that the image of the boundary of R'l hits the boundary of R'l') In this example the transition matrix is given by
B=G
~).
Notice that this transition matrix B is the same matrix as the original matrix A which induced the toral automorphism. The shift space for B is the two sided subshift of finite type E8 = {s : Z -+ {I,2} : b•••• +1 = I} with shift map (T8 = (TIE8' To define the symbolic dynamics, we can not get a continuous map (conjugacy or semiconjugacy) h from T2 to E8 because T'l is connected and E8 is a totally disconnected Cantor set. Also for a point p E 8(~) there are at least two choices of rectangles to which p belongs. Therefore, there is no way to assign a unique symbol sequence to points on the boundary of a rectangle. Instead, we define a map going the other direction, h: E8 -+ T2. We want h to be a semiconjugacy (continuous, onto, and IA 0 h = h 0 (T8). To do this we define h : E8 -+ y2 by
n n l .. n
00
h(s) =
cI(
n-O
j(int(R.J »).
j_-n
We take the images of the interiors because RI n IA'(R2) does not always equal cI(int(RI) n IAI (int(R2» but can have extra points whose images are on the boundary
282
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
of R 2 . (We must put up with the annoyance to be able to use fewer rectangles.) Using the general theory, Theorem 5.3 proves that this h is a semi-conjugacy. In fact, it proves that h is at most four to one. In order to give the precise definitions of rectangle and Markov partition, it is necessary to indicate what we mean by the component of a stable or unstable manifold for a point in a rectangle. As we have done before, we use the notation of comp.(S} to be the connected component of the set S containing the point z. We think of W"(z, R} as equal to comp.(RnW"(z}} for (1 = u,s and Rone of the rectangles (ifit is connected). However, this definition does not quite work, because even in the rectangles for Example 5.2 there is a difficulty: in '1'2, RI touches itself along the projection of the line segment from 0 to c, lI'([O,cl.}. When the total ambient manifold is a torus, a better definition of the stable and unstable manifolds in a rectangle uses the covering space R2 as follows. Let R he one lift of a rectangle R in '1'2 to a rectangle in )R2, so 11' : R -+ R is onto and one to one in the interior. Let z be a lift of z, z E Rand lI'(z} = z. For (1 = U, s, define W" (z, R) = lI'(W" (z) n R).
Note in Example 5.2, for z = 0 and rectangle R\, there are two choices for the lift RI which touches the origin 0 in R2. (There is one choice above and to the right of 0 and one below and to the left.) Making either of these choices, W"(O, Rd = lI'(W"(O, RI » is a proper subset of compo(RI n W"(O}}. In fact, compo(R I n W"(O)} is the union of the two choices for W"(O, Rd. We now use the motivation of the rectangles defined above for the specific example to give a general definition of both a rectangle and a Markov partition. Definition. For a hyperbolic toral automorphism on the n-torus, 'lr', we proceed as follows. Let R be a subset of 'lr' and z E R. Let R is a lift of R to Rn and z E R be a lift of z, i.e., 11' : R c Rn -+ R is a homeomorphism, and 1I'(z} = z. If R is connected then R should be taken to be connected; if R is not connected, then care must be taken to choose the points in (1I')-I(R) in a reasonable manner, e.g. R should be in one fundamental region of 11' : Rn -+ 'lr'. For (1 = U, s, let W"(z, R) = lI'(W"(z) n R).
We do not give a completely precise definition for the general case of a hyperbolic invariant set A. An isolated hyperbolic invariant set has a property called a local product structure provided for t > 0 small enough, there is a () > 0 such that if d(x,y} < () for x.y E A, then W.U(x} n W:(y) is a single point in A. Let A be a hyperbolic invariant set with a local product structure, let R be a subset of A that has diameter less than 6, and let z E R. Then W"(z, R} == R n W:(z) using the local stable and unstable manifolds of size t. This general case was considered by Bowen (1970a, 1975). Also see Section 9.6. Definition. Let f be a diffeomorphism with a hyperbolic invariant set with a local product structure. (This includes the case where f is a hyperbolic toral automorphism.) A nonempty set R of T" (or of A) is a (proper) rectangle provided (i) R = cl(int(R}} (where the interior is relative to A) so that it is closed, and (ii) p, q E R implies that W'(p, R} n WU(q, R) is exactly one point, and this point is in R. If we are considering a hyperbolic tOrai automorphism, then the same lift must be used for R to determine both W'(p, R) and W"(q, R).
7.5.1 MARKOV PAIUITIONS FOR HYPERBOLIC TORAL AUTOMORPHISMS
283
REMARK 5.2. In his general definition, Bowen defines W:(p) n W:'(q) == [p,q). He then demands that for p, q E R that [p, q) is exactly one point, and that this point is in R. Note, if we use Bowen's definition then RI is not a rectangle in Example 5.2 because there are points p and q in RI near 0, for which W.·(p) n W:'(q) is in R2 and not in R I • Using the fact that the manifold is a torus and our definition ofthe subsets of the stable and unstable manifolds using lifts, the sets RI and R2 given in the above example are indeed rectangles.
Below, we define a collection of rectangles (a Markov partition) which have the properties needed to use them to define symbolic dynamics. The definitions use the notion of the interior and boundary of a rectangle. A point pER is a boundary point 01 R if arbitrarily near to p there is a point q in A such that q rt R. (This is the usual pointset boundary of a subset.) If p is a boundary point of R, it follows for such q, that either W'(p) n WU(q) or W'(q) n WU(p) is not in R. Let 8(R) be the set of all boundary point.s of R, and the interior of R be the complement of 8(R) in R, int(R) = R \ 8(R). Definition. Assume that I : M -+ M is a diffeomorphism which has an isolated hyperbolic invariant set A with a local product structure. (This includes the case where I is a hyperbolic toral automorphism with A = M.) A Markov partition for I is a finite collection of rectangles, 'R = {R j } J= I' that satisfies the following four conditions. (All interiors are taken relative to A.) (i) The collection of rectangles cover A, A = U;"=1 R j • (ii) If i ~ j then int(R;) n int(R j ) = 0 (so int(R;) n Rj = 0). (iii) If z E int(R;) and I(z) E int(R j ) then I(WU(z, R;» :) WU(f(z), R j
)
c
).
I(W'(z, R;»
W'(f(z), R j
(iv) (The rectangles are small enough.) If z E int(R;) n
and
r
1 (int(Rj
int(Rj)n/(WU(z,int(R;») = WU(j(z),int(Rj int(R;) n
r
l
where WO' (Zl , int( R,,» rectangle R".
» then
»
and
(W'(f(z), int(R j ») = W'(z, int(R;»
= WO' (Zl , int( R,,» n int( R,,) for = 'U, s, any point Z/, (1
and
Definition. Once we have a Markov partition. we want to set up the symbolic dynamics of the subshift of finite type by means of a transition matrix. Given a Markov partition 'R = {R j }j!.I' the transition matrix B = (b;j) is defined by b.. = {I '] 0
if int(f(R;» n int(Rj ) ~ 0 if int(f(R;» n int(Rj ) = 0.
The shift space lor B Is defined as E8
= {s: Z -+ {I, ...• m}
: b"' H1
= I}.
Letting (1 be the shift map on the full m-shift, Em = {I, ...• m}z. define 0'8 = 0'IE8 : E8 -+ E8·
284
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
Example 5.3. For the geometric horseshoe, let R.;
= Hi n A for i = 1,2.
rectangles form a Markov partition with transition matrix
(! !).
These two
For the hyperbolic invariant set created for a homoclinic point, the sets R.; = A n Ai form a Markov partition with transition matrix B given in the proof of Theorem 4.4(b). REMARK 5.3. Notice that we do not demand that a rectangle be connected, although the examples we give for hyperbolic toral automorphisms are connected. There are examples where a rectangle has countably many components even for a Markov partition of a total space which is connected. In general, for a Markov partition of a hyperbolic invariant set, the total space is often not connected or even locally connected, so a rectangle certainly could not be connected in this case. REMARK 5.4. Let 8(R.;) be the boundary of R.; relative to A. Conditions (i) and (ii) in the definition of a Markov partition imply that 8(R;) = {p E R, : pER) for some j ." i}. This holds because clearly int(Ri) n {p E R.; : p E R j for some j ." i} = 0, so 8(Ri) :::J {p E R; : pER] for some j ." i}. Next, if p E 8(Ri ), then there are qk E Rj> with jk ." i and qk converging to p. Because there are a finite number of rectangles, by taking a subsequence we can take all the jk = j to be the same. Because R j is closed it follows that p E Rj • This proves that 8(R;) C {p E R; : p E R j for some j ." i}. REMARK 5.5. Condition (iii) in the definition of a Markov partition insures that if the image of a rectangle hits the interior of another rectangle, then it goes all the way across in the unstable direction and is a subset in the stable direction (goes all the way across in the stable direction when looking at the inverse). Note that if a point z is on the boundary of a rectangle R.;, then the image of R.; can abut on another rectangle R j without even going into the interior of rectangle R j • (Thus the condition (iii) does not necessarily hold for the points on the boundary.) REMARK 5.6. Condition (iv) is not included in Bowen's definition because he only used small rectangles. It is added to our list to make the point determined by a sequence of rectangles allowed by the transition matrix well defined. This condition prohibits the image of a rectangle R.; from crossing a rectangle R j twice. Note that it does allow the image to intersect the boundary a second time. (See f(Rd and R2 in Figure 5.2.) We could strength Condition (iv) to the following assumption: (iv)' for z E int(R;) n f-l(int(R j »),
Rj nf(W"(z,R.;») = W"(f(z),R j )
R.; n
r
1 (W"(f(z),
and
R j ») = W'(z, R.;).
This condition does not allow the image of a rectangle R.; to cross the rectangle R j once and then intersect the boundary a second time. Therefore the partition constructed in Exanlple 5.2 satisfies assumption (iv) but not assumption (ivY. The advantage of assumption (iv)' over (iv) is that the definition of the conjugacy in Theorem 5.3 without taking interiors and closures. See Remark 5.10. For some purposes, people allow the image of a rectangle to cross more than one time. If multiple crossings are allowed, then (1) Condition (iv) is not included in the definition and (2) the transition matrix must be allowed to have integer entries which are larger than one, i.e., we get an adjacency matrix as defined in Section 7.3.1 on subshifts for matrices with nonnegative integer entries. More precisely, assume there is a partition by rectangles {R.;}:'..l which satisfies conditions (i-iii) for a Markov partition but not necessarily condition (iv). To such a partition, we can associate an adjacency matrix A = (aij) where the entry ail equals the number of times that the image !(R i ) crosses
7.5.1
MARKOV PARTITIONS FOR HYPERBOLIC TORAL AUTOMORPHISMS
285
the rectangle R J • Thus if aij = 2, then f(R;) crosses R j twice. We do not pursue this connection. See Franks (1982). REMARK 5.7. Adler and Weiss (1970) gave a method of constructing simple Markov partitions for hyperbolic toral automorphisms on T2. Assume A is a 2 x 2 adjacency matrix with all positive entries and which induces a hyperbolic toral automorphism on y2. Then, there is always has a partition by two rectangles, {Rb R2}, such that (1) the partition satisfies all the properties of a Markov partition except (iv), and (2) the image f(R;) has aij geometric crossings of R j . The recent theses by Snavely (1990) and Rykken (1993) give more details on constructing such a Markov partition. REMARK 5.8. It should be noted however that even for Markov partitions for hyperbolic toral automorphisms in T" with n ~ 3, the boundaries of the rectangles are not smooth. Thus the "rectangles" are much different than the simple two dimensional example leads one to believe. See Bowen (1978b). REMARK 5.9. Bowen (1970a) proved that any hyperbolic invariant set with a local product structure has a Markov partition. We prove this result in Section 9.6. In this chapter, we restrict ourselves to finding Markov partitions for hyperbolic toral automorphisms on y2 and the solenoid which is defined in Section 7.7. We can now state the main result.
Theorem 5.3. Let 'R = {Rj}j=l be a Markov partition for a hyperbolic toral automorpllism on T2. Let (EB,O'B) be tile shift space and h: EB -+ y2 be defined by h(s) =
n cl( n rj(int(R.;»). 00
n
n-O
j--n
Then h is a finite to one semiconjugacy from O'B to f. In fact h is at most m 2 to one where m is the number of rectangles in the partition.
REMARK 5.10. If we used assumption (iv)' given in Remark 5.6 above, then we could just use the intersection of the images f-j(R. J ) to define h,
n rj(R. 00
h(s) =
J ).
j=-oo
This latter intersection is usually used to define the conjugacy. The problem is that f(int(R;» n int(Rj ) can be nonempty and f(Ri) abut on the boundary of R j at points for which there are no nearby interior points, so cl (J(int(R;» n int(Rj
»i:
f(R;) n R j •
See Example 5.2. We allow such intersections on the boundary in order to find Markov partitions with fewer rectangles. This forces us to use this slightly more complicated definition of h given above. REMARK 5.11. This theorem is used in Section VIII.1.2 to prove that the topological entropy of FA can be calculated by the largest eigenvalue of B. PROOF. By condition (iv), c1(int(R•• ) n f- 1 (int(R'.+I))) is a nonempty subrectangle that reaches all the way across in the stable direction. By induction, k+i
cl (
n rj(int(R.;»)
j-"
286
VII.
EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
is a nonempty subrectangle that reaches all the way across in the stable direction for any k E Z and i E N. The width of this set in the unstable direction decreases exponentially at the rate given by the inverse of the minimum expansion constant. Thus
n (n n
00
cl
n-O
ri(int(R.;)) = W·(p .. R. o)
j-O
for some P. E R.o ' Similarly,
n n 0
-00
ri(int(R.;))) = WU(Pu,R. o)
cl(
n-O
j=n
for some Pu E R.o . Therefore,
n n n
00
cl(
n=-oo
ri(int(R.;)))
= W·(p.,R'o)nWU(pu,R. o)
)=-n
is a unique point p = h(s). This shows that h is a well defined map. By arguments like those used for the horseshoe, h is continuous, onto, and a semiconjugacy. If Ji (p) E int( R.;) for all j, then h -I (p) is a unique symbol sequence, s, because Ji(p) rI Rk for k '" si' Thus, h is one to one on the residual subset (in the sense of Baire category)
Next we show that h is at most m 2 to one, where m is the number of partitions. Let p = h(s). As we showed above we only have to worry if J"(p) is on the boundary of some rectangle Ri . We want to distinguish the boundary points of a rectangle R which are on the edge of an unstable manifold in the rectangle, WU(z, R), and those which are on the edge of a stable manifold, W'(z, R). Let 8'(R)
= {x E 8(R) : x rI int(WU(x, R))}
and
8U (R) = {x E 8(R) : x ¢ int(W·(x,R))}.
Here int(WU(x, R)) is the interior relative to a compact part of the manifold WU(x, R). Similarly for int(W'(x, R)). Then 8'(R) is the union of stable manifolds W'(z, R), and au (R) is the union of such unstable manifolds. If J"(p) E 8'(R'n) then li(p) E 8·(R.;) for j ~ n. There are at most m choices for Sn. (The reader can check that for a hyperbolic toral automorphism on y2, there are at most 4 choices.) Since the transitions of interiors are unique, a choice for Sn determines the choices of Sj for j ~ n. Similarly if J"' (p) E 8 U(R. n , ) then a choice for Sn' determines the choices of Sj for j $ n'. Combining, there are at most m 2 choices as claimed. 0 . ,'oill, let A2
Example 5.4. As a second example of a hyperboh
(~
!). As we noted above, if A = (!
~),
then A2
=
=
A2. The rectangles Rl
and R2 from Example 5.2 are still rectangles for this matrix. However the image of RI
7.5.1 MARKOV PAIITITIONS FOR HYPERBOLIC TORAL AUTOMORPHISMS
287
by I A. crosses RI twice. This partition satisfies conditions (i)-(iii) and has A2 as an adjacency matrix. If we want to get a transition matrix with only O's and l's, we must subdivide the rectangles (split symbols) by taking components of RI n I A. (R.): let the rectangle
Ria = comp (11'(0), cl(int(R.) n IA.(int(R.»» = 1I'(RI
n LA. (R.»
where LA. is the map on R2, and
These rectangles can also be formed by extending the unstable manifold of the origin until it intersects the stable line segment [0, bl. at the point e = I(c). See Figures 5.3 and 5.1. The reader can check that
IA.(Rla ) IA.(RlI,) IA.(R 2 ) Thus the transition matrix is
crosses crosses crosses
B=(i ~
Ria, Rib and R2, Ria, Rib and R 2 , Rib and R2.
D·
This transition matrix has characteristic polynomial p(~) = _~(~2 - 3~ + I), and eigenvalues 0, (~ .. )2, and (~.)2 where ~ .. , and ~. are the eigenvalues of A. Thus the eigenvalues of B are those of A2 together with O. We do not prove it, but the eigenvalues of the transition matrix are always the eigenvalues of the original matrix A together with possibly 0 and/or roots of unity. See Snavely (1990).
FIGURE 5.3. Markov Partition for Example 5.4
288
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
7.5.2 The Zeta Function for Hyperbolic Toral Automorphisms As we mentioned in Section 3.3, the zeta function for a map
f is defined by
with N} = #(Fix(fi». In that section, we prove that if (J A is the subshift of finite type for the matrix A, then (t) = Idet(l- tA)]-1 which is a rational function of t. The zeta function has been proved to be a rational function for many more diffeomorphisms. In this subsection. we prove this is true for a toral Anosov diffeomorphism. The fact that the zeta function is rational means that the number of all the periodic points of all periods can be determined by the finite number of invariants given by the coefficients of the rational function.
<"A
Theorem 5.4. (a) Assume that f : T" --+ T" is an Anosc)V diffeomorphism. Then the zeta function of f is rational. ( b) Assume that fA : 11'2 --+ 11'2 is a hyperbolic toral automorphism for the matrix A, and Au the the unstable eigenvalue. Then
(1 - t)2 det(l- tAl (1 + t)2 det(I + tAl
(I - t 2 ) det(l- tAl (1 - t 2 ) det(l + tAl
if Au > 0 and det{A) > 0, if Au < 0 and det{A) > 0, if Au > 0 and det(A) < 0, and if Au < 0 and det(A) < O.
REMARK 5.11. For the proof to be valid f could be any Anosov diffeomorphism which induces the map A on the first homology group, HI (11'2, JR) = JR ffi JR. REMARK 5.12. There are two types of proof of the rationality of the zeta function. Manning (1971) gave a proof using Markov partitions. Also see Bowen (1978a). Manning proved that the zeta function is a rational function whenever the chain recurrent set has a hyperbolic structure. (Actually he uses the slightly weaker hypothesis that the nonwandering set is hyperbolic and is equal to the closure of the periodic points.) We discuss this type of proof in the exercises. See Exercise 7.30. In this section, we use the proof using the Lefschetz index and the Lefschetz Fixed Point Theorem. This proof goes back to Smale {1967} in the case of a toral Anosov diffeomorphism. The zeta function for a diffeomorphism on a compact manifold was later proved to be a rational function using this type of proof by Williams (1968) for an attractor and by Guckenheimer (1970) whenever the chain recurrent set has a hyperbolic structure. Finally, Fried (1987) proved that the zeta function is rational whenever the diffeomorphism is expansive on the chain recurrent set. For more discussion of zeta functions, see Franks (1982).
Before starting the proof, we need to define the· Lefschetz number of a fixed point and state the Lefschetz Fixed Point Theorem. We only give these definitions in the case when no periodic point has an eigenvalue which is a root of unity.
7.5.2 TilE ZETA FUNCTION FOR HYPERBOLIC TORAL AUTOMORPHISMS
289
Definition. Let 1 : M -- M be a C l diffeomorphism on a compact manifold M of dimension n. If p is a fixed point, the Lelschetz index 01 p, Ip(f), is defined to be the sign( det(I - Dip» (where the derivative and determinant are calculated using some local coordinates). If p is a hyperbolic fixed point, then an easy analysis shows that
where u = dim(E~), and A = 1 provided D/pIE~ -- E~ preserves orientation and A = -1 provided this linear map reverses orientation. (See Exercise 7.29.) Let I.k be the induced map on the k·th homology group, Hk(M, R). The Lelschetz number ollis defined as follows: n
= ~)-I)ktr(f.k)
L(f)
Ic=O
where n is the total dimension of M. Theorem 5.5 (Lefschetz). Let manifold M. Then
I:
L(f)
M -- M be a C l diffeomorphism on a compact
L
=
Ip(f).
pEFix(f)
See Dold (1972).
To prove the main theorem using the Lefschetz number, we need make a connection between the Lefschetz numbers of the iterates of 1 and the induced map on homology. We start by defining a zeta function using the Lefschetz numbers of the iterates of I which can be related to the induced map on homology by means of the Lefschetz Fixed Point Theorem. Definition. Let I : M -- M be a map for which none of the periodic points have eigenvalues which are roots of unity. The homology zeta junction, Z,(t), is defined 88 follows:
where K j = L(fi). We first relate the homology zeta function with the regular zeta function. Proposition 5.6. Let I : T" -- 1'" be a C I Anosov toral diffeomorphism and u If D/IE" preserves orientation, then
dim(E~).
If DIllE" reverses orientation, then
REMARK 5.13. For a Anosov diffeomorphism on
u {p}
pETn
1"', the bundle
x E~
=
290
VII. EXAMPLES OF HYPERBOLIC SETS AND ATIRACTORS
is oriented, and the derivative
either preserves orientation for all points p or reverses orientation for all points p. Therefore the statement of the proposition takes care of all cases. This is the only use we make of the manifold being a torus except for the case of the two dimension hyperbolic toral automorphism. PROOF. Remember that K j = L(fj) and Nj = # Fix(fj). If D/IEu preserves orientation, then the indices of all the periodic points are (_l)U, and K j = (-l)UNj or N j = (-l)UKj . Therefore,
Similarly, if D !lEU reverses the orientation, then D P lEU reverses the orientation for j odd and preserves the orientation for j even. Therefore the index of any fixed point of Ii is (-l}U(-lF, N j = (-l)U(-l)jKj, and
o By the above proposition, to prove the theorem it is enough to prove that the homology zeta function is rational. The next proposition relates the homology zeta function with the linear maps on the homology groups which are induced by I. Proposition 5.1. Let I : M -+ M be a C 1 diffeomorphism on a n dimensional manifold M. for which none of the periodic points have eigenvalues which are roots of unity. Then Zf(t) =
fI
det(I - tl.k)<-t)"+I.
10=0
The proof of the proposition uses the following lemma Lemma 5.B. For an arbitrary matrix B (which we take as some I.k below).
7.5.2 THE ZETA FUNCTION FOR HYPERBOLIC TORAL AUTOMORPHISMS PROOF. The infinite series - log(l - tB), so
'£';.1 Biti Ii
281
is the formal power series for the function
ti . (L "'7BJ) = (/ - tB)-I. 00
exp
j=1
J
Applying Liouville's Formula, we get
L ti Bj)) = det ( exp ( L ti Bi)) 00
exp ( tr (
00
"'7
i=1
"'7
J
i=1 J
= det ((l- tB)-I) = (det(l- tB)) -1.
o PROOF OF PROPOSITION 5.7. By the Lefschetz Fixed Point Theorem, n
Kj
= LUi) = L(-I)"tr(f!,,).
"-0 Substituting this expression for K j in the definition ofthe homology zeta function Z,(t) as follows:
n
=
II det(l- tf!,Y-I)"+J, 10:=0
o
where the last equality uses Lemma 5.8.
PROOF OF THEOREM 5.4. Since the homology zeta function is rational by Proposition 5.7, the usual zeta function is rational by Proposition 5.6. For a hyperbolic toral automorphism induced by A on T2, f.o = 1, f.l = A, and f.2 = sign(det(A)) . 1. Therefore if det(A) > 0 then Z ( ) = det(l- tA) , t (1 _ t)2
by Proposition 5.7. Similarly if if det(A) < 0 then Z,(t) =
(~e~(:)0 t:~) det(l- tA) 1 - t2
Since u = dim(E~) = 1, using the relationship between the homology zeta function and the usual zeta function givpn in Proposition 5.6, we get the four cases as stated in the theorem. 0
292
VII.
EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
7.6 Attractors There are various definitions of an attractor. The main difference involves which points in a neighborhood of the attractor have to approach the set. We demand that this holds for a whole neighborhood. There is also the question of whether the dynamics are "indecomposahle" on the attractor itself. We demand that the map (or flow) is chain transitive on the attractor but this requirement changes from author to author quite drastically. We write the definitions for a diffeomorphism I : M -+ M but only slight changes are needed to apply for a flow. Definitions. A compact region N c M is called a trapping region for I provided I(N) C int(N). A set A is called an attracting set (or an attractor by the terminology of Conley) provided there is a trapping region N such that A = nk>olk(N). A set A is called an attractor provided it is an attracting set and fill. is chain transitive, so A C R(f). (Sometimes we might want to assume that 1111. is topologically transitive.) An invariant set A is called a chaotic attractor provided it is an attrador and I has sensitive dependence on initial conditions on A. (Sometimes people require I to have a positive Liapunov exponent on A instead of sensitive dependence. See Section 8.2 for the definition of Liapunov exponents for a map in several dimensions. Compare with the discussion on chaos in Chapter III.) Finally, an attractor with a hyperbolic structure is called a hyperbolic attractor. REMARK 6.1. Some authors do not require A attracts a whole neighborhood but only it attracts a set of positive measure in order to call it an attractor, however this leads to a very different concept which I might call the core 01 the limit set. See Milnor (1985) for further discussion along these lines. Also See Section 9.1 for a discussion of Conley's theory.
It can easily be checked that an attracting set is both negatively and positively invariant and closed. (These properties hold because it is the intersection of the nested sets Ik(N).) It is useful to give a definition of an attracting set which is given more in terms of properties of the invariant set A, and not determined by intersections of a trapping set. The following example shows that for A to be an attracting set, it is not enough that there is one neighborhood V of A such that w(p) C A for all p E V. Also see the example of Vinograd given in Example V.5.3. Example 6.1. Consider the equations
iJ
=
r=
e2 (211" - ef r(1 - r2)
where e is an angular variable modulo 211". See Figure 6.1 for the phase portrait. In these polar coordinates, let B = {(O,I)} and V = {(e, r) : Ir - 11 < fl. Then, V is a trapping set which is a neighborhood of Band w(p) = B for all p E V. However, the attracting set and attractor for V is the circle A = {( e, 1) : 0 ~ e ~ 211"}. (This is an example where the chain recurrent set is larger than the limit set.) (The set B is an attractor in the sense of Milnor.) Taking into consideration the above example, a compact attracting set can be characterized by w-limit sets of points in arbitrarily small neighborhoods as given in the following proposition.
7.6 A'ITRACTORS
FIGURE
293
6.1. Phase Portrait of Example 6.1
Proposition 6.1. Let A be a compact invariant set in a finite dimensional manifold. Then A is an attracting set if and only if there are arbitrarily small neighborhoods V of A such that (i) V is positively invariant and {ii} w(p} C A for all p E V. We leave the proof of this result to the exercises. (See Exercise 7.31.) We also have the following result about the unstable manifolds for points in an attracting sets. Theorem 6.2. Let A be an attracting set for /. Assume either that (i) pEA is a hyperbolic periodic point or (ii) A has a hyperbolic structure and pEA. Then WU(p,f) C A. PROOF. The set A is contained in the interior of a trapping region N so there is an > 0 such that W.u(fk(p» C N for all k E Z. Therefore for any k ~ 0,
f
i~O
Taking the intersection for k ~ 0 of /k(N}, we get that WU(p}
c
A.
o
In the next few sections, we give examples of attractors which are locally the cross product of the unstable manifold by a Cantor set. It is often useful to talk about the dimension of an attractor or other hyperbolic invariant set. Since such sets are often not manifolds we need another concept. What we use is the topological dimension, which is always an integer. In other contexts, the fractal or Hausdorff dimension is useful, which is a non-negative real number. We postpone discussion of this concept until Section 8.4. Definition. The definition of topological dimension is given inductively. A set A has topological dimension zero provided for each point pEA, there is an arbitrarily small neighborhood U of p such that 8(U} n A = 0. (It is not always possible to take U as a ball as the example of Antoine's necklace shows.) Then inductively, a set A is said to have dimension n > 0 provided for each point pEA, there is an arbitrarily small neighborhood U of p such that 8(U}nA has dimension n -1. See Hurewicz and Wallman (1941) or Edgar (1990) for a more complete discussion of topological dimension. Definition. Using the concept of topological dimension, we say that a hyperbolic attractor A is an expanding attractor provided the topological dimension of A is equal to the dimension of the unstable splitting. (Since WU(p) C A, we always have that the topological dimension is greater than or equal to the dimension of the unstable splitting.} See Williams (1967, 1974).
294
VII. EXAMPLES OF HYPERBOLIC SETS AND A'ITRACTORS
7.7 The Solenoid Attractor In this section, we introduce an example of a hyperbolic attractor which can be given by a specific map, the solenoid. The solenoid was known to people studying topology, see Hocking and Young (1961), but was introduced as an example in Dynamical Systems by Smale (1967). Using this example as a model, R. Williams developed the theory of one dimensional attractors, of which we see further examples in the sections on the Plykin attractors and the DA-attractor. See Williams (1967, 1974). Let D2 = {z E C : Izl ::; I}. We think of D2 as a subset of R2 even though we use complex notation for a point (and complex multiplication). Let Sl = {t E R modulo I} be the circle. The neighborhood of the attractor is the solid torus given by N = Sl X D2. We define the embedding I of N into itself by means of a map 9 on SI, 9 : SI -+ SI, given by g(t) = 2t mod 1. (The circle can also be thought of as a subset of C. Using complex notation, g(z) = z2 for Izl = 1.) The map 9 is called the doubling map (or squaring map). Using g, the embedding I : N -+ N (into N, not onto) is defined by
I(t, z)
= (g(t),
1 2 I' 41 z + 2e .. ').
(Below, we callI a diffeomorphism even though it is not onto N.) The constants 1/4 and 1/2 in the definition are somewhat arbitrary as long as the first is small enough to make lone to one and the combination ofthe two insures that I(N) C N: 1/2 -1/4 > 0 and 1/2 + 1/4 < 1. Geometrically, the map can be described as stretching the solid torus out to be twice as long in the SI direction and wrapping it twice around the SI direction. The image is thinner across in the D2 direction by a factor of 1/4. See Figure 7.1.
FIGURE 7.1.
Image of Neighborhood N inside of N
Let D(t) = {t} x D2
be the fiber with fixed 'angle' t. The map I is a bundle map which takes a fiber D(t) into the fiber D(2t) and is a contraction by a factor of 1/4 on each such fiber. We also use the notation D([t),t2]) = U{D(t) : t E [tl,t2]}'
Theorem 1.1. Let A = n~o Ik(N). Then A is a hyperbolic expanding attractor for I of topological dimension one, called the solenoid. We give various other properties of the solenoid throughout the section as we prove the theorem and further analyze this example. First note that N is a trapping region for I, so A is an attracting set. We start with the following proposition.
7.7
THE SOLENOID ATTRACTOR
295
•• FIGURE 7.2.
f(N) n D(to) for to = 0 and 0 < to < 1/2
Proposition 1.2. For each fixed to, An D(to) is a Cantor set. PROOF. If f(t, z) E D(to), then g(t) = to mod 1. so t is to/2 or to/2 + 1/2. The image f(N) n D(to) is shown in Figure 7.2 for to = 0 /Lad 0 < to < 1/2. Notice that
f(D(~))=(to,~D2+{~e"IOi}) f(D(t o ; 1)) = (to,
~D2
_
and
{~e"loi})
since e"'oi+'" = -e"lo'. Thus the two images are in the same fiber, but are reflections of each other through the origin in D(to). They are disjoint because 1/2 - 1/4 > O. Both images are in D(to) because 1/2 + 1/4 < 1, so f(N) c N. Now let
n fi(N) k
Nk ==
= fk(N).
i=O
Claim 1.3. For all t E Sl, the set Nk nD(t) is the union of2 k disks of radius (1/4)k. PROOF. The claim is trivially true for k = 0, and for k = I, it follows from the image of N as discussed above. See Figures 7.1 and 7.2. Next, Nk
n D(t) =
f(Nk- 1 n D(t/2»)
u f(Nk- 1 n D(t/2 + 1/2»).
By induction, N k- J nD(t/2) and N k- I nD(t/2+1/2) are each the unionof2 k - 1 disks of radius (1/4)k-l. It follows from the fact that f is a contraction on fibers by a factor of 1/4 that f(Nk- 1 n D(t/2» and f(Nk- 1 n D(t/2 + 1/2)) are each the union of 2k - 1 disks of radius (1/4)k. Together, they are the union of 2k disks of the stated radius. 0 Now A = n';oJi(N) = n;'oNj, so An D(to) is a Cantor set as in the earlier example of a horseshoe. This proves Proposition 7.2. 0 Next, we give several other topological properties of A. Proposition 1.4. The set A has the following properties: (a) A is connected, (b) A is not locally connected, (c) A is not path connected, and (d) the topological dimension of A is one. PROOF. (a) The sets Nk are compact, connected, and nested so their intersection A is connected. (See Exercise 5.ll.)
296
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
(b) For 0 < t2 - t\ < 1, D([tl' t2]) n Nle is the union of 21e twisted tubes. For any neighborhood U of a point p in A, there is a choice of tl and t2 and large k such that U contains two of these tubes. Since each tube contains some points of A, this shows that A is not locally connected. (c) Fix p = (to, zo) E A. By induction, for k ~ 1, there is a point qle E An D(to) such that (i) for k ~ 2, qle is in the same component of N Ie - 1 n D(to) as qle-I, and (ii) any path from p to qle in N" must go around SI at least 2" -I times. (This specifies the component of N" n D(to) in which qle lies.) By construction, the sequence {q" }f-l is Cauchy. Let q be the limit point of the qle, which is a point of A since A is closed. We claim that there is no continuous path in A from p to q. This limit point q is in the same component of N" n D(to) as q", and any path from p to q in N/c must intersect D(to) at least 2"-1 times, going around SI between each of these intersections. Thus, if there were a continuous path in A from p to q it would have to interest D(to) infinitely many times (at least 21e - 1 times for arbitrarily large k), going around SI between each of these intersections. A continuous path can not go around SI an infinite number of times, so this contradicts the assumption that there is a continuous path in A from p to q, and proves that A is not path connected. (d) In each fiber An D(to) is totally disconnected, so has topological dimension zero. In a segment of N, An D([tl' t21> is homeomorphic to the product of An D(t.) and the interval [tl' t2l, so has topological dimension one. This completes the proof of the proposition. 0 Next, we state several properties of Ion A which together imply Theorem 7.1. Proposition 7.5. The map
I
restricted to A has the following properties:
(a) the periodic points of I are dense in A, (b) I is topologically transitive on A, and (c) I has a hyperbolic structure on A. PROOF. As a first step in the proof of part (a), we give the following lemma about g.
Lemma 7.6. The periodic points of 9 are dense in SI. PROOF. The point to is fixed by gle, gle(to) = to, if and only if 2kto = to + j for some integer j. Thus to = j/(2/c - 1). For k fixed, let tie.; = j/(2/c - 1) for 0 ~ j ~ 2k - 2. These points are evenly spaced on the circle with separation 1/(2k - 1). As k goes to infinity, it follows that the set of all periodic points is dense. 0
Now if gk(to) = to, 1"(D(to» c D(to). The set D(to) is a disk, and Ik takes it into itself with a contraction factor of 4-", so lie has a fixed point in D(to). By the lemma, it follows that the fibers with a periodic point for I are dense in the set of all fibers. We want to show the periodic points are actually dense in A. Take a point pEA and a neighborhood U of p. There is a choice of k and tl, t2 such that the tube, Ik(D([tl,t2]) cU. We showed above that I has a periodic point in D([tl,t2]) and so in (D( [t I, t2]) cU. This proves the first part ofthe proposition. (b) We need to verify on A the hypothesis of the Birkhoff Transitivity Theorem, Theorem 2.1. Let U and V be two open subsets of A. Thus there are open set in SI x D2, U ' and V', such that U ' n 11.= U and V' n 11.= V. There exist kEN, 0 < t2 - tl < 1, and 0 < t; - tl < 1 such that lle(D([tl, t2]) C U' and lle(D([t1,t;])) c V'. Then there is a j > 0 such that Ji(D([tl,t2]) n D([tl,t;D ,,0. In fact, since we can take j such that Ij(D([tl,t2])) goes all the way across D([tl,t~]), we can require that
r
7.7 THE SOLENOID ATTRACTOR
297
Thus
Ij(fk(D([tl' t2]» n A) n Ik(D([t;, t~])) n A # 0 P(U) n V = li(U' n A) n [V' n AJ
# 0.
By the Birkhoff Transitivity Theorem, Theorem 2.1, proved part (b) of the proposition. (c) In terms of the coordinates on Sl x D'l,
IIA is transitive, and
where 12 is the identity matrix on C or R2. Let E~ = {O}
DIp
(~)
=
(lOv ),
and
X
we have
R2. Then for (0, v) E E~,
and
DI:(~)=(tv) which goes to zero as k goes to infinity. Therefore, this is indeed the stable bundle at each point pEA. To find E~, it is necessary to use cones. Let
Step 1.
DlpC: C C'J(p)'
PROOF.
Let
(~~) E C;;, and
Then,
Iv~1 = 21vlI = ~(4Ivll) 1
1
1
1
> 2(7r1vd + 21v1 1) ~
2(7r1vd + 41v2 1)
~ ~lv~l. This last inequality shows that
o
(~~)
E Cj(p)' completing the proof of the first step.
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRAGrORS
298
n Df7-.(p) 00
Step 2. The intersection
C;_.(p)
= IE: is a line in the tangent space.
k-O
PROOF. By Step 1, the finite intersections k
n
Df;-J(p) CJ-;(p)
= Df7-·(p) CJ-.(p)
j-=O
are nested. To prove that the intersection is a line, we prove that the angle between two vectors in these finite intersections goes to zero as k goes to infinity. Let VI) ' (WI) ( V2 W2
with
Vt. WI
" E C,_.(p)
> 0,
(:D =
D fj
•(p)
:~),
(
and
Then
Iv~
_ w~
VI
WI
I= l7rie
2 .. li vl
(:D
(:~) .
+ i V 2_ 7rie2"li wl + iW21
2vI
=
= Dfj-.(p)
2wI
!IV2 _ w21,
8 VI WI which shows there is a contraction on the difference of the slopes. By induction on k, 1 21, IV2 _ W21=(_)kIV2_ v~ w~ 8 VI WI k
k
W
which goes to zero as k goes to infinity. Since the difference of slopes goes to zero, the cones converge to a line. 0 Step 3. The derillBtive of f restricted to IE~, PROOF. Since
IE~
DfpllE~,
is a graph over TISI x {OJ, let
is an expansion.
I ( :~) I.
=
Ivd.
This is a norm on
the cone. Then,
Thus D fp is an expansion on vectors in this bundle in terms of this norm, and hence the standard norm. This completes the proof of Step 3, part (c) of the proposition, and the proposition. 0 REMARK 7.1. Given the bundle of vectors which expand and contract, it is easy to see that W"(p) :) {t} X D2 if p = (t,z). The unstable manifold, W"(p), winds around through 1\.. Each W"(p) is an immersed line. (Since f is one to one and an expansion on W"(p), it can not be a circle which is the only other one dimensional possibility.) The unstable manifold W" (p) hits D( t) in a countable number of points. Since I\. n D( t) is uncountable, there are many other points in I\.nD(t) which are not in WU(p). These are points q for which there is no curve in I\. from p to q. REMARK 7.2. As an exercise, we ask the reader to construct Markov partitions for and the doubling map g. See Exercises 7.28 and 7.29.
f
7.7.1 CONJUGACY OF THE SOLENOID TO AN INVERSE LIMIT
299
7.7.1 Conjugacy of the Solenoid to an Inverse Limit Williams introd uced the idea of representing certain attractors (expanding attractors) as inverse limits. See Williams (1967, 1974). As before N is the natural numbers, {O, 1, 2, 3, ... }. Let g( t) = 2t mod 1 as before. Let E = {s E (SI)N : g(Bj+d = Bj}. Define the shift map, a, on E by a(s) = t if tj
=
{
Bj_1 g(BO)
ifj~l
if j
= O.
If SEE, then g(Bj+1) = 8j so Bj+1 E g-I(8;) is one of the two preimages of Bj. The pair (E, a) is called the inverse limit 01 g. A point p = (t, z) E A is determined by a sequence of descending disks in D(t) as we saw above. These disks are in turn determined by the preimages of t by the map g.
When a map is expanding (like the one dimensional horseshoe) we use forward images of a point p to determine p. Because 1 contracts on fibers, we use backward images. Define h: A -+ (SI)N by h(p) = s where rj(p) E D(sj) with Bj E 8 1 for j = 0,1, .... Theorem 7.7. The map h defined above is a conjugacy from limit of g, a on E.
1 on A to
the inverse
PROOF.
Step 1.
heAl c
E.
Let h(p) = s. Then rj(p) E D(sj) and r j - l (p) E D(sj+1)' Therefore the intersection I(D(sj+d) n D(s;) -F 0, so f(D(Bj+1)) C D(sj). Thus g(s;+1) = s; for all j and sEE. 0 PROOF.
Step 2. h 0 f = a 0 h. PROOF. For pEA, let h(p) = s and h(f(p» = t. f-(j+1l(f(p» = f-;(p) and so is in both D(tj+d and D(Bj). Therefore tj+1 = Sj for all j ~ O. Similarly, f(p) is in both D(to) and f(D(so» so to = g(Bo). This proves that a(s) = t as required. 0
Step 3. The map h is one to one. PROOF. If h(p) = h(q) = s, then p,q E n~_ofj(D(sj)). This is a nested sequence of disks whose radii go to zero. Therefore there is only one point in the intersection and
P=~
0
Step 4. The map h is onto E. PROOF.
Take 8 E E. Then g(sj+l) = Sj so f(D(sj+1» C D(sj), and
fj(D(sj»
c f j - I (D(s;-I» c ... C D(so).
Therefore n~=oJi(D(sj)) is a nested sequence of disks with nonempty intersection, hence 00
n 1;(D(s;» -F 0.
j-O
If p is a point in this intersection, then h(p) = s. This completes the proof of the fourth
step and the theorem.
0
300
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
7.8 The DA Attractor The next example of an attractor we consider is constructed by modifying a toral Anosov diffeomorphism on the two dimensional torus. For this reason it is called the Derived-from-Anosov-diffeomorphismor the DA-diffeomorphism. It was first introduced by Smale (1967). Let 9 : y2
-+
T2 be the Anosov diffeomorphism induced by the linear matrix
(~ ~ ).
(Any other example with a fixed point would work as well.) Let Po be a fixed point of corresponding to 0 in R2. Let v" and v' be the unstable and stable eigenvectors of the matrix and use coordinates UIV" + U2V' in a (relatively small) neighborhood, U, of Po· Let TO > 0 be small enough so the ball of radius TO about Po is contained in U. Let 6(x) be a bump function of a single variable such that 0 ~ 6(x) ~ 1 for all x, and g
0 for x ~ TO 6(x} = { 1 for x ~ To/2. Consider the differential equations UI = 0 U2 = u26(1(UI' U2)J).
Let cpt be the flow of these differential equations, CPt(UI,U2) = (UI,CP~(UI,U2»' Then the support, SUpp(cpl - id} c U. Also the derivative of the flow at Po in terms of the (uI,u2)-coordinates is
Define' = cpT 0 9 for a fixed T > 0 such that eTA. > 1 where A. is the stable eigenvalue. The map' is called the DA-diffeomorphism. Note that in the (Ulo u2)-coordinates the derivative of , at Po is
so Po is a source. Theorem 8.1. The DA-dilfeomorphism , described above has nu) = {Po} U A where Po is a fixed point source and A is an expanding attractor of topological dimension one. The map' is transitive on A and the periodic points are dense in A. PROOF. Because the neighborhood U can be taken arbitrarily small, , can be made arbitrarily CO near g, but not C l near 9 since eT can not be arbitrarily small. Also note that the flow cpl preserves each stable manifold of a point for g, W'(q, g), because of the form of the differential equations. Therefore, , preserves each W' (q, g). The new map' has three fixed points on W'(pO,g), Po and two new fixed points PI and P2. This fact can be seen to be true because '(Po) = Po is a source and outside U the slope of the graph of , on W' (q, g) is still less than one. Therefore there must be a fixed point on each side of Po along W·(q,g). See Figure 8.1. We claim that both
PI and P2 are saddles. To see this, note that in U, D/q =
(all 0), with all a:n
a22
= Au
7.8 THE DA ATTRACTOR
•
u
301
..
FIGURE 8.1. Graph of IIW'(Po,g)
FIGURE 8.2. Image of Open Set V for all q, and with 0 < a22 < 1 at PI and P2 because of the nature of the graph of IIW'(Po,g) indicated in Figure 8.1. Let V be a neighborhood of Po (not containing PI and P2) contained in U such that (i) a22 > 1 for q E V (f is an expansion along E' in V), (ii) 0 < a22 < 1 for q ~ I(V) (f is a contraction along E' outside of V), and (iii) I{v) ::> V. See Figure 8.2. (We leave as an exercisp. the existence of such a neighborhood V. See Exercise 7.36.) Clearly, V C WU(Po,J) so it is the local unstable manifold of Po and WU(Po,J) == ~o Ji(V). Let N == '1'2 \ V. Then N is a trapping region because I{v) ::> V. Let A = ni_OP(N). This is an attracting set, and A = T2 \ WU(Po,f). The unstable manifold of Po for I, WU(po, f), is a "thickened" version of the unstable manifold for g. In Step 4 below, we prove that WU(Po,f) is still dense in T2, so A has empty interior. We proceed to prove Theorem 8.1 through a series of steps. Step 1. The map
I has a hyperbolic structure on A.
PROOF. In terms of the splitting E~(g) e E~(g), the derivative of I, Dlq = (aji), is lower triangular in U and diagonal outside U (a12 = 0 everywhere and a21 = 0 outside U). The unstable term all == Au > 1 everywhere and 0 < a22 < 1 outside fey) so on A. Because of the form of the derivative, E~(f) = E~(g) is an invariant bundle and every vector in this bundle is contracted by Dlq for q EA. Therefore this is the stable bundle on A. Let C be a bound on la:l1l everywhere, define L == C(~u - ~.)-l and take the cones = {(Vb V2) E E~(g) e E~(g) : IV21 ~ Llvll}·
c::
Then It can be checked using the lower triangular nature of the derivative of I that the
302
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
cones are invariant and
00
E~(J)
=
n Dfl-;(qPJ-J(q)
j=o
is an invariant bundle on which the derivative is an expansion for points q E A. Thus this gives the unstable bundle on A and so the hyperbolic splitting. 0 Step 2. For j = 1,2, PI,P2 E A and W"(Pj,f) C A. PROOF.
The fact that PI, P2 E A follows because PI, P2 ¢ V and they are fixed points,
so
00
PI,P2 ¢
UP(V) =
11'2 \
A.
j=O
0
The fact about the unstable manifolds we proved for general attracting sets. Step 3. The stable manifolds of f satisfy the following:
W'(PII f) U W'(P2, f) = W'(PO,g) \ {Po} and W'(q, f) = W'(q,g) forq ¢ W'(Po,g). Thus W'(q, f) is dense in 11'2 for all q E A. PROOF.
f(W'(q,g» = W'(J(q),g) and W'(q,g) is tangent to R'(J). Also for q E A,
I is a contraction on WI:"'(q, g). Therefore W1:"'(q, f) = W1:"'(q, g), and W'(q, f) c W'(q,g). A line segment I in W'(q,g) which does not end in V (goes all the way across V if it intersects it) is lengthened by I-I. Any line segment whose end stays in V for all inverse images must be a subset of W'(pO,g) and have an end in ",,~c(PO,g). Thus, if q ¢ W'(Po.g) then W'(q,f) = W'(q,g). Also, this implies W'(Pj'/) is one component of W'(Po, g) \ {Po} for j = 1,2. The fact that the W'(q, f) are dense in 11'2 follows because these are lines with irrational slope. 0 Step 4. The unstable manifold of I at Po, W"(po, f), is an open dense set in
11'2.
PROOF. By construction, 11'2 = A u U~o Ii (V), so we need only prove that W" (Po, f) accumulates on A. Let pEA, and Zp be an arbitrarily small neighborhood of P in 11'2. Let I = compp(W'(p, f) n Zp). As long as l-j(I) does not intersect I(V) it is lengthened by I-I by a uniform amount. But there is a uniform bound on the length of comp:a[W'(z, g) \ I(V)]. Therefore for large j,
l-i(I) n I(v) # 0, rj(Zp) n I(v) Zp n IHI(V)
# 0,
# 0, # 0.
and
Zp n W"(Po, f)
Since Zp is an arbitrarily small neighborhood, it follows that W"(Po, f) is dense at p.
o
Step 5. For j = 1.2, W"(Pj'/) is dense in A. PROOF. Let P E AnW'(q, g) where q has period k for g. By Step 4, P E c1(W"(po'/»\ W"(Po. f), so P E 8(W"(po,/». In the proof of Step 4, when rjk(I) intersects V, it must cross W"(PI, f) U W"(P2, f). (It crosses from one side to the other.) Therefore,
[W"(PI. f) [W"(PI,f)
U U
W"(P2, f)] n rik(Zp) W"(P2.f)] n Zp # 0
#0
and
7.8.1 THE BRANCHED MANIFOLD
303
because the unstable manifolds are invariant by f. This shows that the union of the two unstable manifolds WU(Pt, f) U W U(P2, f) is dense in A. We have shown that the union of the two manifolds is dense in A, and we need to show that each manifold is dense by itself. W"(Pt, f) is dense in T2 so it must intersect WU(P2, f). Because these are tangent to the bundles E" and E" the intersections (which are on A) are transverse. By the Inclination Lemma, it follows that W" (1'2, f) accumulates on WU(PI, J), cl(W"(P2,f) ::> W"(PI, f), and cl(W"(Pl, J) = A. SimUarly, cl(WU(PI,J) = A. 0 Step 6. The topological dimension of A is one. PROOF. By Step 4, WU(Po, f) is dense in T2, so A has empty interion and must have topological dimension at most one. The manifolds WU(Pj, f) for j = 1,2 are contained 0 in A, so it must have topological dimension at least one.
Step 1. For j = 1, 2, {q E A : q is
8
transverse homoclinic point for Pj} is dense in A.
If x E A, Steps 3 and 5 imply that both WU(Pj, f) and W'(Pj, f) come arbitrarily near x for j equal either 1 or 2. The existence of a hyperbolic structure in Step 1, implies that WU(Pj,J) and W"(Pj,f) intersect transversally arbitrarily near x for j equal either 1 or 2. 0 PROOF.
Step 8. The set A is transitive. PROOF.
o
This follows from Step 7 and the Birkhoff Transitivity Theorem, Theorem 2.1.
Step 9. The periodic points of f are dense in A. PROOF.
points.
This follows from Step 7 and the horseshoe theorem for transverse homoclinic 0
Together, all these steps prove the theorem.
o
7.S.1 The Branched Manifold A Markov partition for the Anosov automorphism 9 is given in Figure 8.3. The map f pushes outward from Po in the stable direction. If we form equivalence classes of points in comp.(W"(z,f) \ V) and collapse these to points we get the branched manifold, K, indicated in Figure 8.4. This quotient space has the differential structure of a one dimensional manifold except there are branch points. The fact that there is a ct structure on the quotient space is reflected in the picture by the fact that the three curves coming into a brancli point all have the same tangent line. There is a map defined on the quotient space (the branched manifold), g. : K -+ K. See Williams (1967) for the definition of a one dimensional branched manifold or Williams (1974) for the definition in any dimension. This map is an expanding map (because we quotiented out the contracting directions and left the expanding directions), and has the following images: g.(A) = B g.(B) = BCB g.(C) = CAC.
In fact, these line segments A, B, and C can be oriented so the map preserves the orientation. This map takes the role of the doubling map for the solenoid. It can be proved that f on A is topologically conjugate to the inverse limit of g. on K. See Williams (1970a). We leave the details to the reader and references.
304
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
FIGURE 8.3.
Markov Partition for DA-Diffeomorphism
FIGURE 8.4. Branched Manifold for the DA-Diffeomorphism
7.9 Plykin Attractors in the Plane Let A be a hyperbolic attractor in the plane with trapping region N. Thus N must be diffeomorphic to a disk with some (or no) holes removed. Plykin (1974) proved that if A is not just a periodic orbit, then N must have at least three holes removed (four holes on the two sphere 8 2 ). Theorem 9.1. Let N C R2 be a trapping region for f. Assume the associated attracting set, A = o fk(N), has a uniform hyperbolic structure for which the expanding bundle has dimension one, dim(E~) = 1 for pEA. (So A is not just the orbit of a periodic sink.) Then N must have at least three holes.
n:.
REMARK 9.1. For the general proof see Plykin (1974). We give a sketch of a proof that N must have at least two holes. Because of this theorem, any nontrivial attractor (not a periodic orbit) in the plane or sphere is called a Plykin attractor.
PROOF. The hyperbolic splitting on A can be extended to a small neighborhood U of A. (This extension is not hard if it is not assumed that the splitting is invariant off A. It is possible to extend it so it is invariant but this is harder and we do not need this property.) For large k, fk(N) C U, so there is a splitting on fk(N). The neighborhood Jk(N) has the same topological type as N, so we can assume that the splitting is on the entire trapping neighborhood N. Assume that the extension of the bundle E; to all points pEN is orientable on N. Then it is possible to take X(p) E E; that is a nonvanishing vector field. Take pEA. ThE.'n thE.' intt'gral curve of p is one side of W"(p), W"(p)+. By the Poincar~ Bendlxson TheorE.'m, W"(p)+ accumulates on a closed orbit..., for X (because w(p, X) has no fixed points since X is nonvanishing). But w(p, X) c A because A is closed. Thus..., c A is a
7.9 PLYKIN ATTRACTORS IN THE PLANE
closed curve which is an unstable manifold. But unstable manifolds can not be closed curves (they are immersed lines). This contradiction shows that the extension E; can not be orientable on N. If N has no holes (and so is a disk), then the extended bundle E" must be orientable on N. The above argument shows that this is impossible, so N must have at least one hole. Next assume that N is a disk with one hole removed (an annular region). By the above argument the extension E" must not be orientable on N. In this case, it is possible to take a double cover N of N on which there is an orientable bundle E" which covers EU on N. Again, N is an annular region. It is also possible to define a map Ion N which covers f. But this leads to a contradiction as above, so N must have at least two holes. As stated above, Plykin has an argument that N can not have just two holes. This can also be proved using the theory of "pseudo-An08Ov diffeomorphisms" of Thurston. We do not give these arguments. 0 Example 9.1. It is possible to describe a geometric model of a map f which has a planar region with three holes, N, as a trapping region. See Figure 9.1. Consider the map f for which the image of N is as indicated in Figure 9.2. This map takes each of the line segments drawn in Figure 9.1 into (subsets of) another one of these line segments. These line segments are pieces of the stable manifolds. The map f stretches in the direction across the line segments. Also f(N) c N. The attracting Bet A= f"(N) has a hyperbolic structure.
n;::.o
...
,, I
If'
I
I'
'
I, ' ' " '.
I ' I'
:: ,: ,/,,' ,i'
:f:/,/ C' ........ . , ...... _.... .. D
;.::- . .
II'" .. .. ,'",,, . ,I I ' \ .......
,'I' \ .. ,'I" ,I", . . .... " I' ' : ~ '. '- '", , ,I
I
I'
I
.
,
\
I
FIGURE 9.1. Neighborhood N If we make equivalence classes of points which are in the sarne components of stable manifolds in N, q'" p if q E compp(W'(p) n N), then we can form the quotient space K = N/ "'. For this example this quotient is Indicated in Figure 9.3. Williams showed that to an expanding attractor there is associated a bmnched manifold, Williams (1970a, 1974). In Section 11.2, we show that the tangent lines, E! to the various TxW'(p), depend in a C· fashion on x. This differentiability can be used to show that the quotient space can be given a smooth structure. There is a map defined on the quotient space, 9 : K --+ K. This map is an expanding map (because we quotiented out the contracting directions). This map takes the role of the doubling map for the solenoid. In the example being discussed, g(C) ::> A, g(A) ::> B, g(B) ::> C, and g(D) ::> C. It can be proved that f on A is topologically conjugate to the inverse limit of 9 on K. See Williams (1967) and (1970a). Also see Barge (1988) for the connection between inverse limits and attractors for diffeomorphisrns in the plane.
306
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
FIGURE 9.2.
Image of N inside of N
c D
FIGURE 9.3. Branched Manifold for Plykin Example REMARK 9.2. Some progress has been made to understand hyperbolic attractors which are not expanding attractors, i.e., hyperbolic attractors for which the topological dimension is greater than the dimension of the unstable manifolds. See Wen {1992}.
7.10. Attractor for the Henon Map Again we consider the Henon map, FA.B(X,y} = (A - By - x2,x). In the earlier section, we showed that for large values of A FA,B has a horseshoe, e.g. B = ±0.3 and A = 5. In this section, we consider smaller values of A for which FA,B has a trapping region. Henon introduced this map and discussed this map for B = -0.3 and A = 1.4 for which there is a trapping region N which is topologically a disk (a region with no holes). See Henon (1976). For the following discussion, let F = F1. 4 .- O.3 • Since it is a trapping region, A = Fk{N} is an attracting set. Numerical iteration indicates that F is topologically transitive on A because the iteration of a (generic) point appears to have a dense orbit in A. By the discussion of Plykin theory in the last section, F can not have a (uniform) hyperbolic structure on A because A has arbitrarily small neighborhoods which have no hole (topologically disks), Fk(N}. It is still possible that there exists a point with a dense orbit in A. This point also might have a positive Liapunov exponent. (That is, there might be some point p and a vector v for which IDF;vl grows at an exponential rate, lim infk_oo(l/k) 10g(IDF;vi) > D.} In spite of the lack of rigorous proof, an attracting set A for any map FA,B in the Henon family such that FA,BIA appears to be transitive is called a Henon attmctor. See Figure 10.1 for A = 1.4 and B = -0.3. (Also see the comments below about the results of Benedicks and Carleson.) Since A is an attracting set, A must contain all the unstable manifolds of hyperbolic periodic points in A. The following proposition shows that A is the closure of the unstable manifold of the fixed point for the Henon map for many B < O.
n:,o
7.10. ATTRACTOR FOR THE HENON MAP
FIGURE 10.1.
307
Henon Attractor
Proposition 10.1. (a) Let 1 : R2 -+ R2 be a diffeomorphism with a fixed point p. Assume there is a bounded region 0 C R2 which is p06itively invariant and 8(0) C L U u L' where LU is contained in a compact piece of WU(p) and L' is contained in a compact piece ofW'(p). Further assume 1 decreases area on 0, i.e., there is 8 0 < p < 1 such that I det(D/x)1 :5 p. Then A = cl(
U f"(L
U
».
n~O
Ifp is in the interior of the £U as a subset ofWU(p) then
A = cl(WU(p}. (b) For the Henon map F"..B there is a set S of values {A, B} with B < 0 for which part (a) applJes. This set includes {1.4, -0.3} as well as values with 1.4 < A < 2 and B small enough. REMARK 10.1. The region 0 is not a trapping region since the part of the boundary in WU{p) is usually in the image I{O}. See Figure 10.2. In the case of the Henon map the region can be enlarged to make it a trapping region U, but then WU{p} is in the interior of U. PROOF. (b) We do not give the proof of part (b) but mere indicate a region which works. A choice of 0 is the shaded region in Figure 10.2 whose boundary is made up of the pieces of WU(p} and W'(p). This region is not a trapping region because the part of the boundary contained in WU(p} is on the boundary of the image of O. By slightly enlarging 0 along the part ofthe boundary contained in WU(p}, it is pOBSible to make a trapping region.
FIGURE 10.2. Invariant Region 0 for the Henon Attractor is Shaded
308
VII. EXAMPLES OF HYPERBOLIC SETS AND ATIRACTORS
(a) Because
f decreases area by p, the absolute value of the stable eigenvalue at IA,I < p. Since L' is contained in a compact piece of W'(p), there
p is a k > 0 for which flc(L') C W:(p). For n ~ k, the diameter of /"(L') is less than 2fpn-lc. Given '1 > 0, there is an nl such that this diameter is less than '1/2 for n ~ nl. Now take q E A. Take '1 > 0 as above. Since the area(fn(O» $ pnarea(O), there is an n2 ~ nl such that D(q, '1/2) rt. /"'(0), D(q, '1/2) n 8(fn·(o» '" 0, and D(q, '1/2) n (fn'(Lu) U /"' (L'» '" 0. Because the diameter of /"'(L') is less than '1/2, d(q,/"'(LU» $ TJ. (The ends of 8(fn·(o» n /"'(L') lie in /"·(LU).) Because TJ is arbitrary, q is in the closure of the union of the forward iterates of LU, and A is the closurE' of U .. >o f"(LU) as claimed. If p is in the interior of the LU, then Un>O f"(LU) = WU(p), flO A is t.he clollurf' of WU(p) lIS claimed. 0 is less than p,
EVf'n though the attracting set for the Henon map is the closure of the unst.ahle manifold, it ill not l)('cesRnrily topologically transitive. In fact for sollie parameter values, we argue below that it contains some periodic sinks. However, when we look more closely at the attracting set A, the invariant set looks like a Cantor set of curves which seem to be the unstahle manifolds of points in A. See Figure 10.3. If we look at the attracting set in a smaller box (at a smaller scale), the set still looks like a Cantor set of curves. However, between the curves which reach all the way across the box there are curves which turn around part way across the box and come out the same edge they entered. These latter curves look like hooks among the other curves which are relatively straight. If all points in A had stable and unstable manifolds, then these hooks in the unstable manifolds would be tangent to the stable manifold of some other point in the attracting set. As the parameter A varies, these hooks move in the attracting set. For many parameter values, it would seem likely that there are homoclinic tangencies (tangencies of the stable and unstable manifolds of the fixed point or some periodic point). At other parameter values the tangencies of stable and unstable manifolds may be only be for non periodic points. In any case these tangencies prevent A from having a uniform hyperbolic structure. A numerical study by means of computer graphics seems to indicate that there is a tangency for B = -0.3 and A about 1.392. However, it is known that for parameter values near a homoclinic tangency, the attracting set is not transitive but contains infinitely many periodic sinks. This follows from the work of Newhouse (1979) on infinitely many periodic sinks. Also see Robinson (1983). It is still conceivable that the placement of the hooks could be controlled enough to avoid all homoclinic tangencies. It would be hoped that for such a parameter value that most points could be proved to have a positive Liapunov exponent.
FIGURE 10.3.
Enlargement of Piece of Henon Attractor
7.11 LORENZ ATIRACTOR
309
Recently Benedicks and Carleson (1991) have shown that there are other parameter values for which FA,8 has a transitive attractor with positive Liapunov exponent. (The map FA,B still can not have a uniform hyperbolic structure.) In fact, there is a set S c {A: 1.0 < A < 2.0} of positive measure such that for A E S and B < 0 small enough the attracting set is topologically transitive and has a positive Liapunov exponent. Their proof uses a perturbation argument from the one dimensional case of the quadratic map, i.e., from FA,o. For the one dimensional map, the "hooks" are the images of the critical point, and can be controlled well enough to make the map transitive with an invariant ergodic measure. This one dimensional result was first proved by Jakobson (1981). There have hern many refinements of the proof, including Benedicks and Carleson (1985). Benf'llickll nnd Cnrl<'ll
1.11 Lorenz Attractor As stated in Chapter I, Lorenz (1963) introduced the equations :i:
= -ux +uy,
iJ = px i
y - xz,
= -{3z + xy
as a model for fluid flow of the atmosphere (weather). See Guckenheimer and Holmes (1983) or Sparrow (1982) for further discussion of the way these equations model atmospheric movement. The parameter values which have some physical significance occur near p = I, but Lorenz discovered some very unusual dynamics for the parameter values (T = 10, p = 28, and {3 = 8/3. In our treatment of the dynamics, we fix (T = 10 and {3 = 8/3 for the rest of the section and discuss the situation for various values of p, but always taking p > 1. Before turning to the detailed discussion, note that the equations are invariant under the substitution of (-x, -y, z) for (x, y, z). Therefore, the solutions have this type of symmetry: if (x(t), y(t), z(t)) is a solution then (-x(t), -y(t), z(t}} is also a solution. The fixed points of this system of equations are easy to find, and are at 0 = (0,0,0), p+
= ([8(p -
1}/3JI/2, [8(p -1}/3J 1/ 2,p - I), and
p- = (-[8(p-I}/3J 1/ 2,_[8(p-l}/3J 1/ 2,p-l).
The eigenvalues at the origin are all real, -8/3, -11/2 ± [121 + 40(p - I}J 1/ 2/2. Thus there is one unstable eigenvalue
Au = -11/2 + [121 + 40(p - I)JI/2/2 and two stable eigenvalues
A. = -8/3
and
A•• = -11/2 - [121 +40(p-l}]1/2/2.
310
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
For P = 28, Au ::;:: 11.83 and A•• ::;:: -22.83. The unstable manifold of the origin is one dimensional and has two branches WU(O)*. These two branches are related to each other by the symmetry noted above. For small values of p, the positive branch of the I!Estable manifold, WU(O)+ stays on one side of the stable manifold W·(O). In fact, numerical integration indicates that it goes to p+. See Figure 11.1. For P = Po ::;:: 13.93, the unstable manifold for the origin Is seen to connect to the stable manifold of the origin forming a homoclinic loop, WU(O) C W'(O). (Remember, if one branch forms a homoclinic loop for a parameter value than the other branch also forms a homoclinic loop by the symmetry.) See Figure 11.2. For P > Po each half of the unstable manifolds for the origin (going out only one direction from the origin), WU(O)±, crosses from one side of W'(O) to the other side. See Figure 11.3. For P near Po, WU(O)+ falls into W'(p-) and WU(O)- falls into W·(p+). For P > PI = 470/19::;:: 24.74 this is no longer the case as we discuss further below.
FIGURE 11.1.
Unstable Manifold ofthe Origin, 1 < P < Po
FIGURE 11.2. Unstable Manifold of the Origin, P = Po We have already referred to the stable manifolds of the fixed points p±. We now turn to a discussion of the stability type of these fixed points. As is verified in Exercise 6.10, the characteristic equation for the fixed points p± is p(A)
8
160
= A3 + (41/3)A 2 + 3(P + IO)A + 3(P -
1)
= O.
For P ~ 14, p(A) has one real negative root and two complex roots. There is a bifurcation value at PI = 470/19 ::;:: 24.74. The two complex roots have a negative real part for
7.11 LORENZ ATTRACTOR
311
FIGURE 11.3. Unstable Manifold of the Origin, Po < P < PI
FIGURE 11.4. Two Views of the Unstable Manifold of the Origin for P = 28 14 $ P < PI, and positive real part P > PI. Thus these fixed points are stable for 14 $ P < PI, and they become unstable at PI = 470/19 R: 24.74. In fact a Hopf bifurcation takes place for this parameter value. See Sparrow (1982). It is a subcritical Hopf bifurcation where two unstable periodic orbits disappear at PI. For P < PI, the fixed points p± are sinks, while for P > PI, these fixed points push outward in a two dimensional subspace. The eigenvalue is complex so the trajectories spiral around in this two dimensional surface as they move outward. The existence of this two dimensional expanding subspace for the fixed points pushes the unstable manifolds. WU(O)± away from p±. In fact for P = 28, which is greater than PI! WU(O)+ crosses from one side of W"(O) to the other and back again, as indicated in Figure 11.4. (None of the facts stated above are obvious, but follow by detailed calculations. Some of these calculations are contained in Exercise 6.10, and others are referred to in Sparrow (1982).) From now on we focus our attention on the behavior for P = 28 and fix this value. There is a trapping region N containing the origin and not containing the other two fixed points. In fact, there are two holes in the region where these two fixed points are located. See Figure 11.5. Let A be the maximal invariant set in N,
A=
n,l(N). t~O
Thus A is an attracting set. The unstable manifold of the origin, WU(O), must be completely contained in the trapping region N, and so in A. The flow can not have a hyperbolic structure on A in the usual sense because A contains a fixed point at 0: at the fixed point 0, Ef, has dimension two and q has
312
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
, ,, I I \ \
,
,
, FIGURE 11.5.
Trapping Region
dimension one. while at other points x E A the splitting would be of the form
IE: $E~ $IE~ where each of the subspaces would have dimension one and E~ would be spanned by the vector field at x. However. we could consider a generalized hyperbolic structure where the splitting varies as above and the two dimensional subspace E~ $E~ for x E A \ {OJ converges to E& as x converges to o. Numerical integration indicates that the equations have this type of generalized hyperbolic structure but this is a "global" question and has not been verified analytically.
7.11.1 Geometric Model for the Lorenz Equations Guckenheimer (1976) introduced a geometric model of the Lorenz equations which is compatible with the observed numerical integration of the actual equations. This model has been analyzed in Williams (1977. 1979. 1980). Guckenheimer and Williams (1980). Rand (1978). and Robinson (1989. 1992). See Sparrow (1982) and Guckenheimer and Holmes (1983) for a more complete discussion of the model than we give in this section. To understand the model. first consider the flow of the actual equations near the origin. By a nonlinear change of coordinates the equations are differentiably conjugate to the linearized equations in a neighborhood of the origin. i: = ax
y=-by i = -cz where a = Au ~ 11.83. b = -A.. ~ 22.83, and C = -A. = 8/3. (The differentiable conjugacy follows by the result of Sternberg (1958). Also see Hartman (1964).) Therefore 0 < c < a < b. The solution of the linearized equations is given by x(t) = eBtxo, y(t) = e-btyo, and z(t) = e-ctzo. We want to follow the solutions as they flow past the fixed point. so from the time when z(t) equals to some fixed Zo until x(t) equals to some fixed ±x1. Consider the two cross sections E = {(x,y, ZQ) :
E'
=E\
Ixl,lyl :5 a},
{(x,y,ZQ) : x
S± = {(±x\. y, z)
= OJ,
: lyl,lzl :5 .a}.
and
7.11.1 GEOMETRIC MODEL FOR THE LORENZ EQUATIONS
313
Then for (x,y,zo) E I;', the time r such that
eGTlxl = XI eT =
Then the Poincare map P, : E'
-+
(XI) '/0. x
S is given by
PI(x,y) = (y(r),z(T)) = (e-In- y, e- CT zo) = (yxblox;blo,xc/ozox;c/o).
Notice that if x> 0 then P,(x, y) E S+ and if x < 0 then Pdx, y) E S-. For eigenvalues at the fixed point equal to those of the real Lorenz equations, b/a = I~ .. / ~"I :<:: 1.93 > 1 and cia = I~./~"I :<:: 0.23 < 1. Therefore, a square region {(x,y, zo) : 0 < x ~ a, Iyl ~ a} in E' comes out in a cusp shaped region in S+. See Figure 11.6. The geometric model assumes that the Poincare map P2 from S± back to E takes the horizontal lines z = z\ into lines x = Xl' This compatibility insures that there is a coherent set of contracting directions. More specifically, we assume that
Let P = P2
0
PI. Then P: E'
-+
E and the image of I;' by P is as in Figure 11.7.
FIGURE 11.6. Flow of E Past Fixed Point
FIGURE 11.7. Image of E' by Poincare Map
314
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
Because (1) PI takes a line segment with all the same x values to a line segment with all the same z values and (2) P2 takes a line segment with all the same z values back to a line segment with all the same x values, the x value of the image of a point by P is determined solely by its x value. Therefore P is of the form P(x, y) = U(x),g(x, 1/)). For a fixed xo, the map g(xo, y) is a contraction in the y-direction:
for 0 < 11 < 1, and If'(x)1 ~ 21/ 2 > 1 for all x with Ixl :5 a. Thus for the Poincare map, there is a hyperbolic splitting E; e E~, with E~ = {(O, V2)} and E; mainly in the x-direction. Because of the form of the Poincare map, it has an invariant stable foliation W'(q, P) made up of curves with constant value of x on E. One of these line segments, W'(q, P) for q E E', is taken into another such line segment, W"(P(q), P), with most likely a different value of x. We make an equivalence class of points on E that lie on the same stable line segment, W"(q, Pl. By collapsing equivalence classes to points, we get a map 7r : E -+ [-a, a]. (In the above situation 1T(q) just gives the x-value of q.) Because P takes an equivalence class into an equivalence class, P and 1T induce a map I: [-a, a] \ {OJ -+ R. This description of the map 1 is more coordinate free than given above where we wrote P(x, y) = U(x), g(x, y)) but it represents the same function.
FIGURE 11.8.
Graph of 1
The assumptions on the map 1 are more specifically as follows. (1) The symmetry ofthe differential equations implies that I(-x) = -/(x). (2) The map 1 has a single discontinuity at x = o. (3) The limit of I(x) as x approaches 0 from the left side is A > 0, 1(0-) = A, and the limit of I(x) as x approaches 0 from the right side is -A, 1(0+) = -A < o. Also 0 < 12(A) < I(A) < A and so 0 > 12( -A) > I( -A) > -A. Thus I: [-A, A] \ {OJ (4)
-+
[-A,A].
1 is nonuniformly continuously differentiable on [-A, AJ -{OJ, with !,(x) >
21/2
for all x ~ O. (5) The limit of I'(x) is infinity as x approaches 0 from either side. (6) Each of the two branches of the inverse of 1 extends to a C Ha function for some a> 0 on [/( -A), A] or [-A,/(A)I. (Thus the derivative of the extension of the inverse is a-Holder for some a > 0.) The graph of 1 is given in Figure 11.8. The properties of 1 follow mainly from the form of the Poincare map of the How past the fixed point.
7.11.1 GEOMETRIC MODEL FOR THE LORENZ EQUATIONS
315
Let cpt be the flow for the geometric model. Because P(E') C E, cpt has an attracting set A. The dynamics of the flow cpt on A are determined by the two dimensional Poincare map P. It can be shown that the dynamics of the two dimensional Poincare map are determined by the one dimensional Poincare map f. The fact that P has a coherent set of contracting directions can be used to show that there is a bundle IE:: for pEA and a complementary plane of directions E~ that are taken into themselves by D(cpt)p, D( cpt )pE;." = IE~~ (p) D(cpt)plE~ = 1E~,(p)
and the
IE~·
is contracted more strongly than anything in the
IE~
directions,
This last condition implies that cones about the IE~" are taken into themselves by the derivative of the flow, D(cpt)p. The vector field X(p) for the differential equation is in the center direction IE~. The center direction IE~ is also more or less "tangent" to the "sheets" in A, while the strong stable direction IE~" points transverse to the attracting set A. There is then a stable manifold theorem which says that there are curves W:"(p, cpt) that are tangent to the IE~· directions which are taken into themselves by the flow,
For small
f
> 0, N' ==
U W:"(p,cp) c N. pEA
We can form equivalence classes of points in the same strong stable manifold W:"(p, cp), and get a projection from N' to a branched manifold L. See Figure 11.9. See Williams (1974) for the general definition of a branched manifold or Williams (1977) for the branched manifold of the Geometric Model of the Lorenz attractor. The flow on N' induces a semi-flow 'ljit on L. Only a semi-flow and not a flow is induced on L because there are two choices of the backward trajectory at the branch set. The Poincare map from E' = '/l'(E) to itself for 'Iji is ,. (See the references for details.)
FIGURE 11.9. The Branched Manifold The fact that the flow cpt has the above properties is preserved under (fl perturba.tions. That is the content of the papers Robinson (1981, 1984).
316
VII.
EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
Theorem 11.1. Let <{il be a flow with the properties of the Geometric Model of the Lorenz equations as given above. In particular, assume the one dimensional Poincare map 1 satisfies properties (1) - (6). Then there is a neighborhood N in the C 2 topology such that if tjJ! is a flow in N with the symmetry properties of <{it, then tjJl has a bundle of strong stable directions (and so do the Poincare maps). Forming the quotient by the strong stable manifolds for tjJl on the trapping region N', there is induced a seml-Bow on the quotient space L which is a brandied manifold. The semi-Bow tjJl has a Poincare map P: E \ W'(O,tjJ) -+ E. The quotient map taking W"(p, tjJ) to points, induces a one dimensional Poincare map j that satisfies all the properties (1) - (6).
In particular, Williams was able to show j is transitive by the following argument. We first of all make the following definition for one dimensional maps.
Definition. Let I c R be an interval and 1 : I -+ I be a continuous map. We say that is locally eventually onto provided for any nonempty (small) open interval K C I, there is an n > 0 such that I"(K) :> I. If 1 is locally eventually onto, then it is transitive on I by the Birkhoff Transitivity Theorem.
1
Theorem 11.2. Assume I: [-A,O) U (O,Aj-+ [-A,Aj satisfies assumptions (1) - (6) gh'en above. (a) Then 1 is locally event ually onto for the interval ( - A, A). (b) Therefore 1 is transitive on (-A, A) and nu) = [-A, Aj. PROOF. Let I = [-A,Aj. In the proof, we repeatedly throwaway points whose iterates
hit 0 and thus do not have well-defined forward orbits. Given an interval K C I, define
Ko = {Kthe longest component of K \ {O}
if 0 ¢ K if 0 E K.
By induction, define Ki for i 2: 0 by K
_ {/(K;) the longest component of !(Ki )
1+1 -
Let
\
{OJ
if 0 ¢ I(Ki ) if 0 E I(/(i)'
.x = zE/\{O} inf {f'(x)}.
By the assumptions ,x > 21/2. Note that by construction, 0 ¢ int(/(i). Therefore, tU(K i 2: Af(K,) where t(J) is the length of an interval J. Because of the choice of
»
eK
> {Af(K;)
( Hd -
~t(K,)
if 0 ¢ /(K i ) if 0 E I(K,).
Similarly, l(I<
) > { Af(KH1 ) jl(KHd
,+2 -
if 0 ¢ /(Ki +1) if 0 E I(KHd.
Thus if 0 ¢ I(Ki ) or 0 r;.1(K i +1) (0 ¢ I(Ki ) n I(KH1 ,x2 I(KH2 ) ~ 2"l(Ki ).
», then
Thus every two iterates of the interval makes the length of Ki increase by a factor .x2 /2 > 1 until there is some n for which 0 E I(Kn- 2 ) n/(Kn-d. Thus, we have proved that 0 E I(Kn-2) n I(Kn-tl for some n 2: 2.
7.11.1 GEOMETRIC MODEL FOR THE LORENZ EQUATIONS
317
Claim. For the above choice ofn, Kn = (-A,O) or [O,A).
°
°
°
The point E I(Kn -2) so E 8(Kn_.}, i.e., K n- l abuts on 0. Let b > be such that I(±b) = 0. Then the fact that E I(Kn -.} implies that b or -b are in K n - il so K n - l :> [-b,O) or (0, b). Thus, I(Kn -.} contains either 1([-b,O» = [0, A) or /«O,b]) = (-A,OJ. 0 PROOF.
°
Let c = /(A). From the claim, it follows that /(Kn) = (-c,A) = (-c,OJ U [O,A) or (-A, c) = (-A, OJ U [0, c). Note that since ICc) = peA) > = I(b), it follows that c > b. Then in the first case,
°
/2(Kn) :>
/« -c, 0]) U /([0, A)) /«
:> -b, 0)) U 1([0, A)) :> [O,A)U(-A,c) :> (-A, A).
A similar argument holds when I(Kn) = (-A, 0] U [0, c). This completes the proof of (a). As mentioned before the theorem, the Birkhoff Transitivity Theorem implies that / is transitive and so all points are nonwandering. 0 The next theorem makes some connections between
I
and P.
Theorem 11.3. (a) There is a one to one correspondence between the periodic points of / and the periodic points of P. (b) Let Ap = nn~o pn(E'). Then Ap = O(P). PROOF. If P"(xo, YO) = (xo, Yo) then clearly /"(xo) = Xo. Conversely, assume that /"(xo) = Xo· Then P(x,y) = (f(x),g(x,y)) so P"(xo.y) E {xo} x I for the interval I of values of y. The map P is a contraction by a factor I-' < 1 on fibers W"(q, P), so
or P"(xo.·) : I --+ I is a contraction by 1-'''. Therefore P"(XO,y) has a unique fixed point Yo in I, and P has a unique point (xo, Yo) of period k corresponding to the point Xo of period k for /. This proves part (a). For part (b). note that E has the property that peE) C int(E). (If this Is not the case, then enlarging E slightly in the x-direction makes this true.) Then Ap Is an attracting set. so O(P) cAp. We need to show that O(P) :> A p . Take (xo. Yo) E Ap and let U be a neighborhood in E. Then there is a small interval J containing Xo and K = [YO - f, Yo + E) containing Yo such that J x K c U. Take m > such that I-'m[(I) < E. There exists a point (Uo.Vo) E Ap such that pm(Uo.Vo) = (xo, Yo). Since / Is locally eventually onto. there Is a point Xl E J such that r(x.} = Uo. Take any Yl E K, so (xily!l E U. and let pn(Xil y.) = (uo. vd. Then
°
IPR+m(XI.YI) - (xo,Yo)1 = IPm(Uo,v.) - pm(Uo.vo)1 $l-'ml VI - Vol
$
E.
Therefore (Xily.},pn+m(XI.y.) E U. This Is true for any neighborhood of (xo,Yo) 80 (xo, Yo) E O(P). This shows Ap c O(P), completing the proof. 0
318
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
The reduction to the one dimensional Poincare map can also be used to prove that the flow is not structurally stable. Small changes in the flow can make it have a homoclinic orbit or not, i.e., f"(A) = 0 for some n, or Ji(A) ~ 0 for all j > o. These two different types of flows are not conjugate or even flow equivalent. However, Guckenheimer and Williams have analyzed much of the topological structure of the attracting set for the Geometric Model of the Lorenz equations. See Guckenheimer (1976), Guckenheimer and Williams (1980), and Williams (1977, 1979, 1980).
Birman and Williams (1983a, 1983b) and Williams (1983) have studied the type of knots that occur as periodic orbits for the Geometric Model of the Lorenz equations. The branched manifold plays an important part of this analysis. The branch manifold is modified by removing the fixed point and get what is called a template. In particular the periodic orbits on the template correspond to the periodic orbits in R3. The reference sited above show that the knots which appear are prime. Also see Holmes (1988) for a good introduction into the theory of knots in Dynamical Systems.
1.11.2 Homoclinic Bifurcation to a Lorenz Attractor More recently, Rychlik (1990) proved that an attroctor 01 Lorenz type could occur for specific cubic differential equations in R3. The papers by Robinson (1989, 1992) contain a further discussion of this type of bifurcation and connections with stable manifold theory. Also see Ushiki, Oka, and Kokubu (1984). Recently Dumortier, Kokubu, and Oka (1992) has determined the various codimension two bifurcations of a homoclinic connection for a fixed point of a differential equations in R3. For the Lorenz equations, after the homoclinic bifurcation at Po there is an invariant suspension of a horseshoe and not an attractor. Later at the Hopf bifurcation value, it appears that the gap of the horseshoe disappears and an attractor is formed. In the equations studied by Rychlik and Robinson there are more parameters than in the Lorenz equations so these two bifurcations can be compressed into one. With certain (codimension two) assumptions on the parameters, it is possible to bifurcate directly from the homoclinic connection to an attractor. Since the homoclinic connection involves only two orbits, it is possible to analyze completely the properties of the flow at this parameter value and prove that a strong stable direction is preserved after the bifurcation. By controlling the expansion rates in comparison to the distance of the unstable manifold to the stable manifold, it is possible to show that the one dimensional Poincare map is like the one given for the Geometric Model of the Lorenz equations. Again this control of the expansion rates requires the extra parameter of the equations studied in these papers which is not present in the Lorenz equations. This work for the Lorenz equations is somewhat analogous to the comparison of the reliults of Benedicks and Carleson (1991) for the Henon map for small B to the observed results for the parameter values A = 1.4 and B = -0.3.
1.12 Morse-Smale Systems The examples of the horseshoe, toral Anosov, and solenoid all have infinitely many periodic orbits. In this section we consider a class of systems with only finitely many periodic orbits and no other chain recurrent points (or no other nonwandering points). If we let nu) be the set of chain recurrent points, and PerU) be the set of periodic points, then we are assuming that Per(f) = n(f) and that there are finitely many orbits in Per(f). We also want this system to be structurally stable. Therefore we require that all the periodic points are hyperbolic, so they persist under small perturbations of 1 and the dynamics near the periodic points do not change.
7.12 MORSE-SMALE SYSTEMS
319
Assume 1 is a diffeomorphism with PerU) = 'R.U). For any q, a(q),w(q) C PerU), so there must be two periodic points PltP2 with a(q) = O(p.) and w(q) = 0(1)2). Thus for any q there must be PI,P2 E PerU) with q E W"(pd n W"(P2). Since the periodic points are hyperbolic, for 9 sufficiently near to I, there will be nearby periodic points Pl(g),P2(g) E Per(g). For the system 1 to be structurally stable, it is necessary that for 9 sufficiently near to I, W"(Pl(g),g) n W"(P2(g),g) '" 0. Thus we need that any intersection of stable and unstable manifolds can not be destroyed. The property which assures that these intersections can not be broken is traru;;versality, which we now define. Definition. Two submanifolds V and W in M are transverse (in M) provided for any point q E V n W, we have that TqV + TqW = TqM. (This allows for the possibility that VnW=0.) Notice that for the two submanifolds V, W to intersect transversally at a point q, it is necessary for dimTqV + dimTqW ~ dimTqM. In R2, if V and Ware two curves then V being transverse to W at q means that the two tangent lines Tq V and Tq W are not colinear. In R3, two planes are transverse at q if they intersect in a line through q. The definition of a Morse-Smale system can now be given. Notice that the definition involves conditions on the whole phase portrait and how orbits go between various periodic points. For this reason we only define it when the phase space is compact. Instead of R", we need to add the point at infinity to get a system on S", or work with some other compact phase space such as yn. Definition. A diffeomorphism 1 (or a flow !pI) on a compact manifold M is called Morse-Smale provided (1) the chain recurrent set is a finite set of periodic orbits, each of which is hyperbolic, and (2) each pair of stable and unstable manifolds of periodic points is transverse, i.e., if PltP2 E PerU) then W"(p.) is transverse to W"(P2). Notice that in the case of flows, we allow periodic orbits and not just fixed points. We now give a number of examples of Morse-Smale diffeomorphisms. At the end of the section, we return to some examples of flows, and highlight some of the special aspects of Morse-Smale flows. Example 12.1. On 8 1 we consider the system 1(8) = 8 + £ sin(27rk8)
mod I,
for 0 < 27rk£ < 1. The lift F of 1 has F'(8) = 1 +£cos(27rk8). This derivative is positive with the assumption on £, so 1 is a diffeomorphism. This has 2k fixed points, {;k g~o 1 . Half of the fixed points are attracting, {(2 j 2; 1) H:J, and the other half are repelling,
{t}J:J.
All other trajectories are in the stable manifold of one of these sinks and the unstable manifold of one of the sources. Thus the system is Morse-Smale. See Figure 12.1.
Example 12.2. Again on SI we consider the system 1(8) = 8 +
I
+ £sin(27rk8)
mod I,
320
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
(a) k = 1 (b) k = 3
FIGURE 12.1.
for 0 < 2rrkl < 1. With the assumption on l, f is a diffeomorphism as in the previous example. This system has two periodic orbits of period k,
{ l}k-I k )=0
and
{ (2j
+ 1) }k-I
2k
)=0'
The first orbit is repelling and the second is attracting.
Example 12.3. On
y2,
take 8 1 and 82 as variable modulo 1. Let
We assume that 0 < 2rrl < 1 so that both coordinate functions are one to one and f is a diffeomorphism. This diffeomorphism has fixed points at PI = (0,0), P2 = (1/2,0), P3 = (0,1/2), and P4 = (1/2,1/2). The point PI is a source, P2 and P3 are saddles, and P4 is a sink. Then W"(Pj) C WU(pd and WU(Pj) C W"(P4) for j = 2,3,
WU(pd C W'(P2) U W'(P3) U W"(P4), W"(P4) C WU(pd U W U(P2) U W U(P3).
and
It is easily checked that all these intersections are transverse and the diffeomorphism is Morse-Smale. See Figure 12.2.
FIGURE 12.2.
Example 12.3 on Torus
Example 12.4. On S'l. it is possible to have a Morse-Smale diffeomorphism with one fixed point source at the north pole and one fixed point sink at the south pole. See Figure 12.3.
7.12 MORSE-SMALE SYSTEMS
FIGURE
12.3.
321
North Pole - South Pole Diffeomorphism 00 S'l
FIGURE
12.4.
Example 12.5
Example 12.5. On S'l, it is p088ible to have a Morse-Smale diffeomorphism with one BOurce at the north pole (infinity of the plane), two saddles, and three sinu, all fixed points. See Figure 12.4. The next lemma proves that for a Morse-Smale diffeomorphism, there are restrictions on the manner in which the stable and unstable manifolds can intersect. We first define a cycle of periodic points. The lemma then states that a Morse-Smale system can have no cycles among the periodic orbits. Definition. If P is a periodic point of f, for
tT
=u,., let
W"(O(p»
=UW"(JJ(p»,
W"(O(p»
=W"(O(p» ' O(p).
and
A collection of periodic points Po, ... , Pt-I E PerC!) is a k-c,cle provided WU(O(pj»n W'(O(PH.) :I " for j = 0, ... , k - 1 where we let Pt = PO.
Lemma 12.1. (a) If PI, P2 E Per(J) and q E WU(p.) n W'(P2), then there is an (-chain from PI to q and then to P2. (b) Ifpo, ... Pt E Per(J) and q; E WU(pj) n W'(PJ+I) for j = 0, ... , k - I, then there is an (-chain from Po to Pt which p _ through all tbe Pi and tbe q;. (c) If there is a k-cycle Po, ... Pt-l E PerC!) and q; E WU(O(PJ» n W'(O(PJ+I» for j 0, ... , k - 1, then q; E X(J) for j 0, ... , k - 1 and there are chain recurrent points which are not periodic points. (d) If, is Morse-Smale, then there can be no cycles among the period points of ,.
=
=
We leave the proof as an exercise for the reader. See Exercise 7.46. Next we state the theorem about the structural stability of Morse-Smale dift'eomorphisms and flows.
322
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
Theorem 12.2. Let M be a compact manifold, and phism (or flow). Then J is CI structurally stable.
J
a C l Morse-Smale diffeomor-
The proof in the case of a diffeomorphism on SI is simple because there can only be periodic sources and sinks and no periodic saddle points. The proof in this case is similar to the examples treated in Section 2.6. For the proof in the general case, see Palis and de Melo (1982). The original proof is found in Palis and Smale (1970). Morse-Smale Flows We now turn to flows. The definition of a Morse-Smale flow is the same as for a diffeomorphism but with periodic orbits replaced by either fixed points or periodic orbits or some of each. There are some differences in the implications of the assumptions of transversality. If q E WU(-ytl n W"(-Y2) where 'YI and 'Y2 are either fixed points or periodic orbits, then the whole orbit O(q) c WU(-ytl n W"(-Y2). Thus for transversality we need that dim(Wu(-Yd) + dim(W"(-Y2» ?: dim(M) + 1. In particular, on a surface (dimension two), if 'YI and 'Y2 are each fixed point saddles for a Morse-Smale flow, then WU('Yd n W"(2) = 0. Example 12.6. On 'f2, consider the equations (written on
01 = 02 =
£1
sin(27r9tl
£2
sin(27r92)
a 2)
for 0 < £1, £2 < 1/(27r). Taking all the variables modulo 1, there are four fixed points: (0,0), (1/2,0), (0,1/2), and (1/2,1/2). This is very much like Example 12.3 given above for a diffeomorphism and has one source, two saddles, and one sink. See Figure 12.5. This example is a Morse-Smale flow.
FIGURE 12.5.
Example 12.6: Differential Equations on Torus
If
Definition. A flow
~
-'VV""(p)'
7.12 MORSE-SMALE SYSTEMS
323
We also say that X(p) = -VV'P'(p) is a gradient vector field. We want to define a gradient flow on a surface or a manifold. A surface can be embedded in JR3. In general, a manifold M can be embedded in a large dimensional Euclidean space JR n. If this is done, then a real valued function V : M -+ R is called differentiable if it can be extended to a neighborhood U of M in JR n, V : U ..... R with VIM V. With these facts we can define a gradient system on a manifold M.
=
Definition. Let a manifold M be embedded in a Euclidean space Rn. A vector field X(p) on M is called a gradient vector field provided there is a real valued function V : U -+ JR where U is a neighborhood of M in JRn such that
where 1I"p : TpJRn -+ TpM is the orthogonal projection of vectors at pin R n to tangent vectors to M. Also a flow 11" on M is called a gradient flow provided there is a gradient vector field X(p) such that I,V"(p) X(V"(p». As an alternative definition of a gradient vector field on a manifold (and more intrinsic to the manifold itself), we can assume that there is a Riemannian metric on M, i.e., for each p E M there is an inner product on TpM, a positive definite symmetric bilinear form ( , )p, which varies smoothly with p. (An embedding of M into a Euclidean space is merely one way to get such an inner product.) Given a tangent vector vp at p. (vp, )p is something which acts on a tangent vector at p and gives a real number, and the map sending wp to (vp, wp)p is linear in wp, so (vp, )p is in the dual space to TpM. Thus using the inner product, for each p EM. there is a map from the tangent vectors at p to the dual space, J p : TpM ..... (TpM)". The fact that the inner product is positive definite implies that this map is an isomorphism. At each point p, the derivative for V, DVp, is a linear map waiting to act on a vector and the outcome is a real number, thus DVp E L(TpM,JR) ~ (TpM)". Combining these two constructions, we define the gradient 01 V by grad(V)p J;l DVp.
=
=
Finally, a flow 11" is called a gradient flow provided there is a real valued function V : M -+ JR such that 'P' is the flow for the gradient vector field -grad(V), !V"(p) = -grad(V)'P'(p). Theorem 12.3. Let V : M ..... JR be a C' (unction such that each critical point is nondegenerate, i.e., at each point x where grad(V)" = 0, the matrix o( second partial derivatives, (1J1J21JV (x»), has nonzero determinant. Let X(x) Zi Zj
= -grad(V)"
be the
gradient vector field (or V. Then (a) all the fixed points are hyperbolic and (b) the chain recurrent set (or X equals the set o( fixed points (or X (X has no periodic points). PROOF. We leave the proof of part (a) to the exercises. See Exercise 7.37. Since the function V is strictly decreasing off the fixed points and all the fixed points are isolated, all points which are not fixed are not in the chain recurrent set. This proves part (b).
o
REMARK 12.1. Let V : M ..... JR be a C' function such that each critical point is nondegenerate. By Theorem 12.3, then -grad(V) is a Morse-Smale vector field if and only if the stable and unstable manifolds are transverse.
324
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
Example 12.7. For a gradient vector field on 'f2, let V : 1'2 --+ IR be the height function when 1'2 is stood up on its end. This defines a gradient vector field with four fixed points: a source at the maximum Pmor, a sink at the minimum Pm'n, and two saddles PI and P2 with V(P2) > V(pd. See Figure 12.6. If the maximum, minimum, and saddles are nondegenerate, then the fixed points are hyperbolic. If the torus is "straight up and down", then there is a saddle connection from P2 to PI, WU(P2)nW'(pd '" 0. Thus this example is not Morse-Smale. If the torus is slightly tipped so that WU(P2)nW'(pd = 0, then the vector field becomes Morse-Smale.
FIGURE 12.6. Gradient Flow on 'f2
We want to make the connection between gradient systems and Liapunov functions, so we first define (global) Liapunov functions and then state and prove the theorem. Also see Conley (1978). Definition. A C l function L : M --+ R is called a weak Liapunov function for a flow
0
for all x
rt R( 0,
i.e., L(x) < 0 for all x rt R(
7.12 MORSE-SMALE SYSTEMS
325
Theorem 12.4. Let X(p) = -grad(L)p be a gradient vector field on a manifold M with flow
.
d
L(p) = diL(
d
I
= DL""(p)di
But DLpvp = (grad(L)p, v p) for any tangent vector vp E TpM, so L(p) = (grad(L)p, X(p)) = (grad(L)p, -grad(L)p)
= -lgrad(L)pI2.
Thus i(p) $ O. It equals zero if and only if grad(L)p and only if p is a fixed point of the How.
= 0,
if and only if DLp
= 0,
if 0
Corollary 12.5. The minima of L are asymptotically stable fixed points, and the saddles and maxima of L are not Liapunov stable. Corollary 12.6. The trajectories of a gradient system for L : M -- R cross tbe level sets of L orthogonally at points wbicb are not fixed points. If the real valued function L on M has only finitely many nondegenerate critical points, then the gradient vector field for L, X, has finitely many fixed points all of which are hyperbolic and no other periodic orbits. Thus R(X) = Fix(X) = Per(X). It can be shown that by perturbing any such gradient vector field slightly to Y, all the stable and unstable manifolds of Y can be made transverse, and hence Y would be 8 Morse-Smale vector field. Example 12.7 with the torus straight on the end is an example where R(X) = Fix(X) = Per(X) but the system is not Morse-Smale. The perturbation of the gradient system caused by slightly tipping the torus is an example of how such 8 system can be made Morse-Smale.
There are constraints on the possible dynamics in terms of the topology of the phase space. We give the result for gradient Hows or gradient-like flows. If p is a fixed point, then the index at p is defined to be the dimension of the unstable subspace, index(p) = dim(E~). Let Ck = #( critical points of index k) and /310 = dimH.(M,R). Then the Morse inequalities are as follows: Ck -Ck-I
+ ... ±ec ~ /310 -
/310-1
+ ... ±/Jo
for k = 0, ... ,dim(M). Also for k = dim(M), dim(M)
E
j=O
dim(M) (_1)dlm(MH cj
=
E
(_1)dim(MH/3j
= X(M),
j=O
where X(M) is the Euler characteristic of M. See Franks (1982) for more details including some results for diffeomorphisms. Also see Palis and de Melo (1982) for further extension of the index to more complicated invariant sets.
326
VII. EXAMPLES OF HYPERBOLIC SETS AND ATIRACTORS
7.13 Exercises Manifolds Prove that in Example 1.3.
7.1.
sn has the structure of a Coo n-manifold.
Hint: Use the graphs given
7.2. Let F : Rn+1 - R be a C r function for some r > 1. Assume c E R Is a value such that for each p E F-I(c), DFp "# 0, i.e., DFx has r~k one, or some partial derivative of F is nonzero at p. Prove that F-I(C) has the structure of a C r n-manifold. 7.3. Let F : Rn+k - Rk be a C r function for some r 2: 1. Assume C E Rk is a value such that for each p E F-1(c), DFp has rank k. Prove that F-I(C) has the structure of a C r n-manifold. Hint: Use Theorem V.2.2.
7.4. Let AI and N be manifolds and I : M - N a C l diffeomorphism (a C l homeomorphism with a C l inverse). Prove that the map DIp: TpM - Tf(p)M is an isomorphism for every p EM. 7.5.
Let X = R and consider the two coordinate charts 11'1,11'2 : R - X given by = x and 1I'2(X) = x 3. For i = 1,2, let Mi denote the manifold consisting of the metric space X together with the one coordinate chart lI'i' (a) Prove that the identity map on X is not a diffeomorphism. (b) Prove that there is a diffeomorphism from MI to M2.
II'dx)
7.6. Let M be a manifold and X : M - TM be a vector field. Prove that X is C r for r 2: 1 if and only if for each cr+ 1 function I : M - R the function X (f) defined by X(f)(p) = DJpX(p) is cr. n-ansitivity Theorems 7.7. A homeomorphism I : X - X is called topologically mixing if for any two open sets U, V C X there is an N > 0 such that n 2: N implies J"(U) n V "# 0. (a) Prove that no homeomorphism of 1= [0,1) is mixing. (b) Give an example of a homeomorphism of a compact connected metric space which has a point with a dense orbit but which is not mixing. 7.8.
Let
T () x = {
2x
for x ::; 1/2
2 - 2x
for x 2: 1/2
be the tent map. (a) Using the Birkhoff Transitivity Theorem, prove that T has a point with a dense orbit. (b) Using the fact that F4(X) = 4x(1- x) is conjugate to T (see Example 11.6.2), prove that F4 has a point with a dense orbit. 7.9. (Poincare Recurrence Theorem) Assume Il is a finite measure on a space X, J.I(X) < 00. Assume J : X - X is a one to one map which preserves the measure Il, i.e., Il(f-I(A» = J.I(A) for every measurable set A. Assume S is a measurable set. Let So = S, Sn = {x E S,,_I : li(x) E Sn-I for some j = Sn-I
n
2: I}
Uri(Sn_ll i2:1
for n 2: I, and Soo = r1n>0 Sn' Prove that Soo is measurable, Il(Soo) = Il(S), and for x E Soo, fJ(x) returns to S an infinite number oftimes. Hint: Prove that Il(Sn) = Il(S) for n 2: 1.
7.13 EXERCISES
327
7.10. Assume X is a separable metric space and I" is a finite Borel measure on X (so all open sets are measurable). Assume that i : X --+ X is a homeomorphism on X which preserves the measure 1". Using the previous exercise, prove that
Y
= {x EX:
x E o(x) and x E w(x)}
has full measure, i.e., p(Y) = p(X). Hint: Let {Xk}f:1 be a countable dense set in X and consider the open balls B(Xk, 6) for 6 > O. 7.11. Assume X is a complete separable metric space and I" is a finite Borel measure on X which is positive on every open set. Assume that i : X --+ X is a homeomorphism on X which preserves the measure p. Prove that
Y
= {x EX:
x E o(x) and x E w(x)}
is a residual set ill X and so is dense in X. Hint: Let {Xk}f:1 be a countable dense set in X. For 6> 0, let U(xk,6,O) = B(xk,6) and U(xk,6,n) = {XE U(xk,6,n-l): ii(x) E U(xk,6,n-1) forsomej 2: I}
for n 2: I, Prove that each U(Xk, 6, n) is dense and open in B(Xk, 6). Two Sided Shift Spaces 7.12. Let EN be the full two sided shift space on N symbols with shift map >. > I, let p). be the metric on EN defined by p).(s, t)
=
(IN.
For
~ 6(si,ti) L..J >.lil j=-oo
where
Given tEEN and k 2: 0, prove that
{s:
Sj
= tj for Ijl ~ k}
is an open ball in terms of the metric p). if and only if >.
> 3.
7.13. Let EN be the full two sided shift space on N symbols with shift map (IN. Let A be an N x N transition matrix and EA C EN be the subshift of finite type determined by A and (lA = (lNIE A . (a) Prove that (lA is topologically transitive on EA using the Birkhoff Transitivity Theorem. (b) Describe a symbol sequence s· E EA that has both a dense forward orbit and a dense backward orbit in EA' Remark: Compare with Theorem 11.5.3. 7.14. Let (I: E2 --+ E2 be the full two-sided two-shift. Define r : E2 --+ E2 by rea) = b where bj = a_j-I, and let S = (lor. Prove that ror = id, sos = id, and (I = sor. Subshifts for Nonnegative Matrices 7.15. Let A be an n x n adjacency matrix and N = Ei,j aij. Let T be the N x N transition matrix on the edges induced by A as defined in Section 7.3.1. (a) Prove that there are (Ak)ij T-allowable words w of length k + 1 with b(w) = i and e(w) = j. (b) Prove that A is irreducible if and only if T is irreducible.
328
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
(c) Prove that #(Fix(u}»
= tr(A"). =
7.16. Let A be an nxn adjacency matrix and N EiJ aij. Assume aij E {O, I} for all 1 ~ i,j ~ n. Let T be the N x N transition matrix on the edges induced by A &8 defined in Section 7.3.1. Prove that the vertex subshift u A : EA -+ EA is topologically conjugate to the edge subshift formed from A, UT : ET -+ ET. Hint: Define h : ET -+ EA by h(s) = a where Bj = 6(aj) for all j, aj E E and Bj E V. Prove h is a conjugacy. Horseshoes 7.17. Consider the map I: S2 -+ S2 which gives the Geometric Horseshoe. Let S be the square &8 given in the chapter. Draw the inverse image of S by I, I-I(S). 7.18. Consider the Geometric Horseshoe Map. Let PI be the fixed point in HI n VI, L' compp,(W'(p!)nS), and LU compp,(WU(PI)nS). Draw the images of LU by P and L' by 1- 3 , P(LU) and 3 (L'). 7.19. Let I,k"g. : a' -+ a' be diffeomorphisms &8 in Example 4.1 (Section 7.4.2). Let U be the neighborhood of the nontransverse homoclinic point x· for f. Prove that for f > 0 small enough that following statements: (a) g, is a diffeomorphism, (b) is a saddle fixed point for g" (c) letting 1 W'(O, f) n u,
=
=
r
°
=
00
UI
j
(1)
c
W'(O,g.),
j=O -00
U Ij(1) c WU(O,g,),
and
j=-I
k.(1) C WU(O, g.), and (d) x· is a transverse homoclinic point for g•. 7.20. When B = 0, show that the family of Henon maps FA,O is conjugate to the family of quadratic maps F,. on a. (Unfortunately, the labeling of the two families of functions is very similar.) By a conjugacy in this case we mean a continuous map h : a -+ a' which is a homeomorphism onto the image of FA,o and such that h 0 F,. FA,o 0 h. Anosov Diffeomorphisms 7.21. Let / : 1r' -+ 1r2 be the map on the two torus induced by the matrix
=
acting on a'. (Note that I is not invertible.) Prove that the periodic points are dense. 7.22. Let IA be a hyperbolic toral automorphism on 1r" with lift LA to a". Let 9 be a small CI perturbation of IA with lift G to a". Finally, let G = G- LA. Let Cf,per(R"), Cl,per(a"), and 9(0, v) be defined &8 in the proof of Theorem 5.1. . I (a) Prove that G E C.,Plr(lR"). (b) Prove that 9(0,·) preserves C~,per(a"). 7.23. (a) Give an example of a hyperbolic toral automorphism on the three torus T3. (b) Given an example of a hyperbolic toral automorphism on the n-torus 1r" for n > 3.
7.13 EXERCISES
1.24.
Let
329
I : y2 -+ y2 be the hyperbolic tora! automorphism with matrix
Prove that I has sensitive dependence on initial conditions. Markov PartioDs for Hyperbolic Toral Automorphisms 1.25. Let I". : y2 -+ y' be the diffeomorphism induced by the matrix
discussed in Example 5.4. Let Ria be the rectangle used in the Markov partition for this diffeomorphism. Let 9 11. and
=
n gi(Rla)' 00
A
=
j=-oo
Prove that 9 : A -+ A is topologically conjugate to the two-sided full two-shift E2. 1.26. Let I: y2 -+ y2 be the diffeomorphism induced by the matrix
D" :
E2
-+
Form a Markov partition with three rectangles by using the line segment [a. b]. as in the text, and an unstable line segment [g, c]u where 9 is determined by extending the unstable manifold of the origin through the origin 80 that it terminates at a point g E [a,O].. Thus 0 is within the unstable segment (g, c]u. Determine the transition matrix B for this partition. Determine the three eigenvalues for the transition matrix. How do the eigenvalues compare with the eigenvalues for A? 1.21. Let I : y2 -+ y2 be a hyperbolic toral automorphism and A the transition matrix for a Markov partition. Let h : E" -+ y2 be the semi-conjugacy. Prove that s is periodic for IT" if and only if h(s) is periodic for I". Explain why the periods could be different. 1.28. Let I" : y2 -+ y2 be a hyperbolic toral automorphism and B the transition matrix for a Markov partition. Using the fact that I" is topologically transitive, prove that B is irreducible. Zeta FunctioD for a Hyperbolic Toral Automorphism 1.29. Let I : yn -+ yn be a C l diffeomorphism with a hyperbolic fixed point p. Prove that the Lefschetz index of I at p is given as follows:
=
=
where u dim(E;) and .:l 1 provided DlplE; -+ - ; preserves orientation and .:l -1 provided this linear map reverses orientation. 1.30. (Zeta function via Markov partitions) Let I : y2 -+ y2 be a hyperbolic tora! automorphism and A the transition matrix for a Markov partition M. Let h : E" -+ T' be the semi-conjugacy. The zeta function of D"" is rational by Theorem 111.3.2. A. Manning (1911) proved that (,(t) is a rational function by relating it to (I7A' Bowen
=
330
VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS
(1978a) sketched the following modification of Manning's proof. Let Pk be the collection of families of k indices of distinct rectangles {i l , ... , ik} such that each R;j E M and R;, n ... n R;. i: 0. For each such family, fix an ordering i = (i" . .. ,ik). For i,j E Pk, write i -+ j provided there is a permutation T of {I, ... , k} so that Ai"iT(') = 1 for 1 $ t $ k. Define the #(Pk) x #(Pk) matrix by I { A(k)IJ = -1
o
ifi-+j and if i -+ j and otherwise.
T
T
is an even permutation, is an odd permutation, and
Notice that A(l) = A and A(k) = 0 for k > 4 because J is a diffeomorphism on a two dimensional manifold. The parts of the problem below ask you to prove that N,(f) = ~) _1)k+1 tr(A(k)j). k?1
(a) Prove that if p E int(R;) for some Ri E M has period t for J, then h-I(p) has period t for U A. Prove that it does not correspond to a periodic point for U A(k) for any k > 1. (Alternatively, h-I(p) does not contribute to the trace of A(k)i for any k > 1 and j ~ 1.) Prove that in this case, h-I(p) contributes 1 to the right hand side of (.) for the j which are multiples of land 0 for other j. (b) Assume pERi, n R;2' p is not in any combination of three rectangles in M, and p is a fixed point for J. Prove that i -+ i where i = (iI, i2)' (i) If the permutation T for i is even, prove that h-I(p) contains exactly two points, each of which is fixed by U A. Prove that A(2)i,1 = 1 and p does not correspond to a periodic point of UA(k) for a k > 2. In this case, prove that the points in h-I(p) contribute 1 to the right hand side of (.) for all j. (ii) If the permutation T is odd, prove that h-'(p) contains exactly two points, which is an orbit of period two for UA. Prove that A(2)L = (-I)i and p does not correspond to a periodic point for U A(k) for a k > 2. Prove that the points in h-I(p) contribute 1 to the right hand side of (*) for all j. (c) Assume that p is a fixed point for J. Prove that the points ill h-I(p) contribute 1 to the right hand side of (.) for all j. (d) Assume that p has period l for J. Prove that the points in h - 1 (p) contribute 1 to the right hand side of (.) for the j which are multiples of f and 0 for other j, Conclude that (.) is true for all j. (e) Using part (d), prove that <J(t) is rational. Attractors 7.31. (Theorem 6.1) Let J : M -+ M be a diffeomorphism on a finite dimensional manifold M. Let A be a compact invariant set for J. Prove that A is an attracting set if and only if there are arbitrarily small neighborhoods V of A such that (I) V is positively invariant and (ii) w(p) C A for ail p E V. (Note that the neighborhoods V can be taken to be compact since both M and A are compact.) 7.32. (Theorem 6.2) Let A be an attracting set for J. Assume either that (i) pEA is a hyperbolic periodic point or (ii) A has a hyperbolic structure and pEA. Prove that WU(p,f) c A. Solenoids 7.33. Construct a Markov partition for the map g : Sl -+ SI defined by g(8) = 28 mod 1. Note in this case there is no contracting direction so WU (p, R;) = Ri and W' (p, Ri ) =
7.13 EXERCISES
331
{pl. Thus the fourth condition for a Markov partition becomes the following: ifint(Rt)n g-l(int(R j » 1: 0 and p E int(Rt) n g-l(int(R j )) then g(int(Rt)) n int(Rj ) = int(Rj ) and g-I(g(p» n int(Rt) = {pl. 7.34. (a) Construct a Markov partition for the solenoid. (b) Let h : EA ~ A be the map from the subshift of finite type determined by the transition matrix A for the Markov partition to the attractor A for the solenoid diffeomorphism defined by
n n n
00
h(s) =
cl(
n=O
rj(int(R. j
»).
j=-n
Prove that h is at most two to one. Let compp(S) denote the connected component of S containing p. Let N = SI xD2 and f : N ~ N be the solenoid map. For p = (0, z) show that compp(W'(p, f)n N) = D(O), and for 01 < 0 < O2 that 7.35.
compp(WU(p, f) n D([OI, 02])) =
n compp(D([Ol, 02]) n r(N». n2:1
Here D([OI, O2 ]) = (0 1 ,02 ) x D2. Hint: Show the inclusion both ways. You may assume that the left side is the one-to-one differentiable image of an interval.
DA Attractor 7.36. Prove that these exists a neighborhood V for the DA-diffeomorphism as specified in the proof of Theorem 8.1: f(V) :::> V, a22 > 1 for q E V, and a22 < 1 for q rt f(V). Morse-Smale Diffeomorphisms 7.37. Let V : 1'" ~ 1'" be a C2 function such that each critical point is nondegenerate, i.e., at each point x where grad(V)x = 0, the matrix of second partial derivatives,
a::':x (x), has nonzero determinant. j
Prove that all the fixed points of the gradient
vector field of V are hyperbolic. 7.38. Let L: M ~ R be weak Liapunov function for the flow 'PI. If x E 'R('P 1), prove that L 0 'PI(X) is a constant function of time. 7.39. (a) Let 'PI be a Morse-Smale flow with only fixed points and no periodic orbits. Prove that f(x) = 'PI (x) is a Morse-Smale diffeomorphism. (b) Let 'PI be a Morse-Smale flow that has a periodic orbit. Prove that j(x) = 'PI (x) is not a Morse-Smale diffeomorphism. (c) Let j : M ~ M be a Morse-Smale diffeomorphism. Let 'P' be the suspension of j. Prove that 'P' is a Morse-Smale flow without any fixed points and only periodic orbits. 7.40. Consider the set of C l diffeomorphisms on SI with the C l topology, Diffl(SI). Prove that any f E Diffl(SI) can be approximated by a Morse-Smale diffeomorphism, i.e., prove that the set of Morse-Smale diffeomorphisms is dense in Diff1(SI). 7.41. Prove that the set of Morse-Smale diffeomorphisms is not dense in Diffl(T2) or Diffl(S2) where S2 is the two sphere. Hint: Consider the examples of diffeomorphisms we have given with infinitely many periodic points. 7.42. Consider the two torus, T2. The Beti numbers of T2 are /30 = f32 = 1, and = 2. Assume that there is a diffeomorphism f on R2 with one source, C2 = 1, and
{31
332
VII.
EXAMPLES OF HYPERBOLIC SETS AND ATTRACTOR..<;
two sinks, Co = 2. Using the Morse inequalities determine how many saddles have.
I
must
7.43. Which of the following diffeomorphisms are structurally stable? Prove your answer. (a) I: SI -+ SI defined by 1(0) = 0 + 0 mod(271'), where 0/71' is irrational. (b) g: SI -+ SI defined by 1(0) = 0 + 0 mod(271'), where 0/71' is rational. (c) ho : SI -+ SI defined by ho(O) = 0 + 0 sin(O) mod(271'), where 0 E (0,0.2). (Is ho structurally stable for all these values of 0, some, or none?) 7.44. Assume that for each t E [a, bl, II : M -+ M is a diffeomorphism which is structurally stable. Prove that la and Ib are topologically conjugate. 7.45. Let I : R -+ R be defined by I(x) = x + 1. Show there is an f > 0 such that whenever 9 is a C l map which satisfies I/(x) - g(x)1 I!'(x) - g'(x)1
and
for all x E R, then I and g are differentiably conjugate. That is, there is a Cl diffeomorphism h : R -+ R such that I 0 h = hog. 7.46. Prove Lemma 12.1.
CHAPTER VIII
Measurement of Chaos in Higher Dimensions In this chapter we continue the discussion of measurements of chaos started in Chapter III. As mentioned there, topological entropy is a quantity which is used in the mathematical study of dynamical systems. A newcomer to the subject often has difficulty understanding exactly what is being measured from the definition of topological entropy itself. We calculate the entropy of a few simple situations. We also relate the entropy to horseshoes caused by homoclinic points and Markov partitions for Anosov diffeomorphisms. Hopefully, after seeing these examples, the reader will gain an understanding of the concept. The next section returns to the concept of Liapunov exponents. These are defined in one dimension where the situation is fairly simple. In this chapter we give the definition in higher dimensions. In this discussion, we merely give the definitions and statements of theorems. The goal is for the reader to be aware of the concept. We leave a full treatment to the references. The next section discusses the Sinai-Ruelle-Bowen measure for an attractor. Our treatment in this book of invariant measures in general and this measure in particular is very sketchy. Again, we defer to the references for a more complete treatment. The last measurement of chaos relates to the geometry of the invariant set. We define fractal dimension (box dimension, or Hausdorff dimension) and calculate it for a few examples. Again the goal is for the reader to be aware of the concept without giving all the theory or details.
8.1 Topological Entropy As stated when discussing chaos in Section 3.5, topological entropy is a quantitative measurement of how chaotic the map is. In fact, it is determined by how many "different orbits" there are for a given map (or flow). To get an intuitive idea of the concept, assume that you can not distinguish points which are closer together than a given resolution e. Then, two orbits of length n can be distinguished provided there is some iterate between 0 and n for which they are distance greater than e apart. Let r(n, e, f) be the number of such orbits of length n that can be 80 distinguished. The entropy for a given e, h(e, f), is the growth rate of r(n, f, f) as n goes to infinity. The limit of h(f, f) as e goes to zero is the entropy of I, h(J). The key idea in this sequence of limits is the growth rate of the number of orbits of length n that are at least e apart. Now we give a more careful definition.
Definition. Let I : X -> X be a continuous map on the space X with metric d. A set SeX is called (n, e)-separated lor I for n ~ positive integer and e> 0 provided for every pair of distinct points x, yES, x '" y, there is at least one k with 0 $ k < n such that d(J"(x),f"(y» > e. Another way of expressing this concept is to introduce the distance dn.,(x,y) = sup d(Ji(x),li(y». °Sj
334
VIII. MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS
Using this distance, a set SeX is (n, t)-separated for f provided dn./(x, y) > t for every pair of distinct points x, YES, x # y. The number of different orbits of length n (as measured by t) is defined by
r(n,t,f) = max{#(S) : SeX is a (n,t)-separated set for
n,
where #(S) is the number (cardinality) of elements in S. We want to measure the growth rate of r(n, t, f) as n increases, so we define log(r(n, t, f» · h( t, f) = IImsup . n-oo n If r(n,t,f) = e nT , then h(t,f) = T; thus h(t,f) measures the "exponent" of the manner in which r(n,l, f) grows with respect to n. Note that r( n, t, f) 2: 1 for any pair (n, t), so 0 ~ h( t, f) ~ 00. We show in Lemma 1.10 that on a compact metric space, h(t,f) < 00. Finally, we consider the way that h(t, f) varies as t goes to zero, and define the topological entropy of f as h(f) = lim h(t, f) . • -0,.>0
In the next subsection we introduce another way of counting orbits, so we distinguish the notation for the number of (n, t)-separated orbits by a subscript, i.e., we use the notation r.ep(n,l,f) for r(n,t,f), h •• p(t,f) for h(t,f), and h •• p(f) for h(f) 1.1. Note for 0 < t2 < tl, r(n,f2,f) 2: r(n, tit f), so h(t,f) is a monotone function of t, h(t2' f) 2: h(ll' f). the limit defining h(f) exists, and 0 ~ h(l, f) ~ h(f) ~ 00 for all t > O. If f is Cion a compact space, then it has been proved that h(f) < 00. See Bowen (1971, 1978a). REMARK
REMARK 1.2. The concept of topological entropy was originally introduced by Adler, Konheim, and McAndrew (1965) using a very different definition involving covers of the set by open sets. This definition is given in the next subsection. The definition we use was introduced by Bowen (1970b). Good general references for topological entropy are Bowen (1978a, 1970b), Walters (1982), and Alseda, Llibre, and Misiurewicz (1993). This last reference treats the case of the entropy of a one dimensional map quite extensively. It also treats both the definition we have given with separated sets and the original definition using refinements of open sets. REMARK 1.3. It is also possible to define a measure theoretic entropy hj.l(f) for an invariant measure JI.. Then under the correct hypothesis, h(f) = sup{hj.l(f)} where the supremum is taken over all invariant measures JI.. See Bowen (1975b) or Walters (1982) for a discussion of this type of entropy and its connection with topological entropy.
It is hard to get a very good idea of what entropy means directly from the above definition. Throughout the section, we determine the entropy of a few examples which helps give some feeling for its meaning. The first example is the "doubling map" which is easy to show that it has entropy equal to log(2). We state this result as a proposition. Proposition 1.1. Let f : SI
-+
SI have a covering map F : IR
-+
IR given by
F(x) = 2x. This mllp is (,IIJled the doubling map. The distance on SI is the one inherited from R by taking x to x mod 1. Thus points near 1 are close to points near O. (This ('an also
8.1 TOPOLOGICAL ENTROPY
335
be considered as the map g(z) = z2 on the circle where we use complex notation. If this representation is used, the map is often called the squaring map.) Then h(f) = log(2). PROOF. Two points x and y stay within l of each other for n - 1 iterates of / if and only if Ix - yl ~ t2- n+ 1 (as points in R). If we put points exactly distance t2- n +l apart in [0.1). then there are at most [l-12 n- I j points. However the last point to the right is close to the first point on the left when considered modulo 1 in SI. so there are [t- 12n- I j - 1 points in SI. These points can be spread apart slightly to make these points actually (n.f)-separated. Therefore. r(n.f.f) = [t- 12n - I j - 1 where raj Is the integer part of a. Then
log([f-12n-tj - 1) · h( E. I) = I Imsup n-oo n log(t-t) + (n - 1) log(2) . I = tmsup n-(X) n = log(2) for any
l
> O. so h(f)
o
= log(2) as claimed.
We give a few basic results which are used to help calculate the entropy for specific maps. The first such result relates the entropy of a map I with a power lie of I. Theorem 1.2. Assume I is uniformly continuous (or X is compact) and k is an integer with k ~ 1. Then the entropy of lie is equal to k times the entropy of /. h(fle) = k h(f). PROOF. Because the points of the orbits considered for r(n. t. lie) constitute a subset of those considered for r(nk. f. f).
so any (n.f)-separated set for lie is also an (nk.t)-separated set for I. and we have that r(n.l. I") ~ r(nk.l. f). By uniform continuity. given.,., > 0 there is l > 0 such that if d(x. y) ~ l then d(fj(x). Ji(y)) ~ .,., for 0 ~ j < k. Therefore any (nk •.,.,)-separated set for I Is also a (n.f)-separated set for I". or r(n.f./") ~ r(nk •.,.,.f) where.,., is uniform in n. Combining these two inequalities. 1
-r(nk.f.f) n
~
1
Ie
-r(n.l.1 ) n
~
1
-r(nk • n
.,.,.n
and taking the limits in n and then f kh(f.f) ~ h(f.I") ~ kh(.,.,. f). kh(f) ~ h(f") ~ k h(f).
This proves the theorem.
o
1.4. We leave to the exercises to prove that if I Is a homeomorphism. then h(r t) = h(f). See Exercise 8.9. From equality it follows that h(f") = Ikl h(f) for any integer k. The next two results relate the entropy of a map with the entropy on invariant subsets: the first result Is in terms of disjoint invariant sets and the second one is in terms of the nonwandering set. REMARK
336
VIII. MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS
Theorem 1.3. Let f be a continuous map on X. A&5ume X = Xl U··· U X,. is a decomposition into disjoint closed invariant subsets which are a positive distance apart. Then h(f) = maxh(flXi) .
•
PROOF.
If t is smaller than the distance between the subsets Xi. then
r(n.t.fl = Lr(n.t.fIXi). i
Thus for each n and each j. we must have
r(n.t.fIXj ) $ r(n.t.fl $ kmaxr(n.f.fIXi) .
•
In passing to the limit in calculating h(f.!). for each j we have
h(t.fIXj ) $ h(t.!) . _lo.:::.g(,. . T.:. .n.(. :. . t...:..,,--,fl'-'.) = Ilmsup n-oo n
. log(k maxi r(n. t. fiX;) $ I1m su p --='-'-----'---'--'-:..:...!---::..:.. R-OO n $ max heft fIXi )•
•
so h(t.fl = maxj h(t.fIXj zero. one jo must have for infinitely many of the
).
li.
By taking a countable number of ti > 0 converging to
so
h(f) = h(fIXjo )' Since h(fIXj ) $ h(f) for all j. this proves the theorem.
o
The next result of Bowen (1970b) says that all the entropy is contained in the nonwandering set. i.e .• the wandering orbits do not contribute to the entropy.
Theorem 1.4. Let f : X -+ X be a continuous map on a compact metric space X. Let 0 c X be the nonwandering points of f. Then the entropy of f equals the entropy of f restricted to its nonwandering set, h(f) = h(fIO). The proof of this theorem is delayed until the next subsection because of its length and the relative greater complexity of its argument; it also uses a slightly different definition of topological entropy in addition to that of separated sets. We mention that this result shows that any Morse-Smale diffeomorphism, or any map whose nonwandering set is a finite set of points. has zero entropy as stated in the following result.
Theorem 1.5. Let X be a compact metric space and f : X -+ X be a continuous map for which O(f) is a finite number of periodic orbits. (For example, f could be a Morse-Smale diffeomorphism.) Then the entropy of f is zero, h(f) = O. First. h(f) = h(JIO) by Theorem 1.4. Then by Theorem 1.3. h(fIO) is the maximum of the entropy on the individual periodic orbits. However. the entropy of a 0 single periodic orbit is zero. PROOF.
8.1 TOPOLOGICAL ENTROPY
337
REMARK 1.5. Let F,,(x) = #lX(1 - x), and #lie be taken in the range where F". has one attracting periodic orbit of period 21e and repelling periodic orbits of periods 2; for o ~ j < k. Since #lie < 4, F". preserves the interval I = [O,IJ. Theorem 1.5 implies that the entropy h(F".II) = O. In Theorem 1.6 below, we give a direct proof of this fact for 1 < I' < 3 without using Theorem 1.5 (and so without the proof of Theorem 1.4). We include the proof of this special case in addition to the proof of the general case given in Theorem 1.4 because its proof is more concrete. In the calculation of entropy it is necessary to calculate the number of (n, f)-separated wandering orbits which start near the repelling fixed point 0 and end up near the attracting fixed point PI" Because these orbits can be partitioned by the iterate when they leave a neighborhood of 0, the number of these orbits grows linearly in n. Because linear growth adds nothing to the entropy, h(F,,) = O. REMARK 1.6. In the proof of Theorem 1.4 we show that the wandering orbits contribute at most a factor which grows in a polynomial fashion in the length n of the orbits considered. A term which grows polynomially in the length n does not contribute to the entropy, so this proves the theorem. The reason that the growth is possibly polynomial rather than linear is that an orbit can make several transitions from a neighborhood of the nonwandering set. For example for the horseshoe on 8 2 , an orbit could start near the source at 00 and proceed near the horseshoe itself and finally leave and go near the fixed point sink. Because there are two times at which these orbits leave a neighborhood of the nonwandering set (the time it leaves a neighborhood of 00 and the time it leaves a neighborhood of the horseshoe), the number of such orbits can grow quadratically in n. In the general proof, we do not use any decomposition of the nonwandering set. (However see Theorems IX.1.3 and IX.4.4.) Because we proceed without any special knowledge of the decomposition of the nonwandering set, we must consider wandering orbits which make several transitions between a neighborhood of the nonwandering set. The proof shows that each of these transitions contributes a possible factor of n to the growth rate of r(n, f, f) and so the the total number of these (n, f)-separated grows at a polynomial rate. Such a growth rate of r(n, E, f) contributes nothing to h(J). REMARK 1.7. On the whole real line r(n,E,F,,) = 00 (because if x,y < 0 then the orbits F;(x) and Fi(y) diverge as j goes to infinity) so h(F,,) = 00. This illustrates the fact that the entropy is not a good measurement of the chaotic nature of a map on a noncompact set. In fact, let id : R -+ R be the identity map, id(x) = x for all x. The identity map is certainly not chaotic, but r(n, E, id) = 00 for all n and f > 0, so h(id) = 00.
As stated above, we give a direct proof that h(F,,) = 0 in the special case when F,. when it has only two fixed points and no other nonwandering points. This proof is more concrete and less complex than the general proof of Theorem 1.4, and we also obtain a better upper bound on the number of (n, f)-separated orbits. Proposition 1.6. LetF,.(x) = JlX(I-x) forI O.
< #l < 3. Then the entropy h(F,.i[O, 1]) =
PROOF. For 1 < #l < 3, F,. has a single repelling fixed point at 0, a single attracting fixed point PI' = 1 - 1/#l, and no other periodic points or nonwandering points. We write F for F", and P for PI" Let a = F(0.5) be the image of the critical point and J = [O,aJ. Notice that F([O,I]) = J. The reader can easily verify that h(FIIO,I]) = h(FIJ). (See Exercise 8.13.) We use FIJ because it has unique inverse images of points near O. Let H = FI} to simplify notation.
338
VIII. MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS
°
The point p is attracting, so there exists eo > such that if x and yare two points within a distance eo of p then for all j 2: 0, (i) Hi(x) and Hi (y) stay within a distance eo of p and (ii) IHi(x) - Hi(y)1 ::; Ix - yl. (Take eo> such that IH'(z)1 < 1 for all points z E (p - eo,p + eo), and apply the Mean Value Theorem.) For eo > possibly smaller, if x and yare within eo of then for all j ::; 0, (i) Hi (x) and Hi (y) stay within a distance EO of 0 and (ii) IHi(x) - Hj(y)1 ::; Ix - YI. (Notice that this would not be true as stated for FI[O, 1].) The idea of the proof is that as n grows, the (n, EO)-separated orbits which make the transition from near 0 to near p can be counted by the iterate at which they start to make the transition. Therefore the number of these orbits grows at most like a multiple of n. Since the number of orbits which remain near either 0 or p is bounded, r(n, EO, H) grows linearly in n and h(Eo,H) = O. We proceed to make the above idea precise. Now fix a 0 < E ::; EO. Let No = H«p - E/2,p + E/2) be the open interval about the attracting fixed point p, and let Nr = H-1 ([0, f /2)) be the open (in J) interval about the repelling fixed point O. Let U = N r U No and ue = J \ U. Then V = [0, E/2] \ N r is a fundamental domain of the unst.able manifold of 0. Next, there is a positive integer no such that if Hi(x) E UC for 0 ::; j < n, then n ::; no. In particular, if x E V then Hj(x) E No for j 2: no, and U7!;;i Hi(V) :J UC. Because U C is compact and all points are wandering, there is a {3 > 0 with 2{3 ::; E such that Hj([y - {3, y + {3]) n [y - {3, y + {3] = 0 for all y E U e and all j 2: l. Let E dense ({3, V) C V be a set such that
°
°
°
Ro-l
U H i (Eden •• ({3, V))
E dense ({3, UC) =
;=0
(i) is {3-dense in U C and (ii) for any x E U e, there is ayE E dens .({3,U e) such that dno.H(X,y) < {3. Thus for these x and y, IGi(x) - Gi(y)1 < {3 as long as Gi(x) E ue. Finally, let G= (O,p}UEden •• ({3,UC). To estimate r(n, E, J), we define a map
'Pn : J
--+
Gn
as follows. Case (i): If x E No, then Hi(x) E No for all i 2: 0, so we let 'Pn(x) = (p, ... ,pl· Case(ii): If x E N r , then there is a j ::; n such that Hi(X) E N r for 0 ::; i < j and Hl(X) E V if j < n. If j < n, let Yj be a choice of a point in Eden.e({3, V) such that dno,H(Hi (x), Yi) < {3. Let 'Pn(x) = (Yo, ... , Yn-d where
Yi =
{
o Hi-j(Yj) p
forO::;i<j for j ::; i < min{n,j + no} for min{n,j + no} ::; i < n.
Case(iii): If x E UC, then let Yo be a choice of a point in Edenoe({3, ue) such that dno.H(x, YO) < {3. Let 'Pn(x) = (Yo, ... , Yn-d where
Yi = {Hi(yO) p
for 0 ::; i ~ no for no ::; t < n.
We leave to the reader to check the following claim.
S.l TOPOLOGICAL ENTROPY
339
Claim 1. If x E J and rpn(x) = (Yo, ... , Yn-l), then IHi(x) - Yil < f/2 for 0 $ i < n. Take an n with n > no. Let E(n,f) be a set with the maximal number of (n,f)separated points in J, #(E(n, f» = r(n, f, J, H). Claim 2. The map 'PnIE(n, t) is one to one. PROOF.
Assume rpn(x)
= rpn(z) = (yo, ... , Yn-d
for x, Z
for 0 $ i < n. Since E(n, E) is (n, f)-separated, x =
E
E(n, E). Then
o
Z.
Claim 3. The cardinality of 'Pn(E(n, E» is Jess than (n + 1) #(Eden •• U1, UC». PROOF.
An n-tuple (yo, ... , Yn-d
Yi =
{
E
rpn(E(n, f» has the form
o
forO$i<j for j $ i < j + k forj+k$i
Yj p
where k = min{n - j,no} if j ~ I, and k = min{n - j,no} = no or 0 if j = O. To count these n-tuples, we divide them up by the number k. If k = min{ n, j + no} > 0, then there are at most #(Eden •• {{3, U C» choices for Yj. As j varies, 0 $ j < n, we at most n #(Ed.n •• (f3, UC» such n-tuples. The only other cases are when j = n or j + k = O. These contribute two more n-tuples, (0, ... ,0) and (p, ... ,p). Thus
»+ 2
#(rpn(E(n, E))) $ n #(Eden.e{{3, U C
»
$ (n + 1)#(Eden.e(l1, U C
o
as claimed. To calculate the entropy, r(n,f,J,H) = #(E(n,E» = #(rpn(E(n, E)))
$ (n
+ 1)#(Eden.e(f3, U C ».
This quantity grows linearly with n so does not contribute to the entropy: log(n + 1) + log(#(Eden •• {/3, UC))) · h( f, H) < _ IImsup R-OO n = O. Therefore, h(t, H) = 0 and h(F) = h(H) = 0 as claimed in the theorem.
o
The next two theorems show that the topological entropy of two maps which are topologically conjugate are equal. In fact, the second result is for maps which are only semi-conjugate by a map which is uniformly finite to one.
340
VIII.
MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS
Theorem 1.1. Let X and Y be metric spaces with metrics d and d' respectively. Let F: X -- X and f : Y -- Y be semi-conjugate by k: X -- Y. (a) Assume X and Yare compact and the semi-conjugacy k is onto. Then h(F) ~ h(f). (b) If k is one to one (but not necessarily onto), then h(F) ~ h(f).
°
°
PROOF. By uniform continuity of k, given t > there is a 6 > such that d(XI' X2) ~ 6 whenever d'(k(xd,k(xd) ~ t. Let E(n,t,f) c Y be a maximal (n,t)-separated set for f. i.e., one with #(E(n, t, f) = r(n, t. f). Form the set E(n, 6, F) c X by taking one x E k-I(y) for each y E E(n,t,f). Thus #(E(n,6,F» = #(E(n,t,f). Then E(n, 6, F) is a (n,6)-separated set for F by the property of uniform continuity of k mentioned above. Therefore
r(n,6,F)
~
#(E(n,6,F»
= #(E(n,t, f) = r(n.t.f).
From this it follows that h(6, F) ~ h(t. f) and h(F) (a). We leave part (b) to the reader.
~
h(f) as desired. This proves part 0
REMARK 1.8. If two maps f and 9 are conjugate on invariant compact subsets (or conjugate by a uniformly continuous homeomorphism), then their topological entropies on these subsets are equal. Thus the topological entropy of the shift map (12 on 1:2 is equal to that of FI' on AI' for #J > 4. We show below that h«(12) = log(2) 50 we can deduce the value of the entropy for the quadratic map on AI' for #J > 4.
The next result gives a criterion for entropies of F and f to be equal when F is semiconjugate to f by a map k. We say that k is uniformly finite to one provided k- I (y) has a finite number of points for each y and there is a bound C on the number of elements in k -I (y) which is independent of y. The theorem says that if k is a uniformly finite to one semi-conjugacy from F to f, each of which are defined on compact sets, then the entropies of F and f are equal. This result can be used to calculate the entropy of F4 • This theorem is due to Bowen (1971). See de Melo and Van Strien (1993). Theorem 1.8. Assume F : X -- X and f : Y -+ Y are continuous maps where X and Y are compact metric spaces with metries d and d' respectively. Assume k : X -+ Y is a semi-conjugacy from F to f that is onto and uniformly finite to one. Then h(F) = h(f). We delay the proof to the next subsection, and at this time apply it calculate the entropy of F 4 i!0. 11. Example 1.1. There is a semi-conjugacy k from the doubling map D(y) = 2y mod 1 to F 4i!O.lj which is two to one. (This is shown in Example II.6.2.) The entropy of Dis log(2) by Proposition 1.1 so by the above theorem the entropy of F4i!O,lj is also log(2). We end the section by determining the entropy of a subshift of finite type. The first part of the theorem express the entropy of any subshift (and not just a subshift of finite type) in terms of the growth rate of the number of words of length n as n goes to infinity. Theorem 1.9. (a) Let (1 : 1:N -+ 1:N be the full shift on N symbols (either one or two sided). Assume X C 1: N is a closed invariant subset, so (1IX is a subshift. Let Wn be the number of words of length n in X, i.e., Wn
=
#{(so,.·. sn-d :
Sj
=
Xj
for
°
~ j
< n for some x EX}.
Then h«(1IX) = lim sup log(wn ). ft-OO n
8.1 TOPOLOGICAL ENTROPY
341
(b) Let A be a transition matrix on N symbols, so A is N x N. Let 0' A : EA -+ EA be the associated subshift of finite type (either one or two sided). Then h(O'A) = log(~l) where ~I is the rcal eigenvalue of A such that ~I ~ I~il for all the other eigenvalues ~j of A. (a) We need to consider the number of (n,f)-separated points for various (. First, take ( = 2- 1 . Two points s, t E X are within 2- 1 if and only if So = to. For the first n - 1 iterates, O'i(s) is within 2- 1 of 0'1(t) for 0 :5 j < n if and only if S1 = tj for 0:5 j < n. There are Wn choices of blocks (so, ... , sn-.) (by the definition of w n ), so r(n, 2- 1, O'IX) = W n . Thus PROOF.
h(2- 1,O'IX) = limsup log(wn ).
n
n-oo
Next, we need to consider other values of f. Since h(e, O'IX) is monotonically increasing as E decreases, it is enough to calculate the value for E = 2- 13-11:. By Exercises 2.12, d(s, t) :5 2- 13-11: if and only if 5i = tj for 0 :5 j :5 k. Thus d(O'i(S),O'i(t» > 2- 13-11:
for some 0 :5 i < n if and only if Sj
'"
tj for some 0
:5 j < n + k. Therefore
and h(2-13-k,O'IX) = lim sup log(r(n,2- 13- k ,O'IX» R-OO n . log(r(n+k,2-1,O'IX» I
= 1m sup --=;.:....:~-'---'--=--!.:;. n
R-OO
Ii
~_s!P -n-
= =
(n + k) log(r(n + k, 2- 1, O'IX» n+k
h(2- 1 , O'IX).
Since we have shown that h(2- 13- k ,O'IX) = h(2- 1, O'IX) for any positive k and h(f,O'IX) is monotone in e, h(O'IX) = h(2- 1 ,O'IX)
log(wn) = Iimsup---. n-oo
n
This proves part (a). (b) We prove the case for A irreducible using the Perron-Frobenius Theorem and leave to the exercises the proof ofthe general case. See Exercise 8.14. (The general case uses Proposition III.2.10.) We first take the case where A is eventually positive, Aj is positive for j ~ m. Later, we discuss the proof of the general irreducible case. By Lemma 111.2.2, for a subshift of finite type with transition matrix A, Wn is the sum of all the entries in An-1 which we denote by #(An-1), Wn
=
L l~i~N,ISi~N
= #(A n- 1 ).
(An-1li,i
342
VIII. MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS
Therefore to calculate the entropy, we need to estimate #(An-I). Letting e be the column vector with all entries being one, e = (1, ... ,1)tr,
so
Applying the case of the Perron-Frobenius Theorem given in Theorem IV.9.lD, (i) there is a positive real eigenvalue 'xl and corresponding eigenvector vi with all positive entries such that 'xl > I'xjl for all the other eigenvalues'xj of A, (ii) An-le/IAn-lel converges to Vi Ilvll as n goes to infinity, and (iii) there are positive constants C I , C2 such that CI,X~-I ::; (An-Ie), ::; C2,X~-1 for 1 ::; i ::; Nand n > m. (Remember that Aj is positive for j ~ m.) Summing on i, we get the estimate
Because constant multiples do not affect the exponential growth rate, ') I' log(N)+log(Cd+(n-1)log(,Xd Iog ("I = 1m n-oo n = lim 10g(NCI,X~-I) n-oo n ::; lim sup log(w n ) n-oo n
. I1m < - R-oo
log(NC2,X~-I)
n
::; 10g(,Xd, so
. log(wn ) h(UA) = hmsup - - n_oo n = log('xd· This completes the proof of the theorem in the case when A is eventually positive. Finally, consider the case when A is merely irreducible. It is proved in Gantmacher (1959) that by means of a permutation A can be put into the following 'cyclic' form:
A= ( :
Au
o
o
o
o
where the blocks along the diagonal are square. By taking the k-th power (where k is the number of blocks), AA: = di"g(B It ... , BA:) where B j = A),HI'" An-I,nAn,1 ... Aj_I,j is eventually positive for each j. Thus a general irreducible matrix is a "combination" of a cyclic permutation of blocks of symbols and an eventually positive return map AA: on each of these blocks of symbols. All
S.l.l PROOF OF TWO THEOREMS ON TOPOLOGICAL ENTROPY
343
the B j have the same real eigenvalue A~, such that A~ > I~l,jl for the other eigenvalues of Bj. In fact Ale 2d/ k for 0 $ t < k are simple eigenvalues of A; in particular, Al 2: IAil for the other eigenvalues Aj of A. (See Gantmacher (1959) for details.) From this form, it follows that EA is the union of k subsets which are invariant under u~, and u~ on the j subset is UB,' By Theorem 1.3, h(u~) = maxi h(UBJ) = 10g(A~). Then the entropy of U A is as follows: h(u A) = =
1
k h(u~) Ie k1 10g(AI)
= 10g(Ad· This shows how the result for a general irreducible subshift follows from the general 0 Perron-Frobenius Theorem. REMARK 1.9. In the exercises, we ask the reader to use this last theorem to show that the entropy of the full shift on N symbols is 10g(N). See Exercise 8.4. We also use it to calculate the entropy of several subshifts of finite type.
8.1.1 Proof of Two Theorems on Topological Entropy This subsection contains the proofs of Theorems 1.4 and 1.8. We apply Theorem 1.8 to toral automorphisms using their Markov partitions in the next subsection. In Chapter IX we apply it to general hyperbolic invariant sets. The main use of Theorem 1.4 is for Morse-Smale systems which we also discuss in the next subsection. The proofs of Theorems 1.4 and 1.8 use another method of counting orbits in addition to (n, e)-separated sets called (n, e)-spanning sets. We give the definition in a slightly more general situation where we allow the initial points to be restricted to a subset K that is not necessarily invariant. The following definition makes these ideas precise.
f : X - t X be a continuous map on the space X with metric d. Let X be a subset. For a positive integer q, let
Definition. Let K
c
dq,f(W, z) = sup d(fi{w),li{z)) o:Sj<9
as we defined earlier. A set S c K is said to (n, f)-span K for n a positive integer and f > 0 provided for each x E K there exists ayE S such that dn.f{x,y) $ E. Then the number r.pan{n, E, K, f) is defined to be the smallest number of elements in any set S which {n, f)-span K, and · log(r.pan(n, e, K, f)) h.pan (K E, , I) = IImsup . n-oo
n
It is easily checked that h.pan(E, K, f) is monotonically decreasing in E (O < EI < e2 implies that h.pan{fl, K, f) 2: h.pan{f2, K, f)) so the limit as f goes to zero of h.pan{f, K, f) exists, h.pan(K, f) = lim h.pan(E, K, f), (-0, (>0
and for any f > 0, h.pan(f, K, f) $ h.pan{K, f). For any integer n, f > 0, and subset K::> X, we let E.pan{n,f,K) to be a minimal {n, f)-spanning set, so #(E.pan{m, f, S)) = r.pan{m, E, S, f).
344
VIII.
MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS
As in the last section, r •• p(n, 1', K, f) is the maximal number of elements in any set S c K which are (n,E)-separated. The definitions for h •• p(E,K,f) and h •• p(K,f) are similar. Again, we let E •• p(m,E,S) be a maximal (m,E)-separated set for K, so #(E•• p(m,E,S)) = r ••p(m,E,S,f). If K = X then we usually drop the specification of the set K in the notation. The following lemma shows that h •• p(K, f) = h.pon(K, f) so the limit is the same for separating and spanning sets (and either one can be used to define entropy). After proving the lemma, we denote either quantity by h(K, f). Lemma 1.10. Let K be a subset of X. For I' > 0 and n a positive integer,
and
h ••p(2E, K, f) ::; h.pon(E, K, f) ::; h ••p(E, K, f). Therefore h •• p(K, f) = h.pon(K, f). If we further assume that the space X is compact, then h•• p(E, K, f) < 00. PROOF. Let E •• p ( n, 1', K) be a maximal (n, I' )-separated set for K and let x E K. There is some y E E •• p(n,E,K) such that dn,/(x,y) ::; 1', because otherwise E •• p(n,E,K) U {x} would be an (n,E)-separated set for K and Eup(n,E,K) would not be maximal. Therefore E •• p(n,E,K) (n,E)-spans K, and
r•• p(n,E,K,f) = #(Eup(n,E,K» 2: r.pon(n,E,K,f).
Let E •• p(n,2E,K) be a maximal (n,2E)-separated set for K, and E.pon(n,f',K) be a minimal (n,E)-spanning set for K. Using the fact that E.pon(n,E,K) spans, we are going to define a map T : E •• p(n, 21', K) -+ E.pon(n, 1', K). For x E E •• p(n, 21', K) there is a y = T(x) E E.pon(n, 1', K) with dn./(x,y) ::; I' because E.pon(n, 1', K) spans. If T(xiJ = T(X2) for XI,X2 E E •• p(n,2f',K), then dn,/(XI, X2)
::;
dn,/(XI, y)
+ dn./(y, X2)
Because E •• p(n, 21', K) is an (n, 2E)-separated set, one, and so
Xl
=
X2'
::;
21'.
This shows that T is one to
r •• p(n,2E,K,f) = #(E•• p(n,2E,K» ::; #(E.pon(n, 1', K» = r.pon(n,E,K,f).
By taking the growth rates as n goes to infinity, we get the inequalities h ••p(2E, K, f) ::; h.pon(E, K, f) ::; h ••p(E, K,J). By letting I' go to zero, h •• p(K, f) = h •• p(K, f). If X is compact, there is a finite number, N., such that N. is the maximal number of disjoint balls of radius E. It follows from this that the maximal number of elements of a (n, f)-separated set is bounded by N:. (There can not be two orbits with dn./(x, y) 2: I' and Ji(x) and fi(y) in the same f'-balls forO::; j < n.) Thereforer•• p(n,f,K,f)::; N:, and h •• p(E, K, f) ::; 10g(N,) < 00. 0 We next give the definition of topological entropy in terms of open covers. We do not use this definition, so it can be skipped. However the reader may note some similarity between this definition and a construction in the proof of Theorem 1.4.
8.1.1 PROOF OF TWO THEOREMS ON TOPOLOGICAL ENTROPY
345
Definition. A collection A is called an open cover 01 X provided (i) each A E A is an open subset of X and (ii) UtA E A} = X. A subcollection 8 c A is called a subcover provided U{ A E 8} = X. For an open cover A, let
{n
n-I
An =
n
n-I ri(A j
) :
j==O
Aj E A and
rj(A j ) =F 0}.
Let N(A) be the minimal cardinality of a subcover 8 cA. Denote the growth rate of the number of elements in (a minimal subcover of) An by h(A, I) -- I'Imsup log(N(An» . n-oo n Finally, the entropy of I is given by
h(f) = sup{h(A,f) : A is an open cover of X}. Note that this definition also growth rate of a number of objects determined by iterates of I. We do not prove this fact, but the above definition is equivalent to the definitiolUl in terms of (n,E)-separated or spanning sets. PROOF OF THEOREM 1.4. Because 0 C X, we always have that h(fIO) ~ h(f). What we need to prove is the reverse inequality. We use an (m, E)-spanning set of 0 to estimate the size of an (n,2E)-separated set on all of X. We fix an integer m ~ 1 and E > 0 for quite a while in the proof. Take the set E.pon(m,E,O) to be a minimal (m,E)-spanning set for 110. Let
U = {x EX: dm,/(x,y) < E for some y
E
Em(E,O)}.
Since the orbits in E.pon (m, f, 0) also span orbits of points near 0 in X, U is an open neighborhood of 0 in X. Since UC = X\ U is compact and all points in UC are wandering, there exists a uniform {3 with 0 < {3 ~ E such that the forward orbit of the ball of radius {3 about any y E UC, B(y,{3), never intersects itself, Ji(B(y,{3»nB(y,{3) = 0 for all j ~ 1. Now take a set E.pon(m, (3, UC) which is a minimal (m, {3)-spanning set for J with points starting in UC, so
Let G.pon(m) = E.pon(m,E,O)uE.pon(m,{3, UC). The set G.pon(m) is clearly an (m, E)spanning set for X, so #(G.pon(m» ~ r.pon(m,E,X,f). Let t be a positive integer. To estimate r .ep( n, 2E, X, f), we define a map
by IPt(x) = (Yo, ... , Yt-1l where (i) y. E E.pon(m, f, 0) and dm,/(rm(x), Y.) < f if rm(x) E U, and (ii) y. E E.pon(m,{3,UC) and dm,/(rm(x),y.) < {3 if rm(x) E UC, Because E.pon(m,E,O) is an (m,f)-spanning set for U and E.pon(m,{3,U") is an (m,{3)spanning set for UC, it is always possible to make these choices to define IPt.
346
VIII. MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS
Claim 1. Assume (Yo, ... ,Yt-d = 'Pt(x) for some x E X. E,pon (m, {3, U C ) can not be repeated in this i-tuple.
Then a point Y. E
PROOF. This claim follows because the balls B(y.. {3) are wandering for and choice of y. E E.pon(m,{3,UC). 0
Now we take n > mr,pon(m,/3,Uc,j) be an integer. Let E •• ,,(n,2E, X) be a maximal (n, 2t)-separated set. For such an n, let i be the positive integer with (i-l)m < n ~ im. The following claim shows that 'Pt is one to one on E,e,,(n, 2£, X), so we can estimate r ••,,(n, 2£, X, f) by means of #('PI(Eoe,,(n, 2£, X»). Claim 2, The map 'PI is one to one on E. e,,(n,2£,X). PROOF.
Assume that 'Pt(x) = 'Pt(z) = (Yo, ... ,Yt-d for x,z E E. e,,(n,2£,X). For and 0 '5 s < i,
o '5 t < m
+ dm,J(Y"
d(f'm+t(x), j"m+t(z» ~ dm,JU·m(x), Y.)
j"m(z»
< £ + £ = 2£. The integer l is chosen so that tm ~ n, so we get that dn,J(x, z) < 2£, Since the set E.e,,(n, 2E, X) is (n, 2E)-separated, x = z. 0 Claim 3. Let q=r.pon(m,{3,Uc,f) andp=r.pon(m,E,n,f). Then
PROOF.
Let I) be the subset of i-tuples in 'Pt(E.e,,(n, 2£, X» such that there are exactly
j of the Y. that are in E.pon(m,/3,UC). Because the y. E E.pon(m,{3,U C) can not be repeated in 'P1(X), we must have j '5 q. (Notice that this bound is independent of n
or
t. Also, n > mq so t > q.) For Ii' there are
(J)
ways of picking these j points
Y. E E.pan(m,{3,UC); there are t· (t- 1)··· (i- j
+ 1) =
i'
- - '.(i-])I
ways of arranging these choices among the positions in the ordered i-tuples; finally, there are at most r.pan(m,E,O,f)t- i = pt- i ~ l ways of picking the remaining y. from E.pan(m,E, 0). Thus
#(Ii)~
i! (q) (i-j)!P, t
j
and q
#('Pt(Eoe,,(n, 2E, X))) =
L #(Ii ) i=O
'5
~(q) L.,
,-0
j
i! t (i _ j)! P .
8.1.1 PROOF OF TWO THEOREMS ON TOPOLOGICAL ENTROPY
To estimate this summation, note that
(~)
347
$ ql and
~
- - = t· (t- 1)··· (t- J'
(t- j)!
,
+ 1) <- tJ <- f!l.
Thus q
#(IPt(E.ep (n,2E,X))) $ Lq!f!lpt j=O
$ (q + I)!f!ll
o
as claimed.
REMARK 1.10. Notice how crudely we made the count of possible i-tuples in the set q
IPt(E,ep(n, 2E, X». The estimate L
q!f!l $ (q
+ I)!f!l gives a
bound on the number of
j=O
wandering orbits. Because this quantity only grows polynomially in i, it does not contribute to the entropy. A better bound is possible by paying attention to the dynamics of points which wander but it would still have polynomial growth contributed by the wandering orbits. We could also form the open cover of sets V made up of the open sets
{x EX; dm,/(X,y.) < E} for y.
E
E.pan(m,E,n) and
{x EX; dm,/(x,y,) < t1} for y. E E.pan(m,t1,UC). Let V t be the refinement as in the definition of the entropy by open sets. Then there is a very close connection between V t and IPt(E.ep(n,2E, X». Since the growth rate of the number of open sets in V t as t goes to infinity give the entropy for this open cover, it is not surprising that it can be used to get It. bound on the growth rate of (n, 2E)-separated orbits. See Alseda, Llibre, and Misiurewicz (1993) for a proof of Theorem 1.4 using the definition in terms of open covers. By Claims 2 and 3, r.ep(n, 2E, X, f) = #(IPt(E.ep(n,2E, X)))
$ (q+l)!f!ll, where q
= r.pan(m,t1,Uc,f) and p = r.pan(m,f,n,f).
Then
h. ep (2E,X,f) = limsup.!. log(r. ep (n, 2f,X, f) R-OO n . log«q + I)!) + q log(f) I $ l~~P (i _ I)m
$ log(p) m = log(r'fH!n(m, E, n, f). m
+ f log(p)
348
VIII. MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS
Next, we let m vary and go to infinity, and we obtain the bound h ..,(2f, X,!) :5 h."..n(f, n,!)
:5 hen,!).
o
Finally, letting f go to zero, we get h(X,!) :5 h(n,!) as desired.
PROOF OF THEOREM 1.8. Because k is onto, h(F) ~ h(f). Thus what we need to show is that h(F) :5 h(f). The obvious attempts at proofs do not work. Using the uniform continuity of k, it can be shown that given f > 0 there is a 6 > 0 such that if E •• ,(n,6,!) c Y is (n,6)-separated for I then k- 1 (E..,(n,6,/» is (n,f)-separated for F. However k- 1 (E•• ,(n,6,/» is not necessarily the maximal (n,f)-separated set for F, so this fact does not give an upper bound for r ••,(n,f,F) in terms of r •• ,(n,6,!). On the other hand, the inverse image of a spanning set for I is not necessarily a spanning set for F. The proof below shows that given f > 0, there is a 13 > 0 such that there is an upper bound on the maximal size of a (n,2f)-separating set for F, r ..,(n, 2f, F), in terms of the minimal size of a (n,13)-spanning set for I, r."..n(n,13,!). Fix an f > O. Let ~ 1 be a bound on the number of points in k-1(y),
e
for all y E Y. Let m ~ 1 be an integer. With some work we show that the following bound for the size of an (n, 2f)-separating set for F is true: h••p (2f, F) :5 h."..n (13,!)
1
1
+ -m loge e) :5 h(f) + -m loge e),
for correctly chosen 13 > O. Since m is an arbitrarily large integer, this implies that h •• p (2f, F) :5 h(f) and h(F) :5 h(f). To start this proof, for y E Y, let Uy
= Uy •m,< = {w EX:
dm.F(W,Z)
< f for some Z E k-l(y)}.
By the continuity of F, Uy is an open neighborhood of k-1(y). By the continuity of k and compactness of X, there is an open neighborhood Wy C Y of y such that k-1(Wy ) C Uy . The set Y is compact, so there exists a finite cover {Wy " ••. , Wyp }. Let {3 > 0 be the Lebesgue number of this finite cover, i.e., if y E Y there is some WYi such that the closed 13 ball about y is contained in Wy;, cl(B(y, 13» C WYi" Note that given a small f > 0, we have determined a small 13 > O. This is the 13 that works for the (n,13)-spanning set for I mentioned above. We want to show that 1 1 -log(r••p(n, 2f, F» :5 -log(r."..n(n,13,/» n n
1
1
+ -log(e) + -log(e). m n
In the proof below, we need to break up an orbit of F into segments of length m. To this end, for an integer n let t be the integer such that
(t - l)m < n :5 tm. Let E •• ,(n,2f,F) C X be a maximal (n,2f)-separating set for F with
#(E•• p (n,2E,F» = r •• ,(n,2E,F),
8.1.1 PROOF OF TWO THEOREMS ON TOPOLOGICAL ENTROPY
and E.pon(n, {J, j)
c
349
Y be a minimal (n, {J)-spanning set for f with
For y E Eopon(n,{J, f), let q(j,y) E {y" ... y,,} be chosen so that
c1(B(fi(y), {J»
c
Wq{j,y)'
To get an estimate on roe,,(n, 2£, F) in terms of r.pon(n, {J, F), we define the map
'PI : E •• ,,(n, 2£, F)
--+
E.pon(n, {J, f)
X
X,
by 'P1(X) = (Yi"o, ... ,xl-d where (i) d~,/(y,k(x» ~ {J and y E E.pon(n,{J,j), and
(ii) x. E k-I(q(sm,y» satisfies dm.dF·m(x), x.)
k
0
Fom(x)
< £ for 0 ~ s < t. Because
= I'm 0 k(x) E c1(B(f'm(y), {J» c
Wq(.m,y),
F·m(x) E k-I(Wq(.m.y» C Uq(.m,y),m .• and it is always possible to choose the x •.
Claim. The map 'PI is one to one on E ••,,( n, 2£, F). PROOF. If 'P1(W) = 'Pt(z) = (Yi "0, ... , xl-d, then
d(Fom+l(w), F·m+l(z» ~ d m,F(F6m(W), ~ £ + l = 2£.
x.»
+ dm,F(X., Fom(z»
for 0 ~ t < m, and 0 ~ B ~ t. Since ml ~ n, we get that dn,F(Z, w) E •• ,,(n, 2l, F) is (n, 2l)-separated, we get that w = z.
~
2£. Since 0
Now we use the claim to finish the proof of the theorem. For y E E.pon (n, {J, f) fixed,
t-I #('PI(E. e,,(n,2l,F»n({y} x xt» ~
11 #(k-I(q(sm,y»
0=0
Because there are r.pon(n, {J, f) choices of y E E.pon(n,{J, f),
r"l'(n, 2£, F)
= #('PI(E••,,(n, 2£, F))) ~ r.pon(n,{J,f)C' .
Therefore, 1 l I t -log(r••,,(n, 2l, F» ~ -log(r.pon(n, {J, f) + - 10g(C ) n n n 1 tm = -log(r.pon(n,{J,f» + -log(C) n nm 1 n+m ~ -log(r.pon(n,{J,f) + --log(C) n nm 1 1 1 ~ - log(r.pon(n, {J, f) + -log(C) + -log(C). n m n
350
VIII.
MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS
The second to the last inequality is obtained using the definition of i. Taking the limsup as n goes to 00, we get h.
1
+ -log(C) ::; m
h(f)
1
+ -log(C). m
Since this last inequality is true for all m ~ 1, we get that h..p(2~, F) $ h(f). Letting E > 0 go to zero we get the desired Inequality, h(F) $ h(f). (Notice that the bound on the number of pointli in the invllTae im~1l of" point contributllcl \og(C)/m to the bound on the entropy, which is nonzero if C > 1. It is only as m goes to infinity that this bound can be neglected.) 0 There is a generalization of Theorem 1.8 that does not assume k is uniformly finite to one. In this case,
REMARK 1.11.
h(f) ::; h(F) $ h(f)
+ sup h(k-I(y), F). yEY
Notice with the assumptions of Theorem 1.8, k-I(y) is finite so that the second term is zero, SUPyEY h(k-I(y), F) = O. Therefore the specific result of Theorem 1.8 follows from this more general result. The proof of the more general result is basically the same as that given above but the number m varies with the point y E Y. This form was also proved by Bowen (1971). See de Melo and Van Strien (1993) for a proof.
8.1.2 Entropy of Higher Dimensional Examples In this subsection, we apply the theorems about entropy to some of the higher dimensional examples discussed in Chapter VII. We start with Morse-Smale diffeomorphisms. Because such a diffeomorphism has only finitely many points in its chain recurrent set, the dynamics are simple and it has zero topological entropy.
Theorem 1.11. Let M be a compact manifold, and f a CI Morse-Smale diffeomorphism. Then the topological entropy of f is zero, h(f) = O. PROOF.
set of
Theorem 1.4 shows that h(f)
= h(fIO) where 0 = O(f) is the nonwandering
f. Since f is Morse-Smale, O(f) is a finite set of points. Therefore,
h(f) =
h(fIO) = O.
0
As a second example, we consider an Anosov diffeomorphism and connect the entropy to the entropy of the subshift of finite type induced by a Markov partition.
Theorem 1.12. Let fA be a hyperbolic toral automorphism on ']['2. Assume'R. {Rj}j~1 is a Markov partition with transition matrix B. Let Al be the eigenvalue the transition matrix B with largest absolute value. (This is also the absolute value the largest eigenvalue of the original matrix A.) Then the topological entropy of fA
= of of is
10g(Ad· PROOF. By Theorem VII.5.3, there is a finite to one semiconjugacy h : EB -+ ']['2. Since h is a finite to one semi-conjugacy, the entropies of f and (fB are equal by Theorem 1.8. The topological entropy of a two sided shift is the logarithm of the largest eigenvalue 0 by Theorem 1.9. Combining these facts gives the result. REMARK 1.10. Bowen (1971) has also related the topological entropy of a hyperbolic toral automorphism on T" to a summation involving all the expandinp; eigenvalUes:
h(fA) =
L
10g(lAil)
1.\;1>1
where Ai are the eigenvalues of A. Also see Bowen (1978a).
8.2 LIAPUNOV EXPONENTS
351
Example 1.2. As a last example consider a transverse homoc1inic point for a diffeomorphism f. By Theorem VII.4.5(a), has an invariant set A which is conjugate to 172 on the full two shift. The orbit of A by f is just n sets which are homeomorphic to A and for x E A, fj (x) returns to A every n iterates. The entropy of riA is log(2) and the entropy of f on the or bit of A is (1/ n) log (n). On the other hand if we consider directly invariant set A' for f which Theorem VII.4.5(b) shows is conjugate to a subshift of finite type for the matrix B given in the proof. As noted in Remark VII.4.6, the characteristic polynomial of B is p(A) = An - An-I - 1. Since p(21/n) = 2 - 2(n-I)/n - 1 < 0, the largest eigenvalue of B is larger than 21/ n , i.e., the entropy of the subshift of finite type 1781E8 and so flA' is greater than the entropy of fIO(A).
r
REMARK 1.11. Assume f : M -+ M is a C l diffeomorphism on a compact manifold M and f has a hyperbolic chain recurrent set (or hyperbolic nonwandering set). Let Nn(f) = #(Fix(fk» be the number of points whose least period divides n. Bowen (1970b) proved that 10g(Nn(f» · h(f) = IImsup , R-OO n i.e., the entropy is equal to the growth rate of the number periodic points. Also see Bowen (197811.). Bowen (197811.) also introduces the connection between the entropy and the induced maps on the homology groups. See Yomdin (1987) for a proof of the conjecture.
8.2 Liapunov Exponents Most of this book concerns examples of systems with uniformly hyperbolic invariant sets. In this section we define Liapunov exponents for diffeomorphisms and flows in all dimensions, extending the treatment in Section 3.6. We give only a brief introduction to the ideas. For more details see Ruelle (198911.), Mane (198711.), Katok (1980), and Walters (1982). Just as in one dimension, the exponents exist almost everywhere in terms of an invariant measure. If these exponents are nonzero almost everywhere on an invariant set, then it has a nonuni/ormly hyperbolic structure (which may in fact be a uniformly hyperbolic structure in some cases). There are examples which have a nonuniformly hyperbolic structure; in fact, the Benon map for A < 2 and IBI small can not have a uniform hyperbolic structure by the theorem of Plykin, but Benedicks and Carleson (1991) proved that it does have nonzero Liapunov exponents. Therefore the Benon map for these parameter values is a.n example with a nonuniformly hyperbolic structure. For A = 1.4 and B = -0.3, numerical simulation indicates that it has an invariant set with a nonuniformly hyperbolic structure but this is unproven. With this introduction we turn to the definitions for a diffeomorphism. Those for flows are similar, but we leave the small differences to the reader. Definition. Let f : M -+ M be a. diffeomorphism on a manifold of dimension m. Let I . I be the norm on tangent vectors induced by a Riemannian metric (inner product on tangent vectors) on M. For each x E M and v E TxM let A(x, v) = lim -k110g(ID/;vl) k_oo
wh·",·,·ef this limit exists. Note that
lim sup -k1 10g(IDf;vJ) k-oo
354
VIII. MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS
Remember that for a single linear map L on am. most vectors v have some component in the direction of the eigenvector corresponding to the largest eigenvalue J.Lm. so
In the present situation. most tangent vectors v at x lie in Vxm Am(X)
\
V:,-I. so
= k-oo lim -k1 10g(IDf:~vl).
Thus we can calculate Am(X) by taking an arbitrary tangent vector v E TxM (and assume that it does not lie in Vxm- I ) and determine Am{X) by means of A(X. v) (if the limit converges). We can not easily find a vector v E Vxm- I to calculate Am_l(X). However. for most pairs of vectors v m • v m - 1 E TxM. the plane spanned by v m and v m - 1 is complementary to Vxm - 2 • Therefore. there is a vector w m- 1 E Vxm - 1 such that w m- I = v m- I + QV m for some scalar Q. Under application of D I:'. the area of the parallelogram determined by DI:'v m and DI!v m- 1 equals the area of the parallelogram determined by DI!v m and D l!w m- 1 • and so grows at a rate like ek(>'m(xl+~m-dx)). (This requires some justification and consideration of angles between V,T(x) and Vpc;.\.) Let Iv /\ wi be the area of the parallelogram determined by v and w. Then. for most v m, v m- I E TxM,
This gives a means of calculating Am-I (x) once we know Am (x). Taking j-tuples of tangent vectors vm •...• v m- HI E TxM, most choices span a subspace which is complementary to VJ. so
where Iv m/\ .. ·/\vm-j+ll is the j-dimensional volume of the j-dimensional parallelepiped determined by v m•...• v m- HI. Thus it is possible to determine all the Liapunov exponents by induction. In interpreting the Liapunov exponents, Aj(X) is like the logarithm of the eigenvalue of a fixed point of a map. Therefore Aj(X) > 0 corresponds to an expanding direction, Aj (x) < 0 corresponds to a contracting direction, and Aj (x) = 0 corresponds to a neutral direction (at least as far as exponential growth rates are concerned). For the rest of the section, we assume that Aj(X) 1: 0 for all j and almost all points x E An B, where A is an invariant set for f and B, is defined by Theorem 2.1. This assumption is analogous to a hyperbolic structure on A, and is what we called above a nonuniform hyperbolic structure on A. Let r(x) be the integer function such that Aj(X) < 0 for 1 $ j $ r(x) and Aj(X) > 0 for r(x) < j $ s{x). Let 1E~ = V,;(x). This is the stable subspace at x. The unstable subspace at x can be determined using I-I, 1E~. The splitting of the tangent space by means of the stable and unstable subspaces, TxM = E~ eE~ for x E AnB" is measurable in x but not necessarily continuous. This type of structure is called a nonunifonnly hyperbolic structure on A. Pesin {1976} proved that a ll(Jlllillear map f on a nonuniformIy hyperbolic invariant set has local stable and unstable manifolds for points x E A which are invariant and tangent to 1E~ and E~ respectively. In this case, the diameter of the local manifolds varies
8.2 LIAPUNOV EXPONENTS
355
with the base point x, so the situation is much more complicated than the uniformly hyperbolic case. Also see Pugh and Shub (1989) for a proof more like the one given in this book for uniformly hyperbolic sets. Ruelle (1979) has another proof. Katok (1980) used these ideas to prove the following theorem which gives a connection between positive Liapunov exponents and uniformly hyperbolic sets which have positive topological entropy.
Theorem 2.2. Let f : M --+ M be a C 2 diffeomorphism on a compact manifold M. Assume that (i) f is ergodic for some invariant Borel probability measure J.'() EMU) where J.'() is not concentrated on a single periodic orbit (J.'() is a non-atomic measure), and (ii) the Liapunov exponents are nonzero J.'()-almost everywhere. Then f has a closed invariant uniformly hyperbolic subset, A, (a "horseshoe") such that (i) flA is topologically conjugate to a subshi£t of finite type and (ii) the topological entropy of f on A is positive, h(fIA) > O. REMARK 2.3. With the assumptions of this theorem, it follows that there must be at least one positive Liapunov exponent J.'()-alm06t everywhere. Thus, this theorem proves that a positive Liapunov exponent implies positive topological entropy and so chaos. This theorem also says that the condition on individual orbits about Liapunov exponents implies an aggregate condition on a hyperbolic invariant set. In principle, it is easier to show the existence of nonzero Liapunov exponents than the existence of a uniform hyperbolic structure. The theorem implies that there is not as much difference between these two conditions than one might think or fear. An earlier theorem of Pesin (1977) implies that if a diffeomorphism f (i) preserves a measure 1'0 EMU) which is equivalent to the volume for the lliemannian metric, and (ii) at least one of the Liapunov exponents is positive on a set of positive J.'() measure, then the topological entropy is positive. In fact, there is a lower bound on the topological entropy in terms of an integral of the sum of the positive Liapunov exponents. Let kj(x) be the multiplicity of Aj(x), kj(x) = dim(V1) - dim(Vl- 1 ), and ,(xl
L
X"(x) =
kj(x)Aj(x),
j=r(x)+l
then h(f)
~ 1M X"(X) dl'o.
This theorem requires the measure to be smooth, but does not assume that all the exponents are nonzero. In fact what Pesin proved was that the measure theoretic entropy with respect to J.'() is greater than or equal to the integral,
Earlier, Margulis (1969) proved the other inequality
so hl'o(f) =
1M X"(X) dl'o·
See Katok (1980) or MRne (1987a) for further discussion of this type of result.
356
VIII.
MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS
Several recent papers have used a field of invariant cones to prove the existence of a positive Liapunov exponent and the ergodicity of the system. See Wojtkowski (1985), Burns and Gerber (1989), and Katok and Burns (1994) for references dealing with the ergodicity of geodesic flows. See Sinai (1970), Chernov and Sinai (1987), Katok and Streleyn (1986), and Liverani and Wojtkowski (1993) for references dealing with the ergodicity of billiards problems. (The last reference gives a good introduction and many other references.)
8.3 Sinai-Ruelle-Bowen Measure for an Attractor In this section, I : AI -+ M is a C 2 diffeomorphism, and A is a uniformly hyperbolic attractor. (Since we assume that all the points of an attractor are chain recurrent and have a hyperbolic structure, the periodic points are dense in A. See Theorem IX.4.l.) Our goal in this section is to explain why the forward orbit of most points in the basin of attraction of A tend to be dense in A: the computer picture generated by the forward orbit of most points x is a picture of the attractor. We denote the basin of attraction of A by W"(A). Let U be a compact neighborhood of A in W·(A). For x E U, a probability measure, II;, can be associated to the partial forward orbit of length n, {x,/(x), ... ,r-I(x)}: n-I
.!. L
II;:
=
R,
.!. L tp 0 Ii (x) converges to
6'i(xl
n i=O where 6y is the atomic measure at the point y. The Sinai-Ruelle-Bowen Theory says that there exist (i) a subset U' C U of full Lebesgue measure and (ii) a measure /.S with support on A such that for any x E U' the measures II; converge weakly to /.S, i.e., for n-I
any continuous function tp : U
J
tp(y) d/.S(y). This n i=O measure /t is called the Sinai-Ruelle-Bowen measure 01 the attmctor, or just the SRB measure 01 the attmctor. The fact that the average of the evaluation of the function tp along the forward orbit of x converges to the integral means that the forward orbit is uniformly spread out over A in terms of the measure /.S. The measure Il is characterized by the fact that (i) there is a positive Liapunov exponent Il-a.e. and (ii) Il has absolutely continuous conditional measures on the unstable manifolds for points of A relative to the Riemannian measure on the unstable manifolds. See Young (1993) for an introduction to this theory and other related measure theoretic results. Also see Sinai (1972), Bowen (1975b), Ruelle (1976), and Bowen and Ruelle (1975). We mentioned above that a uniformly hyperbolic attractor always has a SRB measure. The only situation where a SRB measure has been shown to exist for a nonunifonnJy hyperbolic invariant set is the Henon attractor for B very small and A just less than 2, Benedicks and Young (1993). This is the same situation where Benedicks and Carleson (1991) proved there is a transitive attractor. See Young (1993) for a discussion of this result. For A = 1.4 and B = -0.3, the Henon map appears to have a SRB measure (the orbits of most points appear to be uniformly dense in the whole attracting set), but this result has not been verified mathematically. The same remark about an apparent SRB measure that has not been verified mathematically applies to the Lorenz attractor for p = 28. -+
8.4 Fractal Dimension Another measurement of the complicated or chaotic nature of the dynamics is the dimension of the invariant set. In Section 7.6 we have defined the topological dimension of a set. This dimension is always an integer and is useful in some considerations
8.4 FRACTAL DIMENSION
357
but does not relate to the chaotic nature of the system. In this section, we discu88 some dimensions which are nonnegative real numbers and do not have to be integers. The definitions of these concepts go back to Hausdorff, but recent interest has been stimulated by Mandelbrot. Generally, these types of dimensions are called fractal dimensions. We mainly consider the box dimension (also known as capacity dimension), but define Hausdorff dimension at the end of the section. We give references to other sources which introduce some of the other types of dimension: information dimension, correlation dimension, and Liapunov dimension. Chapter 5 in Broer, Dumortier, van Strien, and Takens (1991) discusses the sense in which the box dimension is related to chaotic nature of the system. In doing this it also defines both the box dimension and the entropy for a time series of a system. Because the time series can either be generated by an explicit map (or differential equation) or by an experiment, this indicates how the ideas can be applied to experimental situations. Definition. III our discussion of box dimension, we only consider compact subsets A of some Euclidean space Rn. (These definitions also make sense in a metric space. Since a manifold M can be embedded in some Euclidean space Rn , our definitions apply to compact manifolds.) For t > 0, consider the subdivision of Rn into boxes or cubes of sides of length t: for (jl, ... ,in) E zn, let Rj" ... ,jn = {(x), ... x n ): i,t:5 x,
< (i, + l)t for 1:5 i:5 n}.
A box of this kind is said to be a box from the t-grid. Let N(t, A) be the number of boxes Rj among all the choices of j E zn such that An Rj '" 0. To motivate the definition of box dimension, we consider the number of boxes from the (-grid, N(t, A), that are needed to cover various objects. For a line segment, N{f, A) is roughly t-I times the length. For a rectangle in a plane, N{e, A) is roughly c 2 times the area. It is not as obvious, but for a curve, N(t, A) also is roughly e- I times the length, and for a piece ofsurface, N{t, A) is roughly t- 2 times the area. Next, consider a compact submanifold A of Rn of dimension d. (The dimension is here used in the sense of the number of variables in local coordinate charts.) For such a manifold, N{t, A)fd is roughly equal to the d-dimensionaI volume, or N{f, A) is proportional to Cd; more precisely, there are two constants C),C2 > 0 such that C I :5 N{f,A)f d :5 C2 or C l t- d :5 N{f,A):5 C 2 t- d • Taking logarithms of the inequality, we get that
so 10g{N{f, A» - log{C2 ) < d < log{N(t, A» - Jog{CJ) log{c l ) log{c l )
and
d = lim 10g(N{t, A)) . • -0 log(c l )
Notice that for real numbers 0 :5 p < d < q and 0 < t < I, P
=
q
= 0,
lim N{t, A)e .-0 lim N{t, A)t .-0
00
tP
> fd > t q so
and
Thus for a compact submanifold A of dimension d and 0 :5 p < d < q, the growth rate of the of the p dimensional volume of boxes which cover A is infinite, the growth rate
VIII. MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS
358
of the of the q dimensional volume of boxes which cover A is zero, and the growth rate of the d dimensional volume of the boxes which cover A is a finite number. Thus the dimension d is characterized as that number p at which the lim._oN(e,A)e P changes from being infinite to being zero. If the limit exists for p = d, then this limit is the d-dimensional measure of A. Thus the dimensions can both be defined in terms of the growth rate of N(e, A) in terms of e- I , and as the number for which the d-dimensional measure makes sense. We use the first characterization in our definition of box dimension and the second in our definition of Hausdorff dimension. We now turn to organizing these ideas into precise definitions.
Definition. For a general compact subset A c Rn, we define the box dimension of A, dimb(A), by · (A) - I' . f log(N(e, A» d 1mb - 1m In Iog (_\) <-0 e This dimension is often called the capacity dimension of A or limit capacity of A. Because the word capacity has other meanings in certain discussions of complex dynamics, term box dimension is becoming the standard term for this measurement. If we use the limsup instead of the liminf, we get the upper box dimension of A which is denoted by dimB(A), d' (A) - l' log(N(e, A)) 1mB - lI~~~P log(c 1) REMARK
4.1. Clearly dimb(A)
~
dimB(A)
~
n where A eRn.
REMARK 4.2. The box dimension of a set A is the same as the box dimension of its closure. This is the reason we restrict to closed sets. We take the sets to be compact so there are at most a finite number of boxes from the e-grid which intersect A. REMARK 4.3.
We use an exercise to show that liminfN(e,A)e P = <-0
{OO 0
and limsupN(e,A)eP = { • -0
00
0
for 0 ~ p < dimb(A) for dimb(A) < p < 00 for 0 ~ p < dimB(A) for dimB(A) < p < 00 .
See Exercise 8.30. Instead of using a fixed set of boxes from the e-grid, it is possible to cover the set A by a finite set of closed cubes with length e on a side and sides parallel to the axes (without fixing the grid of hyperplanes which form the surfaces). Let N'(e, A) be the minimum number of such cubes of size e which cover A. The following result shows that N'(e, A) can be used to calculate the box dimension instead of N(e, A).
Proposition 4.1. Let A c Rn be a compact subset. The box dimension of A can be calculated by the following limit: . . . log(N'(e, A» dlmb(A) = hmlnf Iog (-1) <-0 f
where N'(e, A) is defined above as the minimum number of boxes without fixing the grid. PROOF. Clearly N'(e,A) ~ N(e,A) since it is possible to choose the cover from boxes from the e-grid. Also any cube of size e is contained in 2n boxes from the e-grid, so
8.4 FRACTAL DIMENSION
359
N{E, A) :5 2n N'{E, A), Taking logarithms and taking the limit, we get the following result:
.-0
' (A) - I' ' f log(N(t, A)) d 1mb - 1m In Iog (t I) ' ' f nlog(2)+log(N'(t,A)
A» o
It is also possible to cover the set A with a ball of diameter t in terms of the Euclidean metric or any metric equivalent to the Euclidean metric, A metric d is said to be equivalent to the Euclidean metric provided there are constants Cit C2 > 0 such that Cllx - yl :5 d{x,y) :5 C 2 1x - yl for all x,y ERn, Using a metric equivalent to the Euclidean metric, let N;'(A) be the minimum number of closed balls of diameter t which cover A, We defer to Exercise 8,31 to verify the following proposition, Proposition 4.2. Let A
c
Rn be a compact subset, The bwc dimension of A is given
by the following limit:
.-0
' (A) -I' ' f log(N"{E, A» d 1mb - 1m In Iog (E- I)
where N" (t, A) is defined above as the minimum number of closed balls of diameter t which cover A, It is also useful to calculate the box dimension using a sequence of diameters going to zero and not having to use all possible t, To that end we have the following result, Proposition 4.3. Let A
c Rn
be a compact sub6et, Assume 0 < r < 1. Then
' (A) I' ' f log{N'(rj , A» d 1mb = ImIn , )-00 J'1og (-I) r PROOF,
For any E > 0, there is a J such that r HI < t :5 r j , In particular, r- I r- j
> t - I 2: r
r- j
- I ,
log{r-I) + j log{r-l) > log(t- I ) 2: log(r) + (j + 1) log(r- I ),
and
[Iog(r-I) + j log(r- I WI < [log(E- 1WI :5 [Iog(r) + (j + 1) log(r- I WI, Since N'{t, A) 2: N'(D, A) whenever t < D, N'(r j , A) :5 N'(E, A)
:5 N'(ri+l, A),
360
VIII,
MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS
Therefore,
. f 10g(N'(ri , A)) I' . f 10g(N'(ri , A» I·Imm. = Imm . i-oo Jlog(r- I ) i-oo 10g(r-I)+Jlog(r-l) · . f 10g(N'(t, A)) < IImm log(t- I ) = dimb(A) .. 10g(N'(rHI , A» < IImmf , - J-OO log(r) + (J + 1) log(r I) · . f 10g(N'(rHI , A» = I1m m ...,..:::.-'--:-'-:----:-'----:"'" i-oo (J+1)log(r- l ) = lim inf 10g(N'(ri , A)). J-OO J log(r- I )
- .-0
Bt'Cause the first and last entries are equal, they must equal the box dimension.
0
Example 4.1. Let C be the middle-a Cantor set in the line. Let /3 be chosen so that 2/3 + a = 1, so that 0 < /3 < 1/2. In the formation of the Cantor set, there are 2i intervals of length /3J which cover C, so N'(/Ji) = 2J. Therefore, · (C)-I' . f log (N'(/J1,C» d1mb - Imm J-OO J'Iog (/3-1)
I' . f log(2J) = 1m m J, Iog (/3-1 ) J-OO log(2) 10g(/3-I) .
First of all note that 0 < dimb(C) < 1. Thus these Cantor sets have non integral box dimension. Also the dimension depends on /3 and hence a. In fact any number between o and 1 can be realized as the box dimension by the proper choice of a. It is also possible to construct a Cantor set in the line (which is not a middle-a Cantor set) with positive Lebesgue measure so dimb(C) = 1. Example 4.2. For 0 < {3 < 1/2, a solenoid can be formed using the map
!(t, z) = (g(t),
~z + (3 e 21r1i )
where g(t) = 2t mod 1. This map! takes the neighborhood N = SI X D2 into itself. !"(N) be the attractor as in the usual construction of the solenoid. The Let A = set Sic = t'rN) is the finite intersection. Let D(t) = {t} x D2 be a single fiber as before. For each t, SIc n D(t) is the union of 2/c disks of diameter 2{3/c. Therefore
n,,>o
d · (AnD(t» =1' . f log (N'(2{3/c,AnD(t») 1mb 'k~: k 10g({3-I)
· . f k log(2), = IImIn /c-oo k 10g({3 I)
log(2) = log({3-I)"
8.4 FRACTAL DIMENSION
381
We do not give the details but . log(2) dm1b(A) = 1 + log(,B-l) with the other dimension coming from the expanding direction. Notice that by picking
,B close to 1/2, dimb(A n D(t» can be made almost equal to one and dimb(A) can be made almost equal to two. REMARK 4.4. If we define Cantor sets of the type of A n D(t) in the above example but with different rates of contraction in different directions, it is often impossible to calculate the box dimension exactly.
Hausdorff Dimension We mentioned above for the box dimension that for 0 ::; p < dimb(A) for dimb(A) < p < 00.
P {OO 0
liminfN(e.A)e = <-0
The definition of Hausdorff dimension uses a characterization like this one. Definition. If U is a nonempty subset of Rn, let
lUI =
sup{lx - yl : x,y E U}
denote the diameter of U. If A C Ui Ui where each Ui is a ball of diameter less than or equal to 6, then lUi} is called a 6-cover 0/ A. Let 0 ::; p. We want to define a p measure of a Borel set A. To do this first take 6 > 0 and define 00
1t:(A) = inf
L lUi I", i=-l
where the infimum is take over all countable 6-covers. Notice in this summation, all the diameters of the different sets do not have to be equal; this contrasts with the definition for box dimension. Next let 6 go to zero and define 1t"(A) = lim 1t~(A). 6-0
Because 1t~(A) increases as 6 decreases, the limit exists but can equal infinity. This quantity, 1t"(A), is called the Hausdorff p-dimensional measure of A. If A eRn, then 1tn(A) is the Lebesgue measure of A. The Hausdorff dimension of A is defined to be d if forO::;p
4.5. For a compact set A
c
Rn,
In many ways the Hausdorff dimension is defined in a manner more similar to Lebesgue measure than the box dimension. Cantor subsets ofR which are defined defined in terms
VIII.
362
MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS
of a map in a manner like that for F,x(x) = AX(l.X) are called dynamically defined. For a dynamically defined Cantor set dimH(A) = dimb(A) = dimB(A) ~ n. See Palis and Takens (1993), Takens (1988), or Manning and McCluskey (1983). For a further discussion of these fractal dimensions see Edgar (1990) or Falconer (1990). For an introduction to dimension in terms of dynamical systems see Farmer, Ott, and Yorke (1983) or Palis and Takens (1993). For a connection with measures and entropy see Young (1982). REMARK 4.6. In Palis and Takens (1993), they define dynamically defined Cantor sets which include the middle-a Cantor sets, Co, and also ones A" determined by F,,(x) = Jlx(1 - x) for Jl > 4. They prove that if C c R is a dynamically defined Cantor set then the Hausdorff dimension equals the upper box dimension and so also the box dimension. In particular, dimH(Co ) = dimb(Co) = log(2)/log(2/(1 - a» for the middle-a Cantor set.
8.5 Exercises Topological Entropy 8.1. Let I,,(x) = JlX mod I, for Jl > 1. Calculate the entropy of I", h(f,,). 8.2. Let f be a diffeomorphism on the circle, SI. Prove that the topological entropy of f is zero, h(f) = O. 8.3. Let d > 1 be an integer. Let f : SI --+ SI have a covering map F : lR --+ R given by F(x) = dx. Prove that the entropy of I is log(d), h(f) = log(d). 8.4. Let aN : EN --+ EN be the full shift on N symbols. Prove that the entropy of aN is 10g(N), h(aN) = 10g(N). 8.5.
Let A =
(~ ~).
Find the entropy of the subshift of finite type with transition
matrix A. 8.6. Use Theorem 1.9(a) (but not Theorem 1.9(b» to calculate the entropy of the subshifts of finite type with the follow transition matrices A, h(aA)' Note: These examples illustrate again that (i) the growth rate of the number of the wandering orbits does not contribute to the entropy and (ii) that the entropy is the maximum of the entropy on disjoint invariant pieces of the nonwandering set.
(a)LetA=On· (b) Let A =
(~
i D·
1 11 1)
~ ~:1 8.7. Let A be an N by N transition matrix and EA C EN be the corresponding subshift of finite type. Prove that the entropy h(aA) ~ 10g(N). 8.8. Let I : M --+ M and 9 : N --+ N be two maps on compact metric spaces. Consider the map f x 9 : M x N --+ M x N defined by f x g(x,y) = (f(x),g(y». Prove that h(f x g) = h(f) + h(g).
8.5 EXERCISES
363
8.9. Assume /: X -+ X is a homeomorphism. Prove that h(f-I) = h(f). 8.10. Let /w : Sl -+ Sl be a rotation by w. Prove that the entropy h(fw) = O. 8.11. Assume / : [0,1) -+ [0,1) is continuous and has two subinterval h,12 C [0,1) such that /(Id ~ II U 12 and /(12) ~ II U h Prove that the entropy of / is at least log(2), h(f) ~ log(2). 8.12. Assume / : [0,1) -+ [0,1) is continuous and has a periodic point of a period which is not a power of two. Prove that / has positive entropy. Hint: Use the Stefan cycle to get intervals which cover each other. 8.13. Assume / : X -+ X is a continuous map on a metric space X and leX) = Y. Prove that h(f) = h(fIY). 8.14. Use Proposition III.2.IO to prove Theorem 1.9(b) in the case when A is reducible. 8.15. Let A : L2 -+ L2 be the bdding machine map: A(SOSI82 .•• )
=
(808182 ... )
+ (1000 ... ) mod 2,
i.e., (1000 ... ) is added to (808182 ... ) mod 2 with carrying. (a) Prove that h(A) = O. Hint: Take f = 3-k+ 1 2- 1 so the closed balls of radius of radius f are cylinder sets given in Exercise 2.13. Then show that A permutes these cylinder sets. (b) Let /00 : [0,1) -+ [0,1) be the map defined in Example m.1.3. Prove that h(foo) = O. 8.16. Let / be the solenoid diffeomorphism given in Section 7.7. Using a Markov partition, prove that h(f) = log(2). 8.17.
Let 9 be the hyperbolic toral automorphism for the matrix
(~ ~ ).
Prove that
the the entropy h(g) = (3 + 8.18. Let / be the DA-diffeomorphism formed from the hyperbolic toral automorphism 5 1/ 2 )/2.
9 for the matrix
(i ~ ).
Using the result of Bowen given in Remark 1.11, prove that
the entropies of / and 9 are equal, h(f) = h(g). 8.19. Let / : T2 -+ T2 be the noninvertible toral map induced by the matrix
(1 -1) 1
1
.
Calculate the entropy of /. Hint: Find a power /Ie for which it is easy to calculate the entropy and use Theorem 1.2 relating the entropy of /Ie to the entropy of /. 8.20. Let / : [0,3) -+ [0,3) be the piecewise linear function such that /(0) = 3, /(1) = 2, /(2) = 3, and /(3) = O. (The map / is linear between each pair of adjacent integers.) (a) Find a Markov partition for / and its transition matrix. (b) Find the topological entropy of /. Liapunov Exponents 8.21. Let / be the solenoid map given in Section 7.7. Prove that the Liapunov exponents satisfy A2 = A3 = log(4- 1 ) and Al ~ log(2). 8.22. (Generalized Baker's map) Assume 0 < 1'1 < 1'2 < 1 satisfy 1'1 + 1'2 < 1. Define the function /I-II.".(x,y) =
{
(2x,l'llI) (2x - 1.1 - 1'2
+ 1'2Y)
for 0 $ x < 0.5, 0 $11 $ 1 for 0.5 $ x $ 1, 0 $ Y $ 1
364
VIII.
MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS
on the square S = [0,11 x [0,11. (1\) Show that the Liapunov exponents satisfy AI(X,y) = log(2) and
for all (x, y) E S. (b) Notice that the first coordinate function of 1",1.",2 is essentially (except for x = 1) the doubling map D(x) = 2x mod 1 and so is ergodic with respect to Lebesgue measure on [0, IJ. (You do not need to prove this fact.) Using the Birkhoff Ergodic Theorem, prove that A2(X, y) = 0.5[log(l-Id + log(1-I2)] almost everywhere with respect to Lebesgue measure on S. (c) Find three different periodic orbits (of any period) for which A2(X, y) equals log(l-Id, log(1-I2), and [log(l-Id + log(1-I2)J/2 respectively. 8.23. Let 1 : M -+ M be a C l diffeomorphism. Assume x E M and v E TxM are a point and a vector for which the Liapunov exponent A(X, v) exists. (a) Let be a real number. Prove that
°
A(X,OV) = A(X,V). (b) Prove that
A(f(X), D/xv) = A(X, v). Fractal Dimension 8.24. Let S = to} U {11k: k is a positive integer }. (a) Prove that dimb(S) = 1/2. (b) Prove that dimH(S) = O. 8.25. Let S = to} U {2-k : k is a positive integer }. (a) Prove that dimB(S) = O. (b) Prove that dimH(S) = O. 8.26. that
Let F",(x) = I-Ix(l-x) and A", = nn~O~([O, 1]). Let A", = (1-1 2 _41-1)1/2. Prove
for 1-1 > 2(1 + 2 1/2). 8.27. Construct a Cantor set in the line with box dimension equal to one. 8.28. Let N = SI X D2 and I(t, z) = (g(t),
~z + (3 e2..ti )
where g(t) = 4t mod 1. Let A = nk>o Ik(N). (a) Prove for 0 < (3 < 2- 112 , that is an embedding of N into itself. (b) Let D(t) = {t} x D2. Prove that
f
. log(4) dlmb(AnD(t)) = 10g({3-I)' Also prove for correction choice of (3, that dimb(AnD(t)) > 1. Note that AnD(t) is a totally disconnected set that has box dimension greater than one.
8.5 EXERCISES
365
(c) Prove that . dlmb(A) 8.29. 8.30.
Calculate the box dimension of the invariant set A for the Geometric horseshoe. Let A c Rn be a compact set. Prove that lim inf N(f, A)fP = {
00
.-0
and
log(4) I)'
= 1 + log(.B
.
hmsupN(f,A)fP = • _0
0
{oo0
for 0 ::; p < dimb(A) for dimb(A) < p < 00 for 0 ::; p < dimB(A) for dimB(A) < p < 00 .
8.31. Let A c Rn be a compact subset. Using a metric equivalent to the Euclidean metric, let N;'(A) be the minimum number of balls of diameter E which cover A. Prove that · (A) - I' . f log(N"(E, A)) d 1mb - Imm Iog (E- I)
.-0
8.32. Let 1/10,./10. be the generalized Baker's map defined in Exercise 8.21. (a) Prove there is a unique positive number d such that 1 = J.l1 + J.I~. (b) Let d be the number given in part (a). Let
n1::,.".([0, 00
A=
IJ x [0,1]).
n=O
Prove that the Hausdorff dimension of A n ({ O} x [0, 1]) is less than or equal to d. Remark: Theorem 6.3.12 in Edgar (1990) proves that the Hausdorff dimension of A n ({O} x [0,1]) is actually equal to d. The Hausdorff dimension of A is then equal to 1 + d. Because the set is dynamically defined, the box dimension of An ({O} x [0, '1]) is also d, so the box dimension of A is 1 + d.
CHAPTER IX Global Theory of Hyperbolic Systems In this chapter, we take the ideas introduced in the last chapter and make them into a more complete theory. The first section shows that in some sense any map can be decomposed into a map on chain recurrent pieces and a gradient-like map (or flow) between the pieces. This is a very general theorem of Conley and can be proved without using much of the material introduced earlier in this book except the definition of chain recurrent. The next section indicates how the proof of the stable manifold theorem for a fixed point can be modified to prove the case for a hyperbolic invariant set. Using these results, we prove the possibility of shadowing near a hyperbolic invariant set, the n-stability of diffeomorphisms with a hyperbolic chain recurrent set, and the structural stability of diffeomorphisms which satisfy a "transversality" condition in addition to having a hyperbolic chain recurrent set. These theorems form the heart of the theory of hyperbolic diffeomorphisms (ones for which the chain recurrent set is hyperbolic) which was articulated by Smale (1967) and carried out in the following years by Smale and other researchers.
9.1 Fundamental Theorem of Dynamical Systems In this section, we study a flow
for all t > 0 and for all T >. I} = {x : x E n+(x)} = {x : x E W(x)}. 367
368
IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS
The definitions for a flow are the same once an t-chain is defined, which is done in Section 5.4. In order to give an interpretation of the fundamental theorem below, we form equivalence classes of points in 'R.('l) for which there are chains from one point to another and back to the the first point. This concept is made precise in the following definition. Definition. We define a relation'" on 'R.(
The proof of the theorem requires proving the connection between the chain recurrent set and the collection of all pairs of attracting and repelling sets. We start by giving these definitions for a flow which are much the same as for a diffeomorphism except that the ''time" is continuous rather than discrete. Definition. Let 0 such that o
and the repelling set A"}.
9.1 FUNDAMENTAL THEOREM OF DYNAMICAL SYSTEMS
A set V is called a wed trapping region for the flow '(" provided there is some T such that c1e'('T(V» C inteV), i.e., V is not neceesarily positively invariant.
>0
REMARK 1.2. The sets A and A·, which we call an attracting set and a repelling set respectively, Conley calls an attractor and a dual repeller respectively. We reserve the word "attractor" for an attracting set which is transitive. REMARK 1.3. Proposition 1.10 proves that if V is a weak trapping region with A = n,>o '("~eV) and A· n O. Thus, these two propositions give two different ways in which the definition of a trapping region can be changed. The following proposition gives a couple of useful properties of attracting sets.
=
Proposition 1.2. (aJ Any attracting set or repelling set is closed. (b) Both A and A· are positively and negatively invariant. PROOF. (a) Because of the property that '('T(c\(U» C int(U), it follows that A is also the intersection of closed sets and so is closed: A n>o '("( cl(U». The proof for A· is similar. (b) The proofs that attracting sets and repelling sets are invariant are similar, 80 we only look at A. Let x E A and fix any real s. Then x E '("~eU) for all t ~ 0, 80 '("(x) E ,(,'+'(U). Therefore, '("(x) e n'~I'1 '('I+'(U) A. Since s is arbitrary, A is both positively and negatively invariant. 0
=
=
The following theorem gives the connection between the chain recurrent set and the set of all attracting-repelling pairs, and is used to prove the existence of a Liapunov function which is strictly decreasing off X('(") as given in Theorem 1.1. Theorem 1.3. Let '(" be a !Iowan a compact metric space M. Let l'
Then l'
=n{A U A· : (A,A·) E A}.
= X('(").
Before starting the proof we give an example. Example 1.1. Let '(" be the gradient flow on T'l with one sink Po, two saddles PI and P2, and one source P3. If Uo then Ao and Ao T2. Next, let UI be an attracting neighborhood of the sink po. Then AI = {po} and Ai = {pa}UW'(P2)UW'(PI). If U2 is a positively invariant set that contains po and PI but not P2 or Pa, then A2 {Po} U WU(PI) and A; {ps}UW'(P2). If U3 is a positively invariant set that contains Po and P2 but not PI or P3, then As = {po} U WU(P2) and Ai {Pal U W'(PI). If U4 is a set such that T2 \ U4 is a repelling neighborhood of Pa, then A4 = {po} U WU(PI)UWU(P2) By taking the and A. = {ps}. Finally, if U5 = T'l, then A5 = T'l and As = intersection of these sets, l' :::> nO~i~5Ai U Aj {Pi: 0 $ j $ 3}. By the remarks preceding the example, we see that the other inclusion is also true and so we have equality (as stated in Theorem 1.3).
=e
=e
=
=
=
=
=
e.
The following lemma proves Xe'(") :::> l' which is one direction of the inclusion of the two sets in Theorem 1.3.
IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS
370
FIGURE 1.1.
Neighborhoods UI. U2, and U4 for Example 1.1
Lemma 1.4. Ify rt R(cp') then there exists an (A,AO) E A such that y Therefore, R(cp') :) P.
rt
Au AO.
=
PROOF. Take y rI: R(cp'). Then there is an ( > 0 such that y rI: 0i(y). Let U 0i(y). Then cpl(cI(U» C int(U) (or else it would be p068ible to (-chain out of U and the set would be bigger). Let A cp'(U) be the attracting set for U and A° <0 cp' (M \ U) be the dual repelling set. Since y rI: u, y rI: A. From the definition -of 0i, it follows that there is aT> 0 such that cpT(y) E U 80 cpT(y) rI: AO. Because AO is negatively invariant, y rI: AO. This proves y rt AU AO. 0
= n,>o
= n,
REMARK 1.4. To prove the second inclusion, that R( cp') C P, first note that for any attracting-repelling pair (A,AO), all the fixed points and periodic points must be contained in Au AO. Thus Fix(cp') U Per(cp') C P. In the following lemma, we prove that given any attracting-repelling pair (A, AO) there is a Liapunov function which is strictly decreasing off AUAo. Then in Lemma 1.6 below, we use this Liapunov function to show that R(cp') c P.
Lemma 1.5. Let (A, AO) E A. Then there is a continuous function L : M ..... Ii such 0, LIAo 1, L(M \ (A U AO» C (0,1), and L is strictly decreasing off that LIA AUAo.
=
PROOF.
M
-+
=
Since A and AO are disjoint closed sets in a metric space, the function V :
[0, 1] given by Vex) _ d(x, A) - d(x, A) + d(x, AO)
is continuous, VIA = 0, VIAO = 1, and V(M\(AUAO» c (0,1). (The existence ofsuch a function in a normal space which is not a metric space is given by the Tietze-Urysohn Theorem.) This function V is not necessarily decreasing or even non-increasing. Let VO(x) = sup{V(cp'(x» : t ~ O}.
=
=
For x E A, cp'(x) E A so VO(x) O. Similarly, if x E AO then VO(x) 1. If x rt A U AO then cp'(x) approaches A as t goes to plus infinity. (See Exercise 9.7.) Therefore, if x rt AO, then V(cp'(x» goes to zero and the maximum of V(cp'(x» is attained. Moreover, there is a neighborhood W and a bounded time interval [0, td such that for yEW the maximum of V (cp' (y» for t ~ 0 is attained for some t E [0, t 11; it follows that VO is continuous. Finally, VO(cp'(x» ~ VO(x) for t ~ 0 by construction. Note that the inequality is not necessarily strict. To make a strictly decreasing function, define L to be a weighted average of VO over the forward orbit:
L(x)
=10
00
e-' V·
0
cp'(x) ds.
9.1 FUNDAMENTAL THEOREM OF DYNAMICAL SYSTEMS
This function is easily seen to be continuous. Then for t L(lPt(x»
~
= fo'JO e-' V* (lP·+t (x»
371
0, ds
$fo'>O e-' V*(lP'(x)) ds =L(x). The second inequality follows because V*(rp·+t(x» $ V*(lP'(x» for t ~ O. Thus L is non-increasing along orbits. Now take t > O. If L(lPt(x» = L(x) then V*(lP,+t(x» = V*(rp'(x)) for all s > O. In particular, taking s = nt, we get that V*(lPnt(x» = V*(x) for all n > O. However, if x ~ Au A* then lPnt(x) goes to A so V*(lPnt(x)) goes to zero. Thus it is impossible since V*(lPHt(X» = V*(lP'(x» for all s > O. Therefore we have shown that L(x) is strictly decreasing off A U A * as required. 0 Lemma 1.6. (a) Let (A, A*) E A bean attracting-repelling pair. Then R(rpt) (b) FUrther, R(lPt) c P.
c
AuA*.
PROOF. Part (b) easily follows from part (a). To prove part (a), we take p ~ Au A* and show p ~ R(lPt). Let L be the Liapunov function which is decreasing off A U A* which is given by Lemma 1.5. Let Co = L(p) and CI = L(lPI(p)). Since L is strictly decreasing at all points of L-I([C),Co]), there is a 6> with 6 < (Co - cI)/2 such that if q E L-I([cI,cI + 6]) then lPI(q) E L-I([O,CI])' For each q E L-I([O,cd) there is an f > such that for q' within f of q, L(q') < CI + 6. By compactness there is one f that works for all points at once. Now if {p = Po, ... , Pnj t), ... , t n } is an f-chain with tAo ~ 1, then L(lPt,(Po)) $ L(rpl(Po)) = CI' Then L(pd $ CI + 6 by the choice of f. Next, L(lPt'(pd) $ L(lPI(pd) $ CI. Again, L(P2) $ CI + 6 by the choice of f. Continuing by induction L(pA:) $ CI + 6 for 1 $ k $ n. Thus the chain can never get back to p. This proves that p is not chain recurrent. 0
°
°
Together, Lemmas 1.4 and 1.6 prove Theorem 1.3. To prove Theorem 1.1, we combine the Liapunov functions for different attractingrepelling pairs to prove that there is a Liapunov function which is strictly decreasing off P (which equals R(lPt». To carry out this construction, we need to know that the number of such pairs is at most countable. Lemma 1.1. The set A is at most countable (i.e., finite or countable). PROOF. We want a neighborhood to uniquely determine the attracting set. Since there are often proper subsets of attracting sets which are attracting sets, we use the pair A x A* which is the unique pair that is oontained in int(U) x int(M \ U). Since M x M is a compact metric space, it has a countable basis. Therefore, there is at most a 0 countable number of such pairs A x A * and A is countable.
Example 1.2. The Morse-Smale examples all have finitely many attracting-repelling pairs. The following is an example of a function on the one torus (or circle), having countably many attracting-repelling pairs. Let TI = {x mod I} be the one torus. Let the equations on ']['1 be given by
x=
f(x)
= x 2 sin(l/x)
for - 2/(311') $
X
$ 2/11'.
Then f(x) can be extended outside this interval to have no more zeroes and be periodic. The fixed points are x = 0, 1/(n1l') for all integers n. Since
!,(x)
= 2x sin(l/x) -
oos(l/x),
372
IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS
f'(I/(mr» = (-1)"+1. Therefore for any integer k, x = 1/(2k7r) is a fixed point sink and x = 1/( (2k + 1 )11-) is a source. Thus there are countably many fixed point attracting sets. There are other attracting sets made up of intervals of the form A = [1/(2j1l"), 1/(2k7r)], where j and k are integers and either j < 0 < k, or j and k have the same sign and 0 < Ikl < Ijl. The dual repelling set is also an interval, if j < 0 < k then A· = yl \ (1/((2j - 1)11"), 1/((2k - 1)11"). (There are other attracting sets made up of intervals which do not contain 0.) This gives the different types of attracting sets. Notice that the set {O} is not an attracting set, but is the intersection of attracting sets. Theorem 1.8. Assume cpl is a flow on a compact metric space M. Then there is a Liapunov function L : M -+ R which is strictly decreasing off P and such that L(P) is a nowhere dense subset of R. PROOF. By Lemma 1.7, A is at most countable, i.e., A = {(AJ' A;)}~l' (The case when A is finite requires small modifications.) For each j there is a weak Liapunov function L J given by Lemma 1.5 which is strictly decreasing off A J UAj. Also, Lj(M) c [D. 1]. Let 00
L(x) = 2
L 3- Lj(x). ,_I j
The sum is absolutely convergent, so L is continuous. Also L(M) C [0.1]. The function is a combination of non-increasing functions and so is non-increasing along trajectories. Next, if x rt P. then x rt A" U Ak for some k. Then for t > 0, Lj(cp'(X» $ Lj(x) for all j and L,,(cpl(X» < L,,(x), so L(cpt(x» < L(x). This proves that L is strictly decreasing off P as desired. Finally, for any point pEP, each Lj(p) is equal to either 0 or I, so L(p) is a number whose ternary expansion has only zeroes or two's. (Note the factor of 2 in the definition of L.) Therefore L(P) is contained in the nowhere dense Cantor set made up of points whose ternary expansion has only zeroes or two's. 0 Theorems 1.3 and 1.8 combine to prove Theorem 1.1. We end this section with three results which follow from the above results or are closely related. Proposition 1.9. Any attracting set A has a neighborhood U which is a trapping region for A and which is positively invariant in the strong sense that cpl(c1(U» C int(U) for all t > O. PROOF. Let L be the Liapunov function given by Lemma 1.5 and let U = L-1([O,f» for small f > D. Then cl(U) C L-l([D,f]) and cpt(c1(U» C cpl(L-l[D,f]) C L-1([O,f» for t > o. 0 The following proposition proves that it is enough to assume there is a weak trapping region. Proposition 1.10. Let V be a weak trapping region for the pair (A, A·) of attracting and repelling sets. Then there is an trapping region U such that A = n,>o cpt(U) and A· = nt$ocpl(M \ U). PROOF. Let V be a weak trapping region and T the time such that cl(cpT(V» C int(V). Let U= cpt (int(V».
n
O$l$T
9.1 FUNDAMENTAL THEOREM OF DYNAMICAL SYSTEMS
373
We leave to Exercise 9.5 the verification that U is open using the continuity of the flow on initial conditions. For 0 :5 s = jT + s' with 0 :5 s' < T. ",'(U) = [
n
",'+iT+.' (int(V))]
n[
n
",,+jT+.' (int(V))]
cU. For s = T, ",T(c1(U»
c
n ",' ° ",T(c1(V)) n ",'(int(V»
O~'
C
O~I
cU. This proves that U is a trapping region. It is also easily checked that the attractingrepelling pair for U is still equal to (A,A"). 0 The next proposition says that the chains from a point x E 'R.( ",') to itself can actually be take inside 'R.(",'). Proposition 1.11. Let ",I be 8 flow on 8 compact metric space M. Then the the map restricted to the chain recurrent set has all points chain recurrent. 'R.(",'I'R.(",')) = 'R.(",'). REMARK 1.5. This result is not true if the ambient space is not compact. Note also that the analogous theorem is not true in general for the nonwandering set. i.e .• there exist examples where 0(",'10(",'» '" 0(",'). Remark 4.2 later in this chapter gives one such example. PROOF. Let P E 'R.(",'). For each n > 0 there is 8 periodic lin-chain through p: {p = P~ .... ,p~ = P;tl = 1..... tA: = I}. It is easily seen to be possible to take all these tj = 1. Let Cn = {pj }jEZ be a periodic lin-chain through p. Then the set Cn is a compact subset of M. In the Hausdorff metric on compact subsets of M. there is a subsequence C n ; that converges to some compact set C c M. We show below that for any q E C and t > 0 there is a periodic t-chain through q, {q = Xo .... ,XA:; 1.... , I} with Xj E C. It follows that q E 'R.(",'IC) c 'R.(",'). This is true for any q E C, so C c 'R.(",'). Next p E C, so P E 'R.(",'IC) c R(",'). This argument applies to any p E 'R.(",'), so R(",') c R(",'IC) and so they are equal. We have only to show that there is a periodic t-chain through q with Xj E C. By uniform continuity of the flow ",' on M, there is a 6 = 6(tI3) > 0 such that for dCa, b) < 6. d(",l(a). ",l(b)) < t13. We can also take 6 < E/3. Because the Cni converge to C, we can find an n = nk such that lin < E/3 and the distance from Cn to C in the Hausdorff metric is less than 6. Assume {pf} has period j so pj = p8 = p. We can extend pf periodically to all i E Z. so pr-rj = pf for all i. For each pf take Xi E C with d(Xi. pf) < 6, Xi+j = Xi for all i. and Xi = q for some i. Then d(",l(Xi).Xi+1) < d(",l(Xi).",l(pf» + d(",l(pf).pf+1) + d(Pr-rl,XHd < E. Therefore, Xi is a periodic t-chain through q. 0
374
IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS
9.1.1 Fundamental Theorem for a Homeomorphism Let f : N -- N be a homeomorphism on a compact metric space. We can form the suspension of f and obtain a flow 'P' on M = (N x R)/ -. By the Fundamental Theorem for flows, there is a Liapunov function L : M ~ R which is strictly decreasing off the chain recurrent set of
9.2 Stable Manifold Theorem for a Hyperbolic Invariant Set The last section gave some of the basic results about the chain recurrent set using only ideas from point set topology. The other results which we give use the hyperbolic structure on the chain recurrent set, or at least on the closure of the periodic points. One of the key results about systems with a hyperbolic structure on the chain recurrent set is the existence of stable and unstable manifolds as stated in Section 7.1.3. In this section, we restate the theorem and sketch the proof. In the statement of the theorem, we use coordinates near points in the invariant set A. In a Euclidean space, a neighborhood Vp(t) of p can be taken as {p} +1E;(t) x 1E~(t). In a manifold, a neighborhood can be taken of the same form if we use local coordinates. However, it is not always possible to take one set of coordinates for all the different points of A. Another approach is to use the exponential map from tangent vectors at p into the manifold: exp p : TpM -> M. In a Euclidean space, expp(vp) = p+vp' In a manifold, the point expp(vp) is obtained by going along a geodesic (distance minimizing curve) starting at p in the direction vp. If the manifold is embedded in some larger Euclidean space R N, then exp p (vp) can be visualized as taking the point p + vp which is probably not in the manifold M and then projecting this point into M along the subspace (TpM).l of vectors perpendicular to TpM. See Franks (1979) for a more complete explanation of this latter method. In any case the neighborhood of p can be taken as
where Bp(t)
= 1E;(t) x E~(t).
In the proofs below, we use the fact that the D(expp)op = id, so IID(expp)vp - idll is small for vp E Bp(t), although we do not explicitly isolate the fact that D(expp)op is different from the identity.
9.2
STABLE MANIFOLD THEOREM FOR A HYPERBOLIC INVARIANT SET
375
Theorem 2.1 (Stable Manifold Theorem for a HyperboUc Set). Letf: M -+ M be a C" diffeomorphism. Let A be a compact invariant set with a hyperbolic structure for f with hyperbolic constants 0 < ~ < 1 < oX and C ~ 1. Then there is an f. > 0 such that for each pEA there are two C" embedded disks W:(p, f) and W;'(p, f) which are tangent to IE~ and IE~ respectively. (p, f) can be represented as the graph Using the exponential map discussed above, of a C" function (1~ : 1E~(l) -+ E~(f.) with (1~(Op) = Op and D({1~)op = 0:
W:
Also, the function {1~ and its first k derivatives vary continuously as p varies. Similarly, there is a C" function (1~ : E~(t) -+ E~(f.) with (1~(Op) = Op and D({1~)op = 0 and with the function (1~ and its first k derivatives varying continuously as p varies such that
Moreover for l
> 0 small enough and using the neighborhoods Vp(l) defined above,
W:(p, f)
= {q E Vp (€) : =
fi(q) E VfJ(p)(t) for j ~ O}
{q E Vp(l) : fi(q)
E V/;(p)(t) for j ~ 0 and
d(fi(q),fi(p» $ C~id(q,p) for all j ~
OJ.
Similarly,
W.U(p,f)
= {q E Vp(t)
: ri(q) E V/-;(p)(t) for j ~ O}
= {q E Vp(t) : ri(q) E V/-J(p)(t) for j ~ 0 and
d(ri(q),ri(p» $ CoX-id(q,p) for all j ~
OJ.
REMARK 2.1. The idea of the proof we sketch is to repeat the argument for a single fixed point adding the fact that points near p get mapped to points near f(p). It is possible to prove the Stable Manifold Theorem for a hyperbolic set as a corollary of the Stable Manifold Theorem for a fixed point in a Banach space. See Hirsch and Pugh (1970) or Shub (1987). These references also contain a complete proof (rather than a sketch of a proof which we give for our approach). PROOF. We also assume that the constant C ~ 1 in the definition of hyperbolic structure is taken to be one. This is always possible by taking an adapted norm as we proved in Theorem VII.l.l of Section 7.1.3. For each point pEA, we take Bp(f) C TpM and Vp(f) = expp(Bp(t» as defined above. The map from a neighborhood of p to a neighborhood of I(p) induces a C" map Fp : Bp(r) -+ B/(p)(Cor) for some Co ~ oX defined by Fp(vp)
= eXP/(~) 01 0 expp(vp)'
For a small v p, expp(vp) is a point in M near p, f 0 exPp(vp) is a point in M near f(p), and eXP/C~) of oexpp(vp) is a relatively small vector in T/Cp)M. Notice that Fp(Op)
= eXP/C~) of(p) = O/(p).
376
IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS
Also D(Fp)op = DJp because D(expp)op = id and D(exp/(p»op = id. Therefore DJp is a linear approximation for the nonlinear map Fp in Bp(E). In order to consider all these maps for different p at once, we use the bundle structure of T"M to keep the different neighborhoods disjoint. The set B,,(r) = {vp E Bp(r) ; pEA}
c
T"M
is a bundle over A with all the sets Bp(r) contained In distinct tangent spaces TpM and so do not intersect. (The neighborhoods Vq(E) and Vp(E) do intersect for q near p.) Since A is often not a manifold (e.g. a Cantor set), this space is a metric space but not a manifold. The map F; B,,(r) ..... B,,(Cor) is continuous and Ck on each fixed fiber Bp(r). Note that since A is compact, Co can be taken independent of p. Let 0 < I' < I < A be bounds on the derivatives on E~ and E~; IIDJpIE~1I < I' and m(DJpIE~) > A for all pEA. (Remember that for a linear map A, m(A) is the minimum norm of A. See Section 4.1.) Take E > 0 small enough so that I' + 2f < 1 and A - 2E > 1. Given such an f > 0, there is an r > 0 small enough so that for q E Bp(r), A"(q) D(Fp)q = ( A~'(q)
A'U(q»)
A~U(q)
with IIA~'(q)1I < 1', m(A~U(q» > A, IIA~U(q)1I < E, and IIA~'(q)1I < E. In this expression, the derivative of Fp is taken along the fiber with p fixed. (We have used the fact that D(expp)q is near the identity.) We let
n Fj,(p)(B/i(p)(r» 00
W:(p) =
and
j=O
W:(p) = expp(W:(p»,
where we think of the local stable manifold W;(p) as being represented in the local coordinates given by the linear space TpM. We want to show that it is I-Lipschitz. As before, we consider the stable and unstable cones (but now at different points); C~ = {(v~, v~) E E~ x E~ ; Iv~1 :::; Iv~l}
= {vp E TpM ; 11r~vpl :::; 11r~vpl},
c; =
and
{vp E TpM ; 11r~vpl ~ 11r~vpl}·
As before, the condition that
W:(p) n [{q} for all points q E W;(p) is equivalent to The following facts are proved just as point. 1. For q E Bp(r), D(Fp)qC; C Cj(p)" 2. Let ql,q2 E Bp(r) with q2 E {qd l1rj(p)Fp(q2) - 1rj(p)Fp(qdl ~ (A -
+'c;J =
{q}
the graph being I-Lipschitz for each fixed p. in the proof of the stable manifold of a single
+~. Then Fp(Q2) E {Fp(q2)} E)\1r~(q2 - qdl·
+ Cj(p)'
and
9.3 SHADOWING AND EXPANSIVENESS
377
3. Let ql,q2 E Bp(r) with q2 E {qd + C;. Then F;I(q:/) E {F;I(q2)} + Ci-1(p). and l7rj-l(p)F;I(q2) - 7rj_l(p)F;I(ql)1 ~ (I-' + l)-II7r;(q2 - ql)l. 4. Let D8. p be an unstable disk in Bp(r), i.e., the image of a Cl function t/J : E~(r) -+ E;(r) with Lip(t/J) ~ 1. Let D~,/"(p) = F/"-I(p)(D~_I,fft-l(p»
n B/ft(p){p)
be defined by induction. Then for n ~ I, D:,/"(p) is an unstable disk in Bf"(p)(r) and F-n(D~,J.(p» C F-n+l(D~_I./"_I(P» C ... c Do.p is a nested set of unstable disks in Bp(r) with n
diam[n P-j(Dj./j(p»] ~ (.\ - i)-n2r. j=O
5. The manifold W:(p) is a graph of a I-Lipschitz function "'; : E;(r) --+ E~(r) with ",;(Op) = Op. 6. Ifq E W:(p), then IFt(q)-O/J(p)1 ~ (I-'+l)ilq-Opl for j ~ O. Ifq E W:(p) c M is a point in the stable manifold in M, then d(fi(q),p(p» ~ (I-' + l)id(q,p) converges to zero at an exponential rate. 7. For p fixed, the manifolds W:(p) and W:(p) are Cli W:(p) is tangent to at Op and W:(p) is tangent to E; at p. 8. For p fixed, the manifolds W:(p) and W:(p) are Ck if f is Ck.
E;
u;
The fact that the function and its derivatives vary continuously as p varies follows easily from the fact that this is true about Fp and its derivative along fibers. Thus the construction for p and that for p' are close to each other if p is near p'. Since we are assuming that f is invertible (a diffeomorphism), the corresponding facts about W;'(p) follow from looking at rl. 0 In the next section, we outline a modification of the above proof which shows that a 6-chain can be i-shadowed near a hyperbolic invariant set. We later give another variation which proves the structural stability of Anosov diffeomorphisms.
9.3 Shadowing and Expansiveness In this section, we prove that it is possible to shadow in a neighborhood of a hyperbolic invariant set and that a diffeomorphism is expansive on a hyperbolic invariant set. R. Bowen carried over the idea of shadowing to hyperbolic invariant sets, based on the work of D. Anosov (1967) and Ya. Sinai (1972) for Anosov systems. See Bowen (1975a) and (1975b). In the next section, we show that if'R.(f) has a hyperbolic structure then it breaks up into a finite number of pieces, and the results of this section show that we can i-shadow an 6-chain by an orbit in one of these pieces. We start by defining shadowing precisely. We also want a condition on the invariant set which allows us to conclude that the orbit which shadows a 6-chain lies in a given invariant set. The necessary assumption is that the invariant set is isolated, which we define next. After these definitions, we state the result about existence of shadowing. Definition. Let f : M --+ M be a homeomorphism. Let {Xj}~!.it be a 6-chain for f. (In this section, often we add the requirement that either it = -00 or h = 00 or both.) A point y E M i-shadows {Xj}~!.it provided d(fi(y),xi) < f for it ~ j ~ h.
378
IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS
Definition. A closed invariant set A is said to be isolated provided there is a neighborhood U of A such that A C int(U) and
A=
n
r(U)·
nEZ
The neighborhood U is called the isolating neighborhood. Notice that the isolated invariant set is the maximal invariant set contained in its isolating neighborhood, i.e., any invariant set A contained entirely inside the neighborhood U is a subset of A. The geometric horseshoe given in Section 7.3 is an example of an isolated invariant set which is not an attractor. The set S given in that section is an isolating neighborhood. When we defined an isolating neighborhood (trapping region) for an attracting set we required that U is positively invariant. With this assumption on U, the intersection of r(U) for all integers n (given above) is the same as the intersection for all positive integers n (used for attracting sets). For an invariant set which is not an attracting set, we only require the set A be locally isolated in U. There can be other points x in the neighborhood U such that the orbit of x leaves U and later returns to U. These points could be either periodic, nonwandering, or chain recurrent.
Theorem 3.1 (Shadowing). Let A be a compact hyperbolic invariant set. Given t > 0, there exist 6 > 0 and TJ > 0 such that if {XjH~iI is a 6-chain for f with d(xj, A) < TJ for il ~ i ~ h, then there is a y which t-shadows {Xj};~iI' If the 6-chain is periodic, then y is periodic. Moreover, if il = -00 and h = 00 for the 6-chain, then y is unique. If h = -it = 00 and A is an isolated invariant set (or has a local product structure), then the unique point YEA. REMARK 3.1. R. Bowen's proof of this theorem assumes that the set has a local product structure and uses the conclusion of the stable manifold theorem and then uses point set topology ideas. Instead, we use the proof of the stable manifold theorem as given in the last section to prove the result directly. REMARK 3.2. Meyer (1987) and Meyer and Sell (1989) give a proof of the shadowing theorem using the implicit function theorem. At SOD;le general level, their proof is really the same as the one given here but it appears very different. We present the ideas geometrically, while their proof is cleaner analytically. Meyer and Hall (1992) give a complete exposition of this proof in the case when the invariant set is a subset of a Euclidean space. Also see Palmer (1984) and Chow, Lin, and Palmer (1989) for another proof. REMARK 3.3. Grebogi, Hammel, and Yorke (1988) have extended this result to show that systems without a uniform hyperbolic structure can often shadow 6-chains for long intervals of time. PROOF. Throughout the proof, we take 0 < r < E. First extend the splitting E~ x E~ from on A to a neighborhood V C M. Take TJ small enough so the rrneighborhood of A is inside V. Let Bp(r) = E;(r) x E~(r) C TpM and Vp(r) = expp (8p (r)) be as in the last section. The box Bp(r) is a subset of TpM while Vp(r) is the comparable neighborhood of pin M. Let {Xj}1!.j, be a 6-chain for f. If 6 > 0 and TJ > 0 are small enough, then f(VxJ (r)) C VX;+l (Cor) for some Co > A. We introduce the map Fj : Bx; (r) -+ BXH , (Cor) which should be considered as the map f represented in local coordinates at Xj and Xj+!
9.3 SHADOWING AND EXPANSIVENESS
379
Note since f(xj) is not necessarily equal to Xj+1, IFj(Oxj) -OxHII is small, say less than 6, but is not necessarily equal to zero. However for small enough 6, the estimates for this sequence of maps are similar to those in the last section. Here we use v > 0 instead of f to measure the change in the derivatives because we are using f for something else. One difference between the two proofs is that in order to know that an unstable disk has an image which is an unstable disk, we need to take into consideration the fact that Fj(Ox) -:;, OXJ+I: the requirement becomes (>.-v)r-6 ~ r or (>.-v-l)r ~ 6, so that the jumps measured by 6 are bounded in terms of quantities determined by the hyperbolicity. Similarly in the consideration of stable disks, we need that (p + v)-I r - 6 ~ r or [(p + v)-I - llr ~ D. Thus we first take the neighborhood V of A in M and 0 < r < f small enough so that the estimates on the derivative of Fp = eXP/(~) of 0 expp : Bp(r)
-+
Bf(p)(COr)
are true for all p E V in tenns of the extended splitting; for q E Bp(r), A"(q) D(Fp)q = ( A~'(q)
A'''(q»)
A~"(q)
the entries satisfy the following estimates: IIA~'(q)1I < p, m(A~"(q» > >., IIA~"(q)1I < v/2, and IIA~'(q)1I < v/2. Next take 6 > 0 such that the same estimates are true for Fj constructed from a D-chain {xi }~!.jl provided Xi E V for all j. Assume that jl = -00. To shorten the notation, for k > 0 write F! for Fm+k-I 0 ···0 Fm. Using this notation,
is an unstable disk in BXJ (r) and Dj(r) = eXPXj (Dj(r» is an unstable disk in M near Xj' (The fact that Fj(Oxj) -:;, Ox,+! means that the disks Dj(r) do not necessarily go through OXJ and the Dj(r) do not necessarily go through Xj but are only nearby.) If q E D~(r) and k ~ 0, then q E F~k(Bx_.(r» so (F~k)-l(q) E Bx_.(r). In terms of D(;(r), if y E D~(r), then rk(y) E Vx_. (r) and the backward orbit of y stays within r of the D-chain. Similarly, if 32 = 00 we can find stable disks. For k negative, let F! = F':;;~k 0 ..• 0 F':;;~I. Then
is an stable disk in BXj (r) and Di(r) = eXPXj (DJ(r» is a stable disk in M near Xi. If q E D~(r), then for all k ~ 0, q E F~,,(Bx_.(r» so F::;(q) E Bx_.(r). In terms of
D&(r), ify E D~(r), then f''''(y) E Vxl.l(r) and the forward orbit ofy stays within r of the chain. Because of the slopes of these two disks, 00
n FJ-A:(Bxj_.(r»
k--oo
is a single point Qj E Bxj(r) and eXPx;(Qj) = Yj E VXj(r). Because of the nature of the intersection defining Yo, P(Yo) stays within r of Xj for each j. Since r < t, Yo
380
IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS
(-shadows the chain. The fact that Yo is a single point shows that the point which (-shadows the chain is unique when h = -iI = 00. Also the uniqueness shows that Fj(lJ.i) = lJ.i+l, or f(Yj) = Yj+1 as points of M. If the chain is It periodic, then the uniqueness shows that F~(qo) qk qo, so qo is a periodic point for F, or Yo E M is a periodic point for f. If the chain is not infinite in one direction or the other, then the intersection is a nonempty strip but not a point. This gives the existence of shadowing but not the uniqueness. Now in the case that A is also an isolated invariant set, let U be an isolating neighborhood and take V and r small enough so that Vp(r) C U for all p E V. Then, P (yo) = Yj E Vx, (r) CU. Therefore Ij (Yo) E U for all j, and so Yo E A since A is the maximal invariant set in U. 0
= =
The next result compares the points with w-limit sets in A with the stable manifolds of points. It shows that if a point approaches the whole set A then it has to approach the orbit of some point in A. First we give a definition of the stable manifold of the invariant set. Definition. If A is an invariant set, the stable manifold of A, W'(A), is defined to be all points q such that w(q) C A. Notice that often the stable manifold is not a manifold. Even for the horseshoe it is a Cantor set of curves. Similarly, the unstable manifold of A, WU(A), is defined to be all points q such that a(q) C A. A point q in the stable manifold of a single point p in an invariant set has a forward orbit which approaches the forward orbit of p in phase, d(fk(q),Jk(p» goes to zero as It goes to infinity. If the invariant set is not hyperbolic then a point q might approach the invariant set and be in the stable manifold of the invariant set, without being in phase to anyone point in the invariant set. The following theorem proves that a point which approaches a compact isolated hyperbolic invariant set necessarily is in phase with some point in the invariant set. Corollary 3.2 (10 Phase). Let A be a compact isolated hyperbolic invariant set.
Then W'(A) =
U W'(x) U WU(x).
and
xEA
WU(A) =
xEA
PROOF. Let Y E W·(A). Then there is an N > 0 such that d(fj(y), A) < v for j;:: N. Take Xj E A such that d(fj(y),Xj) < v for j;:: N. By uniform continuity of I, Xj is & 6-chain if v is small enough. Let Xj P-N(XN) E A for j ~ N. This 6-chain can be uniquely (-shadowed by a point x E A. Thus for j ;:: N,
=
d(fj (y),Jj (x» ~ d(fi (y), Xj) ~ 6+(.
+ d(xj, Ij(x»
Since the forward orbit of y stays near the forward orbit of x, the Stable Manifold Theorem implies that y E W·(x). The proof for W"(A) is similar. 0
9.4
ANOSOV CWSING LEMMA
381
Corollary 3.3 (Expansiveness). Let A be a compact hyperbolic invariant set. There exists a fJ > 0 such that ifx,y E A with d(fi(x),li(y» ~ fJ for all j E Z then x = y. REMARK 3.4. A diffeomorphism with the property of this corollary is called expansive. Any diffeomorphism that is expansive also has sensitive dependence on initial conditions. (See Section 3.5 for the definition.) Therefore, a diffeomorphism restricted to a hyperbolic invariant set has sensitive dependence on initial conditions. PROOF. Let xi = J1(x) and Yj = J1(y). Both of these are infinite c5-chains. Each is orbit which c5-shadows the other. By uniqueness of shadowing, x = y if c5 is small enough. 0 !\II
9.4 Anosov Closing Lemma In this section we prove that if A is either the set limit set L(f) or the chain recurrent set 'R(f) and A hIlS a hyperbolic structure, then A = c1(Per(f». Thus if (i) A is one of these two sets and is hyperbolic, and (ii) there there are transverse intersections of stable and unstable manifolds for some points in A, then we can conclude that there are transverse intersections of stable and unstable manifolds for periodic points. Sometimes we can even get a homoclinic point for the same periodic point. Since a transverse homoc1inic intersections for a periodic point implies the existence of a horseshoe, this result can be used to get an invariant set nearby with additional periodic points. We proceed with the statement and proof of the An060v Closing Lemma.
Theorem 4.1 (Ano80v Closing Lemma). Assume I: M -+ M is a C 1 diffeomorphism (or flow) on a compact manifold M. (a) Assume that the chain recurrent set of I, 'R(f), has a hyperbolic structure. Then the periodic points are dense in the chain recurrent set, and cl(Per(f» = 'R(f) = L(f) = O(f). (b) Assume that the limit set of I, L(f), has a hyperbolic structure. Then the periodic points are dense in the limit set, c1(Per(f» = L(f). (c) Assume that the non wandering set O(f) is hyperbolic. Then the periodic points are dense in the nonwandering set of the map restricted to the nonwandering set, c1(Per(f» = O(fIO(f)), i.e., cl(Per(f» = O(F) where F = fIO(l). REMARK 4.1. In part (c), if 0(1) is hyperbolic it is not true that cl(Per(l» =.O(f). A. Dankner (1978) gives an example of a diffeomorphism with a hyperbolic nonwandering set for whic1I c1(Per(f» '" O(f). The problem is that there are a point x E O(f) and a neighborhood U ofx such that ifz, I"(z) E U with k ~ 1 then {fj(z) : 0 ~ j ~ k} is not contained in a small neighborhood of O(f). Thus the orbit segment {fj (z) : 0 ~ j ~ k} does not stay in the region of M where I is hyperbolic and so the periodic f-chain generated by {fi(z) : 0 ~ j ~ k} can not be shadowed by a periodic orbit. REMARK 4.2. There are examples where L(f) is hyperbolic but L(f) '" O(f) c'R(f). Figure 4.1 gives the phase portrait of an example where the limit set is hyperbolic, but the nonwandering set is not hyperbolic. The point q of non-transverse intersection is nonwandering but not a limit point, q E O(f) \ L(f). It follows that the whole orbit of q are nonwandering points whic1I are not lim1t points. Also, q E O(f) \ 0(f10(f». Exercise 9.14 asks the reader to prove these facts about q. PROOF. (a) We showed in the section on Conley's Theorem that 'R(fI'R(f» = 'R(f). This means that given x E 'R(f), there is a periodic fJ-cbain {xi} with Xo = x, Xj+/< = Xi for all j, and Xi E 'R(f). By the Shadowing Theorem, there is a periodic point y E 'R(f)
382
IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS
+
-t FIGURE 4.1.
Phase Portrait for Remark 4.2
which f-shadows the chain. The point y is within 6 + f of x. Since this is arbitrarily small, x is in the closure of the periodic points. Since cl(Per(f)) C L(f) c n(f) c n(f), and the outer two are equal, we have that cl(Per(f)) = L(f) = n(f) = n(f). (b) Since L(f) = cl (U. w(z)) , it is sufficient to prove that the periodic points are dense in an arbitrary w(z). Let x E w(z). Since M is compact, w(z) is compact. In this circumstance, we proved earlier that d(fi(z),w(z)) goes to zero as j goes to infinity. Therefore given 6 > 0, we can find k and n such that d(f/c (z), fk+n (z» ~ {), d(f/c(z),x) < 6, and d(P(z),w(z)) ~ {) for k ~ i ~ k + n. Let zi be the n-periodic 6-chain with Zj = fi(z) for k ~ j < k + nand Z/c+n = Z/c. By construction this whole periodic 6-chain is within 6 of w(z) and so of L(f). This chain can be f-shadowed by a periodic point y. Then y is within 6 + f of x. The sum 6 + f can be made arbitrarily small, so x is in the closure of the periodic points. We leave the proof of part (c) to the exercises. See Exercise 9.15. 0
9.S Decomposition of Hyperbolic Recurrent Points Conley's Fundamental Theorem of Dynamical Systems gives a decomposition of the chain recurrent set into invariant sets. In this section we show that if f has a hyperbolic structure on nu) then nu) has a finite number of chain components, each of which is an isolated invariant set. This conclusion is not true without the added assumption of hyperbolicity. In other books, it is often assumed that f satisfies a condition in terms of the nonwandering set called Axiom A. A diffeomorphism of flow I is said to satisfy Axiom A provided it has a hyperbolic structure on n(J) and cl(Per(f)) = n(f). See Smale (1967). We mentioned in the last section that the second condition of Axiom A does not follow from the first condition. Rather than assume n(J) has a hyperbolic structure or I satisfies Axiom A, we often only assume that f has a hyperbolic structure cl(Per(f)). This latter condition is a weaker assumption than either of the other two assumptions and isolates what is actually necessary to make the theorem true. A second part of the main theorem assume the limit set L(J) has a hyperbolic structure. Again this assumption is implied by the assumption that either (I) n(J) has a hyperbolic structure or (ii) I satisfies Axiom A
9.5 DECOMPOSITION OF HYPERBOLIC RECURRENT POINTS
383
and so is a weaker assumption. This weaker assumption is exactly what is needed to make the theorem work. The treatment given in this section closely follows the lecture notes by Newhouse(1980) who first isolated the actual assumptions which we use. Many of the original theorems are due to Smale. See Smale (1967). Through this section, / is a C 1 diffeomorphism on a compact manifold M. The results are also true for a flow on a compact manifold but we do not state the results using this terminology. As indicated in the introduction above, we use several types of recurrent sets in this section: limit set, nonwandering set, and chain recurrent set. We review the notation for these concepts in the following definition. Definition. As we have defined before, n(f) is the set of all nonwandering points and R(f) is the set of all chain recurrent points. As a third type of recurrent points, the limit set 0/ / is defined to be the following set: L(f)=cI(
U w(p)ua(p»). pEM
Using the definitions, it is easy to check that Per(f) C L(f) c n(f) c R(f). In the theorems of this section, we often assume that cI(Per(f», L(f), or R(f) has a hyperbolic structure. We could also assume that n(f) has a hyperbolic structure, but then we have to add the assumption that cI(Per(f» = n(f), i.e., that / satisfies Axiom A .. To break up the periodic points into pieces, we form equivalence classes of periodic points using the relation given in the following definition. Definition. We let H(f) be the set of all hyperbolic periodic points of ,. If q,p E H(f), then we say that p is heteroclinically related to q, or p is h-related to q, provided WU(O(p» has a nonempty transverse intersection with W'(O(q» and W"(O(q» has a nonempty transverse intersection with W·(O(p». (Newhouse calls this property homoclinically related.) We form equivalence classes of h-related points and write p "" q if P is h-related to q. For p E H(f), let Hp={qEH(f): p"'q}. The set Hp is called the h-closs 01 p. Note if p is h-related to q then dimWU(p) = dimWU(q) since there is a transverse intersection of the stable manifold of p and the unstable manifold q and vice versa. Proposition 5.1. Being h-related is an equivalence relation on H(f). PROOF. It is clear that p "" p and that p '" q if and only if q "" p.
What we need to check is transitivity: assume p "" q and q '" r and show that p '" r. Let n1, n:h and n3 be the periods of p, q, and r respectively. Since the manifolds for the orbits are transverse, we can find p' E O(p) such that W"(p') intersects W'(q) transversally at a point x. By replacing x by fiRIR·(X) for some j ~ 1, we can assume that x lies in the local stable manifold of q. Let ~u be a small disk in WU(p') through x. By the Inclination Lemma, fiR·(a U) accumulates on W,~(q) in a C1 manner. Next, because q is h-related to r, there is a r' E OCr) such that W"(q') intersects W'(r') transversally at some point y. By replacing y by r;R,RI(y) for some j ~ I, we can assume that y lies in the local unstable manifold of q. Let~' be a small disk in W'(r') through y. See Figure 5.1. By the Inclination Lemma,l-jR'R'(~') accumulates on W,~(q) in a C 1 manner. Since W,~(q) and W,~(q) cr06S transversally at q, and fiR,R·(a U) and l-jR.n'(a') accumulate on them in a C1 manner, it follows that
384
IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS
p'
I X
\
r'
q
FIGURE 5.1.
Disks
~u
and
~.
pnln2(~U) and l-jn2n.(~") cross transversally for large enough j. Thus WU(O(p)) has a nonempty transverse intersection with W"(O(r)). A similar argument shows that WU(O(r)) has a nonempty transverse intersection with W·(O(p)). Combining, p is h-related to r. 0
Next, we show that the closure of a single class Hp, cl(Hp), is topologically transitive using the Birkhoff Transitivity Theorem. Proposition 5.2. For any p E H(f), the set cl(Hp) is closed, invariant under topologically transitive for I.
I,
and
PROOF. Let X = cl(Hp). Then X is a complete metric space with countable basis. Clearly X is closed. It is the closure of an invariant set and so is invariant. (See Exercise 9.4.) We need to show that the Birkhoff Transitivity Theorem applies. Take any two open sets U1 and U2 • It is sufficient to show that O(U.) n U2 '" 0. Take 1,2. The orbit O(r.) C O+(Ul) so W.U(O(rd) C O+(U.). any r J E Uj n Hp for j By iteration, we get that WU(O(r.)) C O+(U.). Because r1 is h-related to r2 (both are in Hp), WU(O(rd) intersects W'(0(r2)) transversally at some point y. We have shown that y E O+(U.). By an argument as above applied to r2, Y E 0-(U2 ) and so there is some k > 0 such that I"(Y) E U2 . Because O+(U.) is positively invariant, I"(Y) E O+(U.). Thus O+(U.) n U2 '" 0. 0
=
Example 5.1. Let I be the map on the standard horseshoe A. Any two periodic points are h-related. Also A = cl(Hp). By the above theorem it follows that I is topologically transitive on A. (The fact that I is topologically transitive on A also follows from the conjugacy to the two sided shift.) Example 5.2. The fact that the solenoid, or any other connected hyperbolic attracting set, is topologically transitive follows from the following proposition. Proposition 5.S. (a) Let N be compact and connected and I : N -+ N be a C 1 diffeomorphism into N. Assume that N is a trapping region, so that l(cl(N)) C int(N). Let A = nji!O Ji (N) be the attracting set. Assume that I has a hyperbolic structure on A and the periodic points are dense in A. Then I is topologically transitive on A. (b) In fact, if A is a connected hyperbolic attracting set for a diffeomorphism f then f is topologically transitive on A. PROOF. A connected attracting set has a connected trapping region, so part (b) follows from part (a).
9.5 DECOMPOSITION OF HYPERBOLIC RECURRENT POINTS
385
To simplify the notation in the proof, we write H to mean the hyperbolic periodic points in A, H(f) n A. Because there is a hyperbolic structure on A, there is an f > 0 such that if p, q E H with d(p, q) < f then p is h-related to q. Thus, each Hp is open in H. Take Up ::> Hp open in M with Up contained within a distance of f/3 of Hp. Since Hp = H, and H is dense in A, UPEH Up ::> A. Therefore we have an open
U
pEH
cover of A. Next, we claim that two Up and Uq either coincide or are disjoint. If x E Up n Uq, there is a p' E Up and q' E Uq such that d(x, p') < f/3 and d(x, q') < f/3. Then d(p',q') ~ d(p',x) + d(x,q') < (2/3)f < E. By the choice of f, it follows that p' is h-related to q', and so p is h-related to q. Thus Hp = Hq and Up = Uq. Therefore the only time that Up and Uq can intersect is for them to coincide. We have shown that the Up form a cover of the set A by disjoint open sets. Since each teeN) is connected, A is the intersection of the connected sets Ik(N), and A is connected. (See Exercise 5.11.) Therefore, there is only one set Up and so only one equivalence class Hp, H = Hp, and A = cl(Hp). By Proposition 5.2, it follows that I is topologically transitive on A. 0 The next theorem shows that the closure of the periodic points can be split up into a finite union of invariant sets that are topologically transitive if cl(Per(f» has a hyperbolic structure. In some sense, Proposition 5.3 is a special case of this theorem where there is one piece. The theorem also proves that each invariant set has a local product structure which is defined next. Definition. Let A be an hyperbolic invariant set for a diffeomorphism I. We say that A has a local product structure provided there is an r > 0 such that for every p, q E A, W;'(p) n W:(q) c A. Let A be a hyperbolic invariant set. The continuity of the stable and unstable manifolds for points of A implies that there is an ro > 0 such that for any 0 < r ~ ro there is an f = f(r} > 0 for which W;'(p) n W:(q) is a single point for any p,q E A with d(p, q) ~ E, i.e., this intersection is nonempty and a single point. Thus if A is a hyperbolic invariant set with a local product structure and d(p, q) < E, then W;'(p) n W:(q) = {z} c A. Using this fact, it can be shown that given pEA, for small r > 0 a neighborhood ofp in A is homeomorphic to [W:,(p)nA) x [W:(p)nA) by the map h: [W;'(p) n A) x [W:(p) n A) - A h(x,y) = W;'(x) n W:(y). Thus locally the invariant set is homeomorphic to a product, hence the name local product structure. Now we give the Spectral Decomposition Theorem. Theorem 5.4 (Spectral Decomposition). Let M be compact and I : M - M be a C 1 diffeomorphism. (a) Assume that I has a hyperbolic structure on cl(Per(f». Then there are a finite number of sets, AI, ... ,AN, each of which is the closure of disjoint h-classes, such that cl(Per(f)) = Al u··· U AN. Thus each Aj is closed, invariant by I, the periodic points are dense, and I is topologically transitive on A j • EUrther, each Aj has a local product structure. (b) If we assume that L(f) has a hyperbolic structure then M = Uj W'(A j } Uj WU(A j ) where WU(A j ) = {q: a(q) C Aj} and W'(A j } = {q: w(q) C Aj }.
=
IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS
386
Definition. Assume that cI(Per(f» has a hyperbolic structure for!. Then by the Spectral Decomposition Theorem cI(Per(f» = Al U '" U AN. The sets Ai are called basic sets. REMARK 5.1. By Theorem 4.1, if we assume either (i) A = R(f) has a hyperbolic structure, (ii) A = L(f) has a hyperbolic structure, or (iii) A = n(J) and f satisfies Axiom A, then I\. = c1(Per(J». Therefore both parts of the Spectral Decomposition Theorem are true if we assume that ! satisfies any of the above three assumptions. Since it is possible for n(f) to be hyperbolic and n(f) -F L(J) = cI(Per(f», if we state the theorem in terms of the nonwandering set we need to assume Axiom A and not just that n(f) has a hyperbolic structure. REMARK 5.2. It is not hard to prove that each basic set Ai can be further split up into pieces for which a power of ! is topologically mixing: Ai = U::!I Xi,i with (i) the sets X i .i = cI(WU(p) n W"(p» for some periodic point p E Xi,i' (ii) the sets X i .i are pairwise clL'ljoint, (iii) !(Xj •i ) = X j •HI for 1 ~ i < nj and !(X j •n ,) = X i . I , and (iv) the j IXi,i is topologically n;-power of ! restricted to each Xi,i is topologically mixing, mixing for each j and i. See Bowen (1975b). (See Section 2.5 for the definition of topologically mixing and the difference from topologically transitive.)
r
PROOF. As in the proof of Proposition 5.3, there is an f > 0 such that if p, q E H(f) with d(p, q) < f then p is h-related to q. Therefore if cI(Hp) n c1(Hq) -F 0 then Hp = Hq. That is, the closure of h-classes either agree or are diSjoint. Also c1(Hp) is closed, invariant, and topologically transitive by Proposition 5.2. As in the proof of Proposition 5.3, we can take sets Up :) Hp which are open in M with Up contained within f/3 of Hp. These open sets are disjoint (or coincide) and cover cI(Per(f». Since cI(Per(f» is compact, a finite number of these Up cover. Therefore there are only a finite number of distinct classes Hp.
z'k
FIGURE 5.2. WU(Yk)
Local Product Structure: W:(Xk), W,!'(Xk) , W:(Yk), and
Finally, we prove the local product structure of each Ai = cI(Hpj ). Let X,Y E Ai' and choose Xk, Yk E H pi n Per(f) so that Xk converges to x and Yk converges to y. By the continuity ofthe local stable and unstable manifolds both W,!'(Xk) n W:(Yk) = {zA:} and W,!'(Yk) n W:(XIt) = {zAJ are a single point at a transverse intersection. By the Inclination Lemma, WU(Yk) accumulates on W,!'(Xk), and so WU(Yk) has a transverse homoclinic intersection with W:(Yk) arbitrarily near the point Zk. See Figure 5.2. Each of these transverse homoclinic intersections for a periodic point is in the closure of the
9.5
DECOMPOSITION OF HYPERBOLIC RECURRENT POINTS
387
periodic points, so Zk E cl(PerU)) = Al U ... U AN. Because the Ai are at a finite distance apart, Zk E Ai if l is small enough. As Xk approaches x and Yk approaches y, Zk approaches some Z E W;'(x) n W:(y). Since Ai is closed, it follows that Z E Aj as was to be proved. This completes the proof of part (a). For the proof of part (b) of Theorem 5.4, let p E M. Then w(p) c LU) = Al U··· U AN. The Ai are disjoint and a bounded distance apart, so there exists an v > 0 such that they are distance apart greater than v. We want to show that w(p) must be contained in a single Ai' We assume the opposite and get a contradiction, i.e., we assume that w(p) n Aj f= 0 and w(p) n Ak f= 0 for j -f= k. Let D( Ai, v /3) be the v /3 neighborhood of Ai. Because the sets Ai are invariant, there are neighborhoods Ui of each Ai, such that cl(U,) c D(A"v/3) and !(cl(U,» n D(A t ,v/3) = 0 for t f= i. With the above assumptions on p, there is an increasing sequence of iterates with fn .. (p) E Uj and In .. +! (p) E Uk. Let m, be the largest integer m such that n2, < m < n2i+1 with Im-I(p) E cl(Uj). Then Im,(p) rt Uj by the choice of m,. On the other hand, Im,-I(p) E cl(U]) so Im.(p) = 10 Im.-I(p) rt D(A t ,v/3) :::> Ut for t f= j. Therefore 1m , (p) E M\Ut Ut • By taking a subsequence ofthe mi, we get that Im,(p) accumulates on a point q that is not in any of the At. We have a contradiction because q is in w(p). Thus we have shown that w(p) is contained in a single Aj . Therefore p E W·(A j ). The proof that Q(p) C Ak for some k is similar, so p E W"(Ak). This completes the proof of the theorem. 0
n,
Corollary 5.5. Ifnu) has a hyperbolic structure, then I has a ilrute number of chain components. PROOF. The the Spectral Decomposition Theorem implies that nu) = Al U· .. UAN is a finite decomposition. Since I is topologically transitive on each of the Aj • each of these Aj is a chain component. Therefore. there are a finite number of chain components. 0 The following proposition shows that the stable manifold of a single periodic orbit is dense in the stable manifold of the whole invariant set. This fact clarifies the meaning of a cycle as defined below. Proposition 5.6. Assume A is a basic set for I. (We could assume that A is an isolated invariant set with PerU) dense in A and all the period points of A h-related.) Let pEA n PerU). Then W'(O(p» is dense in W·(A). Similarly, WU(O(p» is dense in WU(A). REMARK 5.3. If I is topologically mixing on a basic set and pEA n PerU), then the stable manifold ofthe single point p, W'(p), is dense in W'(A) and not just WU(O(p». If I is not topologically mixing, then W'(p) is not dense in W'(A) and it is necessary to take the stable manifold of the orbit of p. PROOF. Let pEA n PerU) be as in the statement of the theorem have period n. Let q be any other periodic point in A with period m. By the assumptions q is h-related to p. Thus there is a PI E O(p) such that W·(ptl has a transverse intersection with W"(q). Using the Inclination Lemma applied to Imn, W·(ptl accumulates on the local stable manifold of q, and so by iteration of I mn it accumulates on all of W·(q). Thus W'(O(p» accumulates on the stable manifold of any periodic point. Because the periodic points are dense in A, it follows from the continuous dependence of the stable manifolds on the point that U{W'(q) : q E A n PerU)} is dense in U{W'(q) : q E A} = W·(A). Since W'(O(p» is dense in U{W'(q) : q E AnPerU)}. it follows that W'(O(p» is dense in W·(A). This completes the proof of the theorem 0 for stable manifolds. The proof for unstable manifolds is similar.
388
IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS
Definition. Assume that c1(Per(f», L(f), or R(f) has a hyperbolic structure for f. (If we assume that n(f) has a hyperbolic structure, then we also have to assume that . c1(Per(f» = n(f).) Let Al u··· U AN be the basic sets given by the Spectral Decomposition Theorem, and define WU(A j ) = WU(Aj ) \ Aj . Define a partial ordering on the basic sets by declaring Aj « Ak if WU(Aj) n W'(Ak) --F 0. A k-cycle is a sequence of basic sets Aj " ... ,Aj • with Ail « Ai> « ... « Ai. « Aj ,. Thus a I-cycle is a basic set Aj with WU(A j ) n W'(A j ) --F 0, i.e., the stable and unstable manifolds intersect off A}. REMARK 5.4. If L(f) (or n(f) or R(f» is hyperbolic then a cycle contains some non-transverse intersections because otherwise all the periodic points are h-related and intersections are all in L(f). The example given above with L(f) --F n(f) is an example for which the limit set is hyperbolic with a 3-cycle, but the nonwandering set is not hyperbolic, i.e., the points of non-transverse intersection are nonwandering but not limit points.
Definition. Assume that LU) or R(f) has a hyperbolic structure. (Alternatively, assume that n(f) has a hyperbolic structure and c1(Per(f» = n(f).) Then / is said to have no cycles or satisfy the no cycle property provided there are no cycles among the basic sets formed from L(f) (or n or R(f». Theorem 5.1. If R(f) is hyperbolic then
f satisfies the no cycle property.
Assume {AjJ~=1 is a k-cycle for /. It is easy to check that all the points ofthe intersections ~VU(Aj;) n W'(Aj;+I) are in R(f). (See Exercise 9.2 dealing with periodic points. Also compare with Exercise 9.18.) Thus these points are not outside the basic sets, so there is no cycle but merely one larger basic set. 0 PROOF.
REMARK 5.5. If we assume that L(f) or nu) has a hyperbolic structure, then the fact that / has no cycles among the basic sets is an additional assumption.
9.6 Markov Partitions for a Hyperbolic Invariant Set Throughout this section, f : M -+ M is a C l diffeomorphism on a manifold M, A is a compact, isolated, hyperbolic invariant set for f with a local product structure. The set A could be a basic set from the spectral decomposition but this is not necessary. We take an adapted norm on tangent vectors at points of A, and d a compatible distance on M. With this distance, two points in a stable manifold get closer together under forward iteration and two points in an unstable manifold get closer together under backward iteration. Therefore /-I(W:(f(p») :> W:(p) and f(W:'(f-l(p») :> W.U(p). Since A has a local product structure, there are f 2: 0' > 0 such that W:(p) n W:'(q) a single point in A whenever p, q E A satisfy d(p, q) ~ 0'. However by taking f 2: 0' > 0 smaller, we can make sure that
whenever p, q E A satisfy d(p, q) ~ 0', and so the intersection of the manifolds on the left hand side is also a single point. Note that the the two sets which are intersected on the left hand side of the equality contain the respective sets on the right hand side. We fix f and 0' with these properties. For this general situation, we need to define a rectangle and then the stable and unstable manifolds in a rectangle. Note that throughout this section the interiors of all subsets of A are taken as subsets of A and not as a subset of the ambient manifold M; thus int(R) means the interior of R in A.
9.6 MARKOV PARTITIONS FOR A HYPERBOLIC INVARIANT SET
389
Definition. A set RcA is a rectangle provided (i) it has diameter less than 0, where 0 is as above, and (ii) p,q E R implies that W:(p) n W;'(q) E R. The rectangle is call proper if in addition (iii) R = cl(int(R)) 80 that it is closed. If R is a rectangle then for pER let W'(p, R) = W:(p) n R
and
W"(p, R) = W.U(p) n R.
Note the comparison of the definition of W'(p, R) with what is used for hyperbolic toral automorphisms in Chapter VII. With theRe definitions we can define a Markov partition as before. Definition. Assume that A has a local product structure for J as above. A Markov partition oj A for J is a finite collection of proper rectangles, 'R = {Rj lj"-l' that satisfy the following four conditions: (i) 11.= Uj:l R j , (ii) if i 1: j then inteR;) n int(Rj ) = 0, (so inteR;) n R j = 0), (iii) ifz E inteR;) nJ-l(int(Rj )) then J(WU(z, R;)) :) WU(J(z), Rj )
and
J(W'(z, R;)) C W'(J(z), Rj ),
and (iv) if z E inteR;) n J-l(int(Rj
))
then
int(Rj ) n J(WU(z, R;) n int(R;)) = WU(J(z), Rj ) n int(R j )
inteR;) n rl(WB(Rj) n int(Rj
))
and
= W'(z, R;) n int(R;).
REMARK 6.1. In the context of the toral Anosov automorphisms, condition (iv) is
needed to insure that the image of a rectangle only crosses another rectangle once; this fact enables large rectangles to be used and still to be able to get a single orbit which passes through the prescribed sequence of rectangles, i.e., to get symbolic dynamics. In the present context, the rectangles found are small. Their small size is used to show that condition (iii) implies condition (iv). (Note the added condition on the size of the local stable manifolds and iterates of the map which we imposed above.) Because the small rectangles automatically satisfy condition (iv), Bowen and other authors do not add a condition like this one to the de~nition of a Markov partition. We can now state the main result of this section which is due to Bowen (1970a). We follow the treatment of Bowen (1975). Theorem 6.1. Let A be a hyperbolic invariant set with a local product structure for a diffeomorphism J. Then there exists a Markov partition of A for I with rectangles arbitrarily small (diameter less than 0). REMARK 6.2. Once there is a Markov partition, then it is possible to define a subshift of finite type EA as for a toral Anosov automorphism and a semi-oonjugacy h : EA -+ A that is finite to one. By Theorem VIII.l.S, the entropy of J/A, h(JIA). is equal to
390
IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS
the entropy of C1 A, h(ul1: A). By Theorem VIII.1.9, h(ul1: A ) = largest eigenvalue of A.
log(~d
where
~l
is the
PROOF. Let E > 0 and a> 0 be small as above. Let {3 > 0 be such that any {3-chain can
be (a/2)-shadowed in A because A is isolated. Next take -y > 0 with -y '5: min{{3/2, a/2}, that if d(x, y) < -y then d(f(x), f(y» < (3/2 and d(f-l(x), rl(y» < {3/2. Let P = {Pl, ... , Pr} be a finite set of -y-dense points in A. Let B = (b'j) be the transition matrix with b _ d(f(Pi),Pj) < (3 'J 0 d(f(Pi),Pj) ~ {3.
so
{I
Because of the choice of 'Y, for each i, there is at least one j such that bij = 1. Let 1:8 be the two sided subshift of finite type determined by B. Then the cylinder sets Cj = {s E 1:8 : So
= j}
form a Markov partition of 1:8. Remember that for s E 1: 8 , WI~(S,C18)
WI~c(S,C18)
If s, s' E 1:8 with So
= {tE1:8 = {tE1:8
: ti
= Si for i
:ti
= Si fori '5: O}.
~
O}
and
= s6, then ~~(S,C18)
n WI~(S',U8) = {SO}
with s· E 1:8 where S!
,
= {Si
s;
for i ~ 0 and for i '5: o.
We use this partition for 1:8 to construct a partition for A. We define a map 8: 1:8
-+
A
which we use to take a rectangle in 1:8 to a rectangle in A. For each s E 1:8, let 8(8) be the point z E A which (a/2)-shadows the ,B-chain {P'; }~-oo. The main properties of 8 are contained in the following lemma. Lemma 6.2. The map 8: 1:8 -+ A is continuous and onto. Further, ifs,s' E 1:8 have 80 (are in the same cylinder set of the partition of 1:8 ), then
So =
d(8(s), 8(8') < a
c W:(8(8),f), c W:(8(s'),f), and 8(WI~(s, (8) n WI~(S',C18» = W:(8(s), f) n W:(8(s'), f). 8(WI~(S,C18»
8(~~(8',C18»
PROOF. To show 8 is onto, let Z E A. For each j let P'; be chosen within -y of
Then d(f(p';)'P'j+,) '5: d(f(p.;),fi+l(z» +d(fj+l(Z),V.,+,)
'5: {3/2 + 'Y '5: {3,
Ji(z).
9.6 MARKOV PARI'ITIONS FOR A HYPERBOLIC INVARIANT SET
391
so {p'j} is a /1-chain. The orbit of z 1'-shadows so it (O'/2)-shadows this /1-chain; thus lI( s) = z and II is onto. If two sequences s,s' E E8 have So = then lI(s) and lI(s') are both within 0'/2 of P' o so are within a of each other. Note that two symbol sequences s and a' are close if Sj = sj for -n :s j :s n for some large n. Thus. and .' correspond to two /1-chalns which agree for a large number of points. By an argument like we have given in earlier sections, lI(s) and lI(s') are nearby points. This shows that II is continuous. For the properties about stable manifolds stated in the lemma, take a" E WI~(S,U8). Then ill = Sj for j ~ O. Then both 8(a) and 8(a") (O'/2)-shadow the same forward /1chain, so d(fj 0 lI(a"), Ij o8(s» :s a :s f
so'
for j
~
O. It follows that
lI(s') E W!(lI(s),J) 8(WI~c(S,0'8» C
c
W.'(lI(s),J)
or
W:(8(s),f).
The proof that lI(WI~(S,0'8»
is similar using j :s o. Finally, assume let s, s' E E8 with So
where
S~ =
W:(lI(s),J)
= so.
{Sj
•
c
Then
for j ~ 0 and for j :s O.
sj
By above 8(s") E W:(8(s), f) 8(WI~c(s, 0'8)
n W:(8(s'), f)
= W:(lI(s), I) n W,"(8(s'), J),
or
n WI~c(S', 0'8» = W:(8(s), f) n W." (lI(s'), f)
o
as claimed.
We check in Lemma 6.3 below that the sets Tj = 8(Cj
)
= {8(s) : s E E8 and So = j}
are rectangles in A for 1 :s j :s r. Since 8 is continuous, each of the Tj is closed. We do not know that these rectangles are proper. Also, the collection of these rectangles might not have disjoint interiors. But, they d,· ..[ 11 cover of A, and Lemma 6.3 also checks the first condition on the stable and unstable manifolds in the definition of a Markov partition.
Lemma 6.3. The collection of sets {Tj (a) Each Tj is a rectangle in A. (b) The {Tj } cover A, A = U;_I Tj • (c) I£x = 8(s) for a E E8, then I(W'(x,T,o» I(W"(x, T. o
c
:
1 :s j
:s r} satisfy the following conditions.
W·(f(x),T,.)
»:::> W"(f(x), T•• ).
and
388
IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS
Definition. Assume that cl(Per(f)), L(f), or n(f) has a hyperbolic structure for f. (If we assume that n(f) has a hyperbolic structure, then we also have to assume that cl(Per(f» = n(f).) Let Al u··· U AN be the basic sets given by the Spectral Decomposition Theorem, and define WU(A j ) = WU(A j ) \ Aj . Define a partial ordering on the basic sets by declaring Aj « Ale if W"(A j ) n W'(AIe) =I 0. A k-cyc/e is a sequence of basic sets AJt , ... ,Aj • with AJt « A;. « ... « Aj • « Ail' Thus a I-cycle is a basic set Aj with W"(Aj) n ~V'(Aj) =I 0, i.e., the stable and unstable manifolds intersect off AJ . REMARK 5.4. If L(f) (or n(f) or n(f)) is hyperbolic then a cycle contains some non-transverse intersections because otherwise all the periodic points are h-related and intersections are all in L(f). The example given above with L(f) =I n(f) is an example for which the limit set is hyperbolic with a 3-cycle, but the nonwandering set is not hyperbolic, i.e., the points of non-transverse intersection are nonwandering but not limit points.
Definition. Assume that L(f) or n(f) has a hyperbolic structure. (Alternatively, assume that nUl has a hyperbolic structure and cl(Per(f) = nu).) Then f is said to have no cycles or satisfy the no cycle property provided there are no cycles among the basic sets formed from L(f) (or n or n(f». Theorem 5.7. If n(f) is hyperbolic then f satisfies the no cycle property. PROOF. Assume {AjJ~=1 is a k-cycle for f. It is easy to check that all the points of the intersections ~VU(Aj;) n W'(Aj;+I) are in n(f). (See Exercise 9.2 dealing with periodic points. Also compare with Exercise 9.18.) Thus these points are not outside the basic 0 sets, so there is no cycle but merely one larger basic set. REMARK 5.5. If we assume that L(f) or n(f) has a hyperbolic structure, then the fact that f has no cycles among the basic sets is an additional assumption.
9.6 Markov Partitions for a Hyperbolic Invariant Set Throughout this section, f : M -+ M is a C l diffeomorphism on a manifold M, A is a compact, isolated, hyperbolic invariant set for f with a local product structure. The set A could be a basic set from the spectral decomposition but this is not necessary. We take an adapted norm on tangent vectors at points of A, and d a compatible distance on /If. With this distance, two points in a stable manifold get closer together under forward iteration and two points in an unstable manifold get closer together under backward iteration. Therefore rl(W:U(p») :) W:(p) and f(W.u(f-I(p») :) W."(p). Since A has a local product structure, there are t ~ 0 > 0 such that W:(p) n W."(q) a single point in A whenever p, q E A satisfy d(p, q) ~ o. However by taking t ~ 0 > 0 smaller, we can make sure that
whenever p, q E A satisfy d(p, q) ~ 0, and so the intersection of the manifolds on the left hand side is also a single point. Note that the the two sets which are intersected on the left hand side of the equality contain the respective sets on the right hand side. We fix t and 0 with these properties. For this general situation, we need to define a rectangle and then the stable and unstable manifolds in a rectangle. Note that throughout this section the interiors of all subsets of A are taken as subsets of A and not as a subset of the ambient manifold M; thus int(R) means the interior of R in A.
9.6 MARKOV PAIUITIONS FOR A HYPERBOLIC INVARIANT SET
389
Definition. A set RCA is a rectangle provided (i) it has diameter less than 0, where 0 is as above, and (H) p, q E R implies that W:(p) n W;'(q) E R. The rectangle is call proper if in addition (iii) R = cl(int(R» so that it is closed. If R is a rectangle then for pER let W'(p,R)
= W:(p)nR
and
W"(p, R) = w:'(p) n R. Note the comparison of the definition of W· (p, R) with what is used for hyperbolic toral automorphisms in Chapter VII. With these definitions we can define a Markov partition as before. Definition. Assume that A has a local product structure for 1 as above. A Markov partition 01 A for 1 is a finite collection of proper rectangles, 'R = {Rj }j'=l' that satisfy the following four conditions: (i) A = U7'=l R j , (H) if i # j then int(~) n int(R j ) = 0, (so int(~) n Rj = 0), (iii) if z E inteR;) n ,-l(int(Rj )) then J(WU(z, ~» :::> WU(f(z), Ri ) I(W'(z,~)) C
W'(f(z), Rj
and
),
and (iv) if z E inteR;) n 1-1(int(Rj » then int(Rj ) n J(WU(z,~) n int(~» int(~)
n r1(W'(R j ) n int(Rj
))
= WU(f(z), R j ) n int(Rj ) = W'(z,~) n int(~).
and
REMARK 6.1. In the context of the toral Anosov automorphisms, condition (iv) is
needed to insure that the image of a rectangle only crosses another rectangle once; this fact enables large rectangles to be used and still to be able to get a single orbit which passes through the prescribed sequence of rectangles, i.e., to get symbolic dynamics. In the present context, the rectangles found are small. Their small size is used to show that condition (iii) implies condition (iv). (Note the added condition on the size of the local stable manifolds and iterates of the map which we imposed above.) Because the small rectangles automatically satisfy condition (iv), Bowen and other authors do not add a condition like this one to the de~nition of a Markov partition. We can now state the main result of this section which is due to Bowen (1970a). We follow the treatment of Bowen (1975). Theorem 6.1. Let A be a hyperbolic invariant set with a local product structure for a diffeomorphism J. Then there exists a Markov partition of A for f with rectangles arbitrarily small (diameter less than 0). REMARK 6.2. Once there is a Markov partition, then it is possible to define a sub6hift of finite type E" as for a toral Anosov automorphism and a semi-conjugacy h : E" - A that is finite to ODe. By Theorem VIII.l.8, the entropy of !lA. h(!lA), is equal to
390
IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS
the entropy of OA. h(O'IEA)' By Theorem VIII.1.9. h(O'IEA) largest eigenvalue of A.
= 10g(At} where Ai
is the
f > 0 and a > 0 be small as above. Let {3 > 0 be such that any {3-chain can be (a/2)-shadowed in A because A is isolated. Next take "{ > 0 with "{ $ min{{3/2. a/2}. so that if d(x.y) < "( then d(f(x),j(y» < (3/2 and d(f-i (X),j-i (y)) < {3/2. Let l' = {Plo ...• Pr} be a finite set of "{-dense points in A. Let B = (bij) be the transition matrix with d(f(Pi), Pi) < (3 ij = o d(f(Pi), Pi) ~ {3.
PROOF. Let
b {I
Because of the choice of "{. for each i, there is at least one j such that bii = 1. Let EB be the two sided subshift of finite type determined by B. Then the cylinder sets Cj
= {s E EB : So = j}
form a Markov partition of EB. Remember that for
If s.s' E EB with
S E
EB,
WI~(S'O'B)
= {t E EB : ti = Si for i
~
WI~(S,O'B)
= {t E EB
$ OJ.
So
=
: ti
= Si for i
O}
and
so, then
with s· E EB where
S~ = {Si 1
S;
for i ~ 0 and for i $ O.
We use this partition for EB to construct a partition for A. We define a map 8: EB
-+
A
which we use to take a rectangle in EB to a rectangle in A. For each s E EB, let 8(s) be the point Z E A which (a/2)-shadows the {3-chain {P' j }~-oo. The main properties of 8 are contained in the following lemma. Lemma 6.2. The map 8 : EB -+ A is cO'ntinuO'us and O'ntO'. Further, if s, s' E EB have So = So (are in the same cylinder set O'f the partitiO'n O'fEB), then d(8(s),8(s') < a 8(WI~(s. O'B» C W~(8(s), I),
8(WI~(S"O'B))
8(WI~(S'O'B)
c
W:(8(s'),I),
and
n WI~(S'.O'B» = W:(8(s), I) n w;'(8(s'). I).
PROOF. To show 8 is onto, let Z E A. For each j let P'j be chosen within "( of /i(z).
Then d(f(p'j)' P'HI) $ d(f(p.j ), /i+i(z» + d(fi+1 (:1:). P.,+,) $ {3/2+,,{ $ {3,
9.6 MARKOV PARTITIONS FOR A HYPERBOLIC INVARIANT SET
391
so {p,,} is a .a-chain. The orbit of z -y-shadows so it (0/2)-shadows this .a-chain; thus 9(s) = z and 9 is onto. If two sequences s, s' E EB have So = then 9(s) and 9(s') are both within 0/2 of Pso so are within or of each other. Note that two symbol sequences s and s' are close if Sj = sj for -n ~ j ~ n for some large n. Thus s and s' correspond to two .6-chains which agree for a large number of points. By an argument like we have given in earlier sections, 9(s) and 9(s') are nearby points. This shows that 9 is continuous. For the properties about stable manifolds stated in the lemma, take s· E W,!.c(S,O"B). Then = Sj for j 2: o. Then both 9(s) and 9(s·) (or/2)-shadow the same forward .6chain, so d(fj 0 9(s·), Ij 0 9(s» ~ 0 ~ f
so'
s;
for j 2: O. It follows that 9(s·) E
W~(9(s), f)
c
W.'(9(s), I)
or
9(W,!.c(s, UB» C W~(9(s), f).
The proof that 9(W,~(S,UB»
is similar using j ~
o.
Finally, assume let s,s' E EB with So =
where
S! = {Sj •
c
W~(9(s),f)
so. Then
for j 2: 0 and for j ~ o.
sj
By above 9(s·) E W~(9(s), f) n W~(9(S/), f) = W:(9(s), f) n W.U(9(s'), f), 9(W,~c(s, UB)
or
n ~~(s', UB» = W:(9(s), f) n W;'(9(s'), f)
o
as claimed.
We check in Lemma 6.3 below that the sets Tj
= e(Cj ) = {e(s) : s E EB and So = j}
are rectangles in A for 1 ~ j ~ T. Since 9 is continuous, each of the T j is closed. We do not know that these rectangles are proper. Also, the collection of these rectangles might not have disjoint interiors. But, they J... .., 11 cover of A, and Lemma 6.3 also checks the first condition on the stable and unstable manifolds in the definition of a Markov partition. Lemma 6.3. The collection of sets {Tj (a) Each Tj is a rectangle in A. (b) The {Tj } cover A, A = U;~l Tj • (c) Ifx = 9(s) for s E EB, then
I(W'(x, T. o f(WU(x,T. o
:
1 ~ j ~ T} satisfy the following conditioDS.
»c W'(f(x), T•• )
»::> WU(f(x),T•• ).
and
392
IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS
PROOF. (a) The diameter ofTj is less than 0, because any two points in a Tj are within 0/2 of the same point Pj. Thus condition (i) in the definition of a rectangle is true. Let x, y E Tj with 8(8) = x and 8(8') = y. Then 8 and 8' are in the same cylinder set Cj , 8'
= Wk,.,(8,O'B) n WI~(S"O'B) E Cj , = 8(s*) E Tj •
and
W:(8(s), I) n W.U(8(s'), f)
Thus condition (ii) in the definition of a rectangle is true, and the Tj are rectangles. (b) Since 8 is onto, the {Tj } cover A. (c) If y E WO(x, Too) c W:(x, I), then y = W:(x, I) n W.U(y, I). On the other hand, if y = 8(8') define 8'
• Sj
= Wk..,(8,O'B) n WI~(8"O'B)' {Sj
= sj
for j ~ 0 for j :S O.
and
By Lemma 6.2,
8(s') = W:(8(s), I) n W:'(8(8') , I), so 8(s') = y. Also (i) 0'8(S')j = 0'8(S)j = Sj+l for j ~ 0 and 8(0'8(S» = f(x) so 8(0'8(S'» E W:(f(x), I), and (ii) 0'8(S')j = 0'8(8')j = Sj+l for j :S 0 and 8(0'8(S'» = f(y) so 8(0'8(S'» E W.u(f(y),f): 8(0'8(S'» E W:(f(x), I) n W:'(f(y), I) = {J(y)}, 8(0'8(S'» = f(y).
But 8(O'B(S'» E To .. so f(y) E W'(f(x), To.), f(W'(x, Too» are done. A similar argument proves the last inclusion,
c
W'(f(x), To.), and we
o To get the other two properties of the Markov partition, we need to subdivide the Tj's. The subdivision uses the different parts of the boundary of the rectangles. Therefore, before making the refinements of the covering, we characterize the different parts of the boundary of a rectangle in the following lemma. Lemma 6.4. Let R be a closed rectangle of A. The boundary of R (relative to A) can be written as the union of two subsets, 8(R) = 8"(R) u 8 U (R), where
8"(R) = {x E R : x aU(R) = {x E R : x
rt int(WU(x, R))} rt int(WO(x,R))}.
and
(The interiors of W·(x. R) and WU(x. R) in the above definitions of 8"(R) and 8 U(R) are as subsets of W:(x) n A and W.U(x) n A respectively.) REMARK 6.3. The notation is chosen so that 8"(R) is made up of pieces of stable manifolds and 8 U (R) is made up of pieces of unstable manifolds. See Figure 6.1. PROOF. It is sufficient to prove that int(R) is equal to the set R \ (8"(R) U OU(R».
9.6 MARKOV PARrITIONS FOR A HYPERBOLIC INVARIANT SET
393
x lwU(x) WS(y) . ...•......... ........ y
FIGURE 6.l. The Points x E ~(R) and Y E 8"(R)
U
If x E int(R) then W"(x, R) ,." R n W:(x) is a neighborhood of x in W:(x) n A for = 8, u. This proves that int(R) C R \ (~(R) U 8"(R». Conversely, assume
x x
E R \ (~(R) U 8"(R))
or
n int(W"(x, R».
E int(W'(x, R»
For yEA near to x,
y' == W:(x) n W."(y) E int(W'(x, R»
and
y" == W:(y) n W;'(x) E int(W"(x, R»
since the intersections depend continuously on y. See Figure 6.2. Thus
y = W:(y") n W."(y') for y' E int(W'(x, R» and y" E int(W"(x, R». The set of such points forms a neighborhood of x in A, so x E int(R). 0
· •
U
•
y . ....:y.......... :..... ~
·:X
~
·.· :
s
.... ~ •.•.....•• :)(....
·· ·
FIGURE 6.2.
.
Determination of Points y' and y"
For x E A, we associate elements of the cover T = {T1' ... , Tr T(x) = {Tj E T : x E 1j} T"(x) = {TAo E T : TAo n Tj
of A by letting and
"# 0 for some 1j
Since T is a cover of A, r
Z = A\
}
U 8(T;) j .. 1
E T(x)}.
394
IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS
is open and dense in A. We also need to remove the extensions of the a'(Tj ) by stable manifolds and the extensions of the au Cr,) by unstable manifolds, so we define the set Z' = {x E A : W:(x) n 8'(T,,) = 0 and W.U(x) naU(T,,) = 0 for all T" E T·(x)}.
This set is also open and dense by an argument like Lemma 6.4. Now we fix Tj E T and subdivide it for T" with T j n T" "i 0 as follows: T}." = T j nT" = {x E Tj : WU(x, Tj ) n T"
Tl~" = {x E Tj : WU(x, Tj ) n T" Tl." = {x E Tj : WU(x, Tj ) n T"
Tf." =
{x E Tj
:
"i 0 and "i 0 and = 0 and
W'(x, T j ) n T"
"i 0}
W'(x, Tj ) n T" = 0} W'(x, Tj ) n T"
"i 0}
WU(x, Tj ) n T" = 0 and W'(x, T1 ) n T" = 0}.
See Figure 6.3. With these definitions, T],j = Tj and TL = 1j~j = TL = 0. Each of the I;,,, is a rectangle as follows from the following observation: if x, y E T j and z = U"(x, T1 ) n WU(y, Tj ), then W'(z, Tj ) = W'(x,Tj ) and WU(z,Tj) = WU(y, Tj ). I
4
T.k j.
- --3 -T j.'k
I
2
: T'j. k I
~
W
W
u
S
T.j.lk Tk
FIGURE 6.3. Subdivision of Rectangles For x
E
Z', we take intersections of the T}:" to define a rectangle at x,
Since there are only a finite number of sets involved, R(x) is open, and nonempty since x E R(x). Since each of the I;,,, is a rectangle, each int(Tj~") is a rectangle, so R(x) is a rectangle. A finite collection of the c1(R(x)), R, form the Markov partition claimed in the theorem. The following lemma proves properties of the R(x)'s used to verify the conditions of a Markov partition in the sequence of lemmas which follow. Notice that R is a refinement of the cover T. Lemma 6.5. (a) For x, y E Z', either R(x) = R(y) or R(x) n R(y) = 0. (b) For x E Z', a(R(x)) n Z· = 0. (c) For x E Z', int(c1(R(x))) = R(x), i.e., the rectangles c1(R(x)) are proper. PROOF. AssumethatR(x)nR(y) "i 0forx,y E Z·. Thereisaz E R(x)nR(y)nZ*. By the definition of R(x), if T j E T(x) then z E 1~I.J = 1j, so Tj E T(z) and T(z) ::) T(x). On the other hand, if Tj r1. T(x), then Tj n R(x) = 0, so z ~ Tj and T(z) C T(x).
9.6 MARKOV PARTITIONS FOR A HYPERBOLIC INVARIANT SET
395
Combining, we get T(z) = T(x). Similarly, T(y) = T(z) so T(y) = T(x). Thus, if R(x) n R(y) 1= 0 then R(x) = R(y). This proves part (a). Assume y E o(R(x» n Z·. Since R(y) is a neighborhood of y in A and y E o(R(x», it must be that R(y) n R(x) = 0 or else R(x) = R(y) and y is not a boundary point. But also this neighborhood R(y) must not intersect R(x) so y can not be a boundary point. This is a contradiction and shows that o(R(x» n Z· = 0. This proves part (b). Since Z· is dense in A and o(R(x»nZ· = 0, int(o(R(x») = 0, and so int(cl(R(x))) = R(x). 0 There are only a finite number of different R(x) because there are only finitely many Tj and T?k' Let n = {cl(R(x» : x E Z·} = {R), ... ,R",} be an enumeration of this finite collection of rectangles. Lemma 6.5(c) proves that the rectangles are proper. Since the collection of the interiors, {int(Rd, ... , int(R",)}, cover Z· which is dense in A, the collection 'R satisfies the first condition for a Markov m
partition,
UR
j
= A. By Lemma 6.5(a,c), condition (ii) for a Markov partition is
j=1
true: int(Rj ) n inteR,,) = 0 for j 1= k. We are only left to show that the stable and unstable manifolds relative to the rectangles behave properly, conditions (iii) and (iv) for a Markov partition. These conditions follow from the lemmas which follow. Lemma 6.6. The rectangles in 'R satisfy condition (iii) in the definition of a Markov Partition: if x E int(~) n 1- 1(int(R j then
»,
I(W'(x,~» I(WU(x,~»
c
W"(f(x),R j ) :;) WU(f(x), Rj ).
and
We prove only the inclusion for the stable manifolds, and remark how the case for the unstable manifolds follows. The first step in the proof of this lemma is the following sublemma. Sublemma 6.1. Assume x, y E Z· n 1- 1 (ZO), R(x) = R(y), and y E W:(x). Then, R(f(x» = R(f(y». PROOF. The first step is to show that T(f(x» = T(f(y». Assume that I(x) E T j • Then there is a s E E8 with x = 6(8) and 81 = j. Let s' be the point in E8 with 8i1 = 80 and y = 6(s'). If we let
. = {8;,
Si
8;
for i ~ 0 for i $ 0,
and
then y = 6(s·) as we argued before. Since y = 8(sO) and 8j = j, I(y) E 1';. Thus T(f(x» c T(f(y». The other inclusion is proved by reversing the roles of x and y, 80 T(f(x») = T(f(y». Now assume I(x), f(y) E Tj and Tj n T" 1= 0. We need to show that I(x) and I(y) are in the same 1'?". Note that W"(f(x), Tj ) = W"(f(y),1';), so both these stable manifolds either intersect T" or neither intersects T". If I(x) and f(y) are not in the
396
IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS
tw
,, : ,
f(y)
-
~f(x)
.............•. :.......
,
-- ---
u
WS
If(z)
Tj
Tk
FIGURE
6.4
same T;\, then one ofthe unstable manifolds WU(f(x), 1j) and WU(f(y), 1j) intersects Tk and the other does not. Assume that J(z) Ewu(f(X), T j ) n Tk i: 0 W U(f(y),1j) n Tk = 0.
and
See Figure 6.4. Let S E E8 be such that J(x) = 80 o'(s) with 81 = j. In Lemma 6.3 we proved that it follows that WU(f(x), T j ) c f(WU(x, Too)), so J(z) E f(WU(x, Too)) and z E WU(x,T.o)' Let s· E E8 be such that z = 8(s*) with 8i = k. Thus T. o E T(x) = T(y) and z E Too n T.; i: 0. Since z E WU(x, T. o) n T.; and R(x) = R(y) (so they are in the same r.:,.oo)' the intersection WU(y, T. o ) n T.; i: 0. In fact, there is a point
z' = W:(z) n W.U(y) = W'(z, Tao) n WU(y, Too)'
See Figure 6.5. Therefore f(z') = W:(f(z)) n W.u(f(y)) = W°(f(z), Tk)
n WU(f(y),1j).
This contradicts the fact that WU(f(y), T j ) n Tk = 0. Therefore f(x) and J(y) are in the same 1}:k' This is true for an arbitrary Tk, so R(f(x)) = R(f(y)). 0
·:Y
,: ,I:: X
:
,:
-
.... ·t- ................ ,.~ ....... .
WS
-rzi - - +::-z-ir----, ···t·········
··
..... . ..
.~
FIGURE
6.5
9.6 MARKOV PARTITIONS FOR A HYPERBOLIC INVARIANT SET
397
PROOF OF LEMMA 6.6. Let x E Z· n r'(Z·). The set
W'(x, R(x» n Z· n r'(Z·) is open and dense in' W"(x, cl(R(x» by an argument like Lemma 6.4. By Lemma 6.7 and continuity, f(W'(x,cl(R(x» C cl(R(f(x») n W:(f(x» C W'(f(x), cl(R(f(x))). If int(R;)nrl(int(Rj» '" 0, then this open subset of A contains an x E ZOnf-'(ZO) with R; = cl(R(x» and R j = cl(R(f(x))). Therefore for such x,
f(W"(x,R;» C W·(f(x),R j ). For any y E inteR;) n f-'(int(Rj
»,
f(W'(y, R;» = f{W:(y) n W.U(z) : z E W'(x, R i }} = {/(W:(y» n f(W.U(z» : z E W'(x, R;)} C {/(W:(y» n W:'(z') : z' E W"(f(x), R j }} C W'(f(y), R j ). This proves the lemma for the stable manifolds. The result for the unstable manifolds follows by applying the above argument to (or modifying it for the unstable manifolds directly).
,-I. 0
Lemma 6.S. The rectangles in R satisfy a strong version of condition (iv) in the definition of a Markov Partition: i(x E int(R;) n r'(int(Rj then
»
W'(x, R i ) = R; n r'(W"(f(x), R j » W"(f(x), R j ) = f(WU(x, R;» n R j •
and
PROOF. By Lemma 6.6,
W'(x,R;) c R; n r1(W'(f(x),Rj» C R; n r1(W:(f(x))) = W'(x, R;). The last equality follows by the choices of f for the size of the stable and unstable manifolds made early in the section. Therefore
W"(x, R;) = R; n r1(W"(f(x), R j » as claimed. The argument for the unstable manifolds is similar. This proves the lemma and completes the proof of the theorem. 0
398
IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS
9.7 Local Stability and Stability of Anosov Diffeomorphisms In this section we prove the semi-stability and structural stability of Anosov dlffeomorphisms and also the stability of a single basic set. The proofs given are in the spirit of that of Anosov, and repeat the construction of stable and unstable manifolds. We start with an Anosov diffeomorphism. Throughout the section M is a compact manifold (or at least the invariant set A is a compact invariant set). Definition. Consider the set of homeomorphisms on a compact manifold M. If we lise the usual CO-sup topology on functions, then the set of homeomorphisms are not open and so not a complete space. Therefore to study perturbations 9 of a homeomorphism I which we want to remain a homeomorphism, we require that both do(f,g) and do(f-I,g-I) to be small, i.e., we use the metric dhomeo(f,g) = max{do(f,g),do(rl,g-In = sup{d(f(x),g(x)),d(rl(x),g-I(x)) : x EM}.
The space of homeomorphisms is complete in terms of the metric dhomeo. Let I : M -+ M be a homeomorphism (or diffeomorphism). We say that I is semistable provided there exists a £ > 0 such that for every homeomorphism 9 : M -+ M with the CO distance from I to 9 and from I-I to g-I less than £, dhomeo(f,g) < £, there exists a h : M -+ M which is a semi-conjugacy from 9 to I, with h continuous, onto, and hog = I 0 h. These conditions imply that the dynamics of 9 are at least as complicated as those of I. An example of two diffeomorphisms which are semi-conjugate is where I is a hyperbolic toral automorphism on 11'2 and 9 the DA-diffeomorphism constructed from I. In fact, this I is semi-stable as the following theorem of Walters (1970) proves. Theorem 7.1 (Semi.Stability of Anosov Diffeomorphisrns). Assume M is a compact manifold and that I : M -+ M is an Anosov diffeomorphism on a compact (f has a hyperbolic structure on all of M). Then I is semi-stable. REMARK 7.1. The proof we give is in the spirit of that given by Anosov, although the ideas are filtered through the concept of shadowing. A different type of proof was given by Moser (1969). This latter proof solves a functional equation using a contraction mapping. PROOF. Let 9 be a homeomorphism within t > 0 of I, dhomeo(f,g) < t. Then for each x E M, {gJ(x)}~_oo is an t-chain for I. Given." > 0, there is an t > 0 such that all t-chains can be uniquely .,,-shadowed. Let y = h(x) be the point that .,,-shadows gi(x), d(P 0 h(x), gJ (x)) < .". We now check the properties of h. Because the point which shadows is unique, h is a well defined function.
Claim 1. The map h is continuous. PROOF. Let Vp(r) = expp(Bp(r)) be the neighborhoods of p in M defined before in the proof of shadowing. The proof of the shadowing result shows that
n 00
h(x) =
li(Vg-;(x)(r)).
i=-oo
Given." > 0, there is a N such that all points in the finite intersection
n N
i=-N
li(Vg-l(x)(r))
9.7 LOCAL STABILITY AND STABILITY OF ANOSOV DIFFEOMORPHISMS
399
are within TJ/2 of hex). Then for y near enough to x, all points in the finite intersection N
n
li(Vg-i(y)(r»
i=-N
are within TJ of hex). Since
n
N
00
hey) =
P(Vg-i(y)(r)) c
n
li(Vg-i(y)(r»,
j--N
we have that d(h(x),h(y» < TJ. This proves the continuity of h.
o
Claim 2. hog = loh. PROOF. Using the point
d(gi
0
g(x) to TJ-shadow,
g(x), po h 0 g(x» = d(gHI(X), p+!
0
rio h 0 g(x» < TJ.
But also
d(gi+!(x), Ii+! By the uniqueness of h(x), I-I
0
h 0 g(x)
0
hex»~
< TJ.
= h(x), or h 0 g(x) = 10 hex).
o
Claim 3. The map h is onto. PROOF. We argue that h is onto using ideas from algebraic topology. (In the proof of Theorem 7.3 below, h is one to one and there is another proof which uses the Invariance of Domain Theorem.) The map h is within TJ of the identity map and can be continuously deformed into the identity (h is homotopic to the identity). The top homology group of M, Hn(M) where n = dim(M), is nontrivial and the identity induces an isomorphism on this group. Because h is homotopic to the identity, it also induces an isomorphism on Hn(M). Since any map which induces an isomorphism on Hn(M) is onto, h is onto.
o
These claims combine to prove Theorem 7.1.
0
REMARK 7.2. Let 9 be the DA-diffeomorphism constructed from the hyperbolic toral automorphism I. Then the semi-conjugacy h from 9 to I is not one to one. In fact, h takes the line segments in W'(q,J) between W'(Pltg) and W'(Jl2,g) and collapses them to points.
Theorem 7.2 (Openness of Anosov Dift'eomorphisms). Assume M is a compact manifold. The set of Anosov diffeomorphisms on M is open in the CI topology. We leave the proof of this theorem as an exercise. See Exercise 9.27. (The proof uses cones.)
Theorem 7.3 (Structural Stability of Anosov Diffeomorphisms). Assume that M is a compact manifold and I : M -+ M is an An080v diffeomorphism. (The map I has a hyperbolic structure on all of M.) Then I is structurally stable (under all small C I perturbations). REMARK 7.3. This result was first proved by An080v (1967). Also see Moser (1969) and the appendix by Mather in Smale (1967). PROOF. The previous result showed there is a semi-conjugacy, h, with hog = 10 h. We only need to show that h is one to one. Assume that hex) = hey). Then hogi(x) =
IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS
400
=
=
Ii 0 hex) Ii 0 hey) h 0 gi(y). Therefore d(gi(x),gi(y» < 2'1. Since 9 is An08Ov, it is expansive. Therefore if '1 is small enough, x = y. (The expansive constant can be shown to be uniform for a neighborhood of /.) In this case there are alternative proofs that h is onto that do not use the induced maps on the top homology group. One such proof is given below in the case of a hyperbolic invariant set. Another proof uses the fact that h is one to one. By the Invariance of Domain Theorem, h is an open map so heM) is open in M. Because M is compact, heM) is compact and so closed. If M is connected, the the image heM) both open and closed in M and 80 is all of M, and h is onto. (This is the same type of proof that we used for the Hartman-Grobman Theorem, Lemma V.7.S.) If M is not connected, then the above argument can be applied to the connected components 80 show that h is onto. 0 Finally, we want to give a version of this last theorem for a hyperbolic i80lated invariant set. See Hirsch and Pugh (1970). Theorem 7.4 (Stability of an Hyperbolic Invariant Set). Let A, be a compact hyperbolic i80lated invariant set for / : M -+ M with isolating neighborhood U. There is an ( > 0 such that if 9 is a C l map within ( of /, then 9 has a hyperbolic structure on A, n{gi(U) : j E Z}, and there exists a homeomorphism h : A, -+ A, (onto) that gives a topological conjugacy.
=
PROOF. Given '1 > 0 there exists a 6 > 0 such that any 6-chain within 6 of A, can be uniquely rrshadowed by an orbit in A,. Take N such that
n N
lieU)
c
{q : d(q,A,)
< 6/2}.
i=-N
There exists a CO neighborhood
n
N' of / such that for
9 in
N'
N
gi(U) C {q: d(q,A,) < 6/2},
i=-N
=
and gi(p) is a 6-chain for / for any p E ntgi(U): - N $ j $ N}. Let A, n{gi(U) : j E Z}. Using cones, it can be shown that if N c N' is a small enough neighborhood of / in the Cl topology, then for 9 E N, 9 has a hyperbolic structure on A,. (See Exercise 9.28.) Take gEN. For PEA" gi(p) is a 6-chain for /. Therefore there is a unique h(p) E A, such that d(fi oh(p),gi(p» < '1. This defines h: A, -+ A,. Just as q in the case of an An080v diffeomorphism, the fact that the shadowing is unique proves that hog /0 h. The map h is continuous, just as for an An080v diffeomorphism. Because 9 has a hyperbolic structure on A" we can define a map Ie : A, -+ A, such that Ie 0 / go Ie. In fact, if YEA" then Ii (y) is an 6-chain for g. Because 9 has a hyperbolic structure on A" this chain can be uniquely shadowed by gi (py) for Py Ie(y) with d(gi(py),Ji(y» small. Just as for h, Ie is continuous and satisfies leo/ gole. The following claim proves that Ie is the inverse for h, 80 h is one to one and onto. This claim completes the proof of Theorem 7.4. 0
=
=
=
=
=
Claim. The map h is a homeomorphism between A, and A, and 80 is a conjugacy. In fact, Ie is the inverse of h as a map between A, and A,. PROOF. For pEA" d(fi 0 h(p),gi(p» < '1 is small for all j and the 9 orbit of p p. Thus if h(pt} h(P2) then PI shadows the / orbit of h(p), 80 Ie 0 h(p) Ie 0 h(Pl) = Ie 0 h(P2) = P2, 80 h is one to one.
=
=
=
9.8 STABILITY OF ANOSOV FLOWS
401
Next, for y E A" d(g1 0 k(y), p(y» is small for all i, and the f orbit of y shadows the 9 orbit of k(y), so h 0 k(y) = y. Thus h is onto A" We showed above that h is continuous. 0
9.8 Stability of Anosov Flows The results of the preceding section are also true for flows. We restrict ourselves to proving the structural stability of Anosov flows. We use the proof of this theorem to discuss expansiveness for a flow (flow expansiveness) and to introduce some constructions which are used in the proof of the global structural stability theorem for flows. (The global structural stability theorem for diffeomorphisms is stated in the next section.) The reader should also notice that we solve a slightly different functional equation in this section than that which is implicitly used in the last section by means of shadowing. The statement of the theorem is similar to before. Theorem 8.1 (Structural Stability of An080V Flows). Let M be a compact manifold and ",I be a C· Anosov flow on M, i.e., ",I has a hyperbolic structure on all of M and R(",I) = M. Then ",I is structurally stable, i.e., ",I is topologically equivalent to any flow t/JI which is C· near to ",I for -2 ::; t ::; 2. REMARK 8.1. Remember that for ",I to be topologically equivalent to t/JI it is permissible to reparameterize either t/JI or ",I.
PROOF. The main difference in the proof is that the flow does not expand or contract along the direction of the flow. To keep the trajectory of the perturbation from running ahead of the trajectory for the original flow we construct a reparameterization of t/JI that keeps its trajectory in a transversal of ",I(X). For each point x E M, let E(x) be a small transversal at x. These can be taken 80 the vary differentiably with x. For 1/ > 0, let E(x,1/) be the subset of E(x) made up of points within 1/ of x. As is done to define the Poincare map, there are " > 0 and a differentiable function r(t,x,y) with r(O,x,y) = 0 and such that for -2 ::; t ::; 2 and y E E(x, 1/), t/JT(t.x,)')(y) E E(",I(X».
Then we can use this "reparameterization" to define
The flow FI is on a subset of M x M which can be thought of as a bundle over M. (We could let E(x) = expx(t(x» where t(x) c TxM is a disk in a subspace which is a complement to the line spanned by the vector field for ",I. In the construction of the stable and unstable disks below, we would first construct disks in E(x) and then exponentiate them to get disks in M. In this discussion we do not include these steps explicitly. See Robinson (1975a).) The first point x gives the base point in M and the second point y gives the point in the fiber (which is the transversal at x). Thus for
-2::; t::; 2, FI:
U {x} x E(x, 1/) xEM
-+
U {x} x E(x). xEM
This flow can be extended for all times for which t/J1'C I.X ,y)(y) the stays in the transversal E(",I(X». As in the case for diffeomorphisms DU(x, 1/) =
nFI(",-I(x), E(",-I(X), 1/» I~O
402
IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS
is an unstable disk at x, D'(x,1/) =
n
F'(cp-'(X), E(cp-I(X), 1/»
I:SO
is an unstable disk at x, and DU(x, 7]) n D'(x, 7]) =
n
F'(cp-'(X), E(cp-I(X), 7]» IER == (x,h(x»
is a single point. By uniqueness of this point (cpl(X), !J1T(I.X.h(x»(h(x))) = F'(X, h(x)) = (cp'(X), h
so
0
cpl(X»,
h 0 cpl(X) = !J1T(I,X,h(xJ)(h(x».
This formula has built into it the reparameterization of!J1. The function T(t, x, h(x» is monotone function of t, so it has an inverse O'(s,x). Then 0' can be used to reparameterize cpl, h 0 cp"(,,x)(x) = !J1·(h(x)). The fact that h is onto and continuous is the same as for diffeomorphisms. The reparameterization is continuous by the fact that both hand T (and so 0') are continuous. Lastly, we need to check that h is one to one. Assume h(xIl = h(X2)' Then
h 0 cp"(,,xtl(xd = !J1'(h(xd) = tjI'(h(X2» = h 0 cp"(,,x')(X2), so
d(cp,,(·,xtl(xd, cp,,(.,x.) (X2» < 27]
for all t. In fact, by taking tjlt nearer to cpt we can make this as small as we desire. The following proposition gives the result about expansiveness of flows.
Proposition 8.2 (Flow Expansiveness). Let A be a compact hyperbolic invariant set for a Bow cpl. Given E > 0, there exists a 6 > 0 such that if Xl, X2 E A with
for all t where lim O'(t) = ±oo,
t-±oo
then X2 = cp'(xd for some
lsi
~ E.
We defer the proof of the proposition to Exercise 9.30. RETURNING TO THE PROOF OF THEOREM 8.1. By taking tjI' near enough to cpt, we can insure that the two trajectories are within 6. Because transversals for nearby points on the same trajectory are disjoint, we must have that Xl = X2, and h is one to one. This completes the proof of Theorem 8.1. 0
9.9 GLOBAL STABILITY THEOREMS
403
The above proof uses the How FI on a bundle of transversals. This construction works fine away from fixed points of the How. Since we are considering Anosov Hows (or possibly flows on a hyperbolic invariant set without fixed points) this causes no problem in the present situation. However, in the proof of the global stability theorem for flows, it is necessary to allow fixed points. As other points limit on a fixed point, the transversals do not have a continuous extension to the fixed point. In the proof of the Hartman-Grobman Theorem near a fixed point for a flow, no reparameterization is needed and the variation of the nonlinear How from the linear flow in all directions (not just those transverse to the How lines) is used to construct the conjugacy. In the proof of the global stability theorem for Hows, it is necessary to make a transition from (i) reparameterizations of the flow and the conjugating function taking values in a transversal for most points to (ii) 110 reparameterization of the flow and the conjugating function allowing displacements in all directions near fixed points. One way to accomplish this is given in Robinson (1975a). Although in this section we only consider flows without fixed points, we introduce some of the ideas used in the more general situation. It is possible to extend the flow FI used above so that it is defined for all pairs (x, y) with y near x (not just in the transversal). As before, there are I) > 0 and a differentiable function r( t, x, y) such that for -2 ::S t ::S 2 and d(x, y) < I),
.,pT(I,x,y)(y) E E(!p/(X». In this context, r(O, x, y) is not necessarily equal to O. Next, let J!(t,x,y) = r(t,x,y) - e-Olr(O,x,y) where a > 0 is small enough so that J!'(t,x,y) = r'(t,x,y)
+ ae-otr(O,x,y) > O.
(This last condition gives monotonicity of the reparameterization by J!.) Notice that J!(O, x, y) = O. Let F'(X, y) = (!p/(X), .,pT(t,X,Y) (y» as before, but is defined on a larger space. It can be checked that F' satisfies the group property, FI 0 F' = F'+'. The flow FI preserves the bundle of transversals on which we previously defined the flow, We have altered the flow so that Ft is hyperbolic in all the "fiber" directions: it contracts in the direction pointing off the transversal (in the flow direction) in the fiber. Applying the construction as before, we get D"(x,l7) and D'(x,l7) with DU(x, I) C E(x) so h(x) E E(x). (The disk D'(x, I) includes the direction along the flow lines of .,pI in each fiber.) We refer the reader to Robinson (1975a) for details. In the present context, there is no real advantage to extending the How F' as indicated above. However, this construction allows the style of proof discussed above to be used to apply to the global stability theorem where fixed points are allowed. We present these ideas in the present context where they are somewhat simple and can be understood without the complicated induction construction needed for the general global stability theorem.
9.9 Global Stability Theorems In this section we give the global stability theorems. The first result gives the conjugacy only on the chain recurrent set. Because the original version of this theorem gave the conjugacy only on the nonwandering set, nUl, we call the result the n-stability theorem. Remember that in the last section we proved the existence of a conjugacy on each basic set.
404
IX.
GLOBAL THEORY OF HYPERBOLIC SYSTEMS
Definition. Let I: M ..... M be a C l diffeomorphism on a compact manifold M. Then I is 'R-stable provided there exists a neighborhood N of I in the CI topology such that for 9 E N there is a homeomorphism h : 'RU) ..... 'R(g) (onto 'R(g» such that hoI = go h. Similarly, I is called O-stable provided there is a homeomorphism h from 0U) onto O(g) with h 0 f = go h.
Theorem 9.1 (O-Stability Theorem). Assume that f: M -+ M [or M compact is a CI diffeomorphism [or which 'RU) has a hyperbolic structure. Then f is'R-stable. Alternatively using the non wandering set to express the assumptions, assume that 0U) has a hyperbolic structure, 0U) = cl(PerU», and f has no cycles. Then f is O-stable. REMARK 9.1. This theorem was originally given in Smale (1970). REMARK 9.2. We give the proof with the assumptions on the chain recurrent set. With the assumptions on the nonwandering set it can be proved that I has a filtration. The rest of the proof is similar to the one given. See Shub (1987). PROOF. By the assumptions, 'RU) = cl(PerU)), so by the Spectral Decomposition Theorem 'R(f) = Al U ... U AN is the finite union of basic sets. Since we are using the chain recurrent set, each Ale has an isolating neighborhood Ule such that Ale = n~1 P(UIe). In this context, there are only a finite number of attracting-repelling pairs. See Exercise 9.16. By Conley's Fundamental Theorem, there is a Liapunov function V : M ..... R that is strictly decreasing off of 'R(f). Because each pair of Aj and Ale can be put in different attracting-repelling pairs, V has different values on each of the Aj . In fact the Aj can be renumbered and V can be modified so that V(Aj) = j. For his modified V, V : M -+ [1, NJ. Exercise 9.32 asks the reader to carry out the details of the above modification of the Liapunov function V. This puts a total ordering on the Aj that is compatible with the partial ordering with Aj « Ale if W"(Aj) n WU(AIe) "I: 0. (In this ordering the orbits flow down the ordering. In many of my papers, I used the ordering for which the index increases along a forward orbit.) Using the modified Liapunov function V, we can define subsets of M by
These sets form what are called a filtration. They have the following properties which characterize a filtration: for each j such that 1 ::; j ::; N, (1) M = MN :) MN-I :) ... :) MI :) Mo = 0, (2) f(Mj ) C int(Mj ), so each M j is a trapping region, (3) A) C int(Mj \ M j _.}, (4) Aj = fle(Mj \ M j _.}, and (5)
n;;:-oo
n 00
fle(Mj ) =
UWU(Ai) iSj
1e=0
=
Ucl(WU(Ai». iSj
We leave it to Exercise 9.33 to check these properties. Note that we used the existence of a Liapunov function to prove the existence of a filtration. From now on we use the
9,9
GLOBAL STABILITY THEOREMS
filtration and no longer use thfl Liapunov function. It is possible to prove the existence of the filtration without using the Liapunov function. Properties 1-4 are used in the proof of the n-Stability Theorem, while Property 5 is also used in the proof of the Structural Stability Theorem. Note in Property 5 that cl(WU(A j » is not usually equal to W" (Ai) but can have points in unstable manifolds of basic sets lower in the filtration. Given the filtration, using the Local Stability Theorem of the last section, it is possible to take the isolating neighborhoods Uj = Mj \ Mj -I. Then there exists a neighborhood N of f such that for 9 EN, (i) each of the Mj \ Mj _ 1 is an isolating neighborhood for Aj(g) = n%:_oogk(Mj\Mj-d, (ii) thereexistshj : Aj(f) -+ Aj(g) that is a conjugacy, and (iii) g(Mj ) C int(Mj ). We define h : U:ml Aj(f) -+ U:=l Aj(g). This gives a conjugacy on these sets. We show below that for 9 E N, R(g) = U:,.l Aj(g). Therefore h is a R-conjugacy from f to g. It remalns to show that R(g) = U:.l A; (g) which we do by means of the following two claims. Claim 1. R(g) ::>
U:.l Aj(g).
PROOF. The periodic points of! are dense in R(f) and so in each Aj(f). The conjugacy h j takes these periodic points of f into periodic points for g. Therefore the periodic points of 9 are dense in each Aj(g), and R(g) ::> cl(Per(g» ::> Aj(g). Taking the union over j we get the result. 0 Claim 2. R(g) C
U:=I Aj(g).
PROOF. Take Y E R(g). Then Y E M j \ M j - 1 for some j. We want to show that Y E Aj . The first step is to show that gi(y) rt Mj _ 1 for all i > O. Assume to the contrary that gk(y) E Mj _ 1 for some k > O. Since g(Mj-d C int(Mj_d. gi(y) E int(Mj _.) for all i ~ k. Also for f > 0 small enough, any t-chain {Yi} with Yo = y has YA: E M j _ 1 by the continuity of g. Next, g(Yk) E g(Mj _.) C int(Mj _.). For E > 0 small enough, Yk+1 E Mj _ 1 since d(Yk+1,g(Yk» < t. Continuing by induction Yi E Mj- 1 for i ~ k. This contradicts the fact that Y is chain recurrent. Therefore g;(y) rt Mj - 1 for all i ~ O. Because g(Mj ) C Mj and Y E Mj , gi(y) E gi(Mj ) C Mj for all i ~ O. Combining, g'(y) E M j \ Mj _ 1 for all i ~ O. A similar argument applied to backward iterates shows that g;(y) E Mj \ M j _ 1 for all i ::; O. Combining the results for positive and negative i, gi(y) E M j \ M j _ 1 for all i E Z, I.e., Y E Aj(g). 0 Combining the claims, we have completed the proof of Theorem 9.1.
o
In the exercises, we ask the reader to give a direct proof of the n-stability theorem for a flow on the two sphere 52 with one source, one sink, and no other chain recurrent points. See Exercise 9.31. Definition. Assume f is a diffeomorphism with a hyperbolic structure on R(J) and = Al u··· U AN where the Aj are basic sets. We say that I satisfies the transversality condition provided for every p,q E R(f). WU(p,J) is transverse to W'(q,J). (This condition allows the fact that some unstable manifolds do not intersect other stable manifolds.) Note that the condition is that the stable and unstable manifolds of points are transverse and not just that the stable and unstable manifolds of basic sets are transverse. Alternatively, we could assume that L(f) (resp. n(f» has a hyperbolic structure, and ! satisfies the transversality condition with respect to L(f) (resp. n(f», I.e., WU(p, J) and W'(q, J) are transverse for all p, q E L(f) (resp. n(f». Note that if L(J) (resp. nu)) has a hyperbolic structure and ! satisfies the transveraaUty condition with respect to L(f) (resp. n(J» then! can be shown to have DO cyclee.
RU)
406
IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS
(Remember that we proved that if R(f) has a hyperbolic structure then f has no cycles, so certainly it has no cycles if R(f) has a hyperbolic structure and f satisfies the transversality condition.)
Theorem 9.2 (Structural Stability Theorem). Assume that M is a compact manifold, I : M --+ M is a C l diffeomorphism, (i) I has a hyperbolic structure on R(f), and (ii) f satisties the transversaJity condition with respect to R(f). Then f is structurally stable. That is, there exists a neighborhood N of I in the set of C l diffeomorphisms such that if 9 E N then 9 is topologically conjugate to f on all of M. Assumptions (i) and (ii) and be replaced with either of the following alternatives: (a) (i) I has a hyperbolic structure on L(f) and (ii) I satisties the transversaJity conditioll with respect to L(f), or (b) (i) I has a hyperbolic structure on n(f) and n(f) = c1(Per(f)) and (ii) I satisties the transversality condition with respect to n(f). REMARK 9.3. This theorem was proved for several special cases before the general proof was given. The case when R(f) = M was proved by Anosov (1967) and Moser (1969). (This case is the Anosov Stability Theorem, Theorems 7.3 and VII.5.l(e).) These proofs applied to both diffeomorphisms or flows (although Moser's proof for flows had to be somewhat modified from what he gave). The case when R(f) is a finite number of (periodic) points (so I is Morse-Smale) was proved by Palis and Smale (1970). This proof applies to either diffeomorphisms or flows. Also see Palis and de Melo (1982). The case when I is a C2 diffeomorphism (but the neighborhood is still in the CI topology) was proved by Robbin (1971). This is the first proof of the general theorem stated above. The case when I is a CI diffeomorphism was first proved in Robinson (1976a). (This result has weaker hypothesis than the result of Robbin referred to above.) The general case of the theorem when J' is a CI flow was proved by Robinson (1975a) after first proving it for C 2 vector fields in Robinson (1974). REMARK 9.4. Besides the original proofs referred to above, also see Robinson (1975b, 1976b, and 1977) for sketch of the proof and discussion of the ideas for the full theorem. REMARK 9.5. The proof in one dimension is especially easy because there can only be sources and sinks. In Section 2.6, we treated some examples on the line. The one complication comes from the fact that the line is not compact. We also considered some examples which have critical points which is more complicated. These methods can be applied to show that Morse-Smale diffeomorphisms on Sl are structurally stable. See Exercises 9.38 and 9.39 for some special cases. We conclude by giving a few comments about the construction. The conjugacy is built up in neighborhoods of successive basic sets. It is necessary to extend the map onto the stable and unstable manifolds. We give the following definition.
Definition. Let A be a hyperbolic basic set. A fundamental domain lor the stable manifold 01 A is a closed set D' C W'(A) \ A such that there exists a set D" with D' = cl(D") and Ij(D") n D" = 0 for all integers j l' O. Note that D' n A = 0. A fundamental domain for the urutable manifold of A is defined similarly.
9.10 EXERCISES
407
REMARK 9.6. One way to show the existence of a fundamental domain is to use a Liapunov function. Let V : M ~ R be a Liapunov function with A. C V-I(i) and W'(A.) C V-I([i,oo)). Let S = V-I([O,i + ED n W·(A.) be the "local stable manifold" of A. for some choice of f, 0 < f < 0.5. Let D" = S \ I(S) and and D' = d(D·'). This is a fundamental domain for the stable manifold. See Exercise 9.3S.
The proof is not that difficult in the case when 1 has only one repeller and one attractor. We consider the even easier case of a north pole south pole diffeomorphism of 8", i.e., 1 has a single fixed point source X2 and a single fixed point sink XI and no other chain recurrent points. Let D' be a fundamental domain for the sink XI of I. Also that D' is constructed as above with the upper edge equal to V-I(1.5). Let 9 be a small C l perturbation which is R-conjugate (so it has only two fixed points, YI and Y2). Assume 9 is near enough to 1 so that g(V-I(l.5)) C V-I([l, 1.5)), i.e., 9 still move this level set down in terms of V. Let ho(x) = X on V-I(1.5). On the image of 1(V-1(1.5», define ~ by ~(x) = go ho 0 rl(x). Using a bump function ho can be filled in to define a function ho : D' ~ 8". (This is similar to the construction in Section 2.6.) Since D' is a fundamental domain, ho can be extended to a function h defined on sn\ {Xl. X2} by hex) = giohoori(x) where j is chosen so that I-'(x) ED'. Since ho is continuous on D', this extension is continuous on 8" \ {x I , X2}. Also define h(xt} = YI and h(X2) = Y2. A little checking shows that h is continuous at these points as well. This completes the sketch of why 1 is structurally stable in this case.
9.10 Exercises Fundamental Theorem 9.1. Let 1 : S2 ~ S2 be the diffeomorphism with one source, one sink, and a hyperbolic (saddle) invariant set that is a Cantor set, i.e., 1 is the horseshoe map on S2. Find enough pairs of attracting-repelling pairs to show that their intersection as in Theorem 1.3 is equal to R(f). 9.2. Let 1 be a diffeomorphism, and all the Pj listed below are periodic points for I. (a) Assume that q E WU(O(p.)) n W'(O(J)2)) =1= 0. Show that for all E > 0, there is an E-chain from PI to q and then to P2. (b) Assume that Qj E WU(O(pj))nW'(O(PH.)) =1= 0 for j = 0, ... , nand Pn+1 = Po. Show that all the Qj are chain recurrent. 9.3. Let X,Y E R and assume Y ~ {l+(x). Prove there is a (A, AO) E A such that X E A and Y E AO. Hint: Let U = {li(x) for small f, and for the corresponding attracting-repelling pair, (A, AO) E A, show that x E A and yEA". 9.4. Prove that if Y is an invariant set for 1 and X = dey) then X is an invariant set for I. 9.5. Let ",' be a continuous flow that is defined on a space X for all t, e.g. X is compact. Assume V is an open set in X and let U = r1o
408
IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS
Shadowing and Expansiveness 9.9. Let D(x) = 2x mod 1 be the doubling map on SI. Let {Xj}~o be a 6-chain for D. (a) Show that the inverse iterates D-i(xj) can be chosen so that
(b) Choosing the inverse iterates 88 in part (a), let y" = Iimj_OO D-H"(Xj). Prove that D(y,,) = Yk+l. (c) Prove that Yo 6-shadows the 6-chain {Xj}~o. 9.10. Let EA be a one-sided subshift of finite type with metric d as defined in Chapter II. Let 6 = 0.5, Assume {s(j) E EA}~O is a O.S-chain for O'A on EA. Specify the point t E EA which 0.5-shadows the 0.5-chain. 9.11. Prove that a hyperbolic toral automorphism on y2 is expansive without using the result about shadowing. 9.12. Assume A is a compact hyperbolic invariant set. Prove that A is isolated if and only if it has a local product structure. 9.13. Assume A is a compact hyperbolic invariant set that is not isolated. Prove that if V is a small enough neighborhood of A then the maximal invariant set in V, Av = nnEZ t(V), h88 a hyperbolic structure. Hint: See the proof of Theorem VlI.4.5, the existence of a horseshoe for a transverse homoclinic point. Anosov Closing Lemma 9.14. Consider the example given in Remark 4.2 and Figure 4.1. (a) Explain why the point labeled q is in n(f) but not L(f). (b) Explain why the point labeled q is in n(f) but not in n(fln(f», so n(fln(f» '" n(f). 9.15. Prove Theorem 4.1(c), i.e., assume that the nonwandering set n(f) is hyperbolic, and prove that c1(Per(f» = n(fln(f». Decomposition of Recurrent Points 9.16. Assume M is a compact manifold and f : M -+ M is a diffeomorphism with a hyperbolic structure on the chain recurrent set, R(f). Let A be the set of attractingrepelling pairs for f. Let {AI," . ,AN} be the collection of basic sets. (a) Let (A,AO) E A. If Aj n A '" 0, prove that Aj C A and WU(A j ) C A. If Aj n AO '" 0, prove that Aj C AO and W'(A j ) C AO. (b) Let (A, AO) E A. Prove that
A = U{WU(Aj ) : Aj n A", 0} AO
= U{W'(A i ) :
and
Ai n AO '" 0}.
(c) For a diffeomorphism with a hyperbolic structure on the chain recurrent set, prove that there are a finite number of distinct attracting-repelling pairs in A. 9.17. Give an example of an attracting set which is not an attractor. 9.18. Assume that (i) the periodic points are dense in two hyperbolic basic sets Aj, IUId Aj,' (ii) there exist ql E Ai, and q2 E Ai, such that WU(q.) h88 a non-empty transverse inters«tion with W'(q2), and (iii) there exist qi E A" and qi E A" such that \P(qi) has a OO[H!mpty transverse intersection with W·(q't). Prove that all the
9.10 EXERCISES
409
points of intersection, W"(A,.) n W'(A h ) and W"(Aj,) n W'(A i .), are in cl(Per(J» and so in n(J). 9.19. Assume that M is compact, , : M -+ M has a hyperbolic structure on the nonwandering set n(f), and , has a cycle. Prove that that nonwandering set is not equal to the chain recurrent set, n(f) :/: 'R(f). 9.20. Assume that, : M -+ M has a hyperbolic structure on the chair recurrent set n(J) and M is compact. Assume A, is a basic set for which int(W'(A i "# e. Prove that Ai is an attractor.
»
9.21. Assume that' : M -+ M has a hyperbolic structure on the chair recurrent set n(J) and M is compact. Let
Prove that S is open and dense in M. 9.22. Let,: M ..... M be an Anosov diffeomorphism on a connected manifold. (It is not assumed that , is a hyperbolic toral automorphism.) (a) Let p E Per(f) be a periodic point. Prove that W'(p) is dense in M. Noted that this is the stable manifold of p and not the stable manifold of the orbit of p. Hint: Prove that cl(W'(p» is open in M. (b) Let q E M be any point. Prove that W'(p) is dense in M. (c) Prove that, is topologically mixing. 9.23. Show on any compact two dimensional manifold M that there exists a diffeomorphism' for which (i) n(f) has a hyperbolic structure and (ii) , has infinitely many periodic points. Hint: Take a Morse-Smale diffeomorphism on M with a fixed point sink. Replace a neighborhood of the sink with the Smale horseshoe on the disk N (in S2).
Assume that' : M -+ M has a hyperbolic structure on the chair recurrent set is compact. Let 'R(f) = AI U ... U AN be the spectral decomposition into basic sets. Prove that each basic set Ai can be decomposed into subsets
9.24.
nu) and M
nJ
Ai
= UX,,; i:al
with (i) (ii) (iii) (iv)
the following properties. The sets Xi,; = cl(W"(p) n W'(p» for some periodiC point p E Xi,;' The sets Xi,; are pairwise disjoint. The sets Xi,; are permuted, ,(Xi,;) = Xi,HI for 1 ~ i < ni and ,(Xi,nj) = Xi,l' The n,-power of, restricted to each Xi,; is topologically mixing, i IXj ,; is topologically mixing for each j and i.
r
Markov Partitions 9.25. Let E8 be a two sided subshift of finite type with shift map 0'8. Find Markov partitions of arbitrarily small diameter. (Note that there is no differential structure, so this map does not have a true hyperbolic structure but it does have stable and unstable manifolds which is all that is needed,) 9,26, Let f : R2 -+ R2 be the diffeomorphism which has the geometric horseshoe A as an invariant set, Find Markov partitions of arbitrarily small diameter.
IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS
410
Local Stability 9.27. Prove that the set of Anosov diffeomorphisms is open by proving the following steps. Let f be all Anosov diffeomorphism and define cones C; using the splitting lE~ E9lE~ such that Dff-'(p)Cj-,(p) C C~. (a) Show that if 9 is Cl near enough, then Dgg-'(pp;_,(p) C Cj; C TpM. (b) Prove that
n
Dg;-n(p)C;-n(p) C TpM
n~O
is a subspace E~·g that is near to the subspace lE~ for f. Similarly, get the subspace
E~g =
n
Dg;n(pp;n(p) C TpM,
n~O
where C~ = c1(Tq \ C:). (c) Since the subspaces lE~·g and lE:;g are near the subspaces lE~ and lE~, argue that TpM = lE~,g E9E~,g is a direct sum decomposition. (d) Since the decomposition for 9 is near the decomposition for f, argue that this is a hyperbolic structure: Dgp expands vectors in lE~,g and contracts vectors in lE:;g. 9.28. Assume that Af be a hyperbolic isolated invariant set for f : M -+ M with is0lating neighborhood U. Prove that if f > 0 is small enough and 9 is a C 1 diffeomorphism within f of f, then 9 has a hyperbolic structure on Ag = n{gj(U) : j E Z}. 9.29. Assume that f : M -+ M has a hyperbolic chain recurrent set on a compact manifold M. Prove each of the basic sets Aj is an isolated ill'" lilt set. Anosov Flows 9.30. Prove Proposition 8.2 on Flow Expansiveness. Global Stability Theorems 9.31. Let cpl be the flow on the two sphere, 8 2 , with one sink and one source and no other recurrent points. This flow is often called the north-pole south-pole flow. Prove that cpl is 'R.-stable using only the local stability near the fixed points (i.e., without using the fI-Stability Theorem). 9.32. Assume that f : M -+ M is a Cl diffeomorphism for which M compact and 'R.(f) has a hyperbolic structure. Prove that the Aj can be renumbered and there exists a modified Liapunov function V: M -+ [I, NJ which has the following properties: (i) V is decreasing off 'R.(f), and (ii) V(Aj) = j.
9.33. Assume that f : M -+ M is a CI diffeomorphism for which M compact and 'R.(f) has a hyperbolic structure. Let Aj for 1 ~ j ~ N. Let V be a Liapunov function that is strictly decreasing off of 'R.(f) with V(Aj) = j. Let
Prove these sets have the five properties of a filtration listed in Section 9.9. 9.34. Assume that f : M -+ M is a Cl diffeomorphism for which M compact and 'R.(f) has a hyperbolic structure. Prove there is at least one basic set which is an attractor and at least one basic set which is a repeller. 9.35. Assume that f : M -+ M is a CI diffeomorphism for which M compact. Also assume that f has finitely many periodic orbits, all of which are hyperbolic. Prove that
9.10 EXERCISES
411
there can be no cycle between these periodic points for which all the intersections of stable and unstable manifolds are transverse. 9.36. Assume that fJ : M j -+ M j is a Cl diffeomorphism on a compact manifold M j for j = 1,2. Let F: Ml x M2 -+ Ml X M2 be defined by F(x,y) = (h(x), h(y». (a) If fJ has a hyperbolic structure on the chair recurrent set 'R.(fJ) for j = 1,2, prove that F has a hyperbolic structure on 'R.(F). (b) If in addition to the assumptions of part (a), if fJ satisfies the transversality condition for j = 1,2, prove that F satisfies the transversality condition and so is structurally stable. 9.37. Let A be a basic set for a diffeomorphism f with a hyperbolic chain recurrent set on a compact manifold M. Let V : M -+ R be a Liapunov function with A c V-l(i). Let S = V-I([O, i + f» n W'(A) for some choice of f > 0, and D' = c1(S\f(S». Prove for good choices of f that D' is a fundamental domain. 9.38. Let f be the diffeomorphism on Sl whose lift F : R -+ R is given by F(8) = 8 + fsin(211'k8)
mod 1,
for 0 < 211'kf < 1. Prove that f is structurally stable. 9.39. Let f be the diffeomorphism on Sl whose lift F : R
F(t) = t
-+
R is given by
+ lIn + f sin(211'nt)
for n a positive integer and 0 < f < 1/(211'n). Prove that f is structurally stable. 9.40. Prove that the set constructed in Remark 9.6 is a fundamental domain for the basic set.
CHAPTER X
Generic Properties In Chapter IX, systems with hyperbolic chain recurrent sets are shown to be 'Rr stable. Certain systems are also shown to be structurally stable: (i) Anosov systems, (ii) Morse-Smale systems, and (iii) systems with a hyperbolic chain recurrent set which also satisfy the transversality condition. Although we have given examples of such systems, we have not discussed the prevalence of such systems. At one time it was hoped that any system could be approximated by a structurally stable system. By now, many counter examples have been constructed. For these examples, there is a whole open set of systems that are not structurally stable or even !l-stable. However, it is possible to prove that there are certain properties which are generic in the sense of Baire category, i.e., any system can be approximated by another for which these properties are true and the condition is open at least in a weak sense. This chapter considers several of the basic generic properties. It also gives a counter-example to the density of structurally stable systems. The proofs of the genericity use methods from transversality theory, so a section develops these ideas.
10.1 Kupka-Smale Theorem A generic property is one which holds for most functions in the function space under consideration. We make this precise in the following definition. Definition. Let F be a topological space. A subset n c F is called a residual subset (in the sense of Baire category) provided n contains the countable intersection of dense open subsets, more precisely, 'R. :::> Sj where each Sj is open and dense in F. In a complete metric space, a residual subset of X is always dense. A topological space X is called a Baire space provided any residual subset of X is dense in X. A property is generic in a function space F which is a Baire space provided the property is true for functions in a residual subset of F. If M is a compact manifold and 1 ~ k ~ 00, then CIe(M,M), Diffle(M), and the set of C le vector fields XIe(M) are all Baire spaces. See Hirsch (1976). If M is noncompact and these function spaces are given the Whitney topology (or strong topology as defined in Hirsch (1976» then they are also Baire spaces.
n;:l
The first generic property which we consider is the hyperbolicity of periodic points and the transversality of the stable and unstable manifolds. Because the statement and proof of the theorem can be expressed in terms of the set of all periodic points and the set of hyperbolic periodic points, we give some notation for these sets. For any diffeomorphism' on M, let Per(k,f) be the set of all periodic points with period less than or equal to n, Per(n, f)
= {p EM:
Ji(p)
=p
for some j ~ n},
PerU) be the set of all periodic points, 00
PerU) =
U Per(n, f), n=l
413
X. GENERIC PROPEJITIES
414
Perh(n, f) be the set of all hyperbolic periodic points with period less than or equal to n, Perh(n, f) = {p E Per(n, f) : p is a hyperbolic periodic point }, and Perh(f) be the set of all hyperbolic periodic points, Perh(f) = {p e Per(f) : p ill a hyperbolic periodic point }.
(Notice that this usage ofthe notation for Per(n, f) does not agree with how it is used in the rest of the book where it is the set of all periodic points with least period exactly n.) Note that all the periodic points of period less than or equal to n are hyperbolic if and only if Per(n,f) = Perh(n,f). Using this fact, we define
'Hn
= {f E Diff"(M):
Per(n,f)
= Perh(n,f)},
and
n=1
Therefore, f E 'H if and only if all the periodic points of f are hyperbolic. The second half of the theorem deals with the transversality of the stable and unstable manifolds. We let
KS(M) = {f
E
'H : W'(p, f) is transverse to WU(q, f) for all p, q
E Per(f)}.
Theorem 1.1 (Kupka-Smale). Assume M is a compact manifold and 1::; k::; 00. (a) The set 'Hn defined above is dense and open in Diff"(M) , and 'H is a residual subset of Diff"(M). (b) The set KS(M) is a residual subset of DiffA:(M). REMARK 1.1. This theorem is also true for vector fields (or Hows). We require that both all the fixed points and all the periodic orbits are hyperbolic in the definition of 'H(X). In the definition of KS(X), we use the stable and unstable manifolds of periodic orbits and not just the stable and unstable manifolds of individual points in the periodic orbits: we require that W'btl is transverse to WU (2) where /'1 and/' vary over all fixed points and periodic orbits. There are two aspects which are different about Hows or vector fields than diffeomorphisms. First, there are both fixed points and closed orbits. This difference is minor. Second, the periods of the periodic orbits can be any positive real number so the induction on the period is slightly more cumbersome to implement. Even with these differences, the main ideas of the proof are the same. REMARK 1.2. The Kupka-Smale Theorem was proved independently by Kupka (1963) and Smale (1963). A nice proof for the case of vector fields is given in Peixoto (1966) which includes the case of noncompact manifolds. Palis and de Melo (1982) and Abraham and Robbin (1967) also give proofs for vector fields.
1.3. We delay the proof (for diffeomorphisms) until Section 10.3 because it uses the transversality theorems which we discuss in Section 10.2. The idea of the proof is that a periodic point can be approximated by a hyperbolic periodic point. Since the hyperbolicity of a single periodic point is an open condition, the set 'Hn is both dense and open. Similarly, a nontransverse intersection of stable and unstable manifolds can be approximated by a transverse one which implies that. KS(M) is dense. The transversality of the intersections on compact pieces is an open condition and the manifolds can be represented as the countable union of compact subsets, so it can be shown that KS(M) is residual. REMARK
10.1 KUPKA·SMALE THEOREM
415
Definition. A diffeomorphism (respectively flow) which satisfies the properties of the set KS(M) in Theorem l.l(b) is called a Kupka·Smale diffeomorphism (respectively flow). The next set of results concerns the genericity of the condition that the closure of the periodic points equals the nonwandering set. The first step is the possibility of approximating a nonwandering point by a periodic orbit, the Closing Lemma. This lemma was first thought to be obvious (hence the title of a lemma), which it is in the CO topology. Its proof is very difficult in the C 1 topology and unknown in the C 2 topology. In other words, the approximating diffeomorphism 9 with the periodic orbit can be taken to be Coo but is only Cl near to the original f. The reason that the proof is complicated is that one localized perturbation is not enough to change an orbit which returns near to itself into a periodic orbit. The proof for an approximation in the Cl topology uses many localized perturbations to accomplish the feat. A proof for an approximation in the C 2 topology (if and when it is given) will probably not use a "localized perturbation", but will have to control the effects of an orbit passing several times through a single perturbation. Theorem 1.2 (Closing Lemma of Pugh). Assume M is a compact manifold, fa a neighborhood of I in Diffl(M), and p E flU). Then there is agE N such that p is a periodic point for g.
c 1 diffeomorphism, N
REMARK 1.4. This result is also true in the spaces of Cl flows or Cl vector fields. It was originally proved in Pugh (1967a) for flows. That paper claims to prove the result for C 1 vector fields but there is a technical difficulty in the smoothness of the perturbed vector field (which do not arise when considering the space of flows). These difficulties were discovered by Pugh and are corrected (by Pugh) in Pugh and Robinson (1983). The theorem is also proved for many other function spaces in this latter paper: Hamiltonian and volume preserving diffeomorphisms and flows. Further papers on this result include Liao (1979), Mai (1986), and Wen (1991). For an intuitive discussion of the proof and its difficulties see Robinson (1978). The next lemma is much easier than the Closing Lemma. It shows that once a periodic orbit has been produced the diffeomorphism (vector field, or flow) can be approximated by a new diffeomorphism (vector field, or flow) with a hyperbolic periodic point.
Lemma 1.3. Let p be a periodic point of period j for a C,. diffeomorphism 9 : M -+ M, for 1 ~ k ~ 00. Then 9 can be approximated arbitrarily closely in the Ck topology by a diffeomorphism g' such that p is a hyperbolic periodic point of the same period.
PROOF. Let 'P : V -+ U be a coordinate chart at p with 'P(O) = p. Let r > 0 be small enough so that gi(B(p, r»nB(p, r) = 0 for 1 ~ i < j where B(p, r) = 'P( {x: Ixl ~ r}). Let {3 : Rn -> R be a Coo bump function such that {3(x) = 1 for Ixl ~ r/2 and {3(x) = 0 for Ixl ~ r. Finally, let g.(q) = g(q) for q ~ B(p, r) and g.
0
'P(x) = 9 0 'P(x + E{3(X)X)
for x E 'P-1(B(p,r» = {x: Ixl ~ r}. Then D('P- 1 0 y!.
0
'P)o = D('P- 1 0
gi 0 'P)o(l + E)I.
Because of the form of this derivative, g. has a hyperbolic periodic point at p for arbitrarily small E > O. Clearly, g. converges to 9 in the C,. topology as E goes to zero.
o
A corollary of the Closing Lemma is the General Density Theorem which was also originally proved by Pugh (1967b). This result is only true in the C 1 topology because it uses the Closing Lemma.
416
X. GENERIC PROPERI'IES
Theorem 1.4 (General Density Theorem). Assume M is a compact manifold. Let 9 = {J E Diffl(M) : c1(Perh(f)) = n(f)}. Then 9 is residual in Diff1(M). Before starting the proof of the General Density Theorem, we give some definitions and results about semi-continuous set valued functions which are used in the proof. Let M be a complete metric space, and CM be the collection of all compact subsets of M. The Hausdorff metric on CM is defined as follows: for A, BE CM, d(A, B)
=sup{d(a, B), d(b, A) : a E A, bE B},
where d(b, A) = inf{d(b, a) : a E A}.
(Note: This metric has nothing to do with the Hausdorff property for a general topological lipace.) If M is a complete metric space, then CM is a complet.e metric splICe with the metric d defined above. (This result is an exercise in many of the books on topology.) The proof also uses the concept of a semi-continuous set valued fUlIction. (The functions we use take f to Perh(n, f).) Let An E CM for n ::?: 1. Define liminf An = {y EM: there exist Yn E An for n::?: 1 n-<XI
such that Y = lim Yn}. n-oo
If a point Y is contained in all the An for n ::?: N for some N, then Y E liminf n _ oo An. Thus the set lim infn _ oo An is the set of points which are "essentially" contained by each of the An for large n. (In fact, y only has to be the limit of points in the An and not actually lie in the sets.) It is easy to check that if An E CM then lim inf n _ oo An is compact so is in CM. Let X be a topological space and M a complete metric space. A set valued function r : X ..... CM is called lower semi-continuous at x provided
f(x) C lim inff(xn ) n-oo
for every sequence Xn E X converging to x. This means that any point in r(x) can be approached by a sequence of points Yn E r(xn ). In an intuitive sense, the set f(x) can be smaller than the r(Xn) for nearby Xn but can not be bigger. The set valued function r is called lower semi-continuous provided it is lower semi-continuous at all points x EX. Lower semi-continuity can also be expressed in terms of a nonreftexive "semi-metric" on CM defined by d.ub(A, B) sup{d(a, B) : a E A}.
=
For B compact, d.ub(A, B) = 0 if and only if A C B. (In the Hausdorff metric, d(A, B) = 0 if and only if A = B.) Then r : X ..... CM is lower semi-continuous at x provided lim{d.ub(Gamma(x),r(y» : y converges to xinX} = O. Exercise 10.3 asks the reader to prove the following result. Assume r n : X ..... CM are lower semi-continuous set valued functions for n ::?: 1, and define r : X ..... CM by [(xl = cl(Un r n(x)). Then f is a lower semi-continuous set valued function. Finally, assume r : X ..... CM is a lower semi-continuous set valued function. Let R. c X be the points of continuity of f. Then R. is residual, i.e., a semi-continuous function is continuous at a residual subset. See Choquet (1969).
10.2
TRANSVERSALITY
417
PROOF OF THEOREM 1.4 (GENERAL DENSITY THEOREM). Define rn,r: Dur1(M)-+ by
eM
r nU) =
Perh(n, f).
ru) = c1(PerhU», where Perh(n, f) and PerhU) are the set of hyperbolic periodic points of period less than or equal to n and of all periods respectively. The sets Perh (n, f) can easily seen to be closed and so compact. For n ~ I, the map r n is lower semi-continuous because a hyperbolic periodic point persists under perturbations. (See Theorem V.6.4.) These functions are not continuous at all I: if 10 is a diffeomorphism with a periodic point Xo with eigenvalue of absolute value one, then under arbitrarily small perturbations of 10 to 9 the periodic point Xo(g) can become hyperbolic (or disappear). Thus the set r n(g) can get bigger for small C l perturbations of an I (Xo(g) E r n(g» but can not get smaller, so r n is lower semi-continuous but not continuous at 10' Because each of the set valued functions r n are lower semi-continuous, the function ro = c1(Un r n('» is also lower semi-continuous by Exercise 10.3. Let n c Diffl (M) be the points of continuity of r. As mentioned above, a sernicontinuous set valued function is continuous at a residual subset, so n is residual in Diffl(M). Claim. The set neg where 9 is the set defined in the theorem, so 9 is residual. PROOF. Suppose lEn \ g. Therefore, r is continuous at I, but there is a point p E nu) \ c1(Per(f». By the Closing Lemma, there are gi E Diffl(M) converging to I such that p E Per(gi). By Lemma 1.3, each can be approximated by E Diffl(M) such that p E Perh(g;) and the g: still converge to I. Therefore p E r(g;) and
g.
lim{d(r(g:),rU» : i
-+
g:
oo} ~ d(p,ru»
F 0,
which contradicts the fact that I is a continuity point of r. This contradiction proves that neg which prove the claim and finishes the proof of the theorem. 0
10.2 Transversality We have already defined what it means for two submanifolds to be transverse in an ambient manifold. In this chapter we need to consider functions which are transverse to a submanifold in the space where the function takes its values. After giving the definition of this concept, we present several transversality theorems which state that most functions are transverse to a given submanifold. Definition. Let M and N be differentiable manifolds, V C N be a submanifold, and M be a subset. A C l function I : M -+ N is transverse to V at P E M provided that either (i) I(p) 'i. V, or (ii) Tf(p)N is spanned by the two subspaces D/pTpM and Tf(p) V whenever I(p) E V, i.e., J( C
Tf(p)N = DlpTpM
+ Tf(p) V.
A C l function I : M -+ N is transverse to V along K provided it is transverse to V at all points p E K. We use the notation / rhK V to indicate that I is transverse to V along K. If K = M, then we say that I is transverse to V, and denote it by I rh V. We also use the notion of the codimension of a submanifold. Assume N is a manifold and V C N a submanifold. The codimension 01 V in N is defined to be dim(N) dim(V), and is denoted by codim(V). The following theorem gives a generalization of the fact that the inverse image of a regular value is a submanifold.
418
X. GENERIC PROPERTIES
Theorem 2.1. A~ume M and N manifolds and V c N is a submanifold. (In our use of the terminology a submanifold is an embedded submanifold.) Assume that I : M --+ N is a C r map for r ~ 1, and it is transverse to V. Then I-'(V) is a C r submanifold of M. Morem'f'I', the codimension of 1-' (V) in M is the same as the codimension of V in N. See Hirsch (1976) for a proof. Theorem 2.2 (Thom Transversality Theorem). Let M and N be manifolds with M compact, and let V C N be a closed submanifold. Assume that k ~ 1. The set of functions Tk(M, V) = {f E Ck(M, N) : I rh V} is dense and open in Ck(M, N). Again, see Hirsch (1976) for a proof. The difficulty in using the Thorn Transversality Theorem is that we do not always have the use of all the functions in Ck(M, N). In the proof of the Kupka-Smale Theorem for I E Diff' (M), we consider the maps Pn(f) = (id, f") : M --+ M x M where n is a positive integer. We want to conclude that most diffeomorphism I have Pn(f) transverse to the diagonal in M x M. Thus we do not have all functions in C'(M, M x M) at our disposal but only those arising as Pn(f). The Thorn Transversality Theorem certainly implies that an open set of I have this property. (See Theorem 2.4 below.) To prove the density of the I which satisfy the property, we must know that the function space Diffl (M) is large enough to be able to make a perturbation of I for which Pn(f} is t.ransverse to the diagonal. There are various ways around this difficulty. Palis and de Melo (1982) and Peixoto (1966) use only the Thorn Transversality Theorem to prove density. They both build into the proof the necessary step to make the Thorn Transversality Theorem applicable to construct the perturbation of the diffeomorphism I itself and not just of Pn(f}. Abraham and Robbin (1967) proves a very general form of the transversality theorem which is directly applicable to representations like Pn. The difficulty is that this proof of the Kupka-Smale Theorem requires proving that Diff' (M) is a Banach manifold which we do not want to verify. (It is true that Diffk(M) is a Banach manifold and p~v : Diffk(M} x M --+ M x M is C k . See Franks (1979), or Hirsch (1976).) We give a proof more like Abraham and Robbin but only require that Diffl (AI) contains finite dimensional subsets of functions on which Pn is differentiable. Thus. we lise the Parametric Transversality Theorem stated below. Also see Abraham and Robbin (1967) for many related results. First we give one definition which we use repeatedly. Definition. Let M and N be manifolds, :F a topological space, and P : :F --+ Ck(M, N} a continuous map for some k ~ O. The evaluation 01 p, is defined to be the map P' v : :F x M ..... N given by p'V(f,x} = p(f)(x).
Theorem 2.3 (Parametric Transversality Theorem). Assume Dq is an open su~ set of some IRq, M and N are manifolds, K C M compact, V C N is a closed submanifold, and k > max{O, dim(M} - codim(V)}. Assume P : Dq --+ Ck(M, N) satisfies the following two conditions: (i) P is continuous, (ii) the evaluation map p. v : Dq x M --+ N is C k , and (iii) p' v is transverse to V along ~ x K. Then T(p, K, V) == {t E Dq : p(t} rhK V} is dense and open in Dq. REMARK 2.1. The differentiability assumption on p'v is that it is C k where k is greater than the dimension of M minus the codimension of V in N. Note that there is a misprint in this assumption in the Parametric Transversality Theorem given in Hirsch (1976).
10.3 PROOF OF THE KUPKA-SMALE THEOREM
419
REMARK 2.2. The transversality 88Sumption on the evaluation map means that there are enough parameters with which to make the necessary perturbations at one point at a time. The conclusion of the theorem is that the function can be approximated by one which is transverse at all points. REMARK
2.3. See Hirsch (1976) for a proof.
The following theorem proves the openness of the set of maps which are transverse. Theorem 2.4. Assume tbat :F is a topological space, M and N are manifolds, K c M compact, V C N is a closed submanifold, and 1 :5 k :5 00. Assume the map p::F -> Ck(M,N) is continuous wbere Ck(M,N) is given tbe compact open topology on the first k derivatives. (If M is compact, then the sup topology on tbe first k derivatives can be used. Hirsch calls the compact open topology the wealc topology.) Tben T(p, K, V) = {J E :F : p(J) thK V} is open in :F. SKETCH OF THE PROOF. The idea is that Tk(K, V) = {J E Ck(M, N) : f thK V} is open in Ck(M,N) and p is continuous. Therefore p-t(Tk(K, V)) = T(p,K, V) is open. See Hirsch (1976) for details. 0
REMARK 2.4. See Abraham and Robbin (1967) or Hirsch (1976) for a more detailed discussion of transversality theorems and their applications.
10.3 Proof of the Kupka-Smale Theorem In the proof of the theorem, we need to consider not only whether a periodiC point are hyperbolic but also whether it satisfies a condition connected with a transversality condition (or the Implicit Function Theorem). A periodic point p of I of period n is called elementary provided 1 is not an eigenvalue of DI;. Thus a hyperbolic periodic point is elementar), but not all elementary periodic points are hyperbolic. The condition of being elementary is the natural first step in the proof below. As mentioned in the last section on transversality, we consider the maps
defined by Pn(J)(x) = (x, r(x». Let tl. be the diagonal in M x M, tl. = {(y,y) : y EM}. Clearly, tl. is a closed submanifold of M x M. Also Pn(J)(p) E tl. if and only if p is fixed by i.e., p has period n in the weak sense of the word. The following lemma characterizes the condition that Pn (f) is transverse to tl. as being equivalent to the condition that all fixed points of In are elementary.
r,
Lemma 3.1. (a) The point p is a fixed point of I if and only if Pt(J)(p) E tl.. (b) Assume pd/)(p) E tl.. Then, pdf) thp tl. if and only ifp is an elementary fixed point. if and only if Pn(J)(p) E tl.. (c) Tbe point p is a fixed point of (d) Assume Pn(J)(p) E tl.. Then, Pn(J) thp tl. if and only jfp is an elementary fixed point of
r
r.
(a) and (c) These statements follow easily from the definitions. (b) Assume Pt(J)(p) E tl.. For v E TpM,
PROOF.
420
X. GENERIC PROPElITIES
The tangent space to the diagonal is clearly given as follows: T(p.p)~ =
If PI (f) rnp ~, then for any
UI, U2 E
{(W, W) : wE TpM}.
TpM, we can solve the set of equations V+W
=
UI
Dlpv+W=U2 for v. WE TpM. But this means W = UI - V, so we can solve Dlpv - v = U2 - UI for v. Dip - I is onto, and 1 is not an eigenvalue of Dip. Conversely, assume 1 is not an eigenvalue of Dip. Then for any UI, U2 E TpM, we can solve Dlpv - v = U2 - UI for v, and set W = UI - v. Thus we can solve the set of equations =
UI
Dlpv + W =
U2
V+W
for v. wE TpM. and PI(f) rnp~. (d) The case for higher period is proved similarly to part (b).
o
We use the following notation for a disk in the proof: for r > 0 and a positive integer J. DJ(r) is the open ball of radius r centered at 0 in R J , DJ(r) = {x E RJ : Ixl < r}. The first step in the proof of the theorem is to prove that the set of diffeomorphisms all of whose fixed points are elementary is open and dense.
Lemma 3.2. The set
is open alld dellSe in Diffk(M). PROOF. The openness of T(pI'~) follows from Theorem 2.2 since PI is clearly continuous. (The openness of T(PI'~) is also related to Theorem V.6.4.) We fix a I E Diffk(M) for the rest of the proof. To prove that I is in the closure of T(pI.~) in Diffk(M). we apply the Parametric Transversality Theorem 2.3 with a finite dimensional subspace of perturbations. We construct a map ( : DLm(r) ---> Diffk(M) and apply Theorem 2.3 to PI 0 (. We have assumed that M is compact. Clearly, ~ c M x At is a closed submanifold. The differentiability required to apply Theorem 2.3 is k > dim(M) - dim(M x M) + dim(~) = m - 2m + m = 0,
or k ~ 1. Finally, we must construct a large enough space of perturbations of the given I so that (PI 0 ()eu is transverse to ~. Fix IE Diffk(M). To construct the perturbations, we use local coordinates (although it is possible to use more global constructions). Cover M by a finite number of open coordinate charts Uj , {'Pi: V; C Rm - Uj C M}f=l' with compact subsets K j C Uj which cover M, U:=I Ki = M. Let Vj be open sets in M with Kj.1 c Vj C c1(Vj) c Uj. Let Kj.o = {x E K; : I(x) ~ Vi}, and K j • l = c1(K, \Kj •o). Thus, each Kj is union of two compact subsets, Kj = Ki.OuKj.t. such that I(K,.o)nKj = 0 and !(K•. d C c1(V,) CUi' With this decomposition, there can not possibly be any fixed points in any of the sets K •. o: PI (f)(K,.o) n ~ = 0 so PI(f) rnK •.• ~. On the other hand for each i, we need to construct a finite dimensional set of perturbations of I. (, : Dm(r;) - Diffk(M), so
421
10.3 PROOF OF THE KUPKA-SMALE THEOREM
that (PI 0(;)"" is transverse to ~ along to} x K i•1 C Rm x M. The transversality means that this finite dimensional subset of diffeomorphisms, {(i(Dm(ri))}, is large enough to perturb I so that it makes all the fixed points in K i•1 elementary. We proceed to construct the perturbations along K i•l . Let VI be open sets in M with Ki,l c V; c cl(V;) C Vi' Let Pi : M ..... R be a Coo bump function with PilVI ;: 1 and cl( {x : Pi(X) "I O}) C Vi' The set cl({x : Pi(X) "I O}) is called the support 01 P and is denoted by supp({3). For ri > 0, define (i, (i : Dm(ri) ..... Diff"(M) by (;(v)(x) = { :i(rpjl(X)
forx¢Vi E Vi,
+ Pi(X)V) for x
and
(;(v)(x) = (i(V) o/(x). Because the set of diffeomorphisms is open, for small enough ri > 0, the image of Dm(ri) by ( is in Diff"(M). Clearly, (PI 0 (i)e" : Dm(ri) x M ..... M x Mise" for k ~ 1 88 required. For p E Ki,l n (pd/»-l(~), p = f(p) E Ki,l C VI, Pi 0/(p) = 1, and (PI 0 (i)'''(V, p) = (p, rpi(rpj I(p)
+ v)).
Differentiating with respect to v for such a p,
D(pi o (i),8,p)(v,Op) = (OP,D(rpi)op;-l(p)v) D(pi 0 (i)i8,p)(R m x {Op}) = top} x TpM.
and
As noted above, T(p.p)~ =
{(w, w) : WE TpM}.
Together top} x TpM and {(w, w) : wE TpM} span T(p.p)(M x M) = TpM x TpM. Thus, Before applying Theorem 2.3, we combine all the (i into a single map to get transversality along all of M in one step. Let P = Dm(rl) x ... x Dm(r/)
and ( : p ..... Diff"(M) be given by «VI. ... , VI) = (I(VJ) 0·" 0 (/(V/) 0 I. Then
D(pi o()io.p)(O, ... ,O,Vi'O, ... ,O, ... ,O,Op) = D(pi o (i)(O,p)(Vi,Op), and (PI ooe" is transverse to ~ along to} x U:=l K;.I. As mentioned above, PI(f) is transverse to ~ along U:=I K i •o, so (PI 0 ()et! is transverse to ~ along to} x M. By openness of transverse intersection, there is some open neighborhood P' C P of 0 such that (PI 0 ()e" is transverse to ~ along P' x M. By Theorem 2.3, T(pl o(,~);: {t E P': PI o«t) ttI~}
is dense in P'. Because ( is continuous, Diffk(M). Since
!
= «0) is in the closure of «T(pi 0 (,~» in
422
X. GENERIC PROPERI'IES
I is in the closure of T(PI, t.) in DiffA:(M). This completes the proof of the lemma. 0 PROOF THAT 'HI IS OPEN AND DENSE. For I E T(PI,t.), Lemma 3.1 shows that if PI (f)(p) E t. then p is a elementary fixed point. Because the elementary fixed points are isolated, each I E T(plo t.) has only finitely many fixed points. (The set (PI(f))-I(t.) is a manifold of the same codimension as t. which is n, so it is a 0 dimensional manifold, i.e., isolated points.) By Lemma 1.3, each I E T(PI, t.) can be approximated by R diffeomorphism 9 for which each fixed point of I becomes a hyperbolic fixed point of g. In fact, there are an open neighborhood U of all the fixed points of I and a perturbation 9 of I such that Per(l,gIU) = Per(I,fIU). Also the nonexistence of fixed points on the compact set M \ U is an open condition, i.e.,
{g E DiffA:(M) : PI (g)(M \ U) n t. = 0} is open in DiffA:(M). Combining the consideration on and off U, the 9 which approximates I can be taken so that Per( 1, g) = Per( 1, f) and each fixed point of 9 is hyperbolic, so 9 E 'HI' This proves that 'HI is dense in DiffA:(M). By the openness of T(PI, t.), and the openness of the hyperbolicity of one particular fixed point, it follows that 'HI is open. 0 To prove that 'Hn is dense in DiffA:(M), we can not just take pn : DiffA:(M) --+ CA:(M, M x M) and proceed to construct perturbations for an arbitrary I E DiffA:(M). The difficulty is that if p is a fixed point for I for which -1 is an eigenvalue with eigenvector Vi for DIp, then for any perturbations of I, (: DJ(r) --+ Diffk(M),
D({J2 0 ()(O,p)(v, 0) does not span the direction corresponding (0, vi). This difficulty is overlooked by many books giving the proof. The correct proof proceeds by induction on n and considers Pn : 'Hn-I --+ CA:(M,M x M), i.e., only constructs perturbations for I E 'Hn- I. The idea is that for I E 'H n - I , if P has period less than n and r(p) = p (so p has period n/j for some integer j), then D(r)p is hyperbolic so we do not need to construct any perturbations.
Lemma 3.3. Assume 'Hn-I is dense and open in Diffk(M). Then,
is open and dense in Diffk(M).
PROOF. By the openness of transverse intersection, T(Pn, t.) n 'Hn- I is open in 'Hn-I, and so in DiffA:(M). To prove the density in DiffA:(M), it is enough to prove density in 'Hn-I, so we fix IE 'H n - I . Let {'P; : Vi --+ U; C M}f=1 be open coordinate charts which cover M as before. Let U be an open neighborhood of Per(n - 1, f) in M and N be an open neighborhood of I in 'Hn-I such that for 9 E N, Per(n, g) n c1(U) = Per(n - l,g) C U and all the periodic points in Per(n - l,g) are hyperbolic. If p E c1(U) and Pn(f)(p) E t., then p has least period less than n, p is a hyperbolic periodic point, p is a hyperbolic fixed point of and Pn(f) rhp t.. Therefore Pn(f) rhcl(U) t.. Let K; C U; \ U be compact subsets such that uf=1 K; = M \ U. (Note that K; = 0 is allowed.) Also, K; n Per(n - 1, f) c K; n U = 0. To proceed as in the case for n = 1, we need to divide K; into subsets; however for n > 1, we may need more than two subsets because the orbit of a point x E K; can pass through U; for some intermediate
r,
423
10.3 PROOF OF THE KUPKA-SMALE THEOREM
iterate, and we must construct a perturbation 9 of I such that the orbit of a point in Ki goes only once through the set {x : g(x) :1= I(x)}. In particular, each Ki can be written as the union of a finite number of compact subsets, Ki = Ki,j, such that (i) f"(Ki,o) n Ki = 0, (ii) f"(Ki,j) C Ui for 1 ~ j ~ Li , and (iii) ft(K"j) n Ki,j = 0 for 0 < l < nand 1 ~ j ~ L i . (Note that Li = 0 is allowed.) For 1 ~ j ~ Li and 1 ~ i ~ I, let U;,j and U;:j be open sets of M such that (i) K"j C U:,j C c1(U:,j) C U::j C cl(U::j ) CUi, (ii) f"(U:) CUi, and (iii) It(U:) n U::j = 0 for 0 < l < n. For 1 ~ j ~ Li and 1 ~ i ~ I, let {3i,j : M ~ R be a Coo bump function with (3i,jlUL == 1 and SUPP({3i,j) c U::j • For ri,j > 0, define (i,j'(i,j : Dm(ri,j) ~ DiIf"(M) by
uf,:,o
,
{ x
(i,j(V)(X) = (i,j(V)(X)
= (.,j(v) 0
for x '" for x E
+ (3i,j(X)V)
CPi (cpjl (x)
U:: U::
j
and
j ,
f(x).
For ri,j > 0 small enough, the image (i,j(Dm(ri,j)) is in DiIf"(M). Fix p E Ki,j with Pn(J)(p) E A. Let Iv = (i,j(V), For 1 ~ l < n, I!(p) f/. U::j so I!(p) = It(p). The nih iterate, I::(p) = CPi(cpjl
0
r(p)
+ (3i,j 0
r(p)v)
= cpi(cpjl(p) + v), or As for the case n = I,
(Pn
0
(i,j )e" rIl{O} x K . A. ',)
Let L = 2::=1 Lit Pi = Dm(ri,d x ... x Dm(ri,L.), and P = PI Define ( : P ~ Diff"(M) by
«vl.l, .. · ,V/,L,) = (1,I(VI,d 0'"
0
(/,L, (v I,L,)
As for the case n = I, (Pn 00·" is transverse to A along {O} x Pn(J) is transverse to A along cl(U) U U:=I Ki,o. Because M = cl(U) U
I
L.
tal
j=l
0
X... XPI c n. Lm . f.
U:=I UJ':'I Ki,j'
Also,
U (Ki,O U U K"j)'
(Pn o()"" is transverse to A along {O} x M. By the openness of transverse intersection, there is an open neighborhood pi C P of 0 in RLm such that (Pn 0 ()"" is transverse to A along pi x M. By Theorem 2.3, T(Pn oC,A)
= {t E P':
is dense in P'. Because ( is continuous, Since
Pno«t) rIl A}
f is in the closure of «T(Pn 0(, A)) in Diff"(M).
«T(Pn 0 (,A)) c T(Pn, A) n 'H n - I == {g E 'Hn-I : Pn(g) rIl A},
X. GENERIC PROPEIUIES
424
f is in the closure of T(Pn,~) n 1tn -1 in Dilfk(M). This completes the proof of the 0
lemma. PROOF THAT 1tn IS OPEN AND DENSE. For
f
E
T(Pn, ~), each point of least period
n can be made hyperbolic using Lemma 1.3, so 1tn is dense in T(Pn, ~). The set of dilfeomorphisms for which a particular periodic point is hyperbolic is open, so 1tn is
0
open in Dilfk(M).
1t IS A RESIDUAL SUBSET. Since 1t = n::"=I1tn is the countable intersection of open dense subsets of Dilfk(M), it is residual. 0 KS(M) IS A RESIDUAL SUBSET. We define a countable collection of sets of dilfeomorph isms K:(n, R,), such that (i) the intersection of all the K(n, R,) is equal to KS(M) and (ii) we can prove each set K:(n, R;) is open and dense in Dilfk(M). For a hyperbolic fixed point P E Perh(f), let W~(p, f) be the points q in W'(p, f) for which there is a curve {-r(t) : 0 ::; t ::; I} such that (i) the length of "( is less than or equal to R, (ii) "((0) = q and "((1) = p, and (iii) "((t) E W'(p, f) for 0 ::; t ::; 1. Similarly, define W}l(p, f). We say that W}l(PI,f) is transverse to W~(P2,f) as a shorter way of saying that W"(PI,f) is transverse to W'(p2,f) at points of W}l(PI,f) n Wk(P2' f). For a positive integer nand R > 0, let K:(71,R)
= {f E 1t n
:
Wil(PI,f) is transverse to WR(P2,f)
for all PI, P2 E Per( 71, I)}· Then KS(M) =
n n
K:(n,R).
ISn
If each K(71,R) is dense and open in 1tn then KS(M) is residual in Dilfk(M). (It is possible to take only a countable number of R > 0 which go to infinity.) The fact that K(n, R) is open is not difficult. For I E 1tn , there are only finitely many points in Per(n, f). By the arguments which prove the openness of 1tn , there is a neighborhood N of I such that for 9 E N, (i) the cardinality of Per(n, g) is the same as the cardinality of Per(n,f) and (ii) each periodic point of gin Per(n,g) is hyperbolic, so N c 1t n . For 9 EN, let {p,(g)}[=1 be the periodic points in Per(n,g). The stable and unstable maps depend continuously in the C k topology on compact subsets. For each 1 ::; i ::; I and 9 E N, there are maps O':(g) : R" -+ M such that (i) the image of O':(g) is W'(p,(g),g) and (ii) the image of the close ball jj"(R) is WR(p,(g),g), a:(g)(jj"(R)) = W~(Pi(g),g). Similarly, O'r(g) : R'" -+ M gives W"(Pi(g),g). The continuity of the stable and unstable manifolds can be expressed by saying that the maps N -+ C t (R", M) and O'r : N -+ C k (Ru" M) are continuous if C k (R", M) and C k (R"', M) are given the compact open topology on the first k derivatives. For i and j fixed, we can combine these into a map ai,j : N -+ Ck(R" x a";, M x M) by
0': :
a',j(g)(x,y) = (aj(g)(x),aj'(g)(y)). The reader can check that W~(P.(g),g) is transverse to W1Hpj(g),g) if and only if O",j(g) is transverse to ~ along jj"'(R) x jju;(R), The openness of the transverse intersection of W~(P.(g),g) and W1Hpj(g),g) follows from Theorem 2.4. By using all pairs 1 ::; i, j $ I, it follows that K(n, R) is open in N, and so in 1tn and Dilfk(M). To complete the proof, we only need to prove that K:(n, R) is dense at functions f E 1t n . We prove this density by applying Theorem 2.3 to a.,)' If we can show for each pair (i,j), that the set of 9 for which 0"'; (g) is transverse to ~ is dense in N, then
10.4 NECESSARY CONDITIONS FOR STRUCTURAL STABILITY
425
clearly it follows that K:(n,R) is dense in N. We show that we can make W~(Pi(g),g) is transverse to W;l(Pj(g),g) at a single point q E W~(Pi(g),g) n WJHpj(g),g). The transversality at all points can then be obtained by combining these perturbations in a way similar to that used to prove 'HI and 'Hn are dense. (This argument uses the fact that D8'(R) x D"j(R) is compact.) Assume ai,j(g)(Xo,YO) = (q,q) E tl.. Since the stable and unstable manifolds are transverse at the periodic points, q '" Pi, Pj' Let
for x rfc UtI E U",
+ (3(x)v) for x
(i,j(V)(X) = (i,j(V) 0 g(x). The two periodic points Pi, Pj rfc utI so they remain hyperbolic periodic points. Also, g(q) ¢ U", so (i,j(V)(q) = g(q). Similarly, gn(q) rfc UtI for n > 0, so (i,j(V)R(q) = gn(q), q E W"(Pi(i,j(V», (i,j(V», and
at
0
(i,;(V)(Xo) = q
is still the same point in M. For the unstable manifold, a similar argument shows that g-I(q) E W"(Pj(i,;(V»,(i,j(V». However (i,j(V)(g-l(q» = (i,j(V)(q) =
so
+ v),
a'J 0 (i,j(V)(Yo) =
Therefore the perturbation (;,j(v) of 9 changes the unstable manifold at q but not the stable manifold, Le., the perturbation can move W"(Pj, (;,j(v» off W8(Pi,
Because this image and T(q,q)tl. span TqM x TqM, (ai,; 0 (i,j)'" is transverse to tl. at (0, Xo,Yo), A finite number ofthe sets ofthe type of U' cover W~(pi,g)nwJHpj,g), so perturbations in these various U' can be combined to define a
10.4 Necessary Conditions for Structural Stability Chapter IX gives sufficient conditions for R-stability and structural stability, In this section, we consider the necessity of some of these conditions, After several people made contributions to this question Mane (1978b, 1987b) and Liao (1980) eventually proved that if f is a C l 'R-stable diffeomorphism on a compact manifold, then f has a hyperbolic structure on 'R(f), The proof of this result is very difficult and beyond the score of this book, but we discuss some of the easier aspects. We start with the hyperbolicity of fixed points.
X. GENERIC PROPERTIES
426
Theorem 4.1. Let M be a compact manifold and f : M ..... M a diffeomorphism. Assume f is CI 'R.-stable. Then all the periodic points of f are hyperbolic. REMARK 4.1. This theorem was proved by Franks (1971) and Pliss (1971, 1972). The
papers by Pliss also obtained some further results related to the theorem of Mane and Liao. REMARK 4.2. Note that if
f
is structurally stable then it satisfies the assumptions of
the theorem.
f is CI 'R.-stable but it is true if 1. See Robinson (1973).
REMARK 4.3. We prove this theorem assuming that
f is
cr structurally stable for any r
~
The first step in the proof of Theorem 4.1 is to show that a diffeomorphism can be C I approximated by a new diffeomorphism which is equal to its linear part in a neighborhood of a periodic point.
Lemma 4.2. Assume p is a periodic point for f of period n. Let <.p : V ..... U be a coordinate chart at p with <.p(0) = p and A = D(<.p-Io/"o<.p)O. LetN bea neighborhood of f in the C I topology. (a) Tllen there are r > 0 and 9 E N such that (i) p has period n for 9 and (ii) <.p-I og" o <.p(x) = Ax forxE Vn{xE R": Ixl:S r}. (b) If liB - All is small enough, then there are r > 0 and 9 E N such that (i) p has period n for 9 and (ii) <.p-I 0 g" 0 <.p(x) = Bx for x E V n {x E R" : Ixl :S r}. REMARK 4.4. Lemma 4.2 is not true in the C 2 topology. PROOF. (a) We only consider the case of a fixed point. The reader can supply the changes for higher period. We also leave the proof of part (b) to the reader. Let (J : R" ..... R be a Coo bump function with (i) 0 :S (J(x) :S 1 for all x, (ii) (J(x) = 1 for Ixl :S 1, and (iii) .B(x) = 0 for Ixl ~ 2. For r > 0, let .Br : lR" -+ lR be defined by (Jr(x) = .B(x/r). Thus.Br is a bump function which is equal to 1 for Ixl <" , and is equal to 0 for Ixl ~ 2r. The estimates on .Br and its derivative depend on r ill the following fashion: SUp{.Br(X)} = sup{.B(x)} = 1, and
where Co = sup{IID«(J)xll}· Using the bump function (Jr, we can define a perturbation gr. First we write terms of its linear and nonlinear terms: let <.p-I
0
f
f in
o <.p(x) = Ax + j(x)
where j(O) = 0 and D(j)o = 0, Thus IID(f)xll = o(lxIO) and Ij(x)1 = o(lxl), i.e., IID(ilxll and Ij(x)I/lxl both go to zero as Ixl goes to zero. We only consider r > 0 for which SUpp«(Jr) c {x: Ixl :S 2r} C V. Let gr be equal to f outside of U, and <.p-l
0
gr
0
<.p(x) = (Jr(x)Ax + (1 - (Jr(X»<.p-l
0
f
0
<.p(x)
= Ax + (1 - .Br(X))j(X). for x E V. Note that <.p-l 0 gr 0 <.p(x) = <.p-l 0 f 0 <.p(x) for Ixl ~ 2r. On the other hand for Ixl :S r, (Jr(x) = 1 and <.p-l 0 gr 0 <.p(x) = Ax.
\0.4
NECESSARY CONDITIONS FOR STRUCTURAL STABILITY
427
To check that 9r is near / for small r, we need to calculate the derivative of 9r: D(cp-I 09r 0 cp)x = A
+ (1 -
+ i(x)D(IJr)x IJr(x)Dix + i(x)D(IJr)x.
IJr(x»Dix
= D(cp-I 0/0 cp)x -
For Ixl ~ 2r, D(cp-I 0 9r 0 cp)x = D(cp-I 0/ ° cp)x so we only need to consider x with Ixl :5 2r. For Ixl :5 2r, IID(cp-' 09r 0 cp)x-D(cp-I 0/ ° cp)xll
:5 IJr(x)IID(i>xll + li(x)l· IID(IJr)xll = o(ro ) + o(r)(CO ) r = o(ro).
In this last calculation, we used the estimates given above for i(x) and Dix and that sup{IID(IJr)xll} = Coiro From the estimate, there derivative of 9r approaches that of / as r goes to O. Since 9r(P) = P = /(p), the Mean Value Theorem proves that the CO distance from 9r to / goes to zero also. Therefore for r > 0 small enough, 9r E N, the CI neighborhood of /. 0 PROOF OF THEOREM 4.1. Since / is 'R-stable, there is an open neighborhood N of / such that any 9 E N is 'R.-conjugate to / and so 9 is also 'R.-stable itself. Assume that p is a nonhyperbolic periodic point of period n. By Lemma 4.2(a), there is a 91 E N such that in local coordinates 9f is linear, cp-I o9f 0 cp(x) = Ax for Ixl < r. Since A is nonhyperbolic, it can be approximated by another matrix B that has an eigenvalue'\' which is a j-th root of unity with eigenvector v. By Lemma 4.2(b), 91 can be approximated by 92 EN such that in local coordinates '1'-1 0 9~ ° V'(x) = Bx for Ixl < r. Then '1'-1 o~n 0V'(av) = Biav = ,\,iav = av, for lavl < r, so ~n has a curve of fixed points which are nonisolated, i.e., 92 has nonisolated periodic points of a given period. By the Kupka-Smale Theorem, there is a 93 E N with only hyperbolic periodic points, so the periodic points of any given period are isolated. But the two diffeomorphisms 93 and 92 can not be 'R.-conjugate. This contradicts the assumptions on /, and so all the periodic points of / must be hyperbolic. 0 We end the section by stating a result about the transversaiity of stable and unstable manifolds. We give the result in two dimensions because part (a) is easier to state in this case. Parts (b) and (c) are true in any dimensions. We leave the proof of this theorem to the exercises. See Exercise 10.11. Theorem 4.3. Assume M is a compact surface (two dimens.ional manifold) and /,9 : M -+ M are CI diffeomorprusms. (a) Assume / is a Kupka-Smale diffeomorphism. Assume 9 has two periodic points p and q such that W"(P,9) is tangent to W'(q,9) at z and W"(p, 9) is on one side ofW'(q,9) locally near z. (The two manifolds are not topologically transverse at z.) Then / and 9 are not conjugate. (b) If / is a structurally stable then / is Kupka-Smale. (c) Assume / is structurally stable and that 'R(f) has a hyperbolic structure. Then / satisfies the transversality condition. REMARK 4.5. As stated in the opening paragraph of this section, Mane (l978b, 1987b) and Liao (1980) proved a much stronger theorem. If / is a CI 'R.-stable diffeomorphism
428
X. GENERIC PROPERrIES
on a compact manifold, then f has a hyperbolic structure on 'R(f). Hayashi (1992) has some improvements in the end of the proof. Also see Aoki (1991, 1992). If f is C l structurally stable then it follows that f also satisfies the transversality condition. By Theorem 4.3 (or Exercise 10.11(c)), it then follows that 1 also satisfies the transversality condition. The proof of the result of Mane and Liao for diffeomorphisms implies the same result for flows without fixed points. Hu (1994) proved a comparable theorem for flows on three dimensional compact manifolds even when fixed points are allowed. The result of Mane and Liao for flows with fixed points on manifolds of dimension greater than 3 is still unproven. .
10.5 Nondensity of Structural Stability As we have remarked elsewhere, the set of structurally stable diffeomorphisms (or O-stable diffeomorphisms) are not dense in Diffl(M) (unless M = 8 1 ). There were a sequence of papers dealing with this problem including Smale (1966), Abraham and Smale (1970), Williams (1970a), Newhouse (1970a), and Simon (1972). Later examples contradicted further conjectures of genericity of various forms of stability. Below we describe the the example in Williams (1970a). This example is simple once the the DA-diffeomorphism is understood. See Section 7.7 for the description of the DAdiffeomorphism. Theorem 5.1. There is an open set of dilfeomorphisms N N is structurally stable.
c
Diffl(1I"2) such that no
9E
PROOF. We start with the DA-diffeomorphism Ii on 11"2 with a fixed point source Po and a hyperbolic invariant set A. The DA-diffeomorphism Ii can be constructed from a hyperbolic toral automorphism 9 with Ii I(T2 \ U) = 91(11"2 \ U) where U c 11"2 is an open set. As we saw in Section 7.7,
n 00
A=
If(1I"2 \ U).
n=O
We now modify Ii to form a 12 with 11I(T2 \ U) = hl(T2 \ U), so 12 still has A as a hyperbolic invariant set. Inside U, we replace the single fixed point source with two fixed point sources ql and q2 and a fixed point saddle Po. The construction can be made so that See Figure 5.1.
FIGURE 5.1
10.5 NONDENSITY OF STRUCTURAL STABILITY
429
FIGURE 5.2
Finally.
12 can
be modified a third time to obtain a diffeomorphism
h
Let
be a fundamental domain with h(a) = b. and let c = h(b). Let ,8(x) be a bump function such that (i) it is zero away from [b. e] and on the part of W"(Po. h) from Po to b. (ii) it equals one in a neighborhood of the midpoint of [b. e]. and (iii) the forward orbit by 12 of supp(,8) does not intersect supp(,8). 00
supp(,8) n
U 12'(supp(,8»
= 0.
n=l
Let v be a vector transverse to W"(Po.h). For small £ > O. let h(x) = { h(x) h(x)
+ £,8 0
h(x)v
for ,8 0 h(x) = 0 for,8 0 h(x) '" O.
(Here we write the sum as if it makes sense on T2. This should really be written in the coordinates of the covering space R2.) Because of the support of ,8. the part of the unstable manifold W"(po.h) from Po to b remains part of W"(Po./s); in particular. [a.b] C W"(Po.h). so also h([a.b]) C W"(Po.h). However h([a,b]) is no longer equal to the line segment [b. e]. but it bends outward from the stable manifold W"(Pl.h). See Figure 5.2. Because the perturbation from h to 13 does not affect the forward orbits of points in supp(.8). these points stay on the stable manifolds of the same points in A. and 13 has a tangency between W"(Po./3) and W"(z. h) for some z(f3) E A. For a small enough Cl neighborhood N of h. every lEN has (i) a hyperbolic invariant set A(f). (ii) a saddle fixed point Po(f). (iii) two fixed point sources ql(f) and q2(f). (iv) R(f) = A(f) U {Po(f). qdf). q2(f)}. and (v) a tangency between W"(Po(f). f) and W"(z(f). f) for some z(f) E A(f). In the Cl topology (as opposed to the C2 topology). I may have more than one tangency. However. if we use the extremal tangency. then the stable manifold W"(z(f). f) is well defined. (Note that the point z(f) is not itself well defined.) This neighborhood N is the one indicated in the statement of the theorem. By means of a bump function near the point of tangency. the stable manifold on which the tangency occurs, W"(z(f).f). can be moved within W"(A,f). The periodic points of I in A(f) are dense. so the union of stable manifolds of periodic points,
U xEPer(f)
W"(x.n
430
x. GENERIC PROPERTIES
is dense in W"(A(f), f). Therefore the set N, of fEN for which W"(z(f), f) contains a periodic point in A is dense in N. By the Kupka-Smale Theorem, the set N2 = {I' E N: f is Kupka-Smale} is dense in N. For fEN, and f' E N 2, f is not conjugate to f' by Theorem 4.3, since (i) f has a tangency of stable and unstable manifolds of periodic points where locally the stable manifold is on one side of the unstable manifold nnd (il) f' Is KupkR-Smale. (The proof of Theorem 4.3 18 an exercise. See Exercise 10.11.) Since (i) N, and N2 are both dense in Nand (ii) no fEN, is conjugate to any f' E N 2 , no diffeomorphism in N is structurally stable. 0 REMARK 5.1. The idea of the counterexamples to O-stability is to make the tangency between the stable and unstable manifolds take place at a point which is nonwandering. Abraham and Smale (1970) do this for an example on S2 x y2. Simon (1972) does this for a three dimensional manifold. Newhouse (1970a) does this on a two dimensional manifold but nE'eds to use the C 2 topology.
10.6 Exercises 'I'cansversali ty 10.1. Let M and N be compact submanifolds of Rk with dim(M) Prove that {a E Rk : (M + a) n N = 0} is dense and open.
+ dim(N) <
k.
10.2. Let M be a submanifold of JRn. Fix k < n and let B be the set of all n x k matrices A such that A has rank k. Note that for A E B, A(Rk) is a k-plane in Rn. The set B is an open subset of all n x k matrices, so B is a manifold. (You do not need to prove this fact.) Show that for a generic set of A E B, the k-plane A(JRk ) is transverse toM. Kupka-Smale Theorem 10.3. Let F be a topological space, M a compact metric space, and C be the collection of all compact subsets of M with the Hausdorff metric. Assume r n : F -+ C is lower semi-continuous for all n ~ 1. Define r(f) = d(Un r n(f). Prove that r : F -+ C is lower semi-continuous. 10.4. Let MS(S') be the set of Morse-Smale diffeomorphisms on S'. (a) Prove that MS(S') is open and dense in Diff'(S'). (b) Prove that if f E MS(S') then there is a positive integers nand k such that f has j periodic sinks of period nand j periodic sources of period n and no other periodic points. 10.5. Assume M is a compact manifold and L : M -+ JR is a C 2 Morse function, i.e., all the critical points of L are nondegenerate (or V(L) has only hyperbolic fixed points). Assume X be a C' vector field such that L is constant along the trajectories of X (e.g. X could be the Hamiltonian vector field for L). (a) Prove that X can be C' approximated by a vector field Y for which n(Y) is a finite set of hyperbolic fixed points. (b) Prove that X can be C' approximated by a vector field Y that is Morse-Smale and has no periodic orbits. 10.6. Assume M is a compact manifold. Let Xk(M) be the set of C le vector fields on M. Fix k ~ 1. Let 1i~ = {X E Xk(M): all the fixed points of X are hyperbolic }. Let Z = fOp : p EM} be the zero section of TM. (a) Let X E Xk(M). Prove that X : M -+ TM is transverse to Z if and only if each fixed point of X does not have 0 as an eigenvalue. (b) Prove that Tk(M, Z) = {X E Xk(M) : X rtl Z} is dense and open in Xk(M). (c) Prove that 1i~ is dense and open in Xk(M).
10.6 EXERCISES
431
10.7. Let KS(S2) be the set of Kupka-Smale vector fields on the two sphere. Prove that KS(S2) is open in Xl(S2). 10.8. Prove that the suspension of a Kupka-Smale diffeomorphism is a Kupka-Smale flow. 10.9. Prove that the set of Kupka-Smale diffeomorphisms is not open in Diff1(M). Hint: Given an example with nu) '" cl(PerU» for which 1 is Kupka-Smale. Necessary Conditions for Structural Stability 10.10. Let M be a compact manifold and X E Xl(M) a structurally stable vector field on M. Prove that all the fixed points of X are hyperbolic. 10.11. (Proof of Theorem 4.3.) Assume M is a compact surface and I,g : M -+ M are C 1 diffeomorphisms. (a) Assume 1 is a Kupka-Smale diffeomorphism and 9 has two periodic points p and q such that W"(p, g) has a point of nontransverse intersection with W'(q,g) where W"(p, g) locally lies on one side of W'(q, g). Prove that 1 and 9 are not conjugate. (b) Prove that if 1 is a structurally stable then 1 is Kupka-Smale. (c) Assume 1 is structurally stable and that 'R(f) has a hyperbolic structure. Prove that 1 satisfies the transversality condition.
CHAPTER XI
Smoothness of Stable Manifolds and Applications In Chapters V and VIII, we proved various results about stable manifolds for a fixed point and a hyperbolic invariant set. In this chapter we present some general results about the existence and differentiability of an invariant section for a fiber contraction. These results generalize the parameterized version of the Contraction Mapping Theorem which is given in Exercise 5.4. Later in the chapter, we give applications of this theory to prove (i) the differentiability of the Center Manifold, (ii) the differentiability of the hyperbolic splitting for an Anosov diffeomorphism on y2, and (iii) the persistence and differentiability of a "normally contracting" invariant submanifold.
11.1 Differentiable Invariant Sections for Fiber Contractions In this section, we consider maps F : X x Yo ..... X x Yo of the form F(x,y) (f(x),g(x,y)) where X is a metric space and Yo is a closed ball in a Banach space Y. Such a map is called a bundle map, X is called the base space, and Yo or Y is called the fiber. In applications, the set of points in the base space that we want to consider is not always invariant. A subset Xo C X is called overflowing for! provided !(Xo) :::> Xo. The bundle map F(·,·) = (f(-), g(., .)) is called a fiber controction over Xo provided Xo is overflowing for f and there is a ,.. with 0 < ,.. < 1 such that g(x, .) : Yo ..... Yo is Lipschitz and Lip(g(x,·» ::; ,.. for each x E X o, i.e.,
for all x E Xo and Yt, Y2 E Yo. An invariant section over Xo of such a map is a function u· : Xo ..... Yo such that F(x, u·(x» = (f(x),u· o/(x» for all x E Xo n ,-t(Xo), i.e., the graph of u· is invariant (or overflowing) by F. If F is a fiber contraction, then Theorem 1.1 proves that there is a continuous invariant section. After proving this result, we give conditions on F which imply that the invariant section is differentiable. In the case when F is a fiber contraction and I(x) = x for all x, the first theorem is merely the result about fixed points for a parameterized contraction mapping. See Exercise 5.4. The results not only apply to product spaces of the form X x Yo C X x Y but also to subsets of vector bundles. In the applications several of the spaces are of the form IE = UXEX{x} x Ex where the space Ex is a vector space that varies with the point x, and the space X is at least a metric space and often a manifold. With a few assumption on how Ex varies, such a space E is a vector bundle over X. (In the proof of Theorem 1.2, the induction step uses a bundle map on a vector bundle even if the original space is a product.) The space X is called the base space and Ex is called the fiber over x. The map 11' : IE ..... X taking {x} x Ex to x is called the projection. The tangent space of a manifold is an example of a vector bundle. See Hirsch (1976) for a formal definition of a vector bundle. 433
434
XI. SMOOTHNESS OF STABLE MANIFOLDS AND APPLICATIONS
We say that a metric d on a vector bundle E is an admissible metric provided (1) it induces a norm on each fiber Ex for x E X (so we write Ivx -wxl for d(vx, w x», (2) there is a complementary bundle E' over X such that (i) the sum of the two bundles E$E' =U{x} x Ex x ~ x is isomorphic to a product bundle X x Y, where Y is a Banach space, (ii) the product metric on X x Y induces the metric d on E, and (iii) the projection {x} x Y onto Ex along E~ is of norm 1. In a vector bundle with an admissible metric, the disk bundle 01 radius R is the subset E(R) = U {x} x Ex(R) xEX
where Ex(R) is the closed ball (or disk) of radius R in Ex. A map F: IE(R) -+ E(R) is called a bundle map provided 11" 0 F( {xl x Ex(R» takes on a unique value I(x). Therefore a bundle map F induces a map 1 : X -+ X such that 11" 0 F = 1011". A bundle map on a vector space IE with an admissible metric induces a map F : X x Y(R) -+ X x Y(R) on the associated product space by means of the composition of (i) the isomorphism from X x Y(R) to (E $ E')(R), (ii) the projection from (E$E')(R) to E(R), (iii) the map F from E(R) to E(R), (iv) the inclusion ofE(R) in (IE $ E')(R), and finally (v) the isomorphism from (E $ E')(R) to X x Y(R). By using this construction, the theorems which we state for a map on a product spaces are valid for a bundle map on a disk bundle. See Hirsch and Pugh (1970) or Shub (1987) for details. Now we can state the first result.
Theorem 1.1 (Invariant Section Theorem). Assume X isametricspace, Xo C X, ll1ld Yo is a closed bounded ball in a Banach space Y. Assume F : Xo x Yo -+ X x Yo is a continuous fiber contraction over Xo with F(x, y) = (J(x), g(x, y». (Or F is a bundle map on a disk bundle over Xo ll1ld is a contraction on each fiber.) Assume that (i) 1 is overflowing on X o, and (ii) IIXo is one to one. Then there is a unique invarill1lt section over X o, u· : Xo -+ Yo, ll1ld u· is continuous. PROOF. Consider the spaces of bounded functions and continuous functions given by
g = {u : Xo gO = {u : Xo
and
-+
Yo}
-+
Yo : u is continuous}.
Note that because Yo is a bounded ball, g is the space of bounded functions. On g and go put the CO-sup topology, lIuI - u2110 = sup lUI (x) - u2(x)l. xEX.
This norm makes both g and gO complete normed linear spaces since Yo is complete and the sections are bounded. See Dieudonne (1960), 7.1.3 and 7.2.I. There is an induced map on the sections rF : {i -+ (i such that the graph of rF(U) is a subset of the image by F of the graph of u. (These two graphs are not necessarily equal because Xo is overflowing but not necessarily invariant for J.) This map r F is called the graph translorm 01 F. In fact it is p068ible to give a formula for rF(U). Let x E Xo. Then I-I(X) E Xo is well defined because 1 is one to one on XOi the value of u
11.1 DIFFERENTIABLE INVARIANT SECTIONS FOR FIBER CONTRACTIONS
435
at I-I(x} is given byuorl(x} and is in the fiber over I-I(x}; the imageofuo/-l(x} by F must be (X,rF(U)(X}), so (X,rF(U)(X}) = F(rl(x},u 0 rl(x)) = (x,g(rl(x},uorl(x))) = (x,gouorl(x}) where u(x} = (x,u(x}). Thus
We claim that r F is Lipschitz on {i with Lip(r F} ~ "', where 0 < '" < 1 is the fiber contraction constant for F. Let UI, C12 E {i, and &1, U2 : Xo -+ Xo x Yo be induced as above. Then IIrF(C1t}-rF(C12}1I0= sup Igo&lorl(x}-gO&20rl(X}1 xEX.
~
sup Igo&I(P}-goU2(P}1 pEX.
~
sup ",luI(p} - u2(p}1 pEX.
= "'lIuI - u2110. Because r
F
is a contraction on the complete metric space
g,
it has a unique fixed point
F has a unique bounded invariant section over Xo. Because rF preserves the closed subspace gO, u* E go and the invariant section is continuous. 0
u*, or
We want conditions whicli imply that the invariant section is differentiable. It is not enough for F to be a fiber contraction. The following example gives a function which contracts more along the base space X than along the fibers Y and there is no differentiable invariant section. Example 1.1. We take the base space as Sl = {x E IR mod I} and fiber space R. Let I : Sl -+ Sl be a Coo diffeomorphism with two fixed points, an attracting fixed point at and a repelling fixed point at 1/2. FUrther, assume that I(x} = x/3 for -1/3 ~ x ~ 1/3. (This is really the form of the function on the covering space R.) To define g, let (3 : [0, I] -+ [0, I] C R be a Coo bump function with supp({3) = [1/9.1/3] and (3(2/9) = 1. The map 9 : SI x R -+ R is defined by g(x, y) = (3(x) + y/2. Notice that I contracts more strongly on the base SI (by a factor of 1/3) than 9 contracts on the fiber (by a factor of 1/2). Let F = (/,g) be the bundle map on SI x R, which takes SI x [-3,3] into its interior. Since F is a fiber contraction, it has a unique continuous section u· , go u* I-I = u· . By the iterative process used in the proof of Theorem 1.1, u·(x} = for 1/3 ~ x ~ 1. For 1/9 ~ x ~ 1/3, rl(x} E [1/3.1/2] (because 1(1/2) = 1/2 and I is monotone}. so u· 0 I-I (x) = and
°
°
°
u·(x} = go u·
0
rl(x} = g(rl(x},O} = (3o rl(x) = 0.
However for 1/33 ~ x ~ 1/32, rl(x) = 3x E (1/9,1/31 and
u·(x) = gou'(3x) = g(lx.O) = 8(3x).
436
XI. SMOOTHNESS OF STABl-E MANIFOLDS AND APPLICATIONS
In particular, (7.(2/33 ) = 1 and (7*(1/3 3 ) = (7.(1/3 2 ) = 1/32+1, f-I(x) = 3x E [1/33 ,1/3 21 and
o.
Next for 1/33 +1 ~ x ~
(7*(x) =gou*(3x) = g(3x, p(32x» =
1
2 P(3 2 x).
In particular, (7.(2/3 4 ) = 1/2 and (7*(1/3 4 ) = (7*(1/3 3 ) = o. By induction on n, for 1/33 +n ~ X ~ 1/32+n with n ~ 1, f-I(x) = 3x E [1/3 2+n , 1/3 1+n l and
(7*(x) = go u*(3x) 1 = g(3x, 2n - 1 t1(3 n+1 x » = ..!..t1(3n+1 X ).
2n
In particular, (7*(2/33+ n ) = 1/2n and (7*(1/33 +n ) = (7*(1/32+ n ) = O. Since (7*(0) = 0 and (7·(2/33+ n ) = 1/2n , the difference quotient
(7'(2/33+ n ) - (7·(0) 33+n (2/33+ n ) - 0 = 2n+1 does not converge as n goes to infinity. Therefore, (7* is not differentiable at x = o. By the calculation of the form of (7* for x '" 0, it is differentiable at all other points. The reader can check using the Mean Value Theorem that there are points in the interval [1/33+n, 2/33+ n l at which the derivative is at least 3n +3 /2 n , and this quantity is unbounded as n goes to infinity. This completes the analysis of this example. To insure that the invariant section is C l we need to assume that the contraction on fibers is stronger than the contraction in the base space. The following theorem makes this precise.
Theorem 1.2 (C r Section Theorem). Assume X is a manifold, Yo is a closed bounded baIl in a Banach space Y, and Xo C X is a subset for which Xo = cl(int(Xo» (so differentiation makes sense on Xo). Assume F : Xo x Yo --+ X x Yo is a C l fiber contraction on Xo with uniformly bounded derivatives on Xo x Yo and F(x,y) = (f(x), g(x, (Or F is a bundle map on a disk bundle over Xo.) Assume that (i) f is overflowing on X o, and (ii) flXo is a diffeomorphism from Xo to its image I(Xo ) J Xo. For each x E Xo, let Ax = II(Dlx)-1 II = l/m(Dfx). (Note that Ax measures the greatest expansion of /-1 at I(x), and 1/ >'x measures the minimum expansion or greatest contraction of I at x.) Let D2 g(x.y) : Y --+ Y be the derivative with respect to the variables in Y. For each x E Xo, let
y».
The map F is a fiber contraction so sup
Itx
< 1.
xE/-'(Xo)
(a) Assume further that 1
~ T ~ 00,
F is cr, and
sup xE/-'(Xo)
ItxA~
<1
1LJ DIFFERENTIABLE INVARIANT SECTIONS FOR FIBER CONTRACTIONS
437
for 1 :5 j :5 r. Then the unique invariant section is cr. (b) Assume that 0 :5 r :5 00, a > 0, F is e r +a with uniformly bounded Holder constant, sup /£xA~ < 1 xE/-'(Xo)
for 1 :5 j :5 r, and sup /£xA~+a xE/-'(Xo)
< 1.
Tllen the unique invariant section is e r +a , i.e., the roth derivative is a-Holder. REMARK 1.1. When r = 0 and a
> 0, the theorem is true when Yo is merely a metric
space. REMARK 1.2. This theorem is essentially in Hirsch and Pugh (l970). For part (b), they make the comparisons of derivatives at different points:
[ sup ItxA~1 [ sup A~I < 1. xE/-1(Xo) xE/-'(Xo) Also see Hirsch, Pugh, and 8hub (1977). REMARK 1.3. Hurder and Katok (1990) prove the result with the purely pointwise assumption that sup /£xA~+Q < 1. xE/-'(Xo)
We refer to these references for the proof in the case of Holder continuous derivatives. Also see Exercises 11.2 and 11.3. REMARK 1.4. Below we let gl be the e l sections in go. The graph transform rF preserves gl but it is not a contraction on gl with the e l topology, so we can not prove that a· E gl by that means. We do not proceed exactly like the proof of the stable manifold, but instead show that the graph transform preserves Lipschitz sections. After this step, we show that the derivative takes a family of appropriate cones of vectors into itself. Note the assumption that sUPxE/-1(Xo) ItxAx < 1 is just the condition needed to verify this invariance of cones. PROOF FOR r = 1. Let go = {a : Xo -+ Yo : a is continuous} and rF : go -+ go be as in the proof of Theorem 1.1, rF(a) = go (id, a) 0,-1. By Theorem 1.1, there is a unique invariant section a· E gO. We first prove that the invariant section a· is Lipschitz. Let
gl gl(L)
= {a: Xo
-+
= {a E gl:
Yo: a is
e l },
sup IIDaxll :5 L}, .,EXo
and gLiP(L) be the closure of gl(L) in go in terms ofthe CO-sup topology. We justify the notation for gLip(L) with the following lemma.
Lemma 1.3. Every a E gLip(L) is Lipschitz with Lip(a) :5 L, PROOF. By the Mean Value Theorem, every a E gl(L) is Lipschitz with Lip(a) If a i E gl(L) converges to a E gLip(L) then
la(x) - a(y)1
= j-ao lim lai(x) :5 .lim Ld(x,y) ,-ao
=Ld(x,y),
ai(Y)1
:5 L,
438
XI. SMOOTHNESS OF STABLE MANIFOLDS AND APPLICATIONS
o
which proves that CT has a Lipschitz constant as claimed.
REMARK 1.5. The proof of the above lemma actually shows that the set of maps whose
Lipschitz constant is bounded by L is closed in go. Also gLiP(L) is actually the set of all Lipschitz sections with Lipschitz constant less than or equal to L, but we do not need this fact. We merely use that gLip(L} is closed in gO and gl(L) is dense in QLip(L). Let p.
=
sup
Kx
~x.
xEf-I(X.)
The derivatives of 9 are uniformly bounded so we can take C > 0 such that
for all (x. y) E Xo x Yo. where D I 9(x.),) : TxX -- Y is the derivative with respect to the variables in X. Take Lo = C/(l - p.} > 0, i.e., Lo > 0 so that C + p. Lo = Lo. With these constants we can show that r F preserves gLip(L o}. Lemma 1.4. With Lo as chosen above, PROOF. For a E
rF take gLip(L o} into itself.
gl(Lo) and x E Xo,
IID(rdCT»xll
= IID(9 o o- o r
' )xll ~ IID I 9"oJ-'(x)D(r l )xll
+ IID29"of-l(x)DCTf-'(x)D(r')xll
C + Kf-I(X) Lo Af-I(x) ~ C + p.Lo = Lo· ~
Therefore rdCT) E gl(Lo). Because gl(Lo) is dense in gLiP(Lo) and gLiP(Lo ) is closed,
rF(QLip(Lo}}
c gLip(L o}.
o Take any a O E Ql(Lo). Then for n 2: 1, r~(CTO} E gl(Lo), r~(CTO) converges to CT> as n goes to infinity, QLip(Lo) is closed, and so CT> E gLiP(Lo}. This proves that the invariant section is Lipschitz. To prove that the invariant section is Cl, consider the cones C~(2
Lo)
= {(v, w) E TxX
x Y : Iwl ~ 2 Lo Ivl}·
The calculation of Lemma 1.4 shows that
We need to measure the maximum opening of a cone. If CU TxX then the angle of opening of CU is defined to by ..((CU ) = sup{ 11:11 : (v, w) E C U
c Tx
c
C;:(2 Lo) is a cone over
x Y}.
The following lemma proves the related fact that the angle of opening of the cone decreases by a factor of p..
11.1 DIFFERENTIABLE INVARIANT SECTIONS FOR FIBER CONTRACTIONS
Lemma 1.5. IfC" C C:(2 Lo) has L(CU) = {3 > 0, then L(DF(x.y)C") PROOF.
439
~ 1I{3.
Assume
(v, w), (v, w') E DF(x,y)CU
c
T/(x)X x Y,
then (v, w) = DF(x.y) (v, w) (v, w') = DF(x.y) (v, w').
The vector v = Dfxv, so v = (Dfx)-I v (so v is the same for both pairs of vectors), Ivl ~ Ax lvi, and Ivl- I ~ Ax lvi-I. To estimate the components in Y, w - w' = D I 9(x.y)V
+ D 2 9(x.y)W -
D I 9(x.y)V - D 2 9(x.y)W'
= D 2 9(x.y)[w - w'I,
so
Iw - w'l
~
Itx
Iw - w'l·
Thus the angle of the opening of the cone DF(x.y)CU is estimated as follows: L(DF(x.y)C") = sup{ ~
Iw-w'l Ivl : (v, w), (v, w')
sup{1tx Ax
E DF(x.y)C"}
Iw Ivl - w'l : (v, .•w), (v, .., w)E
C"}
~ 1-'{3.
This proves the bound on the angle of the opening which is claimed.
o
By induction on n, L(DF;.o/-n(xp;-n(x)(2 Lo) ~ I-'n L(Ci-n(x)(2 Lo» = I-'n4L o· As in the proof of the stable manifold, it follows that
n DF;'o/-n(x)Ci-n(x)(2Lo) 00
Px =
n_O
is a linear subspace which is a graph over TxX of slope less than or equal to Lo. By using an argument like that for Proposition V.IO.S, we get that u" is CI. This completes the proof of Theorem 1.2 for r = 1. 0 r ;?: 2. By induction, u" is cr- I , so at least CI. Let A; = Du;. Then Du; : TxX -+ Y is linear for each x, so Du; E L(TxX, Y) where L(V1t V2 ) is the space of bounded linear maps between Banach spaces. We put the norm on L(Tx, Y) given by the norm of a linear operator. (Notice that this definition uses a Riemannian norm on TX.) We form a space of possible derivatives of such sections. Let £(X) be the bundle PROOF FOR
£(X) =
U {x} x L(TxX, Y). xEX
440
XI.
SMOOTHNESS OF STABLE MANIFOLDS AND APPLICATIONS
Let C(Xo) = C(X)IXo be the bundle restricted to fibers over points in X o, and C x = {x} x L(TxX, Y). Next we want to define a bundle map on C(Xo) that is compatible with the transformations of derivatives by the action of rF on gl. If u E gl, then rF(U) is C l and D(r F( u»x = Dgao/ -' (x) (id, Du1-' (x) )D(f-I)x by the Chain Rule. Using this a motivation, we define a bundle map \If C(Xo) -+ C(X) by
(f, t/J)
\If (x, S) = (f(x), t/J(x, S» = (f(x), Dgao(x)
0
(id, S)
0
D(f-I )/C x».
Note that in this definition, we consider (id, S) as taking values at tangent vectors at (x, u·(x». Thus we only consider the space of possible derivatives along a·. The map \If is C r - I because F is C r and u· is C r - I . Lemma 1.6 shows that \If is a fiber contraction over x by a factor of /3x = Itx Ax :$; 11 < 1. By the definitions, sup /3x Ai :$; 11 < 1 xEXo
for 1 :$; j :$; map
T -
1. The invariant section u· is C l by the induction hypothesis, so the
is an invariant section of \If, i.e., a fixed point of r 1jI. By applying this theorem for T - 1 to \If. A· is a C r - I invariant section. The fact that A· is C r - l implies that u· is cr. This completes the induction step in the proof except for the following lemma. 0
Lemma 1.6. The map \If : C(Xo) ItxAx·
-+
C(X) is a contraction on the fiber over x by
PROOF. The map \If is a bundle map by its definition. For (x, S), (x, S') E cx, 1It/J(x,S) - t/J(x,S')1I = IIDgao(x)((id,S) - (id,S')]D(rl)/(x)1I :$; IID 2 ga(x)(S - S']IIIID(f-I)/(x)1I :$; II:x Ax liS - S'II
as claimed.
o
REMARK 1.6. Note that the proof of Lemma 1.6 is essentially the same as the proof of Lemma 1.5. Also, it is very important that both S and S' take there values at the same point a·(x), so that the derivative of 9 is calculated at the same point for both terms. REMARK 1. 7. Note that the proof of the induction step (the case T ~ 2) does not apply to prove the case for T = 1. The reason is that without using the proof for T = 1 we do not know that the invariant section A· for \If is in fact the derivative of the invariant section u· for F.
11.2 DIFFERENTIABILITY OF INVARIANT SPLITTING
441
11.2 Differentiability of Invariant Splitting As an application of the cr Section Theorem, we consider an Anosov diffeomorphism on T2 and show that its splitting is CI. Theorem 2.1. Assume f : T2 -+ T2 is a C'l An060v diffeomorphism. Then the sulr bundles IE" and IE" in the hyperbolic splitting depend in a Cl fashion on the base point. PROOF. We prove that the unstable bundle is CI. For this proof, we use that the stable bundle has one dimensional fibers but not that the unstable bundle is one dimensional. For this reason to prove that IE" is C I , we assume that I is a C2 Anosov diffeomorphism on yn with IE" one dimensional. In the proof we write the derivative as if I were a map on Rn , i.e., we confuse I with its lift to Rn. By assumptions there is a continuous splitting IE" e E". This splitting can be approximated by a C I splitting F" e F" which is almost invariant. In terms of the splitting IF" elF",
Dfx =
(~: ~:)
where Ax : IF~ -+ lFj(x) , Bx : IF~ -+ lFj(x) , etc. Because the splitting is near the hyperbolic invariant splitting, there are A > 1, 0 < p.' < p. < 1, and E > 0 such that A - E > 1, p. + f < 1, A-I + f + 2f/P.' < I, m(Ax) ~ A, p.' $ m(Dx) = IIDxll $ p., and IIBxll, IICxll $ f. (Note that the bundle IF" is one dimensional so the norm of Dx is really an absolute value.) The bundle IE~ is the graph of a linear map F,! -+ IF~. Let ex be the space of all linear maps from IF~ to IF~) with norm less than or equal to 1,
Let
e be the disk bundle over yn of all the ex, e = U {x} x ex' xET"
e
The natural transformation on is given in terms of the graph transform induced by the derivative of I, II! = (f, "') : e -+ e where "'x: ex -+ e,(x)' To derive the formula for "'x, let O'x E ex and v E F,!. Then D/x(v,O'xv) = ((Ax
+ BxO'x)v, (Cx + DxO'x}v)
= (w, "'x(O'x}w).
So, v = (Ax
+ BxO'x)-I W and "'x((7x)w = (Cx + DxO'x}{Ax "'x((7x) = (Cx + DxO'x}{Ax
+ Bx(7x)-l w + Bx(7x)-I.
or
To check that Ax + B x(7x is invertible note that
m(Ax + BxO'x) ~ m(Ax} - IIBxllllO'xll ~ A-f >1.
(See Exercise 11.6.) Since m(Ax + BxO'x) > 1 > 0, Ax + B"O'x is invertible. Since I is C2, II! : -+ is CI. Also, II! covers the map I : T" -+ T". The following lemma shows that II! is a fiber contraction.
e e
442
XI. SMOOTHNESS OF STABLE MANIFOLDS AND APPLICATIONS
Lemma 2.2. (a) The map ill is a fiber contraction by a [actor o[K. x = (IIDxll/m(Ax»+
2t on ex' (b) Let e = sup{lI1/Jx(Ox)1I : x E T"}. The bundle map ill preserves the disk bundle o[radius Lo = e/(l- K.) in e where K. = SUPxETn K.x. PROOF. (a) If u!.u~ E ex. then by adding and subtracting the same term and applying the triangle inequality. we get
l11/Jx(U!) -1/Jx(u~)11
= II (ex +
Dxu!)(Ax + BxU!)-l - (ex + Dxu~)(Ax + Bxu~)-III
~ lI(ex + Dxu!) - (ex + Dxu~)IIII(Ax + BxU!)-11i
+ lI(ex + Dxu~)lIII(Ax + BxU!)-1 - (Ax + BxU~)-III· Now we estimate each term in the right hand side above. The first term has the following estimate: lI(ex + Dxu!) - (ex + Dxu~)11
= IIDx[u! -
u~JII
~ IIDxllllu! - u~lI· Next. II(Ax + Bxu~)-III
= [m(Ax + BxU~>rl ~ [m(Ax) -IIBxll'lIu~WI ~ [m(Ax) - tr 1
and lI(ex + Dxu~)11 ~ lI(exll + IIDxllllu~)1I ~ t
+ Il.
To estimate the next term. note that lIa- 1 - b-III = lIa-Ibb- 1 - a-lab-III ~ lIa-11lllb - all lib-III.
so
II(Ax + Bxu!)-I - (Ax + B"u~)-III ~ II(Ax + Bxu!)-IIIIIBxllllu! - £1~IIII(Ax + BxU~)-lll ~ t (oX - t)-2I1u! - £1~II.
using the fact that II(Ax + Bx£1!)-III.II(A x + BxU~)-11i ~ (oX - t)-l and IIBxll ~ t. Combining these estimates we get that l\1/Jx(£1!) - 1/Jx(£1~}11
~ IID"lIlIu! - £1~1I [m(Ax} - tr l + (f + 1l)t{oX - f)-2I1u! - £1~1I
< [IIDxll + 2t]II£1 1 _ £1211 -
m(Ax)
x
x
11.2 DIFFERENTIABILITY OF INVARIANT SPLITTING
as claimed. (In the last inequality we used that IIDxll [m(Ax) -fj-I ~ [IIDxII Im(Ax)] +f and (E + J.L)E (A - E)-2 ~ t.) (b) Assume U x E .cx(Lo). Then
IItPx(ux)1I
~
~
IItPx(ux) - tPx(Ox)1I + IItPx(Ox)1I "x lIuxll +C
~"Lo+C
= Lo·
Therefore tPx(ux) E .cf(x)(Lo).
0
Since III covers the map /, we need the estimate on II(D/x)-III: Ax = II(D/x)-III ~ II(Dx)-11i + E = IIDxll-1 + E. (This uses the fact that IF! is one dimensional.) Then the fiber contraction times the maximum expansion of /-1 is less than 1: sup "x Ax ~ sup [(IIDxll Im(Ax)) + 2f][IIDxll- 1 + fj
xET"
xET" ~ m(Ax)-1 + f + 2f/llDxll ~A-l+E+2f/J.L'
<1.
It is important that the product of the rates is taken before the supremum is taken so that the factors II Dx II and II Dx 11- I multiply to give 1. Thus the contraction on fibers .c(Lo) is stronger than the contraction within T" and the invariant section is CI. Since the invariant section has E: as a graph, the unstable bundle is CI. The proof for the stable bundle is similar provided that the unstable bundle has one dimensional fiber. 0 REMARK 2.1. The bundles are not necessarily C2 even if the diffeomorphism is C3. This is because IIAx ll-I[IIDx ll- 1 +fj is not necessarily less than 1. However, various rigidity results have been proven. If an Anosov diffeomorphism / : 1'2 --+ y2 is Coo, area preserving, and has C2 invariant bundles, then the bundles are Coo. See Hurder and Katok (1990). Also see de la Llave, Marco, and Moriyon (1986) and de la Llave (1987). REMARK 2.2. In higher dimensions the bundles are not necessarily CI but they are Holder. See Hirsch and Pugh (1970). Their treatment uses the CO Section Theorem of the last section. The same type of argument can be used to prove that the stable bundle is CI for a hyperbolic attractor with dim(E:) = 1. Theorem 2.3. Assume / : M --+ M is C2 and has a hyperbolic attractor A with dim(E:) = 1. Let E! = Tx W"(x) [or x in a neighborhood o[ A. Then the stable bundle is CI. PROOF. The proof is like above, but the map /-1 is overflowing on a neighborhood U of A. The estimates similar to those of Theorem 2.1 but applied to D(f-I)x give a fiber contraction on the space of linear maps whose graphs could potentially give E!. See Hirsch and Pugh (1970) for more details. 0
444
XI. SMOOTHNESS OF STABLE MANIFOLDS AND APPLICATIONS
11.3 Differentiability of the Center Manifold The existence of a cr Center Manifold is stated in Section 5.10.2. In this section, we indicate how this result follows from the cr Section Theorem. What is needed is to show that the center-unstable manifold, WCU(O,f), and the center-stable manifold, WC'(O, f), are (The center manifold is the intersection of these two manifolds.) We concentrate on showing that WC"(O, f) is C r because the prooffor WC'(O, f) is similar. The proof in Section 5.10.2 (using the methods of Section 5.10.1) shows that WCU(O, f) is C l . We indicate the induction argument which proves that it is Let 2 ::; T < 00 be a fixed level of differentiability. (The proof for T = 1 is done.) We assume I : U c Rn --> Rn is a C r map with 1(0) = o. Also the tangent space at 0 splits, Rn = IE" (fl EC (fl E', where these subspaces are labeled in the usual manner. Let 0< J' < 1 be such that IIDlolE'1i < 1'. Let ~ > 1 be chosen so that JJ~r < 1. A basis can he chosen so that in terms of the norm in this basis II D 10 I lIEu (flEe II < ~. (Notice that for a fixed ~ > 1, it is not possihle to satisfy this inequality for all T ~ 1. This is {'Ssentially the reason the manifold can not be proven to he Coo. Also if D lollE c is diagonalizahle, thl'n it is pO.'l.'!iblc t.o make IIDlolllE" (fl1E'1i ::; 1. ) Next. we want to get global estimates and not just at o. We do this by extending I using a bump function to all of Rn so it is uniformly near the derivative at 0. Let !3 : R n -> R be a bump function with supp(!3) = B(2,O) and !3IB(I,O) = 1. Define !3.(x) = !3(fX), so supp(!3.) = B(2f,0) and !3.IB(f,O) = 1. Let A = Dlo (in the basis indicated above) and
cr.
cr.
F.(x) = !3.(x)/(x) + (1 - !3.(x))Ax. As f goes to 0, F. converges to the linear map Ax in the C l topology. (See Proposition V.7.5 and the proof of Lemma X.4.2.) In particular, if f > 0 is small enough then IID(F.)xllE'li < I' and IID(F.);lIIEU (flEcll < ~ for all x ERn. We fix this f and write F for F.. (Notice that even if DlollEc is diagonalizable, it is not possible to make IIDI; I lIEu (flEc" ::; 1 for all x ERn.) Let C(Rn) be the bundle and III : C(Rn) --> C(Rn) be the bundle map as defined in Section 11.1 for the proof for T ~ 2. Lemma 1.6 proves that III is a fiber contraction by a factor of JJ~. The construction above gives that (I' ~)~r-l < 1, so the the invariant section is C r - l . Let o' : IE" (flEc --> IE' be the C l invariant section whose graph gives WCU(O, F). The map A; = Do: is an invariant section for III. It follows that Do: is C r - l and u· is cr. This completes the proof.
11.4 Persistence of Normally Contracting Manifolds In this section, we assume that we are given a C r diffeomorphism on a manifold, I: M --> M with an invariant compact C l submanifold V c M, I(v) = V. The main theorem gives conditions for an invariant manifold to persist for perturbations of f which are C l small. The first step is to define a condition on the invariant 8ubmanlfold called normally contracting for f at V. To make the definitions, we need the notion of a normal bundle of a submarlifold. At each point x E V, it is possible to pick a subspace N x of the tangent space TxM which is a complementary subspace to Tx V, TxM = Tx V (fl Nx . These subspaces can be chosen so they vary differentiably on the point x E V, so together they form a vector bundle ovt'r V. N = {xl x N x·
U
xEV
11.4 PERSISTENCE OF NORMALLY CONTRACTING MANIFOLDS
445
This vector bundle N is called a nOTmal bundle to V (in M). For each point x E V, there are two projections defined: 11'~ : TxM -+ TxV, the projection along N x onto TxV, and 11'~ : TxM -+ N x , the projection along TxV onto N x . Definition. A diffeomorphism I : M -+ M is called nOTmally controcting at V provided V is a compact invariant submanifold for I and there are constants C ~ 1 and 0 < Il < 1 such that
111I'f,(x)DI!INxll $ 111I'f,cx)DI:INxll $
CIl"
and
CIl" m(DI:ITx V)
for all x E V and k ~ 1. These conditions mean that I contracts toward V and the rate of contraction toward V is stronger than any contraction within V. (The term m( D ITx V) measures any possible contraction within V.) To get n higher degree of smoothness of the invariant manifold of a perturbation, we need to make further assumptions on the rate of contractions toward V relative to the contradions within V. For r ~ 1, a diffeomorphism I : M -+ M is called r-noTmally contmcting at V provided V is a compact invariant submanifold for I, I is and there are constants C ~ 1 and 0 < I" < 1 such that
I:
cr,
111I'f,(x)DI!INxll for all 0 $ j $ r, x E V, and k
~
$ CI""m(DI!ITxVp
L
There is a generalizatioJl,-of r-normally contracting to r-normally hyperbolic invariant manifolds; see Hirsch, Pugh, and Shub (1977) and Fenichel (1971). This latter condition allows there to be both contracting and expanding directions within the normal bundle.
REMARK 4.1.
In our definition. we did not require that the normal bundle be invariant by the derivative map. However, if I is normally contracting along V, then it is possible to choose another normal bundle that is invariant. Proposition 4.1. Assume I: M -+ M is normally contracting at V. (a) There is a continuous choice of the normal bundle that is invariant by tbe deriv-
ative of /. D/x(Nx) = Nf(x)' (b) By changing the Riemannian norm of M, it is possible to take C = 1 in the definition of normally contracting at V. We leave the proof of this proposition to the exercises. See Exercise 11.9. In the rest of this section we take the invariant normal bundle and adapted norm given by this proposition. Once the normal bundle is invariant and C = 1, we can leave the projection out of the conditions on the rates of contraction, and write that IIDlxlNxll $l"m(DI!ITxV)i for 0 $ j :5 r. This condition that DI contracts vectors in N more strongly than vectors tangent to V implies that the derivative of I preserves a family of cones of vectors which point more along V than in the normal direction. This fact is crucial in the main theorem of this section which we state next. Theorem 4.2. Let I : M -+ M be a cr diffeomorpbism, r ~ 1. Assume I is r-normally contracting at V, where V is a compact Cl submanifold of M. If g: M -+ M is a cr diffeomorphism which is Cl near I, then 9 bas a invariant r-normally contracting submanifold Vg which is Cl near V.
cr
REMARK 4.2. A similar theorem is true for flows with very little change in the definitions, statement, or proof.
446
XI. SMOOTHNESS OF STABLE MANIFOLDS AND APPLICATIONS
REMARK 4.3. If 9 is C r near
I,
then its invariant manifold Vg is C r near V.
REMARK 4.4. This theorem has a long history. See the remarks in Hale (1969) and Hirsch, Pugh, and Shub (1977). Sacker (1967), Fenichel (1971), and Hirsch, Pugh, and Shub (1977) have recent results in this direction and beyond. REMARK 4.5. This theorem has many applications in Dynamical Systems. The proof of the Andronov-Hopf Theorem for diffeomorphisms is related to this theorem. For this bifurcation result, the full nonlinear map is considered as a perturbation of a normal form. The normal form of the map trivially has an invariant circle which is normally contracting. As the parameter goes to the bifurcation value, the extent of contraction toward the invariant circle goes to one. However, the perturbation effects of the true nonlinear map from the normal form is small enough so the nonlinear map also has an invariant closed curve. See Ruelle and Takens (1971). Before starting the proof of the theorem, we discuss some constructions and results related to the theorem. We start by showing that V is necessarily a C r manifold under the hypothesis of the theorem. This same proposition shows that Vg is C r once we have shown that it is C 1 •
Proposition 4.3. Assume I is C r and r-normally contracting at V, where V is a compact invariant C 1 submanifold of M. Then V is a C r submanifold of M. REMARK 4.6. There are examples of I, which are r-normally contracting at V but not (r+l)-normally contracting, such that V is c r but not Cr+l; in fact, this is the generic situation. See Mane (1978a). PROOF. The proof is very similar to that of Theorem 2.1. Again we approximate the invariant splitting TV EEl.N by a differentiable splitting IFv EElIF N , in terms of which
E > 0 with IJ + E < 1, because the splitting is near the invariant splitting, IIBxll, IICxll ~ E, and IIDxll ~ IJm(Ax)m(DlxITxV)j for 0 ~ j ~ r - 1. Let 'Dx be the space of all linear maps from IF:; to lFit with norm less than or equal to 1,
Given
Let 'D be the disk bundle over V of all the 'Dx, 'D =
U{x} x 'Dx·
xEV Let \II = (j,1/» : 'D
-+
'D be the graph transform induced by the derivative of
I,
for U x E 'Dx. The proof of Lemma 2.2 shows that \II is a fiber contraction by a factor of II Dx II m(Ax)-1 + 2t. The map on the base space V has,xx = m(DlxITxV)-I. By the conditions above, \II satisfies the assumptions of the C r - I Section Theorem. Thus the invariant section, whose image is TV, is cr- I and so V is cr. 0
11.4 PERSISTENCE OF NORMALLY CONTRACTING MANIFOLDS
447
Definition. Next, we introduce the notion of a tubular neighborhood of a submanifuld. The idea is to identify a point in a neighborhood of V with a point in V and a displacement in the normal direction which is represented by a vector in N. To state this more carefully, we use the disk bundle in N. For a > 0, let Nx(a) be the vectors in N x with length less than or equal to a; thus Nx(a) is a closed disk in N x . Then let N(a) be the bundle of all these disks, N(a) = {x} x Nx(a).
U
xEV
The Tubular Neighborhood Theorem says that there is a embedding cp from N(a) for small a onto a neighborhood of V in M. If M is a Euclidean space then cp can be taken to be given by cp(v x ) = x + v x . In a manifold, cp(vx) = expx(vx ) works but may be only C r - I if V is cr. With a little care, cp can be made to be cr. See Hirsch (1976) for a more complete discussion. PROOF OF THEOREM 4.2. The bundle N is defined at all points of the tubular neighborhood cp(N(a» by taking tangent spaces to cp(Nx ) at points in the image of a fiber. We continue to call this bundle N. The tangent space to V can be extended to this neighborhood to be differentiable. (We do not give the details.) We denote the fibers of this bundle by Tx. Rather than use the family of cones of vectors which point more along V than in the normal direction, we use the cones which point more in the normal direction. For x E cp(N(a», let
Because I is normally contracting at V (with C = 1 from an adapted norm), if a is small enough, then these cones are invariant under the action of the derivative of I-I: D(r1)x(Cx) c C,-I(X)' We leave the details to the reader. See Exercise 11.10. Next let Do be a "vertical" disk in the tubular neighborhood cp(N(a» which is the same dimension as a fiber of the normal bundle, Ny, and whose tangent space TyDo is contained in the cone Cy . We also assume that the boundary of Do is in the boundary of cp(N(a» and Do goes all the way across cp(N(a»: in local coordinates we could assume that Do is the graph of a function from Nx(a) into TxV for some point x E V. Because of the invariance of the bundles under I-I, l-n(Do) n cp(N(a» is a disk with the same properties. As in the proof of the stable manifold theorem,
Dn = f"(rn(Do) n cp(N(a») c Do is a nested set of disks which converge to a single point. (Here cp(N(a» is the tubular neighborhood which is the image of the normal disk bundle over points of V.) This point is the unique point in Do which stays in cp(N(a» for all backward iterates. If Do = cp(Nx ) for x E V, then x E DonV stays in
V =
u n f"(rn(cp(Nx»n cp(N(a») xEVn~O
Now if 9 is a C r diffeomorphism which is C l near to the above set of cones:
I,
then 9- 1 will also preserve
448
XI. SMOOTHNESS OF STABLE MANIFOLDS AND APPLICATIONS
Again. if Do is a "vertical" disk of the same type as above. then
n gn(g-n(Do) n cp(N(a))) R2:0
is a single point. Thus
Vg =
un
gn(g-n(cp(Nx }} n cp(N(a)))
xEV n2:0
is a graph over V in the sense that cp-I(Vg) C N(a) can be represented as the graph of a function t/Jg : V --+ N(a) with t/Jg(x) E Nx(a). This function is also Lipschitz because it has a unique point in cp(Cx ) for x E V. (This follows because any vertical disk has a unique point in cp-I(Vg) just as in the proof of the Stable Manifold Theorem.) An argument like that for the stable manifold proves that Vg is a C1 graph over V. Just as the tangent space to V is an invariant section for the graph transform \II as defined in the proof of Proposition 4.3. the tangent bundle to Vg is a fixed section for a similar graph transform \IIg induced by the derivative of g. This map \IIg is CO near \II provided 9 is C 1 near f. so the sections are CO near. The fact that the tangent space to Vg is CO near t.he tangent space to V implies that Vg is C I near V. We leave the details to the reader. If 9 is C1 near enough to f. then 9 is r-normally contracting at Vg • so by Proposition 4.3 Vg is CT. 0
11.5 Exercises Fiber Contractions 11.1. Let F: Xo x Yo --+ X x Yo be as in Theorem 1.1 (a) Prove that for each x E X o•
n 00
FR( {f-n(x)} x Yo)
n=O
is a unique point (x.u·(x». (Do not use the conclusion of Theorem 1.1.) (b) Prove that the map u· : Xo --+ Y defined in part (a) is continuous. 11.2. Consider Theorem 1.2 for r = 0 and 0 < 0< < 1. Let
90(L) = {u E gO : lu(x) - u(x/)1 ~ L d(x. X/)O for all x. x' E Xo}. (a) Prove that 90(L) is closed in go. (b) Prove that for Lo > 0 large enough. fF preserves 90(£o). (c) Prove that the invariant section u· E gO(Lo). 1l.3.
Consider Theorem 1.2 for r = 1 and 0
< 0<
~
1. Let
gI+O(C I • L) = {u ECI(Xo. yo) : IDuxl ~ C1 and
IDux - DUx'l ~ Ld(x,x')O for all x.x' E Xo}. Henry (1981) in Lemma 6.1.6 proves that gI+O(CI. L) is closed in gO for any 0 < 0< ~ 1. (Note this is not true for 0< = 0.) In fact he proves with the suitable definition that gT+O(C 1 • L) is closed in go for any integer r ~ 0 and 0 < 0< ~ 1.
11.5 EXERCISES
449
(a) Prove that for C.,£o > 0 large enough, rF preserves g1+0(C., £0). (b) Prove that the invariant section u· E g1+0(C., Lo). (You may use the result of Henry.) 11.4. (Fiber Contraction Theorem) Assume X and Y are metric spaces with Y complete. Assume F : X x Y -+ X x Y is a uniformly continuous fiber contraction on X with F(x,y) = (f(x),g(x,y)), Le., there exists a 0 < ~ < 1 such that
for all x E X and YI,Y2 E Y. Assume there is ax· E X such that for any x E X, d(f"(x), x·) goes to zero as n goes to 00. (Such a fixed point is called attractive. Let y. be the fixed point of g(x·,·) : Y -+ Y. Let (x,y) be a point in X x Y. Let 11'2: X x Y -+ Y be the projection onto Y. Note that d(7r2
0
F"(x,y),y·) ~
d(7r2 0 F"(x, y), 11'20 F"(x, y.)) n-I
+ L d(7r2 ° F"-I-;(p+I(X), g(f;(x),y·)), ;=0
(a) Show that d(7r2 oF"(X,y),1r2 oF"(x,y·)) goes to zero as n goes to infinity. (b) Let 6; = d(g(Ji(x),y·),y·). Prove 6; goes to zero as j goes to infinity. (c) Estimate
(d) Splitting the sum from 0 to n -1 into the two sums from 0 to k -1 and from k to n - I, show that d(1r2 ° F"(x, y), y.) goes to zero as n goes to infinity. Thus prove for any (x, y) E X x Y, d( F n (x, y), (x· , y.)) goes to zero as n goes to infinity, and (x·, y.) is an attractive fixed point. B.5. Assume F : Xo x Yo -+ X x Yo is as in Theorem 1.2 for some r ~ 1. Let C(X) be as defined in the proof. Write Yo €a C(X) for
u
{x} x Yo x L(TxX, Y).
xEX
Define 9 : Yo €a C(Xo)
-+
Yo €a C(X) by 9(x,y, S) = (f(x),g(x,y),1/J(x,y, S)), where
1/J(x,y,S) = Dg(x,y)
0
(id, S) ° D(rl)px).
(Note the Similarity to 1/J defined in Section 11.1.) (a) Show that 9 is a cr- I fiber contraction over Xo. (b) Note that the continuous sections of Yo €a C(Xo) can be written as gO x where go is as before and 110 are continuous sections of C(Xo). Let re be the graph transform of 9 on go x Show that re(u, S) = (rF(U), r ... (u, S)) where
no
no.
r ... (u, S)(x) = Dg/to/-'(x) 0
(id, S/-'(x»)
0
D(f-I)x.
450
XI. SMOOTHNESS OF STABLE MANIFOLDS AND APPLICATIONS
(c) Using the previous exercise. show that fa has a unique fixed point (11°. AO). Conclude that if 11 E gl then fa(l1. D(1) converges to (11°. AO) and 11° is e l . (d) Prove that 11· is cr. 11.6. Let A : r;: -+ r,:,. 11 : 1F~ -+ r;:. and Bx : lF~ -+ lF~, be linear maps between Banach spaces. Prove that m(A + B(1) ~ m(A) -IIBIIIII1I1. Differentiability of an Invariant Splitting 11.7. Setup the bundle map for the proof of Theorem 2.2. Show that it is overflowing on the base space and a fiber contraction with a stronger contraction on the fiber than the contraction on the base space. Differentiability of the Center Manifold 11.8. Let I be a e r diffeomorphism on Rn for 1 < r < 00. Assume 0 is a fixed point that has a center (some eigenvalues with absolute value 1). Assuming that the local unstable manifold WU(O. f) is e l • prove that it is e r . (Use the theorems of this chapter. but do not use the theorems of Section 5.10.2.) Normally Contracting Manifolds 11.9. (Proposition 4.1) Assume I: M -+ M is normally contracting at V. (a) Prove that there is a continuous choice of the normal bundle that is invariant by the derivative of I. D/x(Nx) = Nf(x)' (b) Prove that it is possible to take = 1 in the definition of normally contracting at V by changing the Riemannian norm of M. 11.10. Assume I : M -+ M is normally contracting at V. Let {ex} be the family of cones as defined in the proof of Theorem 4.2. Prove that these cones are invariant under the action of the derivative of I-I:
e
11.11. Assume I : (-fO. fO) x M -+ M is e l and I(t,·) : M -+ M is a diffeomorphism for each f E (-fO. fO). Assume 1(0,·) has a I-normally contracting invariant manifold Vo. Theorem 4.2 proves that there is an fl > 0 such that I(f,·) has a I-normally contracting invariant manifold V. which is e l for If I ~ fl. Define the map F : (-fO, EO) x M -+ (-EO.fO) x M by F(£,x) = (E,/(E,X». (a) By constructing an invariant set of cones for F, prove that the set
v=
{(f,X) : x E V. for
lEI
~
Ed
is a Lipschitz manifold in (-fO, fO) x M. This says that the normally contracting invariant manifold varies in a Lipschitz manner. This manifold V could be called the "Center Manifold" for the normally contracting invariant manifold Vo. (b) Prove that V is a e l manifold in (-EO,EO) x M. This says that the normally contracting invariant manifold varies differentiably.
References Abraham, R. and Marsden, J. (1978), Foundation 0/ Mechanics (2nd Edition), BenjaminCummings Publ. Co., Reading MA. Abraham, R. and Robbin, J. (1967), Transversal Mappings and Flows, Benjamin, Reading MA. Abraham, R. and Smale, S. (1970), Nongenericity o/n-stability, Proc. Symp. in Pure Math., Amer. Math. Soc. 14, 5-8. Adler, R. and Weiss, B. (1970), Similarity 0/ automorphisms 0/ the torus, Memoirs of Amer. Math. Soc. 98. Alekseev, V. M. (1968a), Quasimndom dynamical systems, I, Math. USSR-Sb. 5, 73-128. Alekseev, V. M. (1968b), Quasimndom dynamical systems, II, Math. USSR-Sb. 6, 505-560. Alekseev, V. M. (1969), Quasimndom dynamical systems, Ill, Math. USSR·Sb. 1, 1-43. Alekseev, V. M. (1981), Quasimndom oscillations and qualitative questions in celestial mechanics, Amer. Math. Soc. Translations (Three Papers on Dynamical Systems) 166 (Series 2),97-169. Alligood, K. and Yorke, J. (1989), fractal basin boundaries and chaotic attmctors, Proc. of Sympoeia in Applied Mathematics 39, 41-55. Alligood, K., Tedeschini-Lalli, L., and Yorke, J. {1991}, Metamorphoses: sudden jumps in basin boundaries, Commun. Math. Physics 141, 1-8. AIsedA, L., Llibre, J., and Misiurewicz, M. (1993), Combinatorial Dynamics and Entropy in Dimension One, Advanced Series in Nonlinear Dynamics, vol. 5, World Scientific Publ., River Edge NJ & Singapore. Aoki, N. (1991), The set 0/ Axiom A diffeomorphisrns with no cycles, in Collection: Dynamical Systems and related topics, World Scientific Publ, River Edge, NJ, pp. 2(}-35. Aoki, N. (1992), The set 0/ Axiom A diffeomorphisms with no cycles, Bol. Soc. Brasil Mat. 23,21-65. Andronov, A. A. (1929), Application 0/ Poincare's theorem on bifurcation points and change in stability to simple autooscillatory systems, C. R. Acad. Sci. (Paris) 189, 559-561. Andronov, A. A. and Leontovich-Andronova, E. A. (1939), Some cases o/the dependence 0/ periodic motions on a pammeter, Ucenye Zapiski Gorki Gosudarstvenny Univ. 6, 3. Andronov, A. A. and Pontryagjn, L. (1937), Systemes Grossiers, Dok!. Akad. Nault. USSR 14, 247-251. Andronov, A. A., Leontovich, E. A., Gordon, I. I., and Maier, A. G. (1973), Theory 0/ Bifurcation 0/ Dynamical Systems on the Plane, Wiley, New York. Anosov, D. V. (1967), Geodesic flows on closed Riemannian mani/olds oj negative curvature, Proc. Steklov Inst. 90, Amer. Math. Soc. (trans!. 1969). Apostol, T. (1974), Mathematical Analysis, Addison·Wesley Publ. Co., New York and Reading, MA. Arnold, V. I. (1961), On the mapping 0/ the circle into itself, Izvestia Akad. Nault. USSR Math. 25,21-86. Arnold, V. I. (1971), Ordinary Differential Equations, M.I.T. Press, Cambridge MA. Arnold, V. I. (1978), Mathematical Methods 0/ Classical Mechanics, Springer-Verlag, New York, Heidelberg, Berlin. Arnold, V. I. (1983), Geometric Methods in the Theory 0/ Ordinary Differential Equations, Springer-Verlag, New York, Heidelberg, Berlin. Arrowsmith, D. K. and Place, C. M. (1990), An Introduction to Dynamical Systerns, Cambridge University Press, Cambridge, New York. Artin, M. and Mazur B. (1965), On periodic points, Ann. Math. 81,82-89. 451
452
REFERENCES
Banks, J., Brooks, J., Cairns, G., Davis, G., and Stacey, P. (1992), On Devaney's definition of chaos, Amer. Math. Monthly 99, 332-334. Barge, M. (1986), Horseshoe maps and inverse limits, Pacific J. Math. 121,29-39. Barge, M. (1988), A method for constructing attractors, Ergodic Theory Dynamical Systems 8,331-349. Barge, M. and Gillette, R. (1990), Rotation and periodicity in plane separating continua, Ergodic Theory Dynamical Systems 11, 619-631. Barge, M. and Martin, J. (1990), Construction of global attractors in the plane, Proc. Amer. Math. Soc. 110, 523-525. Belitskii, G. R. (1973), f\inctional equations and conjugacy of local diffeomorphism of a finite smoothness class, Functional Analysis and Its Applications 7, 268-277. Benedicks, M. and Carleson, L. (1985), On iterates of x ...... 1 - ax 2 on (0,1), Ann. Math. 122. 1-25. A.. n..
REFERENCES
453
Broer, H., Dumortier, F., van Strien, S., and Takens, F. (1991), Structures in Dvnamics; Finite Dimensional Deterministic Studies, North-Holland, Ellievier Sci. Publ., Amsterdam. Brunovsky, P. (1971), On one parameter families 0/ diffeomorphisms 1/; generic branching in higher dimensions, Comm. Math. Univ. Carolinae 12, 765-784. Burns, K. and Gerber, M. (1989), Continuous invariant cone families and ergodicity 0/ flows in dimension three, Ergodic Theory Dynamical Systems 9, 19-25. Burns, K. and Weiss, H. (1994), A geometric criterion lor positive topological entropy, Preprint. Carleson, L. and Gamelin, T. (1993), Complex Dynamics, Springer-Verlag, New York, Heidelberg, Berlin. Carr, J. (1981), Applications 0/ Centre Mani/old Theory, Springer-Verlag, New York, Heidelberg, Berlin. Cartwright, M. and Littlewood, J. (1945), On nonlinear differential equations 0/ the second order: J. The equation ii-k(1-y2)y+y = b"\kcos(..\t+Q), k large, J. London Math. Soc. 20, 180-189. Cesari, L. (1959), Asymptotic Behavior and Stability Problems in Ordinary Differential Equations, Springer-Verlag, New York, Heidelberg, Berlin. Chernov, N.1. and Sinai, Ya. G. (1987), Ergodic properties 0/ some systems 0/2-dimensional discs and 9-dimensional spheres, RUS8. Math. Surveys 42, 181-207. Chillingworth, D. (1976), Differential Topology with a View to Applications, Pitman Publ., London, San Francisco, Melbourne. Choquet, G. (1969), Lectures on Analysis, I, Benjamin, New York. Chow, S.-N. and Hale, J. (1982), Methods 0/ Bifurcation Theory, Springer-Verlag, New York, Heidelberg, Berlin. Chow, S.-N., Lin, X. B., and Palmer, K. (1989), A shadowing lemma with applications to semilinear parabolic equations, SIAM J. Math. Anal. 20, 547-557. Collet, P. and Eckmann, J. P. (1980), Iterated Maps on the Interval as Dynamical Systems, Progress on Physics, vol. 1, Birkhill5er, Boston. Conley, C. (1978), Isolated Invariant Sets and Morse Index, Amer. Math. Soc., CBMS 3S. Coullet, C. and '!Teaser, C. (1978), Iteration d'endomorphismes et groupe de renormalisation, J. de Physique Colloque 39, C5-C25. Croom, F. (1989), Principles 0/ Topology, Saunders College Pub., Philadelphia. Curry, J. H. (1979), On the Henon trans/ormation, Comm. Math. Phys. 6S, 129-140. Dankner, A. (1978), On Smale's Axiom A dvnamical systems, Ann. Math. 107,517-553. Denjoy, A. (1932), Sur les courbes dlfinies par les equations differentielles Ii la sUrface du tore, J. Math. Pure Appl. 11,333-375. Devaney, R. (1989), Chaotic Dvnamical Systems, Addison-Wesley Publ. Co., New York &t Reading, MA. Devaney, R. and Nitecki, Z. (1979), Shift automorphism in the Henon mapping, Comm. Math. Phys. 67, 137-148. Dieudonne, J. (1960), Foundations 0/ Modem Analysis, Academic Press, New York. Dold, A. (1972), Lectures on Algebraic Topology, Springer-Verlag, New York, Berlin, Heidelberg. Douady, A. and Hubbard, J. H. (1985), On the dynamics o/polynomial-like mapping8, Ann. Sc. E. N. S. 4 seril§ IS, 287-343. Dugundji, J. (1966), Topology, Allyn and Bacon, Boston, London, Sydney, Toronto. Dumortier, F., Kokubu, H., and Oka, H. (1992), On degenerate singularities that generate geometric Lorenz attractors, Preprint. Easton, R. (1986), 7rellises formed by stable and unstable manifolds, '!Tans. Amer. Math. Soc. 294, 719-731.
454
REFERENCES
Easton, R. (1991), Transport through chaos, Nonlinearity 4, 583-590. Edgar, G. (1990), Measure, Topology, and Fractal Geometry, Springer-Verlag. New York. Berlin, Heidelberg. Edwards. C. H. and Penney. D. E. (1990). Calculus and Analytical Geometry, Prentice Hall. Englewood Cliffs, NJ. Falconer, K. (1990). Fractal Geometry, John Wiley & Sons. New York. Farmer, D., Ott, E., and Yorke. J. (1983). The dimension of chaotic at tractors. Physica D 3, 153-180. Feigenbaum, M. (1978), Quantitative universality for a class of non-linear transformations, J. Stat. Phys. 21,25--52. Fen ichel , N. (1971), Persistence and smoothness of invariant manifolds for flows, Indiana Univ. Math. J. 21. 193-226. Fenichel, N. (1974), Asymptotic stability with rate conditions. Indiana Univ. Math. J. 23. 1109-1137. Fenichel. N. (1977), Asymptotic stability with rate conditions, II, Indiana Univ. Math. J. 26,81-93. Franks, J. (1969), Anosov diffeomorphisms on tori. Trans. Amer. Math. Soc. 145. 117-124. Franks. J. (1970). Anosov diffeomorphisms. Global Analysis. Proc. Sympos. Pure Math .• Amer. Math. Soc. 14,61-94. Franks, J. (1971). Necessary conditions for stability of diffeomorphisms. Trans. Amer. Math. Soc. 158,301-308. Franks. J. (1972). Differentiably {l-stable diffeomorphisms. Topology 11. 107-114. Franks, J. (1973). Absolutely structurally stable diffeomorphisms. Proc. Amer. Math. Soc. 37, 293-296. Franks, J. (1974). Time dependent structural stability. Inven. Math. 24. 163-172. Franks, J. (1977a), Constructing structurally stable diffeomorphisms. Ann. Math. 105. 343-359. Franks, J. (1977b). Invariant sets of hyperbolic toral automorphisms. Amer. J. Math. 99. 1089--1095. Franks, J. (1979), Manifolds ofer mappings and applications to differentiable dynamical systems, Adv. Math. Suppl. Studies (Studies in Analysis) 4. 271-290. Franks, J. (1982). Homology and Dynamical Systems. Amer. Math. Soc .• CBMS 49. Providence, RI. Franks, J. (1988). A variation on the Poincare-Birkhoff Theorem. Contemporary Math. 81, 111-117. Franks, J. and Shub, M. (1981). The existence of Morse Smale diffeomorphisms. Topology 20. 273-290. Fried, D. (1982), Geometry of cross-sections to flows, Topology 21. 353-371. Fried, D. (1987). Rationality for isolated expansive sets. Adv. Math. 65. 35--38. Frobenius, G. (1912), Uber Matrizen aua nicht negativen Elementen. S.-B. Deutsch. Akad. Wiss. Berlin Math.-Nat. Kl.. 456-77. Gantmacher, F.R. (1959), The Theory of Matrices, Volume I, II. Chelsea Publ. Co .• New York. Grebogi, C .• Hammel. S .• and Yorke. J. (1988). Numerical orbits of chaotic processes represent true orbits. Bull. Amer. Math. Soc. 19. 465--469. Grebogi, C., Ott, E .• and Yorke. J. (1987), Basin boundary metamorphoses: changes in accessible boundary orbits. Physica D 24, 211 Guckenheimer. J. (1970). Axiom A + no cycles ,minKS <,(t) rational. Bull. Amer. Math. Soc. 76. 592-595. Guckenheimer, J. (1972), Absolutely {l-stable diffeomorphisms, Topology 11, 195--197.
REFERENCES
455
Guckenheimer. J. (1976). A strange, strange attractor, in The Hopr Bifurcation and Its Applications by Marsden and McCracken. Springer-Verlag, New York. Heidelberg, Berlin. Guckenheimer. J. (1979). Sensitive dependence to initial conditions for one dimensional maps, Comm. Math. Phys. TO, 133-160. Guckenheimer. J. (1980), Bifurcations of dynamical systems, in Dynamical Systems, Progress in Math. #8, Birkhauser. Boston. Basel. Stuttgart, pp. 11&-232. Guckenheimer. J. and Holmes. P. (1983). Nonlinear Oscillations, Dynamical Systems and Bifurcations of Vector Fields. Springer-Verlag, New York, Heidelberg. Berlin. Guckenheimer. J. and Williams R. (1980), Structural stability of the Lorenz attractor, Publ. Math. I.H.E.S. 50.73-100. Guillemin. V. and Pollack A. (1974). Differential Topology, Prentice-Hall, Englewood Cliffs, NJ. Hadamard. J. (1901). Sur l'iteration it les solutions asymptotiques des equations differentielln. Rill I. SO<". Math. France 29. 224-228. Hahn. W. (1967), Stability of Motion. Springer-Verlag, New York. Heidelberg. Berlin. Hale. J. (1969), Ordinary Differential Equations. Wiley. New York. Hale. J. and KOI;ak, H. (1991). Dynamics and Bifurcation. Springer-Verlag. New York. Heidelberg, Berlin. Hammel, S. and Jones. C. (1989). Jumping stable manifolds for dissipative maps of the plane, Physica D 35, 87-106. Hartley. R. and Hawkes, T. (1970), Rings, Modules, and Linear Algebra, Chapman and Hall, New York and London. Hartman. P. (1960), On local homeomorphisms of Euclidean spaces, 801. Soc. Mat. Mexicana 5. 220- 241. Hartman, P. (1964). Ordinary Differential Equations, Wiley, New York. Hayyashi, S. (1992), Diffeomorphisms in Fl(M) satisfy Axiom A. Ergodic Theory Dynamical Systems 12. 233-253. Henon, M. (1976). A f1l'n·dimensional mapping with a strange attractor, Comm. Math. Phys. 50, 69- 7 • Henry. B. R. (1973). Escape from the unit interval under the transformation x ..... Ax(l x), Proc. Amer. Math. Soc. 41, 146-150. Henry. D. (1981). Geometric Theory of Semilinear Parabolic Equations (Lecture Notes in Math.). vol. 840, Springer-Verlag, New York, Heidelberg, Berlin. Herman, M. (1979), Sur la conjugation differentiable des diffeomorphisms du cercle des rotations, Publ. Math. I.H.E.S. 49, &-233. Hille. E. (1962), Analytic Function Theory" vol. II, Chelsea Publ. Co .• New York. Hirsch. M. (1976), Differential Topology, Springer-Verlag, New York, Heidelberg, Berlin. Hirsch. M. (1982), Systems of differential equations that are competitive or cooperative, I: limit sets, SIAM J. Math. Anal. 13, 167-79. Hirsch, M. (1984), The dynamical systems approach to differential equations, Bull. Amer. Math. Soc. 11, 1-64. Hirsch, M. (1985), Systems of differential equations that are competitive or cooperative, II: conve1!Jence almost everywhere, SIAM J. Math. Anal. 16, 423-39. Hirsch, M. (1988), Systems of differential equations that are competitive or cooperative, III: competitive species, Nonlinearity 1, 51-71. Hirsch, M. (1990), Systems of differential equations that are competitive or cooperative, IV: structural stability in 3-dimensional systems, SIAM J. Math. Anal. 21, 1225-34. Hirsch, M. and Pugh. C. (1970), Stable manifolds and hyperbolic sets, Proc. Symp. Pure Math. 14, 133-163. Hirsch, M., Pugh, C., and Shub, M. (1977), Invariant Manifolds (Lecture Notes in Math.), vol. 583, Springer-Verlag, New York, Heidelberg, Berlin.
a
456
REFERENCES
Hirsch, M. and Smale, S. (1974), Differential Equations, Dynamical Systems, and Linear Algebra, Academic Press, New York. Hocking, J. G. and Young, G. S. (1961), Topology, Addison-Wesley Publ. Co., New York & Reading, MA. Hoffman, K. and Kunze, K. (1961), Linear Algebra, Prentice-Hall, Englewood Cliffs, NJ. Holmes, P. (1980), Averaging and chaotic motions in forced oscillations, SIAM J. Appl. Math. 38, 65-80. Holmes, P. (1984), Bifurcation sequences in horseshoe maps: infinitely many routes to chaos, Phys. Lett. A 104, 299-302. Holmes, P. (1988), Knots and orbit genealogies in nonlinear oscillations, in New Directions in Dynamical Systems, London Math. Soc. Lecture Notes, vol. 127, Cambridge University Press, Cambridge, New York, pp. 150-191. Holmes, P. and Whitley, D. (1984), Bifurcation in one and two dimensional maps, Phil. Trans. Roy. Soc. London (Ser. A) 1515,43-102. Hofbauer, J. and Sigmund, K. (1988), The Theory of Evolution and Dynamical Systems, Cambridge University Press, Cambridge, New York. Hopf, E. (1942), Abzweigung einer periodischen Losung von einer stationiiren Losung eines Differentialsystems, Ber. Math. Phys. Siichsische Akad. der Wissen. Leipzig 94, 1-22 (See Marsden and McCracken (1976) for an English translation). Hoppensteadt, F. C. (1982), Mathematical Methods of Population Biology, Cambridge University Press, Cambridge, New York. Hu, S. (1994), A proof of C 1 stability conjecture for 3-dimensional flows, Transact. Amer. Math. Soc. 342, 753-772. Hubbard, J. and West, B. (1992), MacMath: A Dynamical Systems Software Package for the Macintosh, Springer-Verlag, New York, Heidelberg, Berlin. Hurder, S. and Katok, A. (1990), Differentiability, rigidity, and Godbillon- Vey Classes for Anosov flows, Pub!. Math. I.H.E.S. 12,5-61. Hurewicz, W. (1958), Lectures on Onlinary Differential Equations, MIT Press, Cambridge, MA. Hurewicz, W. and Wallman, H. (1941), Dimension Theory, Princeton University Press, Princeton, NJ. Hurley, M. (1991), Chain recurrence and attraction in noncompact spaces, Ergodic Theory Dynamical Systems 11, 709-729. Hurley, M. (1992), Chain recurrence and attraction in noncompact spaces, II, Proc. ofthe Amer. Math. Soc. lIS, 1139-1148. Irwin, M. C. (19708), A classification of elementary cycles, Topology 9, 35-47. Irwin, M. C. (1970b), On the stable manifold theorem, Bull. London Math. Soc. 2, 196-198. Irwin. M. C. (1972), On the smoothness of the composition map, Q. J. Math. Oxford Ser. 23, 113-133. Irwin, M. C. (1980), Smooth Dynamical Systems, Academic Press, New York. Jakobson, M. (1971), On smooth mappings of the circle into itself, Math. USSR Sb. 14, 161-185. Jakobson, M. (1981), Absolutely continuous invariant measures for one-parameter families of one-dimensional maps, Comm. Math. Phys. 81, 39--88. Katok, A. (1980), Lyapunov exponents, entropy, and periodic orbits for diffeomorphisms, Publ. Math. I.H.E.S. 51, 137-173. Katok, A. and Burns, K. (1994), Infinitesimal Lyapunov functiOns, invariant cone families and stochastic properties of smooth dynamical systems, Ergodic Theory Dynamical Systems 14, 757-786. Katok, A. and Hasselblatt, B. (1994), Introduction to the Modern Theory of Smooth Dynamical Systems, Cambridge University Press, Cambridge, New York.
REFERENCFS
457
Katok, A. and Streicyn, J.-M. (1986), Invariant manifolds, entropy, and billiards; smooth maps with singularities (Lecture Notes in Mathematics), vol. 1986, Springer-VerlatJ, New York, Heidelberg, Berlin. Kelley, A. (1967), The stable, center-stable, center center-unstable, and unstable manifolds (Appendix C), in Transversal Mappinp and Flows by R. Abraham and J. Robbin .. Benjamin, Reading, MA. K~, H. (1986, revised 1989), Differential and Difference Equations through Computer Experiments, Springer-Verlag, New York, Heidelberg, Berlin. Kupka, I. (1963), Contribution d la theorie des champs generiques, Contrib. to Dill. Eq. 2, 457-484. Lanford, O. (1982), A computer-assisted proof of the Feigenbaum conjecture, Bull. Amer. Math. Soc. 6, 427--434. Lanford, O. (1984), A shorter proof of the existence of Feigenbaum fixed point, Commun. Math. Phys. 96, 521-538. Lanford, O. (1986), Computer-assisted proofs in analysis, Proc. Int. Congress of Math. 1,2, 1385-1394. Lang, S. (1967), Introduction to Differentiable Manifolds, Interscience Publ. (Wiley &. Sons, Inc.), New York/London/ Sydney. Lang, S. (1968), Analysis I, Addison-Wesley Pub\. Co., New York &. Reading, MA. Lefever, F. and Nicholls, C. (1971), Chemical instabilities and sustained oscillations, J. Theoretical Biology 30, 267-284. Levi, M. (1981), Qualitative analysis of the periodically forced relaxation oscillator, Memoirs Amer. Math. Soc. 32. Levinson, N. (1949), A second order differential equation with singular solutions, Ann. Math. 50, 127-153. Li, T. and Yorke, J. (1975), Period three implies chaos, Amer. Math. Monthly 82, 985-992. Liao, S. T. (1979), An extension of the C' closing lemma, Acta Scientiarum Naturalium Universitatis Pekinensls 3, 1-14. (Chinese) Liao, S. T. (1980), Chinese Ann. Math. 1,9--30. Liapunov, A. (1907), Probleme gene,lll de La stabilite du mouvement, Ann. Fac. Sci. Univ. Toulouse 9, 203--475 [reproduced in Ann. Math. Study (17) Princeton (1947)]. Lichtenberg, A. and Lieberman, M. (1983), Regular and Stochastic Motion, Springer-VerlatJ, New York, Heidelberg, Berlin. Liverani, C. and Wojtkowski, M. (1993), Ergodicity in Hamiltonian systems, Dyuamies Reported (to appear). de la Llave, R. (1987), Invariants for smooth conjugacy of hyperbolic dynamical .y.tems II, Commun. Math. Phys. 109, 369--378. de la Llave, R., Marco, J. M., and Moriyon, R. (1986), Canonical perturbation theory of Anosov systems and regularity results for the Livsic cohomology equation, Ann. Math. 123, 537~11. Lorenz, E. N. (1963), Deterministic non-periodic flow, J. Atmos. Sci. 20, 13(}-141. Mai, J. (1986), A simpler proof of closing lemma, Scientia Sinica (Series A) 29 (10), 1021-1031. Mallet-Paret, J. and Yorke, J. (1982), Snakes: oriented families of periodic orbits, their sources, sinks, and continuation, J. Dill. Eq. 43, 411).-450. Mane R. (19788), Persistence manifolds are normally hyperbolic, Trans. Amer. Math. Soc. 246,261-283. Mane R. (1978b), Contributions to the stability conjecture, Topology 17,383-396. Mane R. (1987a), Ergodic Theory and Differentiable Dynamics, Springer-VerlatJ, New York, Heidelberg, Berlin. Mane R. (1987b), A proof of the C' stability conjecture, Pub\. Math. I.H.E.S. 66, 161-210.
c'
458
REFERENCES
Manning. A. (1971). Axiom A diffeomorphisms have rational zeta junctions. Bull. London Math. Soc. 3. 215-220. Manning. A. (1974). There are no new Anosov diffeomorphisms on tori., Amer. J. Math. 96. 422-429. Manning. A. and McCluskey. H. (1983). Hausdorff dimension for horseshoes. Ergodic Theory Dynamical Systems 3, 251-261. Marek. M. and Schreiber. I. (1991). Chaotic Behaviour of Deterministic Dissipative Systems. Cambridge University Press. Cambridge. New York. Marsden. J. (1974). Elementary Classical Analysis. Freeman and Co .• New York. Marsden. J. (1984). Chaos in dynamical systems by the Poincare-Melnikov-Amold method. in Nonlinear Dynamical Systems (edited by J. Chandra), SIAM. Philadelphia. pp. 1931. Marsden. J. and McCracken. M. (1976). The Hopf Bifurcation and Its Applications. SpringerVerlag. New York. Heidelberg. Berlin. Margulis. G. A. (1969). On some applications of ergodic theory to the study of manifolds of negative curvature. Funct. Analy. Appl. 3. 89-90. May. R. (1975). Stability and Complexity in Model Ecosystems, 2nd edn .• Princeton Univ. Press. Princeton. NJ. McGehee. R. (1973). A stable manifold theorem for degenerate fixed points with applications to celestial mechanics. J. Diff. Eq. 14. 70-88. Melnikov. V. K. (1963). On the stability of the center for time periodic perturbations. Trans. Moscow Math. Soc. 12. 1-57. de Melo. W. (1973). Structural stability of diffeomorphisms on two manifolds, Inven. Math. 21.233-246. de Melo. W. (1989). Lectures on One-Dimensional Dynamics. Instituto de Matematica Pura e Aplicada. Rio de Janeiro. Brazil. de Melo. W. and Van Strien. S. J. (1993). One-Dimensional Dynamics. Springer-Verlag. New York. Heidelberg. Berlin. Meyer. K. (1987). An analytic proof of the shadowing lemma, Funkcialaj Ekvacioj 30. 127-133. Meyer. K. and Hall. G. H. (1992). Introduction to Hamiltonian Dynamical Systems and the N-Body Problem. Springer-Verlag. New York, Heidelberg. Berlin. Meyer. K. and Sell. G. (1989). Trans. Amer. Math. Soc. 314, 63-105. Milnor. J. (1985). On the concept of attract or. Comm. Math. Phys. 99.177-195. Misiurewicz. M. (1981). Absolutely continuous measures for certain maps of an interval. Publ. Math. I.H.E.S. 53. 17-51. Mora. L. and Viana. M. (1993). Abundance of strange at tractors, Acta Math. 171, 1-71. Moeer. J. (1962). On invariant curves of area preserving mappings of an annulus, Nachr. Akad. Wiss. Gottingen. Math. Phys. KI. 1-20. Moser. J. (1969). On a theorem of Anosov. J. Diff. Eq. 5. 411-440. Moser. J. (1973). Stable and Random Motions in Dynamical Systems, Princeton University Press. Princeton. NJ. Munkres. J. (1975). Topology, A First Course. Prentice-Hall, Englewood Cliffs. NJ. Naimark. J. (1967). Motions close to doubly asymptotic motions, Soviet Math. Dok!. 8, 228-231. Newhouse. S. (1970a). Nondensity of Axiom A (a) on S2. Proc. Symp. in Pure Math., Amer. Math. Soc. 14. 191-202. Newhouse. S. (1970b). On codimension one Anosov diffeomorphisms. Amer. J. Math. 92. 761-770. Newhouse. S. (1972). Hyperbolic limit sets. Trans. Amer. Math. Soc. 167. 125-150.
REFERENCES
459
Newhouse, S. (1979), The abundance of wild hyperbolic sets and non-smooth stable sets for diffeomorphism, Pub!. Math. I.H.E.S. 50, 101-151. Newhouse, S. (1980), Lectures on Dynamical Systems, in Dynamical Systems, Progress in Math. #8, Birkhauser, Boston, Basel, Stuttgart, pp. 1-114. Nitecki, Z. (1971), Differentiable Dynamics, MIT Press, Cambridge MA. Nitecki, Z. (1971), On semi-stability for diffeomorphisms, Inven. Math. 14,83-122. Oseledec, V. I. (1968), A multiplicative ergodic theorem. Liapunov characteristic numbers for dynamical systems, Trans. Moscow Math. Soc. 19, 197-221. Ott, E. (1993), Chaos in Dynamical Systems, Cambridge University Press, Cambridge, New York. Palis, J. (1968), On the local structure of hyperbolic fixed points in Banach space, Anais Acad. Brasil Ciencais 40, 263-266. Palis, J. and de Melo, W. (1982), Geometric Theory of Dynamical Systems, An Introduction, Springer-Verlag, New York, Heidelberg, Berlin. Palis, J. and Smale, S. (1970), Structural Stability Theorems, Proc. Symp. in Pure Math., Amer. Math. Soc. 14, 223-232. Palis, J. and Takens, F. (1993), Hyperbolicity and Sensitive Chaotic Dynamics at Homoclinic Bifurcations, Cambridge University Press, Cambridge, New York. Palmer, K. (1984), Exponential dichotomies and transversal homoclinic points, J. Ditr. Eq. 55, 225-256. Patterson, S. and Robinson, C. (1988), Basins for general nonlinear Henon attracting sets, Proc. Amer. Math. Soc. 103, 615-623. Peixoto, M. (1962), Structurally stability on two dimensional manifolds, Topology I, 101120. Peixoto, M. (1966), On an appraximation theorem of Kupka and Smale, J. Ditr. Eq. 3, 214-227. Perron, O. (1907), Uber Matrizen, Math. Ann. 64, 248-63. Perron, O. (1929), Uber stabilitiit und asymptotishes Vemalten der Losungen eines Systemes endlicher Differenzengleichungen, J. Reine Angew. Math. 161,41-64. Pesin, Ja. (1976), Families of invariant manifolds corresponding to nonzero characteristic exponents, Math. USSR-Izvestia 10, 1261-1305. Pesin, Ja. (1977), Characteristic Lyapunov exponents and smooth ergodic theory, Russian Math. Surveys 32, 55-114. Pliss, V. A. (1971), Properties of solutions of a sequence of second-order periodic system with small nonlinearities, Ditr. Eq. 7,501-508. Pliss, V. A. (1972), A hypothesis due to Smale, Ditr. Eq. 8, 203-214. Plykin, R. V. (1974), Sources and sinks for A-diffeomorphisms of surfaces, Mathematics of the USSR, Sbomik 23, 233-253. Pollicott, M. (1993), Lectures on Ergodic Theory and Pesin Theory on Compact Manifolds, London Math. Soc. Lecture Note Series No. 180, Cambridge University Press, Cambridge, New York. Prigongine,l. and Lefever, R. (1968), Symmetry breaking instabilities in dissipative systems II, J. Chem. Phys. 48, 1695-1700. Pugh, C. (1967a), The Closing Lemma, Amer. J. Math. 89, 956-1009. Pugh, c. (1967b), An improved Closing Lemma and a General Density Theorem, Amer. J. Math. 89, 1010-1021. Pugh, C. (1969), On a theorem of Hartman, Amer. J. Math. 91, 363-367. Pugh, C. and Robinson, C. (1983), The C' Closing Lemma, including Hamiltonians, Ergodic Theory Dynamical Systems 3, 261-313. Pugh, C. and Shub, M. (19708), Linearization of normally hyperbolic diffeomorphisms and flows, Inven. Math. 10, 187-198.
460
REFERENCES
Pugh, C. and Shub, M. (1970b), O-stability for flows, Inven. Math. 11, 150-158. Pugh, C. and Shub, M. (1989), Ergodic attractors, Transact. Amer. Math. Soc. 312, 1-54. Rand, D. (1978), The topological classification of the Lorenz attractor, Math. Proc. Camb. Phil. Soc. 83, 451-460. Rasband, N. (1990), Chaotic Dynamics of Nonlinear Systems, Wiley-Interscience Publ., New York. Riesz, Z. and Nagy, B. (1955), i'Unctional Analysis, Fredrick Ungar Publ. Co., New York. Robbin, J. (1971), A structural stability theorem, Ann. Math. 94, 447-493. Robinson, C. (1973). C r structural stability implies Kupka-Smale, in Dynamical Systems (edited by M. M. Peixoto), Academic Press. New York, pp. 443-449. Robinson, C. (1974), Structural stability of vector fields, Ann. Math. 99,154-174. Robinson, C. (1975a), Structural stability of l flows. in Lecture Notes in Math., vol. 468, Springer-Verlag, New York, Heidelberg, Berlin, pp. 262-277. Robinson, C. (1975b), The geometry of the structural stability proof using unstable disks, 801. Soc. Brasil. Math. 6, 129-144. Robinson, C. (1976a), Structural stability ofC I diJJeomorphisms, J. Dilf. Eq. 22, 28-73. Robinson, C. (1976b), Structural stability theorems, in Dynamical Systems, Vol. II, Academic Press, New York, pp. 33-36. Robinson, C. (1977), Stability theorems and hyperbolicity in dynamical systems, Rocky Mountain J. Math. 1,425-437. Robinson. C. (1978), Introduction to the Closing Lemma, in The Structure of Attractors in Dynamical Systems (edited by Markley, Martin, and Perrizo) Lecture Notes in Math., vol. 668. Springer-Verlag, New York. Heidelberg, Berlin, pp. 225-230. Robinson, C. (1981), Differentiability of the stable foliation of the model Lorenz equations, in Dynamical Systems and Thrbulence (edited by Rand and Young) Lecture Notes in Math., vol. 898, Springer-Verlag, New York, Heidelberg, Berlin, pp. 302-315. Robinson, C. (1983), Bifurcation to infinitely many sinks, Commun. Math. Phys. 90, 433459. Robinson, C. (1984), 1hlnsitivity and invariant measures for geometric model of the Lorenz equations, Ergodic Theory Dynamical Systems 4, 605-611. Robinson, C. (1985), Phase analysis using the Poincare map, Nonlinear Anal. Theory Methods Appl. 9, 1159-1164. Robinson, C. (1989), Homoclinic bifurcation to a transitive attractor of Lorenz type, Nonlinearity 2, 495-518. Robinson, C. (1992). Homoclinic bifurcation to a transitive attractor of Lorenz type, 11, SIAM J. Math. Anal. 23, 1255-1268. Rudin, W. (1964), Principles of Mathematical Analysis (1964), McGraw-Hill, New York. Ruelle, D. (1976), A measure associated with Axiom A attractor. Amer. J. Math. 98, 619--654. Ruelle, D. (1979), Ergodic theory of differentiable dynamical systems, Publ. Math. I.H.E.S. 50,27-58. Ruelle, D. (1989a). Chaotic Evolution and Strange Attractors, Cambridge University Press, Cambridge, New York. Ruelle, D. (1989b), Element8 of Differentiable Dynamics and Bifurcation Theory, Academic Press, New York. Ruelle, D. and Takens. F. (1971), On the nature of turbulence. Comm. Math. Phys. 20, 167-192. Rychlik, M. (1990), Lorenz attractors through Sil'nikov-type bifurcation, Part I, Ergodic Theory Dynamical Systems 10, 793-822. Rykken, E. (1993). Markov partitions and the expanding factor for pseudo-Anosov homeomorphisms, Thesis, Northwestern University.
c
REFERENCES
461
Sacker, R. (1964), On invariant su.rfaces and bifurcation of periodic solutions of ordinary differential equations, New York Univ. IMM-NYU 333. Sacker, R. (1967), A Perturbation for invariant Riemannian manifolds, in Proc. Symp. at Univ. of Puerto Rico (edited by J. Hale and J. LaSalle), Academic Press, New York, pp.43-54. Sharkovskii, A. N. (1964), Coexistence of cycles of a continuous map of a line into itself, Ukrainian Math. J. 16,61-71. Shub, M. (1987), Global Stability of Dynamical Systems, Springer-Verlag, New York, Heidelberg, Berlin. Simon, C. (1972), Instability in Diffr(T3) and the non-genericity of rational zeta functions, Trans. Amer. Math. Soc. 114,217-242. Sinai, Ya. (1968), Markov partitions and C-diffeomorphisms, f\mc. Anal. AppJ. 2, 64-89. Sinai, Ya. (1970), Dynamical systems with elastic reflections, Russ. Math. Surveys 25, 137-189. Sinai, Ya. (1972), Gibbs measures in ergodic theory, RUBS. Math. Surveys 166, 21~9. Sitnikov, K. (1960), Existence of oscillating motions for the three body problem, Ookl. Akad. Nauk. 133, 303-306. Smale, S. (1963), Stable manifolds for differential equations and diffeomorphisms, Ann. Scuo1a Normale Piss 18, 97-116. Smale, S. (1965), Diffeomorphisms with many periodic points, Differential and Combinatorial Topology, Princeton University Press, Princeton, NJ, 63-80. Smale, S. (1966), Structurally stable systems are not dense, Amer. J. Math. 88, 491-496. Smale, S. (1967), Differentiable dynamical systems, Bull. Amer. Math. Soc. 13,747-817. Smale, S. (1970), The n-Stability Theorem, Proc. Symp. Pure Math. 14,289-297. Smale, S. (1980), The Mathematics of Time: Essays on Dynamical Systems, Economic Processes, and Related Topics, Springer-Verlag, New York, Heidelberg, Berlin. Smith, K. (1971), Primer of Modem Analysis, Bogden & Quigley, Inc., Publishers, Tarrytownon-Hudson, New York, Belmont CA. Snavely M. (1990), Markov Partitions for Hyperbolic Automorphisms of the Two-Dimensional Torus, Thesis, Northwestern University. Sotomayor, J. (1973a), Generic one parameter families of vector fields on two manifolds, Pub!. Math. Inst. Hautes Etudes Scientifiques 43, 5-46. Sotomayor, J. (1973b), Generic bifurcations of dynamical II/Items, in Dynamical Systems (edited by M. Peixoto), Academic Press, New York, pp. 561-582. Sparrow, C. (1982), The Lorenz Equations: Bifurcations, Chaos, and Strange Attroctors, Springer-Verlag, New York, Heidelberg, Berlin. Stefan, P. (1977), A theorem 01 Sarkovskii on the existence olperiodic orbits of continuous endomorphisms 01 the real line, Comm. Math. Phys. 54, 237-248. Sternberg, S. (1958), On the structure of local homeomorphisms 01 Euclidean n-space, II, Amer. J. Math. 81, 623-631. van Strien, S. (1979), Center manilolds are not Coo, Math. Z. 166, 143-145. van Strien, S. (1981), On the bifurcation creating horseshoes, in Dynamical Systems and Thrbulence (edited by Rand and Young) Lecture Notes in Math., vol. 898, Springer-Verlag, New York, Heidelberg, Berlin, pp. 316-351. Strogatz, S. (1994), Nonlinear Dynamics and Chaos, Addison-Wesley Pub!. Co., New York & Reading, MA. Takens, F. (1974), Singularities 01 vector fields, Publ. Math. Inst. Hautes Etudes Scientiflquee 43,47-100. Takens, F. (1975), Tolerance stability, in Dynamical Systems - Warwick 1974, Lecture Notes in Math., vo!. 468, Springer-Verlag, New York, Heidelberg, Berlin, pp. 293-304.
462
REFERENCES
Takens, F. (1988), Limit capacity and Hausdorff dimension of dynamically defined Cantor sets, Lecture Notes in Math. (Dynamical Systems) 1331, Springer-Verlag, New York, Heidelberg, Berlin, 196-212. Thorn, R. (1973), Stabilite Structurelle et Morphogenese, Addison-Wesley Publ. Co., New York and Reading, MA; English (1975). Ushiki, S., Oka, H., and Kokubu, H. (1984), Existence of strange attractors in the unfolding of a degenerate singularity of a translation invariant vector field, C. R. Acad. Ac. Paris, Ser. I Math. 298, 39-42. Vinograd, R. E. (1957), The inadequacy of the method of characteristic exponents for the study of nonlinear differential equations, Mat. Sbornik 41, 431-438. Walters, P. (1970), Anosov diffeomorphisms are topologically stable, Topology 9, 71-78. Walters, P. (1982), An Introduction to Ergodic Theory, Springer-Verlag, New York, Heidelberg, Berlin. Waltman, P. (1983), Competition Models in Population Biology, Society for Industrial and Applied Mathematics, Philadelphia, PA. Wen, L. (1991), The C· closing lemma for non-singular endomorphisms, Ergodic Theory Dynamical Systems 11, 393-412. Wen, L. (1992), Anosov endomorphisms on branched surfaces, J. Complexity 8, 239-264. Wiggins, S. (1988), Global Bifurcation and Chaos, Springer-Verlag, New York, Heidelberg, Berlin. Wiggins, S. (1990), Introduction to Applied Nonlinear Dynamical Systems and Chaos, Springer-Verlag, New York, Heidelberg, Berlin. Williams, R. (1967), One-dimensional nonwandering sets, Topology 6, 473-487. Williams, R. (1968), The zeta function of an attractor, in Conference on the Topology of Manifolds (J. C. Hocking, ed.), Prindle Weber and Smith, Boston MA, pp. 155-161. Williams, R. (19708), The 'DA' maps of Smale and structural stability, Global Analysis, Proc. Sympos. Pure Math., Amer. Math. Soc. 14, 329-334. Williams, R. (1970b), The zeta function in global analysis, Global Analysis, Proc. Sympos. Pure Math., Amer. Math. Soc. 14, 335-340. Williams, R. (1974), Expanding attractors, Publ. Math. I.H.E.S. 43, 169-203. Williams, R. (1977), The structure of Lorenz attractors, in Thrbulence Seminar (edited by P. Bernard) Lect. Notes in Math., vol. 615, Springer-Verlag, New York, Heidelberg, Berlin. Williams, R. (1979), The bifurcation space of the Lorenz attractor, in Bifurcation Theory and Its Applications in Scientific Disciplines, (edited by O. Gurel and o. E. ROesler) Ann. NY Acad. Sci., vol. 316, pp. 393-399. Williams, R. (1980), Structure of Lorenz attractors, Publ. Math. I.H.E.S. 50,59-72. Williams, R. (1983), Lorenz knots are prime, Ergodic Theory Dynamical Systems 4,147-163. Wilson. W. (1967), The structure of the level surfaces of a Lyapunov function, J. Dift'. Eq. 3,323-329. Wojtkowski, M. (1985), Invariant families of cones and Lyapunov exponents, Ergodic Theory Dynamical Systems 5, 145-161. Yomdin, Y. (1987), Volume growth and entropy, Israel J. Math. 57, 285-318. Yorke, J. and Alligood, K. (1985), Period doubling cascades of attractors: a prerequisite for horseshoes, Commun. Math. Phys. 101,305-321. Young, L. S. (1982), Dimension, entropy, and Lyapunov exponents, Ergodic Theory Dynamical Systems 2, 109-124. Young, L. S. (1993), Ergodic theory of chaotic dynamical systems, Lecture Notes in Math. (From Topology to Computation: Proceedings of the Smalefest, editors Hirsch, Marsden, and Shub), Springer-Verlag, New York, Heidelberg, Berlin. Zeeman, E. C. (1980), Population dynamics from game theory, Lecture Notes in Math. 819, Springer-Verlag, New York, Heidelberg, Berlin, 471-497.
Index bounded continuous maps 158 bounded linear map 158 box dimension 358 branched manifold 303, 305, 315 Brusselator 233 bump functions 163 bundle map 433
A-graph 66 a Cantor set 26
a-limit point 23, 146 a-limit set 23, 146 a-recurrent 25, 148 accessible points 275 adapted metric 181 adapted norm 108, 116, 151, 156, 181, 241 adding machine 89 adjacency matrix 248 admissible metric 434 affine conjugacy 41 Andronov-Hopf Bifurcation for diffeomorphisrns 223 Andronov-Hopf bifurcation for flows 226 Ano80v Closing Lemma 381 Ano80v diffeomorphism 276 asymptotic 16 asymptotically stable 16, 106, 156 attracting 16, 106, 110, 149, 156, 167 attracting fixed point 116 attracting set 292, 368 attractor 292 attractor of Lorenz type 318 Axiom A 382
CO-sup-norm 141 C l 15,132 or 15, 134 C r Section Theorem 436 or structurally stable 238 or topology 238 or topology on vector fields 240 or-conjugacy 41 or-diffeomorphism 134 or -distance between functions 44
or(M,N) 237
Cantor set 26 capacity dimension 358 center eigenspace 111, 120 center 104, 149 center manifold 198 center-stable manifold 197 center-unstable manifold 197 chain components 26, 149,368 chain equivalent 26 chain limit set 25, 368 chain recurrent set 25, 149, 368 chain transitive 26 chaos 83, 355 chaotic attractor 292 chaotic map 83 characteristic multipliers 167 closed orbit 146, 165 codimension 417 cones 185, 186, 256, 297 conjugacy 40 conjugate flows 113 conjugate maps 40
backward orbit 16 Baker's map 59 Baker's map, generalized 363 base space 433 basic sets 386 basin boundary 275 basin of attraction 149, 156, 204, 275 Benedicks and Carleson 308 Birkhoff Ergodic Theorem 86, 246, 353 Birkhoff Transitivity Theorem 245 Borel measure 245 Borel probability measure 352 Borel sets 352 41>1
464
Conley's Fundamental Theorem 368 continuation of a fixed point 157 Continuity of solutions with respect to initial conditions 143 continuously differentiable 15, 132, 134 Cont(n, R) 121 contracting linear map 116 contraction mapping 139 Contraction Mapping Theorem 139 coordinate chart 235 covering estimate 138 critical point 13 cross section to a flow 166 cycle 321, 388 Diffr(M) 237 Diffr(Rk) 134 DA-diffeomorphism 300 Denjoy's Example 55 Denjoy's Theorem 55 derivative 132 diffeomorphism 15, 134, 237 differentiable 132, 134 differentiable linearization 154, 157 differentiable map between manifolds 237 differentiably conjugacy 41 differentiably conjugate 332 differential equation on a manifold 240 dimension, box 358 dimension, capacity 358 dimension, Hausdorff 361 dimension, topological 293 disk bundle disk 434 double of a map 71 doubling map (squaring map) 42, 335 dynamically defined 362
Ee 111, 156 E' Ill, 156 EU 111, 156 E~ 241, 354 241, 354 e-chain 25, 149 f-shadow 60, 377 edge subshift 248 eigenvalues of a periodic orbit 167 embedded submanifold 236 embedding 236 entropy, topological 334 equilibrium 146
E;
INDEX
equivalent flows 113 ergodic 246, 352 eventually periodic 15 eventually positive matrix 74, 123 Existence of Solutions of Differential Equations 141 expanding attractor 293 expanding linear map 116 expansion 110 expansive 83, 381 I-covers 64 family of quadratic maps 20 Feigenbaum constant 80 fiber contraction 433 fiber 433 filtration 404 first ret urn map 166 First Variation Equation 145 Fix(f) 15 fixed point 15, 146, 156 flip bifurcation 219 Floquet theory 200 Flow Box Coordinates 203 flow conjugate 203 flow 140 flow equivalent 113, 203 flow expansiveness 402 flow of a vector field 240 focus 105 forward orbit 16 Frechet derivative 132 full n-shift 247 fundamental domain 44, 45, 46, 48, 114, 117,406 fundamental matrix solution 100 generalized Baker's map 363 generalized eigenvector 94 generalized tent map 58 geometric horseshoe 249 geometric model of Lorenz equations 312 global stable manifold 183 global stable manifolds of points 243 global unstable manifold 183 global unstable manifolds of points 243 gradient flow 323 gradient vector field 323, 368 gradient-like 324 gradient-like vector fields 368
INDEX
graph for the partition 66 graph transform 183, 434 Gronwall's inequality 142 Hamiltonian differential equations 269 Hamiltonian vector field 269 Hartman-Grobman Theorem 152, 157, 170 Hausdorff dimension 361 Hausdorff metric 416 Henon attractor 306 Henon map 6, 231, 255, 306 Holder 133 heteroclinkally related 383 higher derivatives 134 homeomorphism 15 homoclinic point 259 homoclinically related 383 homology zeta function 289 horseshoe 6, 249, 259 h-related 383 hyperbolic 111,120,149,157,167,181 hyperbolic estimates 188 hyperbolic fixed point 149 hyperbolic invariant set 241, 244 hyperbolic linear differential equation III hyperbolic linear map 120, 181 hyperbolic periodic point 157 hyperbolic splitting III hyperbolic structure 241, 244 hyperbolic toral automorphisms 276 immersed submanifold 236 immersion 236 Implicit Function Theorem 135, 136 In Phase Theorem 380 Inclination Lemma 200 index of fixed point 325 Intermediate Value Theorem 14 invariant measure 81, 352 invariant section 433 Invariant Section Theorem 434 invariant set 23 Inverse Function Theorem 138 inverse limit 299 irreducible 247 irreducible matrix 74, 123 isoclines 173, 205 isolated invariant set 262, 378 isolating neighborhood 368, 378
465
itinerary map 38 Jakobson 81, 309 Jordan canonical form 94
k-cycle 321, 388 knots 318 L(Rk,Rn) 94
Lambda Lemma 200 least period 15 Lefschetz Fixed Point Theorem 289 Lefschetz index of a fixed point 289 Lefschetz number of a diffeomorphism 289 Liapunov exponent 86 Liapunov exponents 84, 352 Liapunov function 155, 324, 368 Liapunov stable 16, 106, 156 Lienard equations 171, 172 lift 49 limit cycle 147,179 limit set 383 linear conjugacy 41 linear contraction 116 linear expansion 116 linear sink 116 linear source 116 Liouville's formula 78,101,177,207 Lip(f) 132 Lipschitz 132 local product structure 282, 385, 388 local stable manifold 181 local unstable manifold 181 locally eventually onto 316 Lorenz attractor 312 Lorenz equations 8, 309 L-stable 16 m(A) 94, 138, 158 manifold 235 Markov partition 283, 389 Markov property for ambient boxes 265 maximal interval of definition of the solution 143 maximal invariant set 260, 378 Mean Value Theorem 14 measure 352 Melnikov function 271 Melnikov method 7, 271 middle-a Cantor set 26
466
middle-third Cantor 28 minimal set 24 minimum norm 94, 138, 158, 187 mixing 39 monotonically decreasing 57 monotonically increasing 57 Morse inequalities 325 Morse-Smale diffeomorphism or flow 319 Morse-Smale flow 322 Multiplicative Ergodic Theorem 352
N30 (n, t )-separated 334 (n, t)-spanning set 343 negatively invariant set 23 no cycle property 388 node 105 nondegenerate critical point 13, 323 nonhomogeneous linear equations 115 nonnegative matrix 123 nonuniformly hyperbolic structure 351, 354 nonwandering 149 nonwandering point 25 nonwandering set 25 normal bundle 445 normal bundle of a closed orbit 170 normally contracting 445 normally hyperbolic 445 nowhere dense 26 n-shift 37, 247 n-sphere 236
INDEX
Perh(f) 414 Perh(k, f) 414 Per(k, f) 414 Per(n,f) 15 perfect 26 period doubling bifurcation 219 period doubling route to chaos 79 period IS, 146, 156 periodic attractor 167 periodic orbit IS, 146, 165 periodic point IS, 146, 156, 165 periodic repeller 167 periodic sink 167 periodic source 167 Perron method for stable manifolds 183 Perron-Frobenius Theorem 123, 341 persistence of a fixed point 157 phase portrait 2, 102, 148 phase space 102 pitchfork bifurcation 231 Plykin attractor 304 Poincare map 166 Poincare metric 35 Poincare norm 35 Poincare Recurrence Theorem 326 Poincare-Bendixson Theorem 179 positive matrix 74, 123 positively invariant set 23 predator-prey 178 proper 163 quadratic map 20
w-limit point 23, 146 w-limit set 23, 146 w-recurrent 25, 148 O-Stability Theorem 404 O-stable 47, 404 one-sided shift 37 openness of Anosov diffeomorphisms 399 operator norm 94, 158 orbit 16, 144 orbitally Liapunov stable 169 orientation preserving, reversing 117 overflowing 433
'R-stable 404 p-distance 33 r-continuously differentiable 134 rectangle 282, 389 repelling 16, 1l0, 149, 157, 167 repelling fixed point 116 repelling set 368 residual subset 245 Riemannian metric 240 Riemannian norm 240 rigid rotation of 8 1 50 rooftop map 42 rotation number 49 rotation of 8 1 50
partial ordering of basic sets 388 past history 181 Per(f) 414
saddle Ill, 149, 157 saddle linear differential equation III Saddle-Node Bifurcation 212
INDEX
Schwarz Lemma 34 second derivative 133 sectors 185 semi-conjugacy 40 semi-stability of Anosov Diffeomorph isms 398 semi-stable 398 sensitive dependence on initial conditions 82, 381 separated set 334 shadow 60, 377 Sharkovskii ordering 65 Sharkovskii's Theorem 66 shift space 37, 247, 283 Sinai-Ruelle-Bowen measure 356 sink 16, 110, 149, 157, 167 Smale horseshoe 249, 259 solenoid 294 source 16, 110, 149, 157, 167 spanning set 343 Spectral Decomposition Theorem 385 spectrum of a linear map 181 sphere 236 spiral 105 squaring map (doubling map) 335 SRB measure 356 stability of an hyperbolic invariant set 400 stable bundle of a periodic orbit 200 stable cones 186 stable disk 185 stable eigenspace 111, 120 stable focus 105 stable manifold 183 stable manifold of an invariant set 380 Stable Manifold Theorem 182, 197 Stable Manifold Theorem for a hyperbolic set 242, 374 stable node 105 stable set 16 stable spiral 105 stable subbundle 241 stable subspaces 181 stair step method 18 Stefan cycle 67 Stefan graph 67 Sternberg linearization 154, 157 strongly gradient-like 368 structural stability 44, 238 structural stability of Anosov diffeomorphisms 399
467
structural stability of Anosov flows 401 structural stability of hyperbolic toral automorphisms 276 structural stability of Morse-Smale diffeomorphisms 321 Structural Stability Theorem 406 structurally stable 9, 44, 238 subadditive 61 subshift 73 subshift of finite type 73, 248, 254 subshifts of finite type 247 sup-norm 94, 141, 158 support 421 surface 236 suspension of a map 171 symbol space 37 symbolic dynamics 4, 37, 247, 253, 283 tangent bundle 238 tangent space 238 tangent vector 238 Taylor's expansion 134 template 318 tent map 42,57,58,60 topological conjugacy 38, 40 topological dimension 293 topological entropy 84, 334 topologically conjugate flows 113 topologically equivalent flows 113 topologically mixing 39, 266 topologically transitive 6, 39, 244 toral Anosov diffeomorphism 276 torus 236 totally disconnected 26 transition matrix 73, 248, 283 transitive 39, 244 transversal to a flow 166 transversality condition 405 transverse 200, 319 transverse homoclinic point 259 trapping region 292, 368 tubular neighborhood 447 two shift 37 two sided shift space 247 uniform hyperbolic structure 241 Uniqueness of Solutions of Differential Equations 141 unstable cones 186 unstable disk 185 unstable eigenspace 111, 120