Birkhäuser Advanced Texts
Edited by Herbert Amann, Zürich University Steven G. Krantz, Washington University, St. Loui...
21 downloads
429 Views
4MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Birkhäuser Advanced Texts
Edited by Herbert Amann, Zürich University Steven G. Krantz, Washington University, St. Louis Shrawan Kumar, University of North Carolina at Chapel Hill -DQ1HNRYiĜ8QLYHUVLWp3LHUUHHW0DULH&XULH3DULV
3DYHO'UiEHN -DURVODY0LORWD
Methods of Nonlinear Analysis $SSOLFDWLRQVWR'LIIHUHQWLDO(TXDWLRQV
Birkhäuser Basel · Boston · Berlin
Authors: 3DYHO'UiEHN 'HSDUWPHQWRI0DWKHPDWLFV )DFXOW\RI$SSOLHG6FLHQFHV 8QLYHUVLW\RI:HVW%RKHPLDLQ3LOVHQ Univerzitní 8 3O]HĖ Czech Republic
-DURVODY0LORWD 'HSDUWPHQWRI0DWKHPDWLFDO$QDO\VLV )DFXOW\RI0DWKHPDWLFVDQG3K\VLFV &KDUOHV8QLYHUVLW\LQ3UDJXH Sokolovská 83 3UDKD Czech Republic
0DWKHPDWLFV6XEMHFW&ODVVL¿FDWLRQ%[[---$+[[++- -...5[[$$((&&&
/LEUDU\RI&RQJUHVV&RQWURO1XPEHU
%LEOLRJUDSKLFLQIRUPDWLRQSXEOLVKHGE\'LH'HXWVFKH%LEOLRWKHN 'LH'HXWVFKH%LEOLRWKHNOLVWVWKLVSXEOLFDWLRQLQWKH'HXWVFKH1DWLRQDOELEOLRJUD¿H detailed bibliographic data is available in the Internet at .
,6%1%LUNKlXVHU9HUODJ$*%DVHOÂ%RVWRQÂ%HUOLQ 7KLVZRUNLVVXEMHFWWRFRS\ULJKW$OOULJKWVDUHUHVHUYHGZKHWKHUWKHZKROHRUSDUWRIWKHPDWHULDOLVFRQFHU QHGVSHFL¿FDOO\WKHULJKWVRIWUDQVODWLRQUHSULQWLQJUHXVHRILOOXVWUDWLRQVUHFLWDWLRQEURDGFDVWLQJUHSURGXF WLRQRQPLFUR¿OPVRULQRWKHUZD\VDQGVWRUDJHLQGDWDEDQNV)RUDQ\NLQGRIXVHSHUPLVVLRQRIWKHFRS\ULJKW owner must be obtained.
%LUNKlXVHU9HUODJ$* Basel · Boston · Berlin 32%R[&+%DVHO6ZLW]HUODQG 3DUWRI6SULQJHU6FLHQFH%XVLQHVV0HGLD 3ULQWHGRQDFLGIUHHSDSHUSURGXFHGRIFKORULQHIUHHSXOS7&)f 3ULQWHGLQ*HUPDQ\ ,6%1
H,6%1
ZZZELUNKDXVHUFK
Dedicated to the memory of Svatopluk Fuˇc´ık
Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ix
1 Preliminaries 1.1 Elements of Linear Algebra . . . . . . . . . . . . . . . . . . . . . . 1.2 Normed Linear Spaces . . . . . . . . . . . . . . . . . . . . . . . . .
1 24
2 Properties of Linear and Nonlinear 2.1 Linear Operators . . . . . . . 2.2 Compact Operators . . . . . 2.3 Contraction Principle . . . .
55 77 91
Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 Abstract Integral and Differential Calculus 3.1 Integration of Vector Functions . . . . . . . . . . . . . . . . . . . . 105 3.2 Differential Calculus in Normed Linear Spaces . . . . . . . . . . . . 117 3.2A Newton Method . . . . . . . . . . . . . . . . . . . . . . . . 134 4 Local Properties of Differentiable Mappings 4.1 Inverse Function Theorem . . . . . . . . . . . . . . . 4.2 Implicit Function Theorem . . . . . . . . . . . . . . 4.3 Local Structure of Differentiable Maps, Bifurcations 4.3A Differentiable Manifolds, Tangent Spaces and Vector Fields . . . . . . . . . . . . . . . . 4.3B Differential Forms . . . . . . . . . . . . . . . 4.3C Integration on Manifolds . . . . . . . . . . . . 4.3D Brouwer Degree . . . . . . . . . . . . . . . . .
. . . . . . . . 139 . . . . . . . . 146 . . . . . . . . 156 . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
181 195 208 228
5 Topological and Monotonicity Methods 5.1 Brouwer and Schauder Fixed Point Theorems . . . . . . . . . . 5.1A Fixed Point Theorems for Noncompact Operators . . . 5.2 Topological Degree . . . . . . . . . . . . . . . . . . . . . . . . . 5.2A Global Bifurcation Theorem . . . . . . . . . . . . . . . . 5.2B Topological Degree for Generalized Monotone Operators
. . . . .
. . . . .
249 261 267 295 303
viii
Contents
5.3 5.4
Theory of Monotone Operators . . . . . . . . . . . . . . . . 5.3A Browder and Leray–Lions Theorem . . . . . . . . . . Supersolutions, Subsolutions, Monotone Iterations . . . . . 5.4A Minorant Principle and Krein–Rutman Theorem . . 5.4B Supersolutions, Subsolutions and Topological Degree
6 Variational Methods 6.1 Local Extrema . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Global Extrema . . . . . . . . . . . . . . . . . . . . . . . . 6.2A Ritz Method . . . . . . . . . . . . . . . . . . . . . 6.2B Supersolutions, Subsolutions and Global Extrema . 6.3 Relative Extrema and Lagrange Multipliers . . . . . . . . 6.3A Contractible Sets . . . . . . . . . . . . . . . . . . . 6.3B Krasnoselski Potential Bifurcation Theorem . . . . 6.4 Mountain Pass Theorem . . . . . . . . . . . . . . . . . . . 6.4A Pseudogradient Vector Fields in Banach Spaces . . 6.4B Lusternik–Schnirelmann Method . . . . . . . . . . 6.5 Saddle Point Theorem . . . . . . . . . . . . . . . . . . . . 6.5A Linking Theorem . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
309 323 330 338 351
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
361 375 388 398 401 414 416 426 436 442 456 464
. . . . .
. . . . .
473 477 481 486 492
. . . . .
. . . . .
499 505 510 515 527
7 Boundary Value Problems for Partial Differential Equations 7.1 Classical Solution, Functional Setting . . . . . . . . . . . . . . . 7.2 Classical Solution, Applications . . . . . . . . . . . . . . . . . . 7.3 Weak Solutions, Functional Setting . . . . . . . . . . . . . . . . 7.4 Weak Solutions, Application of Fixed Point Theorems . . . . . 7.5 Weak Solutions, Application of Degree Theory . . . . . . . . . 7.5A Application of the Degree of Generalized Monotone Operators . . . . . . . . . . . . . . . . . . . . 7.6 Weak Solutions, Application of Theory of Monotone Operators 7.6A Application of Leray–Lions Theorem . . . . . . . . . . . 7.7 Weak Solutions, Application of Variational Methods . . . . . . 7.7A Application of the Saddle Point Theorem . . . . . . . .
Summary of Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533 Typical Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535 Comparison of Bifurcation Results . . . . . . . . . . . . . . . . . . . . . . . 539 List of Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561
Preface Motto: Real world problems are in essence nonlinear. Hence methods of nonlinear analysis became important tools of modern mathematical modeling.
There are many books and monographs devoted to methods of nonlinear analysis and their applications. Typically, such a book is either dedicated to a particular topic and treats details which are difficult for a student to understand, or it deals with an application to complicated nonlinear partial differential equations in which a lot of technicalities are involved. In both cases it is very difficult for a student to get oriented in this kind of material and to pick up the ideas underlying the main tools for treating the problems in question. The purpose of this book is to describe the basic methods of nonlinear analysis and to illustrate them on simple examples. Our aim is to motivate each method considered, to explain it in a general form but in the simplest possible abstract framework, and finally, to show its application (typically to boundary value problems for elementary ordinary or partial differential equations). To keep the text free of technical details and make it accessible also to beginners, we did not formulate some key assertions and illustrative examples in the most general form. The exposition of the book is at two levels, visually differentiated by different font sizes. The basic material is contained in the body of the seven chapters. The more advanced material is contained in appendices to a number of sections and is presented in a smaller font size. The basic material is independent of the advanced material, is self-contained, and can be read by students new to the subject. It should prepare an undergraduate student in mathematics to read scientific papers in nonlinear analysis and to understand applications of the methods presented to more complex problems. Each chapter contains a number of exercises that should provoke the reader’s creativity and help develop his or her own style of approaching problems. However, the exercises play an additional role. They carry some of the technical material that was omitted in simplifying some of the basic proofs. They are thus an organic
x
Preface
part of the exposition for graduate students who already have experience with the methods of nonlinear analysis and are interested in generalizations. We have organized the material in this book as follows. In Chapters 1–3, we introduce some necessary notions and basic assertions from linear algebra (Section 1.1) and linear functional analysis (Sections 1.2–2.2), and we also present some preliminaries concerning the Contraction Principle and differential and integral calculus in normed linear spaces (Sections 2.3–3.2). In this part of the text we give proofs of the results which are closely related to the nonlinear part of the book. On the other hand, several very important statements of linear functional analysis are left without proofs. In Chapter 4, local properties of differentiable mappings are treated. In particular, it includes topics such as the Inverse Function Theorem, the Implicit Function Theorem together with the Rank Theorem and the notion of the differentiable manifold. Results such as the Lyapunov–Schmidt Reduction and the Morse Theorem are used to prove the Local Bifurcation Theorem of Crandall and Rabinowitz. Chapter 5 is devoted to the topological and monotonicity methods of nonlinear analysis. We focus on the Brouwer and Schauder Fixed Point Theorems, the Sard Theorem and the analytic approach to the degree of a mapping, monotone operators and the method of monotone iterations based on the notions of superand sub-solutions. In Chapter 6, basic variational methods are presented. We start with local and global extrema and then continue with the method of Lagrange Multipliers with applications to eigenvalue problems (Courant–Fischer and Courant– Weinstein Principles), the Mountain Pass Theorem and the Saddle Point Theorem. Abstract results from Chapters 4–6 are accompanied by examples of various boundary value problems for ordinary differential equations. Since these applications are spread over a large number of pages, we add a brief account of examples of boundary value problems for both ordinary and partial differential equations together with the methods used at the end of the book. The reader will also find there a short guide on the bifurcation results presented in the book. Chapter 7 deals with several applications of the preceding methods to boundary value problems for elementary nonlinear partial differential equations. We present and discuss the notions of classical and weak solutions and try to minimize the technical difficulties connected with the formulation of the problems. All this material represents a self-contained introduction to the methods of nonlinear analysis with simple applications to elementary differential equations. More advanced material is presented in appendices which are attached to a number of sections. In particular, Appendix 3.2A explores an abstract Newton Method as an application of the Contraction Principle and differential calculus in Banach spaces. Appendices 4.3A to 4.3D are devoted to the analysis on manifolds (vector fields, differential forms and integration on manifolds). The main results presented
Preface
xi
in these appendices are an abstract version of the Stokes Theorem (and its applications) and the construction of the Brouwer degree by means of differentiable forms. Some fixed point theorems for noncompact operators are presented in Appendix 5.1A. As an application of the Leray–Schauder degree theory we consider global bifurcation theorems in Appendix 5.2A while Appendix 5.2B is devoted to the generalization of the Leray–Schauder degree to generalized monotone mappings. Appendix 5.3A deals with the generalization of the theory of monotone operators to a more general functional setting and to operators which are monotone only in the principal part. In Appendix 5.4A we give the proof of the famous Krein–Rutman Theorem which itself falls within the linear theory but plays an essential role in the study of nonlinear problems. Appendix 5.4B illustrates the connection between the method of supersolutions and subsolutions and the topological degree. The Ritz Method is presented in Appendix 6.2A as an application of an abstract variational principle. Appendix 6.2B illustrates the connection between the method of supersolutions and subsolutions and the existence of global extrema. Appendix 6.3A has an auxiliary character for establishing the “potential” bifurcation theorem in Appendix 6.3B. In Appendix 6.4A we generalize the so-called Deformation Lemma and present the Mountain Pass Theorem in a more general setting. The generalization of the Lagrange Multiplier Method is carried out in Appendix 6.4B. Appendix 6.5A is dedicated to the generalization of the Saddle Point Theorem. Appendices 7.5A, 7.6A and 7.7A are devoted to applications of the degree of generalized monotone mappings, the Leray–Lions Theorem and the Saddle Point Theorem, respectively, to boundary value problems for elementary partial differential equations. This more advanced part contains several generalizations of the methods presented in the basic part and the beginner in the subject who is reading the book can skip it. On the other hand, there are (a few) places in appendices where we have to refer to the basic text and the notions we refer to are contained in the forthcoming chapters or sections. This, however, corresponds to our “philosophy of two levels” of the text and, in our opinion, does not impair the “smoothness” of reading. In order to make the text self-contained, we decided to comment on several notions and statements in footnotes. To place the material from the footnotes in the text could disturb a more advanced reader and make the exposition more complicated. In order to emphasize the role of the statements in our exposition we identify them as Theorem, Proposition, Lemma or Corollary. However, the reader should be aware of the fact that this by no means expresses the importance of the statement within the whole of mathematics. So, several times, we call important theorems Propositions, Lemmas or Corollaries. Although the book should primarily serve as a textbook for students on the graduate level, it can be a suitable source for scientists and engineers who have need of modern methods of nonlinear analysis.
xii
Preface
At this point we would like to include a few words about our good friend, colleague and mentor Svatopluk Fuˇc´ık to whom we dedicate this book. His work in the field of nonlinear analysis is well recognized and although he died in 1979 at the age of 34, he ranks among the most important and gifted Czech mathematicians of the 20th century. We would like to thank Marie Benediktov´ a and Jiˇr´ı Benedikt for an excellent typesetting of this book in LATEX 2ε , excellent figures and illustrations as well as for their valuable comments which improved the quality of the text. Our thanks belong also to Eva Faˇsangov´a, Gabriela Holubov´ a, Eva Kaspˇr´ıkov´ a and Petr Stehl´ık for their careful reading of the manuscript and useful comments which have decreased the number of our mistakes and made this text more readable. Our special thanks belong to Jiˇr´ı Jarn´ık for correction of our English, Ralph Chill and Herbert Leinfelder for their improvements of the text and methodological advice. Both authors appreciate the support of the research projects of the Ministry of Education, Youth and Sports of the Czech Republic MSM 4977751301 and MSM 0021620839.
Plzeˇ n–Praha, November 2006
Pavel Dr´ abek Jaroslav Milota
Chapter 1
Preliminaries 1.1 Elements of Linear Algebra This section is rather brief since we suppose that the reader already has some knowledge of linear algebra. Therefore, it should be viewed mainly as a source of concepts and notation which will be frequently used in the sequel. There are plenty of textbooks on this subject. As we are interested in applications to analysis we recommend to the reader the classical book Halmos [64] to find more detailed information. A decisive part of analysis concerns the study of various sets of numbers such as R, C, RM , . . . , sets of functions (continuous, integrable, differentiable), and mappings between them. These sets usually allow some algebraic operations, mainly addition and multiplication by scalars. We will denote the set of scalars by and have usually in mind either the set of real numbers R or that of complex numbers C. Definition 1.1.1. A set X on which two operations – addition and multiplication by scalars – are defined, is called a linear space over a field if the following conditions are fulfilled: (1) X with respect to the operation x, y ∈ X → x + y ∈ X forms a commutative group with a unit element denoted by o and the inverse element to x ∈ X denoted by −x. (2) The operation a ∈ , x ∈ X → ax ∈ X satisfies (a) a(bx) = (ab)x, a, b ∈ , x ∈ X, (b) 1x = x, x ∈ X, where 1 is the multiplicative unit of the field . (3) For the two operations the distributive laws hold, i.e., for a, b ∈ , x, y ∈ X, we have (a) (a + b)x = ax + bx, (b) a(x + y) = ax + ay.
2
Chapter 1. Preliminaries
If = R or C, then X is called a real or complex linear space, respectively. If a subset Y ⊂ X itself is a linear space with respect to the operations induced by those defined on X, then Y is said to be a (linear) subspace of X. In the rest of this section the character X always denotes a linear space over
. If is not specified, then it always means that a definition or a statement holds n for an arbitrary field . For x1 , . . . , xn ∈ X, a1 , . . . , an ∈ the sum ai xi is well i=1
defined and determines an element x ∈ X which is called a linear combination of x1 , . . . , xn (with coefficients a1 , . . . , an ). Notice that only finite linear combinations are defined since infinite sums cannot be defined without any topology on X. If A is a subset of X, then the set of all linear combinations of elements of A is denoted by Lin A and is called the span of A. A span is always a subspace of X. We can ask whether x ∈ Lin{x1 , . . . , xn } can be expressed in a unique way as a linear combination of x1 , . . . , xn . This uniqueness holds if and only if x1 , . . . , xn are linearly independent, i.e., the condition n
ai xi = o
⇐⇒
a 1 = · · · = an = 0
i=1
is satisfied. More generally, we have the following definition. Definition 1.1.2. A set A ⊂ X is said to be linearly independent if every finite subset of A is linearly independent. A set A ⊂ X is called a basis 1 of X provided A is linearly independent and Lin A = X. Theorem 1.1.3. Every linear space X = {o} has a basis. If A, B are two bases of X, then there is a bijective (i.e., injective and surjective) mapping from A onto B. We will give the proof of the existence part since it contains a very important method which is frequently used. To see the idea of the proof, notice that a basis is a linearly independent set which is maximal in the sense that, by adding an element, it will cease to be linearly independent. The question why such a maximal set has to exist concerns generally mathematical philosophy. There are several equivalent statements of set theory which guarantee this existence result. As the most useful we have found the following one.2 Theorem 1.1.4 (Zorn’s Lemma). Let (A, ≺) be an ordered set in which every chain has the lowest upper bound. Then for any a ∈ A there is a maximal m ∈ A such that a ≺ m.3 1 It
is sometimes called a Hamel basis in order to emphasize the distinction from a Schauder or orthonormal basis in a Banach space or a Hilbert space, respectively (see Section 1.2). 2 It can be viewed also as an axiom of set theory. 3 A binary relation ≺ on A × A is called an ordering if (1) x ≺ x for all x ∈ A, (2) if x ≺ y and y ≺ x, then x = y, (3) if x ≺ y and y ≺ z, then x ≺ z.
1.1. Elements of Linear Algebra
3
We now return to Proof of Theorem 1.1.3. Let A be a collection of all linearly independent subsets of X and define A ≺ B for A, B ∈ A if A is a subset of B. Choose A ∈ A (A = ∅ since X = {o}) and let M be a maximal element of A , A ⊂ M, whose existence is guaranteed by Zorn’s Lemma (if B is a chain in A , then sup B = B). Then B∈B
Lin M = X. The proof of the latter part of Theorem 1.1.3 is more involved (the construction of an injection of A into B is also based on the application of Zorn’s Lemma) and it is omitted. Definition 1.1.5. Let X be a linear space and let A be a basis of X. Then the cardinality of A is called the dimension of X. Example 1.1.6. (i) Assume that A is a basis of a linear space X. Then there is a set (the socalled index set) Γ and a bijection γ ∈ Γ → eγ ∈ A onto A. We will also say that {eγ }γ∈Γ is a basis of X. For any x ∈ X there is a finite subset K ⊂ Γ and scalars {αγ }γ∈K such that x=
αγ eγ .
γ∈K
These scalars are uniquely determined and will be called the coordinates of x with respect to the basis {eγ }. (ii) The space RM of real M -tuples with the usual operations is a real linear space and the elements ek = (0, . . . , 0, 1, 0, . . . , 0),
k = 1, . . . , M
(1 is at the kth position), form a basis of RM . It will be called the standard basis of RM . If x = (x1 , . . . , xM ) ∈ RM , then x1 , . . . , xM are the coordinates of x with respect to the standard basis. If (A, ≺) is an ordered set, then B ⊂ A is called a chain if for every x, y ∈ B we have either x ≺ y or y ≺ x. An element b ∈ A is called the lowest upper bound of a subset B ⊂ A (b = sup B) if (1) a ∈ B =⇒ a ≺ b; (2) if a ≺ c for all a ∈ B, then b ≺ c. Similarly, we call d ∈ A the greatest lower bound of a subset B ⊂ A (d = inf B) if (1) a ∈ B =⇒ d ≺ a; (2) if c ≺ a for all a ∈ B, then c ≺ d. An element m ∈ A is called a maximal element of A if m ≺ x for an x ∈ A implies m = x.
4
Chapter 1. Preliminaries
(iii) Similarly, CM is the space of complex M -tuples and the set of elements e1 , . . . , eM defined as above is the standard basis of CM . More generally, if X is a real linear space and iX is defined by iX {ix : x ∈ X} where i2 = −1, then XC X + iX ( = {x + iy : x, y ∈ X} ) is the complexification of X. The equality x + iy = o holds if and only if x = y = o. The operations in XC are defined as follows: (x1 + iy1 ) + (x2 + iy2 ) (x1 + x2 ) + i(y1 + y2 ), (a + ib)(x + iy) (ax − by) + i(bx + ay),
x1 , x2 , y1 , y2 ∈ X, a, b ∈ R, x, y ∈ X.
It is easy to verify that XC is a complex linear space. (iv) Let P be the family of all polynomials of one variable with real or complex coefficients. Then P is respectively a real or complex linear space and the polynomials Pk (z) = z k , k = 0, 1, . . . , form a basis of P. (v) The space C[0, 1] of all real (complex) continuous functions on the interval [0, 1] is a real (complex) linear space. According to Theorem 1.1.3, C[0, 1] has a basis but it is uncountable (this is not so easy to prove). We will not distinguish among different infinite cardinals and refer to spaces like P and C[0, 1] as infinite dimensional spaces and use (incorrectly) the symbol dim = ∞. (vi) We can consider R as a linear space √ over the field Q of rational numbers. For example, the elements 1 and 2 are linearly independent in this case. In this case a basis is uncountable, and serves as a tool for the constructions of “pathological” examples in analysis, like a noncontinuous (or, equivalently, non-measurable) solution f of the functional equation f (x + y) = f (x) + f (y),
x, y ∈ R.
g
Remark 1.1.7. In the sequel we will use the symbol
∞
to warn the reader that the statement in question is true only in linear spaces of finite dimension. Next we state a corollary of Theorem 1.1.3. Corollary 1.1.8. Let X be a linear space and let Y be a subspace of X. Then there exists a subspace Z of X with the following properties: (i) for every x ∈ X there are unique y ∈ Y , z ∈ Z such that x = y + z; (ii) Y ∩ Z = {o}.
1.1. Elements of Linear Algebra
5
Notation. X = Y ⊕ Z and X is called the direct sum of Y , Z, and Z a direct complement of Y in X. Proof. Let A be a basis of Y and A = {B linearly independent subset of X : A ⊂ B}. ˜ Put C = A˜ \ A (the set compleBy Zorn’s Lemma, A has a maximal element A. ment). It is easy to see that Z Lin C satisfies both (i) and (ii). Notice that the elements y ∈ Y , z ∈ Z are uniquely determined by x in (i). If {o} = Y and Y = X, then Z is not uniquely determined. A simple example can be given in R2 and the reader is invited to provide one! Definition 1.1.9. Let X and Y be linear spaces over the same field . A mapping A : X → Y is said to be a linear operator if it possesses the following properties: (1) A(x + y) = Ax + Ay for all x, y ∈ X; (2) A(αx) = αAx for every α ∈ , x ∈ X. The collection of all linear operators from X into Y is denoted by L(X, Y ). We will use the simpler notation L(X) if X = Y . Remark 1.1.10. (i) A linear operator A ∈ L(X, Y ) is uniquely determined by its values on the elements of a basis A = {eγ }γ∈Γ . Indeed, let fγ Aeγ , γ ∈ Γ, and
x=
αγ eγ .
γ∈K K finite ⊂Γ
If A is linear, then Ax has to be equal to
αγ fγ . On the other hand, if
γ∈K
{fγ }γ∈Γ are given, then Ax
αγ fγ
for
γ∈K
x=
αγ eγ
γ∈K
satisfies (1) and (2) from Definition 1.1.9. (ii) Assume that both X and Y are finite dimensional spaces and {e1 , . . . , eM }
and
∞
{f1 , . . . , fN }
are bases of X and Y , respectively. If A ∈ L(X, Y ), then Aej =
N i=1
aij fi ,
j = 1, . . . , M,
for some scalars aij .
(1.1.1)
6
Chapter 1. Preliminaries
These scalars form a matrix A = (aij ) i=1,...,N j=1,...,M
∞
(N rows and M columns; the jth column consists of the coordinates of Aej ). This matrix A is called the matrix representation of the linear operator A with respect to the bases {e1 , . . . , eM } and {f1 , . . . , fN }. On the other hand, if {e1 , . . . , eM } and {f1 , . . . , fN } are bases of X and Y , respectively, and A is an N × M matrix, then the formula (1.1.1) determines a linear operator A ∈ L(X, Y ). (iii) If A, B ∈ L(X, Y ) have matrix representations A and B (with respect to the same bases), then A + B (aij + bij ) i=1,...,N j=1,...,M
is the matrix representation of A + B : x → Ax + Bx. Similarly, αA (αaij ) i=1,...,N j=1,...,M
is the matrix representation of αA : x → αAx. It is obvious that L(X, Y ) is a linear space (over the same scalar field ) under these definitions of A + B, αA. This is true without any restrictions on the dimensions of X and Y . (iv) If X, Y , Z are linear spaces over the same scalar field B ∈ L(Y, Z), then BA : x → B(Ax), x ∈ X,
∞
and A ∈ L(X, Y ),
is a linear operator from X into Z. Moreover, if X, Y , Z are finite dimensional spaces and A = (aij ) i=1,...,N , B = (bki )k=1,...,P are matrix representations j=1,...,M
i=1,...,N
of A and B, respectively, then BA
N i=1
bki aij k=1,...,P j=1,...,M
is the matrix representation of BA. This product of operators is non-commutative in general, even in the case X = Y = Z.
1.1. Elements of Linear Algebra
7
For A ∈ L(X, Y ) we denote by Ker A {x ∈ X : Ax = o} the kernel of A, and by Im A {Ax : x ∈ X} the image of A. Evidently, Ker A and Im A are linear subspaces of X and Y , respectively. Definition 1.1.11. A linear operator A ∈ L(X, Y ) is said to be (1) injective if Ker A = {o}, (2) surjective if Im A = Y , (3) an isomorphism if A is both injective and surjective. Remark 1.1.12. (i) If A ∈ L(X, Y ) is injective and e1 , . . . , en are linearly independent elements of X, then Ae1 , . . . , Aen are linearly independent elements of Y . Further, A ∈ L(X, Y ) is an isomorphism if and only if {Aeγ }γ∈Γ is a basis of Y whenever {eγ }γ∈Γ is a basis of X. In other words: linear spaces X, Y (over the same scalar field ) have the same dimension if and only if there is an isomorphism A ∈ L(X, Y ). (ii) Assume that A ∈ L(X, Y ) is an isomorphism and put A−1 y = x where y = Ax. Then A−1 ∈ L(Y, X) and AA−1 = IY ,
A−1 A = IX
where IX and IY denote the identity maps on X and Y , respectively. A−1 is called the inverse of A. If X = Y and A is a matrix representation of A, then A−1 has the inverse matrix A−1 as the representation in the same bases. (iii) (Transformation of coordinates in a finite dimensional space) Let E = {e1 , . . . , eM } and F = {f1 , . . . , fM } be two bases of a linear space X. There are two questions: (a) What is the relation between the coordinates of a given x ∈ X with respect to these bases? (b) Let A ∈ L(X) have matrix representations AE and AF with respect to these bases. What is the relation between AE and AF ? The answer to the first question is easy: Put T ej = fj ,
j = 1, . . . , M,
and extend T to a linear operator on X. Then T is an isomorphism. Denote by T = (tij ) i=1,...,M its matrix representation with respect to the basis E, j=1,...,M
i.e., T ej =
M i=1
tij ei ,
j = 1, . . . , M.
∞
8
Chapter 1. Preliminaries
For x =
M
ηj fj we have
j=1
x=
M j=1
ηj
M
tij ei =
i=1
M
⎛ ⎞ M ⎝ tij ηj ⎠ ei .
i=1
j=1
⎛
⎞ ξ1 ⎜ ⎟ This means that the column vector ξ = ⎝ ... ⎠ of the coordinates of x in ξM the basis E is given by ξ = T η where ξi =
M
tij ηj .
j=1
The second question can be answered by the same method but a certain caution in computation is desirable. Write M M M M (E) (F ) (F ) tkj ek = tkj aik ei = akj T ek = akj tik ei . Afj = A k=1
k,i=1
k=1
k,i=1
This equality can be expressed in matrix notation as AE T = T AF . Since the matrix T has an inverse, we get AF = T −1 AE T .
(1.1.2)
Example 1.1.13. (i) Let X = Y ⊕ Z. Define Px = y
where x = y + z,
y ∈ Y,
z ∈ Z.
Then P is the so-called projection of X onto Y and has the following properties: (a) P 2 P P = P , (b) Ker P = Z. It is easy to see that the properties (a), (b) determine uniquely the projection P and hence also the decomposition X = Y ⊕ Z (Y = Im P ). (ii) Let Y be a subspace of X. For x ∈ X put [x] x + Y = {x + y : y ∈ Y }.
1.1. Elements of Linear Algebra
9
If x, y ∈ X, then either [x] = [y] (⇔ x − y ∈ Y ) or [x] ∩ [y] = ∅. Define [x] + [y] [x + y],
for x, y ∈ X, α ∈ .
α[x] [αx]
These operations are well defined and endow the set X|Y {[x] : x ∈ X} with the structure of a linear space. The space X|Y is called a factor space or simply a Y -factor. Put κ : x → [x],
x ∈ X.
Then κ (the so-called canonical embedding of X onto X|Y ) is a linear, surjective operator from X onto X|Y , and Ker κ = Y . If x = y + z where y ∈ Y , z ∈ Z and X = Y ⊕ Z, then the mapping j : [x] → z is an isomorphism of X|Y onto Z. In particular, X|Y and Z have the same dimension. The dimension of X|Y is sometimes called the codimension of Y (codim Y ) and dim X = dim Y + codim Y.
(1.1.3)
Warning. If X is an infinite dimensional space, then the sum on the rightg hand side is the sum of infinite cardinal numbers! Proposition 1.1.14. Let A ∈ L(X, Y ) and let κ be the canonical embedding of X ˆ Ax, then Aˆ is injective and the diagram in Figure 1.1.1 onto X|Ker A . If A[x] ˆ is commutative, i.e., A = Aκ. κ
X
X|Ker A Aˆ
A Y Figure 1.1.1.
Proof. The assertion is obvious but do not forget to prove that Aˆ is well defined. Corollary 1.1.15. Let A ∈ L(X, Y ). Then dim X = dim Ker A + dim Im A.
(1.1.4)
In particular, if X = Y and dim X < ∞, then A ∈ L(X, Y ) is injective if and only if it is surjective.
∞
10
Chapter 1. Preliminaries
Proof. We have codim Ker A = dim X|Ker A = dim Im Aˆ = dim Im A since Aˆ is an isomorphism of X|Ker A onto its image. Equality (1.1.4) follows immediately from (1.1.3). If A is injective, then dim X = dim Im A, and this implies (only in the case of X and Y having the same finite dimension) that Y = Im A. If Im A = Y , then (finite dimensions!) dim Ker A = 0,
i.e., A is injective.
Example 1.1.16. Let X be the space of bounded (real) sequences l∞ (N) and define the right-shift SR : x = (x1 , . . . ) → (0, x1 , x2 , . . . ) and the left-shift SL : x = (x1 , . . . ) → (x2 , x3 , . . . ). Then SR is injective but not surjective and SL is surjective but not injective. Moreover, SL S R x = x for every x ∈ X. g What is S S ? R L
The following special case of linear operators plays an important role both in the theory of linear spaces and in applications. Definition 1.1.17. Let X be a linear space over a field . A linear operator from X into is called a linear form. The linear space of all linear forms on X is called the (algebraic) dual space of X and is denoted by X # . Example 1.1.18. (i) Let {e1 , . . . , eM } be a basis of X, i.e., for every x there is a unique M -tuple M (ξ1 , . . . , ξM ) ∈ M (coordinates of x) such that x = ξi ei . The mapping i=1
ei : x → ξi is a linear form (the ith coordinate form). It is straightforward to show that e1 , . . . , eM are linearly independent and Lin{e1 , . . . , eM } = X # ,
∞
i.e., {e1 , . . . , eM } is a basis of X # (the so-called dual basis of X # , dual to {e1 , . . . , eM }).
1.1. Elements of Linear Algebra
11
(ii) If f ∈ X # \ {o}, then codim Ker f = 1. To see this choose x0 ∈ X such that f (x0 ) = 1. Then x = (x − f (x)x0 ) + f (x)x0 ∈ Ker f ⊕ Lin{x0 }. On the other hand, if Y is a subspace of X of codimension 1,4 then X = Y ⊕ Lin{x0 }
for an x0 ∈ X.
For x = y + αx0 , y ∈ Y , we put f (x) = α. Then f ∈ X # \ {o}
and
Ker f = Y.
Moreover, if f, g ∈ X # are such that Ker f = Ker g, then there is an α ∈ g for which f = αg. This fact has the following generalization, which will be used in Section 6.3, more precisely in the proof of Theorem 6.3.2. Proposition 1.1.19. Let f1 , . . . , fn , g be linear forms on X. Then g ∈ Lin{f1 , . . . , fn }
n
if and only if
Ker fi ⊂ Ker g.
i=1
Proof. The “only if” part is obvious. For the “if” part notice that the assertion g ∈ Lin{f1 , . . . , fn } can be interpreted as the existence of a linear form λ ∈ (n )# such that g = λ◦F where F (x) (f1 (x), . . . , fn (x)). (1.1.5) Let n = Im F (X) ⊕ Y (Corollary 1.1.8). If α = β + γ, β = F (x), γ ∈ Y , then the mapping λ(α) = g(x) is a well-defined linear form (by assumption). This means that (1.1.5) holds. Definition 1.1.20. Let A ∈ L(X, Y ) and g ∈ Y # . Then the linear form f (x) g(Ax) is denoted by f = A# g and A# is called the adjoint operator to A. Remark 1.1.21. (i) A# ∈ L(Y # , X # ). (ii) If A has a matrix representation A = (aij ) i=1,...,N with respect to bases j=1,...,M
E = {x1 , . . . , xM } in X and F = {y1 , . . . , yN } in Y , then the adjoint operator A# has the representation A# = (aji )j=1,...,M i=1,...,N
(i.e., A# is the transpose of A) with respect to the dual bases. 4 Such
a subspace is often called a hyperplane.
∞
12
Chapter 1. Preliminaries
Warning. We will encounter different adjoint operators in the next section and the adjoint A∗ with respect to a scalar product will have a different representation in a complex space! Now we turn our attention to a system of linear equations M
aij xj = bi ,
i = 1, . . . , N.
(1.1.6)
j=1
This system can be written in a more “compact” form, namely as Ax = b
(1.1.7)
where A is a matrix representation of the linear operator A from X into Y . By choosing fixed bases E = {e1 , . . . , eM } in X and F = {f1 , . . . , fN } in Y (also Y = RN or CN ), A is defined by its matrix representation A = (aij ) i=1,...,N with j=1,...,M
respect to these bases. In order to formulate results on solvability of (1.1.6) (or, equivalently, of (1.1.7)) the following notation will be useful. Notation. If U is a subset of X (not necessarily a subspace of X), then U ⊥ = {f ∈ X # : x ∈ U ⇒ f (x) = 0}. Similarly, W⊥ = {x ∈ X : f ∈ W ⇒ f (x) = 0}
∞
for W ⊂ X # .
Proposition 1.1.22. (i) (U ⊥ )⊥ = Lin U for every U ⊂ X. (ii) If dim X < ∞, then (W⊥ )⊥ = Lin W for all W ⊂ X # . Proof. We include the proof because it contains a construction which should be compared with an analogous one in Section 2.1 (see Proposition 2.1.27 and its proof). (i) We can assume U to be a subspace of X since U ⊥ = (Lin U)⊥ . The inclusion U ⊂ (U ⊥ )⊥ is obvious. To prove the reverse let us suppose by contradiction that there is an element x0 ∈ (U ⊥ )⊥ \ U. By the method of proof of Theorem 1.1.3, a subspace Y of X can be found such that X = Lin{x0 } ⊕ Y
and
U ⊂ Y.
According to Example 1.1.18(ii) there exists f ∈ X # with Ker f = Y . In particular, f ∈ U ⊥ and f (x0 ) = 0, which contradicts the choice of x0 . (ii) This part follows from (i) by replacing X by X # . To repeat the proof we need that (X # )# could be identified with X. We note that this is possible because dim X < ∞.
1.1. Elements of Linear Algebra
13
The main idea – separation of a point from a subspace by a linear form (i.e., by a hyperplane) – can be substantially generalized. Definition 1.1.23. A subset C of a (real or complex) linear space X is called convex if for every x, y ∈ C, t ∈ [0, 1], the point tx + (1 − t)y
belongs to
C.
Proposition 1.1.24. Let X be a real linear space, ∅ = C a convex subset of X with a nonempty algebraic interior C 0 {a ∈ C : ∀y ∈ X ∃t0 > 0 such that a + ty ∈ C for all t ∈ [0, t0 )}. Let x0 ∈ X \ C. Then there is f ∈ X # such that f (x) ≤ f (x0 )
for all
x ∈ C.
Proof. It needs a special tool for the treatment of convex sets and a considerably more sophisticated extension procedure,5 and, therefore, it is omitted. See, e.g., Rockafellar [109, § 11] where the interested reader can find also applications to convex optimization, and also Corollary 2.1.18. Theorem 1.1.25. For A ∈ L(X, Y ) we have (i) Im A = (Ker A# )⊥ , (ii) Im A# = (Ker A)⊥ . (iii) If, moreover, dim X = dim Y < ∞, then dim Ker A = dim Ker A# .
∞ (1.1.8)
Proof. (i) It is straightforward to prove both the inclusions which lead to the equality (Im A)⊥ = Ker A# . The result follows then from Proposition 1.1.22(i). (ii) Let Y = Im A ⊕ Z (Corollary 1.1.8). For f ∈ (Ker A)⊥ and y = Ax + z, z ∈ Z, put g(y) = f (x). This definition does not depend on a concrete choice of x since f ∈ (Ker A)⊥ . This proves that f = A# g and hence the inclusion (Ker A)⊥ ⊂ Im A# holds. The reverse inclusion is trivial. (iii) Observe first that (X|U )# is isomorphic to U ⊥ for any subspace U of X, namely, Φ(F )(x) F ([x]), F ∈ (X|U )# is the desired isomorphism. If dim X < ∞, then X|U is isomorphic to (X|U )# (both spaces have the same dimension) and, therefore, X|U is isomorphic to U ⊥ . Now, we apply this observation to U = Ker A. We recall that Im A is isomorphic to X|Ker A (Proposition 1.1.14) and therefore to (Ker A)⊥ . By (ii), Im A is isomorphic to Im A# . The equality (1.1.8) follows from Corollary 1.1.15. 5 See
Corollary 2.1.18 for a similar process.
14
Chapter 1. Preliminaries
Remark 1.1.26. (i) Note that Theorem 1.1.25(i) is an existence result for the equation (1.1.6) (or (1.1.7)) because it can be reformulated as follows: The equation (1.1.6) has a solution for b = (b1 , . . . , bN ) if and only if N bi fi = 0 i=1
for all solutions f = (f1 , . . . , fN ) of the adjoint homogeneous equation N aji fi = 0, j = 1, . . . , M. i=1
In particular, we have also the alternative result: Either the equation (1.1.6) has a solution for all right-hand sides or6 the adjoint homogeneous equation has a nontrivial solution. Theorem 1.1.25(ii) can be reformulated similarly. (ii) If A is a matrix representation of A ∈ L(X, Y ) (X and Y are finite dimensional spaces), then dim Im A is equal to the number of linearly independent columns of A and is called the rank of A. If X = Y , then A is a square matrix of the type M × M (M = dim X), and it is called a regular matrix provided M = rank A.
∞
Equivalently, A is a regular matrix if and only if its determinant det A does not vanish. By the proof of Theorem 1.1.25(iii), dim Im A = dim Im A# . In particular, this means that the rank of A is equal to the rank of its transpose. The reader is asked to find more matrix formulations of the previous results. We often do calculations with a matrix representation instead of the operator itself. Since there are plenty of representations of the same operator it would be convenient to work with the simplest possible form. To examine this problem we start with some notions. Definition 1.1.27. Let X be a complex linear space and A ∈ L(X). A complex number λ is called an eigenvalue of A if there is x = o such that Ax = λx.
∞
Such an element x is called an eigenvector of A (associated with the eigenvalue λ). The set of all eigenvalues of A is called the spectrum of A and is denoted by σ(A). 6
The conjunction “or” has exclusive character here. This alternative result is sometimes called a Fredholm alternative since I. Fredholm proved such a result for linear integral equations. See also Section 2.2.
1.1. Elements of Linear Algebra
15
Warning. In infinite dimensions the spectrum of a linear operator can contain also other points than the eigenvalues and is defined in a different way (see page 56)! Remark 1.1.28. It is obvious that the following statements are equivalent in a finite dimensional complex space X: λ ∈ σ(A)
⇐⇒ ⇐⇒
Ker (λI − A) = {o} det (λI − A) = 0
⇐⇒
∞
rank (λI − A) < dim X
where A is a representation of A. Since P (z) det(zI − A) is a polynomial (the so-called characteristic polynomial ) of degree M = dim X, the problem of finding σ(A) is equivalent to solving an algebraic equation (the so-called characteristic equation of A) P (z) = 0.
(1.1.9)
According to the Fundamental Theorem of Algebra (see Theorem 4.3.111) there exists at least one solution of (1.1.9) in C. The reason for considering complex spaces here is the fact that (1.1.9) need not have a real solution. It is an easy consequence of the Fundamental Theorem of Algebra that the polynomial P can be written in the form P (λ) = (λ − λ1 )m1 · · · (λ − λk )mk
(1.1.10)
where σ(A) = {λ1 , . . . , λk } (λ1 , . . . , λk are different) and m1 + · · · + mk = dim X. The positive integer mi is called the multiplicity of the eigenvalue λi . Definition 1.1.29. Let A ∈ L(X). (1) A subspace Y ⊂ X is said to be A-invariant if A(Y ) ⊂ Y . (2) An A-invariant subspace Y ⊂ X is said to reduce A if there is a decomposition X = Y ⊕ Z where Z is also A-invariant. From now on till the end of this section we consider exclusively finite dimensional spaces. Example 1.1.30. (i) Let X = Y ⊕ Z where both Y and Z are A-invariant. If {e1 , . . . , em } is a basis of Y and {em+1 , . . . , eM } is a basis of Z, then the matrix representation A of A with respect to {e1 , . . . , eM } has a block form
A=
AY O
O AZ
where AY and AZ are representations of restrictions of A to Y and Z, respectively.
∞
16
Chapter 1. Preliminaries
(ii) Assume that there is a basis {e1 , . . . , eM } of X consisting of eigenvectors of A ∈ L(X) and Aei = λi ei , i = 1, . . . , M (λ1 , . . . , λM are not necessarily distinct). Then the matrix representation of A with respect to this basis is the diagonal matrix ⎛ ⎞ λ1 0 · · · 0 ⎜ 0 λ2 · · · 0 ⎟ ⎜ ⎟ ⎜ .. .. .. ⎟ . .. ⎝ . . . . ⎠ 0
0
···
λM
1 1 is a representation of a linear operator A ∈ L(C2 ) 0 1 which has no one-dimensional reducing subspace. Hence A has no diagonal g representation.
(iii) The matrix A =
Because of the last example we have to improve our previous idea: Choose λ ∈ σ(A) and denote k
Nk Ker (λI − A) . It is obvious that Nk ⊂ Nk+1 and they cannot be all distinct. If Nk = Nk+1 , then Ni = Nk for all i > k. Denote by n(λ) the least such k and set
n(λ)
N (λ)
Nj = Nn(λ) ,
n(λ)
R(λ) Im (λI − A)
.
j=1
∞
Lemma 1.1.31. Let A ∈ L(X) and λ ∈ σ(A). (i) Both N (λ) and R(λ) are A-invariant subspaces and the decomposition X = N (λ) ⊕ R(λ)
holds.
(1.1.11)
(ii) Denote by A|N and A|R the restrictions of A respectively to N (λ) and R(λ). Then σ(A|N ) = {λ}, σ(A|R ) = σ(A) \ {λ}. Moreover, the dimension of N (λ) is equal to the multiplicity of the eigenvalue λ. (iii) If σ(A) = {λ1 , . . . , λk }, then X = N (λ1 ) ⊕ · · · ⊕ N (λk ).
(1.1.12)
Proof. (i) Since R(λ) ∩ N (λ) = {o} (by the definition of n(λ)) and dim X = dim N (λ) + dim R(λ) (Corollary 1.1.15), we deduce the decomposition (1.1.11). If y = (λI − A)n(λ) x ∈ R(λ),
1.1. Elements of Linear Algebra
17
then Ay = −(λI − A)y + λy = −(λI − A)n(λ) (λI − A)x + λy ∈ R(λ). The A-invariance of N (λ) is also clear. (ii) Obviously, σ(A|N ) ⊂ σ(A). Let µ ∈ σ(A) \ {λ} and let x be a corresponding eigenvector. By (1.1.11) we have x = y + z where y ∈ N (λ), z ∈ R(λ). Further, o = (µI − A)x = (λI − A)y + (µ − λ)y + (µI − A)z. By virtue of the uniqueness of the decomposition we have (λI − A)y = (λ − µ)y. This implies that o = (λI − A)n(λ) y = (λ − µ)(λI − A)n(λ)−1 y,
i.e., y ∈ Ker (λI − A)n(λ)−1 .
By repeating this procedure we get y ∈ Ker (λI − A) and, therefore, (λ − µ)y = o, i.e., y=o
and
x = z ∈ R(λ).
This shows that µ ∈ σ(A|N ) and µ ∈ σ(A|R ). Since N (λ) ∩ R(λ) = {o} the eigenvalue λ does not belong to σ(A|R ). The matrix representation A of A with respect to the basis formed by joining the bases of N (λ) and R(λ) has the block form
AN O A= . O AR It follows that det(zI − A) = det(zI − AN ) det(zI − AR ) and hence the characteristic polynomial of AN is PN (z) = (z − λ)m(λ) where m(λ) is the multiplicity of the eigenvalue λ of A. Therefore dim N (λ) = m(λ). (iii) This follows by induction with respect to the eigenvalues of A. For a polynomial P (z) = an z n + · · · + a1 z + a0 and A ∈ L(X) we put P (A) = an An + · · · + a1 A + a0 I. Corollary 1.1.32 (Hamilton–Cayley). Let A ∈ L(X) and let P be the characteristic polynomial of A. Then P (A) = O.
∞
18
Chapter 1. Preliminaries
Proof. Assume that P has the form (1.1.10) and x = x1 + · · · + xk is the decomposition given by (1.1.12). Since mk = n(λk ), k−1
(A − λk I)mk x =
(A − λk I)mk xj + o.
j=1
The result follows by induction.
It remains to compute the representation of the restriction of λi I − A to N (λi ). Notice that this restriction is nilpotent.7
∞
Lemma 1.1.33. Let B ∈ L(X) be a nilpotent operator of order n. Then for any x ∈ X \ Ker B n−1 the elements x, Bx, . . . , B n−1 x are linearly independent and the subspace Y = Lin{x, Bx, . . . , B n−1 x} reduces B. The restriction B|Y of B to Y has the representation ⎛ ⎜ ⎜ ⎜ ⎝
0 0 .. . 0
1 ··· 0 ··· .. . . . . 0 ···
⎞ 0 0 ⎟ ⎟ ⎟ 1 ⎠ 0
with respect to the basis {B n−1 x, . . . , x}. There exists a B-invariant direct complement of Y and the restriction of B to such a complement is nilpotent of order ≤ n. Proof. It is easy to see the linear independence of the elements x, . . . , B n−1 x. Indeed, if n−1 αj B j x = o, j=0
then, by applying B n−1 , we get α0 B n−1 x = o,
i.e.,
α0 = 0.
Repetition shows that αj = 0 for all j = 0, 1, . . . , n − 1. The form of representation of B|Y is obvious. The existence of an invariant direct complement of Y can be proved by induction with respect to the order of nilpotency. We omit details and refer to, e.g., Halmos [64, § 57]. We are now ready to summarize all information to obtain the following fundamental result.
1.1. Elements of Linear Algebra
∞
19
Theorem 1.1.34 (Jordan Canonical Form). Let X be a complex linear space of finite dimension and let A ∈ L(X). Assume that σ(A) = {λ1 , . . . , λk }. Then there exists a basis F of X in which A has the canonical block representation ⎛
⎞
(1)
A1
⎜ ⎜ ⎜ ⎜ F A =⎜ ⎜ ⎜ ⎝
..
⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠
O
. (l )
A1 1 O
..
. (l )
Ak k
where the block matrices (the so-called Jordan cells) have the form ⎛ (i) Aj
⎜ ⎜ =⎜ ⎜ ⎝
λj
1
0 .. .
λj
0
0 .. .
···
..
. ···
lj columns
0
⎞
⎟ 0 ⎟ ⎟, ⎟ 1 ⎠ λj
i = 1, . . . , lj , j = 1, . . . , k.
(1.1.13)
Remark 1.1.35. (i) We can also interpret Theorem 1.1.34 as follows. Let AE be the representation of A with respect to the basis E. By Remark 1.1.12(iii), there is a regular transformation matrix T such that (1.1.2) holds. The canonical matrix AF may be viewed as a representation of a B ∈ L(X) with respect to the basis E. Denote by T a linear operator represented in the basis E by the matrix T . Then one has B = T −1 AT. (1.1.14) (ii) Assume that A ∈ L(X) where X is a real linear space. The problem in the application of Theorem 1.1.34 lies in the fact that the spectrum σ(A) ∩ R is not sufficient to guarantee the decomposition (1.1.12). This obstacle can be overcome by the complexification XC of X. Namely, A is extendable to XC by the formula AC (x + iy) = Ax + iAy. If λ = α + iβ, β = 0, is an eigenvalue of AC with an eigenvector u + iv, then u and v are linearly independent in X and the complex conjugate λ is also an eigenvalue of AC and u − iv is the corresponding eigenvector. Moreover, both λ and λ have the same multiplicity. Rearranging the AC -canonical basis by joining its parts which correspond to λ and λ we obtain a basis of the real operator B ∈ L(X) is said to be nilpotent if there is such an n ∈ N that B n = O. The least such integer n is called the order of nilpotency.
7 An
∞
20
Chapter 1. Preliminaries
space X in which the representation of ⎛ α β 1 0 ⎜ −β α 0 1 ⎜ ⎜ ⎜ 0 0 α β ⎜ ⎜ .. .. ⎜ . . −β α ⎜ ⎜ . .. .. .. ⎜ .. . . . ⎜ ⎝ 0 0 0 0
A has blocks of the form ⎞ ··· 0 ··· 0 ⎟ ⎟ .. ⎟ ··· . ⎟ ⎟ ⎟ . ··· 1 0 ⎟ ⎟ ⎟ .. . 0 1 ⎟ ⎟ α β ⎠ · · · −β α
We omit simple computations which confirm these statements and leave them to the reader.
∞
The simple canonical form is convenient for solving a system of linear differential equations with real constant coefficients. Such a system can be written in the form dx x˙ = Ax, A ∈ L(X). (1.1.15) dt If X = RM and A = (aij ) is the representation of A with respect to the standard basis e1 , . . . , eM , then (1.1.15) is an abstract formulation of the system x˙ i (t) =
M
aij xj (t),
i = 1, . . . , M,
j=1
where x(t) =
M
xi (t)ei .
(1.1.16)
i=1
In order to find a solution, it is convenient to transform (1.1.16) into a canonical form. If T ∈ L(X) is invertible, then x = T y is a solution of (1.1.15) if and only if y solves the equation y˙ = By,
where By T −1 AT y.
Theorem 1.1.34 says that T can be chosen in such a way that the representation of B with respect to the standard basis is the Jordan Canonical Form of A. Having this form it is easy to solve (1.1.16) (see Exercise 1.1.41). Qualitative properties of solutions of (1.1.15) are often more interesting than an involved formula for solutions. Therefore it would be convenient to generalize the exponential function solving x˙ = ax in R to L(X). Similarly to the onedimensional case we put ∞ n t n etA x A x n! n=0 provided the series is convergent in L(X). We postpone the question of convergence of this series (see Exercise 2.1.34) and give instead an equivalent definition of a function f (A) for A ∈ L(X) without any use of infinite series.
1.1. Elements of Linear Algebra
21
First we will define f (B) for B ∈ L(CM ) which has a representation in the form ⎞ ⎛ λ 1 ⎟ ⎜ .. .. ⎟ ⎜ . . (1.1.17) B=⎜ ⎟. ⎝ 1 ⎠ λ Assume that f is a polynomial P : z → a0 z n + · · · + an . Obviously, we define P (B) = a0 B n + · · · + an I. It will be convenient to rewrite P (B) in a form which is more adequate for generalization. Since n P (j) (λ) P (z) = (z − λ)j , j! j=0 we can write P (z) =
M−1 j=0
P (j) (λ) (z − λ)j + (z − λ)M R(z) j!
where R is a polynomial, possibly equal to 0. Since z → (z − λ)M is the characteristic polynomial of B, we have (B − λI)M = O (by Corollary 1.1.32). This means that M−1 P (j) (λ) P (B) = (B − λI)j . (1.1.18) j! j=0 This shows that we may define f (B)
M−1 j=0
f (j) (λ) (B − λI)j j!
(1.1.19)
for a function f holomorphic on a neighborhood (depending on f ) of σ(B) = {λ}.8 We denote by H(σ(B)) the collection of such functions. It is easy to check that the formula (f g)(B) = f (B)g(B) = g(B)f (B) holds for f, g ∈ H(σ(B)). In particular, for w ∈ C \ {λ} and rw (z) = (w − z)−1 we get M−1 (B − λI)j −1 rw (B) = (wI − B) = . (1.1.20) (w − λ)j+1 j=0 8A
weaker assumption on f would be also sufficient but we do not try to obtain an unduly general definition. See also Lemma 1.1.37 below.
22
Chapter 1. Preliminaries
Remark 1.1.36. The following assertion yields another equivalent definition of f (B) which can be used also in a general Banach space for a linear continuous operator B : X → X (see Section 1.2 for the notions of the Banach space and the continuous linear operator). Also Theorem 1.1.38 holds in this more general setting (Dunford Functional Calculus, see Proposition 3.1.14 or Dunford & Schwartz [44]). Lemma 1.1.37. Let γ be a positively oriented Jordan curve, σ(B) ⊂ int γ, and let f be a holomorphic function on a neighborhood of int γ. Then 1 f (B)x = f (w)(wI − B)−1 x dw, x ∈ X.9 2πi γ
Proof. By (1.1.20) we have 1 2πi
−1
f (w)(wI − B) γ
x dw =
M−1 j=0
1 2πi
γ
f (w) dw (B − λI)j x. (w − λ)j+1
The result follows now from the Cauchy Integral Formula.10
Let A ∈ L(CM ) have the canonical form (1.1.17), i.e., A = T BT −1. Then we define f (A) by (1.1.19) replacing B by A. Notice that f (A) = T f (B)T −1. We can proceed in the same way for a general A ∈ L(X) using the decomposition (1.1.12). This leads to the following theorem. Theorem 1.1.38 (Functional Calculus). Let X be a complex linear space and let A ∈ L(X). Then there exists a unique linear operator Φ : H(σ(A)) → L(X) with the following properties: (i) Φ(f g) = Φ(f )Φ(g) = Φ(g)Φ(f ) for f, g ∈ H(σ(A)); n n (ii) if P (z) = aj z j , then Φ(P ) = aj Aj ; (iii) if f (z) =
j=0 1 w−z
j=0
for w ∈ σ(A), then Φ(f ) = (wI − A)−1 .
9 Since the integrand is a function w ∈ γ → CM ×M (in a matrix representation), the integral is an M × M -tuple of standard curve integrals. 10 We recall the following result from the theory of functions of a complex variable: If f and γ are as in Lemma 1.1.37, then j! f (w) dw holds for z ∈ int γ and j ∈ N ∪ {0}. f (j) (z) = 2πi γ (w − z)j+1
1.1. Elements of Linear Algebra
23
Remark 1.1.39. (i) A mapping Φ(f ) can be computed either by Lemma 1.1.37 which is valid also for a general A, or by the formula f (A)x =
k m(λ l )−1 f (j) (λl ) l=1
j=0
j!
∞
(A − λl I)j πl x
where σ(A) = {λ1 , . . . , λk } and πl is the projection onto N (λl ) defined by the decomposition (1.1.12). We note that these projections are also functions of A, namely πl = χl (A) where 1, z ∈ B(λl ; δ), χl (z) = 0, z ∈ B(λl ; δ) and δ > 0 is small enough so that σ(A) ∩ B(λl ; δ) = {λl }. (ii) If X is a real linear space of finite dimension and A ∈ L(X), then we can construct a functional calculus for XC and AC (see Remark 1.1.35(ii)). (iii) We deduced a functional calculus from Theorem 1.1.34. The opposite way is also possible, namely to use functional calculus for finding the canonical form. An important role is played by projections πl giving the decomposition (1.1.12). The interested reader can find more details, e.g., in Dunford & Schwartz [44, Section VII, 1].
∞
Exercise 1.1.40. Show that sgn det A = (−1)p
where p =
m(λ)
λ∈σ(A) λ<0
for a matrix representation of A with real entries. (The sum over the empty set is defined to be zero.) mk 1 Hint. Notice that det A = λm 1 · · · λk . Exercise 1.1.41. Show that the formula (1.1.19) yields a matrix representation of etB in the form ⎛ tλ ⎞ tM −1 tλ e tetλ · · · (M−1)! e ⎜ ⎟ ⎜ 0 tM −2 tλ ⎟ etλ · · · (M−2)! e ⎟ ⎜ ⎜ ⎟ ⎜ . ⎟ . . . ⎝ .. ⎠ .. .. .. tλ 0 0 ··· e
∞
whenever B has the representation (1.1.17) with the respect to the same basis. Exercise 1.1.42. Use the formula in Remark 1.1.39(i) to estimate etA xCM for large positive t and large negative t in dependence on σ(A).
∞
24
Chapter 1. Preliminaries
Hint. Suppose that α < Re λ < β for all λ ∈ σ(A). Show that there are constants c1 , c2 such that c1 eβt x for t ≥ 0, for all x ∈ CM . etA x ≤ c2 eαt x for t ≤ 0 In particular, if β < 0, then all solutions of (1.1.15) tend to zero as t → +∞ (asymptotic stability). Exercise 1.1.43. Let A ∈ L(CM ) have a regular matrix representation. Show that (i) all matrix representations of A are regular; (ii) there exists B ∈ L(CM ) such that eB = A. Is this B unique? How is σ(B) related to σ(A)?
1.2 Normed Linear Spaces In the preface we mentioned that our main attention is focused on the properties of nonlinear mappings defined on various spaces of functions. Besides the linear structure studied in the previous section such spaces also have a topological structure. We will assume that the two structures are joined together (like multiplication and addition are joined together by the distributive law in the notion of the field). A natural requirement of continuity of linear operations leads to the notion of a linear topological space. These spaces are often too general for purposes of nonlinear analysis. For example, basic notions and results of differential calculus in such spaces are not straightforward generalizations of the corresponding notions for functions of several variables and they frequently need profound ideas. Because of that we restrict our interest to cases when a topological structure is given by a metric, especially by a norm. Before starting with this concept we briefly introduce the main topological notions. For more information, the interested reader can consult books like Dugundji [43], Kelley [75]. A set X with a collection T of its subsets is called a topological space if T possesses the following properties: (1) ∅, X ∈ T ; (2) an intersection of a finite number of sets of T belongs to T ; (3) a union of any subcollection of T belongs to T . Elements of T are called open sets. A subset U ⊂ X is called a neighborhood of a point x ∈ X if there is an open set G ⊂ X such that x ∈ G ⊂ U. An important special case of a topological space is the so-called metric space. This is a set X with a real function (metric) : X × X → [0, ∞) for which
1.2. Normed Linear Spaces
25
(1) (x, y) = 0 ⇐⇒ x = y, (2) x, y ∈ X =⇒ (x, y) = (y, x) (the so-called symmetry of the metric), (3) x, y, z ∈ X =⇒ (x, z) ≤ (x, y) + (y, z) (the so-called triangle inequality). If is a metric on X, then B(x; r) {y ∈ X : (x, y) < r} is called the open ball centered at x ∈ X with the radius r > 0. Open sets in a metric space are defined as subsets G ⊂ X which have the following property: for every x ∈ G there is δ > 0 such that B(x; δ) ⊂ G. It is easy to prove that a metric space with this definition of open sets is also a topological space. For the following notions and results see, e.g., Dieudonn´e [35]. A subset F of a topological space X is called a closed set if X \ F is open. If A ⊂ X, then the intersection of all closed sets containing A is called the closure of A and is denoted by A, i.e., A= F. A⊂F F is closed
A dual notion is the interior (int A) of A: int A =
G.
G⊂A G is open
The boundary ∂A of A is defined by ∂A A ∩ X \ A. A subset A of X is said to be dense if A = X. It is almost obvious that in a metric space X we have the following equivalences: 11 (i) x ∈ A ⇐⇒ ∃{xn }∞ n=1 ⊂ A : lim (xn , x) = 0, n→∞
(ii) x ∈ int A ⇐⇒ ∃δ > 0 : B(x; δ) ⊂ A. A metric space X is said to be separable if there is a countable dense subset of X. 11 We
also say that the sequence {xn }∞ n=1 is convergent to x and write lim xn = x or, more n→∞
simply, xn → x. The notion of a convergent sequence can be introduced also in topological spaces: {xn }∞ n=1 is convergent to x if for every neighborhood U of x there is an index n0 ∈ N such that xn ∈ U for each n ≥ n0 . Warning. In a topological space there need not be enough convergent sequences in order to describe a closure, etc.! See, e.g., J. von Neumann’s example in Dunford & Schwartz [44, Chapter V, 7, Example 38].
26
Chapter 1. Preliminaries
If X, Y are topological spaces and f : X → Y , then f is said to be continuous on X provided f−1 (G) is open in X whenever G is an open set in Y . If f is injective and surjective, f , f −1 are both continuous, then f is called a homeomorphism of X onto Y . It is also possible to define continuity at a point a ∈ X with help of the notion of a neighborhood: f is continuous at a if f−1 (U) is a neighborhood of a whenever U is a neighborhood of f (a). A mapping f is continuous on X if and only if it is continuous at every point of X. The following equivalence holds in metric spaces X and Y : f: X →Y
is continuous at a ∈ X
⇐⇒
(xn → a =⇒ f (xn ) → f (a)).
A very important notion is that of compactness: A topological space X is said to be compact if for every open covering {Gγ }γ∈Γ of X (i.e., X = Gγ ) there is a γ∈Γ
finite subset K ⊂ Γ such that X=
Gγ
γ∈K
(a finite subcovering). Any subset A of a topological space X is itself a topological space with the collection of open sets {(G ∩ A) : G open in X}. A subset A of a topological space X is said to be compact in X if A is a compact topological space in this induced topology. Further, A ⊂ X is said to be relatively compact provided A is compact. In metric spaces we have the following characterization. Proposition 1.2.1. Let X be a metric space. Then A ⊂ X is relatively compact if 12 and only if for any sequence {xn }∞ n=1 ⊂ A there is a convergent subsequence. Beside this proposition, the importance of compactness in analysis is obvious from the next result which will be discussed more deeply in Section 6.2. Proposition 1.2.2. Let X be either a compact topological space or a sequentially compact topological space and let f be a continuous real function on X. Then there exist a maximal and a minimal value of f , i.e., there are x1 , x2 ∈ X such that f (x1 ) ≤ f (x) ≤ f (x2 )
for all
x ∈ X.
topological space X is said to be sequentially compact if for any sequence {xn }∞ n=1 ⊂ X there is a subsequence {xnk }∞ k=1 which is convergent to a point x ∈ X.
12 A
Warning. These two notions of compactness are different in topological spaces. To be more precise: There is a compact topological space which is not sequentially compact and there is a sequentially compact topological space which is not compact!
1.2. Normed Linear Spaces
27
To find a criterion for compactness in a particular space need not be an easy task. To formulate a general result we need one more notion the significance of ∞ which goes far beyond our present considerations. A sequence {xn }n=1 of elements of a metric space X is called a Cauchy sequence if for every ε > 0 there is n0 ∈ N such that
(xm , xn ) < ε for all m, n ≥ n0 . A metric space X is said to be complete if any Cauchy sequence in X is convergent (to an element of X). We will encounter complete spaces “almost everywhere” in the subsequent text. Proposition 1.2.3. Let X be a complete metric space. Then A ⊂ X is relatively compact if and only if for every ε > 0 there is a finite set K ⊂ X (the so-called finite ε-net for A) such that
In other words, A ⊂
∀a ∈ A ∃x ∈ K :
(a, x) < ε.
B(x; ε).
x∈K
Proposition 1.2.4. Let X be a complete metric space and let f : [α, β) → X. If f is uniformly continuous on [α, β),13 then there exists lim f (x) ∈ X. In particular, x→β−
if β < ∞, then f can be continuously extended to [α, β]. Definition 1.2.5. A topological space X is called a connected space provided it is not possible to find two disjoint nonempty open sets G1 , G2 such that X = G1 ∪ G2 . For a ∈ X put C(a)
{A ⊂ X : a ∈ A and A is connected}.
Then C(a) is a connected set and it is called the component of the point a. If a, b ∈ X, a = b, then either C(a) = C(b) or C(a) ∩ C(b) = ∅. Proposition 1.2.6. Let X be a connected space, let f : X → Y be continuous. Then f (X) is a connected subset of Y . In particular, if γ : [0, 1] → Y is continuous, A ⊂ Y , and γ(0) ∈ A, γ(1) ∈ A, then there exists t0 ∈ [0, 1] such that γ(t0 ) ∈ ∂A. Proposition 1.2.7. Let X be a normed linear space and let G be an open subset of X. Then G is connected if and only if for any two points a, b ∈ G there exists a continuous mapping γ : [0, 1] → G such that γ(0) = a, γ(1) = b. In particular, γ can be chosen piecewise linear. Now we are ready to start with the main subject of this section. Definition 1.2.8. Let X be a real or complex linear space. A function · X : X → R is called a norm on X if it has the following properties: 13 I.e.,
∀ε > 0 ∃δ > 0 ∀x, y ∈ [α, β) : |x − y| < δ =⇒ (f (x), f (y)) < ε.
28
Chapter 1. Preliminaries
(1) xX = 0 ⇐⇒ x = o, (2) αxX = |α|xX for α ∈ R or C and x ∈ X, (3) x + yX ≤ xX + yX for x, y ∈ X (the so-called triangle inequality). If a linear space X is endowed with a norm, then X is called a normed linear space. In the sequel we will drop the index of the norm whenever there is no danger of confusion. It is obvious that (x, y) x − y is a metric on X. Therefore all metric notions and results are transmitted to normed linear spaces. If a normed linear space is complete in this metric, then it is called a Banach space. Any metric space can be embedded as a dense set into a complete metric space. For a normed linear space X we get a slightly stronger result: ˜ (the so-called completion of X) and a There exists a Banach space X ˜ ˜ and linear injection L : X → X such that Im L is a dense subset of X xX = L(x)X˜
x ∈ X.
for all
Example 1.2.9. Let X be an M -dimensional real linear space. Choose a basis M f1 , . . . , fM of X and let e1 , . . . , eM be the standard basis of RM . For x = xi fi ∈ X put ϕ(x) =
M
i=1
xi ei . Then ϕ is an isomorphism of X onto RM . Moreover,
i=1
(x1 , . . . , xM )1
M
|xi |,
(x1 , . . . , xM )∞
i=1
(x1 , . . . , xM )2
M
max |xi |,
i=1,...,M
12 |xi |
2
i=1
are norms on RM (for indices 1, ∞ it is obvious, the triangle inequality for index 2 needs some effort – see also Proposition 1.2.30 below). These norms can be transmitted to X with help of ϕ, i.e., xα ϕ(x)α ,
α = 1, 2, ∞.
Similar results are true also for a complex linear space X when CM is used instead of RM . The space X is a Banach space with respect to any of the above norms. g
∞
The classical Bolzano–Weierstrass result on the compactness of a closed bounded interval in R has the following generalization: Let X be a finite dimensional space endowed with an α-norm (α = 1, 2, ∞). Then A ⊂ X is relatively compact if and only if it is bounded (i.e., there is a constant c such that xα ≤ c for every x ∈ A).
1.2. Normed Linear Spaces
29
We note that this result is true for any norm on X (see Corollary 1.2.11(i) below). Proposition 1.2.10. Let X and Y be normed linear spaces and let A be a linear operator from X into Y . Then the following statements are equivalent: (i) A is continuous on X; (ii) A is continuous at o ∈ X; (iii) there is a constant c such that the inequality AxY ≤ cxX
is valid for all
x ∈ X.
Proof. The easy proof is left to the reader as an exercise.
(1.2.1)
We denote the collection of all continuous linear operators from X into Y by L(X, Y ) and the least possible constant c in (1.2.1) by AL(X,Y ) . This quantity has all properties of a norm on the linear space L(X, Y ). We will always consider this norm (the so-called operator norm) on L(X, Y ). If X = Y , we will use the shorter notation L(X) instead of L(X, X). We now return to Example 1.2.9. It is obvious that there are positive constants c1 , c2 such that for all x ∈ RM (CM ). c1 x1 ≤ x2 ≤ c2 x1 √ 1 , c2 = M .) Such constants exist also for the norms · 1 , (Here, e.g., c1 = M · ∞ . More generally, two norms on a linear space X are called equivalent if they satisfy such inequalities. In other words, two norms · α , · β on a linear space X are equivalent if the identity map from (X, · α ) into (X, · β ) is continuous together with its inverse, i.e., it is an isomorphism.14 Corollary 1.2.11. (i) Any two norms on a finite dimensional linear space X are equivalent. In particular, X is a Banach space. (ii) Let X, Y be normed linear spaces and dim X < ∞. Then L(X, Y ) = L(X, Y ), i.e., any linear operator from X into Y is continuous. Proof. (i) Let ϕ be as in Example 1.2.9 and consider RM (or CM ) equipped with the · 1 -norm. Then for x = (x1 , . . . , xM ) ∈ RM we have M M M ϕ−1 (x)X = xi fi ≤ |xi | fi X ≤ c |xi | = cx1 , i=1
X
i=1
i=1
i.e., ϕ−1 is continuous. Observe that for proving continuity of ϕ it is sufficient to show that inf{ϕ−1 (x)X : x1 = 1} > 0. 14 Unlike the “algebraic” isomorphism from Definition 1.1.11 here, it is understood in the “topological” sense. In general, A ∈ L(X, Y ) is an isomorphism if A is injective, surjective and A−1 ∈ L(Y, X).
∞
30
Chapter 1. Preliminaries
But this is true since the set {x ∈ RM : x1 = 1} is compact and ϕ−1 is continuous. Now let · , · ∼ be two norms on X, dim X = M and let ι be the identity ˜ (= X with the norm · ∼ ). The result follows from the map from X onto X commutativity of the diagram in Figure 1.2.1. Since RM and CM are complete spaces with respect to the · 1 -norm (the classical Bolzano–Cauchy condition) ∞ ∞ and {un }n=1 ⊂ X is a Cauchy sequence if and only if {ϕ(un )}n=1 is a Cauchy sequence, X is a Banach space. ι
X
˜ X ϕ˜−1
ϕ RM (CM ) Figure 1.2.1.
(ii) It is sufficient to prove continuity with respect to the 1-norm on X. M For x = xi fi ∈ X we have i=1
AxY ≤
M i=1
|xi | Afi Y ≤ c
M
|xi | = cx1 .
i=1
Example 1.2.12 (spaces of continuous functions). Let T be a compact topological space. Then any continuous real (complex) function f is bounded on T (Proposition 1.2.2), and f T = sup{|f (x)| : x ∈ T } is a norm on the space C(T ) of all such functions. Convergence of a sequence in this norm is the uniform convergence on T . It follows that C(T ) is a Banach space. If T is not compact, then a continuous function on T need not be bounded. To get a topology on a family of continuous functions on T we can either restrict our attention to the space BC(T ) of all bounded, continuous functions on T or assume certain properties of T which are weaker than compactness (the reader can consider RM as a model of such T ). As a result we wish to obtain a topology on C(T ) in which convergence of a sequence is equivalent to the locally uniform convergence. This can be done as follows. Let a topological space T be a countable union of open, relatively compact subsets Tn .15 We leave to the reader to verify that the sum ∞ 1 f − gn
(f, g) (1.2.2) n 1 + f − g 2 n n=1 basic example is RM or CM . Another example is the set N of natural numbers with the discrete metric: d(m, n) = 1 if m = n and d(m, m) = 0.
15 A
1.2. Normed Linear Spaces
31
where f − gn sup{|f (x) − g(x)| : x ∈ Tn }, defines a metric on C(T ) and the convergence of a sequence in this metric is actually the locally uniform convergence, i.e., uniform convergence on any compact subset of T . Since is bounded it cannot be induced by any norm. Even more is true, namely, there is no norm on C(T ) which generates the same system of open g sets as the metric does (provided T itself is not compact). We now state two fundamental results concerning spaces of continuous functions. To formulate the first we need the concept of equicontinuity: A family F ⊂ C(T ) is said to be equicontinuous if for all x ∈ T and ε > 0 there is a neighborhood U of x such that y ∈ U, f ∈ F
=⇒
|f (y) − f (x)| < ε.
Theorem 1.2.13 (Arzel`a–Ascoli). Let T be a topological space which is a union of a sequence of open, relatively compact subsets. Then F ⊂ C(T ) is relatively compact in the -metric if and only if the following two conditions are satisfied: (i) F is equicontinuous; (ii) for each x ∈ T the set {f (x) : f ∈ F } is bounded in R (or C).16 Proof. We omit the proof and refer, e.g., to Dugundji [43, Section XII, 6] or Kelley [75, Chapter 7, Theorem 17] where more general results are proved. Since a continuous function can be very strange (e.g., nowhere differentiable) it is often desirable to have an approximation procedure. The first result of this type was the famous Weierstrass Theorem on uniform approximation by polynomials. One of the characteristic features of this approximation consists in the fact that the product of two continuous functions is a continuous function and the same is true for polynomials. In algebraic terms: Both sets are not only linear spaces but also algebras.17 The following generalization of the Weierstrass Theorem is due to M.H. Stone. Theorem 1.2.14 (Stone–Weierstrass). Let T satisfy the assumption of Theorem 1.2.13 and let CR (T ) be an algebra of all real continuous functions on T . Let A ⊂ CR (T ) be a subalgebra which contains constant functions and separates points of T (i.e., for any x, y ∈ T , x = y, there is f ∈ A such that f (x) = f (y)). Then A is dense in CR (T ) with respect to the -metric. 16 If T is compact and (i) holds, then the assumption (ii) is equivalent to the boundedness of F in C(T ). 17 A linear space X with a binary operation (product) which is associative and distributive with respect to linear operations is called an algebra. Further, if X is a normed linear space and for the product the inequality x · y ≤ xy holds for every x, y ∈ X, then X is called a normed algebra and, in the case that X is complete, a Banach algebra.
32
Chapter 1. Preliminaries
Proof. The proof can be found, e.g., in Dugundji [43, XIII, 3] or Kelley [75, Chapter 7, Exercise T]. We note that Theorem 1.2.14 can be easily extended to the space of complex continuous functions. In this case, A is assumed to possess the following additional property: If f ∈ A, then also f ∈ A.18 The reader can ask why certain additional properties are needed for compactness in infinite dimensional spaces like C(T ) in contrast to finite dimensional spaces. The following theorem explains not only this situation but also the technical difficulties which one meets in the calculus of variations (see Chapter 6). Proposition 1.2.15 (F. Riesz). Let X be a normed linear space. Then the closed unit ball B(o; 1) {x ∈ X : x ≤ 1} is compact (in the norm topology) if and only if X has finite dimension. Proof. Sufficiency is obvious (see Example 1.2.9 and Corollary 1.2.11(i)). It remains to prove necessity. We proceed by contradiction. Assume that dim X = ∞. Choose 0 < ε < 1 and suppose that we have x1 , . . . , xn ∈ B(o; 1) such that xi − xj > 1 − ε
for all 1 ≤ i < j ≤ n.
We shall show that we can find another element xn+1 ∈ B(o; 1) such that {x1 , . . . , xn+1 } has the same property. Since Xn = Lin{x1 , . . . , xn } = X there is y ∈ X \ Xn . Denote d inf{y − x : x ∈ Xn }. Observe that d > 0 since Xn is a closed subspace.19 By the definition of the greatest lower bound, there exists x ˜ ∈ Xn such that d ≤ y − x ˜ < d(1 + ε). For xn+1
y−˜ x y−˜ x
∈ B(o; 1) and x ∈ Xn we get
xn+1 − x =
1 1 y − (˜ x + y − x˜x) ≥ d > 1 − ε. y − x ˜ d(1 + ε) ∞
Thus an infinite sequence {xn }n=1 ⊂ B(o; 1) with no convergent subsequence has been constructed, which contradicts compactness of B(o; 1). Example 1.2.16 (spaces of integrable functions). Let Ω be a Lebesgue measurable subset of RM and let dx denote the Lebesgue measure in RM . 18 If
z = x + iy, x, y ∈ R, then its complex conjugate z is defined by z x − iy. finite dimensional subspace Y ⊂ X is complete, and therefore closed in X.
19 Every
1.2. Normed Linear Spaces
33
For p ∈ [1, ∞) we denote Lp (Ω) f : Ω → R (or C) : f is measurable p1 p and |f |Lp (Ω) |f (x)| dx < ∞ .
(1.2.3)
Ω
The Minkowski inequality |f + g|Lp (Ω) ≤ |f |Lp (Ω) + |g|Lp (Ω)
(1.2.4)
implies that Lp (Ω) is a linear space. Observe that | · |Lp (Ω) is not a norm since |f |Lp (Ω) = 0 implies only f = o almost everywhere (abbreviation: a.e.) in Ω. Put N (Ω) = {f : Ω → C : f = o a.e. in Ω}. Then N is a linear subspace of Lp and the factor space Lp (Ω) Lp (Ω)|N is a normed linear space with the norm [f ]Lp(Ω) = |f |Lp (Ω)
for any f ∈ [f ].20
For the sake of simplicity we will use the notation f instead of the superfluous [f ] for an element of Lp (Ω) and will call it simply a function. It is also convenient to introduce the space L∞ (Ω) of all (classes of) essentially bounded measurable functions. We recall that f is said to be essentially bounded on Ω if there is a constant c such that |f (x)| ≤ c for a.e. x in Ω. The least possible c is denoted by f L∞ (Ω) . Again · L∞ (Ω) is a norm on L∞ (Ω). We mention another important inequality – the so-called H¨ older inequality: 1 If 1 ≤ p ≤ ∞ and p is the conjugate exponent ( p1 + p1 = 1 where ∞ is
here defined to be 0) and f ∈ Lp (Ω), g ∈ Lp (Ω), then f g ∈ L1 (Ω) and f g1 ≤ f p gp .
(1.2.5) g
Proposition 1.2.17. Lp (Ω) is a Banach space for any 1 ≤ p ≤ ∞. Proof. We give the proof for p = 1 (some small modifications are needed for 1 < p < ∞, while the proof for p = ∞ is similar to the one of completeness of 1 C(T ), cf. Example 1.2.12). Let {fn }∞ n=1 be a Cauchy sequence in L (Ω). Then for 20 For
the sake of simplicity we will use in the sequel the notation · p instead of · Lp (Ω) .
34
Chapter 1. Preliminaries
any k ∈ N there is nk ∈ N such that fn −fnk 1 < that the sequence
∞ {nk }k=1
for all n ≥ nk . We can assume p is strictly increasing. Put gp = |fnk+1 − fnk |. Since k=1
gp (x) dx ≤ Ω
1 2k
p
|fnk+1 (x) − fnk (x)| dx ≤
Ω
k=1
p 1 , 2k
k=1
the Monotone Convergence Theorem21 gives that g = lim gn has a finite integral n→∞ ∞ over Ω and therefore g is finite a.e. in Ω. This means that |fnk+1 (x) − fnk (x)| k=1
is a.e. convergent, and therefore f (x) lim fnk (x) exists a.e. in Ω. By the Fatou k→∞
Lemma22 we have |f (x) − fnk (x)| dx ≤ lim inf |fnl (x) − fnk (x)| dx ≤ l→∞
Ω
Ω
1 . 2k−1
In particular, f ∈ L1 (Ω)
and
lim fnk − f 1 = 0.
k→∞
The rest of the proof is easy. Indeed, a Cauchy sequence which has a convergent subsequence is itself convergent. Remark 1.2.18. The proof shows that the following statement is true: ∞ If {fn }n=1 is convergent to f in the Lp -norm, then there is a subse∞ quence {fnk }k=1 which converges to f a.e., and there is g ∈ Lp (Ω), g ≥ 0, such that |fnk (x)| ≤ g(x)
for a.e.
x ∈ Ω.
Warning. The whole sequence need not be To see this arrange the a.e. kconvergent! characteristic functions of the intervals k−1 into a sequence. , 2k 2k 21 This
theorem reads as follows: Let {gn }∞ n=1 be an increasing sequence of nonnegative measurable functions on Ω and let g = lim gn . Then n→∞
lim
n→∞ 22 The
gn (x) dx = Ω
g(x) dx. Ω
Fatou Lemma reads: Let {hn }∞ n=1 be a sequence of measurable functions which are uniformly bounded below by an h ∈ L1 (Ω). Then lim inf hn (x) dx ≤ lim inf hn (x) dx. Ω n→∞
n→∞
Ω
The statement holds for lim sup with the reverse inequality for a sequence bounded above by an integrable function. Put here hl = |fnl − fnk |.
1.2. Normed Linear Spaces
35
Approximations of integrable functions by more regular functions, like continuous or differentiable ones, are often desirable. Proposition 1.2.19 (Density Theorem). For any p ∈ [1, ∞) the subset C(Ω)∩Lp (Ω) is dense in Lp (Ω). Proof. It is based on the application of the Luzin Theorem.23 See also Proposition 1.2.21 below. We now show another type of approximation which is more constructive and therefore often more convenient in applications. If f , g are measurable functions on RM , then we define their convolution f ∗ g as (f ∗ g)(x) f (x − y)g(y) dy for all x ∈ RM (1.2.6) RM
for which the integral exists. We note that the properties of the convolution follow from the Fubini Theorem provided measurability of the function (x, y) → f (x − y)g(y) is established. For details see, e.g., Folland [52], Gripenberg, Londen & Staffans [62, Chapters 2–4], and also Example 2.1.28. The following assertion is a basic result on convolutions. Proposition 1.2.20. Let f ∈ L1 (RM ). (i) If g ∈ Lp (RM ), 1 ≤ p ≤ ∞, then f ∗ g ∈ Lp (RM )
and
f ∗ gp ≤ f 1gp .
(ii) If g ∈ L∞ (RM ), then f ∗ g is bounded and uniformly continuous on RM . ∂g ∂g ∂ ∈ Lp (RM ), then ∂x (f ∗ g) = f ∗ ∂x a.e. in RM . (iii) If g ∈ Lp (RM ) and ∂x i i i (iv) If ϕ is a nonnegative, measurable function with ϕ(x) dx = 1 (the soRM
called mollifier) and ϕn (x) nM ϕ(nx), then ϕn ∗ g converge to g in the Lp -norm for any g ∈ Lp (RM ), 1 ≤ p < ∞. If T is a topological space and f : T → R (C), then the support of f (abbreviation supp f ) is the set {x ∈ T : f (x) = 0}. If Ω ⊂ RM is an open set, then D(Ω) denotes the set of all infinitely differentiable functions on Ω (i.e., their derivatives of arbitrary order are continuous in Ω) which have compact support lying in Ω. 23
Roughly speaking, the Luzin Theorem says that a bounded measurable function is continuous with respect to sets, measures of which are arbitrarily close to the measure of Ω provided the latter is finite. For a more general formulation and the proof of the Luzin Theorem the reader can consult, e.g., Rudin [113, § 2.23].
36
Chapter 1. Preliminaries
We show that D(Ω) contains enough functions. Put − 1 e 1−x2 , |x| < 1, ω(x) = 0, |x| ≥ 1. It is a matter of simple calculation to prove that ω ∈ D(R). If a ∈ Ω, then B(a; δ) ⊂ Ω for a δ > 0 small enough and the function ϕ(y) ω 2δ y − aRM belongs to D(Ω). However, much more is true. Proposition 1.2.21. Let Ω be an open set in RM and let p ∈ [1, ∞). Then D(Ω) is dense in Lp (Ω). Proof. The just defined function ϕ multiplied by an appropriate constant satisfies the assumptions of Proposition 1.2.20(iv). There is a strictly increasing sequence of compact subsets Cm of Ω such that ∞
Cm = Ω.
m=1
Extend f ∈ Lp (Ω) by zero outside Ω and put f m = χm f where χm is the characteristic function of the set Cm . Then fm → f in the Lp norm. By Proposition 1.2.20, ϕn ∗ fm ∈ D(Ω) for n ≥ nm and ϕn ∗ fm − f p ≤ ϕn ∗ (fm − f )p + ϕn ∗ f − f p ≤ fm − f p + ϕn ∗ f − f p .
The result follows from Proposition 1.2.20(iv).
Remark 1.2.22. If meas Ω < ∞ and 1 ≤ p˜ < p ≤ ∞, then, by the H¨ older inequality, 1
1
f p˜ ≤ (meas Ω) p˜ − p f p ,
f ∈ Lp (Ω).
(1.2.7)
This means that the identity map of Lp (Ω) into Lp˜(Ω) is continuous. We will denote this fact by Lp (Ω) ⊂ Lp˜(Ω) and say that Lp (Ω) is continuously embedded into Lp˜(Ω). Warning. Simple examples show that this is not true if meas Ω = ∞! The following assertion is an analogue of Theorem 1.2.13. Proposition 1.2.23 (A.N. Kolmogorov). Let Ω be an open set in RM . Then M ⊂ Lp (Ω), p ∈ [1, ∞), is relatively compact if and only if the following conditions are satisfied: (i) M is bounded in Lp (Ω),
1.2. Normed Linear Spaces
37
(ii) ∀ε > 0 ∃δ > 0 ∀f ∈ M: (iii) ∀ε > 0 ∃η > 0 ∀f ∈ M:
Ω
|f (x + y) − f (x)|p dx < ε for all yRM < δ,24
{x∈Ω:xRM ≥η}
|f (x)|p dx < ε.
Proof. For the proof based on Proposition 1.2.3 see Yosida [135, Chapter 10, § 1]. Remark 1.2.24. All results from 1.2.16–1.2.23 also hold in spaces of sequences ⎧ ⎫ p1 ∞ ⎨ ⎬ ∞ lp x = {xn }n=1 : xp = |xn |p <∞ ⎩ ⎭ n=1
which can be regarded as Lp (N) equipped with the counting measure µ (µ(A) = card A). Example 1.2.25 (spaces of differentiable functions). We can consider either classical derivatives (defined as limits of relative differences) or weak derivatives. We start with the former case. Let α = (α1 , . . . , αM ) be a multiindex , i.e., αi ∈ N ∪ {0}, i = 1, . . . , M , and |α| α1 + · · · + αM . For a function f on an open set Ω ⊂ RM we put Dα f (x)
∂ |α| f (x) M · · · ∂xα M
1 ∂xα 1
and say that f ∈ C n (Ω) if Dα f are continuous for all multiindices α for which |α| ≤ n. We can use the metric given by (1.2.2) to define α (f, g) (Dα f, Dα g) for a multiindex α and set
n (f, g)
α (f, g). |α|≤n
Then n is a metric on C n (Ω) and the convergence in this metric is the locally uniform convergence of all derivatives Dα , 0 ≤ |α| ≤ n (Do f = f ). Another possibility is to consider only such functions f ∈ C n (Ω) for which Dα f is bounded in Ω for all 0 ≤ |α| ≤ n. We denote the collection of such functions by C n (Ω)25 and put f C n(Ω) sup |Dα f (x)|. |α|≤n
x∈Ω
This is a norm, C n (Ω) is a Banach space, and the convergence of a sequence ∞ {fk }k=1 ⊂ C n (Ω) to f in this norm means that D α fk ⇒ D α f
uniformly on Ω
for all |α| ≤ n.
x + y ∈ Ω, then we set f (x + y) 0. connection with this notation observe that for a relatively compact set Ω all derivatives D α f , |α| ≤ n − 1, are uniformly continuous, and therefore continuously extendable to Ω. 24 If
25 In
38
Chapter 1. Preliminaries
It is sometimes convenient to have a finer scale of spaces of differentiable functions. We can achieve that by introducing the H¨ older continuous functions: A function f : Ω → R (or C) is called γ-H¨ older continuous (0 < γ ≤ 1) if there is a constant c such that the inequality |f (x) − f (y)| ≤ cx − yγ
holds for all x, y ∈ Ω.26
The quantity f C 0,γ (Ω) sup |f (x)| + sup x∈Ω
x,y∈Ω x =y
|f (x) − f (y)| x − yγ
is a norm on the space C 0,γ (Ω) of γ-H¨older continuous, bounded functions on Ω. The space C n,γ (Ω) is defined similarly. We note that C n,γ (Ω) is a Banach space with respect to the above norm (cf. Exercise 7.1.4). Now we turn our attention to weak derivatives on an open set Ω ⊂ RM . Let f ∈ L1loc (Ω) (this means that f ∈ L1 (K) for every compact subset K ⊂ Ω), and let α be a multiindex. A function g ∈ L1loc (Ω) is called an α-weak derivative of f if f (x)Dα ϕ(x) dx = (−1)|α| g(x)ϕ(x) dx for every ϕ ∈ D(Ω). (1.2.8) Ω
We will denote g =
Ω α Dw f
and omit w when there is no danger of ambiguity.
Warning. Even in the one-dimensional case the ordinary derivative existing almost everywhere need not be the weak derivative! For example, the Heaviside function 1, x ≥ 0, H(x) = satisfies 0, x < 0,
H (x) = 0
for x ∈ R \ {0}
but the weak derivative does not exist. The distributional derivative of H 27 is the Dirac measure. 26 If
γ = 1, then it is more common to say that f is a Lipschitz continuous function. We note that the inequality is satisfied for a γ > 1 only if f is a constant function (cf. Exercise 7.1.6). 27 A linear form Φ on the linear space D(Ω) is called a distribution (this notion is due to L. Schwartz) if it has the following continuity property: If ϕn ∈ D(Ω) have their supports in the same compact set K ⊂ Ω and D α ϕn ⇒ D α ϕ uniformly on Ω for all multiindices α, then Φ(ϕn ) → Φ(ϕ).
1.2. Normed Linear Spaces
39
We note that an absolutely continuous function f on an interval I ⊂ R has a derivative a.e. (the Lebesgue Theorem, see Rudin [113]), and x f (x) = f (a) + f (y) dy for a, x ∈ I. a
This implies that Dw f = f . The situation in higher dimensions is not so simple since there are several non-equivalent definitions of absolutely continuous functions. Having a definition of weak derivatives we can define Sobolev spaces W k,p (Ω) for an open set Ω ⊂ RM as follows: α W k,p (Ω) {f ∈ Lp (Ω) : derivatives Dw f exist
and belong to Lp (Ω) for all |α| ≤ k} with the norm f W k,p (Ω)
α Dw f p .28
(1.2.9)
|α|≤k
Similarly to the definition of Lp spaces, classes of functions are considered here. Since Lp (Ω) is a Banach space, W k,p (Ω) is a Banach space, too. As we will see later in this book, Sobolev spaces play an important role in the study of boundary value problems. For this purpose the following assertions g are important. Theorem 1.2.26 (Sobolev Embedding Theorem). Let k ∈ N and let p ∈ [1, ∞). ∗ k 29 (i) If k < Np , then W k,p (RN ) ⊂ Lp (RN ) for p1∗ = p1 − N . (ii) If k =
(iii) If k >
N p,
N p,
then W k,p (RN ) ⊂ Lr (RN )
for all
r ∈ [p, ∞)
W k,p (RN ) ⊂ Lrloc (RN )
for all
r ≥ 1.
then W k,p (RN ) ⊂ C 0,γ (RN ) for all 0 ≤ γ < k −
and
N 30 p.
Note that any f ∈ L1loc (Ω) (and even a regular Borel measure on Ω, see, e.g., Rudin [113]) yields f (x)ϕ(x) dx for any ϕ ∈ D(Ω). The distributional a distribution Φf by the formula Φf (ϕ) = Ω
derivative D α of a distribution Φ is defined as D α Φ(ϕ) (−1)|α| Φ(D α ϕ), ϕ ∈ D(Ω). It is easy to prove that D α Φ is again a distribution, and an α-weak derivative of f ∈ L1loc (Ω) is actually equal to the distributional derivative D α Φf . As the Heaviside function shows the converse is not true. 28 Similarly as for the Lebesgue norm we will use in the sequel the notation · k,p instead of · W k,p (Ω) for the Sobolev norm. 29 The exponent p∗ pN is sometimes called the critical Sobolev exponent. N−kp 30 This means that any function f ∈ W k,p (RN ) can be changed on a set of measure zero in such a way that the new function f˜ is γ-H¨ older continuous and f˜C 0,γ (RN ) ≤ cf W k,p (RN ) . The older continuous symbol RN means that functions from C 0,γ (RN ) are bounded and uniformly γ-H¨ on the whole RN .
40
Chapter 1. Preliminaries
Proof. Proofs of these statements are quite involved and also have a long history. The interested reader can consult, e.g., Adams [2], Kufner, John & Fuˇc´ık [82], Maz’ja [93], Stein [123, Chapters V, VI]. For a readable account of Sobolev spaces we recommend Evans [48, Chapter 5]. Spaces with fractional derivatives which extend the class of Sobolev spaces can be also defined, e.g., Triebel [128], [129]. Remark 1.2.27. The situation for an open set Ω with a nonempty boundary ∂Ω (in particular, for a bounded Ω) is even more complicated because some techniques from harmonic analysis, like Fourier transform, are not available. One possibility is to extend f ∈ W k,p (Ω) to a function f˜ ∈ W k,p (RN ). This is possible if the boundary ∂Ω possesses certain smoothness properties. To explain this more precisely we would need some facts about manifolds (see Section 4.3 and Appendix 4.3A). So we omit details and just state that Theorem 1.2.26 is true provided ∂Ω is locally Lipschitz (see Section 7.3 for details). Theorem 1.2.28 (Rellich–Kondrachov). Let Ω be a bounded open set in RN with a locally Lipschitz boundary, k ∈ N, p ∈ [1, ∞). (i) Let k <
N p
and q ∈ [1, p∗ ) where p∗
pN . N − kp
(1.2.10)
Then the embedding W k,p (Ω) into Lq (Ω) is compact.31 (ii) If k = Np , then W k,p (Ω) ⊂⊂ Lq (Ω) for all q ∈ [1, ∞). (iii) If 0 ≤ γ < k −
N p,
then W k,p (Ω) ⊂⊂ C 0,γ (Ω).
Proof. For the proof see references given above.
Now, we turn our attention to abstract spaces. Proposition 1.2.15 has pointed out the difference between finite dimensional spaces and (infinite dimensional) function spaces. Another difference between the finite and infinite dimension lies in the notion of a basis. It can be shown that any algebraic basis in an infinite dimensional Banach space X has to be uncountable, and therefore the representation of a point by its coordinates can hardly be of any use. This observation leads to the necessity of expressing an element of X by an infinite sum. A sequence ∞ {en }n=1 ⊂ X is called a Schauder basis of X if for each x ∈ X there is a (uniquely determined) sequence {ξn }∞ n=1 of numbers (real or complex according to whether X is real or complex) such that x=
∞
ξn en .
(1.2.11)
n=1
will use the notation ⊂⊂ for compact embeddings. An embedding of X into Y is compact if a ball in X is relatively compact in Y .
31 We
1.2. Normed Linear Spaces
41
There are several imperfections in this definition. Namely, there are separable32 Banach spaces which do not possess a Schauder basis. Moreover, the convergence of the sum in (1.2.11) can be understood in several non-equivalent meanings. These problems do not appear in a special class of spaces with an additional structure which is connected with the norm and allows measuring angles. Definition 1.2.29. Let X be a real (or complex) linear space. A mapping (·, ·)X : X × X → R (or C) is called a scalar product on X if the following conditions are satisfied: (1) for any y ∈ X the mapping x → (x, y)X is linear; (2) (x, y)X = (y, x)X for all x, y ∈ X in the real case and (x, y)X = (y, x)X in the complex case; (3) (x, x)X ≥ 0 for every x ∈ X and (x, x)X = 0 if and only if x = o. Proposition 1.2.30. Let (·, ·) be a scalar product on a linear space X. Then (i) the so-called Schwartz inequality |(x, y)|2 ≤ (x, x)(y, y)
holds for all
x, y ∈ X;
(1.2.12)
1
(ii) the mapping · : x → [(x, x)] 2 is a norm on X. Proof. Assertion (i). For x, y ∈ X there exists c ∈ C, |c| = 1, such that for yˆ = cy we have (x, yˆ) ∈ R. Hence it suffices to prove (1.2.12) for the real space X. For any α ∈ R we have 0 ≤ (x + αy, x + αy) = (x, x) + 2α(x, y) + |α|2 (y, y), i.e., the discriminant 4|(x, y)|2 −4(x, x)(y, y) is nonpositive. Hence (1.2.12) follows. In assertion (ii), only the triangle inequality has to be checked. For x, y ∈ X we get33 x + y2 = (x + y, x + y) = (x, x) + 2 Re(x, y) + (y, y) ≤ x2 + 2|(x, y)| + y2 and the Schwartz inequality completes the proof.
If X is a linear space with a scalar product we will always consider the norm on X induced by this scalar product. If X is complete with respect to this norm, then X is called a Hilbert space and will be usually denoted by H. We note that ˜ which is a completion of X. if X is not complete there exists a Hilbert space H 32 If a space X has a Schauder basis, then X is separable. This is not a serious drawback since most function spaces used in analysis are separable. 33 Notice here a typical procedure with the norm induced by a scalar product, namely using the second power of the norm in calculation.
42
Chapter 1. Preliminaries
Example 1.2.31. (i) RM with the scalar product (x, y) =
M
ξi ηi ,
x=
i=1
M
ξi ei ,
y=
i=1
M
ηi ei ,
i=1
(e1 , . . . , eM the standard basis) is a Hilbert space. Similarly, CM is a Hilbert M space with respect to the scalar product (x, y) = ξi ηi . i=1
(ii) The norm on L2 (Ω) given by (1.2.3) is induced by the scalar product (f, g)L2 (Ω) = f (x)g(x) dx (1.2.13) Ω
(in the complex case). Similarly, for p = 2 the norm (1.2.9) is equivalent to the norm induced by the scalar product (Dα f, Dα g)L2 (Ω) . (f, g)W k,2 (Ω) = |α|≤k
(iii) The “sup norm” on BC(Ω) is not induced by any scalar product. This can be seen from the parallelogram identity x + y2 + x − y2 = 2x2 + 2y2,
x, y ∈ X
(1.2.14)
which is valid only in such a space X the norm of which is induced by a scalar product. Indeed, if a norm satisfies (1.2.14), then (in the real case) (x, y) =
1 (x + y2 − x − y2 ) 4
(1.2.15)
(polarization identity) has all properties of a scalar product, and the induced norm coincides with · . It is not difficult to show that the “sup norm” does not satisfy (1.2.14). Even more is true, namely, the “sup norm” is not equivalent to any norm on BC(Ω) induced by a scalar product. Since C[0, 1] ⊂ L2 (0, 1), the scalar product (1.2.13) is also a scalar product on C[0, 1]. But the space C[0, 1] is not complete in the L2 -norm and, therefore, the L2 -norm on C[0, 1] cannot be equivalent to the “sup norm”; only the inequality f L2 (0,1) ≤ f C[0,1] holds. Observe that L2 (0, 1) is a completion of C[0, 1] with respect to the g integral norm given by (1.2.3). The most useful concept in spaces with a scalar product is the following one.
1.2. Normed Linear Spaces
43
Definition 1.2.32. Let X be a linear space with a scalar product (·, ·). (1) Subsets A, B ⊂ X are said to be orthogonal (and denoted by A ⊥ B) if (a, b) = 0 for every a ∈ A, b ∈ B. (2) A system {xγ }γ∈Γ ⊂ X is said to be orthonormal if (xγ , xγ˜ ) =
0, γ = γ˜ , 1, γ = γ˜ .
∞
∞
(3) A sequence {en }n=1 ⊂ X is called an orthonormal basis of X if {en }n=1 is both an orthonormal system and a Schauder basis of X. Suppose that x1 , . . . , xn are linearly independent elements of a space X with a scalar product (·, ·). Put e1 = xx11 and if orthonormal elements e1 , . . . , ek (k < n) are constructed in such a way that Lin{x1 , . . . , xk } = Lin{e1 , . . . , ek }, then define yk+1 = xk+1 −
k
(xk+1 , ej )ej ,
ek+1 =
j=1
yk+1 . yk+1
It is obvious that (ej , ek+1 ) = 0,
ek+1 = 1
j = 1, . . . , k,
and Lin{x1 , . . . , xk+1 } = Lin{e1 , . . . , ek+1 }. This procedure is called the Schmidt orthogonalization. For any x ∈ Y n αk ek . Taking the scalar product with ej , we get Lin{x1 , . . . , xn } we have x = k=1
(x, ej ) =
n
αk (ek , ej ) = αj ,
k=1
and also
⎛ x2 = ⎝
n j=1
(x, ej )ej ,
n k=1
⎞ (x, ek )ek ⎠ =
n k=1
|(x, ek )|2 .
44
Chapter 1. Preliminaries
Assume now that X = Y and let us look for an approximation of a y ∈ X \ Y by n an element x = αj ej : j=1
⎛ y − x2 = ⎝y −
n
αj ej , y −
j=1
= y2 −
n
αj ej ⎠
j=1
n
αj (y, ej ) −
j=1
= y2 +
⎞
n
n
αj (y, ej ) +
j=1
|αj |2
j=1
|αj − (y, ej )|2 −
j=1
n
n
(1.2.16)
|(y, ej )|2
j=1
2 n n 2 2 ≥ y − |(y, ej )| = y − (y, ej )ej . j=1 j=1 Two consequences follow from this inequality. First, the best approximation of y ∈ X by an element of Y is Pn y
n
(y, ej )ej .34
j=1
Observe also that (y − Pn y) ⊥ Y . Second, n
|(y, ej )|2 ≤ y2
for all y ∈ X.
j=1
Since n is arbitrary (in an infinite dimensional space) we have obtained the socalled Bessel inequality: ∞ If {en }n=1 is an orthonormal system in X, then ∞
|(y, en )|2 ≤ y2
for all y ∈ X.
(1.2.17)
n=1
In particular, the sum
∞
|(y, ej )|2 is always convergent.
j=1 34 We
note that this result, namely the linearity of the operator Pn of the best approximation, is typical for spaces with scalar products. In a general normed linear space X and a finite dimensional subspace Y the best approximation of an arbitrary x ∈ X by elements of Y exists (by a compactness argument) but a special property of the norm is needed for the uniqueness of the best approximation. Linearity of the best approximation on all subspaces of dimension 2 implies that the norm is induced by a scalar product. More details can be found in the monograph of Singer [120].
1.2. Normed Linear Spaces
45
Proposition 1.2.33. Let X be a linear space with a scalar product, let X be separable.35 Then there exists an orthonormal basis in X. Proof. Let {x1 , x2 , x3 , . . . } be a dense set in X. Put Yn = Lin{x1 , . . . , xn },
Y =
∞
Yn .
n=1
Then Y = X. By omitting linearly dependent elements we can assume that dim Yn = n. According to the Schmidt orthogonalization there exists an orthonor∞ mal sequence {en }∞ n=1 such that Yn = Lin{e1 , . . . , en }. Let x ∈ X and let {yn }n=1 be a sequence such that yn ∈ Yn and lim yn = x (the density of Y in X). By the n→∞
inequality (1.2.16),
This means that x =
n x − yn ≥ x − (x, ej )ej . j=1 ∞
(x, ej )ej .
j=1
To prove uniqueness, suppose that x =
∞
αj ej . Since the scalar product is
j=1
continuous, we have ⎛ (x, ek ) = lim ⎝ n→∞
n
⎞ αj ej , ek ⎠ = αk .
j=1
In order to obtain some useful properties which guarantee that an orthonormal sequence is a basis we need to use completeness. We start with a general approximation result. Theorem 1.2.34. Let H be a Hilbert space and let C be a closed convex subset of H. Then for any x ∈ H there exists a unique y ∈ C such that x − y = inf {x − z : z ∈ C}.
(1.2.18)
This best approximation y is characterized by the following property: y ∈ C and Re(x − y, y − z) ≥ 0
for all z ∈ C
(1.2.19)
(see Figure 1.2.2 36 ). 35 The assumption on separability is redundant. Without separability an orthonormal basis {eγ }γ∈Γ still exists but Γ is uncountable. Moreover, if x ∈ X, then (x, eγ ) = 0 for all but countably many γ. 36 For A ⊂ H we denote A⊥ {x ∈ H : a ∈ A =⇒ (a, x) = 0}.
46
Chapter 1. Preliminaries
x y + {x − y}⊥ y z C
x−y
{x − y}⊥
y−z o Figure 1.2.2.
Proof. Step 1 (Existence). Denote the right-hand side in (1.2.18) by d. If d = 0, then ∞ x ∈ C (C is closed) and y = x. Suppose that d > 0. Then there are {zn }n=1 ⊂ C such that 1 d ≤ x − zn < d + . n By (1.2.14) we get zn − zm 2 = x − zm − (x − zn )2
2 zn + zm = 2(x − zm + x − zn ) − 4 x − 2 2 2
1 1 <2 d+ +2 d+ − 4d2 n m 2
2
∞
m (notice that zn +z ∈ C since C is convex). This implies that {zn }n=1 is a Cauchy 2 sequence, and therefore it is convergent to a y ∈ C. Obviously, x − y = d.
Step 2 (Uniqueness). Assume that x − y = x − y˜ = d for y, y˜ ∈ C. Using (1.2.14) as above we get y = y˜. Step 3 (Characterization). Let y be the best approximation of x and let z ∈ C. Then zt tz + (1 − t)y ∈ C for t ∈ (0, 1) (C is convex) and x − zt 2 = x − y + t(y − z)2 = x − y2 + t2 y − z2 + 2t Re(x − y, y − z) ≥ x − y2 ,
1.2. Normed Linear Spaces
47
i.e., ty − z2 + 2 Re(x − y, y − z) ≥ 0 and taking the limit for t → 0+ , the inequality (1.2.19) follows. If (1.2.19) is satisfied, then x − z2 = x − y + y − z2 = x − y2 + y − z2 + 2 Re(x − y, y − z) ≥ x − y2 ,
and therefore y is the best approximation of x.
Corollary 1.2.35. Let H be a Hilbert space and M a closed linear subspace of H, M = H, M = {o}. Then there exists a unique subspace M ⊥ (the so-called orthogonal complement of M ) such that H = M ⊕ M ⊥,
M ⊥ M ⊥.
Moreover, if P denotes the projection to M given by this direct sum37 (the socalled orthogonal projection), then P ∈ L(H), P L(H) = 1 and (P x, y) = (x, P y)
for all
x, y ∈ H.
Proof. A closed linear subspace M is a closed convex set. Denote by P x ∈ M the best approximation of x ∈ H in M . Choose w ∈ M and put z = P x − w ∈ M. By (1.2.19) we get Re(x − P x, w) ≥ 0 and also Re(x − P x, iw) ≥ 0 (by taking z = P x − iw), i.e., (x − P x, w) ≥ 0. Since also −w ∈ M , we finally have (x − P x, w) = 0
for all w ∈ M.
(1.2.20)
It is easy to see that (1.2.20) is a characterization of P x. By using (1.2.20) for αx, x1 + x2 , we see that P is a linear operator. Since P 2 = P , P is a projection onto M . The identity (1.2.20) also shows that Ker P = M ⊥ {y ∈ H : x ∈ M =⇒ (x, y) = 0}. By the orthogonality of P x and x − P x we have x2 = P x2 + x − P x2 ≥ P x2 ,
i.e.,
P L(H) ≤ 1.
Since P x = x for x ∈ M , P L(H) = 1. By (1.2.20) we get (x, P y) = (x − P x + P x, P y) = (P x, P y) = (P x, P y − y + y) = (P x, y). 37 Cf.
Example 1.1.13(i).
48
Chapter 1. Preliminaries ∞
Corollary 1.2.36. Let H be a Hilbert space and let {en }n=1 be an orthonormal sequence in H. Then the following statements are equivalent: (i) {en }∞ n=1 is an orthonormal basis; (ii) if (x, en ) = 0 for all n, then x = o; (iii) Lin{e1 , e2 , . . . } is dense in H; (iv) the Parseval equality x2 =
∞
|(x, en )|2
is valid for all
x ∈ H.
(1.2.21)
n=1
Proof. The implication (i)⇒(ii) is obvious and follows from the definition of the orthonormal basis. The implication (ii)⇒(iii): Denote Y = Lin{e1 , e2 , . . . }. Assume that Y is not dense, i.e., Y = H. By Corollary 1.2.35 there exists x ∈ Y ⊥ \ {o}. In particular, (x, en ) = 0 for all n, a contradiction. The implication (iii)⇒(iv): The proof of Proposition 1.2.33 shows that the sequence n sn (x, ek )ek converges to x for all x ∈ H. k=1
Moreover, sn ⊥ (x − sn ), and hence x2 = sn 2 + x − sn 2 =
n
|(x, ek )|2 + x − sn 2 .
k=1
By taking the limit, the Parseval equality follows. The implication (iv)⇒(i): Let x ∈ H be arbitrary. For sn defined as above and m > n we have m sm − sn 2 = |(x, ek )|2 . k=n
Since the series in (1.2.21) is convergent, the sequence {sn }∞ n=1 is Cauchy, and therefore it is convergent to a y ∈ H since H is complete. Moreover, (y, en ) = (x, en ), and by the Parseval equality x − y2 =
∞
|(x − y, en )|2 = 0.
n=1 ∞
Remark 1.2.37. Let H be a Hilbert space and {en }n=1 an orthonormal basis in H. The proof of the last implication shows that for an arbitrary sequence {αn }∞ n=1 ⊂ R ∞ |αn |2 is (or C depending on whether H is a real or complex space) for which n=1
1.2. Normed Linear Spaces
convergent, the series
∞
49
αn en is convergent in H to an x ∈ H and (x, en ) = αn .
n=1
Moreover, the operator 2 38 U : x ∈ H → {(x, en )}∞ n=1 ∈ l (N)
is a unitary operator (i.e., (U x, U y)l2 (N) = (x, y), x, y ∈ H) which is surjective. It implies also that all infinite dimensional separable Hilbert spaces over the same field of scalars are unitarily equivalent. This statement is known as the Riesz– Fischer Theorem. Having this result we can ask why not restrict our attention only to a single abstract separable Hilbert space. The reason is that in a special function space like W k,2 (Ω) one has more ways of computation since its elements are functions. Example 1.2.38. (i) The space L2 (−π, π) is a Hilbert space. It is separable since continuous 2πperiodic functions are dense in L2 (−π, π) and any such function can be apn proximated by trigonometric polynomials of the type ak eikt (either the k=−n
classical Weierstrass Approximation Theorem or Theorem 1.2.14). It is easy to see that 1 en : t → √ eint , t ∈ (−π, π), n ∈ Z, 2π form an orthonormal system in L2 (−π, π). By Corollary 1.2.36(iii) it is also an orthonormal basis.39 (ii) Functions Hn (t)e
2
− t2
n t2
where Hn (t) = (−1) e
2
dn e−t dtn
(the so-called Hermite polynomials) form (after normalization) an orthonormal basis in L2 (R). For the proof and relevant results in harmonic analysis we recommend the classical book Kaczmarz & Steinhaus [70]. We note that 38 l2 (N)
is the space of all (generally complex) sequences x = {ξn }∞ n=1 such that
convergent. The scalar product on l2 (N) is given by (x, y)l2 (N) = y = {ηn }∞ n=1 (see also Remark 1.2.24). 39 Here this means that f (t) =
+∞ −∞
fˆ(n)eint
where
∞ n=1
1 fˆ(n) = (f, en )L2 (−π,π) = 2π
∞
|ξn |2 is
n=1
ξn η n for x = {ξn }∞ n=1 ,
π
f (t)e−int dt
−π
and the series is convergent in the L2 -norm for arbitrary f ∈ L2 (−π, π). It is worth noting that the series is actually a.e. convergent to f but this by no means follows from the norm convergence. This result is due to L. Carlesson and it is one of the most difficult and profound results in analysis.
50
Chapter 1. Preliminaries
there are many different orthonormal bases in L2 -spaces. We will present one g general method of their construction in Theorem 2.2.16. ∞
Proposition 1.2.39. Let {en }n=1 be an orthonormal basis in a Hilbert space H. Then a bounded set M ⊂ H is relatively compact if and only if for any ε > 0 there is k ∈ N such that ∞
|(x, en )|2 < ε
for all
x ∈ M.
n=k
Proof. The statement follows from Proposition 1.2.3.
Theorem 1.2.40 (Riesz Representation Theorem). Let H be a Hilbert space and let F be a continuous linear form on H. Then there is a unique f ∈ H such that F (x) = (x, f )
for all
x ∈ H.
Moreover, F = f where F = F L(H,R) or F = F L(H,C) depending on whether H is a real or a complex space. Proof. Suppose that H is a complex Hilbert space. If F = o, then f = o. Suppose that F = o. The idea of constructing f is that f has to be orthogonal to Ker F which is a closed subspace of H. By Corollary 1.2.35, H = Ker F ⊕ (Ker F )⊥ . Take x0 ∈ (Ker F )⊥ , x0 = 1, and put f = αx0 where α will be determined later. Let x = y + βx0 , y ∈ Ker F , β ∈ C be arbitrary. Then (x, f ) = βα,
F (x) = βF (x0 ).
Choose now α = F (x0 ). If there is another g ∈ H such that F (x) = (x, g), x ∈ H, then 0 = (x, f − g) for all x ∈ H, in particular, for x = f − g. Therefore f = g. By the Schwartz inequality (1.2.12) we obtain |F (x)| = |(x, f )| ≤ xf ,
i.e.,
F ≤ f .
Since F (f ) = f 2 , we have F ≥ f . This shows that F = f .
The following variant of the Riesz Representation Theorem is often used in the functional analysis approach to differential equations (see, e.g., Evans [48]). Proposition 1.2.41 (Lax–Milgram). Let H be a complex Hilbert space and let B : H × H → C be a mapping with the following properties: (i) The mapping x → B(x, y) is linear for any y ∈ H. (ii) B(x, α1 y1 +α2 y2 ) = α1 B(x, y1 )+α2 B(x, y2 ) for every x, y1 , y2 ∈ H, α1 , α2 ∈ C. (iii) There is a constant c such that |B(x, y)| ≤ cxy for every x, y ∈ H.
1.2. Normed Linear Spaces
51
Then there is A ∈ L(H), AL(H) ≤ c, such that x, y ∈ H.
B(x, y) = (x, Ay), Moreover, (iv) if there is a positive constant d such that B(x, x) ≥ dx2
for each
x ∈ H,
then A is invertible, A−1 ∈ L(H)
and
A−1 L(H) ≤
1 . d
Proof. The existence of A follows from (i), (iii) and the Riesz Representation Theorem. The property (ii) yields the linearity of A. Since Ay2 = (Ay, Ay) = B(Ay, y) ≤ cAyy, we have Ay ≤ cy, i.e., A ∈ L(H) and AL(H) ≤ c. The property (iv) means that dy2 ≤ B(y, y) = (y, Ay) ≤ yAy, i.e., Ay ≥ dy
for all y ∈ H.
(1.2.22)
In particular, A is injective. Moreover, Im A is a closed subspace of H. Indeed, ∞ let Ayn → z ∈ Im A. By (1.2.22), {yn }n=1 is a Cauchy sequence, and hence it is convergent to a y ∈ H. By continuity of A, Ay = z, i.e., z ∈ Im A. In fact, Im A = H. Indeed, if w ∈ (Im A)⊥ , then dw2 ≤ B(w, w) = (w, Aw) = 0
and
w = o.
So Dom (A−1 ) = Im A = H and (1.2.22) implies that A−1 L(H) ≤
1 . d
Exercise 1.2.42. Let {Fα }α∈A be a system of closed subsets of a compact space M . Prove the finite intersection property: % % If Fα = ∅ for any finite K ⊂ A, then Fα = ∅. α∈K
α∈A
(This property characterizes compact spaces.) Hint. Suppose not. Then {M \ Fα }α∈A is an open covering of M .
52
Chapter 1. Preliminaries
Exercise 1.2.43. Prove that F ⊂ C[a, b] is relatively compact if and only if F is bounded in C[a, b] and the following equicontinuity condition is satisfied: ∀ε > 0 ∃δ > 0 ∀f ∈ F :
x, y ∈ [a, b], |x − y| < δ
|f (x) − f (y)| < ε.
=⇒
Hint. Use Proposition 1.2.3. Obviously, this statement is also a special case of Theorem 1.2.13. ∞
Exercise 1.2.44. Let {en }n=1 be an orthonormal basis in a Hilbert space H. Define ⎧ ⎪ n if x = en , ⎪ ⎨ 1 f (x) = n(1 − 2x − en ) if x − en < , ⎪ 2 ⎪ ⎩ 0 otherwise. Show that f is a well-defined continuous functional on H which is not bounded on the closed unit ball. Exercise 1.2.45. Let ∅ = M ⊂ X be a subset of a normed linear space X. For x ∈ X set dist(x, M) = inf{x − y : y ∈ M}. Prove that for any x1 , x2 ∈ X we have | dist(x1 , M) − dist(x2 , M)| ≤ x1 − x2 . Hint. Assume dist(x1 , M) ≥ dist(x2 , M). For any ε > 0 there exists xε ∈ M such that x2 − xε < dist(x2 , M) + ε. Use the triangle inequality for x1 − xε . Exercise 1.2.46.40 Let Ω be a bounded open set in RM . For p ∈ [1, ∞) and k ∈ N define W0k,p (Ω) to be the closure of D(Ω) with respect to the W k,p (Ω)-norm (1.2.9). (i) Prove that W0k,p (Ω) ⊂ W k,p (Ω) and W0k,p (Ω) need not be dense in W k,p (Ω) (compare it with the statement of Theorem 1.2.28(iii); see also the Trace Theorem (Theorem 7.3.1)). (ii) Prove the Poincar´e inequality: There exists a constant cp such that for all u ∈ W01,p (Ω) the inequality
|u(x)| dx ≤ cp
∇u(x)p dx 41
p
Ω
holds.
Ω
40 Supplement 41 Finding
to Example 1.2.25. the smallest possible value of the constant difficult problem. See also ( ' cp is a much more
Exercise 6.3.19 and Example 7.4.4. Here ∇u(x) =
weak derivatives (see (1.2.8)), is the gradient of u.
∂u ∂u , . . . , ∂x ∂x1 M
where
∂ , ∂xi
i = 1, . . . , M , are
1.2. Normed Linear Spaces
53
Hint. It suffices to prove the assertion for u ∈ D(Ω). Consider first Ω = (0, 1) and use the Mean Value Theorem. Then suppose (without loss of generality) ˜ (0, d) × RM−1 and notice that D(Ω) ⊂ D(Ω). ˜ Ω⊂Ω (iii) Use the Poincar´e inequality to prove that |u|W 1,p (Ω) = 0
p1 ∇u(x)p dx
Ω
is an equivalent norm on W01,p (Ω) with the norm
uW 1,p (Ω) = 0
|u(x)| dx p
Ω
p1
p1 ∇u(x) dx . p
+ Ω
Exercise 1.2.47. Let u ∈ W 1,p (0, 1), 1 ≤ p < ∞. Prove that functions u+ (x) max{u(x), 0},
u− (x) max{−u(x), 0}
also belong to W 1,p (0, 1). We remark that the corresponding result is false for W k,p (0, 1), k ≥ 2.
Chapter 2
Properties of Linear and Nonlinear Operators 2.1 Linear Operators In this section we point out some fundamental properties of linear operators in Banach spaces. The key assertions presented are the Uniform Boundedness Principle, the Banach–Steinhaus Theorem, the Open Mapping Theorem, the Hahn–Banach Theorem, the Separation Theorem, the Eberlain–Smulyan Theorem and the Banach Theorem. We recall that the collection of all continuous linear operators from a normed linear space X into a normed linear space Y is denoted by L(X, Y ), and L(X, Y ) is a normed linear space with the norm AL(X,Y ) = sup {AxY : xX ≤ 1}. Proposition 2.1.1. Let Y be a Banach space. Then L(X, Y ) is a Banach space, too. In particular, the space X ∗ of all linear continuous forms on X is complete. Proof. Let {An }∞ n=1 be a Cauchy sequence in L(X, Y ). Then for any ε > 0 there is n0 ∈ N such that for all n, m ≥ n0 and x ∈ X, An x − Am x ≤ An − Am x ≤ εx. ∞
Since Y is complete, the sequence {An x}n=1 is convergent to a point in Y that can be denoted by Ax. Obviously A is a linear operator from X into Y and Ax − Am x = lim An x − Am x ≤ εx, n→∞
m ≥ n0 ,
x ∈ X.
This implies (Proposition 1.2.10) that A ∈ L(X, Y ) and A − Am → 0. The importance of this result can be seen from the following statement.
56
Chapter 2. Properties of Linear and Nonlinear Operators
Proposition 2.1.2. Let X be a Banach space and A ∈ L(X). If A < 1, then the operator I − A is continuously invertible and (I − A)−1 =
∞
An
n=0
where the sum is convergent in the L(X)-norm. Proof. First we prove the convergence. Let ε > 0 be arbitrary. Put Sk =
k
An .
n=0
Then l l l An ≤ An ≤ An < ε 1 Sl − Sk = n=k+1
n=k+1
for
l>k
n=k+1
provided k is sufficiently large. By Proposition 2.1.1, the limit of Sk exists in the ∞ L(X)-norm. Denote B lim Sk = An . We have k→∞
n=0
(I − A)B = lim (I − A) k→∞
k
An = lim
n=0
k→∞
k
An −
n=0
k+1
An
n=1
= lim (I − Ak+1 ) = I k→∞
since lim An = O. Similarly, n→∞
B(I − A) = I,
i.e.,
B = (I − A)−1 .
If X is a complex Banach space and A ∈ L(X), we denote
(A) {λ ∈ C : λI − A is continuously invertible in L(X)} (the so-called resolvent set of A) and σ(A) C \ (A) (the so-called spectrum of A).2 The operator-valued function λ → (λI − A)−1 ,
λ ∈ (A),
is called the resolvent of A. A ∈ L(X, Y ), B ∈ L(Y, Z), then BA ∈ L(X, Z) and BAL(X,Z) ≤ BL(Y,Z) AL(X,Y ) . reason for considering only complex spaces consists in the fact that σ(A) = ∅ for all A ∈ L(X) in this case. This will be proved later in this section (see the discussion following Example 2.1.20). 1 If
2 The
2.1. Linear Operators
57
Corollary 2.1.3. Let X be a complex Banach space and A ∈ L(X). Then (A) is an open set and {λ : |λ| > A} ⊂ (A). Proof. If |λ| > A, then and I −
A −1 λ
A λI − A = λ I − λ
∈ L(X) according to Proposition 2.1.2. Hence we have (λI − A)−1 =
∞ An 3 . λn+1 n=0
Similarly, if λ0 ∈ (A), then λI − A = (λ − λ0 )I + (λ0 I − A) = (λ0 I − A)[I − (λ0 − λ)(λ0 I − A)−1 ]. For a parameter λ such that (λ0 − λ)(λ0 I − A)−1 < 1, the inverse operator B = [I − (λ0 − λ)(λ0 I − A)−1 ]−1 exists and (λI − A)−1 = B(λ0 I − A)−1 .
The next theorem together with Theorems 2.1.8 and 2.1.13 is one of the most significant results in linear functional analysis. For the proofs the interested reader can consult textbooks on functional analysis, e.g., Conway [28], Dunford & Schwartz [44], Rudin [112], Yosida [135]. Theorem 2.1.4 (Uniform Boundedness Principle). Let X be a Banach space and Y a normed linear space. If {Aγ }γ∈Γ ⊂ L(X, Y ) is such that the sets {Aγ xY : γ ∈ Γ} are bounded for all x ∈ X, then {Aγ L(X,Y ) : γ ∈ Γ} is also bounded. This result is the quintessence of several results on approximation of functions in classical analysis and can be used for “modern” proofs of such results. The following example is typical. Example 2.1.5. There exists a periodic continuous function the Fourier series of which is divergent at zero.4 To see this we recall that the nth partial sum of the Fourier series of a function f at 0 is given by π sin n + 12 t 1 sn (f )(0) = Dn (0 − t)f (t) dt where Dn (t) = , 0 < |t| < π 2π −π sin 2t (the nth Dirichlet kernel ). Since σn : f → sn (f )(0) forms on ) are continuous *linear ∞ the space C[−π, π], the sequence of their norms σn L(C[−π,π],R) n=1 should be 3 This series actually converges for λ such that |λ| > r(A) sup {|µ| : µ ∈ σ(A)} but its proof is more involved. The quantity r(A) is called the spectral radius of A. 4 Even divergent at uncountably many points but always of measure zero. The set of such “bad” functions is dense in C[−π, π].
58
Chapter 2. Properties of Linear and Nonlinear Operators
bounded provided σn (f ) is convergent for all f ∈ C[−π, π] (Theorem 2.1.4). One can calculate that π 1 σn = |Dn (t)| dt, 2π −π g and a careful estimate shows that σ is like log n for large n. n
As indicated in the previous example, Theorem 2.1.4 is essentially an approximation result. This is clearer from its next variant. Corollary 2.1.6 (Banach–Steinhaus). Let X and Y be Banach spaces and let ∞ {An }n=1 ⊂ L(X, Y ). Then the limits lim An x exist for every x ∈ X if and n→∞ only if the following conditions are satisfied: (i) There is a dense set M ⊂ X such that lim An x exists for each x ∈ M. ∞ (ii) The sequence of norms {An }n=1 is bounded. Moreover, under these conditions Ax lim An x n→∞
exists for all x ∈ X and A ∈ L(X, Y ).5 The following proposition is also often useful. Proposition 2.1.7. Let X be a Banach space and Y a normed linear space. If B : X × X → Y is a bilinear operator (i.e., linear in both variables) and (i) for every y ∈ X the mapping x → B(x, y) belongs to L(X, Y ); (ii) for every x ∈ X the mapping y → B(x, y) belongs to L(X, Y ), then there exists a constant c such that B(x, y)Y ≤ cxX yX ,
x, y ∈ X.
In particular, if xn → x, yn → y, then B(xn , yn ) → B(x, y). Proof. Denote By : x → B(x, y). By (i), By ∈ L(X, Y ) for all y ∈ X, y ≤ 1. By (ii), By (x) ≤ c(x). The Uniform Boundedness Principle implies the existence of a constant c such that sup sup B(x, y) ≤ c. x≤1 y≤1
Theorem 2.1.8 (Open Mapping Theorem). Let X, Y be Banach spaces, let A ∈ L(X, Y ) and let A have a closed range Im A. Then for any open set G ⊂ X its image A(G) is an open set in Im A. In particular, if A is, in addition, injective and surjective, then A−1 ∈ L(Y, X). 5 This
type of convergence is the so-called convergence in the strong operator topology. It is weaker than the norm convergence.
2.1. Linear Operators
59
When applied to linear equations Ax = y, Theorem 2.1.8 says that the continuous dependence of a solution on the righthand side is a consequence of the existence and uniqueness result. Such continuous dependence is important for any reasonable numerical approximation. Theorem 2.1.8 can be also used in a “negative” sense: Example 2.1.9. Denote by 1 fˆ(n) = 2π
π
f (t)e−int dt
−π
the nth Fourier coefficient of f ∈ L1 (−π, π). Since fˆ(n) → 0 for |n| → ∞ for all trigonometric polynomials which are dense in L1 (−π, π), we have fˆ(n) → 0
for all f ∈ L1 (−π, π)
(the so-called Riemann–Lebesgue Lemma). In other words, A : f → fˆ(·) is a continuous linear operator from L1 (−π, π) into c0 (Z) {an }n∈Z : lim |an | = 0 , {an }c0 (Z) = sup |an |. |n|→∞
Applications of Fourier series to various problems in analysis (like convolution equations, differential equations, . . . ) would be much easier if A were a mapping onto c0 (Z). Theorem 2.1.8 shows that this cannot be true for then A−1 would be bounded, i.e., f L1 (−π,π) ≤ c sup |fˆ(n)|
for all f ∈ L1 (−π, π).
∞
If {Dk }k=1 is the sequence of Dirichlet kernels (Example 2.1.5), then 1, |n| ≤ k, ˆ and Dk L1 (−π,π) ∼ log k, Dk (n) = 0, |n| > k, g
a contradiction.
Theorem 2.1.8 also yields a sufficient condition for a linear operator to be continuous. To formulate it we need the notion of a closed operator: Let X, Y be normed linear spaces. A linear operator A : Dom A ⊂ X → Y is said to be closed if ∞
{xn }n=1 ⊂ Dom A,
xn → x,
Axn → y
implies that x ∈ Dom A
and
Ax = y.
60
Chapter 2. Properties of Linear and Nonlinear Operators
Equivalently, A is a closed operator if and only if the graph of A, i.e., G(A) {(x, Ax) : x ∈ Dom A}, is a closed linear subspace of X × Y . Corollary 2.1.10 (Closed Graph Theorem). Let X, Y be Banach spaces and let A be a closed operator from Dom A = X into Y . Then A is continuous. Proof. If G(A) denotes the graph of A, then put T (x, Ax) = x. By Theorem 2.1.8, T −1 is continuous, and therefore A = π2 ◦ T −1 is continuous as well (π2 is the projection of X × Y onto the second component Y ). Example 2.1.11. Many differential operators are either closed or have closed extensions. If they are viewed as operators from X into X, then they are only densely defined. A very simple example: X = C[0, 1], Ax = x, ˙ Dom A = {x ∈ X : x(t) ˙ exists for all t ∈ [0, 1] and x˙ ∈ X}. A well-known classical result says that A is a closed operator. But A is not contig nuous. For xn (t) = tn we have xn = 1, x˙ n = n. Example 2.1.12. Let X be a Banach space and M a linear subspace of X. Let N be an (algebraic) complement of M and let P be the corresponding projection onto M . Then P is continuous if and only if both M and N are closed. The sufficiency part follows from the Closed Graph Theorem and from an observation that P is closed whenever M and N are closed subspaces. The necessity part is obvious since M = Ker(I − P ),
N = Ker P.
This statement should be compared with the Hilbert space case (Corollary 1.2.35). An important special case is codim M < ∞. By definition, this means that an algebraic direct complement N has a finite dimension (codim M dim N ) and therefore N is closed (Corollary 1.2.11(i)). If M is closed as well, then any projection onto M is continuous. We postpone the case of dim M < ∞ to Remark 2.1.19. We note that if X is a Banach space such that there exists a continuous projection P , P L(X) ≤ 1, onto every closed subspace of X, then X has an g equivalent norm induced by the scalar product on X (see Kakutani [71]).
2.1. Linear Operators
61
Now we turn our attention to the dual space X ∗ of all continuous linear forms on a normed linear space X. In Section 1.1 we have seen the importance of linear forms. Namely, they allowed us to define an algebraic adjoint operator A# and formulate Theorem 1.1.25. The dual space X ∗ is even more important for a normed linear space X since another topology can be introduced on X with help of X ∗ which in a certain sense has better properties (Theorem 2.1.25 below). Surprisingly, the following basic result does not need any topology. Theorem 2.1.13 (Hahn–Banach). Let X be a real linear space and let Y be a linear subspace of X. Assume that f is a linear form on Y which is dominated by a sublinear functional p.6 Then there exists F ∈ X # such that (i) F (y) = f (y) for all y ∈ Y (extension); (ii) F (x) ≤ p(x) for all x ∈ X (dominance). Proof. The proof is based on an extension of f to a subspace whose dimension is larger by 1 and such that this extension is dominated by the same p, and the use of Zorn’s Lemma as an inductive argument, similarly as in the proof of Theorem 1.1.3. Remark 2.1.14. If X is a complex linear space, then we need p to satisfy a stronger condition than (2) in footnote 6, namely (2 ) p(αx) = |α|p(x), α ∈ C, x ∈ X. In this case p is called a semi-norm.7 The dominance also has to be stronger: |f (x)| ≤ p(x). The extension result follows from Theorem 2.1.13 by considering Re f and Im f and observing that Re f (ix) = − Im f (x). Corollary 2.1.15. Let X be a normed linear space and let Y be a linear subspace of X (not necessarily closed). If f ∈ Y ∗ , then there exists F ∈ X ∗ such that (i) F (y) = f (y) for y ∈ Y ; (ii) F X ∗ = f Y ∗ . Proof. Put p(x) = f x, x ∈ X, and apply Theorem 2.1.13 or Remark 2.1.14, respectively. Corollary 2.1.16 (Dual Characterization of the Norm). Let X be a normed linear space. Then xX = max {|f (x)| : f ∈ X ∗ with f X ∗ ≤ 1}. (2.1.1) 6
A mapping p : X → R is called sublinear if (1) p(x + y) ≤ p(x) + p(y) for any x, y ∈ X; (2) p(αx) = αp(x) for any x ∈ X and α ≥ 0.
7 The
difference between a norm and a semi-norm is that a semi-norm need not satisfy the condition: p(x) = 0 =⇒ x = o.
62
Chapter 2. Properties of Linear and Nonlinear Operators
Proof. Put g0 (αx) = αx, α ∈ R (or α ∈ C). Then g0 is a continuous linear form on Lin{x} and its norm is 1 (provided x = o). Let f0 be its extension from Corollary 2.1.15. Then f0 (x) = x, f0 = 1,
i.e.,
x ≤ sup {|f (x)| : f ∈ X ∗ with f ≤ 1}.
The converse inequality follows from the definition of f .
Remark 2.1.17. (i) If X is a Hilbert space, then the equality (2.1.1) can be obtained immediately from the Riesz Representation Theorem (Theorem 1.2.40). This theorem can be often used in Hilbert spaces instead of the Hahn–Banach Theorem. (ii) A slightly weaker form of (2.1.1) is often used: If f (x) = 0 for all f ∈ X ∗ , then x = o. The equivalent assertion reads as follows: X ∗ separates points of X. Corollary 2.1.18 (Separation Theorem). Let X be a normed linear space and let C be a nonempty, closed, convex set. If x0 ∈ C, then there exists F ∈ X ∗ such that sup {Re F (x) : x ∈ C} < Re F (x0 ). (2.1.2) Proof. It is sufficient to give the proof for a real space X and under the additional assumption o ∈ C. In particular, this assumption means that x0 = o. We wish to extend the form f defined on Lin{x0 } by f (αx0 ) = α, α ∈ R. To do that we need a suitable dominating functional. Since d dist(x0 , C) > 0, there exists a convex neighborhood of C which does not contain x0 , e.g., d K = x + y : x ∈ C, y < . 2 , + z pK (z) inf α > 0 : ∈ K α
Put
for
z ∈ X.8
It is a matter of simple calculation to show that pK is sublinear, pK (x0 ) > 1, and pK (z) ≤ 1 for z ∈ K. Let F be an extension of f given by Theorem 2.1.13. Since o ∈ C, we have F (±y) ≤ pK (±y) ≤ 1 This shows that F ≤ 8p
K
2 , d
i.e.,
for
y <
F ∈ X ∗.
is the so-called Minkowski functional of the convex set K.
d . 2
2.1. Linear Operators
63
The inequality (2.1.2) follows from domination: namely, we have F (x) + F (y) ≤ pK (x + y) ≤ 1 i.e.,
for x ∈ C
and all y <
d < 1 = F (x0 ). F (x) ≤ 1 − sup F (y) : y < 2
d , 2
Remark 2.1.19. If C from Corollary 2.1.18 is a closed linear subspace of X and F ∈ X ∗ satisfies (2.1.2), then F (x) = 0 for all x ∈ C. Notice that F (x0 ) = 1 for F which has been constructed in the proof. This observation yields the existence of a continuous projection onto a finite dimensional subspace Y of X. Namely, suppose that {y1 , . . . , yn } is a basis of Y , and denote by Yk the span of y1 , . . . , yk−1 , yk+1 , . . . , yn . Then Yk is a closed linear subspace of X and yk ∈ Yk . Let Fk ∈ X ∗ be such that 1, j = k, j = 1, . . . , n. Fk (yj ) = 0, j = k, Then Px =
n
Fk (x)yk
k=1
is a continuous projection onto Y . Warning. It is not true that every projection onto Y is continuous even if dim Y = 1 but the construction (i.e., the construction of a noncontinuous linear form) is not obvious! Example 2.1.20. (i) By Corollary 1.2.11(ii), (RM )∗ = (RM )# . This means that (RM )∗ can be identified with RM . (ii) Let K be a compact subset of RM . Then for any F ∈ [C(K)]∗ there exists a unique complex Borel measure µ on K such that F (f ) = f (x) dµ(x) for every f ∈ C(K), K
and F [C(K)]∗ = |µ|(K) where |µ| is the total variation of µ. A similar statement holds under a more general assumption on K – for details and the corresponding notions see Dunford & Schwartz [44, Section IV, 6] or Rudin [113, Chapter 6] and, especially, Bourbaki [14]. In the last book the integration theory is developed on the basis of this representation theorem. (iii) Let Ω be an open subset of RM and let p ∈ [1, ∞). Then the dual space [Lp (Ω)]∗ can be identified with Lp (Ω) (p is the conjugate exponent, i.e.,
64
Chapter 2. Properties of Linear and Nonlinear Operators 1 p
+ p1 = 1) in the following sense. For any F ∈ [Lp (Ω)]∗ there exists a unique
ϕ ∈ Lp (Ω) such that F (f ) =
f (x)ϕ(x) dx
for every f ∈ Lp (Ω).
Ω
Moreover, F [Lp(Ω)]∗ = ϕLp (Ω) . Details can be found in books cited above. Warning. The dual space [L∞ (Ω)]∗ is much larger than L1 (Ω)! (iv) The dual spaces to Sobolev spaces W k,p (RM ) can be identified with special subspaces of tempered distributions for example via the Fourier transform. We omit details since their description is beyond the scope of this book. g The reader can ask why we are so interested in continuous linear forms. One of the reasons is the following. Suppose that ϕ is a vector-valued function (i.e., a mapping from R or C into a normed linear space X). For any f ∈ X ∗ the composition f ◦ ϕ is a real or complex function of a real or complex variable and therefore results of classical analysis can be applied to f ◦ ϕ. To be more specific, consider the resolvent (see page 56) of A ∈ L(X) R(λ)x (λI − A)−1 x,
λ ∈ (A),
which is an X-valued function for every x ∈ X. Then for any F ∈ X ∗ , the complex function ϕ(λ) = F [(λI − A)−1 x] is holomorphic in (A). For |λ| > A we also have ∞ An |ϕ(λ)| ≤ F X ∗ (λI − A)−1 L(X) xX = F x n+1 λ n=0 ≤ F x
∞ An , |λ|n+1 n=0
and so lim |ϕ(λ)| = 0.
|λ|→∞
If (A) = C, ϕ would be identically zero (by the Liouville Theorem from the complex functions theory). Since this should be true for all F ∈ X ∗ , we get (λI − A)−1 x = o for all x ∈ X, a contradiction. Therefore, the spectrum σ(A) is nonempty for each A ∈ L(X). This is a generalization of the existence of an eigenvalue of a linear operator in a finite dimensional space and therefore also a generalization of the Fundamental Theorem of Algebra (cf. page 15). It is worth mentioning that the Jordan Canonical Form (Theorem 1.1.34) is based on this result.
2.1. Linear Operators
65
Warning. It is not true that any A ∈ L(X), dim X = ∞, has an eigenvalue! A simple example is X = C[0, 1], Ax(t) tx(t). Our main reason for considering dual spaces comes from an attempt to find a weaker topology on a normed linear space in which bounded sets would be relatively compact. The importance of this fact will become clear in Chapter 6. We also ask the reader to return to Proposition 1.2.2 for motivation. Definition 2.1.21. Let {xn }∞ n=1 be a sequence of elements in a normed linear ∞ space X. We say that {xn }n=1 converges weakly to x ∈ X (notation xn x or w- lim xn = x) if n→∞
lim f (xn ) = f (x)
n→∞
for every f ∈ X ∗ .
Proposition 2.1.22. (i) (uniqueness) If xn x and xn y, then x = y. (ii) If lim xn − x = 0, then xn x.9 n→∞
(iii) A weakly convergent sequence is bounded. Moreover, if xn x, then x ≤ lim inf xn . n→∞
(iv) If X is a uniformly convex Banach space,10 xn x and xn → x, then {xn }∞ n=1 converges to x in the norm topology. Proof. Assertion (i) follows immediately from Remark 2.1.17(ii) since in this case f (x) = f (y) for every f ∈ X ∗ . Assertion (ii) is obvious. Assertion (iii) is basically a consequence of Theorem 2.1.4, but certain preliminaries are needed: Since X ∗ is a normed linear space, its dual X ∗∗ (X ∗ )∗ is defined. Put κ(x) : f → f (x), f ∈ X ∗. Then κ (the so-called canonical embedding) is a linear continuous operator from X into X ∗∗ , and κ(x)X ∗∗ = sup |f (x)| = xX f X ∗ ≤1
The converse statement is not true in general (see Exercise 2.1.37)! A Banach space X is said to be uniformly convex for every ε > 0 there is δ > 0 such if that x, y ∈ X, x = y = 1, x − y ≥ ε =⇒ 1 − x+y ≥ δ. Every uniformly convex space 2 9 Warning. 10
is reflexive, see Yosida [135, Chapter V, 2]. Hilbert spaces, Lp (Ω)-spaces and W 1,p (Ω)-spaces (1 < p < ∞) are uniformly convex (for a Hilbert space this follows from the parallelogram identity (1.2.14), for the other two cases see, e.g., Adams [2, Corollary 2.29 and Theorem 3.5]).
66
Chapter 2. Properties of Linear and Nonlinear Operators
(Corollary 2.1.16).11 Since the space X ∗ is always complete (Proposition 2.1.1), Theorem 2.1.4 can be applied to the sequence {κ(xn )}∞ n=1 . This shows that ∞ {xn }n=1 is bounded. If xn x, we choose f ∈ X ∗ such that f = 1
and
f (x) = x
(Corollary 2.1.16). Then x = f (x) = lim f (xn ) ≤ lim inf xn . n→∞
n→∞
Assertion (iv) is obvious for x = o. If x = o, then we may assume that also x xn = o and put y x and yn xxnn . Since xn x and xn → x, we have f (yn ) =
1 1 f (xn ) → f (x) = f (y) for any f ∈ X ∗ , xn x
i.e., yn y.
If we prove that yn − y → 0, then xn − x = (yn xn − yx) ≤ xn yn − y + y-xn − x- → 0 due to the assumption xn → x. To prove yn → y we proceed by contradiction using the uniform convexity of X. Suppose that there is ε > 0 such that yn −y ≥ ε for infinitely many n. Then, by the uniform convexity of X, yn + y ≤ 2(1 − δ). Let us choose f0 ∈ X ∗ , f0 = 1, f0 (y) = y = 1 (see Corollary 2.1.16). Then 2(1 − δ) ≥ lim sup yn + y ≥ lim sup f0 (yn + y) = 2f0 (y) = 2, n→∞
a contradiction.
n→∞
Remark 2.1.23. The weak convergence is the convergence in the weak topology. It is convenient to define this topology by systems of neighborhoods of points. We say that U ⊂ X is a weak neighborhood of a point x ∈ X if there are f1 , . . . , fn ∈ X ∗ such that {y ∈ X : |fi (y) − fi (x)| < 1 for i = 1, . . . , n} ⊂ U. A subset G ⊂ X is weakly open (i.e., open in the weak topology) provided it is a weak neighborhood of each of its points. It is easy to see that a weakly open set is also open in the norm topology. The converse is generally true only in finite dimensional spaces. 11 It
is not generally true that κ is surjective. A Banach space X is said to be reflexive if κ is surjective. Every Hilbert space and spaces Lp (Ω), 1 < p < ∞, are reflexive (the Riesz Representation Theorem and Example 2.1.20(iii)). Spaces L1 (Ω), L∞ (Ω) and C(Ω) are not reflexive.
2.1. Linear Operators
67
As we have mentioned, our aim is to find compact sets in the weak topology. Remark 2.1.24. The weak topology in an infinite dimensional space is not metrizable. Therefore two concepts of compactness, namely the sequential and the covering one (see footnote 12 on page 26) are in principle different. It is surprising that they coincide for weak topologies in Banach spaces. This very deep result is known as the Eberlain–Smulyan Theorem (see Dunford & Schwartz [44, Chapter 5]). Theorem 2.1.25 (Eberlain–Smulyan). Let X be a reflexive space. Then any bounded sequence contains a weakly convergent subsequence. Proof. We present a simple proof for the case that X is a Hilbert space. A proof for an arbitrary reflexive space can be found, e.g., in Dunford & Schwartz [44], Fabian et al. [49], Yosida [135]. Let {xn }∞ n=1 ⊂ X be a bounded sequence, and put Y = Lin{x1 , x2 , . . . } (the closure is taken in the norm topology). Since the sequence of scalar products {(x1 , xn )}∞ n=1 is a bounded sequence of + numbers,(real or complex), there is a
subsequence, say {xn }∞ n=1 , such that (1)
(1)
(x1 , xn )
reason there is a subsequence {xn }∞ n=1 of (2)
(k)
converges, etc. Put yk = xk
∞
converges. For the same + ,∞ (2) that (x2 , xn )
n=1 (1) {xn }∞ n=1 such
n=1
(the diagonal choice). Then lim (xj , yk ) exists for k→∞
all j ∈ N, and therefore lim (x, yk ) exists for each x ∈ Lin{x1 , x2 , . . . }. k→∞
Since the sequence of linear forms fk : x → (x, yk ) is bounded in Y ∗ , the Banach–Steinhaus Theorem (Corollary 2.1.6) implies the existence of f ∈ Y ∗ such that lim fk (x) = f (x)
k→∞
for all x ∈ Y.
Let P be the orthogonal projection onto Y . Put g(x) = f (P x)
for
x ∈ X.
Then g ∈ X ∗ and by the Riesz Representation Theorem there is y ∈ X such that g(x) = (x, y)
for x ∈ X.
Moreover, lim (x, yk ) = lim (P x, yk ) = f (P x) = (x, y)
n→∞
n→∞
This means that yk y.
for all x ∈ X.
68
Chapter 2. Properties of Linear and Nonlinear Operators
Remark 2.1.26. Weak convergence in a dual space X ∗ is more confusing since two ∗ approaches can be used. We say that a sequence {fn }∞ n=1 ⊂ X (i) converges weakly to f ∈ X ∗ (notation fn f or w- lim fn = f ) if n→∞
lim F (fn ) = F (f )
n→∞
for every F ∈ X ∗∗ ; ∗
(ii) converges weak star to f ∈ X ∗ (notation fn f or w∗ - lim fn = f ) if n→∞
lim fn (x) = f (x)
n→∞
for every x ∈ X.
Criteria for weak convergence in Lp -spaces can be found, e.g., in Dunford & Schwartz [44, Chapter IV, 8]. The weak convergence in X ∗ has obviously the same properties as that in X. Because of the continuous embedding κ : X → X ∗∗ (see the proof of Proposition 2.1.22(iii)) the w-convergence implies the w∗ -convergence. The converse is true if X is a reflexive space, i.e., κ(X) = X ∗∗ . Since the w∗ -topology is generally weaker than the w-topology there can exist more w∗ -compact sets than the w-compact ones. In fact, the following result (the Alaoglu–Bourbaki Theorem, see Conway [28], Dunford & Schwartz [44], Fabian et al. [49]) holds: If X is a normed linear space, then any closed ball in X ∗ is w∗ compact. If, moreover, X is separable, then the ball is also sequentially w∗ -compact. For example, this theorem can be applied to balls in Lp (Ω), 1 < p ≤ ∞. In the rest of this section we will examine adjoint operators. Suppose that X and Y are normed linear spaces and A ∈ L(X, Y ). If g ∈ Y ∗ , then A∗ g g(A) ∈ X ∗ . The operator A∗ : Y ∗ → X ∗ is obviously linear, and it is also continuous since |A∗ g(x)| = |g(Ax)| ≤ gY ∗ AxY ≤ gY ∗ AL(X,Y ) xX . If H1 , H2 are Hilbert spaces and A ∈ L(H1 , H2 ) we have another approach to the definition of an adjoint operator, namely the one based on the Riesz Representation Theorem: For y ∈ H2 the mapping f : x → (Ax, y)H2 is a continuous linear form on H1 , and hence there is z ∈ H1 for which f (x) = (x, z)H1 . This z is uniquely determined by y, and we denote for a moment z = A+ y, i.e., (Ax, y)H2 = (x, A+ y)H1 .
2.1. Linear Operators
69
There is a very slight difference between A∗ and A+ , e.g., (αA)∗ = αA∗ and (αA)+ = αA+ (see also Example 2.1.28 below). So we will use the same notation, namely A∗ , for both concepts. Symmetric matrices have certain special properties (e.g., their canonical forms are diagonal). The same can be expected for their generalization in the Hilbert space setting which is defined as follows: An operator A ∈ L(H) is said to be self-adjoint if A = A∗ , i.e., (Ax, y) = (x, Ay)
for all x, y ∈ H.
In order to generalize Theorem 1.1.25 to continuous linear operators on infinite dimensional normed linear spaces we will use the same notation but with a slightly different meaning: If M ⊂ X, then M⊥ {f ∈ X ∗ : x ∈ M ⇒ f (x) = 0}. If N ⊂ X ∗ , then N⊥ {x ∈ X : f ∈ N ⇒ f (x) = 0}. We invite the reader to compare this symbol with that for orthogonal complements in Hilbert spaces. Proposition 2.1.27. Let X, Y be normed linear spaces and let A ∈ L(X, Y ). Then (i) if xn x, then Axn Ax; (ii) if A is, moreover, continuously invertible, then A∗ is also continuously invertible and (A∗ )−1 = (A−1 )∗ ; (iii) Ker A = (Im A∗ )⊥ ; (iv) Im A = (Ker A∗ )⊥ . Proof. (i) It is easy with the use of A∗ . (ii) It is sufficient to show that (A−1 )∗ A∗ = IY ∗ and A∗ (A−1 )∗ = IX ∗ . This follows from the more general result (AB)∗ = B ∗ A∗ which is easily verified. (iii) The inclusion ⊂ is obvious from the definition, for the converse inclusion ⊃ it is sufficient to use the fact that Y ∗ separates the points of Y . (iv) It is easy to see that (Im A)⊥ = Ker A∗ . To get (iv) it suffices to prove that (M⊥ )⊥ = Lin M for M ⊂ X. If x0 belonged to (M⊥ )⊥ \ Lin M, x0 would be separated from Lin M by a linear form f ∈ X ∗ (Corollary 2.1.18). Since Lin M is a subspace of X, this separating f would be in (Lin M)⊥ = M⊥ . Therefore f (x0 ) = 0, and a contradiction is obtained. The converse inclusion Lin M ⊂ (M⊥ )⊥ is obvious.
70
Chapter 2. Properties of Linear and Nonlinear Operators
Notice that the statement (iv) is not a sufficient condition for solvability of the equation Ax = y since only the closure of Im A is characterized. There are many operators the range t
of which is not closed. A simple example is Ax(t) =
x(s) ds considered either 0
2
in C[0, 1] or in L (0, 1). It is not an easy task to decide whether an operator has a closed range or not. The following statement is useful in applications. If X, Y are Banach spaces and A ∈ L(X, Y ) is injective, then Im A is closed if and only if there is a positive constant c such that Ax ≥ cx
for all
x ∈ X.
Sufficiency is easy, the necessity part follows from the Open Mapping Theorem. There is an important subclass of operators with a closed range, namely the so-called Fredholm operators. An operator A ∈ L(X) is said to be Fredholm if dim Ker A < ∞,
Im A is closed,
and
codim Im A < ∞
(i.e., the dimension of any direct complement of Im A is finite). We note that codim Im A = dim Ker A∗ (this is basically Proposition 2.1.27(iv)). We define ind A dim Ker A − dim Ker A∗ and call it the index of the Fredholm operator . A special class of Fredholm operators will be examined in the next section. We have not yet introduced any sufficiently broad family of continuous linear operators. The next example fills this gap. ˜ be open subsets of RM and Example 2.1.28 (Integral operators). Let Ω and Ω ˜ M ˜ R , respectively. Assume that k : Ω × Ω → C is a measurable function for which there are constants c1 , c2 such that ˜ |k(t, s)| ds ≤ c1 for a.a. t ∈ Ω, |k(t, s)| dt ≤ c2 for a.a. s ∈ Ω. ˜ Ω
Ω
Then the operator A defined by k(t, s)x(s) ds
Ax(t) =
(2.1.3)
Ω
˜ for 1 ≤ p ≤ ∞.12 is a linear bounded operator from Lp (Ω) into Lp (Ω) ˜ can be found in Dunford on the kernel k which guarantee that A ∈ L(Lp (Ω), Lr (Ω)) & Schwartz [44, Chapter VI, 11A].
12 Conditions
2.1. Linear Operators
71
˜ is To prove this assertion we have to show that Ax(t) exists for a.a. t ∈ Ω, p ˜ 13 ˜ measurable on Ω and belongs to L (Ω). For 1 ≤ p < ∞, by the H¨ older inequality, we get for p1 = 1 − p1 : |Ax(t)| ≤ Ω
1 p
1 p
1 p
|k(t, s)| |k(t, s)| |x(s)| ds ≤ c1
Set
|k(t, s)||x(s)| ds p
ϕ(t)
.
Ω
|k(t, s)||x(s)| ds p
p1
p1 .
Ω
Since the measurable function (t, s) → |k(t, s)||x(s)|p can be approximated by step ˜ × Ω bounded), the function t → ϕ(t) is measurable on functions (consider first Ω ˜ The Fubini Theorem yields Ω. p p |ϕ(t)| dt = |k(t, s)||x(s)| ds dt ˜ ˜ Ω Ω Ω |k(t, s)| dt |x(s)|p ds ≤ c2 xpLp (Ω) . = ˜ Ω
Ω
˜ (by the same In particular, ϕ is finite a.e. Since t → Ax(t) is measurable on Ω argument as above), we also have p1 1 1 p AxLp (Ω) = |Ax(t)| dt ≤ c1p c2p xLp (Ω) . ˜ ˜ Ω
˜ ∗ , 1 ≤ p < ∞, with a The Fubini Theorem also yields (we identify g ∈ [Lp (Ω)] p ˜ function from L (Ω) – see Example 2.1.20(iii)) Ax(t)g(t) dt = k(t, s)x(s) ds g(t) dt ˜ ˜ Ω Ω Ω k(t, s)g(t) dt x(s) ds = (A∗ g)(s)x(s) ds, = Ω
i.e., ∗
A g : s →
˜ Ω
Ω
k(t, s)g(t) dt, ˜ Ω
˜ g ∈ Lp (Ω).
We note that the adjoint operator to A for p = 2 in the sense of the Riesz Representation Theorem is of the form ∗ k(t, s)g(t) dt. A g(s) = ˜ Ω
˜ and k(t, s) = k(s, t). We will continue the In particular, A is self-adjoint if Ω = Ω g study of integral operators in the next section (Example 2.2.5). 13 The
case p = ∞ is left to the reader.
72
Chapter 2. Properties of Linear and Nonlinear Operators
In Example 2.1.11 we have mentioned that differential operators on a function space are not continuous and are only densely defined. Therefore we wish to extend the notion of the adjoint operator to this case. Assume that A is a linear operator defined on a dense subspace Dom A of X with values in Y . Put D∗ = {g ∈ Y ∗ : a linear form x ∈ Dom A → g(Ax) has a continuous extension f to the whole of X}. Obviously, D∗ is a linear subspace of Y ∗ containing o and the extension f is uniquely determined by g. We denote A∗ g f,
Dom (A∗ ) = D∗
and call A∗ the adjoint operator to A. Example 2.1.29. The simplest differential operator is defined by Ax(t) = x(t). ˙ This relation can be considered in various function spaces and also with different domains. If we are interested in its adjoint we should have a good representation of the dual space. This leads to an observation that spaces of integrable functions would be more convenient than spaces of continuous functions. Therefore let X = Lp (0, 1), 1 ≤ p < ∞ and Dom A = C 1 [0, 1]. Consider A : Dom A ⊂ X → X. We wish to compute A∗ . Assume g ∈ Dom (A∗ ) ⊂ Lp (0, 1) and A∗ g = f , i.e., 1 1 x(t)g(t) ˙ dt = x(t)f (t) dt = A∗ g(x) for all x ∈ Dom A. g(Ax) = 0
0
In particular, for x ∈ V = {x ∈ Dom A : x(1) = 0}
and
t
F (t) =
f (s) ds, 0
the integration by parts14 yields 1 x(t)f (t) dt = x(t)F (t)|10 − 0
1
x(t)F ˙ (t) dt = −
1
x(t)F ˙ (t) dt.
0
0
Since the restriction A|V of A to V has a dense range in Lp (0, 1) (Im A|V = C[0, 1]), we have F + g = o in Lp (0, 1). This means that g can be changed on a set of measure zero to have g absolutely continuous and
g˙ = −f ∈ Lp (0, 1),
i.e.,
g ∈ W 1,p (0, 1).
you are not familiar with integration by parts for the Lebesgue integral (notice that f ∈ Lp (0, 1) ⊂ L1 (0, 1)), you can approximate f by a continuous function to get a standard situation for integration by parts. 14 If
2.1. Linear Operators
73
1
Moreover, g(0) = −F (0) = 0. Taking F (t) = −
f (s) ds we see that also g(1) = t
0. This proves that
Dom (A∗ ) ⊂ {g ∈ W 1,p (0, 1) : g(0) = g(1) = 0} = W01,p (0, 1) 15 and ˙ A∗ g = −g. Integration by parts yields also the converse inclusion, i.e.,
Dom(A∗ ) = W01,p (0, 1). Notice that Im A is dense in Lp (0, 1) but not closed while A∗ is injective and p Im A = f ∈ L (0, 1) :
1
∗
f (t) dt = 0
0
is closed but not dense in Lp (0, 1). Notice also that (A) = (A∗ ) = ∅ and any λ ∈ C is an eigenvalue of A. To g the contrary, A∗ has no eigenvalues. A more general result (due to S. Banach) is stated in the following proposition (see, e.g., Yosida [135]). Proposition 2.1.30. Let X, Y be Banach spaces and let A be a closed densely defined linear operator from X into Y . Then Im A is closed if and only if Im A∗ is closed. Moreover, Im A = (Ker A∗ )⊥
and
Ker A = (Im A∗ )⊥ .
Nevertheless, notice that A is not closed in our example. Proposition 2.1.30 can be applied to A∗ (A∗ is always closed); Dom (A∗∗ ) = W 1,p (0, 1), A∗∗ x = x. ˙ 16 This simple example shows how the domain of a (linear) noncontinuous operator affects its properties. Example 2.1.31. Put Ax = −¨ x
with
Dom A = {x ∈ C 2 (a, b) : x(a) = x(b) = 0}.
If the equation Ax = λx 15 The last equality should be proved. A deeper insight into these Sobolev spaces will be given in Chapter 7, cf. also Exercise 1.2.46. 16 Notice that A∗∗ is an extension of A and, moreover, the graph of A∗∗ is the closure of the graph of A (it is also said that A∗∗ is the closure of A).
74
Chapter 2. Properties of Linear and Nonlinear Operators
has a nonzero solution w (∈ Dom A), then λ is called an eigenvalue and w a cork2 π 2 responding eigenfunction of A. Simple calculation shows that (b−a) 2 are all eigen-
kπ (t − a) are the corresponding eigenfunctions. Consider values of A,17 and sin b−a now the boundary value problem −¨ x(t) = λx(t) + f (t), t ∈ (a, b), (2.1.4) x(a) = x(b) = 0.
Let ϕ1 , ϕ2 be a fundamental system for the differential equation −¨ x − λx = 0. The Variation of Constants Formula shows that t ϕ1 (s)ϕ2 (t) − ϕ1 (t)ϕ2 (s) x(t) = c1 ϕ1 (t) + c2 ϕ2 (t) + f (s) ds (2.1.5) W (s) a is a solution to −¨ x − λx = f . Here W is the Wronski determinant of ϕ1 , ϕ2 (notice that for this equation we always can choose ϕ1 , ϕ2 such that W ≡ 1). We wish to find constants c1 , c2 such that x given by (2.1.5) satisfies the boundary conditions x(a) = x(b) = 0. The number λ is not an eigenvalue if and only if
ϕ1 (a) ϕ2 (a) = 0. det ϕ1 (b) ϕ2 (b) In this case the formula (2.1.5) shows that for any f ∈ C[a, b] the problem (2.1.4) has a unique solution in Dom A 18 which is called a classical solution. This means that λ ∈ (A). Suppose now that λ is an eigenvalue. Then we can take ϕ1 as a corresponding eigenfunction and get x(a) = c2 ϕ2 (a),
i.e.,
c2 = 0
(ϕ2 (a) = 0 since ϕ1 , ϕ2 are linearly independent), and b b ϕ1 (s)f (s) ds = 0, i.e., ϕ1 (s)f (s) ds = 0 x(b) = ϕ2 (b) a
(2.1.6)
a
since ϕ2 (b) = 0 (by the same argument as above). Notice that (2.1.6) is also a necessary condition for solvability of (2.1.4). We will return to this example in the next section (see Example 2.2.17). g Example 2.1.32. Linear differential operators of the second order with nonconstant coefficients are more complicated. To simplify our exposition we consider a differential expression Lx p0 x ¨ + p1 x˙ + p2 x 17 The minus sign in the definition of A is conventional; it is introduced to obtain positive eigenvalues. 18 If f ∈ Lp (a, b), then it is possible to show that the function x = x(t) given by (2.1.5) belongs to W 2,2 (a, b), x(a) = x(b) = 0, and the equation in (2.1.4) is satisfied a.e. in (a, b). Such a solution is called a strong solution.
2.1. Linear Operators
75
where p¨0 , p˙1 , p2 are continuous functions on a closed bounded interval [a, b] and p0 < 0 on this interval (the so-called regular case). Let X = Lp (a, b), 1 ≤ p < ∞ and D = {x ∈ W 2,p (a, b) : x(a) = x(b) = 0}. Put x ∈ D = Dom A
Ax = Lx, and consider
A : Dom A ⊂ X → X. A solution of Ax = f is therefore a strong solution of Lx(t) = f (t), t ∈ (a, b), x(a) = x(b) = 0. It can be proved that A is injective provided p2 > 0 in [a, b]. (Assume by contradiction that Ker A = {o} and show that there is x0 ∈ Ker A which has a negative minimum at an interior point c ∈ (a, b). Deduce that Lx0 (c) < 0.) The Variation of Constants Formula shows that the operator A is also surjective and A−1 is an integral operator b A−1 f (t) = G(t, s)f (s) ds (2.1.7) a
where G is the so-called Green function of L. The Green function is nonnegative on [a, b] × [a, b] and satisfies the estimates from Example 2.1.28. Therefore A−1 ∈ L(X). In order to calculate the adjoint A∗ it is convenient to consider the so-called formal adjoint expression to L, i.e., M y = (p0 y)¨− (p1 y)˙ + p2 y
b
Lx(t)y(t) dt and omit-
which is obtained by integrating by parts in the integral a
ting the boundary terms. Put By = M y
for
y ∈ D = Dom B.
The same integration as above shows that B ⊂ A∗ . The proof of the equality A∗ = B needs a more careful calculation. The interested reader can consult the books Coddington & Levinson [27, Chapter 9], Edmunds & Evans [46] or Dunford & Schwartz [45], in particular Chapter XIII, for details and also for more complicated singular cases which are important in applications, e.g., in Quantum Mechanics (the Schr¨ odinger equation). g
76
Chapter 2. Properties of Linear and Nonlinear Operators
Exercise 2.1.33. Let X, Y be Banach spaces. If A ∈ L(X, Y ) has a continuous inverse A−1 ∈ L(Y, X) and B ∈ L(X, Y ) is such that B − A <
1 , A−1
then B is also continuously invertible and B −1 ≤
A−1 , 1 − A−1 B − A
B −1 − A−1 ≤
A−1 2 B − A. 1 − A−1 B − A
Hint. Examine the proof of Corollary 2.1.3 and write A−1 B = A−1 (B − A) + I. Exercise 2.1.34. Show that etA =
∞ n n t A n! n=0
is well defined for all t ∈ R, A ∈ L(X), provided X is a Banach space, and, moreover, the vector function ϕ : t → etA x0 solves the differential equation x(t) ˙ = Ax(t) and satisfies the initial condition ϕ(0) = x0 . (See also the end of Section 1.1, in particular Exercise 1.1.41.) Exercise 2.1.35. Let K be a continuous real function on [a, b] × [a, b] and let h ∈ C[a, b] be fixed. Let M=
max (t,τ )∈[a,b]×[a,b]
and let λ ∈ R be such that |λ| <
|K(t, τ )|
1 . M (b − a)
Prove that the integral equation x(t) = λ
b
K(t, τ )x(τ ) dτ + h(t) a
has a unique solution x ∈ C[a, b]. ∞
∞
Exercise 2.1.36. Let {xn }n=1 , {yn }n=1 be sequences in a Hilbert space H such that xn x, yn → y. Then (xn , yn ) → (x, y). Hint. Use Proposition 2.1.22(iii).
2.2. Compact Operators
77 ∞
Exercise 2.1.37. Let {en }n=1 be an orthonormal sequence in a Hilbert space. Show that en o. Hint. Use the Bessel inequality (1.2.17). Exercise 2.1.38. Prove assertion (iv) of Proposition 2.1.22 for a Hilbert space X. Hint. Use the relation between the scalar product and the norm in X. Exercise 2.1.39. Show that a convex set (in particular a subspace) of a normed linear space is weakly closed if and only if it is closed in the norm topology. Hint. Suppose by contradiction that C is a norm-closed convex set which is not weakly closed. Then there is x0 ∈ C w \ C. Use the Separation Theorem (Corollary 2.1.18) to obtain a contradiction. Exercise 2.1.40. Prove that actually A∗ L(Y ∗ ,X ∗ ) = AL(X,Y ) . Hint. The inequality A∗ ≤ A follows from the calculation after Remark 2.1.26. For the converse inequality use the dual characterization of the norm Ax.
2.2 Compact Operators In this section we present a class of continuous linear operators the properties of which are closely related to the properties of finite dimensional linear operators. The key assertions presented concern the Riesz–Schauder Theory and the Hilbert– Schmidt Theorem. Definition 2.2.1. Let X and Y be normed linear spaces. A linear operator A ∈ L(X, Y ) is called a compact operator if the image of a ball in X is relatively compact in Y . The set of all compact operators from X into Y is denoted by C (X, Y ). Remark 2.2.2. (i) Every compact linear operator is continuous. (ii) The compactness condition is mostly used in the following equivalent form: ∞
For any bounded sequence {xn }n=1 ⊂ X there is a subsequence ∞ {xnk }k=1 such that Axnk converge in the norm topology of Y . (iii) Replacing the norm topology in Y by the weak topology a weakly compact operator can be defined. If either X or Y is reflexive, then any A ∈ L(X, Y ) is weakly compact. This follows from the Eberlain–Smulyan Theorem (Remark 2.1.24) and the observation that A ∈ L(X, Y ) maps a weakly convergent sequence into a weakly convergent one (cf. Proposition 2.1.27(i)).
78
Chapter 2. Properties of Linear and Nonlinear Operators
Example 2.2.3. (i) If A ∈ L(X, Y ) and dim Im A < ∞ (the so-called operator of finite rank ), then A ∈ C (X, Y ). ∞ (ii) Let {en }n=1 be an orthonormal basis in a Hilbert space H. Put Aen = λn en and extend A by linearity to the dense set D Lin{e1 , . . . } in H. The operator A is bounded on D (and therefore it can be uniquely extended ∞ to a continuous operator on H) if and only if {λn }n=1 is a bounded sequence. In addition, A = sup |λn |. n
This follows immediately from the identity |λn |2 |(x, en )|2 for every x ∈ H. Ax2 = Moreover, A is a compact operator on H if and only if lim λn = 0.
n→∞
g
This is an easy consequence of Proposition 1.2.39.
Proposition 2.2.4. Let X, Y and Z be normed linear spaces. Then (i) if A ∈ C (X, Y ), B ∈ L(Y, Z), then BA ∈ C (X, Z); (ii) if A ∈ C (Y, Z), B ∈ L(X, Y ), then AB ∈ C (X, Z); ∞ (iii) if A ∈ C (X, Y ) and a sequence {xn }n=1 ⊂ X converges weakly to x ∈ X, then lim Axn − Ax = 0. n→∞
∞
(iv) Assume that Y is a Banach space and a sequence {An }n=1 ⊂ C (X, Y ) converges to A ∈ L(X, Y ) in the norm operator topology. Then A ∈ C (X, Y ). Proof. The assertions (i) and (ii) are obvious. To prove (iii) assume by contradiction that there is a subsequence {xnk }∞ k=1 such that Axnk − Ax ≥ c > 0. ∞
The sequence {x , is bounded (Proposition 2.1.22(iii)), and hence there exists +n }n=1 a subsequence xnkl
∞
l=1
and y ∈ Y such that
Axnkl − y → 0. Since f (Axn ) = A∗ f (xn ) → A∗ f (x) = f (Ax) we have y = Ax, and hence a contradiction.
for every f ∈ X ∗ ,
2.2. Compact Operators
79
(iv) Let B(o; 1) be the unit ball. By Proposition 1.2.3 it suffices to show that for any ε > 0 there is a finite ε-net of A(B(o; 1)). We choose n such that An − A < 2ε , and a finite 2ε -net for An (B(o; 1)). By the triangle inequality, this is the desired ε-net for A(B(o; 1)). Example 2.2.5. (i) Let k be a continuous function on the Cartesian product [a, b] × [a, b]. Then the operator b Ax : t ∈ [a, b] → k(t, s)x(s) ds a
is compact as an operator from C[a, b] into itself.19 We give two proofs of this assertion. The first is based on the use of the Arzel` a–Ascoli Theorem (Theorem 1.2.13). Its assumptions are satisfied for F = A(B(o; 1)) where B(o; 1) is the unit ball in C[a, b]. The equicontinuity of F follows from the uniform continuity of k on [a, b] × [a, b]. The second proof uses Proposition 2.2.4(iv). Put A = {(t, s) → x(t)y(s) : x, y ∈ C[a, b]}. It is easy to see that A is a subalgebra of C([a, b] × [a, b]) which satisfies the assumptions of the real or complex Stone–Weierstrass Theorem (The∞ orem 1.2.14). Hence there are sequences {qn }∞ n=1 , {rn }n=1 in C[a, b] such that qn (t)rn (s) ⇒ k(t, s) uniformly in [a, b] × [a, b]. In particular, this means that the operators b rn (s)x(s) ds An x : t → qn (t) a
converge in the operator norm to A. Since Im An ⊂ Lin{qn }, all An are compact and, therefore, A is compact. (ii) Let Ω be a measurable subset of RM and let k ∈ L2 (Ω×Ω). Then the operator k(t, s)x(s) ds Ax(t) = Ω
(the so-called Hilbert–Schmidt operator ) is compact as an operator from L2 (Ω) into itself. We present again two proofs of this statement. The first will be a typical Hilbert space proof, the second will use the reflexivity of L2 (Ω) and we will show how it could be used to get compactness of an integral operator on Lp (Ω). 19 This
is true under more general assumptions, e.g., if the interval [a, b] is replaced by a compact k(t, s)x(s) dµ(s).
topological space K, µ is a Borel measure on K and A is defined by Ax(t) = K
80
Chapter 2. Properties of Linear and Nonlinear Operators
The first proof is based on the following observation: ∞ ∞ Let {ek }k=1 , {fk }k=1 be two orthonormal bases in a separable Hilbert space H. Let B ∈ L(H). By the Parseval equality we have n2 (B)
∞
|(Bek , fn )|2 =
k,n=1
∞
Bek 2 =
∞
B ∗ fn 2 ≤ ∞.
n=1
k=1
This shows that the quantity n(B) depends only on B and not on the particular choice of bases. Moreover, if n(B) < ∞, then B ∈ C (H). To see this ∞ take nε ∈ N such that B ∗ fn 2 < ε and define n=nε +1
Bε x =
nε
(Bx, fn )fn .
n=1
Then dim Im Bε < ∞ and Bε x − Bx2 =
∞
|(Bx, fn )|2 ≤ x2
n=nε +1
∞
B ∗ fn 2 ≤ εx2 .
n=nε +1
The compactness of B follows from Proposition 2.2.4(iv). In order to apply this statement to the Hilbert–Schmidt operator choose an 2 orthonormal basis {en }∞ n=1 in L (Ω) and notice that ϕm,n (t, s) em (t)en (s) is an orthonormal set in L2 (Ω × Ω). Notice that {ϕm,n }∞ m,n=1 is an orthonormal basis (use Corollary 1.2.36). Since (Aen , em )L2 (Ω) = (k, ϕm,n )L2 (Ω×Ω) , the finiteness of n(A) follows from the Bessel inequality. ∞ Now we give the second proof. Let {xn }n=1 be a bounded set in L2 (Ω). Since 2 L (Ω) as a Hilbert space is reflexive, there is a subsequence – denote it again ∞ by {xn }n=1 – which is weakly convergent to an x in L2 (Ω). In particular, k(t, s)xn (s) ds → k(t, s)x(s) ds for a.a. t ∈ Ω Ω
Ω
(the Fubini Theorem shows that k(t, ·) ∈ L2 (Ω) for a.a. t ∈ Ω). Since |k(t, s)| |xn (s) − x(s)| ds |Axn (t) − Ax(t)| ≤ Ω
≤ xn − xL2 (Ω)
|k(t, s)|2 ds
12
≤c
Ω
|k(t, s)|2 ds
12 ,
Ω
the Lebesgue Dominated Convergence Theorem yields Axn − AxL2 (Ω) → 0.
g
2.2. Compact Operators
81
Proposition 2.2.6. Let H be a Hilbert space and A ∈ L(H). Then A is a compact operator if and only if there is a sequence {An }∞ n=1 ⊂ L(H) of operators of finite rank which converges to A in the operator norm topology. Proof. Because of Proposition 2.2.4 only the necessity part is left to be proved. Let B(o; 1) be the unit ball in H. Since A(B(o; 1)) is compact, it is a separable metric space, and therefore Y = Lin A(B(o; 1)) is a separable Hilbert space. Let {en }∞ n=1 be an orthonormal basis in Y . Put An x =
n
(Ax, ek )ek .
k=1
Then An have finite rank and An x − Ax2 =
∞
|(Ax, ek )|2 < ε
for every x ∈ B(o; 1)
k=n+1
provided n is sufficiently large (Proposition 1.2.39).
Remark 2.2.7. The proof of the preceding proposition indicates that the result ∞ holds also in a Banach space X with a Schauder basis {en }n=1 (see page 40). The famous conjecture of S. Banach was that any separable Banach space has a Schauder basis. The first counterexample was constructed by P. Enflo. He found a compact operator in a separable Banach space which cannot be approximated by operators of finite rank. We notice that separable Banach spaces of functions like C(Ω), Lp (Ω), W k,p (Ω) (1 ≤ p < ∞) have a Schauder basis. One of our goals in this section is to generalize the Fredholm alternative (see footnote 6 on page 14). As we have seen in Section 1.1 the notion of the adjoint operator is very important. Proposition 2.2.8 (Schauder). Let X, Y be Banach spaces and assume that A ∈ L(X, Y ). Then A is compact if and only if A∗ is compact. Proof. ∞
Step 1 (the “only if” part). Suppose that A ∈ C (X, Y ) and {gn }n=1 ⊂ Y ∗ , gn Y ∗ ≤ 1. It is easy to verify the assumptions of the Arzel`a–Ascoli Theorem (Theorem 1.2.13) for the sequence of functions gn : K A(B(o; 1)) → R
(or C) ∞
(B(o; 1) is the unit ball in X). By this theorem there is a subsequence {gnk }k=1 which is uniformly convergent on K. Since |A∗ gnk (x) − A∗ gnl (x)| ≤ sup |gnk (y) − gnl (y)|
for each x ∈ B(o; 1)
y∈K
∞
and X ∗ is complete, the sequence {A∗ gnk }k=1 is convergent in X ∗ .
82
Chapter 2. Properties of Linear and Nonlinear Operators
Step 2 (the “if” part). Assume now that A∗ ∈ C (Y ∗ , X ∗ ). We embed X into X ∗∗ and Y into Y ∗∗ with help of the canonical isometrical embeddings κX and κY (see the proof of Proposition 2.1.22(iii)). Since A∗ is compact, A∗∗ is compact by the first part of the proof. It suffices to show that κY (Ax) = A∗∗ κX (x)
for x ∈ X
and we leave that to the reader. If A ∈ C (X, Y ), then the equation Ax = y
(2.2.1)
is scarcely ever well posed20 as follows from the first part of the next theorem. This is the reason why we are interested rather in equations of the type x − Ax = y.
(2.2.2)
Theorem 2.2.9 (Riesz–Schauder Theory). Let X be a Banach space and A ∈ C (X). Then (i) if Im A is closed, then dim Im A < ∞; (ii) dim Ker (I − A) < ∞; (iii) Im (I − A) is closed; (iv) (the Fredholm alternative) Im (I − A) = X
if and only if
Ker (I − A) = {o};
(v) dim Ker (I − A) = dim Ker (I ∗ − A∗ ). Proof. (i) If Y = Im A is closed, then A : X → Y is an open mapping (Theorem 2.1.8). This means that a certain ball B(o; δ) in Y is contained in the relatively compact set A(B(o; 1)), i.e., B(o; δ) itself is relatively compact. By Proposition 1.2.15, dim Y < ∞. (ii) For the rest of the proof we put T I −A
and
Y Ker T.
Then the restriction of A to the Banach space Y maps Y onto Y . By (i), dim Y < ∞. (iii) Because of (ii) there exists a continuous projection P of X onto Y (Remark 2.1.19). Denote Z Ker P,
i.e.,
X =Y ⊕Z
equation (2.2.1) is said to be well-posed if A is injective and A−1 is continuous. If A is an integral operator, then (2.2.1) is called an integral equation of the first kind . The equation (2.2.2) is called an integral equation of the second kind. The research of these equations carried out by I. Fredholm is supposed to be one of the starting points in the development of functional analysis. 20 An
2.2. Compact Operators
83
and both Y and Z are Banach spaces. Since T is injective on Z, Im T is closed provided there is a positive constant c such that T zY ≥ czZ
for each z ∈ Z,
see page 70. Suppose by contradiction that such c does not exist, i.e., there are zn ∈ Z such that zn Z = 1
T zn Y <
and
1 zn Z . n
∞
Then one can find a subsequence {znk }k=1 for which Aznk converges to a y. Since T znk → o, we have lim znk = y ∈ Z. This means that n→∞
T y = o,
i.e.,
y ∈ Y ∩ Z,
and thus
y = o.
This is a contradiction since znk → y implies that yY = 1. (iv) We will prove the necessity part by way of contradiction. Put Yk Ker T k . Then Y1 Y2 · · · Yk · · · since for x1 ∈ Ker (I − A), x1 = o, there is x2 such that x1 = T x2 , i.e., x2 ∈ Y2 \Y1 , etc. It follows from the construction in the proof of Proposition 1.2.15 that there are yk ∈ Yk , yk Yk = 1, such that dist(yk+1 , Yk ) ≥ 12 . For k > l we have Ayk − Ayl Yk = yk − (yl − T yl + T yk )Yk ≥ dist(yk , Yk−1 ) ≥
1 . 2
∞
This means that there is no convergent subsequence of {Ayk }k=1 , a contradiction. The sufficiency part is now easy: It follows from Proposition 2.2.8 and the previous part (iii) that Im T ∗ is closed. Assume that Ker T = {o}. By Proposition 2.1.27(iii), Im T ∗ = (Ker T )⊥ = X ∗ . According to the first part of this proof, Ker T ∗ = {o} and, again by (iii) and Proposition 2.1.27(iv), Im T = (Ker T ∗ )⊥ = X. (v) As in the proof of (iii), X = Y ⊕ Z and the corresponding projection P of X onto Y is continuous. It can be shown that a direct complement W of Im T in X is isomorphic to Ker T ∗ .21 This means that dim W = dim Ker T ∗ < ∞. 21 This is clear for X being a Hilbert space, since Im T is closed and the orthogonal complement (Im T )⊥ is equal to Ker T ∗ (Proposition 2.1.27(iv)). In a general Banach space we can use the factor space X|Im T which is algebraically isomorphic to a direct complement W of Im T and ∗ for g ∈ X|Im T put f (x) = g([x]). It remains to show that the correspondence g → f is an (isometric) isomorphism onto (Im T )⊥ = Ker T ∗ .
84
Chapter 2. Properties of Linear and Nonlinear Operators
Denote dim Ker T = n
dim Ker T ∗ = n∗ .
and
We shall prove that n = n∗ . Assume that n > n∗ . In particular, this means that there is a surjective linear operator Φ ∈ L(Y, W ). Such Φ cannot be injective (see Corollary 1.1.15), i.e., there is x0 ∈ Y , x0 = o, for which Φ(x0 ) = o. Put now B A + ΦP. Since P ∈ C (X), we have B ∈ C (X) and Bx0 = Ax0 + o = x0 ,
i.e.,
Ker (I − B) = {o}.
By the Fredholm alternative (iv), Im (I − B) = X. But (I − B)(Z) = Im T
and
(I − B)(Y ) = Φ(Y ) = W,
i.e., Im (I − B) = Im T + W = X, a contradiction. This proves the inequality n ≤ n∗ . By interchanging T and T ∗ we similarly obtain n∗ ≤ n.22
Remark 2.2.10. The proof of the following statement is similar to that of Lemma 1.1.31(i). If A ∈ C (X) and 1 ∈ σ(A), then there is k ∈ N such that X = Ker (I − A)k ⊕ Im (I − A)k . Moreover, both the spaces on the right-hand side are A-invariant, and dim Ker (I − A)k < ∞.23 Remark 2.2.11. Theorem 2.2.9 can be generalized to operators A ∈ L(X) for which there is k ∈ N such that Ak ∈ C (X). Another way of generalization is connected with perturbations of Fredholm operators. Notice that the statement (v) of Theorem 2.2.9 says that I − A is a Fredholm operator of index zero provided A ∈ C (X). The following theorem states the stability of index. Theorem 2.2.12. Let X, Y be Banach spaces and let A ∈ L(X, Y ) be a Fredholm operator. Then (i) if B ∈ C (X, Y ), then A + B is Fredholm and ind A = ind (A + B);
(2.2.3)
(ii) the set of Fredholm operators in L(X, Y ) is an open subset of L(X, Y ); furthermore, ind is a continuous function on this open set. 22 We
recommend to the reader to do that carefully to see that no reflexivity of X is needed. dimension is called the multiplicity of the eigenvalue 1.
23 This
2.2. Compact Operators
85
Proof. The proofs and further results can be found, e.g., in Kato [73, § IV.5.].
Corollary 2.2.13. Let X be a complex Banach space and let A ∈ C (X). Then (i) σ(A) \ {0} is a countable set of eigenvalues of finite multiplicity; (ii) if dim X = ∞, then 0 ∈ σ(A), and if λ is an accumulation point of σ(A), then λ = 0. Proof. (i) If λ = 0, then λI − A = λ I − A λ and Theorem 2.2.9 can be applied. In particular, if such λ belongs to σ(A), then λ is an eigenvalue of finite multiplicity. It remains to show that for any r > 0 the set = {λ ∈ σ(A) : |λ| > r} is finite. Assume by way of contradiction that there is a sequence of mutually different ∞ points {λn }n=1 ⊂ and let xn be the corresponding nonzero eigenvectors. Put Wn = Lin{x1 , . . . , xn }. It is easy to see by induction that x1 , . . . , xn are linearly independent. So we can find yn+1 ∈ Wn+1 such that yn+1 = 1
and
dist(yn+1 , Wn ) ≥
1 . 2
Now for k > l we have Ayk −Ayl = λk yk −[(λk I −A)yk +(λl I −A)yl −λl yl ] ≥ |λk | dist(yk , Wk−1 ) ≥
r 2
and this contradicts the compactness of A. (ii) The statement on accumulation points follows immediately from the proof of (i). To see that 0 is a point of σ(A) provided dim X = ∞ it is sufficient to realize that σ(A) cannot be a finite set of nonzero numbers λ1 , . . . , λn . Indeed, with help of Remark 2.2.10 we get X = Ker (λ1 I − A)k1 ⊕ · · · ⊕ Ker (λn I − A)kn ⊕ V
(2.2.4)
where V is a nontrivial closed A-invariant subspace of X. Therefore the spectrum σ(A|V ) of the restriction A|V of A to V is a subset of σ(A). Since σ(A|V ) = ∅ (see the discussion following Example 2.1.20), we have {λ1 , . . . , λn } = σ(A).
Example 2.2.14. Consider Ax(t)
t
x(s) ds
on the space
L2 (0, 1).
0
This is a special class of operators which have been examined in Example 2.2.5(ii): 1 for 0 ≤ s ≤ t ≤ 1, k(t, s) = 0 for 0 ≤ t < s ≤ 1. Therefore A ∈ C (L2 (0, 1)).
86
Chapter 2. Properties of Linear and Nonlinear Operators
If λ = 0 were an eigenvalue of A with an eigenfunction x, then 1 t x(s) ds, x(t) = λ 0 i.e., x is absolutely continuous and x˙ =
1 x, λ
x(0) = 0.
This implies that x = o in [0, 1]. Since σ(A) cannot be empty, σ(A) = {0}, and 0 is no eigenvalue of A. We notice that the same statement (with a more complicated proof) is valid for any Volterra integral operator t Ax(t) = k(t − s)x(s) ds, x ∈ L2 (0, 1), 0
provided, e.g., k ∈ L2 (0, 1). See also Example 2.3.7.
g
Corollary 2.2.13 can be significantly strengthened in the case that X is a Hilbert space and A is a compact, self-adjoint operator. To see this we need some technicalities. Proposition 2.2.15. Let H be a Hilbert space and A a self-adjoint continuous operator on H. Then (i) A = sup |(Ax, x)|; x=1
(ii) m inf (Ax, x) and M sup (Ax, x) belong to the spectrum of A; x=1
x=1
(iii) A = sup {|λ| : λ ∈ σ(A)}; (iv) σ(A) ⊂ R; (v) if Ax = λx, Ay = µy, λ = µ, then (x, y) = 0. Proof. (i) Denote the right-hand side by α. Obviously α ≤ A. To prove the converse inequality take o = x ∈ H, y = Ax. Then for any t > 0, using (1.2.14), we have
1 1 2 Ax = A(tx), Ax = A(tx), y t t 1 1 1 1 1 A tx + y , tx + y − A tx − y , tx − y = 4 t t t t 2 2 α tx + 1 y + tx − 1 y = α t2 x2 + 1 y2 . ≤ 4 t t 2 t2
2.2. Compact Operators
87
Now we choose t such that 1 t x + 2 y2 = 2xy, t 2
2
Hence
t=
y x
12 and
i.e.,
2
1 tx − y = 0. t
Ax2 ≤ αxy
follows.
(ii) By taking A + AI instead of A, we can assume that 0 ≤ m ≤ M = A ∞ (the last equality follows from (i)). Let {xn }n=1 be a sequence such that xn = 1
and
lim (Axn , xn ) = M.
n→∞
Then lim sup Axn − M xn 2 = lim sup [(Axn , Axn ) − 2M (Axn , xn ) + M 2 ] n→∞
n→∞
≤ lim sup [2M 2 − 2M (Axn , xn )] = 0. n→∞
If M ∈ (A), then there is a constant c > 0 such that Ax − M x ≥ cx. The previous calculation shows that this cannot be true. The assertion on m is obtained by replacing A by −A. (iii) This is a consequence of (i) and (ii) and Corollary 2.1.3. (iv) Let λ = α + iβ, β = 0. A simple calculation yields that λx − Ax2 ≥ |β|2 x2
for every
x ∈ H.
This inequality shows that both λI − A
and
λI − A∗ = λI − A
are injective and Im (λI − A) is closed. By Proposition 2.1.27(iv) and Corollary 1.2.35, Im (λI − A) = [Ker (λI − A)∗ ]⊥ = [Ker (λI − A)]⊥ = H. Therefore λ ∈ (A). (v) We have λ(x, y) = (Ax, y) = (x, Ay) = (x, µy) = µ(x, y) (by (iv), µ ∈ R). Since λ = µ, we conclude that (x, y) = 0.
88
Chapter 2. Properties of Linear and Nonlinear Operators
Theorem 2.2.16 (Hilbert–Schmidt). Let H be a separable Hilbert space and A a self-adjoint compact operator. Then there exists an orthonormal basis {en }∞ n=1 where en are the eigenvectors of A. If ∞ Aen = λn en and x= (x, en )en , n=1
then Ax =
∞
λn (x, en )en .
n=1 ∞
Proof. Let {λn }n=1 be the sequence of all nonzero and pairwise distinct eigenvalues (k) (k) of A. Choose an orthonormal basis e1 , . . . , enk of Nk Ker (λk I − A). Remember that Nk ⊥ Nk+1 (Proposition 2.2.15(v)). Let us align the collection (k) {e1 , . . . , e(k) nk } k
into a sequence {e1 , e2 , . . . }. This sequence is an orthonormal basis of H1 Lin{e1 , e2 , . . . }. If H1 = H, the proof is complete. Assume therefore that H = H1 . The orthogonal complement H1⊥ is A-invariant. This means that the restriction B A|H1⊥ is a self-adjoint operator on the Hilbert space H1⊥ . Since σ(B) ⊂ σ(A), σ(B) cannot contain any nonzero number (Corollary 2.2.13(i)). As σ(B) = ∅, we have σ(B) = {0} and, by Proposition 2.2.15(iii), on H1⊥ .
B=O
Hence 0 is an eigenvalue of B as well as of A. By adding an orthonormal basis of H1⊥ to {e1 , e2 , . . . } we obtain an orthonormal basis of H. Example 2.2.17.24 We have found that the inverse operator to Ax = −(px)˙ ˙ + qx, 25
x ∈ Dom A = {x ∈ W 2,2 (a, b) : x(a) = x(b) = 0},
exists provided p, ˙ q ∈ C[a, b] and p, q > 0 on [a, b]. Moreover, A−1 is an integral operator b A−1 f (t) = G(t, s)f (s) ds a 24 A
continuation of Example 2.1.32. 25 This operator is called a Sturm–Liouville operator.
2.2. Compact Operators
89
where G is the Green function of the differential expression. From the construction of G it follows that G ∈ C([a, b] × [a, b]), in particular, G ∈ L2 (a, b), and G is a real symmetric function (G(t, s) = G(s, t)), see, e.g., Walter [131]. By Example 2.2.5(ii), A−1 is a compact, self-adjoint26 operator in the real space L2 (a, b) and Theorem 2.2.16 can be applied to obtain an orthonormal basis ∞ of L2 (a, b) formed by the eigenfunctions {en }n=1 of A−1 , i.e., by the eigenfunctions of A. Since b (Ax, x)L2 (a,b) = [p(t)x˙ 2 (t) + q(t)|x(t)|2 ] dt > 0 for all x ∈ Dom A, x = o, a
all eigenvalues are positive. If λ is an eigenvalue of A (equivalently value of A−1 ), then dim Ker (λI − A) = 1
1 λ
is an eigen-
since the equation (px)˙ ˙ + (q − λ)x = 0 cannot have two linearly independent solutions satisfying the initial condition x(a) = 0. Let the eigenvalues λn of A be arranged into a sequence so that 0 < λ1 < λ2 < · · · . From the properties of compact operators (Corollary 2.2.13) it follows that λn → ∞. It is sometimes important to know how quickly λn tend to infinity. A simple estimate can be obtained with help of the quantity n(A−1 ) (Example 2.2.5(ii)), namely ∞ 1 = n2 (A−1 ) < ∞. 2 λ n=1 n However, this result is far from being optimal. We remark here that a variational approach to an eigenvalue problem for compact, self-adjoint operators will be briefly described in Section 6.3. Consider now the equation Ax = λx + f
(2.2.5)
or, equivalently (cf. Exercise 2.2.23), ∞ n=1
(λn − λ)(x, en ) =
∞
(f, en ),
i.e., (λn − λ)(x, en ) = (f, en )
for n ∈ N.
n=1
If λ is no eigenvalue of A, then inf |λn − λ| > 0 (since λn → ∞) and n
x= 26 We
∞ (f, en ) en λ −λ n=1 n
restrict our attention to a special differential operator A in contrast to the general operator from Example 2.1.32 in order to get a self-adjoint inverse A−1 .
90
Chapter 2. Properties of Linear and Nonlinear Operators
is a unique solution of (2.2.5). (Notice that this series is convergent.) If λ = λn , then the condition (f, en ) = 0 is a necessary and sufficient condition for solvability of (2.2.5) (see also Example 2.1.31). If we examined singular differential operators, e.g., on the interval [0, ∞), we would meet with many difficulties arising for example from the fact that A−1 is not compact and, therefore, its spectrum is more complicated. The interested g reader can consult the book Dunford & Schwartz [45]. Remark 2.2.18. The Hilbert–Schmidt Theorem allows to introduce a functional calculus for compact, self-adjoint operators similarly as it has been done for matrices in Theorem 1.1.38: Let A be a compact, self-adjoint operator on a Hilbert space H. Then there exists a unique mapping Φ : C(σ(A)) → L(H) 27 with the following properties: (i) Φ is an algebra homomorphism; (ii) Φ is a continuous mapping from C(σ(A)) into L(H) with the operator topology; m m (iii) if P (x) = ak xk , then Φ(P ) = ak Ak ; k=0
k=0
(iv) if w ∈ σ(A) and f (x) = then Φ(f ) = (wI − A)−1 ; (v) σ(Φ(f )) = f (σ(A)) for every f ∈ C(σ(A)). ∞ If Ax = λn (x, en )en , then it is easy to verify properties (i)–(v) for 1 w−x ,
n=1
Φ(f )x
∞
f (λn )(x, en )en .
n=1
We omit the proof of uniqueness. It is worth mentioning that we can introduce a functional calculus for a linear operator A which has a compact, self-adjoint resolvent (λ0 I − A)−1 . We leave this easy construction to the interested reader. Example 2.2.17 shows a class of such operators. Exercise 2.2.19. Suppose that A ∈ L(X, Y ) maps a weakly convergent sequence into a strongly convergent one. Prove that A is compact provided X is reflexive. Exercise 2.2.20. Prove the assertion from Remark 2.2.10 and the decomposition (2.2.4). 27 If
σ(A) = {0} ∪ {λn }∞ n=1 , then f ∈ C(σ(A)) if and only if lim f (λn ) = f (0). n→∞
2.3. Contraction Principle
91
Exercise 2.2.21. Consider a special case of the Sturm–Liouville operator Ax = −¨ x in the space L2 (0, π) with the boundary conditions (i) x(0) = x(π) = 0 (Dirichlet boundary conditions), (ii) x(0) ˙ = x(π) ˙ = 0 (Neumann boundary conditions), (iii) α0 x(0)+β0 x(0) ˙ = 0, α1 x(π)+β1 x(π) ˙ = 0 (mixed or Newton–Robin boundary conditions), (iv) x(0) = x(π), x(0) ˙ = x(π) ˙ (periodic conditions). Find Green functions, eigenvalues and eigenfunctions. What follows from the Hilbert–Schmidt Theorem? Compare this result with that of Example 1.2.38(i). Exercise 2.2.22. Define etA for the operator A from Exercise 2.2.21 (see Remark 2.2.18). Take x ∈ Dom A and show that the function t ≥ 0, u(t, ξ) etA x (ξ), is a solution to the heat equation ∂2u ∂u = ∂t ∂ξ 2 satisfying the initial condition u(0, ·) = x(·) and the boundary conditions given by u(t, ·) ∈ Dom A. Do not forget to define the notion of a solution. Exercise 2.2.23. Let A be as in Example 2.2.17. Prove that . ∞ ∞ Dom A = x = (x, en )en : |λn |2 |(x, en )|2 < ∞ n=1
and Ax =
n=1 ∞
λn (x, en )en .
n=1
2.3 Contraction Principle The previous four sections have been devoted to some basic facts in the linear theory. It is now time to start with nonlinear problems, especially with the solution of the nonlinear equation f (x) = a
for
f : X → X.
(2.3.1)
The basic assertions in this section are fixed point theorems for contractible and non-expansive mappings. If X is a linear space, (2.3.1) is equivalent to the equation F (x) a − f (x) + x = x.
92
Chapter 2. Properties of Linear and Nonlinear Operators
The solution of this equation is called a fixed point of F . In the case that f (x) = x − Ax
(F (x) = Ax + a)
where A ∈ L(X), we succeeded in solving this equation in Section 2.1 (cf. Proposition 2.1.2) by applying the iteration process x0 = a,
xn = a + Axn−1
provided A < 1.
This idea can be easily generalized to the following result which is often attributed to S. Banach. Theorem 2.3.1 (Contraction Principle). Let M be a complete metric space and let F : M → M be a contraction, i.e., there is q ∈ [0, 1) such that
(F (x), F (y)) ≤ q (x, y)
for every
x, y ∈ M.
Then there exists a unique fixed point x ˜ of F in M . Moreover, if x0 ∈ M,
xn = F (xn−1 ),
∞
then the sequence {xn }n=1 converges to x ˜ and the estimates qn
(x1 , x0 ) 1−q q
(xn , xn−1 ) ˜) ≤
(xn , x 1−q
(xn , x ˜) ≤
(a priori estimate),
(2.3.2)
(a posteriori estimate)
(2.3.3)
hold. ∞
Proof. We prove that {xn }n=1 is a Cauchy sequence. Indeed, for m > n we have
(xm , xn ) ≤ (xm , xm−1 ) + · · · + (xn+1 , xn ) = (F (xm−1 ), F (xm−2 )) + · · · + (F (xn ), F (xn−1 )) ≤ q[ (xm−1 , xm−2 ) + · · · + (xn , xn−1 )] qn
(x1 , x0 ). ≤ (q m−1 + · · · + q n ) (x1 , x0 ) ≤ 1−q Since q < 1, the right-hand side is arbitrarily small for sufficiently large n. The ∞ Cauchy sequence {xn }n=1 has a limit x ˜ in the complete space M , and for this limit the estimate (2.3.2) holds. Being a contraction, F is a continuous mapping, and therefore ( ' x ˜ = lim xn = lim F (xn−1 ) = F lim xn−1 = F (˜ x). n→∞
n→∞
n→∞
Uniqueness of a fixed point is even easier: If x ˜ = F (˜ x), y˜ = F (˜ y ), then
(˜ x, y˜) = (F (˜ x), F (˜ y )) ≤ q (˜ x, y˜),
i.e., (˜ x, y˜) = 0
(q < 1).
The a posteriori estimate also follows from the above estimate of (xm , xn ).
2.3. Contraction Principle
93
The fixed point of F the existence of which has been just established often depends on a parameter. The following result is useful in investigating this dependence. Corollary 2.3.2. Let M be a complete metric space and A a topological space. Assume that F : A × M → M possesses the following properties: (i) There is q ∈ [0, 1) such that
(F (a, x), F (a, y)) ≤ q (x, y)
for all
a∈A
and
x, y ∈ M.
(ii) For every x ∈ M the mapping a → F (a, x) is continuous on A. Then for each a ∈ A there is a unique ϕ(a) x ˜ such that F (a, x ˜) = x ˜. Moreover, ϕ is continuous on A. Proof. The existence of ϕ follows directly from Theorem 2.3.1. The estimates
(ϕ(a), ϕ(b)) = (F (a, ϕ(a)), F (b, ϕ(b))) ≤ (F (a, ϕ(a)), F (b, ϕ(a))) + (F (b, ϕ(a)), F (b, ϕ(b))) ≤ (F (a, ϕ(a)), F (b, ϕ(a))) + q (ϕ(a), ϕ(b)) yield
(ϕ(a), ϕ(b)) ≤
1
(F (a, ϕ(a)), F (b, ϕ(a))), 1−q
and the continuity of ϕ follows.
Remark 2.3.3. Notice that ϕ is Lipschitz continuous provided a → F (a, x) is Lipschitz continuous uniformly with respect to x (and, of course, A is a metric space). There is an enormous number of applications of the Contraction Principle. The proof of the existence theorem for the initial value problem for ordinary differential equations belongs to standard applications. However, the historical development went in the opposite direction. The following theorem had been proved (by iteration) about thirty years before the Contraction Principle was formulated in its full generality. Another application will be given in Section 4.1. Theorem 2.3.4 (Picard). Let G be an open set in R× RN and let f : (t, x1 , . . . , xN ) ∈ G → RN be continuous and locally Lipschitz continuous with respect to the xvariables, i.e., for every (s, y) ∈ G there exist δ > 0, δˆ > 0, L > 0 such that f (t, x1 ) − f (t, x2 ) ≤ Lx1 − x2
ˆ i = 1, 2. whenever |t − s| < δ, xi − y < δ,
Then for any (t0 , ξ0 ) ∈ G there exists δ > 0 such that the equation x˙ = f (t, x)
(2.3.4)
94
Chapter 2. Properties of Linear and Nonlinear Operators
has a unique solution on the interval (t0 − δ, t0 + δ) satisfying the initial condition x(t0 ) = ξ0 .
(2.3.5)
Proof. First we rewrite the initial value problem (2.3.4), (2.3.5) into an equivalent fixed point problem for an integral operator F defined by t F (x) : t → ξ0 + f (s, x(s)) ds, t ∈ (t0 − δ, t0 + δ).28 (2.3.6) t0
This equivalence is easy to establish (by integration and by differentiation with respect to t). Therefore we wish to solve the equation F (x) = x in a complete metric space M . We choose M to be a closed subset of the Banach space C[t0 − δ, t0 + δ] for a certain small δ > 0. We need two properties of F and M , namely that F maps M into M and F is a contraction on M . Choose first δ1 , δˆ1 such that R1 [t0 − δ1 , t0 + δ1 ] × {x ∈ RN : x − ξ0 ≤ δˆ1 } ⊂ G. This set R1 is compact, and therefore f is bounded and uniformly Lipschitz continuous on it, i.e., there are constants K, L such that f (s, x) ≤ K,
f (s, x) − f (s, y) ≤ Lx − y
for (s, x), (s, y) ∈ R1 .
Put M = {x ∈ C[t0 − δ, t0 + δ] : x(t) − ξ0 ≤ δˆ1 ∀t ∈ [t0 − δ, t0 + δ]} for a δ ≤ δ1 . Then sup F (x(t)) − ξ0 ≤ δK,
t∈Iδ
sup F (x(t)) − F (y(t)) ≤ δL sup x(t) − y(t) t∈Iδ
t∈Iδ
where Iδ [t0 − δ, t0 + δ]. If we choose δ so small that δK ≤ δˆ1 and δL ≤ 12 , then F maps M into itself (the first condition) and is a contraction with q = 12 (the second condition). By the Contraction Principle, F has a unique fixed point y in M and this is a solution of (2.3.4), (2.3.5) on the interval (t0 − δ, t0 + δ). If x ˜ is a solution of (2.3.4), (2.3.5) on the interval (t0 − δ, t0 + δ), then x ˜ ∈ M (prove it!), i.e., y = x ˜, and the uniqueness follows.
28 If
t
t < t0 , then we define t0
t0
f (s, x(s)) ds = − t
f (s, x(s)) ds, and
t0 t0
f (s, x(s)) ds = 0.
2.3. Contraction Principle
95
Remark 2.3.5. The mapping F defined by (2.3.6) depends actually not only on x but also on t0 , ξ0 . By taking smaller δ we can prove that F is also Lipschitz continuous with respect to the initial conditions and Corollary 2.3.2 yields that the solution x(t; t0 , ξ0 ) of (2.3.4), (2.3.5) is also Lipschitz continuous with respect to the initial conditions. Remark 2.3.6. If we apply Theorem 2.3.4 (i.e., the Contraction Principle) to a system of linear differential equations x˙ = A(t)x + g(t) with a continuous matrix A and a continuous vector function g on an interval (a, b), we need an extra effort to prove that a solution exists on the whole interval (a, b). Namely, Theorem 2.3.4 gives only local existence, and in the continuation process (take (t0 + δ, x(t0 + δ)) as a new initial condition) there is no a priori evidence that δ could not be smaller and smaller.29 It is therefore sometimes more convenient not to refer to the Contraction Principle but to prove the convergence of iterations directly. The following example demonstrates this approach. Example 2.3.7. Let k be a bounded measurable function on the set M = {(s, t) ∈ R2 : 0 ≤ s ≤ t ≤ 1}. Then for any f ∈ L1 (0, 1) and λ = 0 there is a unique solution to the integral equation t x(t) − λ k(t, s)x(s) ds = f (t). (2.3.7) 0
To prove this assertion, denote
t
Ax(t) =
k(t, s)x(s) ds. 0
Then A ∈ L(L1 (0, 1)) (Example 2.1.28). Put x0 = f,
xn = f + λAxn−1 .
1 Due to the completeness of L (0, 1) the sequence {xn }∞ n=1 is convergent in L (0, 1) ∞ xn − xn−1 L1 (0,1) is convergent. We have if and only if the sum 1
n=1
xn − xn−1 = λ A f n
n
and
n
A f (t) =
t
kn (t, s)f (s) ds 0
where
k1 = k
and
t
kn (t, s) =
kn−1 (t, σ)k(σ, s) dσ s
(check this relation). situation does not occur for the equation x˙ = f (t, x) provided, e.g., that G = RN+1 and there exists L > 0 such that for all (t, x), (t, y) ∈ RN+1 the inequality f (t, x)−f (t, y) ≤ Lx−y holds. 29 This
96
Chapter 2. Properties of Linear and Nonlinear Operators
It is easy to prove by induction that |kn (t, s)| ≤ knL∞ (M) and hence
Since the series
∞ n=1
(t, s) ∈ M,
- t ≤ |λ| kn (t, s)f (s) ds-- dt 0 0 1 1 |λ|n kn n f L1 (0,1) . |f (s)| |kn (t, s)| dt ds ≤ ≤ |λ| n! 0 s
xn − xn−1 L1 (0,1)
(t − s)n−1 , (n − 1)!
n
an n!
1
is convergent for any a ∈ R the limit lim xn = x˜ ∈ n→∞
L1 (0, 1) exists and x˜ is a solution to (2.3.7). In fact x ˜ is a unique solution (see Exercise 2.3.18). Moreover, x ˜ depends continuously on f , which means that σ(A) = {0}.30 This result holds also for k ∈ C(M) in the space C[0, 1]. The proof is the g same. Example 2.3.8. Find sufficient conditions for the existence of a classical solution (cf. Example 2.1.31) of the boundary value problem x ¨(t) = εf (t, x(t)), t ∈ (0, 1), (2.3.8) x(0) = x(1) = 0. Theorem 2.3.4 suggests the assumption that f is continuous with respect to t and Lipschitz continuous with respect to the x-variable on a certain rectangle [0, 1] × [−r, r]. Denote a Lipschitz constant on this interval by L(r). We wish to rewrite the problem (2.3.8) as a fixed point problem. To reach this goal suppose that we have a solution y and denote g(t) εf (t, y(t)). Then y solves also the equation y¨ = g and satisfies y(0) = y(1) = 0. It is easy to see that this problem has exactly one solution which is given by 1 t 1 y(t) = G(t, s)g(s) ds (t − 1)sg(s) ds + t(s − 1)g(s) ds 0
0
t
(G is the Green function – see Example 2.1.32). Therefore, we are looking for a continuous function x which solves the integral equation 1 x(t) = ε G(t, s)f (s, x(s)) ds. (2.3.9) 0
we have proved that C \ {0} ⊂ (A), i.e., σ(A) ⊂ {0}. Since A ∈ L(L1 (0, 1)), σ(A) =
∅, we have σ(A) = {0}. 30 Actually,
2.3. Contraction Principle
97
Denote
F (ε, x) ε
1
G(t, s)f (s, x(s)) ds. 0
We can solve (2.3.9) by applying the Contraction Principle in M {x ∈ C[0, 1] : x ≤ r}
for an appropriate choice of r.
For x ∈ M we have |f (s, x(s))| ≤ |f (s, 0)| + |f (s, x(s)) − f (s, 0)| ≤ K + L(r)r where K > 0 is a constant such that |f (s, 0)| ≤ K, s ∈ [0, 1], and F (ε, x) ≤
|ε| (K + L(r)r). 8
This estimate shows that F maps M into itself if q
|ε| L(r) < 1 8
and
r≥
|ε|K 1 . 8 1−q
Then F is also a contraction on M with the constant q. We can conclude that for a given r there is ε0 > 0 such that for |ε| ≤ ε0 both the above conditions31 are satisfied and (2.3.9) has a solution. Now we have to show that a continuous solution x of (2.3.9) is actually a classical solution of the boundary value problem (2.3.8). Since we know the explicit form of the Green function G, it is obvious that x(0) = x(1) = 0 and it is also easy to differentiate twice the right-hand side of (2.3.9) (taking into account that x is continuous). We remark that we have not used all properties of the integral operator with the kernel G. In particular, such an operator is compact (Example 2.2.5(i)) and this property has not been used. This property will be significant in Section 5.1. g The a posteriori estimate (2.3.3) shows that the convergence of iterations may be rather slow. It can be sometimes desirable to have faster convergence at the expense of more restrictive assumptions. The classical Newton Method for solving an equation f (x) = 0, f : R → R, is illustrated in Figure 2.3.1. In order to generalize this method we need the notion of a derivative of f : X → X. This will be the main subject of the next chapter. 31 Notice
that for a fixed ε these conditions are antagonistic, namely the first requires small r and the other large r. This situation is typical in applications of the Contraction Principle.
98
Chapter 2. Properties of Linear and Nonlinear Operators
y = f (x) y = f (xn )(x − xn ) + f (xn )
x ˜ xn+1
xn Figure 2.3.1.
There are many generalizations of the Contraction Principle. One of them concerns the assumption q < 1. A mapping F : M → M is called non-expansive if
(F (x), F (y)) ≤ (x, y)
for all x, y ∈ M.
A simple example F (x) = x + 1, x ∈ R, shows that F may have no fixed point. This can be caused by the fact that F does not map any bounded set into itself. However, there are non-expansive mappings which map the unit ball into itself and do not possess any fixed point either. See the following example or Exercise 2.3.17. Example 2.3.9 (Beals). Let M be the space of all sequences with zero limit with the sup norm (this space is usually denoted by c0 ) and let F (x) = (1, x1 , x2 , . . . )
for
x = (x1 , x2 , . . . ) ∈ M.
Then F is a non-expansive map of the unit ball into itself without any fixed point. g This example indicates that some special properties of the space are needed. We formulate the following assertion in a Hilbert space and use the Hilbert structure essentially in its proof. The statement is true also in uniformly convex spaces but the proof is more involved (see, e.g., Goebel [60]). Let us note an interesting fact that the validity of Proposition 2.3.10 in a reflexive Banach space is an open problem. Proposition 2.3.10 (Browder). Let M be a bounded closed and convex set in a Hilbert space H. Let F be a non-expansive mapping from M into itself. Then there is a fixed point of F in M. Moreover, if x0 ∈ M,
xn = F (xn−1 )
and
yn =
n−1 1 xk , n k=0
then the sequence
∞ {yn }n=1
is weakly convergent to a fixed point.
2.3. Contraction Principle
99
Proof. The existence result is not difficult to prove.32 So we will prove a more interesting result which yields also a numerical method for finding a fixed point. The proof consists of four steps, the last one is crucial and has a variational character. Step 1. Since M is bounded, closed and convex, yn ∈ M and there is a subsequence ∞ {ynk }k=1 weakly convergent to an x ˜ ∈ M (Theorem 2.1.25 and Exercise 2.1.39). Fix such a weakly convergent subsequence {ynk }∞ ˜ ∈ M. k=1 and its weak limit x Step 2. We have lim F (yn ) − yn = 0. Indeed, n→∞
F k (x0 ) − F (yn ) + F (yn ) − yn 2 = F k (x0 ) − F (yn )2 + F (yn ) − yn 2 + 2 Re(F k (x0 ) − F (yn ), F (yn ) − yn ) where F k (x0 ) = F (F k−1 (x0 )). Summing up this equality from k = 0 to k = n − 1 and dividing by n we get n−1 n−1 1 k 1 k F (x0 ) − F (yn )2 F (x0 ) − yn 2 = n n k=0
k=0
+ F (yn ) − yn 2 + 2 Re(yn − F (yn ), F (yn ) − yn )
=
n−1 1 k F (x0 ) − F (yn )2 − F (yn ) − yn 2 . n k=0
Since F is non-expansive, we conclude from this equality that F (yn ) − yn 2 ≤
n−1 1 k−1 1 F (x0 ) − yn 2 + x0 − F (yn )2 n n
−
k=1 n−1
1 n
F k (x0 ) − yn 2
k=0
1 1 = x0 − F (yn )2 − F n−1 (x0 ) − yn 2 → 0 n n (all sequences belong to M, and hence they are bounded). Step 3. The element x ˜ is a fixed point of F . To see this, observe that the inequality (z − F (z) − (ynk − F (ynk )), z − ynk ) = (z − ynk , z − ynk ) − (F (z) − F (ynk ), z − ynk ) ≥ z − ynk 2 − z − ynk 2 = 0 32 It is possible to assume that o ∈ M. For any t ∈ (0, 1) the mapping F (x) tF (x) is a t contraction. Letting t → 1 we obtain a sequence {xn }∞ n=1 ⊂ M for which xn − F (xn ) → o. Therefore it is sufficient to show that (I − F )(M) is closed. This needs a trick which is typical for monotone operators (Section 5.3). Notice that I − F is monotone provided F is non-expansive.
100
Chapter 2. Properties of Linear and Nonlinear Operators
holds for any z ∈ M. By Exercise 2.1.36 and Step 2, the limit of the left-hand side is (z − F (z), z − x ˜), i.e., the inequality (z − F (z), z − x ˜) ≥ 0
(2.3.10)
is also true. Now take t ∈ (0, 1) and put z = (1 − t)˜ x + tF (˜ x)
(z ∈ M).
For t → 0, the inequality (2.3.10) divided by t yields ˜ x − F (˜ x)2 ≤ 0. Step 4. If x is a fixed point of F , then xn − x2 = F (xn−1 ) − F (x)2 ≤ xn−1 − x2 and, therefore, the limit ϕ(x) lim xn − x2 exists. By Step 3, x ˜ is also a fixed n→∞ point, and we get x − v2 + v − xk 2 + 2 Re(˜ x − v, v − xk ) ϕ(˜ x) ≤ ˜ x − xk 2 = ˜
for any v ∈ H.
Summing up from k = 0 to k = n − 1 and dividing by n we arrive at ϕ(˜ x) ≤ ˜ x − v2 +
n−1 1 v − xk 2 + 2 Re(˜ x − v, v − yn ). n
(2.3.11)
k=0
∞
∞
Let v be a weak limit of a subsequence {ynl }l=1 ⊂ {yn }n=1 , possibly different ∞ from {ynk }k=1 . Then v is a fixed point of F by virtue of the previous steps. Set n = nl and take the limit for l → ∞ in (2.3.11). We finally obtain33 ϕ(˜ x) ≤ ϕ(v) − ˜ x − v2 , and v = x ˜ follows. In particular, the limit of any weakly convergent subsequence ∞ ∞ of {yn }n=1 coincides with x ˜, and therefore the whole sequence {yn }n=1 weakly converges to x ˜. Remark 2.3.11. We have mentioned in footnote 32 on page 99 that I − F is a monotone operator whenever F is non-expansive. The converse statement is not true even in R. Consider, e.g., F (x) = −2x. Proposition 2.3.10 should be compared with Theorem 5.3.4. In the following three exercises we briefly show other modifications of the Contraction Principle. 33 Observe
nl −1 1 x n l→∞ l j=0
that lim
− xj 2 = lim x − xn 2 . n→∞
2.3. Contraction Principle
101
Exercise 2.3.12. If M is a complete metric space, F : M → M and there is a function V : M → R+ ∪ {+∞} such that V (F (x)) + (x, F (x)) ≤ V (x),
x ∈ M,
(2.3.12)
then for arbitrary x0 ∈ M,
xn = F (xn−1 ),
{xn }∞ n=1
the sequence is convergent in M to an x ˜. Moreover, if the graph of F is closed in M × M , then F (˜ x) = x ˜. ∞
∞
Hint. Show that {V (xn )}n=1 is a decreasing sequence; this implies that {xn }n=1 is a Cauchy sequence. Remark 2.3.13. The condition (2.3.12) is suitable for a vector-valued mapping F and plays an important role in game theory. For details see, e.g., Aubin & Ekeland [10, Chapter VI]. Exercise 2.3.14. Let M be a complete metric space and let F : M → M . If there is n ∈ N such that F n is a contraction, then F has a unique fixed point in M . Hint. Let x ˜ be a fixed point of G F n,
x ˜ = lim Gk (x0 ). k→∞
Estimate (F (Gk (x0 )), Gk (x0 )). It is possible to show that x ˜ = lim F k (x0 ). k→∞
Remark 2.3.15. The power n ∈ N need not be the same for all x, y ∈ M , i.e., if there is q ∈ [0, 1) and for every x ∈ M there exists n(x) ∈ N such that
(F n(x) (x), F n(y) (y)) ≤ q (x, y)
for all y ∈ M,
then F also has a unique fixed point (Sehgal [119]). The proof is similar to the previous one. Exercise 2.3.16 (Edelstein). Let M be a compact metric space and let F : M → M satisfy the condition
(F (x), F (y)) < (x, y)
for all x, y ∈ M, x = y.
Then F has a unique fixed point in M . Hint. Only existence has to be proved: By compactness there is a convergent subsequence F nk (x0 ) → x ˜. Now show that the sequence αn (F n (x0 ), F n+1 (x0 )) is decreasing and lim αn = (˜ x, F (˜ x)) = (F (˜ x), F 2 (˜ x)),
n→∞
i.e.,
F (˜ x) = x˜.
102
Chapter 2. Properties of Linear and Nonlinear Operators
Exercise 2.3.17. Let K = {x ∈ C[0, 1] : 0 ≤ x(t) ≤ 1, x(0) = 0, x(1) = 1},
F : x(t) → tx(t).
Then F (K) ⊂ K, F is non-expansive and there is no fixed point of F in K! Prove these facts and explain their relation to Proposition 2.3.10. Exercise 2.3.18. Let x ∈ L1 (0, 1) be a solution of
t
k(t, s)x(s) ds
x(t) = λ 0
where λ and k are as in Example 2.3.7. Prove that x = 0 a.e. in (0, 1). Hint. First show that x ∈ L∞ (0, 1). From the equation we have xL∞ (0,t) ≤ |λ|tkL∞ (M) xL∞ (0,t) ,
t ∈ (0, 1).
Now deduce x = 0 a.e. in (0, 1). Exercise 2.3.19. Prove Corollary 2.1.3 using Theorem 2.3.1. Exercise 2.3.20. Let f ∈ C[0, 1]. Prove that there exists ε0 > 0 such that for any ε ∈ [0, ε0 ] the boundary value problem x ¨(t) − x(t) + ε arctan x(t) = f (t), t ∈ (0, 1), x(0) = x(1) = 0, has a unique solution x ∈ C 2 [0, 1]. Exercise 2.3.21. Let K be a continuous real function on [a, b] × [a, b] × R and assume there exists a constant N > 0 such that for any t, τ ∈ [a, b], z1 , z2 ∈ R, we have |K(t, τ, z1 ) − K(t, τ, z2 )| ≤ N |z1 − z2 |. Let h ∈ C[a, b] be fixed and let λ ∈ R be such that |λ| <
1 . N (b − a)
Prove that the integral equation x(t) = λ
b
K(t, τ, x(τ )) dτ + h(t) a
has a unique solution x ∈ C[a, b]. Exercise 2.3.22. Let A : (a, b) → RM×M be a continuous matrix-valued function and let α ∈ (a, b), ξ ∈ RM .
2.3. Contraction Principle
103
(i) Modify the procedure from Example 2.3.7 to prove that the initial value problem x(t) ˙ = A(t)x(t), x(α) = ξ, has a unique solution which is defined on (a, b). (ii) Prove that the equation x(t) ˙ = A(t)x(t)
(2.3.13)
has M linearly independent solutions ϕ1 , . . . , ϕM on the interval (a, b) and any solution of (2.3.13) is a linear combination of ϕ1 , . . . , ϕM . The matrix Φ = (ϕji )i,j=1,...,M is called a fundamental matrix of (2.3.13). (iii) Let A be continuous on R and T -periodic (T > 0). Denote C = Φ(T ) where Φ is a fundamental matrix, Φ(0) = I. Suppose that B is a solution of the equation eT B = C (see Exercise 1.1.42). Prove that Q(t) Φ(t)e−tB is regular for all t ∈ R and T -periodic. Moreover, x is a solution to (2.3.13) if and only if y(t) Q−1 (t)x(t) is a solution of the equation y˙ = By which has constant coefficients. Find a condition in terms of σ(C) for the existence of a nontrivial kT -periodic solution to (2.3.13) (k ∈ N). (iv) Let f : R → RM be a continuous and T -periodic mapping. Is there any relation between the existence of a nontrivial T -periodic solution to (2.3.13) and the existence of a T -periodic solution to the equation x(t) ˙ = A(t)x(t) + f (t)? Hint. Use the Variation of Constant Formula and (iii).
Chapter 3
Abstract Integral and Differential Calculus 3.1 Integration of Vector Functions This short section is devoted to the integration of mappings which take values in a Banach space X. We will consider two types of domains of such mappings: either compact intervals or measurable spaces. For scalar functions the former case leads to the Riemann integral and the latter to the Lebesgue integral with respect to a measure. Definition 3.1.1. Let f : [a, b] → X. Let there exist x ∈ X with the following property: For every ε > 0 there is δ > 0 such that for all divisions D {a = t0 < · · · < tn = b} for which |D| max (ti − ti−1 ) < δ and for all choices τi ∈ [ti−1 , ti ], i = 1, . . . , n, i=1,...,n
the inequality
n f (τi )(ti − ti−1 ) − x i=1
<ε
(3.1.1)
X
is satisfied. Then x is called the Riemann integral of f over [a, b] and it is denoted by b f (t) dt. a
A basic existence theorem is a straightforward generalization of the classical (Riemann’s) result. Theorem 3.1.2 (Graves). Let X be a Banach space and let f : [a, b] → X be b continuous. Then the Riemann integral f (t) dt exists. a
106
Chapter 3. Abstract Integral and Differential Calculus
Proof. Since f is continuous on the compact interval [a, b], f is uniformly continuous on it. Take an equidistant division Dn = {a = tn0 < · · · < tnn = b} of the interval [a, b], i.e., tni = a +
i (b − a), n
i = 1, . . . , n,
Then sn =
n
and
|Dn | =
b−a . n
f (tni )(tni − tni−1 )
i=1
forms a Cauchy sequence (by the uniform continuity of f ). Let x = lim sn . It is n→∞
easy to see, again by the uniform continuity of f , that condition (3.1.1) is satisfied whenever |D| is sufficiently small. Since the Riemann integral is linear and the estimate b b f (t) dt ≤ f (t)X dt ≤ f C([a,b],X)(b − a) 1 a a
(3.1.2)
X
holds for each f ∈ C([a, b], X), the integral is a linear continuous operator. Its commutativity with linear operators is important. Proposition 3.1.3. Let X, Y be Banach spaces and let f : [a, b] → X be Riemann integrable. (i) If A ∈ L(X, Y ), then Af is also integrable and b b A f (t) dt = Af (t) dt holds. (3.1.3) a
a
(ii) If A : X → Y is a linear closed operator and Af is Riemann integrable, then b f (t) dt ∈ Dom A a
and (3.1.3) is true. Proof. The verification of (i) is straightforward, for (ii) we choose a sequence ∞ {sn }n=1 of Riemann sums such that b b lim sn = f (t) dt and lim Asn = Af (t) dt. n→∞
a
n→∞
a
The statement follows from the definition of a closed operator.
With help of this generalization of the Riemann integral we can also prove a basic result on the existence and uniqueness of a solution of a differential equation in a Banach space. 1 Here
f C([a,b],X) = max f (t)X . t∈[a,b]
3.1. Integration of Vector Functions
107
Assume that f: I ×G → X where I is an open interval in R, G is an open subset of a Banach space X. By a solution of a differential equation x˙ = f (t, x)
(3.1.4)
we mean a mapping x : J → X where J is an open interval, J ⊂ I, such that x(J ) ⊂ G and for every t ∈ J the limit x(t + τ ) − x(t) τ →0 τ
x(t) ˙ lim exists and
x(t) ˙ = f (t, x(t)). Theorem 3.1.4. Let I be an open interval in R and let G be an open subset of a Banach space X. Assume that f : I × G → X is continuous and locally satisfies the Lipschitz condition with respect to the second variable, i.e., for every s ∈ I, y ∈ G there are δ > 0, δˆ > 0, L > 0 such that f (t, x1 ) − f (t, x2 ) ≤ Lx1 − x2 ˆ i = 1, 2. whenever |t − s| < δ and xi − y < δ, Then for each t0 ∈ I, x0 ∈ G there exists h > 0 such that the equation (3.1.4) has a unique solution on the interval J = (t0 − h, t0 + h) which satisfies the initial condition x(t0 ) = x0 . (3.1.5) The proof of this theorem is based on the use of the Contraction Principle for the equivalent integral equation (see also the proof of Theorem 2.3.4)
t
f (s, x(s)) ds, 2
x(t) = x0 +
t ∈ J,
(3.1.6)
t0
where the integral is the Riemann integral. The equivalence of (3.1.4), (3.1.5) and (3.1.6) is established in the following lemma. Lemma 3.1.5. Suppose that f is continuous on I × G and (t0 , x0 ) ∈ I × G. Then a continuous function x : J → G is a solution of (3.1.4) on an interval J ⊂ I and satisfies the condition (3.1.5) if and only if t0 ∈ J and x solves on J the integral equation (3.1.6). 2 Recall
t
that t0
t0
g(s) ds = − t
g(s) ds for t < t0 (see footnote 28 on page 94).
108
Chapter 3. Abstract Integral and Differential Calculus
Proof. Step 1. Assume first that x is a solution of (3.1.4). Then x as well as the mapping t ∈ J → f (t, x(t)) are continuous on J . Choose τ ∈ J and integrate both sides of (3.1.4) over the interval [t0 , τ ] (or [τ, t0 ]). Notice that both sides are Riemann integrable (Theorem 3.1.2). Moreover,
τ τ τ d ϕ(x(t)) dt = ϕ(x(τ )) − ϕ(x0 ) ϕ x(t) ˙ dt = ϕ(x(t)) ˙ dt = dt t0 t0 t0 for all ϕ ∈ X ∗ (the last equality follows from the so-called Basic Theorem of Calculus). By the Hahn–Banach Theorem, in particular Remark 2.1.17(ii), we have τ x(t) ˙ dt = x(τ ) − x0 , t0
i.e., x satisfies (3.1.6). Step 2. Suppose now that x : J → X is a continuous solution of (3.1.6). Then x satisfies (3.1.5) and it remains to check that d t f (s, x(s)) ds = f (t, x(t)). dt t0 This can be done by copying the proof for the scalar case.
Proof of Theorem 3.1.4. Choose δ > 0, δˆ > 0 small enough and K > 0 such that f (s, x1 ) ≤ K,
f (s, x1 ) − f (s, x2 ) ≤ Lx1 − x2 + , ˆ i = 1, 2. Let 0 < h ≤ min δˆ , 1 , δ and for s ∈ [t0 − δ, t0 + δ], xi − x0 ≤ δ, K 2L Mh = {x ∈ C([t0 − h, t0 + h], X) : x(s) − x0 ≤ δˆ for s ∈ [t0 − h, t0 + h]}. Then Mh is a complete metric space (with respect to the induced metric) and the operator t F (x) : t → x0 + f (s, x(s)) ds, t ∈ [t0 − h, t0 + h], t0
is well defined on Mh , F (Mh ) ⊂ Mh (by (3.1.2)), and t sup [f (s, x1 (s)) − f (s, x2 (s))] ds F (x1 ) − F (x2 ) = t∈[t0 −h,t0 +h]
t0
≤ Lhx1 − x2 ≤
1 x1 − x2 2
for x1 , x2 ∈ Mh .
By the Contraction Principle (Theorem 2.3.1), there is a unique x ∈ Mh such that F (x) = x. Using Lemma 3.1.5 we conclude that x is a solution of (3.1.4), (3.1.5) on the interval J = (t0 − h, t0 + h).
3.1. Integration of Vector Functions
109
Let y be another solution of the same problem on the interval J . Then y ∈ Mk for a k ≤ h. Because of the uniqueness in the Contraction Principle, y(t) = x(t) for t ∈ [t0 − k, t0 + k]. Taking (t0 ± k, x(t0 ± k)) as new initial conditions we can extend ˜ t0 + k], ˜ i.e., y ∈ M˜ . the equality y(t) = x(t) to a larger closed interval [t0 − k, k This argument shows that y(t) = x(t)
for
t ∈ J.
Corollary 3.1.6. Let f satisfy the assumptions of Theorem 3.1.4 where I = (α, ∞), G = X. If, moreover, f is bounded on I × X, then for each t0 ∈ I and x0 ∈ X, (3.1.4) has a unique solution satisfying the initial condition (3.1.5) which is defined on the whole interval I. Proof. The problem (3.1.4), (3.1.5) has a solution xγ on an interval (β, γ) ⊂ I. Such an interval exists due to Theorem 3.1.4 and the solution xγ is unique on this interval by a similar argument as in the proof of uniqueness. Denote γ˜ = sup {γ > β : there is a solution xγ on (β, γ)}. If γ1 < γ2 and xγi is a corresponding solution on (β, γi ), i = 1, 2, then xγ1 (t) = xγ2 (t)
for
t ∈ (β, γ1 )
(by uniqueness). This allows us to define the solution x = x(t) on the entire interval (β, γ˜ ). Since t x(t) − x(s) ≤ f (σ, x(σ)) dσ ≤ |t − s| sup f (τ, y) s
(τ,y)∈I×X
for any β < s < t < γ˜, the solution x is uniformly continuous on (β, γ˜ ) and, therefore, continuously extendable at γ˜ provided γ˜ < ∞ (see Proposition 1.2.4). The local Theorem 3.1.4 allows us to continue x as a solution beyond the value of γ˜, a contradiction. Hence γ˜ = ∞. Similarly, we prove inf β = α. Remark 3.1.7. Under the assumptions of Corollary 3.1.6 the solution x depends continuously on the initial data. In order to formulate this result we denote by x = x(·; t0 , x0 ) the solution of (3.1.4), (3.1.5) on the interval I. The continuous dependence now reads as follows: For any compact interval J ⊂ I, t0 ∈ J , and any ε > 0 there is δ > 0 such that x(t; t0 , x0 ) − x(t; t1 , x1 ) < ε
for all
t∈J
provided t1 ∈ J , |t0 − t1 | < δ and x0 − x1 < δ. Cf. Remark 2.3.5. Remark 3.1.8. Another existence theorem for the scalar differential equation (3.1.4) (i.e., X = RN ) is based on the continuity of f only (cf. Proposition 5.1.13).
110
Chapter 3. Abstract Integral and Differential Calculus
Warning. A generalization to an infinite dimensional space does not hold, e.g., ∞ 1 1 for x = (x1 , x2 , . . . ) ∈ c0 f (x) = |xn | 2 + n + 1 n=1 where c0 is the space of sequences which converge to zero. As a norm on c0 the “sup norm” is taken. It is not difficult to see that the equation x˙ = f (x) has no solution satisfying the initial condition x(0) = o! We now turn to the integration of vector functions defined on a measurable space (M, Σ, µ) where Σ is a σ-algebra of subsets of M and µ is a (positive) measure defined on Σ. A generalization of the abstract Lebesgue integral3 can be done in two different ways: Either by integrating ϕ ◦ f over M for all ϕ ∈ X ∗ or by approximating f by step functions for which the integral is naturally defined, and then passing to the limit. The former approach leads to a weak integral and the latter one to the so-called Bochner integral . Since an existence theorem for the weak integral (i.e., the existence of x ∈ X such that ϕ(x) = ϕ(f ) dµ for all M
ϕ ∈ X ∗ ) is complicated, we only briefly describe the less general Bochner integral. Definition 3.1.9. Let (M, Σ, µ) be a measurable space and let X be a Banach space. (1) A function s : M → X is called a step function if there are pairwise disjoint sets M1 , . . . , Mn in Σ with µ(Mk ) < ∞, k = 1, . . . , n, such that s is constant (say, equal to xk ) on each Mk and for t ∈ M \
s(t) = o
n
Mk .
k=1
The integral of s is defined by n s dµ = xk µ(Mk ). M
k=1
(2) A function f : M → X is said to be strongly measurable if there is a sequence ∞ {sm }m=1 of step functions such that lim sm (t) = f (t)
m→∞
exists for µ-a.a. t ∈ M.
(3) A strongly measurable function f : M → X is said to be Bochner integrable if ∞ there is a sequence {sm }m=1 of step functions which converges to f µ-a.e. and f − sm X dµ = 0. (3.1.7) lim m→∞
3 The
M
reader who is not acquainted with measure theory and the abstract Lebesgue integral can assume that M is an open subset of RN , µ is a Lebesgue measure and Σ is a collection of all Lebesgue measurable subsets of M .
3.1. Integration of Vector Functions
In this case we put
111
f dµ = lim
m→∞
M
sm dµ.
(3.1.8)
M
Remark 3.1.10. In order to show that this definition is correct we need to prove that the norm of a strongly measurable function is a measurable function (this is obvious) and, therefore, the condition (3.1.7) makes sense. From (3.1.7) it also immediately follows that the limit in (3.1.8) does not depend on any special choice ∞ of {sn }n=1 . The following statement offers a very useful criterion for Bochner integrability. Proposition 3.1.11 (Bochner). Let X be a Banach space and let (M, Σ, µ) be a measurable space. A strongly measurable vector function f : M → X is Bochner integrable if and only if the norm f X is Lebesgue integrable. Moreover, ≤ f dµ f X dµ. (3.1.9) M
M
X
Proof. ∞
Step 1. Let f be Bochner integrable andlet {sn }n=1 be a sequence of step functions ∞
sn dµ
from the definition. Then M
particular, its limit, say α ∈ R, f dµ = lim
n→∞
M
is a Cauchy sequence (by (3.1.7)), in n=1
exists. Then sn dµ ≤ lim n→∞
M
sn dµ = α.
M ∞
It is easy to see that α does not depend on any special choice of {sn }n=1 from the definition and, moreover, f dµ ≤ f − sn dµ + sn dµ, i.e., f dµ ≤ α. M
M
M
M ∞
Step 2. Suppose now that f is Lebesgue integrable and that {sn }n=1 is a sequence of step functions from the definition of strong measurability. Put
⎧ ⎨s (t) if s (t) ≤ 1 + 1 f (t), n n n σn (t) = ⎩ o otherwise. Then σn → f µ-a.e. and, by the Lebesgue Dominated Convergence Theorem,
1 f (t). σn − f dµ → 0 since σn (t) − f (t) ≤ 2 + n M It follows from the independence of α of the special choice of approximating step functions that α ≤ f dµ. This proves the inequality (3.1.9). M
112
Chapter 3. Abstract Integral and Differential Calculus
Proposition 3.1.12. Let X, Y be Banach spaces, let (M, Σ, µ) be a measurable space and let f : M → X be Bochner integrable. (i) If A ∈ L(X, Y ), then Af is also Bochner integrable and Af dµ = A f dµ. (3.1.10) M
M
(ii) If A is a closed linear operator from X into Y and Af is Bochner integrable, then f dµ ∈ Dom A M
and (3.1.10) holds. Proof. The proof of statement (i) is straightforward. To prove (ii) let Z = {(x, Ax) : x ∈ Dom A} be equipped with the graph norm (x, Ax)Z xX + AxY . Since A is closed and X, Y are Banach spaces, Z is a Banach space as well. The crucial point of the proof is to show that g(t) (f (t), Af (t)),
t ∈ M,
is strongly measurable. Achieving this4 the rest of the proof is easy: By Proposition 3.1.11, g is Bochner integrable and g dµ = f dµ, Af dµ ∈ Z M
M
M
f dµ ∈
(since g maps into Z, its integral has to belong to Z, too). In particular, M
Dom A and (3.1.10) holds.
4 We sketch the proof of this result: Let ϕ ∈ Z ∗ . According to the Hahn–Banach Theorem there is an extension Φ = (Φ1 , Φ2 ) of ϕ to (X × Y )∗ . Since f and Af are strongly measurable, we conclude that t → ϕ(g(t)) = Φ1 (f (t)) + Φ2 (Af (t)) is measurable. It can be also shown that there is N ⊂ M , µ(M \ N ) = 0 such that g(N ) is separable. The result now follows from the Pettis Theorem (see, e.g., Dunford & Schwartz [44, Chapter III, 6], Yosida [135]): A function g : M → Z (Banach space) is strongly measurable if and only if the following two conditions are satisfied:
(i) For every ϕ ∈ Z ∗ the function t → ϕ(g(t)) is a measurable function. (ii) There is N ⊂ M such that µ(M \ N ) = 0 and g(N ) is a separable subspace of Z.
3.1. Integration of Vector Functions
113
Remark 3.1.13. If f : M → X is a Bochner integrable function and ϕ ∈ X ∗ , then, by the previous proposition, ϕ(f ) : M → R (or C) is integrable (in this case the Bochner and the Lebesgue integrals coincide). This shows that the Bochner integral is a restriction of any notion of a weak integral. We now return to the functional calculus given for matrices (see Theorem 1.1.38). Let B ∈ L(X) and let H(σ(B)) be a collection of holomorphic functions on a neighborhood of σ(B) (this neighborhood can depend on a function). If f ∈ H(σ(B)), then there exists a positively oriented Jordan curve γ such that σ(B) ⊂ int γ and f is holomorphic on a neighborhood of int γ. Hence the integral 1 f (w)(wI − B)−1 x dw, x ∈ X, f (B)x 2πi γ exists. Its properties are collected in the following assertions. Proposition 3.1.14 (Dunford Functional Calculus). Let X be a complex Banach space and let B ∈ L(X). There exists a unique linear mapping Φ : H(σ(B)) → L(X) with the following properties: (i) Φ(f g) = Φ(f )Φ(g) = Φ(g)Φ(f ) for f, g ∈ H(σ(B)); n n (ii) if P (w) = aj wj , then P (B) = aj B j ; (iii) if f (w) =
j=0 1 λ−w
j=0
for w = λ and λ ∈ σ(B), then f (B) = (λI − B)−1 ;
∞ {fn }n=1
⊂ H(σ(B)), fn ⇒ f on a neighborhood of int γ, then we have (iv) if fn (B) → f (B) in the norm topology; (v) if f ∈ H(σ(B)), then σ(f (B)) = f (σ(B)). Proof. The proof can be found, e.g., in Dunford & Schwartz [44, Section VII.3]. Suppose that λ0 ∈ σ(B) is an isolated point of the spectrum of B, i.e., there exist disjoint neighborhoods U0 of λ0 and U of σ(B) \ {λ0 }. The function 1 for w ∈ U0 , f (w) = 0 for w ∈ U, belongs to the collection H(σ(B)) and the operator P0 f (B) is a projection of X onto the subspace X0 P0 (X) since P02 = P0 . The operator P1 I − P0 is a projection onto the complementary subspace X1 , i.e., X = X0 ⊕ X1 . Denote by B0 and B1 the restrictions of B onto X0 (i.e., B0 ∈ L(X0 )) and X1 , respectively. Proposition 3.1.14(v) implies that σ(B0 ) = {λ0 },
σ(B1 ) = σ(B) \ {λ0 }.
114
Chapter 3. Abstract Integral and Differential Calculus
Put γ0 {λ0 + reiϕ : ϕ ∈ [0, 2π]}
for a (small) positive r.
Using Proposition 3.1.14 we get (see Exercise 3.1.16) (λI0 − B0 )−1 x0 1 dλ (µI0 − B0 )−1 x0 = 2πi γ0 µ−λ n ∞ 1 λ − λ0 1 = (λI0 − B0 )−1 x0 dλ 2πi γ0 µ − λ0 n=0 µ − λ0 ∞
=
x0 (−1)n + (λ0 I0 − B0 )n x0 µ − λ0 n=1 (µ − λ0 )n+1
(3.1.11)
for x0 ∈ X0 , |µ − λ0 | > r. The Taylor series for the function µ → (µI1 − B1 )−1 x1 has the form (see Exercise 3.1.16) −1
(µI1 − B1 )
∞ (µ − λ0 )n dn −1 x1 = (λI1 − B1 ) x1 -n n! dλ λ=λ0 n=0 =
∞
−(n+1)
(−1) (µ − λ0 ) (λ0 I1 − B1 ) n
n
(3.1.12)
x1 ,
n=0
x1 ∈ X1 , |µ − λ0 | < r0 (λ0 I1 − B1 )−1 −1 . Proposition 3.1.15. If λ0 is an isolated point of the spectrum σ(B), B ∈ L(X), then there exist operators An ∈ L(X), n ∈ Z, and r > 0 such that (µI − B)−1 x =
+∞
(µ − λ0 )n An x
(3.1.13)
n=−∞
for all x ∈ X and 0 < |µ − λ0 | < r. Moreover, if k ∈ N is such that A−n = O
for every
n>k
and
z A−k x = o,
then Bz = λ0 z. On the other hand, if λ0 is a nonzero eigenvalue of a compact operator B, then λ0 is a pole of the resolvent of B, i.e., there is k ∈ N such that A−n = O
for all
n > k.
Proof. Let λ0 be an isolated point of σ(B) and B ∈ L(X). If P0 , P1 are the above projections onto X0 , X1 , then (µI − B)−1 x = (µI − B0 )−1 x0 + (µI − B1 )−1 x1 ,
P0 x = x0 ,
P1 x = x1 ,
3.1. Integration of Vector Functions
115
and (3.1.13) follows from (3.1.11) and (3.1.12). Since A−(k+1) x = (B − λ0 I)A−k x, the second statement holds as well. Suppose now that B is compact and λ0 = 0 is an eigenvalue of B. By Corollary 2.2.13, λ0 is an isolated point of σ(B). Since the restriction B0 of B onto the subspace X0 has the spectrum σ(B0 ) consisting of λ0 only, B0−1 exists and is continuous. Therefore, the unit ball B(x0 ; 1) = B0 (B0−1 (B(x0 ; 1))) is a compact set. Proposition 1.2.15 says that M = dim X0 is finite. It follows from Lemma 1.1.31 that X0 = Ker (λ0 I0 − B0 )k for a certain k ∈ N and (see (1.1.20)) −1
(µI0 − B0 )
k−1
k (−1)n A−n x n x0 = (λ0 I0 − B0 ) x0 = n+1 (µ − λ ) (µ − λ0 )n 0 n=0 n=1
where P0 x = x0 . The proof is complete.
Exercise 3.1.16. Give details to confirm the formulae of resolvent (3.1.11) and (3.1.12). Hint. For (3.1.11) replace the sum and the integral and use Proposition 3.1.14(ii). For (3.1.12) use the resolvent identity (λI − B)−1 − (µI − B)−1 = (µ − λ)(λI − B)−1 (µI − B)−1 and induction. Exercise 3.1.17. Compare the functional calculus from Proposition 3.1.14 with that of Remark 2.2.18. More precisely, show that for a compact, self-adjoint operator B the functional calculus given in Remark 2.2.18 is an extension of that of Proposition 3.1.14. Exercise 3.1.18. Let X be a Banach space. Assume that f : [a, b] → X has the Riemann integral over the interval [a, b]. Show that then the Bochner integral b f (t) dt also exists and the two integrals are equal. In particular, Proposia
tion 3.1.3 is a special case of Proposition 3.1.12. However, the proof of Proposition 3.1.3(ii) is much simpler. Exercise 3.1.19. Let A : Dom A ⊂ H → H be a densely defined linear operator on a Hilbert space H. Assume that A has a compact resolvent that is also self-adjoint. (i) Extend the functional calculus (Remark 2.2.18 and Proposition 3.1.14) to such A. In particular, show that the formula for Φ(f )x still holds provided that ∞ |f (λn )|2 |(x, en )|2 < ∞ n=1
116
Chapter 3. Abstract Integral and Differential Calculus
(here σ(A) = {λ1 , . . . }). Notice that Φ(f ) need not be bounded if dim H = ∞ since σ(A) is unbounded in this case. Also, Φ(f ), Φ(g) do not commute in general. (ii) Suppose that σ(A) is bounded above. Show that the function u(t) etA x0
with x0 ∈ Dom A
is a continuous solution to the initial value problem x(t) ˙ = Ax(t), x(0) = x0 . (iii) Prove that t esA x ds ∈ Dom A for all x ∈ H
(3.1.14)
and
0
t
esA x ds = etA x − x.
A 0
In other words, etA x is a continuous solution of the integral form of (3.1.14). (iv) Prove that ∞ e−λt etA x dt = (λ − A)−1 x for all x ∈ H 0
and sufficiently large Re λ (actually for Re λ > sup{(Ax, x) : x ∈ Dom A, x = 1}). (v) Let g : [0, ∞) → H be a continuous mapping and u : [0, ∞) → H a solution to the initial value problem x(t) ˙ = Ax(t) + g(t), x(0) = x0 . Show that
t
e(t−s)A g(s) ds
tA
u(t) = e x0 +
for
t ≥ 0.
0
(vi) Find conditions on a continuous mapping h : H → H such that the existence of a continuous solution to the integral equation t x(t) = etA x0 + e(t−s)A h(x(s)) ds 0
follows from the Contraction Principle. Such a solution is called a mild solution of the problem x(t) ˙ = Ax(t) + h(x(t)), (3.1.15) x(0) = x0 .
3.2. Differential Calculus in Normed Linear Spaces
117
If −A is as in Exercise 2.2.21(i), (ii), (iv), or, more generally, A = ∆ with suitable boundary conditions, then (3.1.15) is a semilinear partial differential equation of parabolic type. Exercise 3.1.20. Let B be a compact, self-adjoint operator on a Hilbert space H and let λ0 ∈ σ(B) \ {0}. Compute An in the expression (3.1.13).
3.2 Differential Calculus in Normed Linear Spaces We suppose that the reader is acquainted with partial derivatives and the differential of functions of two real variables. Our goal in this section is to extend these notions to mappings between normed linear spaces. Many infinite dimensional spaces vary from RN by the lack of any natural basis. In particular this means that there is no way of generalizing partial derivatives. We define a directional derivative instead. Definition 3.2.1. Let X, Y be normed linear spaces (both over the same scalar field) and let f : X → Y . If for a, h ∈ X the limit (in the norm of Y ) f (a + th) − f (a) t→0 t lim t∈R
exists, then its value is called the derivative of f at the point a and in the direction h (or directional derivative or Gˆ ateaux variation) and is denoted by δf (a; h). If δf (a; h) exists for all h ∈ X and the mappingDf (a) : h → δf (a; h) is linear and continuous, then Df (a) is called the Gˆ ateaux derivative of f at the point a.5 Remark 3.2.2. Simple examples of functions of two variables show that the directional derivative need not be linear in h and not even the existence of Df (a) guarantees the continuity of f at the point a. M N N M Example 3.2.3. Consider the standard bases eM 1 , . . . , eM and e1 , . . . , eN of R N M N and R , respectively. Then we can write f : R → R in the form
f (x) =
N
f i (x)eN i
(or briefly f = (f 1 , . . . , f N )).
i=1
It is easy to see that δf (a; h) exists if and only if δf i (a; h) exists for all i = i 1, . . . , N . In particular, for h = eM j , the directional derivative δf (a; h) is nothing else than 5 The
∂f i ∂xj (a).
This means that the Gˆateaux derivative Df (a) has the matrix
terminology concerning Gˆ ateaux differentiability is not fixed. Some authors do not assume linearity of Df (a).
118
Chapter 3. Abstract Integral and Differential Calculus
representation with respect to the standard bases in ⎛ ∂f 1 ∂f 1 ⎜ ∂x (a) . . . ∂x (a) 1 M ⎜ ⎜ .. .. . . ⎜ . . . ⎜ ⎝ ∂f N ∂f N (a) . . . (a) ∂x1 ∂xM
the form ⎞ ⎟ ⎟ ⎟ ⎟. ⎟ ⎠
This matrix is called the Jacobi matrix of f at the point a. If M = N , then its determinant is denoted by ∂(f 1 , . . . , f M ) = Jf ∂(x1 , . . . , xM ) g
and is called the Jacobian of f at a.
Example 3.2.4. Suppose that H is a Hilbert space and f : H → R (or C) has a Gˆ ateaux derivative Df (a) at a ∈ H. Then, by the Riesz Representation Theorem, there exists a unique point ∇f (a) ∈ H such that Df (a)h = (h, ∇f (a))H . The element ∇f (a) is called the gradient of f at a. Notice that the gradient ∇f g is a mapping from H into itself. Remark 3.2.5. One of the most important applications of the notion of derivative is in extremal problems of classical analysis. The well-known theorem (due to Fermat) asserts that the derivative is zero at an extremal point provided this derivative exists. The same result obviously holds for f : X → R also in an infinite dimensional space X.6 The previous remark indicates the use of the notion of derivative for solving the equation F (x) = o for F : H → H. Namely, suppose that there is a functional f : H → R such that ∇f = F. Then it is sufficient to show that f has a local maximum or minimum. However, it is a very nontrivial problem to find such f (the so-called potential of F ) or to find conditions to ensure its existence. See Chapter 6 for more details. A discussion of the finite dimensional case (H = RM ) is given in Appendix 4.3B (Remark 4.3.62 and Theorem 4.3.64). We postpone examples since various properties of the derivative will be needed to introduce them. 6 A simple reason for this observation comes from the fact that the directional derivative δf (a; h) describes the behavior of the functional f along the straight line {a + th : t ∈ R}, i.e., the behavior of the real function t → f (a + th) near zero.
3.2. Differential Calculus in Normed Linear Spaces
119
Theorem 3.2.6 (Mean Value Theorem). Let X be a normed linear space and Y a Banach space. Let f : X → Y have the directional derivative at all points of the segment joining points a, b ∈ X in the direction of this segment, i.e., δf (a+t(b−a); b−a) exists for all t ∈ [0, 1]. If the mapping t → δf (a+t(b−a); b−a) is continuous on [0, 1], then 1 f (b) − f (a) = δf (a + t(b − a); b − a) dt. (3.2.1) 0
Proof. Take a ϕ ∈ Y
∗
and denote
g(t) = ϕ(f (a + t(b − a))),
t ∈ [0, 1].
By the definition of the directional derivative, we have g (t) = ϕ[δf (a + t(b − a); b − a)] and g is continuous on [0, 1]. It follows from the Basic Theorem of Calculus that 1 ϕ[δf (a + t(b − a); b − a)] dt = g(1) − g(0) = ϕ(f (b)) − ϕ(f (a)). 0
The Riemann integral
1
δf (a + t(b − a); b − a) dt exists (see Theorem 3.1.2) and,
0
by Proposition 3.1.12(i), we get 1 ϕ[δf (a + t(b − a); b − a)] dt = ϕ 0
1
δf (a + t(b − a); b − a) dt .
0 ∗
Since ϕ ∈ Y has been chosen arbitrary, the Hahn–Banach Theorem (in particular, Remark 2.1.17(ii)) implies the equality (3.2.1). The following result offers another possible formulation. Theorem 3.2.7 (Mean Value Theorem). Let X, Y be normed linear spaces and let f : X → Y . If for given a, b ∈ X the directional derivative δf (a + t(b − a); b − a) exists for all t ∈ [0, 1], then f (b) − f (a)Y ≤ sup δf (a + t(b − a); b − a)Y
(3.2.2)
t∈[0,1]
and f (b) − f (a) − δf (a; b − a)Y ≤ sup δf (a + t(b − a); b − a) − δf (a; b − a)Y . t∈[0,1]
(3.2.3) Moreover, if Df (a + t(b − a)) exists for all t ∈ [0, 1], then f (b) − f (a)Y ≤ sup Df (a + t(b − a))L(X,Y ) b − aX . t∈[0,1]
(3.2.4)
120
Chapter 3. Abstract Integral and Differential Calculus
Proof. An idea similar to the previous proof is used. By the dual characterization of the norm (Corollary 2.1.16) there is ϕ ∈ X ∗ , ϕ = 1, such that f (b) − f (a) = ϕ(f (b) − f (a)). Define now g(t) = ϕ(f (a + t(b − a))),
t ∈ [0, 1].
Then g (t) = ϕ(δf (a + t(b − a); b − a)) and, therefore, the function g satisfies all assumptions of the classical Mean Value Theorem. Consequently, if X is a real space, we get f (b) − f (a) = g(1) − g(0) = g (ϑ) = ϕ(δf (a + ϑ(b − a); b − a)) ≤ δf (a + ϑ(b − a); b − a)
for a ϑ ∈ (0, 1).
If X is a complex space, we consider Re g and obtain f (b) − f (a) ≤ sup |g (ϑ)| ϑ∈(0,1)
(see the next remark) and the assertion also follows. The proof of (3.2.3) is similar and (3.2.4) is an easy consequence of (3.2.2). Remark 3.2.8. The Mean Value Theorem for functions from R → R is often stated in the following form: There is ϑ ∈ (0, 1) such that f (b) − f (a) = f (a + ϑ(b − a))(b − a) provided f is continuous on the interval [a, b] and f (x) exists for every x ∈ (a, b). Warning. This equality does not hold even for f : R → C (∼ R2 ) (e.g., f (x) = eix , a = 0, b = 2π)! Example 3.2.9. Differentiability of the norm is connected with the properties of the corresponding space (see, e.g., Fabian et al. [49, Chapter 5]). As a simple example we will show the relation between the uniqueness of the supporting hyperplane at a given point a ∈ X, a = 1, and the Gˆ ateaux differentiability of the norm at the point a. We recall that by Corollary 2.1.16 there is ϕ ∈ X ∗ , ϕ = 1, such that ϕ(a) = a = 1 and Re ϕ(x) ≤ 1
for all x ∈ X, x ≤ 1.7
hyperplane M = {x ∈ X : ϕ(x) = 1} is then called a supporting hyperplane to the unit ball of X at the point a. Such a ϕ ∈ X ∗ need not be uniquely determined. 7 The
3.2. Differential Calculus in Normed Linear Spaces
121
Put f (x) = x. Fix h ∈ X and let g(t) = a + th, t ∈ R. The function g is a convex real function, and therefore there exist the right and the left derivatives at zero and g− (0) ≤ g+ (0). Further, we have ϕ(a + th) − ϕ(a) g(t) − g(0) ≥ = ϕ(h) t t
for
t > 0.
In particular, g+ (0) ≥ ϕ(h) and similarly g− (0) ≤ ϕ(h). This means that ϕ is uniquely determined provided the directional derivative of the norm exists at a for all h ∈ X. In particular, δf (a; h) = ϕ(h),
i.e., the norm is Gˆ ateaux differentiable at a. The converse is also true. Indeed, suppose by contradiction that δf (a; h) does not exist for an h, i.e., g+ (0) > g− (0). Choose α ∈ [g− (0), g+ (0)] and define ψ(γa + th) = γ + tα for scalars γ, t. Then α ≤ g+ (0) ≤
a + th − a g(t) − g(0) = t t
for
t > 0,
and therefore ψ(a + th) = 1 + tα ≤ a + th. The same inequality holds for t ≤ 0. As an easy consequence we get |ψ(γa + th)| ≤ γa + th,
ψ(a) = 1.
This means that ψ ∈ Y ∗,
ψ = 1
where
Y = Lin{a, h}.
The Hahn–Banach Theorem yields an extension ϕ of ψ which determines a supporting hyperplane. Since for a different α we get a different ϕ there is no uniqueness of supporting hyperplanes at a and the duality mapping8 is not single-valued g at a. Similarly to partial derivatives, the Gˆ ateaux derivative is also unsuitable for the Chain Rule for differentiability. We recommend to the reader to construct examples of f : R2 → R, g : R → R2 such that f (g) has no derivative at o in spite of the fact that Df (o) = 0,
g(0) = o,
g (0) = (0, 0).
For this purpose a stronger notion of differentiability is needed. The following definition is a straightforward generalization of the differential of a function of two variables. 8 The map κ : X → exp X ∗ : κ(x) {f ∈ X ∗ : f ∗ = x , f (x) = x2 } is called the X X X duality mapping. It is a multivalued mapping and belongs to the fundamental concepts in the Banach space theory.
122
Chapter 3. Abstract Integral and Differential Calculus
Definition 3.2.10. Let X, Y be normed linear spaces (both over the same scalar field). A mapping f : X → Y is said to be Fr´echet differentiable at a point a ∈ X if there exists A ∈ L(X, Y ) such that f (a + h) − f (a) − AhY = 0. h→o hX lim
(3.2.5)
In this case A is called the Fr´echet derivative of f at the point a and is denoted by f (a). Remark 3.2.11. (i) If f (a) exists, then also Df (a) exists. Moreover, f (a)h = Df (a)h
for all h ∈ X.
(ii) Suppose that a linear operator A : X → Y has the property (3.2.5). It is easy to see that A is continuous if and only if f is continuous at a. (iii) A basic analytical approach to the investigation of nonlinear problems involves their approximation by simpler objects. Among them linear approximations are more appropriate from the local point of view. The classical notion of the derivative as the best local linear approximation is the most transparent confirmation of this phenomenon (e.g., the Fermat Theorem for local extremal points). The notion of Fr´echet derivative is a genuine generalization to infinite dimensional spaces. Theorem 3.2.12 (Chain Rule). Let X, Y , Z be normed linear spaces and let there exist δg(a; h) for g : X → Y . If g(a) = b and for f : Y → Z the Fr´echet derivative f (b) exists, then δ(f ◦ g)(a; h) = f (b)[δg(a; h)].9 (3.2.6) Proof. Choose ε > 0 and h ∈ X. By (3.2.5) there is η > 0 such that f (b + k) − f (b) − f (b)kZ ≤ εkY
for
kY < η.
Put ω(t) g(a + th) − g(a) − tδg(a; h). ˆ For By Definition 3.2.1, there is δˆ > 0 such that ω(t)Y ≤ ε|t| for |t| < δ. k g(a + th) − g(a) = g(a + th) − b we have kY ≤ |t|δg(a; h)Y + ω(t)Y ≤ |t|[δg(a; h)Y + ε]. more transparent notation we will often use the symbol f ◦ g instead of f (g) for the composition of f and g. 9 For
3.2. Differential Calculus in Normed Linear Spaces
123
We may choose δˆ so small that the right-hand side in this inequality is less than ˆ Using all the information and δg(a; h) = k−ω(t) we obtain η whenever |t| < δ. t f (g(a + th)) − f (g(a)) − f (b)[δg(a; h)] t Z f (b + k) − f (b) − f (b)k ω(t) + f = (b) t t Z ω(t)Y εkY + f (b)L(Y,Z) ≤ ε[ε + δg(a; h)Y + f (b)L(Y,Z) ] ≤ |t| |t| ˆ The formula (3.2.6) follows. for 0 < |t| < δ.
Corollary 3.2.13. Let the hypotheses of Theorem 3.2.12 be satisfied. If, moreover, Dg(a) exists, then also D(f ◦ g)(a) does exist and the analogue of (3.2.6) is true. A similar assertion is true for (f ◦ g) (a) provided g (a) exists. Proof. The assertion on D(f ◦ g)(a) follows from (3.2.6). The proof for (f ◦ g) (a) is similar to that given above. Corollary 3.2.14. Let A ∈ L(Y, Z) and let δf (a; h) exist for f : X → Y . Then δ(Af )(a; h) = Aδf (a; h) and similarly for D(Af )(a) and (Af ) (a). Proof. It is sufficient to show that A (y) = A
for all y ∈ Y,
but this follows immediately from the definition.
The verification of the degree of linear approximation needed in (3.2.5) is not always an easy task. The following condition can be of use in such situations. Proposition 3.2.15. Let Df (x) exist for all x in a neighborhood of a point a ∈ X. If x → Df (x) is continuous at a (as a mapping X → L(X, Y )), then f (a) exists. Proof. According to the estimate (3.2.3) we have for small h, f (a + h) − f (a) − Df (a)hY ≤ sup Df (a + th) − Df (a)L(X,Y ) hX t∈[0,1]
and the continuity of Df yields (3.2.5).
Definition 3.2.16. Let G be an open set in X and let f : X → Y . If the Gˆ ateaux derivative Df : G → L(X, Y ) is continuous on G (or equivalently, f is continuous on G), then we write f ∈ C 1 (G). One of the convenient conditions for the existence of the differential (i.e., Fr´echet derivative) of f : R2 → R is the continuity of partial derivatives. These can be interpreted also as derivatives with respect to one-dimensional subspaces. A generalization leads to the following definition.
124
Chapter 3. Abstract Integral and Differential Calculus
Definition 3.2.17. Let f : X → Y where X = X1 × X2 and X1 , X2 , Y are normed linear spaces.10 Let a2 ∈ X2 and let f1 : x1 → f (x1 , a2 ). If f1 has the Gˆateaux (or Fr´echet) derivative at a1 ∈ X1 , then Df1 (a1 ) (or f1 (a1 )) is called the partial Gˆ ateaux (or Fr´echet ) derivative of f at (a1 , a2 ) with respect to the first variable and is denoted by D1 f (a1 , a2 ) (or f1 (a1 , a2 )). Similarly the partial derivative with respect to the second variable (D2 f or f2 ) is defined. If Df (a1 , a2 ) exists, then also D1 f (a1 , a2 ), D2 f (a1 , a2 ) exist and Df (a1 , a2 )(h1 , h2 ) = D1 f (a1 , a2 )h1 + D2 f (a1 , a2 )h2 .
(3.2.7)
For the converse assertion we need more assumptions: Proposition 3.2.18. Assume that D2 f exists on a neighborhood of a point (a1 , a2 ) and the mapping D2 f : X1 × X2 → L(X2 , Y ) is continuous at (a1 , a2 ). Assume, moreover, that D1 f exists at the point (a1 , a2 ). Then Df (a1 , a2 ) exists and (3.2.7) holds. Proof. Choose sufficiently small h1 , h2 . Then, by (3.2.3), f (a1 + th1 , a2 + th2 ) − f (a1 , a2 ) − tD1 f (a1 , a2 )h1 − tD2 f (a1 , a2 )h2 ≤ f (a1 + th1 , a2 + th2 ) − f (a1 + th1 , a2 ) − tD2 f (a1 + th1 , a2 )h2 + D2 f (a1 + th1 , a2 ) − D2 f (a1 , a2 )|t|h2 + f (a1 + th1 , a2 ) − f (a1 , a2 ) − tD1 f (a1 , a2 )h1 ≤ sup D2 f (a1 + th1 , a2 + tτ h2 ) − D2 f (a1 + th1 , a2 )|t|h2 0≤τ ≤1
+ D2 f (a1 + th1 , a2 ) − D2 f (a1 , a2 )|t|h2 f (a1 + th1 , a2 ) − f (a1 , a2 ) − D + f (a , a )h 1 1 2 1 |t| = o(|t|) t as t → 0, and the result follows.
Remark 3.2.19. If, in addition to the assumptions of Proposition 3.2.18, f1 (a1 , a2 ) exists, then f (a1 , a2 ) exists, too. The proof then follows the same lines as that above. Corollary 3.2.20. Let G be an open subset of X = X1 × X2 and f : X → Y . Then f ∈ C 1 (G) if and only if both f1 , f2 belong to C 1 (G). X is a normed linear space, too. A norm on X is, for example, defined by xX = x1 X1 + x2 X2 , x = (x1 , x2 ) ∈ X1 × X2 . 10 Then
3.2. Differential Calculus in Normed Linear Spaces
125
Example 3.2.21. One of the most important nonlinear mappings is the so-called Nemytski operator which is sometimes also called the substitution (or superposition) operator . As the latter term indicates it arises by the substitution of a function ϕ : G ⊂ RM → R into the function f : G × R → R. This leads to a new operator F : ϕ → f (·, ϕ(·)) which acts on a space X of functions ϕ. We wish to find conditions on f for F to be a mapping from X into X and to have some derivatives. We start with the case X = C[0, 1]. It is clear that the continuity of f on [0, 1] × R is sufficient to guarantee that F : X → X. Since f is uniformly continuous on compact sets of the form {(x, y) ∈ [0, 1] × R : |y − ϕ(x)| ≤ 1}
for ϕ ∈ C[0, 1],
F is also continuous on X. Suppose now that the partial derivative ∂f ∂y is continuous on [0, 1] × R. For ϕ, h ∈ X we have, by the classical Mean Value Theorem, f (x, ϕ(x) + th(x)) − f (x, ϕ(x)) ∂f = (x, ϕ(x) + ϑ(t, x)th(x))h(x) t ∂y for a ϑ(t, x) ∈ (0, 1) and - f (x, ϕ(x) + th(x)) − f (x, ϕ(x)) ∂f − (x, ϕ(x))h(x)-sup -t ∂y x∈[0,1] - ∂f ∂f ≤ sup sup - (x, ϕ(x) + ϑth(x)) − (x, ϕ(x))-- |h(x)| ≤ εhC[0,1] ∂y 0≤ϑ≤1 ∂y x∈[0,1]
for all sufficiently small |t| (again by uniform continuity of This means that the Gˆateaux derivative DF (ϕ) exists and DF (ϕ)h : x →
∂f ∂y
on compact sets).
∂f (x, ϕ(x))h(x). ∂y
Moreover, DF is continuous as a mapping X → L(X) (again by the uniform continuity of ∂f ∂y ) and, by Proposition 3.2.15, F (ϕ) exists for any ϕ ∈ X. Warning. It is not always true that the existence of DF !
∂f ∂y
implies the existence of
For example, let X = {ϕ ∈ C[0, ∞) : ϕ(x)e−x is bounded on [0, ∞)} with the norm ϕ
sup |ϕ(x)e−x | x∈[0,∞)
126
Chapter 3. Abstract Integral and Differential Calculus
and let f (y) = sin y. Since f is Lipschitz continuous with constant 1 we obtain F (ϕ1 ) − F (ϕ2 ) ≤ ϕ1 − ϕ2 . In particular, F is a continuous mapping from X into itself. But δF (0; h) = sin (0)h,
h ∈ X,
as could be erroneously supposed by analogy. Namely, for h(x) = ex ∈ X, - −x sin (tex ) − 0 - sin y x − e - = sup − 1-- ≥ 1 for any t > 0.11 sup -e t y x∈[0,∞) y∈[t,∞) Similar calculations yield that δF (0; h), h = o, does not exist at all.
g
The study of the Nemytski operator in spaces of integrable functions is much more complicated. First it has to be proved that F (ϕ) is a measurable function on Ω for ϕ ∈ Lp (Ω). The following notion is crucial for this purpose. Definition 3.2.22. Let Ω be an open set in RN . A function f : Ω × R → R is said to have the Carath´eodory property (notation: f ∈ CAR(Ω × R)) if (M) for all y ∈ R the function x → f (x, y) is (Lebesgue) measurable on Ω; (C) for a.a. x ∈ Ω the function y → f (x, y) is continuous on R. Proposition 3.2.23. Let Ω be an open set in RN . Then (i) if f : Ω × R → R is continuous on Ω × R, then f ∈ CAR(Ω × R); (ii) if f ∈ CAR(Ω × R) and ϕ : Ω → R is (Lebesgue) measurable on Ω, then F (ϕ) : x → f (x, ϕ(x)),
x ∈ Ω,
is a measurable function on Ω. Proof. (i) Since a continuous function f (·, y) is Lebesgue measurable, the assertion is obvious. ∞ (ii) Let ϕ be a measurable function on Ω. Then there is a sequence {sn }n=1 of step functions which converge to ϕ a.e. in Ω. If s(x) =
k
αi χΩi (x)
i=1
is a step function on Ω, i.e., there are pairwise disjoint Ω1 , . . . , Ωk which are measurable, k 1, x ∈ Ωi , Ω= Ωi and χΩi (x) = 0, x ∈ Ωi , i=1 11 The
lack of differentiability of the Nemytski operators in weighted spaces causes big problems in the use of the Implicit Function Theorem.
3.2. Differential Calculus in Normed Linear Spaces
then f (x, s(x)) =
k
127
f (x, αi )χΩi (x)
i=1
is a measurable function (property (M) in Definition 3.2.22). By property (C), lim f (x, sn (x)) = f (x, ϕ(x))
n→∞
for a.a. x ∈ Ω,
i.e., F (ϕ)(x) = f (x, ϕ(x))
is measurable.
Having measurability of F (ϕ) we can ask when F (ϕ) ∈ Lq (Ω). It is plausible that a certain growth condition for f is needed. Theorem 3.2.24. Let f ∈ CAR(Ω × R) and p, q ∈ [1, ∞). Let there exist g ∈ Lq (Ω) and c ∈ R such that p
|f (x, y)| ≤ g(x) + c|y| q
for a.a.
x∈Ω
and all
y ∈ R.
(3.2.8)
Then (i) F (ϕ) ∈ Lq (Ω) for all ϕ ∈ Lp (Ω);12 (ii) F is a continuous mapping from Lp (Ω) into Lq (Ω); (iii) F maps bounded sets in Lp (Ω) into bounded sets in Lq (Ω). Proof. The proof of (i) is based on Proposition 3.2.23 and the use of the Minkowski inequality (Example 1.2.16) and it is straightforward. The proof of (ii) is quite involved and its crucial step consists in the fact that F maps sequences converging in measure into sequences with the same property. We omit details (see, e.g., Krasnoselski [78, § I.2] or Appell & Zabreiko [8]). The property (iii) follows from the growth condition (3.2.8). Remark 3.2.25. The Carath´eodory property can be generalized to functions f : Ω × RM → R. Proposition 3.2.23 and Theorem 3.2.24 hold similarly for F (ϕ1 , . . . , ϕM )(x) f (x, ϕ1 (x), . . . , ϕM (x)). Remark 3.2.26. Let Ω ⊂ RN be an open subset of RN and f : Ω × RN +1 → R satisfy the Carath´eodory property. Assume, moreover, there exist g ∈ Lq (Ω) and c ∈ R such that N p |f (x, y)| ≤ g(x) + c |yi | q i=0
for a.a. x ∈ Ω and all y = (y0 , y1 , . . . , yN ) ∈ RN +1 . Then F defined by F (u)(x) f (x, u(x), ∇u(x)) is a continuous mapping from W 1,p (Ω) into Lq (Ω) and maps bounded sets into bounded sets. it can be proved that this property implies that (3.2.8) is satisfied for g ∈ Lq (Ω) and c ∈ R, cf. Appell & Zabreiko [8].
12 Actually,
128
Chapter 3. Abstract Integral and Differential Calculus
The growth condition with respect to y0 can be relaxed according to the Embedding Theorem for W 1,p (Ω) (cf. Fuˇc´ık & Kufner [54]). Now we turn our attention to the directional derivative of the Nemytski operator in the space L2 (Ω). The exponents p = q = 2 are considered for simplicity only. In accordance with the computation in Example 3.2.21 we could expect δF (ϕ; h)(x) =
∂f (x, ϕ(x))h(x) ∂y
provided the right-hand side belongs to L2 (Ω). This is true if
(3.2.9) ∂f ∂y (·, ϕ(·))
∈ L∞ (Ω),
is a bounded continuous function on Ω × R. But this is not the e.g., whenever whole story since we have to show that -2 - f (x, ϕ(x) + th(x)) − f (x, ϕ(x)) ∂f − (x, ϕ(x))h(x)-- dx → 0 for t → 0. t ∂y Ω ∂f ∂y
For a.a. x ∈ Ω the function under the integral sign can be estimated by the Mean Value Theorem (the formula (3.2.1)): - f (x, ϕ(x) + th(x)) − f (x, ϕ(x)) ∂f − (x, ϕ(x))h(x) t ∂y - ∂f 1 ∂f ≤ sup -- (x, ϕ(x) + tϑh(x)) − (x, ϕ(x))--|t| |h(x)|.13 |t| 0≤ϑ≤1 ∂y ∂y The right-hand side converges to zero for t → 0 for a.a. x ∈ Ω (by the continuity of ∂f ∂y ). In order to justify the use of the Lebesgue Dominated Convergence Theorem we need a square integrable majorant. In particular, boundedness of ∂f ∂y on Ω × R 14 is sufficient. In the case when F depends also on the gradient of ϕ the situation is only technically slightly more complicated. Nemytski operators appear often under the integral (see Chapters 6 and 7). Since the integral is a continuous linear form, in particular, it is Fr´echet differentiable, we can use the Chain Rule to get N ∂f ∂f ∂h DΦ(ϕ)h = (x, ϕ(x), ∇ϕ(x))h(x) + (x, ϕ(x), ∇ϕ(x)) (x) dx ∂yi ∂xi Ω ∂y0 i=1 (3.2.10) 13 It is worth noticing how the classical Mean Value Theorem is used here: to avoid problems with measurability of x → ∂f (x, ϕ(x) + ϑ(x)th(x)) “the inequality form” of the theorem is employed. ∂y 14 The reader should notice problems in finding conditions which ensure that (3.2.9) is also the Fr´ echet derivative. The situation is even much worse than one would expect. The function f has to be linear for F : L2 (Ω) → L2 (Ω) to be Fr´echet differentiable (see, e.g., Ambrosetti & Prodi [6, Chapter 1, Proposition 2.8]). See also Exercise 3.2.41 and Remark 3.2.42.
3.2. Differential Calculus in Normed Linear Spaces
129
for
f (x, ϕ(x), ∇ϕ(x)) dx,
Φ(ϕ) Ω
under appropriate assumptions on f . Now we turn our attention to higher derivatives. We restrict our attention to the second derivatives and believe that the reader will be able to define the third and higher order derivatives as well. Higher order derivatives of functions are defined by induction. We will do the same for abstract mappings. Let f : X → Y , and a, h, k ∈ X. Put g(t, s) = f (a + th + sk). Then ∂g (0, s) = δf (a + sk; h), ∂t which is a mapping from R (of variable s) into Y and can be differentiated again:
∂2g ∂ ∂g . (0, 0) (0, s) -∂t∂s ∂s ∂t s=0 If these derivatives exist, then δ 2 f (a; h, k)
∂ ∂s
∂g (0, s) -∂t s=0
is called the second directional derivative (in the directions h, k). Notice that generally δ 2 f (a; h, k) = δ 2 f (a; k, h). (Find an example for f : R2 → R!) It is easy to see that for f : RM → R we have δ 2 f (a; ei , ej ) =
∂2f (a) ∂xi ∂xj
if ei , ej are the unit coordinate vectors in RM . It may occur that the operator (h, k) ∈ X × X → δ 2 f (a; h, k) is linear in both variables (i.e., it is the so-called bilinear operator ) and is continuous on X × X.15 In that case δ 2 f (a; ·, ·) is called the second Gˆ ateaux derivative and is denoted by D2 f (a). 15 Equivalently,
it is continuous at the point (o, o) if there is a constant c such that δ2 f (a; h, k)Y ≤ chX kX for all h, k ∈ X.
(See a similar assertion in Proposition 1.2.10 for a linear operator.) Denoting the space of all continuous bilinear operators from X × X into Y by B2 (X, Y ) we see that the least possible constant c in the above inequality is a norm on B2 (X, Y ). See also the important Proposition 2.1.7.
130
Chapter 3. Abstract Integral and Differential Calculus
Proposition 3.2.27 (Taylor Formula). Let X be a normed linear space and Y a Banach space. Assume that a, h ∈ X and that δ 2 f (x; h, h) exists for all x ∈ M {a + th : t ∈ [0, 1]} and is continuous as a mapping from M into Y . Then 1 f (a + h) = f (a) + δf (a; h) + (1 − t)δ 2 f (a + th; h, h) dt.16 (3.2.11) 0
Proof. Put g(t) = (1 − t)δf (a + th; h). Then we have g (t) = −δf (a + th; h) + (1 − t)δ 2 f (a + th; h, h),
t ∈ [0, 1].
Since both terms on the right-hand side are continuous we get, 1 1 1 g(1) − g(0) = g (t) dt = − δf (a + th; h) dt + (1 − t)δ 2 f (a + th; h, h) dt. 0
0
0
Using Theorem 3.2.6 we obtain (3.2.11).
If we wanted to define the second Fr´echet derivative also by induction we should differentiate f : X → L(X, Y ) at a ∈ X to obtain f (a) ∈ L(X, L(X, Y )). But this space seems to be rather strange and the space L(X, L(X, L(X, Y ))) (for f (a)) really awkward. Because of that we identify L(X, L(X, Y )) with the space of continuous bilinear operators B2 (X, Y ) (see footnote 15) and define the second Fr´echet derivative f (a) to be an element of B2 (X, Y ) with the approximation property f (a + h) − f (a) − f (a)(h, ·)L(X,Y ) lim = 0. (3.2.12) h→o hX The careful reader can ask why we have written f (a)(h, ·) and not f (a)(·, h) in (3.2.12). The reason is that the mapping (h, k) → f (a)(h, k) is actually symmetric. Proposition 3.2.28. If f (a) exists, then f (a)(h, k) = f (a)(k, h)
for all
h, k ∈ X.
Proof. Similarly to the proof of the classical result on mixed partial derivatives we express the difference f (a + h + k) − f (a + h) − f (a + k) + f (a) which is equal to gi (1) − gi (0) for g1 (t) f (a + th + k) − f (a + th),
t ∈ [0, 1],
g2 (s) f (a + h + sk) − f (a + sk),
s ∈ [0, 1].
for n ∈ N the nth directional derivative δn f (a; h, . . . , h) exists for all h ∈ X, then the n 1 k δ f (a; h, . . . , h) is called the Taylor polynomial of the degree n of f mapping h → f (a) + k! k=1 16 If
k-times
at the point a.
3.2. Differential Calculus in Normed Linear Spaces
131
Since f (a) exists, both the mappings f and f are defined on a neighborhood U of a. Elements h, k are chosen so small that all variables belong to U. We can express gi (1) − gi (0) = Ai + gi (0)
where Ai gi (1) − gi (0) − gi (0),
and g1 (0) = e1 (h, k) + f (a)(k, h), g2 (0) = e2 (k, h) + f (a)(h, k),
e1 (h, k) f (a + k)h − f (a)h − f (a)(k, h), e2 (k, h) f (a + h)k − f (a)k − f (a)(h, k).
Since g1 (1) − g1 (0) = g2 (1) − g2 (0), we have f (a)(h, k) − f (a)(k, h) = A1 − A2 + e1 (h, k) − e2 (k, h).
(3.2.13)
Now we estimate all terms on the right-hand side of this equality. By Theorem 3.2.7, Ai ≤ sup gi (t) − gi (0). t∈[0,1]
Since f (a) is bilinear, we have g1 (t) − g1 (0) = f (a + th + k)h − f (a + th)h − f (a + k)h + f (a)h = [f (a + th + k)h − f (a)h − f (a)(th + k, h)] − [f (a + th)h − f (a)h − f (a)(th, h)] − [f (a + k)h − f (a)h − f (a)(k, h)].
(3.2.14)
Choose now ε > 0 and δ > 0 corresponding to the definition of f (a) such that f (a + u)v − f (a)v − f (a)(u, v) ≤ εuv for
u < δ and any v ∈ X.
Then every term on the right-hand side of (3.2.14) is bounded by ε(h + k)2 provided h, k < δ. The same estimate holds for e1 (h, k) and similarly also for g2 (t) − g2 (0), e2 (k, h). By (3.2.13) we obtain f (a)(h, k)−f (a)(k, h) ≤ A1 +A2 +e1 (h, k)+e2 (k, h) ≤ 8ε[h+k]2 (3.2.15) provided h, k < δ. Choose h0 , k0 ∈ X and put h = αh0 ,
k = αk0 .
For a sufficiently small α the estimate (3.2.15) holds. Because of the bilinearity of f (a) we get f (a)(h0 , k0 ) − f (a)(k0 , h0 ) ≤ 8ε[h0 2 + k0 2 ] This completes the proof.
for any ε > 0.
132
Chapter 3. Abstract Integral and Differential Calculus
Remark 3.2.29. (i) It is not difficult to see that the existence of f (a) implies the existence of D2 f (a) and the equality f (a)(h, k) = D2 f (a)(h, k). It is also possible to prove that the continuity of D2 f on an open set G ⊂ X (as a mapping from G into B2 (X, Y )) is equivalent to the continuity of f on G. In this case we write f ∈ C 2 (G). (ii) If X = RM , Y = R and D2 f (a) exists for f : RM → R, then it is sufficient to know the values D2 f (a)(ei , ej ),
i, j = 1, . . . , M,
to determine D2 f (a). This means that D2 f (a) (and also f (a)) can be represented by the matrix (the so-called Hess matrix )
2 ∂ f (a) . ∂xi ∂xj Exercise 3.2.30. Let A ∈ L(X, Y ) and B ∈ B2 (X, Y ). Compute A , B and B ! Exercise 3.2.31. Let f : X → Y be injective on an open set G ⊂ X. Denote −1 (f |G ) = g. Suppose that f (a) and g (b) exist for an a ∈ G, f (a) = b. Is it true that g (b) = [f (a)]−1 ? (For conditions which guarantee the existence of g (b) see Section 4.1.) Exercise 3.2.32. Put Φ(A) = A−1 for an invertible A ∈ L(X, Y ) (here X, Y are Banach spaces). Show that Φ (A)(H) = −A−1 HA−1 ,
H ∈ L(X, Y ) for all A ∈ Dom Φ.
Hint. Use the same method as in Exercise 2.1.33. Exercise 3.2.33. Let X be either C[0, 1] or Lp (0, 1), 1 ≤ p < ∞. Compute δf (x, h), Df (x) and f (x) for f (x) = x, x ∈ X. Exercise 3.2.34. (i) Compute the duality mapping for the space Lp (Ω), 1 ≤ p < ∞. (ii) Show that the duality mapping for the space C[0, 1] need not be single-valued. Exercise 3.2.35. Let p > 1, Ω ⊂ RN and let 1 1 p f (u) = ∇u(x) dx, g(u) = |u(x)|p dx p Ω p Ω
3.2. Differential Calculus in Normed Linear Spaces
133
be functionals defined on W
1,p
(Ω) (here ∇u(x)
(2 N ' ∂u(x) ∂xi
i=1
12 ). Prove that
f and g are Fr´echet differentiable at each u ∈ W 1,p (Ω), and f (u)v =
∇u(x)p−2 (∇u(x), ∇v(x)) dx,
g (u)v =
Ω
|u(x)|p−2 u(x)v(x) dx. Ω
Hint. Let t = 0, ϕ(0) = 0. ϕ(t) = |t|p−2 t, ' ( d 1 p = ϕ(t), t ∈ R. Similarly, for y ∈ RN , y = Then ϕ is continuous and dt p |t| 1
N 2 2 yi , set i=1
ψ(y) = yp−2 y,
( ' Then ∇ p1 yp = ψ(y) for all y ∈ RN .
y = o,
ψ(o) = o.
Exercise 3.2.36. Find conditions on k and f for the so-called Hammerstein operator b Hϕ(t) = k(t, s)f (s, ϕ(s)) ds a
to map L2 (a, b) into itself, and then differentiate H! Exercise 3.2.37. Differentiate the following operators: 3 1 t 2 |ϕ(s)| ds dt, ϕ ∈ C[0, 1] or ϕ ∈ L2 (0, 1); (i) F (ϕ) = 0
0
2
t
ϕ(s) ds
(ii) F (ϕ)(t) =
as
0
F : L1 (0, 1) → L1 (0, 1), F : C[0, 1] → C[0, 1], F : C[0, 1] → C 1 [0, 1]. Exercise 3.2.38. Let f : [0, 1] × R → R and
1
f (t, ϕ(t)) dt.
F (ϕ) = 0
Under which conditions on f do there exist D2 F (ϕ) and F (ϕ) if we consider F : C[0, 1] → R
and
F : L2 (0, 1) → R?
134
Chapter 3. Abstract Integral and Differential Calculus
Remark 3.2.39. The following assertion is due to I.V. Skrypnik: 2
If ∂∂yf2 is continuous and bounded on (0, 1) × R, then F ∈ C 2 (L2 (0, 1)) (F is defined as in Exercise 3.2.38) if and only if f (t, y) = a(t) + b(t)y + c(t)y 2
a, b, c ∈ L∞ (0, 1).
where
It is not too difficult to prove that by contradiction. Exercise 3.2.40. Let f : [0, 1] × R × R → R and
1
F (ϕ) =
f (t, ϕ(t), ϕ (t)) dt,
ϕ ∈ C 1 [0, 1].
0
Under which conditions does D2 F (ϕ) exist? Exercise 3.2.41. Suppose that Ω is a bounded open subset of RN , function f : Ω × R → R and its partial derivatives ∂f ∂y are continuous on Ω × R (or both satisfy the Carath´eodory property). Let p > 2 and let there exist constants a, b such that - ∂f - (x, y)- ≤ a + b|y|p−2 , x ∈ Ω, y ∈ R. - ∂y
p (the conjugate exponent) and F (ϕ)(x) = If f (·, 0) ∈ Lp (Ω) where p = p−1 f (x, ϕ(x)), show the following facts: (i) F maps Lp (Ω) into Lp (Ω). Hint. Integrate ∂f ∂y and use the above estimate and Theorem 3.2.24. p (ii) δF (ϕ)h : x → ∂f ∂y (x, ϕ(x))h(x) for all h ∈ L (Ω). Hint. Proceed similarly to the main text. Use the H¨ older inequality to show p p q that Fy (ϕ)(x) ∂f (x, ϕ(x)) maps L (Ω) into L (Ω), q = p−2 . ∂y
(iii) The Fr´echet derivative F (ϕ) exists for all ϕ ∈ Lp (Ω). Remark 3.2.42. If differentiability properties of F are also needed for p ≤ 2, one should replace Lp (Ω) by more sophisticated spaces like Besov or Triebel–Lizorkin ones. See, e.g., Runst & Sickel [115, Chapter 5].
3.2A Newton Method The Contraction Principle offers a very effective method for solving nonlinear equations, either to prove the existence of a solution or to find them numerically. Since the speed of convergence is not always satisfactory, various modifications have appeared. One of these modifications is even much older than the Contraction Principle itself and goes back to I. Newton. An idea of this method can be seen from Figure 3.2.1 where the iterations for solving the equation f (x) = o are shown.
3.2A. Newton Method
135
f (x)
x ˜ x2
x1
y1
f (a)(x − a) + f (a)
a
Figure 3.2.1.
Suppose that we have found an approximate solution a. We wish to construct a correction y˜ such that f (a + y˜) = o. By the Taylor expansion, y + r(˜ y) = −f (a + y˜)˜ y + r(˜ y), f (a) = f (a + y˜) − f (a + y˜)˜ i.e., y )] = −[f (a + y˜)]−1 [f (a + y˜) − f (a + y˜)˜ y] F (˜ y ) (3.2.16) y˜ = −[f (a + y˜)]−1 [f (a) − r(˜ provided [f (a + y˜)]−1 exists. The idea is to solve the equation y˜ = F (˜ y)
(3.2.17)
in a certain closed ball B(o; ) around o by iterations yn+1 = F (yn ),
y0 = o.
Denoting xn = a + yn we can rewrite these iterations in the form xn+1 − a = −[f (xn )]−1 f (xn ) + xn − a,
(3.2.18)
which are exactly the iterations from Figure 3.2.1. If the sequence of iterations {yn }∞ n=1 converges to y˜, then f (a + y˜) = o as follows from (3.2.16). Our goal is to show: (A1) There is δ > 0 such that F maps B(o; δ) into itself and it is a contraction on this ball.
136
Chapter 3. Abstract Integral and Differential Calculus
(A2) The convergence of {xn }∞ n=1 is faster than the convergence of iterations given by the Contraction Principle (cf. Theorem 2.3.1), actually there is a constant c such that xn+1 − xn ≤ cxn − xn−1 2 .17 (3.2.19) We apparently need some assumptions to reach this goal. We assume that X is a Banach space, f : X → X, and, moreover: ˆ and f satisfies the Lipschitz ˆ such that f ∈ C 1 (B(a; δ)) (H1) There is a ball B(a; δ) condition on this ball: there exists L such that f (x) − f (y)L(X) ≤ Lx − yX
for
ˆ x, y ∈ B(a; δ).
(H2) The value f (a) is sufficiently small.18 (H3) The derivative f (a) has a continuous inverse [f (a)]−1 ∈ L(X). The proof of (A1), (A2) will be done in several steps. For the sake of simplicity we denote A(x) f (x), α [f (a)]−1 . A f (a), −1 1 ˆ then A (x) exists and , δ ≤ δ, Step 1. If δ < αL
A−1 (x) ≤ Indeed, we can write
α 1 − αLδ
for
x ∈ B(a; δ).
A(x) = A[I + A−1 (A(x) − A)].
Since A(x) − A ≤ Lx − a, we get A−1 (A(x) − A) ≤ αLx − a and A−1 (x) exists for x ∈ B(a; δ) (by Proposition 2.1.2), and A−1 (x) =
∞
(−1)n [A−1 (A(x) − A)]n A−1 .
n=0 −1
The estimate for A (x) follows. Step 2. If w, x ∈ B(a; δ), then A−1 (w) − A−1 (x) ≤
α 1 − αLδ
2 Lw − x.
This estimate follows from the identity A−1 (w) − A−1 (x) = A−1 (w)[A(x) − A(w)]A−1 (x) and Step 1. 17 Compare this quadratic estimate (which yields an exponential one for ˜ x − xn ) with an estimate from the Contraction Principle ˜ x − xn ≤ q n x1 − x0 for a 0 < q < 1. 18 This assumption means that we actually need a good approximation of a solution of the equation f (x) = o (see Step 4 for the estimate of f (a)).
3.2A. Newton Method
137
Step 3. We have r(y) ≤
L 2 δ 2
r(y) − r(z) ≤ 3Lδy − z
and
for
y, z ∈ B(o; δ)
where r(y) f (a) − f (a + y) + A(a + y)y (see (3.2.16)). Indeed, by Theorem 3.2.6, we get
1
[A(a + y) − A(a + (1 − t)y)]y dt
r(y) = 0
and r(y) − r(z) = f (a + z) − f (a + y) + A(a + y)y − A(a + z)z 1 [A(a + t(z − y)) − A(a + y)](z − y) dt + [A(a + y) − A(a + z)]z. = 0
The estimates now follow from (H1) and Step 2. Step 4. The assertion (A1) holds. Indeed, we have F (y) − F (z) = A−1 (a + y)[r(y) − r(z)] + [A−1 (a + z) − A−1 (a + y)][f (a) − r(z)]. From (H1) and Steps 1–3 we get F (y) − F (z) ≤ c(δ + f (a)L)y − z with a c which is a bounded function of δ ∈ [0, δ0 ] (δ0 small enough). This means that we can choose δ and the estimate of f (a) in (H2) such that F (y) − F (z) ≤ qy − z,
y, z ∈ B(o; δ)
for a
q ∈ (0, 1).
Moreover, F (y) ≤ F (y) − F (o) + F (o) ≤ qδ + αf (a) ≤ δ, provided f (a) is sufficiently small. Step 5. We can now prove the assertion (A2). By (3.2.18) and Theorem 3.2.6, f (xn ) = f (xn ) − f (xn−1 ) − f (xn−1 )(xn − xn−1 ) 1 [f (xn−1 + t(xn − xn−1 )) − f (xn−1 )](xn − xn−1 ) dt. = 0
Hence f (xn ) ≤ and also
L xn − xn−1 2 , 2
xn+1 − xn ≤ A−1 (xn )f (xn ) ≤ cxn − xn−1 2 .
Remark 3.2.43. The drawback of the iteration procedure (3.2.18) consists in the requirement to compute the inverse to the derivative at each step. This is the price for fast convergence. One can assume that by replacing [f (x)]−1 by the fixed inverse [f (a)]−1
138
Chapter 3. Abstract Integral and Differential Calculus
we should avoid this disadvantage. This idea is also due to I. Newton. Conditions for convergence of these iterations were found by Kantorovich (see Kantorovich [72]). Serious problems appear when the derivative f (x) is injective but not continuously invertible. In applications, e.g., to nonlinear partial differential equations, we have many possibilities of the choice of Banach spaces Xα , Yα such that f : Xα → Yα (see, e.g., Example 1.2.25 and Example 2.1.29). It can happen that [f (x)]−1 ∈ L(Yα , Xβ )
where
Xα ⊂ Xβ .
This means that the equation f (xn )(x − xn ) = −f (xn ) (see (3.2.18)) which has to be solved to obtain the (n + 1)st -iteration xn+1 , has a solution in a larger space Xβ provided xn ∈ Xα . Therefore the iterations belong to larger and larger spaces and, after a finite number of steps, there is no solution at all. This can be also expressed by an observation that xn+1 is less smooth than xn , or that “derivatives are lost” during iterations. One way to overcome these difficulties consists in the approximation of [f (x)]−1 by a “better” operator L(x) in the sense that f (x)L(x) − IL(Yα ) is smaller and smaller when x approaches a solution of f (x) = o. Precise conditions under which new iterations wn+1 = wn − L(wn )f (wn ) converge to a solution can be found, e.g., in Moser [97, pp. 265–315 and 499–535]. A similar idea appeared earlier in Nash [98]. See also Remark 4.1.6 for a slightly different explanation. Exercise 3.2.44. Let f ∈ C 1 (R) be a convex real function. (i) Using only the results of elementary calculus prove the convergence of Newton approximations under appropriate assumptions. Give the reccurence formula for f (x) = x2 − A,
A > 0.
(ii) The same as in (i) for the Kantorovich approximations (Remark 3.2.43).
Chapter 4
Local Properties of Differentiable Mappings 4.1 Inverse Function Theorem In this section we are looking for conditions which allow us to invert a map f : X → Y , especially f : RM → RN . The simple case of a linear operator f indicates that a reasonable assumption is that M = N . Let us start with the simplest case M = N = 1. The well-known theorem says that if f is continuous and strictly monotone on an open interval I, then f is injective and f (I) is an open interval J . Moreover, the inverse function f −1 is continuous on J . It is not clear how to generalize the monotonicity assumption to RM (cf. Section 5.3), and without it the theorem is not true even in R. Since the monotonicity of a differentiable function f : R → R is a consequence of the sign of the derivative of f , we take into consideration also f . The example f (x) = x2 where f is not injective in any neighborhood of the origin shows that we have to assume f (x) = 0. In fact, if f is continuous on an open interval I, f (x) exists (possibly infinite) at all points of I, and f does not vanish at any point of I, then f is injective (actually strictly monotone since f is either strictly positive or strictly negative in I), and f −1 is continuous and differentiable on the open interval f (I). Therefore, we are looking for a generalization of the assumption f (x) = 0 for maps f : RM → RM . Since we are interested in a (unique) solution of the equation f (x) = y, the case of a linear function f : RM → RM (then f (x) = f ) suggests assuming that f (x) is either an injective or, equivalently because of the finite dimension, a surjective linear map. In both cases, f (x) is an isomorphism of RM onto RM (for the case of Banach spaces see Theorem 2.1.8).
140
Chapter 4. Local Properties of Differentiable Mappings
However, there is still one more problem: Let or g(z) = ez , z ∈ C.
f (r, ϑ) = (r cos ϑ, r sin ϑ), (r, ϑ) ∈ (0, ∞) × R,
Both functions are infinitely many times differentiable on their domains, det f (r, ϕ) = r = 0,
g (z) = 0,
and f (r, ·) is 2π-periodic and g is 2πi-periodic, i.e., f and g are not injective. Therefore, we cannot expect more than only local invertibility. The philosophy of that is simple. Since the notion of derivative is a local one, we can deduce only local information from it. After these preliminary considerations we can state the main theorem. Since there is no simplification in the case of finite dimension, we formulate it for general Banach spaces. Theorem 4.1.1 (Local Inverse Function Theorem). Let X, Y be Banach spaces, G an open set in X, f : X → Y continuously differentiable on G. Let the derivative f (a) be an isomorphism of X onto Y for a ∈ G. Then there exist neighborhoods U of a, V of f (a) such that f is injective on U, f (U) = V. If g denotes the inverse to the restriction f |U , then g ∈ C 1 (V). Proof. We will solve the equation f (x) = y for a fixed y near the point b = f (a) by the iteration process. To do that we have to rewrite the equation f (x) = y as an equation in X. We denote by A the inverse map [f (a)]−1 ∈ L(Y, X). Then f (x) = y
⇐⇒
Fy (x) x − A[f (x) − y] = x.
(4.1.1)
The simplest condition for the convergence of iterations is given by the Contraction Principle (see Theorem 2.3.1). We have Fy (x1 ) − Fy (x2 ) = x1 − x2 − A[f (x1 ) − f (x2 )] ≤ Af (x2 ) − f (x1 ) − f (a)(x2 − x1 ) ≤ A
sup
f (ξ) − f (a)x1 − x2 ,
ξ∈B(a;r)
x1 , x2 ∈ B(a; r). (Here we have used the Mean Value Theorem (see formula (3.2.3)).) In other words, we can choose r > 0 so small that Fy (x1 ) − Fy (x2 ) ≤
1 x1 − x2 2
(4.1.2)
for x1 , x2 ∈ B(a; r) ⊂ G, y ∈ Y . Further, Fy (x) − a = Fy (x) − Fy (a) + Fy (a) − a ≤
1 x − a + Ab − y. 2
If δ > 0 is such that Aδ ≤ r2 , then Fy (x) ∈ B(a; r) provided x ∈ B(a; r), y ∈ B(b; δ). By the Contraction Principle, the equation (4.1.1) has a unique solution
4.1. Inverse Function Theorem
141
in B(a; r), x g(y) ∈ B(a; r)
for any y ∈ B(b; δ).
Moreover, if g(yi ) = xi , i = 1, 2, then g(y1 ) − g(y2 ) = Fy1 (x1 ) − Fy2 (x2 ) ≤ Fy1 (x1 ) − Fy1 (x2 ) + Fy1 (x2 ) − Fy2 (x2 ) ≤
1 x1 − x2 + Ay1 − y2 , 2
i.e., g(y1 ) − g(y2 ) ≤ 2Ay1 − y2 .
(4.1.3)
In particular, g is a Lipschitz continuous map on B(b; δ). To prove the differentiaˆ ⊂ B(b; δ). bility of g, fix a y ∈ B(b; δ) and choose δˆ > 0 so small that B(y; δ) −1 A candidate for g (y) is the inverse C(x) [f (x)] for x = g(y). By (4.1.3), x ∈ B(a; r) and 1 f (x) − f (a) ≤ . 2[f (a)]−1 This means that C(x) exists and C(x) ∈ L(Y, X) (cf. Exercise 2.1.33). So we wish to estimate the expression α(k) g(y + k) − g(y) − C(x)k
ˆ for k ∈ Y, k < δ.
Put h = g(y + k) − g(y), We have
k = f (x + h) − f (x).
i.e.,
α(k) = h − C(x)k = −C(x)[f (x + h) − f (x) − f (x)h].
By the definition of the Fr´echet derivative, for any ε > 0 there is η > 0 such that f (x + h) − f (x) − f (x)h ≤ εh
provided h < η.
But (see (4.1.3)) h ≤ 2Ak. This means that α(k) = o(k),
i.e.,
g (y) = C(x) = [f (x)]−1 .
This also implies the continuity of g (y) since the inverse [f (x)]−1 depends continuously on x (see Exercise 2.1.33). To complete the proof it remains to put V = B(b; δ)
and
U = f−1 (V) ∩ B(a; r).
Corollary 4.1.2. Let X, Y be Banach spaces, G an open subset of X, f ∈ C 1 (G, Y ). If f (x) is an isomorphism of X onto Y for all x ∈ G, then f (G) is an open subset of Y . Proof. Use the definition of an open set and Theorem 4.1.1.
142
Chapter 4. Local Properties of Differentiable Mappings
Example 4.1.3. If f ∈ C k (G), k ∈ N, then g ∈ C k (V). This follows easily from the formula g (y) = [f (x)]−1 , x = g(y), g the Chain Rule and Exercise 3.2.32. Definition 4.1.4. Let X, Y be Banach spaces. Then f : X → Y is called a diffeomorphism of G ⊂ X (or a diffeomorphism of G onto H = f (G)) if the following conditions are satisfied: (1) G is an open set in X, f ∈ C 1 (G), (2) f (G) = H is an open set in Y , (3) f is injective on G and the inverse g = (f |G )−1 belongs to C 1 (H). If, moreover, f ∈ C k (G) for some k ∈ N, and (therefore) g ∈ C k (H), then f is called a C k -diffeomorphism. A diffeomorphism in RM can be viewed as a nonlinear generalization of a linear invertible operator A : RM → RM . Such A yields a linear transformation of coordinates y = Ax. If ϕ is a diffeomorphism of G onto H and a ∈ G, we can suppose without loss of generality that ϕ(a) = o (if this is not true consider a new diffeomorphism on G: ϕ(x) ˜ = ϕ(x) − ϕ(a)). Then the Cartesian coordinates y1 , . . . , yM of y = ϕ(x) can be taken as (generalized or nonlinear or non-Cartesian) coordinates of a point x in the neighborhood G of a. Such coordinates play an important role in problems where we have to work on non-flat domains (e.g., on nonlinear manifolds – see Appendix 4.3A). Notice that we can also interpret Theorem 4.1.1 in the finite dimensional case as follows: The Cartesian coordinates (y1 , . . . , yM ) of the point y = f (x) are nonlinear coordinates of the point x. In these nonlinear coordinates the diffeomorphism f of U is equal to the identity map. Example 4.1.5. Standard examples of nonlinear coordinates: (i) Polar coordinates in R2 : x = r cos ϕ,
y = r sin ϕ
(ψ(r, ϕ) = (x, y) is a diffeomorphism of (0, ∞) × (α, α + 2π) onto R2 without a half line); (ii) Spherical coordinates in R3 : x = r cos ϕ1 cos ϕ2 ,
y = r sin ϕ1 cos ϕ2 ,
z = r sin ϕ2 ;
4.1. Inverse Function Theorem
143
(iii) Spherical coordinates in RM : x1 = r cos ϕ1 cos ϕ2 · · · cos ϕM−1 , x2 = r sin ϕ1 cos ϕ2 · · · cos ϕM−1 , x3 = r sin ϕ2 cos ϕ3 · · · cos ϕM−1 , .. . xM−1 = r sin ϕM−2 cos ϕM−1 , xM = r sin ϕM−1 . Before using the Local Inverse Function Theorem we have to check that functions ψi (r, ϕ1 , . . . , ϕM−1 ) = xi ,
i = 1, . . . , M,
have continuous partial derivatives (obvious) and their Jacobi matrix is regular. Equivalently, the determinant Jψ of the Jacobi matrix is nonzero at a point (˜ r , ϕ˜1 , . . . , ϕ˜M−1 ), ψ(˜ r , ϕ˜1 , . . . , ϕ˜M−1 ) = a. Here Jψ = rM−1
M−2 /
cosk ϕk+1 ,
M ≥ 2.1
g
k=1
Example 4.1.6. The following question concerning the assumptions of Theorem 4.1.1 naturally arises: “What happens if f (a) is not an isomorphism?” In the case of finite dimension, f (a) cannot be an isomorphism for f : RM → N R whenever M = N . If M > N , i.e., the number of equations is smaller than the number of variables, then we can expect (we recommend to consider the case of a linear f ) to compute some of the variables. The simplest case is solved in the next Section 4.2 (the Implicit Function Theorem). If M < N , then f (G) will probably be a “thin” subset of RN . This case leads to the notion of a (differentiable) manifold (see the first part of Section 4.3 (Definition 4.3.4) and Appendix 4.3A). If both X and Y have infinite dimension, it can occur that f (a) is injective but Im f (a) is a dense subset of Y , different from Y . In this case, A = [f (a)]−1 exists but it is not continuous into X. We can also explain this situation as follows. If there is a constant c > 0 such that f (a)hY ≥ chX
for all h ∈ X,
(4.1.4)
then f (a) is injective, Y1 Im f (a) is a closed subspace of Y . Moreover, if we know that Y1 is dense in Y , then Y1 = Y and Theorem 4.1.1 can be applied. But sometimes we are able to prove only a weaker estimate, namely that there is a constant c > 0 such that f (a)hY ≥ chX˜ 1 We
use the notation
p 0 j=1
0 aj a1 · · · ap ( 1). ∅
for all h ∈ X
144
Chapter 4. Local Properties of Differentiable Mappings
where · X˜ is a weaker norm than · X . By this we mean that only the estimate hX˜ ≤ dhX
holds for all h ∈ X
(e.g., X = C 1 [0, 1], hX = sup |h(t)| + sup |h (t)|, hX˜ = sup |h(t)|). Then t∈[0,1]
t∈[0,1]
t∈[0,1]
A [f (a)]−1 maps Y continuously into the completion of X with respect to the norm · X˜ (remember that we need complete spaces for the Contraction Principle). In the above example “one derivative is lost” in an iteration. An idea how to overcome this problem is to use an approximation of A and a more rapid iteration process (e.g., the Newton iteration – see Appendix 3.2A) to compensate errors in the approximations of A (results of this type are the so-called “Hard” Local Inverse/Implicit Function Theorems) see, e.g., Deimling [34], Hamilton [65], g Moser [97], Nash [98] or Nirenberg [100]. We now turn to a global version of the Inverse Function Theorem. Theorem 4.1.7 (Global Inverse Function Theorem). Let X, Y be Banach spaces and let f : X → Y be continuously differentiable on X. Suppose that f (x) is continuously invertible for all x ∈ X and there is a constant c > 0 such that [f (x)]−1 L(Y,X) ≤ c
for all
x ∈ X.
Then f is a diffeomorphism of X onto Y . Proof. It is sufficient to prove that f is injective and surjective. The statement on the diffeomorphism follows then from Theorem 4.1.1. Fix an a ∈ X and denote b = f (a). Step 1. The map f is surjective, i.e., f (X) = Y . To see this choose y ∈ Y and put ϕ(t) = (1 − t)b + ty,
t ∈ [0, 1].
We wish to show that there is a curve ψ : [0, 1] → X such that f (ψ(t)) = ϕ(t),
in particular,
y = ϕ(1) = f (ψ(1)).
Since f is locally invertible at a ∈ X (Theorem 4.1.1), there is a neighborhood −1 U of a and δ > 0 such that ψ(t) = (f |U ) ϕ(t) is well defined for t ∈ [0, δ) and ψ ∈ C 1 ([0, δ), X). Let A {τ ∈ [0, 1] : ∃ ω ∈ C 1 ([0, τ ], X), f (ω(t)) = ϕ(t), t ∈ [0, τ ]},
(4.1.5)
and α = sup A. Notice that ω is uniquely determined by (4.1.5) (this follows from the local invertibility of f ), and therefore there is ω ∈ C 1 ([0, α), X) such thatf (ω(t)) = ϕ(t), t ∈ [0, α). Since we have ω(t1 ) − ω(t2 ) ≤ sup ω (t)|t1 − t2 | ≤ cy − b|t1 − t2 | t∈[t1 ,t2 ]
4.1. Inverse Function Theorem
145
for all t1 , t2 ∈ [0, α), the mapping ω is uniformly continuous on the interval [0, α), hence lim ω(t) ω(α) t→α−
exists (X is a complete space) and the equality (4.1.5) holds for all t ∈ [0, α]. Now we are ready to prove that α = 1. Indeed, if α < 1, then we can apply Theorem 4.1.1 at the point ω(α) to obtain a contradiction with the definition of α. Step 2. The map f is injective. Suppose by contradiction that there are different x1 , x2 ∈ X for which f (x1 ) = f (x2 ). Put y f (x2 ),
ψi (t) = (1 − t)a + txi ,
ϕi (t) = f (ψi (t)),
t ∈ [0, 1], i = 1, 2.
By a slight modification of the above procedure it is possible to prove the existence of a mapping G : [0, 1] × [0, 1] → X such that f (G(t, s)) = (1 − s)ϕ1 (t) + sϕ2 (t),
(t, s) ∈ [0, 1] × [0, 1].
Then f (G(1, s)) = (1 − s)f (x1 ) + sf (x2 ) = y
for every s ∈ [0, 1].
This contradicts the local invertibility of f at x1 (= x2 ).
Exercise 4.1.8. A complex function f : C → C is called holomorphic in an open set G ⊂ C if f (z) exists for every z ∈ G. If f (a) = 0 for an a ∈ G, then f is −1 locally invertible (Theorem 4.1.1). Prove that (f |U ) is holomorphic and apply z this result to f (z) = e to obtain a power series expression of a continuous branch of the “multivalued function” log. (For the “complex function” proof see, e.g., Rudin [113, Theorem 10.30].) Exercise 4.1.9. Let 1 f (x) = x + 2x2 sin , x
x = 0,
f (0) = 0.
Show that f is not injective on any neighborhood of zero. Which assumption of Theorem 4.1.1 is not satisfied? Hint. If U is a neighborhood of 0, show that f (x) = 0 has a solution in U and f (x) = 0 at any such solution. Hence f is not injective on U. Note also that f is not continous at 0.
146
Chapter 4. Local Properties of Differentiable Mappings
Exercise 4.1.10. Find the form of the Laplace operator ∆u
∂ 2u ∂ 2 u + 2 ∂x2 ∂y
in the polar coordinates in the set G = {(x, y) ∈ R2 : x2 + y 2 > 0}
for
u ∈ C 2 (G).
Hint. If v(r, ϕ) = u(r cos ϕ, r sin ϕ), then (∆u) ◦ Φ =
∂ 2v 1 ∂ 2v 1 ∂v + + 2 2 2 ∂r r ∂ϕ r ∂r
where Φ(r, ϕ) = (r cos ϕ, r sin ϕ)
is the transformation. Note that we have
∂u ∂u ∂v ∂v , = , ◦ (Φ )−1 . ∂x ∂y ∂r ∂ϕ It is more comfortable to use this formula once again to compute
∂2 u ∂2u ∂x2 , ∂y 2 .
Exercise 4.1.11. Show that the estimate [f (x)]−1 L(Y,X) ≤ c + dxX is sufficient in Theorem 4.1.7. Hint. Use the Gronwall Lemma (Exercise 5.1.16) to estimate ω (t).
4.2 Implicit Function Theorem Let us start with a simple example of f : R2 → R, e.g., f (x, y) x2 + y 2 − 1. Denote M = {(x, y) ∈ R2 : f (x, y) = 0}, i.e., M is the unit circle in R2 . We would like to solve the equation f (x, y) = 0 for the unknown variable y or to express M as the graph of a certain function y = ϕ(x). √ We immediately see that for any x ∈ (−1, 1) there is a pair of y’s (y1,2 = ± 1 − x2 ) such that (x, y) ∈ M. In particular, M is not a graph of any function y = ϕ(x). We can only obtain that M is locally a graph, i.e., for (a, b) ∈ M, a ∈ (−1, 1),
4.2. Implicit Function Theorem
147
there is a neighborhood U of (a, b) such that M ∩ U is the graph of a function y = ϕ(x). On the other hand, for x = ±1 there is a unique y (y = 0) for which (x, y) ∈ M. But there is no neighborhood U of (1, 0) such that M ∩ U is the graph of a function y = ϕ(x). What is the difference between these two cases? In the former case the tangent line to M ∩ U exists at the point (a, b) with the slope ϕ (a). Since for x ∈ (a − δ, a + δ),
f (x, ϕ(x)) = 0 we have (formally by the Chain Rule)
∂f ∂f (a, b) + (a, b)ϕ (a) = 0, ∂x ∂y i.e., ϕ (a) = − ab , since
∂f ∂y (a, b)
(4.2.1)
= 2b = 0.
In the latter case, where (a, b) = (1, 0), we have ∂f ∂y (1, 0) = 0, and ϕ (1) cannot be determined from (4.2.1). The tangent line to M at the point (0, 1) is parallel to the y-axis, which indicates some problems with determining a solution, i.e., the (implicit) function ϕ. The reader is invited to sketch a figure. This discussion shows the importance of the assumption
∂f (a, b) = 0. ∂y How can this assumption be generalized to f : RM+N → RN ? A brief inspection of the linear case leads to the observation that we can compute the unknowns yM+1 , . . . , yM+N from the equations fi (y1 , . . . , yM+N ) =
M+N
aij yj = 0,
i = 1, . . . , N,
j=1
uniquely as functions of y1 , . . . , yM if and only if det (aij )
i=1,...,N j=M+1,...,M +N
= 0.
∂fi , and the condition on the regularity of the matrix Nevertheless, aij = ∂y j (aij ) means that the partial (Fr´echet) derivative of f (see Definii=1,...,N j=M+1,...,M +N
tion 3.2.17) with respect to the last N variables is an isomorphism of RN . Theorem 4.2.1 (Implicit Function Theorem). Let X, Y , Z be Banach spaces, f : X × Y → Z. Let (a, b) ∈ X × Y be such a point that f (a, b) = o. Let G be an open set in X × Y containing the point (a, b). Let f ∈ C 1 (G) and let the partial Fr´echet derivative f2 (a, b) be an isomorphism of Y onto Z.
148
Chapter 4. Local Properties of Differentiable Mappings
Then there are neighborhoods U of a and V of b such that for any x ∈ U there exists a unique y ∈ V for which f (x, y) = o. Denote this y by ϕ(x). Then ϕ ∈ C 1 (U). Moreover, if f ∈ C k (G), k ∈ N, then ϕ ∈ C k (U). Proof. We denote A [f2 (a, b)]−1 and define (x, y) ∈ G.
F (x, y) = (x, Af (x, y)), Then F : X × Y → X × Y , F ∈ C 1 (G) and
F (a, b)(h, k) = (h, Af (a, b)(h, k)). One can verify that F (a, b) is an isomorphism of X × Y onto itself. Hence we can apply Theorem 4.1.1 to get neighborhoods U × V of (a, b) and U˜ × V˜ of (a, o) such that for any ξ ∈ U˜ and η = o ∈ V˜ there exists a unique (x, y) ∈ U × V such that F (x, y) = (x, Af (x, y)) = (ξ, o),
i.e.,
x = ξ,
˜ U = U,
and, denoting y = ϕ(x), f (x, ϕ(x)) = o. This means that
F −1 (x, o) = (x, ϕ(x)).
Since the inverse F −1 is differentiable, by Theorem 4.1.1 we conclude that ϕ ∈ C 1 (U). Remark 4.2.2. We can also deduce a formula for ϕ (x): Indeed, since f (x, ϕ(x)) = o
for every x ∈ U
and both the functions f and ϕ are differentiable, we get from the Chain Rule f1 (x, ϕ(x)) + f2 (x, ϕ(x)) ◦ ϕ (x) = o, and therefore ϕ (x) = −[f2 (x, ϕ(x))]−1 ◦ f1 (x, ϕ(x))
for
x ∈ U1
(4.2.2)
where U1 ⊂ U may be smaller if necessary in order to guarantee the existence of the inverse [f2 (x, ϕ(x))]−1 (see Exercise 2.1.33). Remark 4.2.3. The statement of Theorem 4.2.1 is by no means the best one. If we have used the Contraction Principle directly we would obtain the existence of a solution y = ϕ(x) under weaker assumptions. Namely, f (x, y) = o is equivalent to y = y − Af (x, y)
4.2. Implicit Function Theorem
149
and since x is a parameter here we do not need to assume the differentiability with respect to x if we content ourselves just with the existence of ϕ (and give up its differentiability). We recommend that the reader uses directly the Contraction Principle to obtain the following statement: Let X be a normed linear space, Y , Z be Banach spaces and let f : X × Y → Z be continuous at the point (a, b) where f (a, b) = o. Assume that the partial Fr´echet derivative f2 (a, b) is an isomorphism of Y onto Z and f2 : X × Y → L(Y, Z) is continuous at (a, b). Then there are neighborhoods U of a and V of b such that for any x ∈ U there is a unique y = ϕ(x) ∈ V for which f (x, ϕ(x)) = o. Moreover, ϕ is continuous at a. It is also possible to avoid partly the requirement of invertibility of f2 (a, b) (see Remark 4.1.6 and references given there). There are many examples in Calculus where the Implicit Function Theorem is used. We give one in Exercise 4.2.9, see also exercises in Dieudonn´e [35]. Our attention is turned mainly towards more theoretical applications. Example 4.2.4. Let P (z) = z n + an−1 z n−1 + · · · + a0 be a polynomial with real or complex coefficients a0 , . . . , an−1 . The famous Fundamental Theorem of Algebra says that if n ≥ 1, then the equation P (z) = 0 has at least one solution z˜ ∈ C and actually n solutions if all of them are counted with their multiplicity. This means that P can be factorized as follows: P (z) = (z − z1 )k1 · · · (z − zl )kl ,
k1 + · · · + kl = n,
where z1 , . . . , zl are different. A natural question arises: How do these solutions z1 , . . . , zl depend on the coefficients a0 , . . . , an−1 of P ? Let F (z, y0 , . . . , yn−1 ) = z n + yn−1 z n−1 + · · · + y0 : C × Cn → C. Then F (z1 , a0 , . . . , an−1 ) = P (z1 ) = 0
and
If z1 is a simple root, i.e., k1 = 1, then ∂F (z1 , a0 , . . . , an−1 ) = 0, ∂z
F ∈ C ∞ (C × Cn ).
150
Chapter 4. Local Properties of Differentiable Mappings
and the Implicit Function Theorem says that z1 depends continuously on a0 , . . . , an−1 (also in the real case). But what happens if k1 > 1? Notice that the cases of real and complex roots are different. In the former case, the real root can disappear (x2 + ε = 0 for ε > 0), and in the latter case, the uniqueness can be lost. Since the solution z1 ramifies or bifurcates at a0 , . . . , an−1 , this phenomenon is called a bifurcation. We postpone a basic discussion of this very important g nonlinear phenomenon till the end of the next section. Example 4.2.5 (dependence of solutions on initial conditions). Suppose that f : R × RN → RN is continuous in an open set G ⊂ R × RN and has continuous partial derivatives with respect to the last N variables in G. Denote by ϕ(·; τ, ξ) a (unique) solution of the initial value problem x˙ = f (t, x), x(τ ) = ξ (see Theorem 2.3.4). We are now interested in the properties of ϕ with respect to the variables (τ, ξ) ∈ G, cf. Remark 2.3.5. Let us define t Φ(τ, ξ, ϕ)(t) ξ + f (s, ϕ(s)) ds − ϕ(t). (4.2.3) τ
For a fixed (t0 , x0 ) ∈ G the solution ϕ(·; t0 , x0 ) of Φ(t0 , x0 , ϕ) = o is defined on an open interval J . Choose a compact interval I ⊂ J such that t0 ∈ int I. Then the mapping Φ given by (4.2.3) is defined on a certain open subset H ⊂ R × RN × C(I, RN ) and takes its values in C(I, RN ). Further, Φ(t0 , x0 , ϕ(·; t0 , x0 )) = o and [Φ 2 (τ, ξ, ϕ)η](t) = η, [Φ 1 (τ, ξ, ϕ)](t) = −f (τ, ϕ(τ )), t f2 (s, ϕ(s))ψ(s) ds − ψ(t), [Φ 3 (τ, ξ, ϕ)ψ](t) = τ
t ∈ I,
η ∈ RN ,
ψ ∈ C(I, RN ) whenever (τ, ξ, ϕ) ∈ H.
Since these partial Fr´echet derivatives are continuous, Φ ∈ C 1 (H) (see Proposition 3.2.18). The crucial assumption of the Implicit Function Theorem is the continuous invertibility of Φ 3 (t0 , x0 , ϕ(·; t0 , x0 )) in the space C(I, RN ). Put t Bψ(t) = f2 (s, ϕ(s; t0 , x0 ))ψ(s) ds, ψ ∈ C(I, RN ). t0
We have proved in Example 2.3.7 that σ(B) = {0}. In particular, B − I = Φ 3 (t0 , x0 , ϕ(·; t0 , x0 ))
4.2. Implicit Function Theorem
151
is continuously invertible. By Theorem 4.2.1, there exist neighborhoods U of (t0 , x0 ) and V of ϕ(·; t0 , x0 ) such that for any (τ, ξ) ∈ U there is a unique ϕ ∈ V such that Φ(τ, ξ, ϕ) = o. Moreover, this ϕ is continuously differentiable with respect to τ and ξ, and for the continuous mappings Θ(·)
∂ϕ (·; t0 , x0 ) ∂τ
and
Ξ(·)
∂ϕ (·; t0 , x0 ) ∂ξ
we have, by Remark 4.2.2, t f2 (s, ϕ(s; t0 , x0 ))Θ(s) ds − Θ(t) = o, −f (t0 , x0 ) +
t0
t
η+ t0
f2 (s, ϕ(s; t0 , x0 ))Ξ(s)η ds − Ξ(t)η = o,
η ∈ RN .
This means that Θ and Ξ solve the so-called equation in variations y(t) ˙ = f2 (t, ϕ(t; t0 , x0 ))y(t)
(4.2.4)
(this is a system of N linear equations for Θ and a system of N × N equations for Ξ) and fulfil the initial conditions Θ(t0 ) = −f (t0 , x0 ),
Ξ(t0 ) = I.
In particular, Ξ(·) is a fundamental matrix of (4.2.4).
(4.2.5) g
As an application of differentiability with respect to initial conditions we briefly sketch the approach to orbital stability of periodic solutions. Example 4.2.6. Assume that we know a non-constant T -periodic solution ϕ0 of an autonomous system x˙ = f (x), and that we are interested in the behavior of other solutions which start at time t = 0 near ϕ0 (0) = x0 . We assume that f ∈ C 1 (G), G is an open set in RN , and denote by ϕ(·, ξ) the solution satisfying ϕ(0, ξ) = ξ. Let M = {x ∈ RN : (x − x0 , f (x0 ))RN = 0}. In order to show that a solution ϕ(·, ξ) exists on such an interval [0, t(ξ)] that it meets M ∩ U (U is a neighborhood of x0 ) for the first positive time t(ξ) near T (T is the period of ϕ0 ), see Figure 4.2.1, we can solve the equation Φ(t, ξ) (ϕ(t, ξ) − x0 , f (x0 )) = 0 in the vicinity of the point (T, x0 ). We have Φ 1 (T, x0 ) = (f (x0 ), f (x0 )) > 0
152
Chapter 4. Local Properties of Differentiable Mappings
(f (x0 ) = 0 since ϕ0 is non-constant) and
dϕ (t, x0 )η, f (x0 ) = (Ξ(t, x0 )η, f (x0 )) Φ2 (t, x0 )η = dξ (see the previous example) where Ξ(t, x0 ) is a fundamental matrix of the linear T -periodic equation y(t) ˙ = f (ϕ0 (t))y(t) (cf. (4.2.4)). So, we may use the Implicit Function Theorem to get a function t(ξ) such that Φ(t(ξ), ξ) = 0,
t(x0 ) = T,
ξ ∈ U(x0 ).
f (x0 )
M
ϕ0 (·)
RN
ϕ(·, ξ) x0 ϕ(t(ξ), ξ) ξ
Figure 4.2.1.
By (4.2.2) we also have 1 dt (T, x0 )η = − (Ξ(t, x0 )η, f (x0 )), dξ f (x0 )2
η ∈ RN .
This allows us to investigate the behavior of the so-called Poincar´e mapping P (ξ) ϕ(t(ξ), ξ),
ξ ∈ U ∩ M.
The asymptotic orbital stability of ϕ0 can be defined by the requirement lim P n (ξ) = x0 .
n→∞
For more detail the interested reader can consult, e.g., Amann [4, Section 23]. g
4.2. Implicit Function Theorem
153
We are often interested in asymptotic behavior of solutions of a system of ordinary differential equations (linear or nonlinear), e.g., boundedness of solutions or its convergence to some special solutions (constant, periodic, etc.). In the following example we briefly sketch a method which can be used. Example 4.2.7. Consider the equation x˙ = Ax + f
(4.2.6)
where A is a constant N × N matrix and f : R → RN is bounded and continuous on R (f ∈ BC(R, RN )). We are interested in bounded solutions of (4.2.6) on R. Let us assume σ(A) ∩ iR = ∅. With help of Functional Calculus (Theorem 1.1.38, in particular, Remark 1.1.39(i), (ii)) we can construct two projections P + , P − onto complementary subspaces X + , X − of RN which commute with A ∈ L(RN ) (A is the matrix representation of A in the standard basis) and such that σ(A− ) = σ(A) ∩ {z ∈ C : Re z < 0}
σ(A+ ) = σ(A) ∩ {z ∈ C : Re z > 0},
(A+ , A− are the restrictions of A to X + , X − , respectively). With help of the Variation of Constants Formula it can be proved that for any f ∈ BC(R, RN ) there is a unique solution x of (4.2.6) in the space BC(R, RN ), and this solution is given by the formula t +∞ + (t−s)A− − x(t) = e P f (s) ds − e(t−s)A P + f (s) ds.2 (4.2.7) −∞
t
If we are interested in bounded solutions only on R+ [0, ∞), a similar computation shows that all such solutions for f ∈ BC(R+ , RN ) are given by t ∞ − − + x(t) = etA x− + e(t−s)A P − f (s) ds − e(t−s)A P + f (s) ds (4.2.8) 0 −
t −
where x is an arbitrary point in X . Both formulae (4.2.7) and (4.2.8) may be used for finding bounded solutions to a semilinear equation x˙ = Ax + f (x)
where f (o) = o, f (o) = o, f ∈ C 1 (U)
(4.2.9)
(U is a neighborhood of o ∈ RN ). To do that we solve the corresponding nonlinear equations (4.2.7), (4.2.8) where f (·) is replaced by g(x(·)) where g is bounded and 2 The interested reader can check this formula and also (4.2.8) as an exercise on the use of the Variation of Constants Formula. Hint. Use the estimates etA x ≤ ce−αt x for x ∈ X − , t > 0, and etA x ≤ ceαt x for x ∈ X + , t < 0, where the positive constants α, c are independent of t and x, α is such that σ(A) ∩ {λ ∈ C : | Re λ| ≤ α} = ∅ and c depends on α only. These estimates follow from Functional Calculus (see Exercise 1.1.42) and they ensure that integrals in (4.2.7) do exist. Apply P + to both sides of the Variation of Constants Formula and send t → ∞ to obtain ∞
P + x(t) = − t
+
e(t−σ)A P + f (s) ds provided x is a bounded solution. Similarly P − x(t).
154
Chapter 4. Local Properties of Differentiable Mappings
g(y) = f (y) in a neighborhood of 0. For details see Hale [63, Sections III.6 and IV.3]. A solution in (4.2.8) depends on the parameter x− , so we have the equation Φ(ξ, ϕ)(t) −
ϕ(t) − etA ξ −
t
−
e(t−s)A P − g(ϕ(s)) ds +
0
∞
+
e(t−s)A P + g(ϕ(s)) ds = o
t
with Φ : X − × BC(R+ , RN ) → BC(R+ , RN ) (check it – you have to use the estimates given in footnote 2 on page 153). This formulation is suitable for the use of the Implicit Function Theorem. We have left details to the interested reader. The graph of the mapping κ : ξ ∈ X − → P + ϕ(0, ξ) s is the so-called local stable manifold Wloc (x0 ) of the equation (4.2.9) (ϕ(·, ξ) is a 3 solution of Φ(ξ, ϕ) = o). It follows from the formula (4.2.2) that
κ (o) = o, s i.e., Wloc (x0 ) is tangent to the stable manifold X − of the linear equation x˙ = Ax, g see Figure 4.2.2.
Remark 4.2.8. It is sometimes convenient to define a solution of nonlinear, in particular, partial differential equations, more generally, not assuming that a solution has all classical derivatives which appear in the equation (see Chapters 6 and 7). Actually, we have seen one such possibility in the reformulation of a differential equation as an integral equation x = F (x) where F is given by the formula (2.3.6). Having a more general notion of solution a natural question arises: Under which conditions is this solution smoother, in particular, is it a “classical” solution? Such results are known as regularity assertions. The Implicit Function Theorem can be occasionally used to prove such statements. See Theorem 6.1.14. 3 The so-called stable manifold W s (x ) of the stationary point x of the equation x ˙ = g(x) 0 0 (g(x0 ) = o) is defined as follows: Let ψ(·, ξ) be a solution of this differential equation satisfying
the initial condition ψ(0, ξ) = ξ. Then the stable manifold is W s (x0 ) = s (x ) Wloc 0
ξ : lim ψ(t, ξ) = x0 t→∞
and a local stable manifold is defined by {ξ ∈ W 0 ) : ψ(t, ξ) ∈ U for t ≥ 0} where U is a neighborhood of x0 . Notice the crucial assumption σ(A) ∩ iR = ∅ (i.e., o is a so-called hyperbolic stationary point of the equation (4.2.9)) in the above argument. Figure 4.2.2 shows also the distinction between stable and local stable manifolds. It is worth mentioning that a similar approach cannot be used in the case σ(A) ∩ iR = ∅. Since there can exist eigenvalues on the imaginary axis of the multiplicity greater than 1, we cannot expect a manifold consisting of bounded solutions. To get the so-called central manifold we are forced to solve a nonlinear version of the equations (4.2.7) in a weighted space instead of BC(R, RN ). However, this problem is more difficult due to the lack of differentiability of the Nemytski operator (see footnote 11 on page 126). For details see, e.g., Chow, Li & Wang [25, Chapter 1] and references given there. s (x
4.2. Implicit Function Theorem
155
W s (o)
X+
RN
ϕ(o, ξ)
κ(ξ) s Wloc (o)
o
ξ
X−
W s (o) Figure 4.2.2.
Exercise 4.2.9. Let f : RM → RN and let Φ be a diffeomorphism defined on a neighborhood U of the graph of f onto V ⊂ RM+N . Write Φ−1 (ξ, η) = (ψ 1 (ξ, η), ψ 2 (ξ, η))
for (ξ, η) ∈ V.
This means the graph of f is isomorphic to Γ = {(ξ, η) ∈ RM+N : ψ 2 (ξ, η) − f (ψ 1 (ξ, η)) = o}. The Implicit Function Theorem yields conditions for Γ to be the graph of a function η = g(ξ). (i) Formulate these conditions! (ii) Express the derivative of f in terms of the derivative of g. Hint. f (Φ−1 ) = [(ψ 2 ) 2 g + (ψ 2 ) 1 ] ◦ [(ψ 1 ) 1 + (ψ 1 ) 2 g ]−1 . Control question: Have you checked that the second term on the right-hand side is an isomorphism of RM onto RM ? (iii) Without using the general result from (ii) transform the equation dy = f (x, y) dx into polar coordinates! Exercise 4.2.10. Let M be a metric space and f : M × R → R a continuous map. Let c > 0 be such that for all x ∈ M , y1 , y2 ∈ R, we have (f (x, y1 ) − f (x, y2 ))(y1 − y2 ) ≥ c|y1 − y2 |2 .
156
Chapter 4. Local Properties of Differentiable Mappings
Prove that for any x ∈ M there exists a unique y(x) ∈ R such that f (x, y(x)) = 0 and, moreover, y : x → y(x) is a continuous map from M into R. Hint. Use the properties of real functions of one real variable. Exercise 4.2.11. Let M be a normed linear space, let f be as in Exercise 4.2.10 and, moreover, f ∈ C k (M × R) with some k ∈ N. Then the implicit function y = y(x) from Exercise 4.2.10 is of the class C k (M ). Prove it! Hint. Use Theorem 4.2.1. Exercise 4.2.12. Give details which are omitted in Example 4.2.7. Exercise 4.2.13. Let A be a densely defined linear operator in a Hilbert space. Assume that A has a compact self-adjoint resolvent. Extend the construction of the local stable manifold (Example 4.2.7) to the equation (4.2.6). See Exercise 3.1.19 for the properties of this equation. Exercise 4.2.14. Assume that f (x, y) =
∞
ajk (x − x0 )j (y − y0 )k ,
|x − x0 | < α,
|y − y0 | < β.
j,k=0
Moreover, let a00 = 0, a01 = 0. Apply the Implicit Function Theorem and show that the implicit function y(x) is the sum of a power series in a neighborhood of x0 . Note that for complex variables the result follows directly from the properties of holomorphic functions and Theorem 4.2.1. In the real case one has to prove that the formal power series for y(x) has a positive radius of convergence.
4.3 Local Structure of Differentiable Maps, Bifurcations We now revert to the topic of Remark 4.1.6, i.e., to the case when the assumptions of the Local Inverse Function Theorem (Theorem 4.1.1) are violated. In particular, it was mentioned there that the assumptions of the Local Inverse Function Theorem are never satisfied for f : RM → RN provided M = N . In the first part we will study local behavior of such mappings. In the second part we stress the main idea of the Lyapunov–Schmidt Reduction and the approach to bifurcation phenomena (Crandall–Rabinowitz Bifurcation Theorem). Definition 4.3.1. Let f : X → Y be a differentiable map in a neighborhood of a point a ∈ X. If f (a) is neither injective nor surjective, then a is called a singular point of f .
4.3. Local Structure of Differentiable Maps, Bifurcations
157
The following proposition deals with the first non-singular case for the mapping f : RM → RN , M < N . For the second one see Proposition 4.3.8. Proposition 4.3.2. Let f : RM → RN be a differentiable map on an open set G ⊂ RM . Let a ∈ G and let f (a) be injective. Let Q be a (linear) projection of RN onto Y1 Im f (a). Then there exist neighborhoods U of a, V of Qf (a) in Y1 , a diffeomorphism ϕ of U onto V and a differentiable map g : V → RN such that f =g◦ϕ (see Figure 4.3.1). RN = Y1 ⊕ Y2
Y2 f (a)
f
a Q
o U ⊂G G ⊂ RM
g
ϕ Qf (a) V
Y1 = Im f (a)
Figure 4.3.1.
Proof. The proof is almost obvious from Figure 4.3.1. Put ϕ = Q ◦ f . Then ϕ (a) = Qf (a) is an isomorphism of RM onto Y1 . Since dim Y1 = M is finite, Y1 is a Banach space (as a closed subspace of the Banach space RN ) and, by Theorem 4.1.1, ϕ is a diffeomorphism of a neighborhood U of a onto a neighborhood (in Y1 ) V of Qf (a). It suffices to put g = f ◦ ϕ−1 . Remark 4.3.3. (i) We have used the finite dimension of Y RN to ensure both the existence of a continuous linear projection Q and the closedness of the range Im f (a). If f : X → Y , X, Y are Banach spaces, then neither of these two conditions has to be satisfied. It follows from the proof that Proposition 4.3.2 holds under these two additional assumptions. We notice that these assumptions are superfluous provided X has a finite dimension (see Remark 2.1.19).
158
Chapter 4. Local Properties of Differentiable Mappings
(ii) It is also easy to prove that Ψ(y) g(Qy) − (I − Q)y − Qf (a) is a diffeomorphism of a neighborhood W of b = f (a) onto a neighborhood ˜ of o in RN . Indeed, W Ψ (b)k = ϕ (a)h − (I − Q)k
Ψ(b) = o, where h ∈ RM is such that
Qk = ϕ (a)h. Moreover, y ∈ f (G) ∩ W if and only if there is an x ∈ G such that y = f (x)
and
(I − Q)Ψ(f (x)) = o.
This means that there exists a local (nonlinear) transformation of coordinates in W (given by Ψ) such that f (G) ∩ W is expressed by zM+1 = · · · = zN = 0 in these new coordinates. (iii) An interpretation similar to (ii) follows: (I − Q)f = (I − Q)g(ϕ) = (I − Q)g(Qf ),
Φ(Qf ) (I − Q)g(Qf ).
This means that after a linear transformation of coordinates the last N − M components of f (i.e., (I − Q)f ) depend (via Φ) on the first M components of f in a neighborhood of a. Compare this local nonlinear result to the linear one for the equation Ax = y,
Figure 4.3.2. Immersion
A ∈ L(RM , RN ).
Figure 4.3.3. Injective immersion
4.3. Local Structure of Differentiable Maps, Bifurcations
159
(iv) A map f which satisfies the assumptions of Proposition 4.3.2 at each point a ∈ G is often called an immersion of G into RN . An injective immersion which is also a homeomorphism of G onto f (G) (in the induced topology from RN ) is called an embedding. Some examples of immersions which are not embeddings are shown in Figures 4.3.2 and 4.3.3. We note that we have already used the term embedding for an injective continuous linear operator. Further examination of Proposition 4.3.2 leads to the following definition of a differentiable manifold. This notion is basic for differential geometry and global nonlinear analysis. In this textbook we will mostly use it for purposes of terminology only. Some basic facts on manifolds are given in Appendix 4.3A and will be used for developing the notion of degree (Appendix 4.3D). Definition 4.3.4. A differentiable manifold of dimension M and of the class C k is a subset M of RN (N ≥ M ) with the following property: For each x ∈ M there is a neighborhood W of x (in RN ) and a C k -diffeomorphism ψ of W into RN such that ψ(M ∩ W) = {y = (y1 , . . . , yN ) ∈ RN : yM+1 = · · · = yN = 0} ∩ ψ(W). A relative neighborhood W ∩ M together with ψ is called a (local) chart at the point x ∈ M . The first M coordinates (y1 , . . . , yM ) are called the local coordinates of x on M . The collection of all charts of M is called an atlas of M . Example 4.3.5. (i) An open subset G ⊂ RM is an M -dimensional differentiable manifold of the class C k for any k ∈ N (i.e., of the class C ∞ ). (ii) The graph of a function f : RM → R, f ∈ C k (G), G an open subset of RM , is an M -dimensional differentiable manifold of the class C k in RN , N ≥ M + 1. (iii) Let S 2 = {(x, y, z) ∈ R3 : x2 + y 2 + z 2 = 1} be the 2-dimensional sphere. Then S 2 is a 2-dimensional differentiable manifold of the class C ∞ in RN , N ≥ 3. Indeed, a chart for the upper open half-sphere can be constructed as follows: let 1 ψ(x, y, z) = (x, y, z − 1 − x2 − y 2 ), W = {(x, y, z) ∈ R3 : x2 + y 2 < 1, z > 0}. Then ψ is a diffeomorphism of W into R3 and ψ(W ∩ S 2 ) = {(u, v, w) ∈ R3 : u2 + v 2 < 1, w = 0}. We will see a more comfortable proof in Example 4.3.10.
g
160
Chapter 4. Local Properties of Differentiable Mappings
Definition 4.3.6. Let X, Y be Banach spaces, f : X → Y a differentiable map on a neighborhood of a point a ∈ X. If f (a) is a surjective map onto Y , then the point a is called a regular point. If a is not a regular point, then it is called a critical point . A value b ∈ Y is called a critical value of f provided the set f−1 (b) {x ∈ X : f (x) = b} contains a critical point. In the other case, b is a regular value. Remark 4.3.7. There is a difference between the notion of a singular point (Definition 4.3.1) and a critical point. For example, if f : RM → RN , M < N , then all points in RM are critical (but some of them can be non-singular). The importance of the notion of a critical point will be more apparent in connection with the Sard Theorem (Theorem 5.2.3) and its applications. Proposition 4.3.8. Let G be an open subset of RM , f : G → RN , f ∈ C k (G). Let a ∈ G be a regular point of f . Then there are neighborhoods U of o ∈ RM , V of a, and a diffeomorphism ϕ ∈ C k of U onto V such that {x ∈ V : f (x) = f (a)} = ϕ(U ∩ Ker f (a)) (see Figure 4.3.4). X2
A
RM = X 1 ⊕ X 2
V a
RN A−1
ϕ P
f (x) = f (a)
o U
X1 = Ker f (a)
Figure 4.3.4.
Proof. By Remark 4.3.7, M ≥ N . If M = N , then Theorem 4.1.1 can be applied. Therefore, we assume that M > N . Denote by P a (linear continuous) projection of X RM onto X1 Ker f (a) and by X2 the complementary subspace given by X2 = Im (I − P ). If A is the restriction of f (a) to X2 , then A is an isomorphism of X2 onto RN (A is both injective and surjective). Denote by A−1 the inverse isomorphism of RN onto X2 (A−1 is also called a right inverse of f (a)). We can rewrite f in the following way: f (x) = f (a) + f (a)[A−1 (f (x) − f (a)) + P (x − a)].
4.3. Local Structure of Differentiable Maps, Bifurcations
Let us denote
161
ψ(x) = A−1 (f (x) − f (a)) + P (x − a).
A simple calculation shows that ψ (a)h = A−1 f (a)h + P h = (I − P )h + P h = h
for any h ∈ X.
Since ψ(a) = o, ψ is a diffeomorphism of a neighborhood V ⊂ G of a onto a neighborhood U of o (Theorem 4.1.1). Further, x ∈ {y ∈ V : f (y) = f (a)} if and only if x ∈ V and ψ(x) = P (x − a), i.e., ψ(x) ∈ U ∩ Ker f (a).
The desired diffeomorphism ϕ is the inverse of ψ.
Remark 4.3.9. (i) Proposition 4.3.8 together with its proof also holds for f : X → Y , X, Y Banach spaces provided there exists a linear continuous projection P of X onto Ker f (a). The continuity of A−1 follows in this case from the Open Mapping Theorem (Theorem 2.1.8). The existence of such a projection P can be shown in two important cases, namely, when Y has finite dimension (and therefore Ker f (a) has finite codimension – Example 2.1.12) or Ker f (a) has finite dimension (Remark 2.1.19). (ii) Notice that ϕ can be viewed as a local (nonlinear) transformation of coordinates in which f is a linear map, namely f (ϕ(y)) = f (a) + f (a)y,
y ∈ U.
This formula also shows that all points in V are regular. Moreover, if z is sufficiently close to b = f (a), then y = A−1 (z − b) ∈ U
and
f (ϕ(y)) = z.
This shows that f (G) is an open set in RN provided all points of G are regular. (iii) In the terms of differentiable manifolds (Definition 4.3.4) the statement of Proposition 4.3.8 can be formulated as follows: If f : RM → RN is a differentiable map in an open set G ⊂ RM , b ∈ RN , then the set {x ∈ G : f (x) = b} is a differentiable manifold (either empty or of dimension M − N ) provided b is a regular value of f . (iv) Proposition 4.3.8 imposes certain restrictions on the set {x ∈ RM : f (x) = f (a)}. In Figures 4.3.5–4.3.7 there are some cases in which a is not a regular point (i.e., it is a critical point). The value f (a) is critical in all cases.
162
Chapter 4. Local Properties of Differentiable Mappings
a a a (cusp) Figure 4.3.5.
Figure 4.3.6.
Figure 4.3.7.
Example 4.3.10. The sphere S 2 is a C ∞ -differentiable manifold. To see this it is sufficient to use Remark 4.3.9(iii) for f (x, y, z) = x2 + y 2 + z 2 − 1,
b = 0.
g
The assertions of the last two propositions are part of the following more general result. Theorem 4.3.11 (Rank Theorem). Let f : RM → RN be a differentiable map on an open subset G ⊂ RM and let the dimension of Im f (x) be constant for x ∈ G (and equal to L ∈ N). Then for any a ∈ G there exist neighborhoods U of a, W of b = f (a), cubes C in RM , D in RN and diffeomorphisms Φ : C → U, Ψ : W → D such that the map F defined by F = Ψ ◦ f ◦ Φ has the form F (z1 , . . . , zM ) = (z1 , . . . , zL , 0, . . . , 0)
for all
z = (z1 , . . . , zM ) ∈ C
(see Figure 4.3.8). Proof. Denote X2 = Ker f (a), P a (linear) projection in RM onto X2 , X1 = Ker P and, similarly, Y1 = Im f (a), Q a (linear) projection in RN onto Y1 , Y2 = Ker Q. Then the restriction A of f (a) to X1 is an isomorphism of X1 onto Y1 . Let A−1 be the inverse isomorphism, A−1 : Y1 → X1 . By the proof of Proposition 4.3.8, α(x) = A−1 Q(f (x) − f (a)) + P (x − a) is a diffeomorphism of the neighborhood U of a ∈ RM onto the neighborhood U˜ of o ∈ RM . Denote by ϕ the inverse to α. For h1 ∈ X1 we have α (x)h1 = A−1 Qf (x)h1 . This implies that f (x) is injective on X1 (α (x) has this property). Since dim X1 = dim Im f (x) = L,
4.3. Local Structure of Differentiable Maps, Bifurcations
163
RN −L
RM−L
RM
RN D
F
C o
o
RL TC
RL TD
Φ
Ψ
X2 = Ker f (a)
RM
Y2
RN
f P
a
b
ϕ
W U
U˜
o
Q ˜ W
−1
X1
f (U)
ψ
g˜
o
f (U) ∩ W
A
Y1 = Im f (a)
Figure 4.3.8.
the restriction of f (x) to X1 is an isomorphism of X1 onto Im f (x). We can express this fact in the commutative diagram (Figure 4.3.9). α (x) X1
X1 A−1 Q (an isomorphism)
f (x) Im f (x) Figure 4.3.9.
Using the decomposition RM = X1 ⊕ X2 , we write u = u1 + u2
˜ ui ∈ Xi , i = 1, 2, for u ∈ U,
and define g(u1 , u2 ) = f (ϕ(u1 + u2 )). Now, we show that g actually depends on the first variable only. To see this we compute the derivative of g with respect to the second variable: g2 (u1 , u2 )h2 = f (ϕ(u))ϕ (u)h2 .
164
Chapter 4. Local Properties of Differentiable Mappings
For k ϕ (u)h2 and ϕ(u) = x we have h2 = α (x)k = A−1 Qf (x)k + P k. This means that A−1 Qf (x)k = o. Since A−1 Q is an isomorphism of Im f (x) onto X1 (see Figure 4.3.9), we have f (x)k = o, i.e., g2 (u1 , u2 )h2 = o
for any h2 ∈ X2 .
The Mean Value Theorem (Theorem 3.2.7) implies that g(u1 , u2 ) = g(u1 , o)
for
˜4 (u1 , u2 ), (u1 , o) ∈ U.
This result is shown in Figure 4.3.8 by shaded areas. Put g˜(u1 ) g(u1 , o). We employ Proposition 4.3.2, in particular Remark 4.3.3(ii) to complete the proof. Replacing there g˜ for f , we obtain a diffeomorphism ψ of a neighborhood ˜ of o ∈ RN such that W of b = f (a) onto a neighborhood W (I − Q)ψ(f (U) ∩ W) = o (see the right lower corner of Figure 4.3.8). We get cubes C and D by diffeomorphisms TC , TD in RM , RN , respectively, which transform non-Cartesian coordinates α in X1 ⊕X2 or ψ in Y1 ⊕Y2 into Cartesian coordinates in RM = RL ×RM−L (TC (X1 ) = RL ), or in RN = RL × RN −L , respectively (see the upper part of Figure 4.3.8 and page 163). Remark 4.3.12. The assertion of the Rank Theorem can be formulated in a slightly less informative way as follows: Under the hypotheses of Theorem 4.3.11, f (G) is a differentiable manifold of dimension L. Definition 4.3.13. Functions f1 , . . . , fN : RM → RN are said to be independent in an open set G ⊂ RM if any point x ∈ G is regular for f = (f1 , . . . , fN ). In the other case, the functions are called dependent . The following assertion explains the notions of dependent and independent functions. Suppose the assumptions of the Rank Theorem are satisfied for f = (f1 , . . . , fL , fL+1 , . . . , fN ) : RM → RN where functions f1 , . . . , fL are independent in a neighborhood of a point a ∈ RM . Then there is a smooth function G : RL → RN −L such that (fL+1 (x), . . . , fN (x)) = G(f1 (x), . . . , fL (x)) for x in a certain neighborhood of a. 4 In fact, the use of Theorem 3.2.7 requires the segment joining (u , o) to (u , u ) to lie in U. ˜ 1 1 2 Taking a smaller U˜ if necessary we can assume that U˜ is convex. Notice that we have got a similar result at the end of the proof of Proposition 4.3.8 where we have considered only one fiber, namely {x : f (x) = f (a)}.
4.3. Local Structure of Differentiable Maps, Bifurcations
165
To prove this assertion notice first that Im f (a) is an L-dimensional subspace of RN and can be identified with RL × {0}. This means that Qf (x) = H1 (x) (f1 (x), . . . , fL (x)) and, in the notation of the proof of Theorem 4.3.11, f (x) = g˜(u1 )
where
u1 = A−1 (H1 (x) − H1 (a)).
In particular, fL+1 , . . . , fN are smooth functions of f1 , . . . , fL . The notion of independent functions plays an important role also in the theory of ordinary differential equations. Indeed, let x˙ = v(x) be a system of M differential equations. A smooth non-constant function f : RM → R is called the first integral of this system in an open set G ⊂ RM if for any a ∈ G there is an interval Ia such that for a solution ϕ(·, a) of the system such that ϕ(0, a) = a, we have that ϕ(t, a) ∈ G
and
d f (ϕ(t, a)) = 0 dt
hold for t ∈ Ia .
It has been proved in the theory of ordinary differential equations that a system x˙ = v(x) (v : G ⊂ RM → RM is smooth) has M − 1 independent first integrals f1 , . . . , fM−1 in a neighborhood U of any non-stationary point a ∈ G. A smooth function g : U → R is the first integral if and only if g, f1 , . . . , fM−1 are dependent on U. We remark that the knowledge of the first integrals reduces the original system. For example, if f1 , . . . , fM−1 are independent first integrals in a neighborhood U of a non-stationary point, then the transformation of coordinates yi = fi (x),
i = 1, . . . , M − 1,
yM = xM
leads to a new system y˙ i = 0,
i = 1, . . . , M − 1,
y˙ M = w(yM )
for a function w,
and after rescaling in time, to y˙ i = 0,
i = 1, . . . , M − 1,
y˙ M = 1.
For another interpretation and a generalization of the notion of the first integral see Exercise 4.3.26 and the end of Appendix 4.3A.
166
Chapter 4. Local Properties of Differentiable Mappings
Remark 4.3.14. A result similar to the Rank Theorem holds also for a differentiable map f : X → Y where X, Y are Banach spaces. The delicate question is the existence of continuous linear projections P of X (onto Ker f (a)) and Q of Y (onto Im f (a)). Such projections exist provided f (a) is a Fredholm operator, i.e., Ker f (a) has finite dimension and Im f (a) is a closed subspace of finite codimension in Y (see page 70). Notice that the equation f (x) = y can be solved by the following procedure which is often called the Lyapunov–Schmidt Reduction: The equation f (x) = y is equivalent to the pair of equations y1 Qy = Qf (x1 + x2 ),
y2 (I − Q)y = (I − Q)f (x1 + x2 )
where x = x1 + x2 ,
x2 = P x.
Suppose that the first equation may be solved5 for x1 assuming x2 to be fixed (looking at x2 as a parameter). We obtain x1 = g(y1 , x2 ). The second equation is now an equation (it is called the bifurcation equation or the alternative problem) of the form (I − Q)f (x2 + g(y1 , x2 )) = y2
for an unknown x2 .
If f (a) is a Fredholm map, then this equation is an equation in finite dimensional spaces: x2 ∈ Ker f (a), y2 ∈ Y2 , dim Ker f (a) < ∞, dim Y2 = codim Im f (a) < ∞. Notice that the Implicit Function Theorem ensures a unique local solution to the first equation for y sufficiently close to b = f (a). In this situation we also obtain g2 (b1 , a2 ) = o, i.e., the point a2 is a critical point for F (x2 ) (I − Q)f (x2 + g(b1 , x2 )) − b2 . The simplest case for the local study of F is that codim Im f (a) = 1,
i.e.,
F : X2 = Ker f (a) → R
(see Example 4.3.20). Notice that dim X2 is finite for f (a) being a Fredholm map. 5 E.g.,
by the Implicit Function Theorem (Theorem 4.2.1) in the vicinity of a known solution b = f (a) since f (a) is an isomorphism of X1 onto Y1 or, more generally, by an iteration process.
4.3. Local Structure of Differentiable Maps, Bifurcations
167
Example 4.3.15. As an application we will investigate the existence of a solution of the following boundary value problem for a system of ordinary differential equations x(t) ˙ = f (t, x(t)), t ∈ (0, 1), (4.3.1) x(0) = x(1). We suppose (see Theorem 2.3.4) that f together with its partial derivatives with respect to the variables x = (x1 , . . . , xN ) are continuous on [0, 1] × RN . We know that any solution starting at t = 0 satisfies the integral equation t f (s, x(s)) ds x(t) − x(0) = 0
for all t from the interval of its existence. This means that x satisfies the boundary value problem (4.3.1) if and only if G(x0 )
1
f (s, x(s, x0 )) ds = o. 0
Here x(·, x0 ) denotes a (unique) solution of x(t) ˙ = f (t, x(t)) such that x(0, x0 ) = x0 . The problem of solving the equation G(x0 ) = o
for G : RN → RN
is a nontrivial topological task which we will deal with in Chapter 5. Notice that we cannot use the Implicit Function Theorem directly since there is no parameter in (4.3.1). Therefore we modify the problem by adding a multiplicative parameter ε to (4.3.1), i.e., we investigate the problem x(t) ˙ = εf (t, x(t)), t ∈ (0, 1), (4.3.2) x(0) = x(1). Notice that for ε = 0 any N -dimensional constant a solves (4.3.2). To be able to use the abstract approach described above we rewrite (4.3.2) in an operator form. To do this we define Banach spaces X = {x ∈ C([0, 1], RN ) : x(0) = x(1)},
Y = {y ∈ C([0, 1], RN ) : y(0) = o}
and operators L, N : X → Y : Lx : t → x(t) − x(0),
t
N (x) : t →
f (s, x(s)) ds,
t ∈ [0, 1].
0
Then the system (4.3.2) is equivalent to the operator equation G(x, ε) Lx − εN (x) = o.
(4.3.3)
168
Chapter 4. Local Properties of Differentiable Mappings
The operator L is linear and continuous, therefore differentiable: L (x)h = Lh
h ∈ X.
for
The operator N is also continuously differentiable and t f2 (s, x(s))h(s) ds, t ∈ [0, 1], h ∈ X. N (x)h : t → 0
Check this expression yourself, see also Example 3.2.21. This means that G 1 (a, 0)h = Lh is not injective and X2 Ker L consists of N -dimensional constant functions. Moreover, Y1 Im L = {y ∈ Y : y(1) = y(0) = o}. There are continuous linear projections P , Q onto closed subspaces X2 and Y1 , respectively, given by P x : t → x(0),
Qy : t → y(t) − ty(1).
Having the decompositions X = X 1 ⊕ X2 ,
Y = Y1 ⊕ Y2 ,
we can use the Lyapunov–Schmidt Reduction, i.e., x = x1 + a,
x1 ∈ X1 ,
a ∈ X2 ,
solves (4.3.3) if and only if it solves the pair of equations G1 (x1 , a, ε) Lx1 − εQN (x1 + a) = o,
(4.3.4)
G (x1 , a, ε) (I − Q)N (x1 + a) = o.
(4.3.5)
2
Since G1 (o, a, 0) = o and
(G1 ) 1 (o, a, 0)h = Lh
is an isomorphism of X1 onto Y1 (it is both injective and surjective), the inverse is continuous by the Open Mapping Theorem (Theorem 2.1.8). The Implicit Function Theorem yields a solution x1 ϕ(b, ε) of (4.3.4) in a neighborhood of (a, 0) for a given a ∈ X2 . We also have ϕ(a, 0) = o
and
ϕ 1 (a, 0) = o
(check it again). This means that it is sufficient to solve H(b, ε) (I − Q)N (ϕ(b, ε) + b) = o
4.3. Local Structure of Differentiable Maps, Bifurcations
169
with respect to b. Since dim X2 = dim Y2 = N < ∞ and H : X2 × R → Y2 we can try to use the Implicit Function Theorem once more. To this end we need an a ˜ ∈ X2 for which 1 1 f (s, ˜ a) ds = o, i.e., f (s, a ˜) ds = o, (I − Q)N (˜ a) t 0
0
and the equation (I − Q)N (˜ a)d t
0
1
f2 (s, a ˜) ds d = tc
has a unique solution for every c ∈ R . The last requirement means that the 1 g f2 (s, a ˜) ds has to be regular. N × N -matrix N
0
To summarize the considerations of the previous example, we get the following conclusion. Proposition 4.3.16. Let f = (f 1 , . . . , f N ) : [0, 1] × RN → RN be continuous and ∂f i have continuous partial derivatives ∂x (i, j = 1, . . . , N ). Let the function f satisfy j the conditions
1 1 ∂f i f (s, a ˜) ds = o, det (s, a ˜) ds = 0 0 0 ∂xj for a certain constant a ˜ ∈ RN . Then there exist δ > 0 and a differentiable map ε → x(·, ε), |ε| < δ, such that x(·, 0) = a ˜ and the functions x(·, ε) satisfy the boundary value problem (4.3.2). Remark 4.3.17. Let us make some remarks on this result. If the function f in (4.3.1) is 1-periodic in the variable t, then x is a solution of (4.3.1) if and only if x ˜(t) = x(t − n),
n = [t], t ∈ R,
is a 1-periodic solution of x˙ = f (t, x). Only technical difficulties appear when one generalizes the just described approach to a more general equation x(t) ˙ = A(t)x + εf (t, x) with more general boundary conditions Bx(0) − Cx(1) = o (B, C are N × N matrices). Notice also that having a result for a system of differential equations we can investigate boundary value problems for second order equations. For example, we put
b 1 b2 0 y 0 0 , B= , f (t, x) = x= , C= , g(t, y, y) ˙ y˙ c 1 c2 0 0
170
Chapter 4. Local Properties of Differentiable Mappings
to rewrite
into the form
t ∈ (0, 1),
y¨(t) = a(t)y(t) + εg(t, y(t), y(t)), ˙ ˙ = 0, b1 y(0) + b2 y(0)
⎧
⎪ ⎨ x(t) ˙ =
0 a(t)
1 0
c1 y(1) + c2 y(1) ˙ =0
x(t) + εf (t, x(t)),
t ∈ (0, 1),
⎪ ⎩ Bx(0) + Cx(1) = o. Many other examples of the use of the Implicit Function Theorem can be found in Vejvoda et al. [130]. We will return to the problem (4.3.1) in Example 5.2.18. We now turn to the study of the behavior of a differentiable function in the vicinity of a critical point. We recommend that the reader considers the cases f (x) = xn ,
n > 1,
and
f (x) =
n
aij xi xj ,
aij = aji ,
i,j=1
first. Definition 4.3.18. Let G be an open set in a Banach space X, f : X → R, f ∈ C 2 (G). A critical point a ∈ G of f is said to be non-degenerate if for any h ∈ X, h = o, the linear form f (a)(h, ·) does not vanish. The following basic result holds also in a Hilbert space but its finite dimensional version is more transparent. Theorem 4.3.19 (Morse). Let G be an open set in RM , f : RM → R, f ∈ C 2 (G). Let a ∈ G be a non-degenerate critical point of f . Then there exists a diffeomorphism ϕ of a neighborhood U of a onto a neighborhood V of o ∈ RM such that for x ∈ U, y = ϕ(x), the function f can be expressed in the form 1 λi yi2 2 i=1 M
f (x) = f (a) +
where λ1 , . . . , λM are the eigenvalues of the symmetric matrix f (a). Proof. We identify a bilinear operator with its matrix representation in the standard basis in RM (Remark 3.2.29(ii)) and denote the collection of all M × M matrices by M . Then we can write B(x)(x − a, x − a) (B(x)(x − a), x − a)RM . We choose a norm ·M×M on M and keep it fixed throughout the proof. A subset of M consisting of symmetric matrices is denoted by S . We also denote by F and
4.3. Local Structure of Differentiable Maps, Bifurcations
171
FS the sets of all bounded continuous maps of G into M and S , respectively. The space F equipped with the norm AF sup A(x)M×M x∈G
is a Banach space and FS is its closed subspace. Without loss of generality we can assume that G is a convex neighborhood of the point a so small that f is bounded on G. After these preliminaries we start with the proof. Since f (a) = o, the Taylor Formula (Proposition 3.2.27) gives
1
f (x) = f (a) +
(1 − t)f (a + t(x − a))(x − a, x − a) dt
0
= f (a) + B(x)(x − a, x − a) with
B(x)
1
(1 − t)f (a + t(x − a)) dt
0
(the Riemann integral of a function with values in RM×M ). Note that we have B(·) ∈ FS . Our aim is to show that we can choose C(·) ∈ F such that B(x) = C ∗ (x)JC(x) where J is the canonical form of B(a) = 12 f (a), i.e., ⎛ J=
1⎜ ⎝ 2
0
λ1 .. 0
.
⎞ ⎟ 6 ⎠.
λM
Here C ∗ stands for the adjoint matrix to C, i.e., C ∗ = (cji ) provided C = (cij ). The transformation of coordinates y = C(x)(x − a) then yields 1 λi yi2 . 2 i=1 M
f (x) = f (a) + (J(y), y)RM = f (a) +
To achieve this goal we will use the Implicit Function Theorem (Theorem 4.2.1). We put Φ(B, C) = C ∗ (x)JC(x) − B(x) : FS × F → FS . In particular, Φ(B(a), T ) = T ∗ JT − B(a) = o, 6A
symmetric matrix has a diagonal canonical form – see Proposition 6.3.8.
172
Chapter 4. Local Properties of Differentiable Mappings
provided T is a unitary matrix which transforms B(a) into its canonical form J. Put A JT . The partial differential of Φ with respect to the second variable has the form Φ 2 (B, C)M : x → M ∗ (x)JC(x) + C ∗ (x)JM (x),
x ∈ G.
Then Ker Φ 2 (B(a), T ) = {M ∈ F : M ∗ (·)A + A∗ M (·) = o} and Q : M →
1 (M − (A∗ )−1 M ∗ A) 2
is a continuous linear projection of F onto Ker Φ 2 (B(a), T ). By the assumption on the point a, J is injective. Further, T is a unitary matrix, i.e., T ∗ = T −1 . This means that (A∗ )−1 = J −1 T exists. It can be seen that I − Q is a projection onto F1 (A∗ )−1 (FS ). The partial differential Φ 2 (B(a), T ) is an isomorphism of F1 onto FS . Namely, M
1 −1 J T S ∈ F1 2
Φ 2 (B(a), T )M = S ∈ FS .
and
We can now apply the Implicit Function Theorem to Φ : FS × F1 → FS (T ∈ F1 ) and obtain positive numbers ε and δ such that for any B ∈ FS , B(·)−B(a)F < ε there is a unique C ∈ F1 , C(·) − T F < δ for which Φ(B, C) = C ∗ (x)JC(x) − B(x) = o
for all x ∈ G.
To finish the proof we have to show that there is a neighborhood U of a such that for all x ∈ U.
B(x) − B(a)F < ε By the definition of B, B(·) − B(a)F = sup x∈G
0
1
(1 − t)[f (a + t(x − a)) − f (a)] dt
M×M
1 ≤ sup f (x) − f (a)M×M . 2 x∈G This means that we can find the desired neighborhood U.
Example 4.3.20. Let X, Y be Banach spaces and let f : X → Y . Consider the equation f (x) = o (4.3.6)
4.3. Local Structure of Differentiable Maps, Bifurcations
173
in the vicinity of a known solution x = a. Let f be a C 2 -mapping in a neighborhood of a. Suppose that f (a) is a Fredholm operator (Remark 4.3.14) and, moreover, that the above equation can be reduced to the bifurcation equation (I − Q)f (g(x2 ) + x2 ) = o. Here Q is a projection of Y onto Im f (a), X = X1 ⊕ Ker f (a) and g(x2 ) is a (unique) solution of Qf (x1 + x2 ) = o
for x2 ∈ Ker f (a),
and x2 is in a neighborhood of a2 ∈ Ker f (a) (a = a1 + a2 ). We also assume that this g is given by the Implicit Function Theorem. In particular, this means that g (a2 ) = o. Suppose now that
codim Im f (a) = 1,
i.e., I − Q is a projection onto a 1-dimensional subspace Y2 of Y . Let Y2 = Lin{y2 }. By Corollary 2.1.18 and Remark 2.1.19, there is ϕ ∈ Y ∗ , ϕ ∈ [Im f (a)]⊥ , and we may assume that ϕ(y2 ) = 1. In other words, (I − Q)y = ϕ(y)y2 , and the bifurcation equation has the form F (x2 ) ϕ(f (g(x2 ) + x2 )) = 0. We have F (a2 )h = ϕ[f (a)(g (a2 )h + h)] = 0,
h ∈ Ker f (a),
i.e., a2 is a critical point of F . Further, F (a2 )(h, k) = ϕ[f (a)(g (a2 )h + h, g (a2 )k + k)] + ϕ[f (a)(g (a2 )(h, k))] = ϕ[f (a)(h, k)] since If, for example,
ϕ ◦ f (a) = 0
and
g (a2 ) = o.
dim Ker f (a) = 2
174
Chapter 4. Local Properties of Differentiable Mappings
(this can occur for f : RN +1 → RN ) and the matrix of F (a2 ) is regular, i.e., a2 is a non-degenerate critical point of F , then after a suitable transformation of coordinates we get 1 F (x2 ) = (λ1 ξ 2 + λ2 η 2 ) 2 (the Morse Theorem) and the following conclusion: If sgn λ1 = sgn λ2 , then the equation (4.3.6) has an isolated solution x = a; if sgn λ1 = − sgn λ2 , then there are two curves of solutions given by 2 λ2 ξ = ± − η. λ1
g
The previous example can be generalized. The following problem is a standard one in the bifurcation theory: A differentiable map f : R × X → Y is given where X, Y are Banach spaces.7 A smooth curve x = α(λ), λ ∈ (−δ, δ), of solutions of the equation f (λ, x) = o (4.3.7) is known. After the transformation ξ = x − α(λ), we can suppose that f (λ, o) = o
(4.3.8)
for λ in a neighborhood of (e.g.) 0 ∈ R. Definition 4.3.21. Let (4.3.8) be satisfied for the equation (4.3.7). The point (0, o) ∈ R × X is called a bifurcation point provided in any neighborhood of (0, o) there is a solution (λ0 , x0 ) of (4.3.7) such that x0 = o. Notice that whenever f is differentiable in a neighborhood U of (0, o) and f2 (0, o) is an isomorphism, then (0, o) is not a bifurcation point (the Implicit Function Theorem). In order to find a sufficient condition for bifurcation suppose that f ∈ C 2 (U) and A = f2 (o, o) is not an isomorphism. More precisely, let Ker A be nontrivial, i.e., let 0 be an eigenvalue of A. The simplest case occurs when 0 is a simple eigenvalue, i.e., Ker A = Lin{x0 },
x0 = o.
The following result is a classical one (see Crandall & Rabinowitz [29]). Theorem 4.3.22 (Local Bifurcation Theorem). Let X, Y be Banach spaces, f : R× X → Y a twice continuously differentiable map on a neighborhood of (0, o). Let f satisfy the assumptions (i) f (λ, o) = o for all λ ∈ (−δ, δ) for some δ > 0, (ii) dim Ker f2 (0, o) = codim Im f2 (0, o) = 1, (iii) if f2 (0, o)x0 = o, x0 = o, then f1,2 (0, o)(1, x0 ) ∈ Im f2 (0, o). 7 The
set of parameters R can be replaced by a normed linear space in general.
4.3. Local Structure of Differentiable Maps, Bifurcations
175
Denote by X1 the topological complement8 of Ker f2 (0, o) in X. Then there is a C 1 -curve (ϕ, ψ) : (−η, η) → R × X1 (for some η > 0) such that ϕ(0) = 0,
f (ϕ(t), t(x0 + ψ(t))) = o.
ψ(0) = o,
Moreover, there is a neighborhood U of (0, o) in R × X such that f (λ, x) = o
(λ, x) ∈ U
for
if and only if either x = o or λ = ϕ(t),
x = t(x0 + ψ(t))
for a certain t
– see Figure 4.3.10. Such a picture is called a bifurcation diagram. X
(ϕ(t), t(x0 + ψ(t)))
(0, o) R U
Figure 4.3.10.
Proof. We will give two proofs. The first one for a finite dimensional case when X = Y = RM is based on the Morse Theorem. The second one which is due to M. Crandall and P. Rabinowitz is based on the Implicit Function Theorem and will be only sketched. The first proof. We choose ω ∈ Y ∗ = RM , ω = o, such that y ∈ Im f2 (0, o)
if and only if
ω(y) = 0.
Using the Lyapunov–Schmidt Reduction (Remark 4.3.14) we obtain a map g(λ, t) : R2 → X1 such that the equation f (λ, x) = o is locally equivalent to the equation F (λ, t) ω[f (λ, tx0 + g(λ, t))] = 0. We now show that (0, 0) ∈ R2 is a non-degenerate critical point of F . To do this we need to compute F (0, 0) and F (0, 0). Since f1 (λ, o) = o,
f1,1 (λ, o) = o
for
λ ∈ (−δ, δ)
X = X1 ⊕ X2 and the corresponding projection P of X onto X1 be continuous. Then X1 is called a topological complement of X2 and vice versa. 8 Let
176
Chapter 4. Local Properties of Differentiable Mappings
(assumption (i)) and g(λ, 0) = o,
g2 (0, 0) = o
(see Remark 4.3.14), we have F (0, 0) = 0 Further,
and also
F1,1 (0, 0) = 0.
F2 (λ, 0) = ω[f2 (λ, o)(x0 + g2 (λ, 0))].
Therefore, by (iii), (0, 0) = F2,1 (0, 0) = ω[f1,2 (0, o)(1, x0 ) + f2 (0, o)g1,2 (0, 0)] β F1,2 = ω[f1,2 (0, o)(1, x0 )] = 0 (0, 0), we obtain since ω(z) = 0 for every z ∈ Im f2 (0, o). If we denote α F2,2
0 β . This matrix has the matrix representation of F (0, 0) in the form β α eigenvalues of different signs. The rest of the proof follows by applying the Morse Theorem (see also Example 4.3.20). The second proof proceeds by using the Implicit Function Theorem for the function Φ : R × R × X1 → Y defined by ⎧ ⎨1 f (λ, t(x0 + x1 )) for t = 0, Φ(λ, t, x1 ) = t ⎩f (λ, o)(x + x ) for t = 0. 2
0
1
Notice that Φ(0, 0, o) = o, and (λ, h) → Φ 1 (0, 0, o)λ + Φ 3 (0, 0, o)h is an isomorphism of R × X1 onto Y (assumptions (ii) and (iii)). For details see Crandall & Rabinowitz [29]. Example 4.3.23. The following two functions offer very simple illustrative examples: f (λ, x) = λx − x2 , g(λ, x) = λx − x3 . Their bifurcation diagrams are shown in Figures 4.3.11 and 4.3.12. We use these functions to point out the typical examples of the changing of stability of a differential equation when the so-called non-hyperbolic stationary point is crossed.9 In these figures branches of stationary solutions of equations x˙ = f (λ, x),
x˙ = g(λ, x)
are shown with an indication of their stability (s for stable, u for unstable).
g
stationary point a ∈ RM is called hyperbolic for the equation x˙ = f (x) provided f (a) = o and σ(f (a)) ∩ iR = ∅. See also footnote 3 on page 154. 9A
4.3. Local Structure of Differentiable Maps, Bifurcations
177
x
x s
(0, 0)
s
u
s
s
(0, 0)
u
λ
λ
u s Figure 4.3.11. Transcritical bifurcation
Figure 4.3.12. Pitchfork bifurcation
Example 4.3.24. We wish to find a nontrivial 2π-periodic solution of the nonlinear pendulum equation x ¨(t) + λ sin x(t) = 0. (4.3.9) We put f (λ, x) : t → x ¨(t) + λ sin x(t), and X = {x ∈ C 2 (R) : x is 2π-periodic}, ˙ + max |¨ x(t)|, xX = max |x(t)| + max |x(t)| t∈[0,2π]
t∈[0,2π]
Y = {y ∈ C(R) : y is 2π-periodic}, It is easy to show that
t∈[0,2π]
yY = max |y(t)|. t∈[0,2π]
¨ + λh f2 (λ, o)h = h(t)
and, therefore, Ker f2 (λ, o) is nontrivial if and only if λ = n2 , n ∈ N ∪ {0}, and Ker f2 (0, o) = {constant functions}, Ker f2 (n2 , o) = Lin{sin nt, cos nt}
for
n ∈ N.
In the former case, i.e., n = 0, we can apply Theorem 4.3.22. Since 2π y(s) ds = 0 Im f2 (0, o) = y ∈ Y : 0
and
f1,2 (0, o)(1, c) = c
for c ∈ Ker f2 (0, o),
the assumptions of Theorem 4.3.22 are satisfied. What can we do in the latter case when n ∈ N and the dimension of Ker f2 (n2 , o) is equal to 2? In spite of the fact that Theorem 4.3.22 cannot be used we still may proceed with the Lyapunov–Schmidt Reduction: Denote A f2 (n2 , o), Y = Im A ⊕ Z, 2π cos nt 2π sin nt y(s) sin ns ds + y(s) cos ns ds, (I − Q)y : t → π π 0 0
y ∈ Y.
178
Chapter 4. Local Properties of Differentiable Mappings
Then I − Q is the projection onto Z such that Ker (I − Q) = Im A. Similarly, let X = Ker A ⊕ V
where V = {v ∈ X : (I − Q)v = o}.
The operator f can be expressed by f (µ + n2 , u + v) = Av + µ(u + v) + h(µ, u, v),
u ∈ Ker A, v ∈ V,
where h(µ, u, v) = µ[sin (u + v) − (u + v)] + n2 [sin (u + v) − (u + v)]. Because of this special form of h we will try to find a solution of (4.3.9) in the form x = µ(u + v). The equality f (µ + n2 , µ(u + v)) = 0 holds if and only if Av + µ(u + v) + µ2 g(µ, u, v) = 0
(4.3.10)
where
⎧ h(µ, µu, µv) ⎪ ⎪ , µ = 0, ⎨ µ3 g(µ, u, v) = 2 ⎪ ⎪ ⎩− n (u + v)3 , µ = 0. 6 For solving (4.3.10) we use the Lyapunov–Schmidt Reduction. According to it, the equation (4.3.10) is equivalent to the following pair of equations: 0 = Av + µQ(u + v) + µ2 Qg(µ, u, v) 0 = (I − Q)(u + v) + µ(I − Q)g(µ, u, v)
(= Av + µv + µ2 Qg(µ, u, v)), (= u + µ(I − Q)g(µ, u, v)).
By the Implicit Function Theorem, the first equation has a unique solution v = ϕ(µ, u) in a neighborhood of the point (0, u∗ , o) for any u∗ . We insert ϕ into the bifurcation equation obtaining Φ(µ, u) = u + µ(I − Q)g(µ, u, ϕ(µ, u)) = 0.
(4.3.11)
Since Φ(0, u∗ ) = u∗ , we take u∗ = o and solve (4.3.11) in a neighborhood of (0, o). This can be done with help of the Implicit Function Theorem since Φ 2 (0, o) is an isomorphism of Ker A onto V . Denoting this solution by u = ω(µ) we can come to the following conclusion: Any point (n2 , o) is a bifurcation point of the equation (4.3.9) and a nontrivial branch of 2π-periodic solutions of (4.3.9) has the form x = µ(ω(µ) + ϕ(µ, ω(µ))),
µ ∈ (−δ, δ).
4.3. Local Structure of Differentiable Maps, Bifurcations
179
The reader is invited to generalize this procedure to obtain sufficient conditions for a bifurcation for the equation f (λ, x) = o assuming that f (λ, o) = o for |λ − λ∗ | < δ, f ∈ C 2 (U), dim Ker f2 (λ∗ , o) = codim Im f2 (λ∗ , o) = 2 where U is a neighborhood of (λ∗ , o). We notice that no uniqueness of the bifurcation branch was proved even in our concrete example. Compare this with the assertion given in Theorem 4.3.22. This is due to our special choice of the form of the bifurcation branch, namely g x = µ(u + v). Example 4.3.25 (Application of Theorem 4.3.22). We will study the bifurcation points of the periodic problem x ¨(t) + λx(t) + g(λ, t, x(t), x(t)) ˙ = 0, t ∈ (0, 2π), (4.3.12) x(0) = x(2π), x(0) ˙ = x(2π). ˙ In this example we will concentrate on the point λ = 0 which is an eigenvalue of the associated eigenvalue problem x¨(t) + λx(t) = 0, t ∈ (0, 2π), (4.3.13) x(0) = x(2π), x(0) ˙ = x(2π), ˙ of multiplicity 1. We consider the same function spaces X, Y as in the previous example (Example 4.3.24). Let us define F : X × R → Y by F (λ, x)(t) = x ¨(t) + λx(t) + g(λ, t, x(t), x(t)) ˙ where the function g = g(λ, t, x, y) satisfies the following hypotheses: (i) g is 2π-periodic in t and continuous with respect to all four variables (as a function from R4 into R); (ii) the derivatives of g with respect to x, y, λ up to the order p (p ≥ 2) are continuous functions from R4 into R; (iii) g(λ, t, 0, 0) = 0 for all t, λ ∈ R; (iv) g3 (λ, t, 0, 0) = g4 (λ, t, 0, 0) = 0 for all t, λ ∈ R.
180
Chapter 4. Local Properties of Differentiable Mappings
It follows from (iii) that F (λ, o) = o for all λ ∈ R. Moreover, thanks to (iv) we have F2 (λ, o)w = w ¨ + λw, and so we conclude
dim Ker F2 (o, 0) = 1.
It follows from (ii) that F ∈ C p (X × [0, 2π], Y ). By Proposition 2.1.27(iv), 2π Y1 = Im F2 (0, o) = w ∈ Y : w(t) dt = 0 0
is a closed subspace of Y of codimension 1. Set x0 = 1, X1 = Lin{1}, X2 = x ∈ X :
2π
x(t) dt = 0 .
0
Since
F1,2 (0, o)1 = 1
and
1∈ / Im F2 (0, o),
the condition (iii) in Theorem 4.3.22 is verified, too. It follows from the Crandall–Rabinowitz Bifurcation Theorem (see Theorem 4.3.22) that (0, o) is a bifurcation point of (4.3.12). In particular, the point (0, o) ∈ R × X belongs to the branch of trivial solutions (λ, o), but also to the branch ˆ Γ = {(s + x(s), λ(s)) : s ∈ (−ε, ε)},
x(0) = o, x˙ s (0) = o,
ˆ λ(0) = 0.
Hence for any s ∈ (−ε, ε), s = 0, the nontrivial solution s + x(s) is the sum of a constant function (with respect to t) and the perturbed function x(s) (which g depends on t) such that x(s) belongs to X2 . Exercise 4.3.26. (i) Let G be an open subset of RM and let v ∈ C 1 (G, RM ). Assume that a ∈ G is not a stationary point of v, i.e., v (a) = o. Prove that there exists a diffeomorphism F of a neighborhood U of a onto a neighborhood V of o such that F maps solutions to the equation x(t) ˙ = v(x(t))
(4.3.14)
which lie in U to solutions in V of the system of equations y˙ 1 (t) = 1, y˙ i (t) = 0,
i = 2, . . . , M.
Hint. Choose a subspace Y of RM for which RM = Y ⊕ Lin{v(a)}. Define G(z) = ϕ(t; a + y)
for z = y + tv(a), y ∈ Y,
where ϕ(t; ξ) is a solution to (4.3.14) such that ϕ(0; ξ) = ξ. Prove that G is a local diffeomorphism and F = G−1 has the desired property.
4.3A. Differentiable Manifolds, Tangent Spaces and Vector Fields
181
(ii) Deduce from (i) that the equation (4.3.14) has M − 1 independent first integrals in a neighborhood of a non-stationary point. (iii) Is there any relation between the first integral of (4.3.14) and the linear partial differential equation v1 (x)
∂u ∂u + · · · + vM (x) = 0? ∂x1 ∂xM
Exercise 4.3.27. Apply Theorem 4.3.22 to the (Dirichlet) boundary value problem x ¨(t) + λx(t) + g(λ, t, x(t), x(t)) ˙ = 0, t ∈ (0, π), x(0) = x(π) = 0, and show that every (k 2 , o), k ∈ N, is a bifurcation point! Exercise 4.3.28. Replace the Dirichlet boundary condition in Exercise 4.3.27 by the Neumann boundary condition x(0) ˙ = x(π) ˙ =0 and prove that every (k 2 , o), k ∈ N ∪ {0}, is a bifurcation point! Exercise 4.3.29. Why cannot the approach used in Example 4.3.25 be applied to prove that the points (k 2 , o), k ∈ N, are bifurcation points of (4.3.12) even if k 2 is an eigenvalue of the associated eigenvalue problem (4.3.13)? Can you modify the method from Example 4.3.24? Exercise 4.3.30. Apply Theorem 4.3.22 to the boundary value problem ⎧ 4 ... ⎨ d x (t) − λx(t) + g(λ, t, x(t), x(t), ˙ x ¨(t) x (t)) = 0, t ∈ (0, π), dt4 ⎩ x(0) = x ¨(0) = x(π) = x¨(π), and show that, under appropriate assumptions on g, every (k 2 , o), k ∈ N, is a bifurcation point.
4.3A Differentiable Manifolds, Tangent Spaces and Vector Fields We have defined a differentiable manifold in the basic text (Definition 4.3.4) and have also shown some examples (Examples 4.3.5 and 4.3.10). In the following Appendices 4.3A– 4.3D we will provide more information about this object in order to develop a geometric approach to one of the most powerful tools of nonlinear analysis, namely to the Brouwer degree (cf. Section 5.2 for an alternative approach). There is no doubt as to the importance of the notion of the derivative (or differential) for local study of functions of one or more variables. Therefore a notion of the differential will be a rudiment of analysis on differentiable manifolds, too. We have learned that it is convenient to define the differential of f : RM → R
at a point
a ∈ RM
182
Chapter 4. Local Properties of Differentiable Mappings
as a linear form f (a) on RM which approximates f locally at a in a given precise way. To extend such an approach to functions on differentiable manifolds we have to say what is a linear form on a (nonlinear) manifold. This is done by the important notion of the tangent space Ta M of a differentiable manifold M at a point a ∈ M . Roughly speaking, Ta M is the collection of all tangent vectors to M at the point a. We can imagine a tangent vector v at the point a ∈ M with help of the following physical interpretation. Consider a force F which acts on a material point P . Suppose that there are certain rigid constraints which make P move along a smooth curve γ which lies on a manifold M ⊂ RN (this manifold is determined by these constraints). As a concrete example, imagine that we move on the globe (because of gravitation) which can be approximated by the smooth two-dimensional sphere S 2 . Let the point P be at a certain instant, say t = 0, at the position a = γ(0) ∈ M , and let the force F and also all constraints stop operating suddenly at this time. What will happen then? According to the First Newton Law the point P will continue to move with a constant speed v = |γ(0)| ˙
(γ(0) ˙
dγ (0)) dt
along the line with the directional vector γ(0). ˙ This vector is the tangent vector to the curve γ at the point a. The collection of all these tangent vectors (which are given by all possible motions through a fixed point a ∈ M ) forms the tangent space Ta M . More precisely, we give the following definition. Definition 4.3.31. Let M be an M -dimensional differentiable manifold in RN and let ˙ to all smooth a ∈ M . The tangent space Ta M is the collection of tangent vectors γ(0) curves γ ∈ Γa {γ : R → RN : there is an open interval Iγ 0 such that γ ∈ C 1 (Iγ ), γ(Iγ ) ⊂ M , γ(0) = a}. The method for computation of tangent vectors to a “parametrized” M -dimensional manifold M ⊂ RN is based on the use of local coordinates: Let a ∈ M , let ψ be a diffeomorphism of a neighborhood W ⊂ RN of the point a into RN (see Definition 4.3.4). If V = W ∩ M, then (V, ψ) is a local chart of M at the point a ∈ M . Denote the inverse of the restriction of P ψ to V by ϕ where P y = (y1 , . . . , yM ) ∈ RM for y = (y1 , . . . , yN ) ∈ RN . Then ϕ maps the neighborhood U = P ψ(V) of the point b = P ψ(a) ∈ RM into M and we will consider ϕ also as an embedding (see Remark 4.3.3(iv)) of U into RN . We call ϕ the local parametrization of V ⊂ M . The main reason for introducing ϕ is that ϕ can be differentiated, but ψ|V cannot, and the whole ψ does not describe M . See Figure 4.3.13. Consider now a smooth curve γ ∈ Γa (see Figure 4.3.13). We can choose Iγ so small that γ(Iγ ) ⊂ V. Then κ(t) = (P ψ)(γ(t))
4.3A. Differentiable Manifolds, Tangent Spaces and Vector Fields
RN −M
183
RN
M W
V
γ(0) ˙ a γ
ϕ
o
Pψ κ(0) ˙ U
κ
b
RM Figure 4.3.13. Manifold is a smooth curve in U ⊂ RM . We have ϕ ◦ κ = γ and, consequently, γ˙ j (0) = or, more briefly,
M ∂ϕj (b)κ˙ i (0), 10 ∂yi i=1
j = 1, . . . , N,
˙ γ(0) ˙ = ϕ (b)(κ(0)).
Since also
(4.3.15)
κ(0) ˙ = P ψ (a)(γ(0)), ˙
there is a correspondence between the tangent vector v = (γ˙ 1 (0), . . . , γ˙ N (0)) ∈ Ta M and the tangent vector w = (κ˙ 1 (0), . . . , κ˙ M (0)) to a curve κ = (P ψ) ◦ γ
at
b = κ(0) = ψ(a).
Obviously, for any w = (w , . . . , w ) ∈ R there is a smooth curve κ (e.g., κ(t) = M wi ei ; e1 , . . . , eM is the standard coordinate basis in RM ) such that w = κ(0). ˙ b+t 1
M
M
i=1
This means that
Ta M = Im ϕ (b)
10 The Einstein summation convention is often used in differential geometry. According to it, the sum is taken with respect to all indices which appear simultaneously in upper and lower positions. For example, if e1 , . . . , eM is a basis in RM , then the coordinates of a point x ∈ RM with respect M to this basis should be denoted by x1 , . . . , xM , since x = xi ei xi ei by this convention. i=1
Similarly, ϕ : RM → RN has components ϕ1 , . . . , ϕN and values ϕ(x1 , . . . , xM ) (= ϕ(x)) and M N j j ∂ϕj . Moreover, ϕ (a)(hi ei ) = ∂ϕ (a)hi ej (= (a)hi ej ). Since partial derivatives ∂ϕ ∂x ∂x ∂x i
i
this notation is not too common in analysis we do not use it.
j=1
i=1
i
184
Chapter 4. Local Properties of Differentiable Mappings
and the linear operations in RM induce those in Ta M . Therefore, Ta M is a linear space of dimension M (M = dim M ) and ϕ (b)ei , i = 1, . . . , M , form a basis of Ta M . Since (y 1 , . . . , y M ) (ψ 1 (x), . . . , ψ M (x)) can be viewed as local (nonlinear) coordinates of a point x ∈ V, the vector ϕ (b)ei is also ∂ . This means that denoted by ∂y i M M i ∂ ˙ = ϕ (b) κ˙ (0)ei = κ˙ i (0) . (4.3.16) γ(0) ˙ = ϕ (b)(κ(0)) ∂y i i=1 i=1 Example 4.3.32. Let us compute the tangent space to the 2-dimensional sphere √
2 1 1 , , . S 2 = {(x, y, z) ∈ R3 : x2 + y 2 + z 2 = 1} at the point a = 2 2 2 As local coordinates we choose the spherical coordinates x = cos α cos β, y = sin α cos β, z = sin β, i.e., (x, y, z) = ϕ(α, β), b = π4 , π4 and ϕ π4 , π4 = a. Then √
∂ 1 1 ∂ 1 ∂ϕ ' π π ( ∂ϕ ' π π ( 1 2 , = − , ,0 , , = − ,− , ∂α ∂α 4 4 2 2 ∂β ∂β 4 4 2 2 2 is a basis of Ta S 2 . Choosing a perpendicular vector v to both √ (1, 1, 2), we get the following expression for Ta S 2 : √ Ta S 2 = {(x, y, z) ∈ R3 : x + y + 2z = 0}.
∂ ∂α
and
∂ , ∂β
e.g., v =
e
(If you have drawn a picture, you get a slightly better insight.)
It was shown in Remark 4.3.9(iii) that a manifold can also be given implicitly, i.e., as the set of solutions of the equation f (x) = o. Proposition 4.3.33. Let f : RM → RN have continuous partial derivatives in an open set G ⊂ RM and let o be a regular value of f (Definition 4.3.6). Then M = {x ∈ G : f (x) = o} is an (M − N )-dimensional differentiable manifold provided M is not empty, and for a ∈ M the tangent space Ta M is equal to Ker f (a). Proof. The first part is exactly Proposition 4.3.8 and Remark 4.3.9(iii). If a map γ : Iγ → M is a smooth curve, γ(0) = a, then f (γ(t)) = o
for t ∈ Iγ
and
f (a)γ(0) ˙ = o,
i.e.,
γ(0) ˙ ∈ Ker f (a).
Since Ta M ⊂ Ker f (a) and both the spaces have the same (finite) dimension (the assumption on regularity of o), we have Ta M = Ker f (a).
4.3A. Differentiable Manifolds, Tangent Spaces and Vector Fields
185
Since the same geometric object M can be viewed as a manifold with different local parametrizations or as solutions of different equations, we would like to know how the notion of the tangent space (and other notions to be introduced later on) depends on the way it is introduced. As the implicit definition of manifold leads to local parametrizations (see the proof of Proposition 4.3.8) we can consider only the definition given by parametrizations. First of all we should say when two atlases of M define the same structure on M . Definition 4.3.34. Two C k -atlases (Vα , ψα )α∈A , (V˜β , ψ˜β )β∈B of M are said to be equiva˜β the mapping lent if for every a ∈ M and any α ∈ A, β ∈ B for which a ∈ Vα ∩ V Φ = (P ψ˜β ) ◦ ϕα ˜β ) onto (ϕ is a C -diffeomorphism of (ϕα )−1 (Vα ∩ V ˜β )−1 (Vα ∩ V˜β ) (see Figure 4.3.14). k
V
V˜ a
P ψ˜
ϕ U
U˜
˜ ◦ϕ Φ = (P ψ) ˜b
b ˜ U1 ϕ−1 (V ∩ V)
˜ U˜1 ϕ˜−1 (V ∩ V)
Figure 4.3.14. ˜ be two local charts at a point a ∈ M which belong to ˜ ψ) Example 4.3.35. Let (V, ψ), (V, ˙ for a smooth curve γ. Then equivalent atlases of M . Let v ∈ Ta M , v = γ(0) γ =ϕ◦κ =ϕ ˜◦κ ˜
where
κ ˜ =Φ◦κ
and Φ is defined as in Definition 4.3.34. It follows that ϕ (b)(κ(0)) ˙ = ϕ˜ (˜b)(Φ (b)κ(0)). ˙ ∂ ˙ denoting ∂y ϕ (b)ei (as above) and ∂z∂ j ϕ ˜ (˜b)ej , we get In particular, for ei = κ(0), i the transformation rule for the tangent vectors M M ∂Φj ∂ ∂Φj ∂ =ϕ ˜ (b) (b)ej = (b) , i = 1, . . . , M. (4.3.17) ∂yi ∂y ∂y ∂z i i j j=1 j=1
e
We will now examine a more general situation. Let M , M˜ be differentiable man˜ ˜ be ˜ ψ) ifolds in RN and RN , respectively. Suppose that g : M → M˜ and let (V, ψ), (V, local charts in a ∈ M and a ˜ = g(a) ∈ M˜. Put ˜ ◦g◦ϕ G = (P˜ ψ) (see Figure 4.3.15). Then G is called a local realization of g.
186
Chapter 4. Local Properties of Differentiable Mappings
RN
RN −M
˜
RN − M ˜
˜
RN M˜
M
V˜ g
˜ ∩V g−1 (V)
a ˜
V a ϕ˜ ϕ
o
P˜ ψ˜
˜ ◦g◦ϕ o G = (P˜ ψ)
Pψ
U R
b
M
˜b
˜ ∩ V) P ψ(g−1 (V)
U˜
˜
RM
Figure 4.3.15.
We say that a mapping g : M → M˜ is of the class C k (or k-times continuously ˜ In this case g maps a ˜ ψ). differentiable) if G ∈ C k (U, U˜ ) for all charts (V, ψ) and (V, smooth curve γ ∈ Γa onto a smooth curve g ◦ γ ∈ Γa˜ . Namely, g◦γ =ϕ ˜◦G◦κ
where
γ = ϕ ◦ κ,
and d ˙ (4.3.18) (g ◦ γ)(0) = ϕ ˜ (b)(G (b)κ(0)). dt We say that g “pushes forward” the tangent vector γ(0) ˙ ∈ Ta M to the tangent vector d (g ◦ γ)(0) ∈ Tg(a) M˜ dt ˙ and which which is denoted by g∗ (γ(0)) ' is(called a push-forward . In particular, g pushes ∂ ∂ forward the tangent vector ∂yi to g∗ ∂y where i
g∗
∂ ∂yi
=
˜ M ∂Gj ∂ (b) ; ∂y ∂z i j j=1
∂ =ϕ ˜ (˜b)ej . ∂zj
(4.3.19)
We wish to point out that g∗ is the generalization of g (a) for g : RM → RM . The transformation rule (4.3.17) is a special case of (4.3.18) where g = I. An important special case of a smooth mapping is a differentiable function on a manifold. For such a function we define the notion of the differential. ˜
4.3A. Differentiable Manifolds, Tangent Spaces and Vector Fields
187
Definition 4.3.36. A function f : M → R is said to be differentiable at a ∈ M if f ◦ γ ∈ C 1 (Iγ )
for all
γ ∈ Γa .
The differential df (a) of f at the point a is defined by the relation df (a)(γ(0)) ˙
d (f ◦ γ)(0) dt
for all
γ ∈ Γa .
The (algebraic) dual space to Ta M will be denoted by (Ta M )∗ (it is sometimes called the cotangent space) and the dual basis to ∂y∂ 1 , . . . , ∂y∂M is denoted by dy 1 , . . . , dy M , i.e.,
1 for i = j, ∂ = δij = dy j ∂yi 0 for i = j. Remark 4.3.37. (i) From the definition of df (a) it is obvious that df (a) ∈ (Ta M )∗ and its values can be expressed in local coordinates as follows: If (ψ, V) is a local chart at a ∈ M , F = f ◦ ϕ : U ⊂ RM → R, then ∂F d (b)κ˙ i (0). (f ◦ γ)(0) = dt ∂yi i=1 M
f ◦γ = F ◦κ
for
γ = ϕ ◦ κ ∈ Γa ,
i.e.,
In other words, df (a) =
M ∂F (b) dy i . ∂y i i=1
(4.3.20)
In particular, for f (x) = ψ i (x) we have dψ i (a) = dy i ,
i = 1, . . . , M.
Observe that the formula (4.3.20) allows us to define continuity of the mapping x ∈ M → df (x) ∈ (Tx M )∗ by the requirement that all F (corresponding to all charts of M ) have continuous partial derivatives.11 ˜ at a point a ∈ M . Let ˜ ψ) (ii) Suppose now that there are two local charts (V, ψ), (V, f : M → R be differentiable at a ∈ M . Put F f ◦ ϕ,
F˜ f ◦ ϕ, ˜
i.e.,
F = F˜ ◦ Φ
11 It is possible to define the structure of a differentiable manifold on the collection T M {(x, v) : x ∈ M , v ∈ Tx M } of all tangent spaces together with their “base” points. This set T M is called a tangent bundle of M . The structure of a differentiable manifold on T M is given by the local charts which are constructed as follows: If (V, ψ) is a local chart at a ∈ M , then VT = {{x} × Tx M : x ∈ V} ⊂ RN+M and ψT (x, w) = (ψ(x), P ψ (x)w) ⊂ RN+M . In a similar way the cotangent bundle is the collection T ∗ M {(x, κ) : x ∈ M , κ ∈ (Tx M )∗ }. The continuity of df which has just been defined is then the continuity of df : M → T ∗ M .
188
Chapter 4. Local Properties of Differentiable Mappings where Φ is defined in Definition 4.3.34. By virtue of (4.3.20) we have M M M ∂ F˜ ∂F ∂Φj i ˜ df (a) = (b) dy = (b) (b) dy i ∂yi ∂zj ∂yi i=1 i=1 j=1 M M M ∂ F˜ ˜ ∂Φj ∂ F˜ ˜ i = (b) (b) dy = (b) dz j . ∂z ∂y ∂z j i j j=1 i=1 j=1
(4.3.21)
The equality dz j =
M ∂Φj (b) dy i ∂yi i=1
(4.3.22)
follows from this calculation applied to the jth coordinate function z j = ψ˜j (x),
x ∈ V ∩ V˜
and Remark 4.3.37(i).12 According to the last remark, the existence of the differential df (a) does not depend on the choice of local chart. The fact that the differentiability of functions, and similarly other notions, does not depend on a particular choice of local coordinates is crucial for differential geometry and global analysis. Namely, these parts of mathematics study objects in their own geometric nature. The invariance of geometric objects with respect to various groups of transformations (in our case this group is the group of diffeomorphisms) plays a very important role in various applications, mainly in physics (e.g., in general relativity). It is also worth mentioning that the emphasis on invariance is a certain kind of the philosophy dual to Descartes. Analytic geometry and classical differential geometry transform geometric properties into the language of analysis. Local coordinates which introduce analytic tools into geometry are used mainly for computations. Remark 4.3.38. The transformation rule (4.3.21) can be generalized. Consider a function f : M˜ → R and suppose that a mapping g : M → M˜ is given. We can look at g as a “generalized” transformation (we do not assume that g is injective). We are interested in the relation between df (˜ a) and the differential of the “transformed” function f ◦g : M → R. The desired chain rule can be obtained again with help of local charts (V, ψ) at a ∈ M ˜ ψ) ˜ at a and (V, ˜ = g(a) ∈ M˜ (see Figure 4.3.15). This means that we investigate F ◦ G instead of f ◦ g. Here ˜ ◦g◦ϕ G = (P˜ ψ)
and
F = f ◦ ϕ. ˜
According to (4.3.19)–(4.3.21) we have
d(f ◦ g)(a)
∂ ∂yi
∂F ∂Gj ∂(F ◦ G) (b) = (˜b) (b) ∂yi ∂zj ∂yi j=1
∂ ∂ = df (˜ a) g∗ a)) g ∗ ( df (˜ . ∂yi ∂yi ˜ M
=
(4.3.23)
12 Notice the difference between the transformation rules (4.3.17) and (4.3.22). The reader who is acquainted with tensor analysis can realize that tangent vectors are transformed as contravariant tensors and differentials as covariant ones.
4.3A. Differentiable Manifolds, Tangent Spaces and Vector Fields
189
The linear form g ∗ ( df (˜ a)) ∈ (Ta M )∗ is called a pull-back of the linear form df (˜ a) ∈ ∗ ˜ (Ta˜ M ) . This operation will play an important role in the definition of the degree (see Proposition 4.3.116). We will return to pull-back in Exercise 4.3.71. Remark 4.3.39. The notion of the differentiable manifold can be generalized in such a way that it is not a priori assumed that M is a subset of RN . In fact, we have needed in Definition 4.3.4 that M has a topological structure (inherited from RN ) such that the neighborhood V = W ∩ M is homeomorphic via P ψ|V ((P ψ|V )−1 = ϕ) with U ⊂ RM . A differential structure on M is introduced with help of differentiability properties of the mappings Φ1,2 ϕ−1 2 ◦ϕ1 for different neighborhoods V1 , V2 of M (cf. Definition 4.3.34). This is sufficient for correctness of the definition of the smooth function f : M → R. (It is smooth provided f ◦ ϕ : U → R is smooth for all ϕ. Namely, f (x) = (f ◦ ϕ1 )(y) = (f ◦ ϕ2 )[Φ1,2 (y)]
for
x ∈ V1 ∩ V2 .)
These considerations allow us to say that the topological space M 13 is called the M dimensional differentiable manifold if M is locally homeomorphic to open sets of RM in such a way that all composite mappings Φ1,2 belong to the class C k .14 Remark 4.3.40. We can define an infinite dimensional differentiable (e.g, of the class C k ) manifold by replacing RM by a Banach space X. As an important example consider a mapping f ∈ C k (X, R), k ≥ 1, and define M = {x ∈ X : f (x) = 1}. If M = ∅ and all its points are regular (i.e., f (x) = 0 for all x ∈ M ), then M is an infinite dimensional manifold of the class C k (this follows from Proposition 4.3.8 and Remark 4.3.9(i)). Moreover, the tangent space Ta M is equal to Ker f (a) for a ∈ M . Indeed, the inclusion Ta M ⊂ Ker f (a) can be proved as in Proposition 4.3.33. To get the reverse inclusion let h ∈ Ker f (a) and let ϕ be the diffeomorphism from Proposition 4.3.8. For γ(t) ϕ(th) there exists δ > 0 such that γ(t) ∈ M for |t| < δ and γ(0) ˙ = ϕ (o)h = h (cf. the proof of Proposition 4.3.8), i.e., h ∈ Ta M . In order to define the differential df (a) for f : M → R we need a generalization of the notion of the tangent space Ta M in this more general setting. For a ∈ V ⊂ M we define Γa {γ : R → M : ∃ open interval Iγ 0 : γ(0) = a ∈ V, ϕ−1 ◦ γ ∈ C 1 (Iγ ), where ϕ : U → V is a local parametrization of V}.
d (ϕ−1 ◦ γ)-t=0 with a linear Similarly as above, Ta M is the collection of all vectors dt structure of RM . (Actually, Ta M now coincides with RM . We remark that previously Ta M was an M -dimensional subset of RN .) Further,
13 If
M =
α
df (a) : Ta M → R Vα is a set such that there are injective and surjective mappings ϕα : Uα → Vα
where Uα are open subsets of RM , then the sets Vα form a subbase of a topology in M . 14 The deep theorem due to H. Whitney roughly says that a connected M -dimensional differentiable manifold can be embedded (Remark 4.3.3(iv)) into R2M +1 (see, e.g., Whitney [133, Chapter IV], Aubin [11, Theorem 1.22] or Sternberg [124, Theorem 2.4.4]). This means that our previous approach was not too restrictive.
190
Chapter 4. Local Properties of Differentiable Mappings
is defined as df (a)v
d d = where v = κ(o), ˙ κ = ϕ−1 ◦ γ, F = f ◦ ϕ−1 , (f ◦ γ)-(F ◦ κ)-dt dt t=0 t=0
see Figure 4.3.16.
0
Iγ
γ
V ⊂M
κ U ⊂ RM ϕ−1
γ(Iγ )
κ(Iγ ) a ϕ
F
f
R Figure 4.3.16. In Appendix 6.3B we consider the level sets of a function ψ : X → R, ψ ∈ C k , k ≥ 1. If 0 is the regular value of ψ and M = {x ∈ X : ψ(x) = ψ(a)} = ∅, then M is a C k differentiable manifold (with a “parameter space” X1 = Ker ψ (a), see Remark 4.3.9(i) and the proof of Proposition 4.3.8). In this case Ta M can be identified with X1 (the analogue of Proposition 4.3.33). The following notion is also useful in nonlinear analysis. Definition 4.3.41. A vector field on a differentiable manifold M is such a mapping v : M → T M that v(x) ∈ Tx M for all x ∈ M . A vector field v determines the differential equation x˙ = v(x).
(4.3.24)
A solution of (4.3.24) is a curve γ : Iγ → M (Iγ is an open interval in R) such that γ(t) ˙ = v(γ(t))
for all
t ∈ Iγ .
If we choose a chart (V, ψ) at a point a ∈ M , we can try to find a solution γ of (4.3.24) which passes through the point a, i.e., we are looking for a curve γ : Iγ 0 → M which is a solution of (4.3.24) and γ(0) = a. If v(x) =
M i=1
v i (x)
∂ ∂yi
for
x ∈ V,
4.3A. Differentiable Manifolds, Tangent Spaces and Vector Fields
191
then the equation (4.3.24) has the form of a system y˙ i = v i (ϕ ◦ y),
i = 1, . . . , M.
(4.3.25)
The local existence (and uniqueness) theorem for the system (4.3.25) can be used asi suming that the vector field v is continuous (and all partial derivatives ∂(v∂y◦ϕ) are conj tinuous). A standard continuation process then yields a solution γ which is defined on a maximal “time” interval Ia (γ(0) = a). It is well known that even very simple differential equations in R need not have any solution defined on the whole of R (e.g., x˙ = x2 + 1). The situation is better in the case of a compact differentiable manifold M (i.e., when M is a compact subset of RN ). If v is continuous on M , a ∈ M , then there exists a solution γ of (4.3.24), γ(0) = a, which is defined on the whole of R. Because of the compactness k Vi . By of M there is a finite number of charts (Vi , ψi ), i = 1, . . . , k such that M = i=1
the continuity of v there is also a constant K for which |v(x)| ≤ K
for all
x ∈ M.
Any solution γ is therefore uniformly continuous on Iγ . If Iγ = R, then the limits (in M ) of γ at the terminal point(s) of Iγ exist, and γ could be continued. The reader is invited to fill in all details of this proof using local coordinates. The global existence of solutions means that the map σ : R × M → M : σ(t, a) = γa (t) where γa is a solution of (4.3.24) on R which satisfies γa (0) = a, is a smooth (provided v is smooth) dynamical system on M . On the other hand, with any smooth dynamical system σ on M we can associate a smooth vector field v on M . The reader who is interested in dynamical systems on manifolds can consult, e.g., Chillingworth [24], Ruelle [114] for brief information, Palis & De Melo [101] or Katok & Hasselblatt [74] (this is actually much more than an introduction). We have mentioned on page 165 the role of the first integrals for (autonomous) systems of ordinary differential equations. The notion of the first integral is connected with partial differential equations, since f : M → R is the first integral of (4.3.24) if f satisfies (in local coordinates) the linear partial differential equation of the first order ∂F d d f (γ(t)) = F (κ(t)) = (κ(t))v i (γ(t)) = df (x)(v(x)) = o dt dt ∂y i i=1 n
where x = γ(t)
and
γ(t) = ϕ(κ(t)).
The solutions of (4.3.24) are called the characteristics of this partial differential equation for an unknown function F . We obtain a system of partial differential equations by considering a family of vector fields. Let v1 , . . . , vk be vector fields on a manifold M and denote by V (x) = Lin{v1 (x), . . . , vk (x)}
192
Chapter 4. Local Properties of Differentiable Mappings
the subspace of Tx M . The first integral of the system v1 , . . . , vk (or the collection of subspaces {V (x)}x∈M ) is a function f : M → R for which df (x)(vi (x)) = 0,
i = 1, . . . , k,
x ∈ M,
(4.3.26)
or equivalently, df annihilates all V (x), x ∈ M , i.e., Lw f = o for all vector fields w such that w(x) ∈ V (x). (Lw f is the so-called Lie derivative – see Exercise 4.3.46.) From this formulation it is clear that we can suppose that the vector fields v1 , . . . , vk are linearly independent at each x ∈ M . Contrary to the case of one equation, the system (4.3.26) need not have a solution. The following problem is similar to the preceding one: Let G be an open subset of RM and let g = (g 1 , . . . , g M ) be a smooth mapping from G × R into RM . Since RM can be also interpreted as the dual space to RM , the mapping g determines a system of partial differential equations u (x) = g(x, u(x)),
x ∈ G ⊂ RM .
(4.3.27)
Expressing the Fr´echet derivative (i.e., the differential) u (x) in terms of partial derivatives we get the system ∂u (x) = g i (x, u(x)), ∂xi
i = 1, . . . , M,
x ∈ G.
If this system has a solution u, then u ∈ C 2 (G) (since g is supposed to be smooth), which implies a necessary condition for the existence of a solution given by mixed derivatives 2 2 u u = ∂x∂j ∂x , i, j = 1, . . . , M ): ( ∂x∂i ∂x j i ∂g i ∂g i j ∂g j ∂g j i g = g, + + ∂xj ∂u ∂xi ∂u
i, j = 1, . . . , M.
(4.3.28)
It is a question how to formulate this integrability condition for the system (4.3.26). The system (4.3.26) is said to be completely integrable in M if for any x ∈ M there is a submanifold N (x) of (the integral manifold ) M containing x such that i∗ (Ty N (x)) = V (y)
for all
y ∈ N (x)
(i is the natural embedding of N (x) into M ). Notice that for one vector field v = o (i.e., dim V (x) = 1) the integral manifold is the integral curve of the system x˙ = v(x) and, in general, it contains the integral curves of all equations x˙ = v i (x),
i = 1, . . . , M.
The gluing of all integral curves need not be a manifold. A possible problem is shown in Remark 4.3.43 below. The basic result on complete integrability is the next theorem (Frobenius Theorem) which we state without proof. This theorem is an important tool in differential geometry, and the reader can find its proof in textbooks on this subject, e.g., Aubin [11], Sternberg [124, § 3.5]. To formulate the theorem in a compact form we need15 the notion 15 We
will give another formulation of this theorem at the end of Appendix 4.3B.
4.3A. Differentiable Manifolds, Tangent Spaces and Vector Fields
193
of Lie brackets (they are sometimes, mainly in applications to Hamiltonian mechanics, called Poisson brackets). If v, w are two smooth vector fields with local representations v=
M
vi
i=1
∂ , ∂yi
w=
M
wj
j=1
∂ , ∂yj
then [v, w] is the vector field with the local representation M M i i ∂ j ∂w j ∂v [v, w] = v −w . ∂y ∂y ∂y j j j j=1 i=1
(4.3.29)
For another interpretation of this operation see Remark 4.3.43 below or Exercise 4.3.47. Theorem 4.3.42 (Frobenius). Let v1 , . . . , vk be smooth vector fields on a manifold M . Then this system is completely integrable if and only if [vi (x), vj (x)] ∈ V (x)
for all
x ∈ M,
i, j = 1, . . . , k.
Remark 4.3.43. Suppose that two smooth vector fields v, w are given on a (compact) manifold M and let σv , σw be the corresponding dynamical systems on M . There is no reason to expect that these systems commute, i.e., σv (t, σw (s, x)) = σw (s, σv (t, x)), and it is not difficult to construct a counterexample which confirms that (see Figure 4.3.17).
σv (t, x) = y1 σw (s, y1 )
x
σv (t, y2 ) σw (s, x) = y2 Figure 4.3.17. It can be shown that a necessary and sufficient condition for commutativity is that [v, w] = o (see (4.3.29)). Exercise 4.3.44. We say that a function f : M → R is an integral of the equation (4.3.24) if f (γ(·)) is constant for any solution γ of (4.3.24). If this is true only locally, f is called a local integral. Suppose that dim M = 2 and (V, ψ) is a local chart of M , f is the integral of (4.3.24) and df (x) = 0 for all x ∈ V. Prove the following assertions. (i) There is no stationary point of (4.3.24) in V (a ∈ V is a stationary point of (4.3.24) if γ(t) = a, t ∈ R, is a solution of (4.3.24)).
194
Chapter 4. Local Properties of Differentiable Mappings
(ii) Take functions g : U → R, ϕ(U) = V, such that z = Φ(y) = (F (y), g(y))
F = f ◦ ϕ,
where
is a diffeomorphism of U onto U˜ (why does such a function exist?). Then there exists h : U˜ → R such that the vector field v has the form v(x) = 0 ·
∂ ∂ + h(z) ∂z1 ∂z2
in these new local coordinates z = (z1 , z2 ) (use the transformation rule (4.3.17) and the fact that f is an integral of (4.3.24)). Notice that h(z) = 0 for all z ∈ U˜ . (iii) Put
H(z1 , z2 ) =
z2 0 z2
dη h(z1 , η)
(what is the relation of H to a solution of (4.3.24) in the z-coordinates?). Consider another transformation of coordinates ξ = (z1 , H(z1 , z2 )). Then the vector field v has the form v(x) = 0 ·
∂ ∂ +1· ∂ξ1 ∂ξ2
in the local coordinates ξ = (ξ1 , ξ2 ). Cf. Exercise 4.3.26. Can you formulate the result in terms of ordinary differential equations? Can you generalize this result to higher dimensional manifolds? Exercise 4.3.45. Let v1 , . . . , vk be smooth vector fields on a differentiable manifold M which are linearly independent on a neighborhood U of a ∈ M . Assume that [vi , vj ] = o,
i, j = 1, . . . , k,
on
U.
Prove that there exist local coordinates (y1 , . . . , yM ) such that vi =
∂ , ∂yi
i = 1, . . . , k,
in a neighborhood of a. Hint. Cf. Exercises 4.3.44 and 4.3.26. Exercise 4.3.46. Let M be a differentiable manifold of the class C ∞ and let X be the set of all C ∞ functions on M . Let D : X → X satisfy (D1) D is linear, (D2) D(f g) = gDf + f Dg for all f, g ∈ X (pointwise multiplication). Show that there is a vector field v on M such that Lv f df (x)(v(x)) = Df (x),
x ∈ M,
f ∈ X.
4.3B. Differential Forms
195
Here Lv f is the directional derivative (in the direction of the vector field v) which is also called the Lie derivative (cf. page 192).16 Hint. Put M ∂ ai where ai = Dyi . v= ∂y i i=1 Show that Df − Lv f = o holds for polynomials of degree ≥ 1 on U. Then use the Taylor polynomial. It remains to extend this result from local charts to the whole manifold – use a partition of unity. See Definition 4.3.74 and Theorem 4.3.76. The converse statement, i.e., the fact that Lv satisfies (D1), (D2) for smooth v is easy to prove. (Do that.) Is there any difference between the differential and the Lie derivative? Exercise 4.3.47. Let v, w be two smooth vector fields on a differentiable manifold. Define the vector field [v, w], the so-called commutator (or Lie bracket) of v, w, by the formula L[v,w] f = Lv (Lw f ) − Lw (Lv f )
for every
f ∈X
(see (4.3.29) and Exercise 4.3.46). Show that this definition is correct, i.e., [v, w] is a vector field, and show that the Jacobi identity [u, [v, w]] + [v, [w, u]] + [w, [u, v]] = o holds for any smooth vector fields u, v, w.17
4.3B Differential Forms Before starting with the notion of a differential form we need to summarize some basic facts from multilinear algebra. Let X be a (real) linear space. A bilinear form is a map A : X × X → R which is linear in both variables. A typical example of a bilinear form is the scalar product in a real Hilbert space.18 A p-linear form A : X × · · · × X → R is p
defined in a similar way. 16 Sophus Lie was one of the promoters of geometric methods in analysis. A topological group (i.e., the group with a topological structure such that group operations are continuous) with the structure of a differentiable manifold (e.g., S N , groups of regular or orthogonal matrices) is called a Lie group. 17 Let A be a set with two binary operations +, · such that (A1) A with operations +, · is a ring,
(A2) a · a = o for all a ∈ A, (A3) (a · b) · c + (b · c) · a + (c · a) · b = o for all a, b, c ∈ A. Then A is said to be a Lie ring. If A is, moreover, a linear space, then A is called a Lie algebra. (If A is an associative ring and [a, b] = a · b − b · a, then (A, +, [·, ·]) is a Lie ring.) For more information see, e.g., Adams [1], Bourbaki [15], Br¨ ocker & Dieck [17], Helgason [66]. 18 This is not true for a complex Hilbert space since (x, αy) = α(x, y) for α ∈ C.
196
Chapter 4. Local Properties of Differentiable Mappings
Definition 4.3.48. A p-linear form A is said to be skew-symmetric if A(xπ(1) , . . . , xπ(p) ) = sgn πA(x1 , . . . , xp ) holds for any permutation π of the set {1, . . . , p} and all x1 , . . . , xp ∈ X. Here sgn π = 1 if the number of sign changes in π (a sign change occurs whenever i < j and π(i) > π(j)) is even and sgn π = −1 if this number is odd. The collection of all skew-symmetric p-linear forms is denoted by Λp (X). Remark 4.3.49. (i) Let e1 , . . . , eM be a basis of X and f 1 , . . . , f M its dual basis, i.e., a basis of the space X ∗ of all linear forms on X for which i
f (ej ) =
δji
=
1, 0,
i = j, i = j.
Then any element x ∈ X can be expressed in the from x=
M
f i (x)ei .
i=1
For A ∈ Λp (X), p ≤ M = dim X, we have A(x1 , . . . , xp ) =
M
f i1 (x1 ) · · · · · f ip (xp )A(ei1 , . . . , eip )
i1 ,...,ip =1
=
1≤i1 <···
⎡
⎣
⎤ sgn πf
iπ(1)
(x1 ) · · · · · f
iπ(p)
(xp )⎦
π∈{1,...,p}
× A(ei1 , . . . , eip ) =
det (f ij (xk ))j,k=1,...,p A(ei1 , . . . , eip ).
1≤i1 <···
In particular, if p = M , then A(x1 , . . . , xM ) = det (f i (xk ))i,k=1,...,M A(e1 , . . . , eM ),
(4.3.30)
i.e., dim ΛM (X) = 1. Notice also that Λp (X) = {o} for p > M . (ii) Elements x1 , . . . , xp of X are linearly dependent if and only if A(x1 , . . . , xp ) = 0
for all
A ∈ Λp (X).
This follows easily from the formula given above. The product operation can be defined in the family of skew-symmetric forms.
4.3B. Differential Forms
197
Definition 4.3.50. Let A ∈ Λp (X), B ∈ Λq (X) be skew-symmetric forms. Then their exterior product A ∧ B is the (p + q)-skew-symmetric form defined by the formula A ∧ B(x1 , . . . , xp+q ) = sgn πA(xπ(1) , . . . , xπ(p) )B(xπ(p+1) , . . . , xπ(p+q) ). π∈{1,...,p+q} π(1)<···<π(p) π(p+1)<···<π(p+q)
Remark 4.3.51. (i) The exterior product of three or more skew-symmetric forms is defined by induction and the associative law holds, i.e., A ∧ B ∧ C (A ∧ B) ∧ C = A ∧ (B ∧ C). (ii) The exterior product is not commutative. Namely, B ∧ A = (−1)pq A ∧ B
for
A ∈ Λp (X),
B ∈ Λq (X).
Example 4.3.52. (i) If A, B are one-forms (i.e., linear forms), then A ∧ B(x1 , x2 ) = A(x1 )B(x2 ) − A(x2 )B(x1 ). More generally, by induction, A1 ∧ · · · ∧ An (x1 , . . . , xn ) = det (Ai (xj ))i,j=1,...,n
for one-forms
A1 , . . . , A n .
(ii) If e1 , . . . , eM and f 1 , . . . , f M are mutually dual bases of X and X ∗ , respectively, then for any A ∈ ΛM (X). A = A(e1 , . . . , eM )f 1 ∧ · · · ∧ f M In other words, f 1 ∧ · · · ∧ f M generates ΛM (X) (dim X = M ). More generally, the products (f i1 ∧ · · · ∧ f in )1≤i1 <···
where ai1 ,...,in = A(ei1 , . . . , ein ) (see Remark 4.3.49(i)).
e
The main goal of this appendix is to investigate skew-symmetric forms on manifolds which are continuous (smooth) with respect to the topological (differential) structure of the manifold. The basic definition is the following: Definition 4.3.53. Let M be a differentiable manifold of dimension M . A mapping ω : x ∈ M → ω(x) ∈ Λp (Tx M ) is said to be a p-differential form on M if ω is continuous (or smooth) in the following sense:
198
Chapter 4. Local Properties of Differentiable Mappings Let (V, ψ) be a local chart of M and let ω(x) = ai1 ,...,ip (x) dy i1 ∧ · · · ∧ dy ip
(4.3.32)
1≤i1 ≤···≤ip ≤M
be the representation of ω in this chart (see (4.3.31)). Then all functions ai1 ,...,ip are continuous (or smooth) in V. Remark 4.3.54. (i) A smooth function f : M → R is sometimes called a differential form of order 0. Its differential df is the one-form with the local representation df (x) =
M ∂(f ◦ ϕ)(y) i dy , ∂yi i=1
ϕ(y) = x
(see (4.3.20)). (ii) Let ω be a p-differential form in RN with the representation ai1 ,...,ip (x) df i1 ∧ · · · ∧ df ip ω(x) = 1≤i1 <···
(f 1 , . . . , f N is the dual basis to the standard one e1 , . . . , eN in RN ). In accordance with the notation of coordinates in RN we can also write ai1 ,...,ip (x) dxi1 ∧ · · · ∧ dxip . ω(x) = 1≤i1 <···
If M is a differentiable manifold in RN of dimension M ≥ p, then ω can be restricted to M . Since Tx M ⊂ RN (see (4.3.16)) we have
∂ ∂ ∂ = ω(x) ,..., ai1 ,...,ip (x) det f ik . ∂yj1 ∂yjp ∂yjl k,l=1,...,p 1≤i <···
p
˜ are local charts at the same point, then we have two represen˜ ψ) (iii) If (V, ψ) and (V, ˜ tations for a p-differential form ω in V ∩ V: ω(x) = fi1 ,...,ip dy i1 ∧ · · · ∧ dy ip , i≤i1 <···
ω(x) =
gj1 ,...,jp dz j1 ∧ · · · ∧ dz jp .
1≤j1 <···<jp ≤M
(Here dy 1 , . . . , dy M is the basis of (Tx M )∗ with respect to the local chart (V, ψ) ˜ The Transformation Rule (4.3.22) yields a ˜ ψ).) and similarly dz 1 , . . . , dz M for (V, relation between the coefficients f... and g... . This relation is simple for M -forms (M = dim M ), namely ∂Φj (ψ(x)) dy 1 ∧ · · · ∧ dy M ω(x) = g(x) dz 1 ∧ · · · ∧ dz M = g(x) det ∂yi
4.3B. Differential Forms ' where
∂Φj ∂yi
199
( (y)
is the Jacobi matrix of the transformation z = Φ(y) of i,j=1,...,M
local coordinates (see Figure 4.3.14). The determinant of the Jacobi matrix will be called the Jacobian and denoted by JΦ (Example 4.1.5). This Transformation Rule can be generalized to mappings between manifolds in a way similar to (4.3.19) and (4.3.23). If g : M → M˜ is a smooth map and ω is a p-differential form on M˜, then the formula (4.3.33) g ∗ ω(x)(v1 , . . . , vp ) = ω(g(x))(g∗v1 , . . . , g∗ vp ), x ∈ M , v1 , . . . , vp ∈ Tx M , where g∗ vi is the push-forward of the tangent vector vi (see (4.3.19)), defines the pull-back of ω. To obtain a local representation of the type (4.3.33) we choose local coordinates at x, put vk = ∂y∂j and use the Transformation k
Rule (4.3.19). However, the final formula is rather cumbersome and we will not need it with the exception of the case when dim M = dim M˜ = M and ω is an M -form, ω(z) = f (z) dz 1 ∧ · · · ∧ dz M . Then
g ∗ ω(x) = f (g(x))JG (ψ(x)) dy 1 ∧ · · · ∧ dy M
where G is the local realization of g (see Figure 4.3.15). An important special case is ϕ∗ ω where ϕ is a coordinate mapping U ⊂ RM → V ⊂ M . The next example shows how to compute the pull-back of (M − 1)-forms for small M . These formulae are often used in vector calculus – see also special cases of integration in Appendix 4.3C. Example 4.3.55. (i) Let ω(x, y) = f (x, y) dx + g(x, y) dy be a 1-form in R2 and γ = (γ1 , γ2 ) : (a, b) → R2 a smooth curve. Then
∂ ∂ = ω(γ(t)) γ∗ γ ∗ ω(t) ∂t ∂t = f (γ1 (t), γ2 (t))γ˙ 1 (t) dt + g(γ1 (t), γ2 (t))γ˙ 2 (t) dt. (ii) Let ω(x, y, z) = f (x, y, z) dy ∧ dz + g(x, y, z) dz ∧ dx + h(x, y, z) dx ∧ dy be a 2-form in R3 and ϕ : (u, v) ∈ U ⊂ R2 → R3 a smooth parametrization of a surface S in R3 . Then [ϕ∗ ω(u, v)](e1 , e2 ) = ω(ϕ(u, v))(ϕ (u, v)e1 , ϕ (u, v)e2 ) ∂ ∂u
∂ ∂v
and, if ∂ ∂ ∂ ∂ = u1 + u2 + u3 , ∂u ∂x ∂y ∂z
∂ ∂ ∂ ∂ = v1 + v2 + v3 ∂v ∂x ∂y ∂z
∂ is actually the first vector of the standard basis in R3 – see Remark (here ∂x 4.3.54 (ii)), then
∂ ∂ = u2 v3 − u3 v2 , dy ∧ dz etc., , ∂u ∂v
200
Chapter 4. Local Properties of Differentiable Mappings and eventually ϕ∗ ω
∂ ∂ , ∂u ∂v
=
(f, g, h),
∂ ∂ × ∂u ∂v
R3
∂ ∂ where the brackets (·, ·)R3 denote the scalar product in R and ∂u × ∂v is the so∂ ∂ called cross (or vector) product of vectors ∂u = (u1 , u2 , u3 ), ∂v = (v1 , v2 , v3 ) in R3 , i.e., ∂ ∂ e × = (u2 v3 − u3 v2 , u3 v1 − u1 v3 , u1 v2 − u2 v1 ). ∂u ∂v 3
Remark 4.3.56. The reader can ask why it is necessary (or reasonable) to introduce differential forms even though vectors and vector fields have been defined. Actually, there is only a technical difference for one-forms, since Ta M is isomorphic to its dual (Ta M )∗ . For example, df (a) ∈ (Ta M )∗ and therefore it can be represented by a scalar product in Ta M . Since Ta M is a linear subspace of RN (for M ⊂ RN ) we may define the scalar product in Ta M as (v, w)Ta M (v, w)RN
for
v, w ∈ Ta M .19
In particular, this means that there is a vector ∇f (a) – the so-called gradient of f – such that df (a)(v) = (v, ∇f (a))Ta M . If df (a) = then
fi (a) =
fi (a) dyi ,
∂ , ∇f (a) ∂yi
.20 Ta M
The reason for distinguishing between differential forms and vector fields lies in the richer structure of the collection of all differential forms – there are operations like the exterior product and the exterior differential (Definition 4.3.57). Moreover, the differential forms ω 1 = f dx + g dy + h dz
and
ω 2 = f dy ∧ dz + g dz ∧ dx + h dx ∧ dy
can be attached to the vector field F =f
∂ ∂ ∂ +g +h . ∂x ∂y ∂z
We will see in Appendix 4.3C that the integral of ω 1 along a curve γ can be interpreted as work done by the force field F along γ and the integral of ω 2 along a surface S has the meaning of the rate at which a fluid flow represented by the velocity field F crosses S. Another reason consists in a simplification of various notions and results of classical vector analysis and differential geometry. Examples like orientation, elementary volume and the Stokes Theorem will be shown in Appendix 4.3C. 19 In
this connection see footnote 27 on page 214. ∂ warn that vectors ∂y , . . . , ∂y∂ need not be orthogonal in Ta M !
20 We
1
M
4.3B. Differential Forms
201
Definition 4.3.57. Let M be a differentiable manifold of dimension M and let ω be a smooth p-differential form on M which has the local representation (4.3.32). Then the differential of ω is the (p + 1)-differential form dω with the local representation
dω(x) =
dai1 ,...,ip (x) ∧ dy i1 ∧ · · · ∧ dy ip .
(4.3.34)
1≤i1 <···
Example 4.3.58. (i) If f : M → R is a differentiable function, i.e., a 0-form, then the differential df given by Remark 4.3.54(i) is the same as that in (4.3.34). (ii) Let ω(x) = f1 (x) dx1 + f2 (x) dx2 + f3 (x) dx3 be a 1-form on an open set G in R3 and (x1 , x2 , x3 ) the Cartesian coordinates of a point x. If f1 , f2 , f3 are smooth functions on G, then dω(x) = df1 (x) ∧ dx1 + df2 (x) ∧ dx2 + df3 (x) ∧ dx3 ∂f1 ∂f1 ∂f1 dx1 ∧ dx1 + dx2 ∧ dx1 + dx3 ∧ dx1 + · · · ∂x1 ∂x2 ∂x3 =0
∂f2 ∂f1 ∂f3 ∂f2 = dx1 ∧ dx2 + dx2 ∧ dx3 − − ∂x1 ∂x2 ∂x2 ∂x3
∂f1 ∂f3 dx3 ∧ dx1 . + − ∂x3 ∂x1
=
If we interpret the components (f1 , f2 , f3 ) of the form ω as those of a vector field v (Remark 4.3.56), then the components of dω, more precisely
∂f3 ∂f2 ∂f1 ∂f3 ∂f2 ∂f1 − , − , − ∂x2 ∂x3 ∂x3 ∂x1 ∂x1 ∂x2
,
form the so-called curl of v (notation curl v or ∇ × v; for the cross product see e Example 4.3.55(ii)). Remark 4.3.59. By computing the differentials dfi1 ...ip as in the previous example and rearranging the sum in (4.3.34) to get rid of the zero terms dy i1 ∧ · · · ∧ dy ip+1 where two indices coincide, we obtain, e.g., for an (M − 1)-form, ω(x) =
M
fi (x) dy 1 ∧ · · · ∧ 7 dy i ∧ · · · ∧ dy M
i=1
(here 7 dy i means that dy i is missing), dω(x) =
M i=1
(−1)i
∂(fi ◦ ϕ) (ψ(x)) dy 1 ∧ · · · ∧ dy M . ∂yi
Here ϕ, ψ are given by a local chart in x ∈ M .
202
Chapter 4. Local Properties of Differentiable Mappings
Example 4.3.58(i) leads to the following question: Are all one-differential forms differentials of smooth functions? In other words, has any (continuous) one-form ω a “primitive function” f , i.e., is there f such that df = ω? A short speculation on oneforms in R2 suggests obstacles caused by mixed partial derivatives. We investigate this problem in a more general way. Proposition 4.3.60. (i) Let ω be a differential form of the class C 2 . Then d2 ω d( dω) = 0. (ii) Let ω and κ be a p-differential form and q-differential form, respectively, then d(ω ∧ κ) = ( dω) ∧ κ + (−1)q ω ∧ ( dκ). Proof. An easy proof is left to the reader. Notice however that the exchangeability of mixed partial derivatives of C 2 -functions is the crucial point in the statement (i). Definition 4.3.61. A differential form ω is said to be (1) closed if dω = 0, (2) exact if there is a differential form κ such that ω = dκ. Remark 4.3.62. The concept of exact differential forms is a generalization of the classical notion of the potential of a mapping f : RM → RM : A function F : G → R is called a potential of f in an open set G ⊂ RM if F (x)h = (f (x), h)RM ,
x ∈ G,
h ∈ RM .
In particular, if F is a potential of a C 1 -function, then ∂fj ∂fi (x) = (x), ∂xj ∂xi
i, j = 1, . . . , M,
x ∈ G.
The following example shows that this necessary condition is not sufficient. Example 4.3.63. Let G = R2 \ {(0, 0)} and let ω(x, y) = −
y x dx + 2 dy x2 + y 2 x + y2
be a 1-form in G. This form is closed in G. Suppose now that ω is exact, i.e., there is a function f : G → R such that df = ω, in particular, ∂f y , =− 2 ∂x x + y2
∂f x = 2 ∂y x + y2
in
G.
Integrating, we obtain ⎧ x ⎪ ⎨ − arctan y + C(y) f (x, y) = ⎪ ⎩ arctan y + D(x) x
for
(x, y) ∈ G, y = 0,
for
(x, y) ∈ G, x = 0.
4.3B. Differential Forms
203
Since arctan z + arctan 1z = π2 for z = 0, we have C(y) − D(x) = π2 , i.e., C and D are constant functions in all quadrants. Taking limits for x → 0± , y → 0± we arrive at a contradiction, i.e., ω is not exact. The reader can ask how we have found this example. The problem is more transparent if R2 is identified with the complex plane. If F (z) = then Re F (x + iy) =
1 , z
z = x + iy,
x , x2 + y 2
Im F (x + iy) = −
y . x2 + y 2
It is well known that there is no (holomorphic) function Φ such that Φ (z) =
1 z
for all
z ∈ C \ {0}
dz ). In the theory of functions of a complex variable a primitive function z can be constructed by a curve integral. We will use the same approach in constructing a “primitive form” to a differential form. This is the main idea of the proof of the following e basic result. (consider
S1
Theorem 4.3.64 (H. Poincar´e). Any closed differential form on a differentiable manifold is locally exact. Proof. Let ω be a closed p-form on an M -dimensional manifold M (1 ≤ p ≤ M ). We choose a local chart (V, ψ) such that P ψ(V) = U is an open ball in RM with center at the origin. The pull-back (see Remark 4.3.54(iii)) Ω ϕ∗ ω (ϕ = (P ψ)−1 ) is a p-form in U. We define a (p − 1)-form σ on U by the formula 1 tp−1 Ω(ty)(y, v1 , . . . , vp−1 ) dt σ(y)(v1 , . . . , vp−1 ) = 0
for y ∈ U, v1 , . . . , vp−1 ∈ Ty U .
21
We have to show that
(i) the integral exists (this fact follows from the continuity of t → Ω(ty)); (ii) σ is a (p − 1)-differential form on U (the skew-symmetry of σ follows from the same property of Ω); (iii) dσ(y) = Ω(y) for y ∈ U. Verification of the last statement is technically complicated. The case p = 1 is more transparent, and therefore we will give the computation only for this case. For the induction step the reader can consult, e.g., Sternberg [124, Theorem III.4.1], Cartan [21, Theorem II.3.2.12.1] or Taylor [127, Theorem 1.13.2]. Suppose that Ω has in U the form Ω(y) (ϕ∗ ω)(y) =
M
gi (y) df i ,
y ∈ U,
i=1 21 Here
we identify the point y ∈ U with the vector y ∈ Ty U = RM .
204
Chapter 4. Local Properties of Differentiable Mappings
where f 1 , . . . , f M is the dual basis to the standard one e1 , . . . , eM in RM . We wish to show that the function M 1 σ(y) gi (ty)yi dt, y = (y1 , . . . , yM ) ∈ U, (4.3.35) i=1
0
has the differential dσ(y) = Ω(y),
∂σ (y) = gj (y), ∂yj
i.e.,
j = 1, . . . , M.
By differentiating the integral (4.3.35) with respect to the parameter yj we obtain ∂σ (y) = ∂yj
1
gj (ty) dt + 0
M i=1
1 0
∂gi (ty)tyi dt = ∂yj
1
gj (ty) dt + 0
M i=1
1 0
∂gj (ty)tyi dt. ∂yi
For the last equality we have used the assumption dω = 0 and Exercise 4.3.71(iv): dΩ = d(ϕ∗ ω) = ϕ∗ ( dω) = 0 and, consequently, ∂gi ∂gj = , i, j = 1, . . . , M. ∂yj ∂yi Using integration by parts we get 1 1 M 1 8 9t=1 d ∂gj gj (ty) dt = tgj (ty) t=0 − t gj (ty) dt = gj (y) − t (ty)yi dt. dt ∂yi 0 0 0 i=1 If we put f (x) = σ(y) for x = ϕ(y), then
df = ω.
Remark 4.3.65. The proof of the case p = 1 shows that there exists a potential σ of a smooth mapping g = (g 1 , . . . , g M ) : U → RM in a ball U provided the symmetry conditions ∂g j ∂g i = , i, j = 1, . . . , M, hold. ∂yj ∂yi Example 4.3.63 suggests that certain topological properties of U are necessary if U is not a ball. In the proof of the previous theorem the potential σ was defined by the curve integral 1
σ(y) = 0
(g(γ(t)), γ(t)) ˙ RM dt
Ω
22
γo,y
along the curve γo,y = {ty : t ∈ [0, 1]}. The crucial point in the direct computation of the Fr´echet derivative of σ is an estimate of the difference σ(y + h) − σ(y). If the curve integral depends only on the initial and terminal points and not on the path which joins these points, then 1 σ(y + h) − σ(y) = Ω= (g(y + th), h)RM dt = (g(y), h)RM + o(hRM ) γy,y+h
22 The
0
definition of the curve integral and of the integral of a differential form is given in the next Appendix 4.3C.
4.3B. Differential Forms
205
provided g is continuously differentiable. Example 4.3.63 can be easily adapted to show that the independence of the curve integral on the path in U implies that U is not punctured. There is another way to express this observation. It consists in considering the obstacles preventing a closed form from being exact. Assume that M is a connected manifold and denote the group (with respect to pointwise addition) of closed p-differential forms on M by Z p (M ) and the subgroup of exact p-forms by B p (M ). The quotient H p (M ) Z p (M )|B p (M ) is called the p-(de Rham) cohomology group of M . If H 1 (M ) is trivial, i.e., any closed one-form in M is exact, then M is said to be simply connected . More details on the role of cohomological groups in the study of differentiable manifolds can be found, e.g., in Whitney [133, Chapter IV]. The calculation of cohomology groups is by no means trivial. Example 4.3.66. (i) Let f : M → N be a smooth map. If ω ∈ Z p (N ), then d(f ∗ ω) = f ∗ ( dω) = 0,
i.e.,
f ∗ ω ∈ Z p (M ).
(Exercise 4.3.71(iv)). Similarly, if ω ∈ B p (N ), then f ∗ ω ∈ B p (M ). This means that f induces a linear map f ∗ : H p (N ) → H p (M ). In particular, if f is a diffeomorphism of M onto N , then H p (M ) is isomorphic to H p (N ). (ii) Using the previous example we can show that H 1 (S 1 ) is isomorphic to R. Instead of H 1 (S 1 ) it is sufficient to compute H 1 (R|Z ): Denote by i the natural projection of R onto R|Z and consider a closed one-form ω on R|Z . Then f (x) dx i∗ ω(x) where f is a 1-periodic function. Define ϑ(ω) =
1
f (x) dx. 0
It is easy to see that ϑ(ω) = 0 if and only if ω is exact, and also that ϑ maps B 1 (R|Z ) onto R. This shows that ϑ induces the isomorphism of H 1 (R|Z ) onto R.
e
Now we explain the notion of a simply connected domain in another way which will be important in the sequel (e.g., in the degree theory – Appendix 4.3D). Definition 4.3.67. Let X, Y be metric (topological) spaces. Continuous maps f, g : X → Y are called homotopic if there exists a continuous map Φ : X × [0, 1] → Y such that Φ(·, 0) = f (·),
Φ(·, 1) = g(·).
Such Φ is said to be a homotopy between f and g.
206
Chapter 4. Local Properties of Differentiable Mappings
Remark 4.3.68. The relation between two continuous maps to be homotopic is clearly an equivalence relation. The set of all continuous maps C(X, Y ) is therefore divided into disjoint classes of mutually homotopic maps. We denote the class containing f by [f ]. Here we are using the homotopy concept mainly for curves. The reader can imagine that curves γ0 , γ1 : [0, 1] → M are homotopic if γ0 may be continuously deformed (in M !) into γ1 . A curve γ is called null-homotopic if γ is homotopic to a constant curve (point) γ ˜ : t ∈ [0, 1] → a ∈ M . In particular, this is important for closed curves (γ(0) = γ(1)). To see this, choose a fixed point a ∈ M and define H1 (M ) {[γ] : γ : [0, 1] → M is continuous, γ(0) = γ(1) = a}. H1 (M ) forms a group – the so-called fundamental group of M – under “multiplication” γ = γ2 · γ1 defined by 8 9 t ∈ 0, 12 , γ1 (2t), 81 9 γ(t) = γ2 (2t − 1), t ∈ 2 , 1 (notice that the definition [γ2 ] · [γ1 ] [γ2 · γ1 ] is correct). If M is path-connected, i.e., for any a, b ∈ M there exists a continuous curve γ in M such that γ(0) = a, γ(1) = b, then the fundamental group of M does not depend on the choice of the point a. Whenever the fundamental group is trivial, i.e., any closed curve can be continuously deformed into a point, then there are no holes in M and the integral of a one-form along a closed curve is zero, i.e., this integral does not depend on the path (Remark 4.3.65). Cohomology and homotopy groups belong to the main tools in algebraic topology. The reader who is interested in these techniques can consult the corresponding textbooks, e.g., Adams [1], Dold [36], Greenberg [61], Kosniowski [77] (here, in Chapter 26, you can find applications of fundamental groups to the classification of two-dimensional compact connected surfaces; for example, such a surface is simply connected if and only if it is homeomorphic to S 2 ), Spanier [122]. At the end of this appendix we link differential one-forms and systems of differential equations and continue the discussion from Appendix 4.3A. To simplify it we assume that α=
M
γi dxi
i=1
is a non-vanishing smooth one-form in an open set G ⊂ RM . The form α is uniquely determined by its kernel up to a multiplication factor. Let v 1 , . . . , v M −1 be a basis of this kernel, i.e., v 1 , . . . , v M −1 are vector fields on G which are linearly independent at each point x ∈ G and annihilate α. The equation α=0
(4.3.36)
is called the exterior differential equation in G and its solution is a mapping T : x ∈ G → a subspace T (x) ⊂ Tx G such that α(v) = 0
for all
v ∈ T (x).
4.3B. Differential Forms
207
A submanifold S of G is said to be an integral manifold for the equation (4.3.36) if dim S = M − 1
and
ϕ∗ α(y) = 0
for all
y∈S
where ϕ is a local parametrization of S in a neighborhood of y. This integral manifold is the same object as that at the end of Appendix 4.3A, i.e., i∗ (Ty S ) = Lin{v 1 (y), . . . , v M −1 (y)} where i : S → G is an embedding. The notion of the exterior differential equation can be generalized to a system αj = 0,
j = 1, . . . , k,
(4.3.37)
where α , . . . , α are differential forms on a manifold M not necessarily of the same order. In the special case when α1 , . . . , αk are linearly independent one-forms ((4.3.37), it is the so-called Pfaff system), the intersection of their kernels has a basis formed by M − k vector fields. The Frobenius Theorem for the existence of an integral manifold for (4.3.37) has the following form. 1
k
Theorem 4.3.69 (Frobenius, the differential forms version). Let α1 , . . . , αk be smooth differential one-forms on a differentiable manifold M . The necessary and sufficient condition for the existence of an integral manifold in a vicinity of any point of M is that dαi ∧ α1 ∧ · · · ∧ αk = 0,
i = 1, . . . , k.
Proof. The equivalence of Theorem 4.3.42 and Theorem 4.3.69 is not difficult to prove, and the case k = 1, dim M = 3 is rather instructive. In this case a connection to the Poincar´e Theorem 4.3.64 and its proof should also be recognized. Exercise 4.3.70. Denote by O(N ) the set of all regular linear mappings A : RN → RN for which A−1 = A∗ and by SO(N ) the set {A ∈ O(N ) : det A = 1}. (i) As a subset of R the set O(N ) is a differentiable manifold. Find its dimension. Hint. Consider A → A∗ A. The dimension is N(N−1) . 2 N×N
(ii) Show that SO(N ) is the component of O(N ) containing the identity. (iii) Show that A ∈ O(N ) induces a mapping of S N−1 into itself. (iv) Let ω be a one-form on S 2 which is invariant under SO(3), i.e., A∗ ω = ω
for all
A ∈ SO(3).
Prove that ω = 0. (v) Does a result analogous to (iv) hold for a two-form ω on R3 ? Exercise 4.3.71. Prove the following properties of the pull-back operation: (i) g ∗ (ω ∧ κ) = (g ∗ ω) ∧ (g ∗ κ), (ii) (h ◦ g)∗ ω = g ∗ (h∗ ω), (iii) g ∗ ( df ) = d(f ◦ g) for f : M → R,23 (iv) g ∗ ( dω) = d(g ∗ ω) where d denotes the differential. 23 If
we interpret f as a 0-form, then the notation g ∗ f instead of f ◦ g is more agreeable.
208
Chapter 4. Local Properties of Differentiable Mappings
Exercise 4.3.72. Let M be the open unit ball in R2 without its origin. Show that H1 (M ) is isomorphic to Z. Is it also true in R3 ? Exercise 4.3.73. Show that H1 (S 1 ) is isomorphic to Z. Hint. Use an approach similar to that in Example 4.3.66(ii), and instead of the mapping ϑ show that there is a lifting γ˜ of a continuous closed curve γ : [0, 1] → R|Z , i.e., γ ˜ : [0, 1] → R continuous such that i(˜ γ ) = γ, γ˜ (0) = 0. Now consider γ˜ (1) (actually this is the degree of γ – see Appendix 4.3D). For details see, e.g., Kosniowski [77, Chapter 16].
4.3C Integration on Manifolds We have met the curve integral in the previous Appendix 4.3B. There are two objects which can be integrated along curves: functions and differential one-forms. The situation with functions is simple. If M is an M -dimensional differentiable manifold in RN , f : M → R is a continuous function and γ : [a, b] → M is a smooth curve, then we define b f dγ f (γ(t))γ(t) ˙ dt. (4.3.38) γ
a
The Euclidian norm γ(·) ˙ of tangent vectors expresses here a quantity which could be viewed as the “infinitesimal” length of γ. Recall in this context the formula for the length of γ: b l(γ) γ(t) ˙ dt. a
We will return to the length and area of a nonlinear object later in this appendix. The integral on the right-hand side of (4.3.38) is the Riemann integral and consequently it has reasonable properties. It could be generalized to some noncontinuous functions (via the Lebesgue integral) and/or to certain non-smooth curves (pairwise smooth or with bounded variation via the Riemann–Stieltjes integral). Since we are not interested in these generalizations we always assume that all objects are as smooth as we need (manifolds at least of the class C 1 , functions, vector fields, differential forms at least continuous). The situation with integration of one-forms is different. Namely, differential forms are defined only on manifolds (recall that an open subset of RM is also a manifold) and curves need not be manifolds (see Figure 4.3.2). There are two possibilities to avoid these obstacles: either to assume that γ lies on a manifold where the one-form is defined or to restrict integration to curves which are themselves manifolds. We now examine the first possibility and postpone the other one to Definition 4.3.86. For the definition of the integral of a one-form given in Remark 4.3.54 we have assumed that the whole curve γ lies in one chart to get the same representation of the form at all points of γ. If more charts are needed to cover the curve we have to be careful not to integrate over some parts of the curve several times. To eliminate this risk the following tool is very useful. In order to build it up we need a topological interlude.
4.3C. Integration on Manifolds
209
Definition 4.3.74. Let (Vn , ψn )n∈N be an atlas of a differentiable manifold M . Let {αn }n∈N be a collection of smooth (often C ∞ ) nonnegative functions on M which have the following properties: (1) for all n ∈ N the support of αn defined by supp αn {x ∈ M : αn (x) = 0} is a compact subset of Vn ; ∞ αn (x) = 1 for all x ∈ M . (2) n=1
Then {αn }n∈N is said to be a partition of unity subordinate to {Vn }n∈N .24 Since M ⊂ RN is separable in the induced topology a countable atlas always exists. It is also possible to construct a sequence {Gn }∞ n=1 of open subsets of M such that Gn ⊂ int Gn+1 , 25
Gn
is compact,
and
M =
∞
Gn .
n=1
For example, Gn can be chosen as the intersection of M with the open ball centered at o with radius n. For the construction of a partition of unity the following topological device is convenient. We will need various types of balls so B(a; r) will denote the open ball in RM (M = dim M ). Lemma 4.3.75. Let {Wα }α∈A be an open covering of an M -dimensional manifold M in RN . Then there is a countable open covering {Vm }m∈N of M with the following properties: (i) {Vm }m∈N is subordinate to {Wα }α∈A , i.e., for each m ∈ N there is an index αm ∈ A such that Vm ⊂ Wαm ; ∞ ϕm (B(o; 1)) = M ; (ii) there are smooth mappings ϕm : B(o; 2) → Vm such that m=1
(iii) the collection {Vm }m∈N forms a locally finite system, i.e., any point x ∈ M has a neighborhood which intersects only a finite number of {Vm }m∈N . Proof. Choose a sequence {Gn }∞ n=1 of open subsets of M which has the property stated prior to this lemma. Put in addition G0 = G−1 = ∅. The main idea behind the forthcoming construction is that the compact sets Kn Gn \ Gn−1 ,
n ∈ N,
cover M and the larger open sets Hn Gn+1 \ Gn−2 ,
n ∈ N,
24 A partition of unity is defined in topology in a more general way; see the corresponding textbooks, e.g., Dugundji [43, Chapter VIII]. 25 We wish to point out that topological notions (like interior) are taken here with respect to the topology of M , i.e., G ⊂ M is open provided there is an open set H ⊂ RN such that G = M ∩ H.
210
Chapter 4. Local Properties of Differentiable Mappings
V
Hnα Vxn
x ϕnx (z)
ϕnx
Pψ (P ψ)−1 z
y
δ y+ z 2
o B(o; 1)
B(y; δ)
B(o; 2)
Figure 4.3.18.
form a locally finite system. Fix n ∈ N. Let (V, ψ) be a local chart at x ∈ Hα n Hn ∩ Wα . Put y = P ψ(x). P ψ(V ∩Hα n ).
There is a ball B(y; δ) ⊂ We now shift the center y to the origin and expand the ball appropriately, namely we put
δ −1 y+ z for z ∈ B(o; 2). ϕn x (z) = (P ψ) 2 With help of these smooth maps ϕn x we return back to the manifold by setting Vxn = ϕn x (B(o; 1)) (see Figure 4.3.18). Notice that Vxn is open in M . Open sets {Vxn }x∈Hα cover the n ,α∈A compact set Kn . We choose a finite subcovering Vxn1 , . . . , Vxnkn . The collection {Vxnj }j=1,...,kn , n∈N , covers M , and {ϕn xj (B(o; 2))}j=1,...,kn , n∈N , is the desired locally finite countable system {Vm }m∈N .
4.3C. Integration on Manifolds
211
Theorem 4.3.76. For any atlas (Wα , ψα )α∈A of a manifold M there exists a subordinate partition of unity. Proof. According to the previous Lemma 4.3.75 we choose a locally finite subordinate covering {Vk }k∈N of M and the corresponding functions {ϕk }∞ k=1 . It is easy to show that the function − 1 e 1−y2 , y < 1, η(y) = y ∈ RM , 0, y ≥ 1, is a C ∞ -function in RM . Put
βk (x) =
x ∈ Vk , x = ϕk (y), x ∈ M \ Vk .
η(y), 0,
Then βk is smooth (of the same order as M ) and βk (x) > 0
for
x ∈ ϕk (B(o; 1)).
Since {Vk }k∈N is a locally finite system, the series nonzero terms. Moreover,
∞
∞
βk (x) has only a finite number of
k=1
βk (x) > 0 for all x ∈ M due to
k=1
∞
ϕk (B(o; 1)) = M . It is
k=1
now sufficient to put βk (x) , αk (x) = ∞ βn (x)
k ∈ N,
n=1
to obtain the desired partition of unity.
We can now return to the definition of the integral of a one-form ω. If {αn }n∈N is a partition of unity which is subordinate to a covering {Vn }n∈N of M where (Vn , ψn )n∈N is an atlas of M , then ω(x) =
∞
x ∈ M.
αn (x)ω(x),
n=1
Notice that αn ω is a one-form and supp αn ω ⊂ Vn . This decomposition of ω allows us to define the integral locally. If γ is a smooth curve in M and γ(t) ∈ Vn , then γ(t) ˙ ∈ Tγ(t) M and it can be written in the form γ(t) ˙ =
M
γ˙ i (t)
i=1
∂ . ∂yi
Definition 4.3.77. Let M be an M -dimensional differentiable manifold and denote (Vn , ψn )n∈N its atlas. Let {αn }n∈N be a partition of unity subordinate to {Vn }n∈N . Let γ : I → M be a smooth curve and ω a one-form on M . If ω(x) =
M i=1
fi (x) dy i ,
x ∈ Vn ,
212
Chapter 4. Local Properties of Differentiable Mappings
then we define ω= γ
∞ n=1
αn ω γ
∞ n=1
αn (γ(t)) I
M
fi (γ(t))γ˙ i (t) dt
(4.3.39)
i=1
provided the integrals on the right-hand side exist and the sum is absolutely convergent. Remark 4.3.78. (i) If γ is a smooth curve defined on a compact interval I = [a, b] and {Vn }n∈N is a locally finite covering of M , then γ lies in a finite number of {Vn }n∈N only. If, moreover, the form ω is continuous, then the integrals in (4.3.39) exist and the sum is absolutely convergent, since it contains only finitely many nonzero terms. We require absolute convergence of the series because we do not want the value of the integral to depend on the arrangement of charts (Vn , ψn )n∈N into a sequence. (ii) It can be proved (do it as an exercise!) that the formula (4.3.39) does not depend on the choice of partition of unity. It should be also proved that the right-hand side in (4.3.39) is the same for all equivalent atlases on M (see Definition 4.3.34). This follows from the transformation rules for tangent vectors (4.3.17) and for differential forms (4.3.22). Remark 4.3.79. We can interpret the local coordinates (f1 , . . . , fM ) of a one-form ω as the local coordinates of a vector field F (x) =
M
fi (x)
i=1
∂ , ∂xi
x ∈ Vn
(and vice-versa – Remark 4.3.56). If we define F = ω, γ
γ
F expresses the work done by the vector field F along the curve γ.
then the integral γ
The special cases M = R2 , R3 are known from introductory courses in mechanics (see, e.g., Kittel, Knight & Ruderman [76, Chapter 5]). Remark 4.3.80. Figures 4.3.2 and 4.3.3 show that a smooth curve need not be a differentiable manifold in RN . In order to avoid such cases it is sufficient to assume that the curve γ has a parametrization which is an embedding (Remark 4.3.3(iv)). If, moreover, γ lies on a manifold M , then it is the so-called submanifold of M in the following sense: A subset P of a differentiable manifold M is said to be a P -dimensional submanifold of M if there is an atlas (Vn , ψn )n∈N of M such that ψn (x) = (y1 , . . . , yP , 0, . . . , 0) ∈ RN
for all
x ∈ Vn ∩ P.
The proof of Proposition 4.3.2 shows that the image of an embedding is a submanifold. In order to integrate functions over a surface in R3 , or more generally over a manifold, we need to generalize the notion of area of a parallelogram to a non-flat domains. Let
4.3C. Integration on Manifolds
213
us recall here the definition of the multiple Riemann integral. The notion of (normalized) area or volume is based on the fact that the unit cube M . C x i ei : 0 ≤ x i ≤ 1 i=1
(e1 ,...,eM is the standard basis in RM ) has the M -dimensional volume (i.e., the Lebesgue measure) equal to 1. Let A be a parallelepiped in RM spanned by vectors v1 , . . . , vM , i.e., . M αj vj : 0 ≤ αj ≤ 1 . A j=1
Then the volume V (A) of A is defined by V (A)
1 dx. A
This integral can be calculated with help of the linear operator T : RM → RM which M sends the vectors e1 , . . . , eM of the standard basis to v1 , . . . , vM (T ej = vj = tij ei ). i=1
It is well known that 1 dx = | det T | dy = | det T | | det (tij )i,j=1,...,M |.26 A
(4.3.40)
C
There is a problem with the generalization to a manifold since a manifold is bent. Nonetheless, a manifold can be supposed to be locally flat provided it is smooth. This basic principle of analysis allows us to define an infinitesimal area or volume via these notions for flat tangent spaces. Another problem now arises since there is no natural unit cube in Ta M . To overcome this obstacle we want to express the M -volume of the parallelepiped, spanned by the coordinate vectors ∂y∂ 1 , . . . , ∂y∂M , without using the standard basis. This can be done for the parallelepiped A given above with help of the scalar product (in which the standard basis is an orthonormal basis). If G(v1 , . . . , vM ) is the so-called Gramm matrix of vectors v1 , . . . , vM , i.e., ⎛
(v1 , v1 )RN ⎜ .. G(v1 , . . . , vM ) = ⎝ . (vM , v1 )RN then the formula
··· .. . ···
⎞ (v1 , vM )RN ⎟ .. ⎠, . (vM , vM )RN 1
V (A) = [det G(v1 , . . . , vM )] 2
(4.3.41)
holds (see Exercise 4.3.103). 26 More
generally:If T is a (nonlinear) diffeomorphism which maps C onto A T (C), then the 1 dx = | det T (y)| dy holds for the Lebesgue measure V (A) of A. The
formula V (A)
A
C
proof of this nonlinear version is based on (4.3.40).
214
Chapter 4. Local Properties of Differentiable Mappings
Since Ta M ⊂ RN , the scalar product in RN can be restricted to Ta M to get a 27 This justifies the next definition. We wish to point out natural scalar product in T' aM . ( ∂ ∂ , ∂y∂ j that the Gramm matrix G ∂y1 , . . . , ∂y∂M consists of scalar products of vectors ∂y i in RN (the differential structure of M is inherited from RN ). Definition 4.3.81. Let M be a differentiable manifold with an atlas (Vn , ψn )n∈N and let ϕn = (ψn |Vn )−1 .
Un = Pn ψn (Vn ),
Let {αn }n∈N be a partition of unity in M subordinate to {Vn }n∈N . If f is a continuous function on M , then we define 1
∞ 2 ∂ ∂ f dV (αn f )(ϕn (y)) det G ,..., dy1 · · · dyM (4.3.42) ∂y1 ∂yM M n=1 Un provided the right-hand side exists and the sum is absolutely convergent. Remark 4.3.82. It is possible to show that this definition does not depend on the partition of unity and on the choice of an atlas. The right-hand side in (4.3.42) exists whenever f has compact support or, in particular, if M is a compact manifold. Example 4.3.83. Compute the surface area V (S 2 ) of the unit sphere S 2 in R3 . It is obvious that the two-dimensional surface area of the “Greenwich meridian G” is zero. The rest S 2 \ G is covered by one chart with ' π π( ϕ(α, ϑ) = (cos α cos ϑ, sin α cos ϑ, sin ϑ), α ∈ (0, 2π), ϑ ∈ − , . 2 2 ∂ ∂ Since n = 1, U1 = (0, 2π) × − π2 , π2 , α1 = 1 and det G ∂α , ∂ϑ = cos2 ϑ, ∂ = (− sin α cos ϑ, cos α cos ϑ, 0), ∂α
we have
∂ = (− cos α sin ϑ, − sin α sin ϑ, cos ϑ), ∂ϑ
V (S 2 ) =
1 dS = S2
(0,2π)×(− π ,π 2 2)
cos ϑ dα dϑ = 4π.
(It is more common to denote the integration symbol in the two-dimensional case by dS e instead of dV .) It follows from Definition 4.3.77 (see also Exercise 4.3.100) that the integral of a one-form along a curve γ depends on the orientation of γ. Namely, if γ˜ (t) = γ(1 − t),
t ∈ (0, 1),
then γ˜˙ (t) = −γ(1 ˙ − t),
ω=−
and hence γ ˜
27
ω. γ
This scalar product leads to uniformly distributed mass or currents in physical applications but it is sometimes unrealistic. To cover further applications in a more realistic manner we can consider different scalar products at different points of a manifold. Since any positive definite symmetric bilinear form in RM × RM determines a scalar product in RM , we can introduce a metric structure on a manifold M by a (smooth) mapping g : x ∈ M → S2+ (Tx M ) (positive definite symmetric bilinear forms on Tx M ). Such g is called a Riemann metric on M .
4.3C. Integration on Manifolds
215
This dependence on an orientation is crucial for the generalization of the curve integral to an integral of a differential form over a manifold. What can the orientation of a manifold be? Let us start with simple examples like R, R2 , R3 . It is the common understanding that the “standard” (equivalently “positive”) orientation on R is from the left to the right, in R2 anticlockwise and by the right-thumb rule in R3 . These slightly vague formulations can be made precise by taking fixed bases in R, R2 , R3 , e.g., the standard bases. Then all bases are divided into two disjoint classes according to the sign of the determinant of the transformation matrix T which sends the fixed (e.g., standard) basis into a new one. We say that e˜1 , . . . , e˜M is a positive basis if det T > 0. We want to remind the reader that det T = f 1 ∧ · · · ∧ f M (˜ e1 , . . . , e˜M ) if T ei = e˜i , i = 1, . . . , M, and f 1 , . . . , f M is the dual basis to e1 , . . . , eM (Example 4.3.52(i)). This indicates that the choice of a fixed nowhere-vanishing continuous M -form ω (i.e., ω(x) = 0 for all x ∈ M ) on the M -dimensional manifold M makes it possible to introduce an orientation on M . If (V, ψ) is a local chart at a point x ∈ M , then the basis ∂y∂ 1 , . . . , ∂y∂M of Tx M is said to be a positive basis of Tx M provided
∂ ∂ > 0. ω(x) ,..., ∂y1 ∂yM It can be proved that a continuous non-vanishing form exists on M if and only if there ˜ ψ) ˜ of this atlas the is an atlas (Vn , ψn ) of M such that for any two charts (V, ψ), (V, ˜ ◦ ϕ) (y) (see (4.3.17)) has a positive determinant for all transformation matrix ((P ψ) ˜ (provided V ∩ V ˜ = ∅). y ∈ ψ(V ∩ V) Definition 4.3.84. A differentiable manifold M of dimension M is said to be orientable if there exists a continuous nowhere-vanishing M -form on M . If such a form ω is fixed, then (M , ω) is called an oriented manifold. Example 4.3.85. (i) Suppose that M is a two-dimensional orientable connected manifold in R3 (i.e., a surface) and ω is a nowhere-vanishing two-form on M . The question is how these orientations of Tx M cohere with the natural orientation of R3 . To find an answer we choose a point x ∈ M and local coordinates at x such that ∂y∂ 1 , ∂y∂ 2 form a positive basis in Tx M . It is obvious that there is a vector n ∈ R3 which is perpendicular to Tx M ⊂ R3 and such that
∂ ∂ n, , ∂y1 ∂y2 n is called a (unit) outer normal vector to is a positive basis of R3 . The vector n
M at the point x. It is easy to prove that
n=
∂ ∂ × . ∂y1 ∂y2
For the definition of the cross product see Example 4.3.55(ii). (ii) The M¨ obius strip S ⊂ R3 is an example of a non-orientable manifold. An argument to prove that can be based on the above consideration. Choose a point a ∈ S , a basis ∂y∂ 1 , ∂y∂ 2 in Ta S and find the outer normal vector na . Now move the point
216
Chapter 4. Local Properties of Differentiable Mappings ' ( a together with the basis n, ∂y∂ 1 , ∂y∂ 2 along the whole strip to come back to the e initial position. The vector na will end at −na (see Figure 4.3.19).
x
∂ ∂y1
nx ∂ ∂y2
S
∂ ∂y2
na ∂ ∂y1
a
Figure 4.3.19. Definition 4.3.86. Let (M , ω) be an oriented M -dimensional differentiable manifold. Let (Vn , ψn )n∈N be an atlas of M for which the coordinate vectors ∂y∂ 1 , . . . , ∂y∂M form a positive basis of Tx M for all x ∈ Vn and all n ∈ N. Let {αn }n∈N be a partition of unity subordinate to this atlas. If ω is a continuous M -form with the local representation ω(x) = fn (x) dy 1 ∧ · · · ∧ dy M ,
x ∈ Vn ,
then we define the integral of ω over M as ∞ ∞ ω= αn ω ϕ∗n (αn ω)(y) M
n=1
M
n=1
=
∞ n=1
Un
Un
(4.3.43) (αn fn )(ϕn (y)) dy1 · · · dyM 28
provided the right-hand side exists and the sum is absolutely convergent. Remark 4.3.87. N (i) If a form ω has compact support, in particular, if M is a compact set in R , then the integral ω exists. M 28 For
M η(y) = g(y) df 1 ∧ · · · ∧ df M a continuous M -form on a measurable set U ⊂ R (cf. Re-
mark 4.3.54(ii)) we define U
g(y) df 1 ∧ · · · ∧ df M =
U
g(y) dy1 · · · dyM .
4.3C. Integration on Manifolds
217
(ii) Definition 4.3.86 does not depend on the concrete choice of an atlas and on a partition of unity. If the coordinate vectors ∂y∂ 1 , . . . , ∂y∂M determine a negative basis in Tx M , then we change the order, e.g., to ∂ ∂ ∂ ∂ , , ,..., , ∂y2 ∂y1 ∂y3 ∂yM to get a positive basis. Notice that ω(x) = f (x) dy 1 ∧ · · · ∧ dy M = −f (x) dy 2 ∧ dy 1 ∧ dy 3 ∧ · · · ∧ dy M . (iii) Definition 4.3.86 is also independent on a transformation of coordinates in the following sense: Let g be a diffeomorphism of an oriented manifold M onto a manifold M˜.29 Then g induces an orientation on M˜30 and with respect to this orientation the equality
g∗ω
ω= M˜
(4.3.44)
M
holds for any continuous M -form ω on M˜. (iv) If a curve γ is itself a differentiable manifold (Remark 4.3.80) with the orientation ω defined by (4.3.39) is the same as in (4.3.43). induced from R, then γ
(v) If M is an oriented manifold and ωV is the M -form given in local coordinates by 1
2 ∂ ∂ ,..., dy 1 ∧ · · · ∧ dy M ωV = det G ∂y1 ∂yM where
∂ , . . . , ∂y∂M ∂y1
is a positive basis, then the volume V (M )
(4.3.42)) is given by
dV (see M
V (M ) =
ωV . M
Example 4.3.88. Let M be part of the hemisphere given by 1 (x, y, z) ∈ R3 : x2 + y 2 + z 2 = 1, z > − 2 and ω(x, y, z) = y 2 dx ∧ dy + yz dx ∧ dz + x2 dy ∧ dz. We choose the spherical coordinates ϕ: x = cos α cos ϑ,
y = sin α cos ϑ,
z = sin ϑ,
' π π( (α, ϑ) ∈ U (−π, π) × − , 6 2
29 We have not defined this notion yet, but it is almost evident how to generalize the well-known case RM → RM . One has to overcome certain difficulties which are caused by the local definition of a manifold and the global notion of diffeomorphism. ∂ ∂ 30 Let (V , ψ ) form a n n n∈N be an atlas of M such that the coordinate vectors ∂y , . . . , ∂y 1 M ∂ ∂ positive basis of Tx M for x ∈ Vn . Then g∗ , . . . , g∗ determine a positive basis of Tg(x) M˜. ∂y1
∂yM
218
Chapter 4. Local Properties of Differentiable Mappings
∂ ∂ and the orientation such that ∂α , ∂ϑ is a positive basis. We wish to compute the integral ω. M * ) has two-dimensional surface meaThe curve γ = (− cos ϑ, 0, sin ϑ) : ϑ ∈ − π6 , π2 sure equal to zero, and therefore ω= ϕ∗ ω dα dϑ. M
U
We have (see Example 4.3.55(ii) or (4.3.33) and (4.3.19)) ϕ∗ ( dx ∧ dy) = sin ϑ cos ϑ dα ∧ dϑ, ϕ∗ ( dy ∧ dz) = cos α cos2 ϑ dα ∧ dϑ,
i.e.,
ϕ∗ ω = cos3 α cos4 ϑ dα ∧ dϑ.
ϕ∗ ( dz ∧ dx) = sin α cos2 ϑ dα ∧ dϑ, An easy computation gives ω= M
(−π,π)×(− π ,π 6 2)
cos3 α cos4 ϑ dα dϑ = 0.
e
It is evident that the computation of
ω need not be an easy task when several M
charts have to be used to cover the support of ω. The main reason is that a partition of unity must be constructed and this is technically difficult. Because of that, we would like to have such useful tools like the Fubini Theorem and the Fundamental Theorem of Calculus. The later theorem, i.e., b f (x) dx = f (b) − f (a), a
can be interpreted in the manifold language as follows. The closed interval [a, b] is a manifold M positively oriented from the left to the right. The one-form f (x) dx is the differential of the zero-form (i.e., the function f ), and the (oriented) boundary of M consists of the points b, a (in this order). The Fundamental Theorem of Calculus reduces the integral of the form df (x) = f (x) dx over M to an “integral” of f over ∂M . This observation is essential for the generalization to manifolds with boundaries. To do that we have to define the boundary of M first, and then to show how this boundary inherits the orientation of M . Definition 4.3.89. Let N be an M -dimensional differentiable manifold in RN . A closed subset M of N is said to be an M -dimensional differentiable manifold with boundary 31 if int M = M (the interior and closure are taken in the topology of N ) and for any point x ∈ M there is a chart (V, ψ) of an atlas of N such that either 31 This boundary can be an empty set (see, e.g., Remark 4.3.90(i)). If this boundary is nonempty, then the manifold M is not a differentiable manifold in the sense of Definition 4.3.4. See also Remark 4.3.90(ii).
4.3C. Integration on Manifolds
219
(i) V ⊂ M or (ii) P ψ(x) = (0, y2 , . . . , yM ) and P ψ(V ∩ M ) = {(y1 , . . . , yM ) ∈ RM : y1 ≤ 0} ∩ P ψ(V). A point x is called an interior point of M in the case (i). If x is not an interior point, then x is called a boundary point of M . The collection of all boundary points is called the boundary of M and denoted by ∂M (see Figure 4.3.20).
RN
RN −M
∂M
V ∩M
M
x V ϕ
Pψ
{(y1 , . . . , yM ) ∈ RM : y1 < 0}
o
{(0, y2 , . . . , y
M)
P ψ(x) y1
N
P ψ(V)
∈ RM }
{(y1 , . . . , yM ) ∈ RM : y1 > 0}
RM Figure 4.3.20.
Remark 4.3.90. (i) A manifold can have empty boundary. The sphere S 2 in R3 is an example of this fact. Such a manifold is also called a manifold without boundary. (ii) If M is a manifold with nonempty boundary ∂M , then the boundary ∂M is itself a differentiable manifold of dimension M −1 in RN . An atlas is given by the restriction of the original atlas. We notice that ∂M has empty boundary, i.e., ∂(∂M ) = ∅. (iii) Let M be a manifold with boundary. The tangent space Ta M for an interior point a ∈ M is defined as in Appendix 4.3A and Ta M = Ta N . If a ∈ ∂M , then we take M M : y1 < 0}) through the point all smooth curves in RM − (R− = {(y1 , . . . , yM ) ∈ R N b = ψ(a) and transfer them into R by applying ϕ = (P ψ)−1 .
220
Chapter 4. Local Properties of Differentiable Mappings The tangent vectors at the point a of these transferred curves form the tangent space Ta (∂M ). If ω : (−1, 1) → RM is a smooth curve such that ω(0) = b,
ω(t) ∈
RM − RM +
for for
t < 0, t > 0,
then ϕ (b)[ω(0)] ˙ is the so-called outer vector to ∂M . The outer normal n to ∂M at the point a ∈ ∂M is the unit outer vector which is perpendicular in RN to Ta (∂M ) (see Figure 4.3.21).
RN
RN −M M
ϕ(ω)
V
a {a} + Ta ∂M n
Ta M
ϕ Pψ P ψ(V) o ω y1
ω(0) ˙ R
b
RM − RM +
M
Figure 4.3.21. (iv) If M ⊂ N is a differentiable manifold with boundary and (N , ω) is an oriented manifold, then ω induces an orientation on ∂M as follows. Choose a ∈ ∂M and d ϕ(t, b2 , . . . , bM )-v1 dt t=0
4.3C. Integration on Manifolds
221
(a special case of an outer vector), and define ω ∂ (a)(v2 , . . . , vM ) = ω(a)(v 1 , v2 , . . . , vM ). The form ω ∂ defines the induced orientation on ∂M . In other words: v 2 , . . . , v M is a positive basis of Ta (∂M ) provided v 1 , . . . , v M is a positive basis of Ta N . Example 4.3.91. (i) Let M be the closed ring {(x, y) ∈ R2 : 1 ≤ x2 + y 2 ≤ 4}. Then M ⊂ R2 is a two-dimensional manifold with boundary ∂M = S11 ∪ S21 (Sr1 denotes the circle with radius r and center at the origin). As the local coordinates we take the polar coordinates (r, ϑ). The (standard) orientation on M is given either by the standard Euclidean basis e1 = (1, 0), e2 = (0, 1) in Ta M = R2 or by the form ω(a) = f 1 ∧ f 2 dx ∧ dy = r(a) dr ∧ dϑ. The special outer vector v 1 mentioned in Remark 4.3.90(iv) is ⎧ ∂ ⎪ ⎪ ⎨ − ∂r at the point a1 = (1, ϑ1 ), v1 = ⎪ ⎪ ⎩ ∂ at the point a2 = (2, ϑ2 ). ∂r The completion to a positive basis in Ta R2 is shown in Figures 4.3.22 and 4.3.23.
R2 a1
M
v2
v1 o
ϑ=π S11 a2 v1
S21
v2 Figure 4.3.22.
222
Chapter 4. Local Properties of Differentiable Mappings
v1 =
r
∂ ∂r 2
a2
∂ v2 = ∂ϑ
v2 = −
∂ ∂ϑ a1
1 v1 = − −π
∂ ∂r π
0
ϑ
Figure 4.3.23.
The form ω ∂ on S11 is given (in the polar coordinates r, ϑ) by
∂ ∂ ∂ = r(a) dr ∧ dϑ − , = −r(a). ω ∂ (a) ∂ϑ ∂r ∂ϑ (ii) Let B be the closed unit ball in R3 . Then B is a three-dimensional manifold (included in R3 ) with boundary ∂B = S 2 (the two-dimensional sphere). The standard orientation in R3 = Ta B gives the orientation in int B. The induced orientation on ∂ at ∂B is obtained by Remark 4.3.90(iv). Namely, we take a normal vector n = ∂r a point a ∈ ∂B and independent vectors v 2 , v 3 ∈ Ta (∂B) in the order given by the right-thumb rule for (n, v 2 , v 3 ) (see Figure 4.3.24).32 Notice that the orientation on B is given, e.g., by ω = dx ∧ dy ∧ dz = (r 2 cos ϑ) dr ∧ dα ∧ dϑ in the spherical coordinates. Similarly,
ω ∂ (v2 , v3 ) = ω
∂ , v2 , v3 . ∂r
e
The next theorem is a basic result on differential forms and it is the promised generalization of the fundamental theorem of calculus. Theorem 4.3.92 (Stokes, abstract version). Let M be an M -dimensional oriented manifold with boundary ∂M . Let ω be a smooth (M − 1)-differential form on M with compact support. Then i∗ ω 33
dω = M
∂M
provided ∂M has the induced orientation and i : ∂M → M is the canonical injection. 32 More precisely, for n v × v (the cross product is defined in Example 4.3.55(ii)) the vectors 2 3 3 n, v 2 , v 3 form a positive basis in R and n is perpendicular to v 2 , v 3 . 33 Here M
dω is defined as in Definition 4.3.86 where Un need not be an open set – see foot-
note 28 on page 216.
4.3C. Integration on Manifolds
223
R3 n
v2 a v3
Ta S 2
B
Figure 4.3.24.
Proof. Let (Vn , ψn )n∈N be an atlas of M such that the coordinate vectors given by ∂ , . . . , ∂y∂M form positive bases in the corresponding tangent spaces. Let {αn }n∈N be ∂y1 a partition of unity subordinate to {Vn }n∈N . Since ω has compact support, we have k dω = αj dω where supp (αj dω) ⊂ Vj . M
M
j=1
Therefore, it is sufficient to prove the Stokes Theorem only for the case when supp ω is contained in one coordinate neighborhood, say V. Suppose that ω has the representation ω(x) =
M
(−1)i−1 fi (y) dy 1 ∧ · · · ∧ 7 dy i ∧ · · · ∧ dy M ,
x = ϕ(y) ∈ V,
i=1
where the hat denotes that the term dy i is missing. Then M ∂fi (y) dy 1 ∧ · · · ∧ dy M . dω(x) = ∂yi i=1 There are two cases for V:
i∗ ω = 0 and
Case 1 (V ∩ ∂M = ∅). Then ∂M
dω = M
M ∂fi (y) dy1 . . . dyM = 0 ∂y i i=1 U
by the Fubini Theorem, since fi = 0 outside of a compact subset of U.
224
Chapter 4. Local Properties of Differentiable Mappings
Case 2 (V ∩ ∂M = ∅). According to the definition of the boundary we can assume that V and ψ(V) = U have the form as in Figure 4.3.20. Then i∗ ω(x) = f1 (y) dy 2 ∧ · · · ∧ dy M
for
x = ϕ(y) ∈ V ∩ ∂M
and
i∗ ω = ∂M
f1 (y) dy 2 ∧ · · · ∧ dy M = ∂M
U ∩RM −1
f1 (0, y2 , . . . , yM ) dy2 . . . dyM .
On the other hand,
0 M ∂fi ∂f1 dy1 . . . dyM = · · · (y) dy1 dy2 . . . dyM ∂yi −∞ ∂y1 U ∩RM −1 i=1 U = f1 (0, y2 , . . . , yM ) dy2 . . . yM
dω = M
U ∩RM −1
since the integrals for i = 2, . . . , M vanish because of the compact support of the restriction of f to U ∩ RM −1 .
This completes the proof.
Several special cases of the Stokes Theorem are worth mentioning. We say that a curve γ : [a, b] → RN is simple if γ(t1 ) = γ(t2 ),
t1 < t2 ,
implies
a = t1 ,
b = t2 .
Corollary 4.3.93 (Green). Let Ω be a bounded open subset of R2 the boundary of which is the image of a simple closed smooth curve γ which is oriented so that Ω is on the left-hand side when we move along the curve. Let F = (f, g) : R2 → R2 be a C 1 -mapping in a neighborhood of the closure Ω. Then ∂f ∂g (f dx + g dy). (4.3.45) − dx dy = ∂x ∂y Ω γ (See Definition 4.3.77 for the integral on the right-hand side.) Proof. Notice that Ω is an oriented manifold (the positive orientation of R2 ) with boundary ∂Ω (smoothness of γ) and the above mentioned orientation of γ agrees with the induced orientation on ∂Ω. Put ω = f dx + g dy
and use the Stokes Theorem. Remark 4.3.94. For F (x, y) = 12 (−y, x) the formula (4.3.45) gives 1 V (Ω) = 2
−y dx + x dy, γ
which can be used for computation of the area of planar sets.
4.3C. Integration on Manifolds
225
Corollary 4.3.95 (Gauss–Ostrogradski). Let Ω be a bounded open subset of R3 such that Ω is a differentiable manifold with boundary ∂Ω. Assume that F = (f 1 , f 2 , f 3 ) : R3 → R3 is a smooth mapping in a neighborhood of Ω. Then 3 ∂f i dx1 dx2 dx3 = (F, n) dS (4.3.46) Ω i=1 ∂xi ∂Ω where n is the unit outer normal vector to ∂Ω and (F, n) is the scalar product in R3 . The integral on the right-hand side is defined in Definition 4.3.81. Proof. Put ω = f 1 dy ∧ dz + f 2 dz ∧ dx + f 3 dx ∧ dy ∂ ∂ , ∂v forms a positive orthonorand choose local coordinates (u, v) on ∂Ω such that n, ∂u 3 ∗ mal basis in R . The pull-back i ω has been computed in Example 4.3.55(ii):
∂ ∂ du ∧ dv. i∗ ω = F, × ∂u ∂v The cross product ∂ ∂ × ∂v = n. ∂u
∂ ∂u
∂ × ∂v is the unit vector which is perpendicular in R3 to
∂ , ∂ . ∂u ∂v
So,
Corollary 4.3.96 (Stokes, classical version). Let M be a bounded two-dimensional oriented manifold in R3 (i.e., a surface) with boundary which is described by a simple smooth curve γ. Let F be a C 1 -vector field on M . Then (curl F, n) dS = F M
γ
where γ has the orientation induced from M , the vector curl F is defined in Example 4.3.58(ii) and n is the unit outer normal vector to M in R3 (if ∂y∂ 1 , ∂y∂ 2 is a positive ' ( basis at a ∈ M , then n is perpendicular in R3 to Ta M and n, ∂y∂ 1 , ∂y∂ 2 is a positive basis in R3 ).
Proof (a hint). Rewrite the abstract Stokes theorem for this special case using the corresponding definitions of integrals and Example 4.3.58(ii). Remark 4.3.97. Considering F in Corollary 4.3.95 as a velocity field of a fluid flow (Remark 4.3.56) we can interpret the right-hand side in (4.3.46) as the amount of the fluid which flows out of a region Ω per a unit time. In particular, if the divergence of F 3 ∂f i ) vanishes everywhere in Ω, then this amount is zero for any subregion (div F ∂xi i=1
of Ω. In other words, the fluid is incompressible in this case. Remark 4.3.98. Using an “infinitesimal” ball Ω centered at a in Corollary 4.3.95 it is 3 ∂f i possible to interpret the value of the function div F = at the point a physically. ∂xi i=1
From the mathematical point of view it is more interesting that this is a starting point for the generalization of basic differential operators to non-flat domains. We now briefly describe this procedure. Let F be a vector field on a manifold M and let ω be an M -form. Define an (M − 1)-form ωF by (ωF )(v1 , . . . , vM −1 ) ω(F, v1 , . . . , vM −1 ).
226
Chapter 4. Local Properties of Differentiable Mappings
If (M , ω) is an oriented manifold, then d(ωF ) has to be a multiple of ω: d(ωF ) (div F )ω. We strongly recommend that the reader computes div F , e.g., on the sphere S 2 . One of the most important partial differential operators is the Laplacian ∆: if G is an open set in RN , f : G → R is smooth, then ∆f
M ∂2f ∂x2i i=1
and it is easy to see that
∆f = div (∇f )
where
∇f
∂f ∂f ,..., ∂x1 ∂xN
is the gradient of f . Since the notion of the gradient has been defined for functions on manifolds (Remark 4.3.56), we are able to generalize the Laplacian to functions defined on a manifold M : ∆M f = div (∇f ). This operator ∆M is often called the Laplace–Beltrami operator. For more information on the significance of this operator the reader can consult, e.g., Chavel [23], Davies & Safarov [31], Robinson [108] or Rosenberg [110]. Remark 4.3.99. In weak formulations of boundary problems for elliptic differential equations (see Chapter 7) the following Green formula is frequently used: Let a manifold M satisfy the assumptions of the abstract Stokes Theorem and let f, g ∈ C 2 (M ). Then
M
(∆M f )gω +
M
(∇f, ∇g)Tx M ω = ∂M
g(∇f, n)Tx (∂M) ω ∂ .
(4.3.47)
The proof is based on a generalization of (4.3.46): (div F )ω = (F, n)ω ∂ M
∂M
which follows from the abstract Stokes Theorem. Another ingredient is the formula for the divergence of the product of a function and a vector field div (gF ) = g div F + (∇g, F ). This formula follows from the definition of the divergence by computation. ω does not depend on the choice of an atExercise 4.3.100. Prove that the definition of γ
las and a partition of unity. What can be said about the dependence on a parametrization of γ? Exercise 4.3.101. Let ω be an exact one-form and f its “primitive function”, i.e., df = ω. Compute
ω! What is the result if γ is a closed curve? γ
4.3C. Integration on Manifolds
227
Exercise 4.3.102. Let ω be a closed one-form in M . Show that ω is exact if and only if ω = 0 for any smooth closed curve in M . γ
Hint. Consult the proof of Theorem 4.3.64. Exercise 4.3.103. Prove (4.3.41)! Hint. Use induction. To prove the induction step it is convenient to compute the distance δ of vM from the span of v1 , . . . , vM −1 . Show that δ2 =
det G(v1 , . . . , vM ) det G(v1 , . . . , vM −1 )
provided v1 , . . . , vM −1 are linearly independent. Exercise 4.3.104. Check that the definition (4.3.38) is a special case of (4.3.42)! Express the length of a curve γ ⊂ R3 in spherical coordinates! Exercise 4.3.105. Let a surface S be determined by the graph of a smooth function 3 for the area of S! f : U ⊂ R2 → : R . Finda formula
2 2 ∂f ∂f Hint. 1+ + dx dy ∂x ∂y U Exercise 4.3.106. Let M be the graph of a smooth function f : RM → R defined on an open set U ⊂ RM . Show that M is an orientable manifold! Exercise 4.3.107. Let f : RN → R be a smooth mapping for which o is a regular value. Show that M = {x ∈ RN : f (x) = 0} is an orientable manifold of dimension N − 1 (provided M = ∅)! Exercise 4.3.108. Deduce a version of the Cauchy Theorem, i.e., f dz = 0, γ
from the Green Theorem (Corollary 4.3.93)! Hint. Interpret f (z) dz as the couple of differential 1-forms (g dx − h dy, h dx + g dy). Exercise 4.3.109. Find the formula34 for ∆f (i) in polar coordinates; Hint. You should get ∆f =
∂2f 1 ∂f 1 ∂2f + ; + ∂r 2 r ∂r r 2 ∂ϑ2
(ii) in spherical coordinates in R3 ; 34 Such
formulae are convenient if one is looking for a solution with some symmetries, i.e., invariant with respect to some group actions.
228
Chapter 4. Local Properties of Differentiable Mappings Hint. You should get ∆f =
2 ∂f ∂2f 1 + + 2 ∂r 2 r ∂r r cos ϑ
∂2f ∂2f ∂f + cos ϑ 2 + sin ϑ 2 ∂α ∂ϑ ∂ϑ
;
(iii) on S 2 ; (iv) on the Riemann manifold (see footnote 27 on page 214). Exercise 4.3.110. Let M be a connected differentiable manifold. (i) Prove that there exists a Riemann metric g on M . Hint. M is embedded into RN . (ii) Prove that any two points x, y ∈ M can be connected by a C 1 -curve, i.e., there is
γ : [a, b] → M ,
γ ∈ C1,
γ(a) = x, γ(b) = y.
Hint. Use the assumption that M is connected. (iii) Define the length of a C 1 -curve γ : [a, b] → M by the formula
b
lg (γ)
; gγ(t) (γ(t), ˙ γ(t)) ˙ dt.
a
Put g (x, y) = inf{lg (γ) : γ : [a, b] → M , γ(a) = x, γ(b) = y} and show that g is a metric on M . (iv) How is the topology on M given by g related to the topology induced from RN ?
4.3D Brouwer Degree We will establish the main properties of the degree of a mapping f : RM → RM in the basic text (Section 5.2). In this appendix we will present another definition of the Brouwer degree, prove its main properties and give some of its topological applications. This goal can be achieved in different ways but all of them are lengthy and contain intricate calculations. Here we choose the treatment based on the integration of differential forms, mainly to introduce the interested reader to a “geometric world” and to relate the notion of the degree to classical results of the theory of functions of complex variable. Since C.F. Gauss was one of the fathers of this theory we will start with his approach to the so-called Fundamental Theorem of Algebra. Theorem 4.3.111 (Fundamental Theorem of Algebra). Let P be a polynomial with complex coefficients of a degree at least 1. Then there exists z0 ∈ C such that P (z0 ) = 0. Equivalently, P : C → C is a surjective mapping. We will give two rather different proofs the ideas of which go back to Gauss. The former is purely geometric with a little analysis for introducing a differentiable structure on the sphere S 2 . The latter uses the theory of functions of a complex variable and demonstrates the above mentioned connection to the degree.
4.3D. Brouwer Degree
229
Proof (see, e.g., Milnor [95]). We will regard the sphere S 2 as a compactification of the complex plane and endow it with the structure of a differentiable manifold by two charts (see Example 4.3.5(iii)) given by the stereographic projections π+ , π− of S 2 \ {N } where N is the north pole, and S 2 \ {S} where S is the south pole, respectively, onto R2 . See Figure 4.3.25.
N
S2
x R2
π− (x) π+ (x)
S Figure 4.3.25.
Let P be a non-constant polynomial. We define f to be −1 π+ ◦ P ◦ π+ (x), x ∈ S 2 \ {N }, f (x) N, x = N. One can prove that f : S 2 → S 2 is continuously differentiable at all points. To prove differentiability at N it is sufficient to verify another formula for f , namely −1 f = π− ◦ Q ◦ π−
where
Q(z) =
1 P (z −1 )
,
Q(0) = 0
(z is the complex conjugate to z). The calculation of f∗ (see (4.3.19)) shows that x0 ∈ S 2 is a singular point of f (i.e., f∗ (Ta S 2 ) = Tf (a) S 2 ) if and only if P (z0 ) = o,
z0 = π+ (x0 ).
Since the last equation has only a finite number of solutions, the set A of singular values of f is finite. This is the main point where we have used that P is a polynomial. Consider now a point y ∈ S 2 \A, i.e., y is a regular value of f . Then the set f−1 (y) is finite (possibly empty) since a polynomial takes any value only finitely many times. Let ϕ(y) denote the number of points, i.e., the cardinality, of f−1 (y). The main part of the proof is to show that ϕ is a constant function on B S 2 \ A. To prove that we consider two kinds of regular values, B1 = {y : f−1 (y) = ∅}, B2 = B \ B1 . Since B1 = S 2 \ f (S 2 ) and S 2 is compact, B1 is open in S 2 . For y ∈ B2 we have f−1 (y) = {x1 , . . . , xk }
230
Chapter 4. Local Properties of Differentiable Mappings
and there are disjoint open neighborhoods U(x1 ), . . . , U(xk ) on which f is a diffeomorphism (see Local Inverse Function Theorem 4.1.1). Put Vi f (U(xi )). It is easy to see that ϕ is constant on the set k k 2 Vi \ f S \ U(xi ) , i=1
i=1
which is a neighborhood of the point y. Since S 2 is connected and A is finite, the set B is also connected. The function ϕ, being locally constant, is constant on the whole of B. Moreover, ϕ cannot vanish on B, i.e., B1 = ∅. This shows that f actually maps S 2 onto S 2 , i.e., P : C → C is surjective as well. In particular, there exists z0 ∈ C such that P (z0 ) = 0. The following result generalizes the above fact that f is surjective (see, e.g., Sternberg [124, Theorem 3.4.3]). Proposition 4.3.112. Let M1 , M2 be two oriented manifolds of the same dimension, and let M2 be a connected space. Suppose that f : M1 → M2 is a proper35 differentiable mapping such that its realization (see Figure 4.3.15) has a nonnegative Jacobian at any point. Then either the Jacobian vanishes everywhere or f (M1 ) = M2 . Remark 4.3.113. Write f : C → C as f (z) = g(x, y) + ih(x, y)
where
z = x + iy
and
g, h : R2 → R.
If f is a holomorphic function in an open set G, then the Cauchy–Riemann conditions ∂h ∂g = , ∂x ∂y
∂g ∂h =− ∂y ∂x
hold for
z = x + iy ∈ G,
and the Jacobian of (g, h) : R2 → R2 satisfies J(g,h) (x, y) = |f (z)|. In particular, J(g,h) ≥ 0 and if f is a polynomial, then f is proper (why?), and so Proposition 4.3.112 can be used to get the Fundamental Theorem of Algebra. The idea of the latter proof is based on the notion of the index of a point a with respect to a curve. If γ is a closed C 1 -curve in C, a ∈ γ,36 then the index , Indγ a, is defined by the formula 1 dz . Indγ a = 2πi γ z − a We say that γ is positively oriented if Indγ a ≥ 0
for all
a ∈ γ.37
mapping f is said to be proper if f−1 (K) is compact whenever K is a compact set. to complicate matters by different notation we identify the curve, i.e., a mapping γ : interval I → C, with its image. 37 This definition, common in the theory of functions of a complex variable, coincides with our definition of an oriented manifold. 35 A
36 Not
4.3D. Brouwer Degree
231
In particular, if a=0
and
γ(t) = reit ,
t ∈ [0, 2nπ],
n ∈ Z,
then Indγ 0 = n and it can be interpreted as the number of revolutions of γ and also as the increment of the argument along γ divided by 2π. Proposition 4.3.114 (Rouch´e). Let γ be a simple, closed, positively oriented C 1 -curve in an open set Ω, and let G {z ∈ Ω \ γ : Indγ z = 0}. If f is a holomorphic function in Ω for which 0 ∈ f (γ), then the number Nf (G) of solutions of the equation f (z) = 0 that belong to G is equal to f (z) 1 (4.3.48) Nf (G) = dz = Indf ◦γ 0 38 2πi γ f (z) provided the solutions are counted with their multiplicity.39 If g is another holomorphic function in Ω such that |f (z) − g(z)| < |f (z)|,
z ∈ γ,
(4.3.49)
then Nf (G) = Ng (G). Proof. The proof is based on the Residue Theorem, see, e.g., Rudin [113, Theorem 10.43]. We wish to point out that the condition (4.3.49) is a quantitative description of stability of the number of solutions with respect to perturbations of f . The second proof of Theorem 4.3.111. Our second proof of the Fundamental Theorem of Algebra follows easily from the previous proposition: Suppose that P (z) = z n + a1 z n−1 + · · · + an and put f (z) = z n . Let R > 0 be such that |a1 z n−1 + · · · + an | < |z n | = Rn
for
|z| = R
(why such an R does exist?). Proposition 4.3.114 implies that n = Nf (G) = NP (G) where G is the open ball B(0; R).
The connection between the winding number and the degree deg (f, Ω, p) as the latter is defined in Definition 5.2.1 for a regular value p is given by the following result. For a holomorphic function f and its regular value p the degree deg (f, Ω, p) is defined 38 This
quantity is also called the winding number of f at 0 with respect to γ. solution z0 has multiplicity k if f (z0 ) = · · · = f (k−1) (z0 ) = 0, f (k) (z0 ) = 0. Notice that k is finite provided f is not identically zero. 39 A
232
Chapter 4. Local Properties of Differentiable Mappings
as the number of solutions in Ω of the equation f (z) = p. By identification of f : C → C,
f (z) = g(x, y) + ih(x, y)
with
(g, h) : R2 → R2 ,
this definition coincides with Definition 5.2.1. Lemma 4.3.115. Let Ω be an open, bounded set whose boundary is the image of a C 1 -simple, closed, positively oriented curve γ. Assume that f : C → C is holomorphic in a certain neighborhood of Ω and p ∈ f (∂Ω) is a regular value of f (i.e., f (z) = 0 whenever f (z) = p). Then f (z) 1 deg (f, Ω, p) = dz. (4.3.50) 2πi γ f (z) − p Proof. Denote A = {z ∈ Ω : f (z) = p}. If A = ∅, then both sides vanish (use the Cauchy Theorem for the right-hand side). If A = ∅, then A is finite since A ∩ ∂Ω = ∅, and f is holomorphic. Both sides in (4.3.50) are equal to the cardinality of A. This follows for the left-hand side from the definition of the degree and for the right-hand side by the Residue Theorem. We would like to point out here that the formula (4.3.50) indicates a way to remove the assumption on regularity of p. Namely, the integral exists for any holomorphic function f provided p ∈ f (∂Ω). Put d inf |f (z) − p|, z∈∂Ω
and let A = {z1 , . . . , zn } be the same as in the proof of Lemma 4.3.115. Assume that the solution zj is of multiplicity mj , and denote m m1 + · · · + mk . According to Proposition 4.3.114, the number of solutions (including multiplicity) of the equation f (z) = q is also m provided |p−q| < d. In this neighborhood of the point p there exists a regular value of f . This follows either from the Sard Theorem (see Theorem 5.2.3) or, in this special case more easily from the properties of holomorphic functions – see, e.g., Rudin [113, Theorem 10.32]. If the degree has the property of stability with respect to the point (Property (vi) or (viii) in Theorem 5.2.7 or Theorem 4.3.124), the equality (4.3.50) will also hold for singular values. In order to motivate the next result, we note that the integral in (4.3.50) can be rewritten as an integral over Ω (by the Green Theorem). We avoid such uninteresting and intricate calculation, and put aside the special case of holomorphic functions. Instead we will consider the general case of mappings RM → RM . For the rest of this appendix we will suppose that ⎧ ⎪ Ω is a bounded open set in RM , ⎪ ⎨ (H) f : Ω → RM is continuous on Ω, ⎪ ⎪ ⎩ p ∈ RM \ f (∂Ω).
4.3D. Brouwer Degree
233
Proposition 4.3.116. In addition to (H) let f ∈ C 1 (Ω, RM ) and let p be a regular value of f . Then there exist aneighborhood U of the point p and a smooth function α : RM → R supported in U with
α(y) dy = 1 such that
U
f ∗ω = Ω
( deg (f, Ω, p)).
sgn Jf (x)
(4.3.51)
x∈Ω f (x)=p
Here f ∗ ω is the pull-back of the form ω(y) = α(y) dy 1 ∧ · · · ∧ dy M (see Remark 4.3.54(iii)). Proof.40 By the definition of the pull-back we have f ∗ ω(x) = α(f (x))Jf (x) dx1 ∧ · · · ∧ dxM
f ∗ ω dx =
and
Ω
α(f (x))Jf (x) dx Ω
where Jf (x) is the Jacobian of f at x. We consider two complementary cases: Case 1 (p ∈ f (Ω)). Then there is a neighborhood U of p which is a subset of RM \ f (Ω). Choose a smooth function α on RM with its support in U and such that α(y) dy = 1. Ω
Then
f ∗ω = 0
since
α|f (Ω) = 0. The right-hand side of (4.3.51) also vanishes by definition ( = 0). Ω
∅
Case 2 (p ∈ f (Ω)). Since p is a regular value, the set A {x ∈ Ω : f (x) = p} is finite. Let A {x1 , . . . , xk }. Then for any xi ∈ A there is a neighborhood Vi ⊂ Ω of xi such that f |Vi is a diffeomorphism of Vi onto a neighborhood Ui of p. These neighborhoods k % V1 , . . . , Vk can be chosen mutually disjoint. Take a neighborhood U ⊂ Ui of p such i=1
that f−1 (U) ⊂
k
Vi
i=1
(why does such a U exist?). Choose a smooth function α with its support in U and normalize α so that α(y) dy = 1. Then U
f ∗ω = Ω
k i=1
=
k i=1
40 The
f ∗ω = Vi
k i=1
α(f (x))Jf (x) dx = Vi
sgn Jf (xi )
α(y) dy = f (Vi )
k
sgn Jf (xi )
i=1 k
sgn Jf (xi ).
α(f (x))|Jf (x)| dx Vi
i=1
proof based on a more explicit construction of the form ω is given by Mawhin [92]. There the reader can also find a different approach to the homotopy invariance property of the degree (see Lemma 4.3.117 below).
234
Chapter 4. Local Properties of Differentiable Mappings
We define the degree deg (f, Ω, p) for f ∈ C 1 (Ω, RM ) and p a regular value of f by the right-hand side of (4.3.51). This definition coincides with Definition 5.2.1. We also want to point out that Case 2 of the proof contains the essence of the use of differential forms for the definition of the degree. The formula (4.3.51) can hardly be used for computation of the degree but its advantage consists in the fact that the integral exists also if p is a singular value. However, we should be careful and examine whether this integral does not depend on the choice of an M -form ω also in the case when p is a singular value. Lemma 4.3.117. Let ω be a smooth M -form in RM with its support in a certain open cube Q. If ω = 0, Q
then there exists a smooth (M − 1)-form η with support in Q such that dη = ω, i.e., ω is exact. Proof. It is similar to that of the Poincar´e Theorem (Theorem 4.3.64).
Corollary 4.3.118. Suppose (H) and f ∈ C (Ω, R ). ˜ be two smooth M -forms Let ω, ω M with supports in a cube Q ⊂ R \ f (∂Ω) such that ω= ω ˜ . Then 1
M
Q
f ∗ω = Ω
Q
f ∗ω ˜. Ω
Proof. According to Lemma 4.3.117 there exists an (M −1)-form η for which dη = ω − ω ˜. Then ˜ ) = f ∗ ( dη) = d(f ∗ η) f ∗ (ω − ω (see Exercise 4.3.71(iv)). Since the support of the form f ∗ η is a compact subset of Ω, the Stokes Theorem (Theorem 4.3.92) implies that d(f ∗ η) = 0. Ω
Let p be a singular value of f and B B(p; r) be such a ball that B ∩ f (∂Ω) = ∅. According to the Sard Theorem (see Corollary 5.2.4) there is a regular value q of f in B. We want to define ω= f ∗ ω. deg (f, Ω, p) deg (f, Ω, q) = Q
Ω
To be sure that this definition does not depend on the choice of the point q we choose another regular value q˜ ∈ B. It is obvious that a diffeomorphism h of B (onto B) can be constructed such that h(q) = q˜.
4.3D. Brouwer Degree
235
Let ω ˜ be a differential form constructed in the proof of Proposition 4.3.116 supported on ˜ q˜ ∈ U˜ ⊂ B. Then by the definition of pull-back a cube U, ω ˜= h∗ ω ˜. U
˜ U
The form ω ˜ can be chosen in such a way that h∗ ω ˜ is supported by the same cube Q as the form ω (explain why!). By Corollary 4.3.118, ω= h∗ ω ˜= ω ˜. Q
Q
˜ U
This shows that the assumption on regularity of p can be omitted in the definition of the degree. Now we may drop the assumption on smoothness of f . However, it will need a certain further effort. Lemma 4.3.119. Let Ω be a bounded open set in RM and let H : [0, 1] × Ω → RM be such that the mapping t ∈ [0, 1] → H(t, ·) ∈ C 1 (Ω, RM ) ∩ C(Ω, RM ) is continuous. If p ∈ RM \ H([0, 1] × ∂Ω), then deg (H(t, ·), Ω, p) is constant on the interval [0, 1]. Proof. First note that H is continuous as a mapping considered on [0, 1]×Ω, and therefore H([0, 1] × ∂Ω) is compact and there is an open cube Q ⊂ RM \ H([0, 1] × ∂Ω) which is a neighborhood of p. Choose a smooth M -form ω with its support in Q. By definition, (H(t, ·))∗ ω. deg (H(t, ·), Ω, p) = Ω
This shows that the function t → deg (H(t, ·), Ω, p) is continuous on [0, 1]. Taking only integer values it has to be constant.
Corollary 4.3.120. Suppose (H) and denote d dist (p, f (∂Ω)). If g, h are mappings from C 1 (Ω, RM ) ∩ C(Ω, RM ) such that f − gC(Ω,RM ) sup |f (x) − g(x)| < d,
f − hC(Ω,RM ) < d,
x∈Ω
then deg (g, Ω, p) = deg (h, Ω, p). Proof. Put H(t, x) (1 − t)g(x) + th(x), The assertion now follows from Lemma 4.3.119.
t ∈ [0, 1],
x ∈ Ω.
The last step in our approach to the definition of the degree consists in the approximation of a continuous mapping by smooth ones.
236
Chapter 4. Local Properties of Differentiable Mappings
Lemma 4.3.121. Let Ω be a bounded open set in RM and let f : Ω → RM be continuous. Then there exist mappings fn ∈ C 1 (Ω, RM ) ∩ C(Ω, RM ) that converge uniformly on Ω to f. Proof. Observe first that it is sufficient to prove the statement for individual components of f . So we will assume that f : Ω → R is continuous. There are many ways to prove the density of smooth functions in the space of continuous functions. See the discussion on page 31. We will present the approach based on convolution approximations (see Proposition 1.2.20 and the proof of Proposition 1.2.21). Extend f to a continuous bounded function g on RM (such an extension exists by the Tietze Theorem41 ). Choose a nonnegative C ∞ -function ϕ : RM → R with compact support and
ϕ(x) dx = 1. Put RM
ϕn (x) = nM ϕ(nx).
Then the convolutions
(ϕn ∗ g)(x)
RM
ϕn (x − y)g(y) dy,
x ∈ RM ,
are C ∞ -functions which converge to g locally uniformly on RM , in particular, uniformly to f on Ω. Proposition 1.2.20(iii) implies the smoothness of ϕn ∗g. To see the convergence it is convenient to visualize the form of ϕn (see Figure 4.3.26 for M = 1).42 The convergence is obtained as follows: ϕn (x − y)|g(y) − g(x)| dy |(ϕn ∗ g)(x) − g(x)| ≤ M R = ϕn (x − y)|g(y) − g(x)| dy U (x)
+ RM \U (x)
ϕn (x − y)|g(y) − g(x)| dy < δ + 0.
The first integral is arbitrarily small for a sufficiently small neighborhood U(x) of x by the continuity of g at x; the second integral is zero for sufficiently large n since ϕn (x − ·) vanishes on RM \ U(x) for such n. 41 The
Tietze Theorem says: Assume that g is a bounded, continuous, real function on a closed non-void subset A of a metric space X equipped with a metric . Then there exists a continuous extension G : X → R such that sup g(x) = sup G(x), x∈A
x∈X
inf g(x) = inf G(x).
x∈A
x∈X
This theorem permits generalization (nontrivial) to normal topological spaces (see, e.g., Dugundji [43, Section VII.5]). The proof for metric spaces is quite easy. Indeed, without loss (x,y)g(y) of generality we can suppose that 1 ≤ g ≤ 2. Put G(x) = inf dist(x,A) for x ∈ X \ A, and y∈A
G(x) = g(x) for x ∈ A. It is not difficult to show that G has all the required properties. 42 It is also said that ϕ converge to the Dirac measure (this is true in the sense of distributions) n 1 M or that {ϕn }∞ n=1 is the so-called approximate unit (the space L (R ) with the convolution multiplication is a Banach algebra without a unit, and the convergence ϕn ∗ g → g takes place in the L1 -norm for all g ∈ L1 (RM )).
4.3D. Brouwer Degree
237
ϕ5
ϕ2 ϕ = ϕ1 −a
− a2
− a5
0
a 5
a 2
a
x
Figure 4.3.26.
Corollary 4.3.122. Suppose (H) and let {fn }∞ n=1 be a sequence the existence of which follows from Lemma 4.3.121. Then lim deg (fn , Ω, p)
n→∞
exists and its value does not depend on {fn }∞ n=1 whenever this sequence possesses the properties from Lemma 4.3.121. Proof. Since dist(p, f (∂Ω)) is positive, we conclude that p ∈ fn (∂Ω) for all sufficiently large n and the degrees deg (fn , Ω, p) are defined. Corollary 4.3.120 shows that the sequence {deg (fn , Ω, p)}∞ n=1 is eventually constant. This corollary also yields the independence of the limit of the sequence of degrees on the choice of {fn }∞ n=1 . The previous corollary shows how the definition of the degree deg (f, Ω, p) is extended to all triples (f, Ω, p) satisfying (H). Remark 4.3.123. The approach which has just been described may be extended to a mapping f : M → M˜ between manifolds of the same dimension. We need to assume that both M and M˜ are oriented (in order for the integral of an M -form to be defined, cf. Proposition 4.3.116 where instead of Jf (x) we take the Jacobian of a realization of f ). A set Ω ⊂ M is supposed to be an open set with compact closure in the topology of M . We consider the degree of f ∈ C(Ω, M˜) with respect to a point p ∈ M˜ \ f (∂Ω). Here ∂Ω is again the boundary of Ω in the topology of M . M -forms have their supports in some coordinate neighborhoods V of p which are disjoint with f (∂Ω). An analogue of Lemma 4.3.117 still holds. There are problems with an analogue of Corollary 4.3.120 – we have to use a topology on the space of mappings M → M˜ since we have not defined any metric on a manifold. The existence of an approximating sequence similar to that of Lemma 4.3.121 is not clear, either. These obstacles can be overcome but special tools are required. Since they are beyond the scope of this book we only refer the interested reader to, e.g., the book Hirsch [67, Chapters 2 and 5].
238
Chapter 4. Local Properties of Differentiable Mappings
We are now able to prove the main properties of the degree. See also Proposition 5.2.2 and Theorem 5.2.7. Notice that the following theorem is also true in the case of manifolds. Theorem 4.3.124. There exists a mapping deg which sends any triple (f, Ω, p) satisfying (H) into Z and has the following properties: (i) (normalization property) If f is the identity map, p ∈ Ω, then deg (f, Ω, p) = 1. (ii) (additivity property) If Ω1 , Ω2 are disjoint open subsets of Ω and the point p is such that p ∈ f (Ω \ (Ω1 ∪ Ω2 )), then deg (f, Ω, p) = deg (f, Ω1 , p) + deg (f, Ω2 , p). (iii) (continuity property) Let {fn , Ω, p}, n = 0, 1, . . . , satisfy (H). If the sequence {fn }∞ n=1 converges uniformly to f0 on Ω, then lim deg (fn , Ω, p) = deg (f0 , Ω, p).
n→∞
(iv) (translation invariance property) deg (f, Ω, p) = deg (f − p, Ω, o); (v) (solution property) If deg (f, Ω, p) = 0, then there exists an x ∈ Ω such that f (x) = p. (vi) (homotopy invariance property) If H : [0, 1] × Ω → RM is continuous and (H) is satisfied for all (H(t, ·), Ω, p), t ∈ [0, 1], then deg (H(t, ·), Ω, p) is constant on [0, 1]. (vii) (boundary values dependence property) If (f, Ω, p), (g, Ω, p) satisfy (H) and f and g coincide on ∂Ω, then deg (f, Ω, p) = deg (g, Ω, p). (viii) (point dependence property) The mapping p → deg (f, Ω, p) is constant on every component of RM \ f (∂Ω). (ix) (multiplication property) Let Ω be a bounded open set in RM and let the mapping f : Ω → RM be continuous. Denote by U1 , . . . all bounded components of RM \ f (∂Ω). Suppose that the mappig g : RM → RM is continuous on f (Ω) and p ∈ g(f (∂Ω)). Then deg (g ◦ f, Ω, p) = deg (f, Ω, Ui ) deg (g, Ui , p) 43 (4.3.52) i
where the sum contains only a finite number of nonzero terms. of property (viii), deg (f, Ω, U ) deg (f, Ω, q) for a q ∈ U is well defined for any component U of RM \ f (∂Ω).
43 Because
4.3D. Brouwer Degree
239
Proof. The degree is defined in Definition 5.2.1 for f ∈ C(Ω, RN ) ∩ C 1 (Ω, RN ) and a regular value p ∈ f (∂Ω) and has been extended above by the procedure started with Proposition 4.3.116. It follows from this construction that it is sufficient to prove all parts of Theorem 4.3.124 for f ∈ C(Ω, RN ) ∩ C 1 (Ω, RN ). Another proof is given on pages 270–271 (the proof of Proposition 5.2.2). (i) This follows immediately from the definition. (ii) This is a consequence of Proposition 4.3.116 since an M -form ω can be chosen in such a way that its support is disjoint with f (Ω \ (Ω1 ∪ Ω2 )). Then f ∗ω = f ∗ω + f ∗ ω. Ω
Ω1
Ω2
(iii) It is obtained directly from Corollary 4.3.122. (iv) It follows from Proposition 4.3.116. (v) This is slightly tricky: Suppose by contradiction that f−1 (p) ∩ Ω = ∅, and choose four mutually disjoint, nonempty open subsets Ω1 , . . . , Ω4 of Ω. Then, by the additivity property, we have deg (f, Ω, p) = deg (f, Ω1 , p) + deg (f, Ω2 , p) = deg (f, Ω3 , p) + deg (f, Ω4 , p), and also deg (f, Ω, p) = deg (f, Ω1 ∪ Ω2 , p) + deg (f, Ω3 ∪ Ω4 , p) = · · · = 2 deg (f, Ω, p). This contradicts the inequality deg (f, Ω, p) = 0. (vi) It follows from the construction – see Lemma 4.3.119. (vii) It is sufficient to apply property (vi) to H(t, x) = tf (x) + (1 − t)g(x). (viii) Choose an M -form ω supported in an open set U where U ∩ f (∂Ω) = ∅. Then, by definition, deg (f, Ω, p) = f ∗ω for all p ∈ U. Ω
This means that the degree is locally constant on RM \f (∂Ω), and therefore it is constant on every component of RM \ f (∂Ω). (ix) If the equation g(f (x)) = p has no solution in Ω, then the left-hand side of (4.3.52) vanishes (by the solution property). For the same reason all products on the right-hand side are also equal to zero. Suppose therefore that f (Ω) ∩ g−1 (p) = ∅. There is exactly one unbounded component U0 of RM \ f (∂Ω). Regardless of whether U0 ∩ g−1 (p) = ∅ or not, deg (f, Ω, U0 ) = 0 (by (v) and (viii)). Since g−1 (p) ∩ f (Ω)
240
Chapter 4. Local Properties of Differentiable Mappings
is compact there is only a finite number of bounded components of RM \ f (∂Ω), say U1 , . . . , Uk , which contain some points of g−1 (p) (different components are disjoint). According to property (ii) we have deg (g ◦ f, Ω, p) =
k
deg (g ◦ f, Ωi , p)
(4.3.53)
i=1
where Ωi = f−1 (Ui ). By definition, there is a neighborhood V of p, V ∩ g(f (∂Ω)) = ∅, and an M -form ω supported in V with ω = 1 for which V
(g ◦ f )∗ ω.
deg (g ◦ f, Ω, p) = Ω
First consider a component Ui such that
deg (g, Ui , p) =
Ui
g ∗ ω = 0.
In this case put 1 g ∗ ω|Ui . deg (g, Ui , p) Then κi is an admissible M -form for the definition of deg (f, Ωi , Ui ), and we have (g ◦ f )∗ ω = deg (g, Ui , p) f ∗ κi = deg (g, Ui , p) deg (f, Ωi , Ui ). deg (g ◦ f, Ωi , p) = κi =
Ωi
Ωi
Now consider a component Ui such that
deg (g, Ui , p) =
(4.3.54)
g ∗ ω = 0. Ui
Let
g ∗ ω(y) = κ(y) dy 1 ∧ · · · ∧ dy M . If κ does not vanish on Ui , then there are disjoint open sets Ui+ and Ui− which carry the ± positive part κ+ and the negative part κ− , respectively. Let Ω± i = f−1 (Ui ). Then f ∗ (g ∗ ω) f ∗ (g ∗ ω) Ω+ Ω− + − i i = deg (f, Ωi , Ui ) = − . deg (f, Ωi , Ui ) = deg (f, Ωi , Ui ) = κ+ (y) dy κ− (y) dy Ui+
Ui−
g ∗ ω = 0 and g ∗ ω does not vanish
Notice that both denominators are nonzero since everywhere. This means that we have f ∗ (g ∗ ω) = f ∗ (g ∗ ω) + deg (g ◦ f, Ωi , p) = Ωi
= deg (f, Ωi , Ui )
Ω+ i
κ (y) dy −
Ω− i
f ∗ (g ∗ ω)
−
+
Ui+
Ui
Ui−
κ (y) dy = deg (f, Ωi , Ui ) deg (g, Ui , p),
i.e., (4.3.54) holds in this case as well. Summing up and using (4.3.53) we complete the proof.
4.3D. Brouwer Degree
241
Remark 4.3.125. It can be proved (see Amann & Weiss [5]) that the properties (i)–(iv) determine the degree uniquely. We will conclude this appendix with some topological applications of the degree. Applications to differential equations are shown in Section 5.2. The well-known and basic Jordan Separation Theorem asserts that a Jordan curve divides the complex domain into two open components and exactly one of them is bounded (the interior domain of γ). This theorem has the following generalization. Theorem 4.3.126 (Generalized Jordan Separation Theorem). Let K be a compact set in RM such that RM \ K has a finite number, say k, of components. If f is a continuous injection of K into RM , then RM \ f (K) has exactly k components. Proof (a sketch). Notice first that f is actually a homeomorphism of K, i.e., f −1 is continuous on the compact set f (K). Applying the Tietze Theorem (see the footnote on page 236) to each coordinate f i of f = (f 1 , . . . , f M ) and f −1 we conclude that there are continuous extensions g and h of f and f −1 , respectively, which are defined in RM . Denote by G0 , . . . , Gk−1 the components of RM \K where G0 is the unique unbounded component. Similarly, let U0 , . . . , Um (m ∈ N ∪ {∞}) denote the components of RM \ f (K) where U0 is the unique unbounded component. The idea of the proof consists in the application of the multiplication property of the degree (Theorem 4.3.124 (ix)): Show that deg (h ◦ g, Gj , p) = δij
for
p
∈ Gi ,
deg (g ◦ h, Ui , q) = δij
for
q
∈ Uj ,
i = 1, . . . , m, j = 1, . . . , k − 1.
Define matrices A (deg (g, Gj , Ui ))
i=1,...,m , j=1,...,k−1
B (deg (h, Ui , Gj ))j=1,...,k−1 , i=1,...,m
and show that AB = Im . This means that m ≤ k − 1. Similarly, BA = Ik−1 . The equality m = k − 1 follows. Corollary 4.3.127 (Invariance of domain). Let G ⊂ R continuous injection. Then f (G) is an open set.
M
be an open set and f : G → R
M
a
Proof (a sketch). Show first that it is sufficient to prove the assertion for the case when G is an open ball B and f is continuous and injective on B. According to Theorem 4.3.126, RM \f (∂B) has exactly two components U0 , U1 . If U0 denotes the unbounded component, then RM \ f (B) ⊂ U0 (again by Theorem 4.3.126), i.e., U1 ⊂ f (B). To show the opposite inclusion recall that f (B) is bounded and connected. Remark 4.3.128. If M < N and f : RM → RN is a continuous injection, then it can be proved (not too easily) that the complement of f (RM ) is a dense set in RN . We want to recall the famous “Peano curve”, i.e., a continuous (but not injective) map from the interval [0, 1] onto the square [0, 1] × [0, 1].44 This map can be used to construct a 44 This
example (for the construction see, e.g., Dugundji [43, Section IV.4]) has played an important role in developing the notion of a curve.
242
Chapter 4. Local Properties of Differentiable Mappings
continuous surjection of RM onto RN for any M < N . The existence of such a surjection for M ≥ N is trivial. As we have mentioned in Section 4.3, our main interest in the degree theory consists in applying it to solving equations, i.e., in using the solution property (Theorem 4.3.124 (v)). This means to compute the degree, which is by no means an easy task. Fortunately, we do not need to know the exact value of the degree. It will be sufficient to show that it is not equal to zero. For this purpose the following mapping property is very important. Definition 4.3.129. (1) A nonempty subset A of a linear space X is said to be symmetric if for every
x∈A
we have
− x ∈ A.
(2) A mapping f : A ⊂ X → Y , X, Y linear spaces, is called an odd mapping on a symmetric set A if f (−x) = −f (x)
for each
x ∈ A.
Theorem 4.3.130 (Borsuk Antipodal Theorem). Let Ω be a bounded, open, symmetric subset of RM and o ∈ Ω. Let f : Ω → RM be a continuous mapping whose restriction to ∂Ω is odd. If o ∈ f (∂Ω), then deg (f, Ω, o) is an odd integer. In particular, deg (f, Ω, o) = 0, and there is a solution of the equation f (x) = o
in
Ω.
There are several proofs of this important topological theorem. For the proof based on algebraic machinery see, e.g., Dugundji [43, Section XVI.6]. Here we present the main steps of an “analytic” proof which is taken from Schwartz [118] (see also Nirenberg [100] or Rothe [111] or Krawcewicz & Wu [80]). Proof of Theorem 4.3.130. The assertion is obvious for M = 1, therefore, we assume that M > 1. The main idea of this proof is quite simple. First observe that deg (f, Ω, o) does not depend on a continuous extension of f from ∂Ω into Ω (see Theorem 4.3.124(vii)). Since o ∈ Ω, there is a small open ball B B(o; δ) inside Ω, and there is a continuous mapping g : Ω → RM such that f (x) for x ∈ ∂Ω, g(x) = x for x ∈ B (by the Tietze Theorem). Part (ii) of Theorem 4.3.124 implies that deg (g, Ω, o) = deg (g, Ω \ B, o) + deg (g, B, o) = deg (g, Ω \ B, o) + 1. If g is constructed in such a way that g ∈ C 1 (Ω \ B) and o is a regular value of g, then deg (g, Ω \ B, o) = sgn Jg (x) where S = {x ∈ Ω \ B : g(x) = o}. x∈S
4.3D. Brouwer Degree
243
If, moreover, g is odd, then S is either the empty set or a symmetric set and g (−x) = g (x), deg (g, Ω \ B, o) is an even integer, and the proof is completed. Unfortunately, it is not known whether such g does exist. Therefore, we will show that all the required properties of g are not actually needed. The demand for regularity of o can be replaced by the assumption that g does not vanish on a part of a hyperplane, for instance on H {(x1 , . . . , xM −1 , 0) ∈ Ω \ B}. Indeed, in this case we have deg (g, Ω \ B, o) = deg (g, H+ , o) + deg (g, H− , o) where
H± {(x1 , . . . , xM ) ∈ Ω \ B : xM > 0 or xM < 0}.
Moreover,
deg (g, H+ , o) =
g∗ω = H+
H− +
g ∗ ω = deg (g, H− , o)
since Jg (x) = Jg (−x) and the mapping x ∈ H → −x is a diffeomorphism. It can be shown, by smooth approximation, that the equality deg (g, H+ , o) = deg (g, H− , o) holds also for continuous g. Therefore, the core of the proof is the following substantial strengthening of the Tietze theorem for odd mappings. Lemma 4.3.131. Let D be a bounded, open, symmetric subset of RM and o ∈ D. Let f be a continuous mapping from ∂D into RM which is odd and nowhere zero on ∂D. Then there exists a continuous odd extension g : D → RM such that g(x) = o
for all
x ∈ H {(x1 , . . . , xM ) ∈ D : xM = 0}.
To elucidate the problems of construction we note that the requirement of oddness of g does not cause difficulties. The crucial point is that we need g to be nowhere zero on H. The following simple example shows that the existence of such an extension is not obvious. Let D = (−1, 1), f (−1) = −1, f (1) = 1. Then any continuous extension of f has a zero point in D. Proof of Lemma 4.3.131. We assume again M > 1. The notation H± is used similarly as above. The key point is an odd, nowhere zero, continuous extension of f from ∂D ∩ H to f˜: H → RM . See the above example and notice the distinction, namely that dim H = M − 1 and f is a map into RM . Having such an extension, the rest of the proof is an application of the Tietze Theorem: ⎧ f (x), x ∈ ∂D, ⎪ ⎪ ⎪ ⎨ f˜(x), x ∈ H, g(x) = ⎪ g˜(x), x ∈ H+ , ⎪ ⎪ ⎩ −˜ g (−x), x ∈ H− , where g˜ is the Tietze extension of f and f˜. The existence of an extension f˜ follows from the following slightly more general assertion:
244
Chapter 4. Local Properties of Differentiable Mappings Let G be a bounded, open, symmetric subset of RN , o ∈ G, and let the mapping f : ∂G → RM be continuous, odd and nowhere zero. If N < M , then there exists a continuous, odd and nowhere zero extension ϕ : G → RM .
We will prove this assertion by induction with respect to the dimension N . If N = 1, then G ⊂ [−β, −α] ∪ [α, β]
0 < α < β < ∞,
for some
and we need to find a continuous extension ϕ to [α, β] which is nowhere zero, define ϕ on [−β, −α] to be odd and restrict ϕ to G. The induction step is done similarly: First use the induction hypothesis for an extension to G ∩ RN−1 , and then for extending it into the upper half-space. In order to show that such an extension actually exists (also for N = 1) we need the following key result: Let K be a compact subset of RN and let f be a continuous nowhere-zero mapping from K into RM where N < M . Then for any compact set L ⊃ K there exists a continuous ϕ : L → RM which extends f and is nowhere-zero on L. Recall again the above example to see the obstacles in the proof. Denote c min |f (x)| x∈K
c
and choose ε ∈ 0, 2 . First we prove the existence of a smooth approximation which is defined on a neighborhood of L. By the Tietze Theorem, there is a continuous extension f1 : L → RM . This f1 can be smoothly approximated on L (the proof of Lemma 4.3.121): Take ϕ1 ∈ C 1 (U) where U is a neighborhood of L such that f1 − ϕ1 C(L) <
ε . 2
Put ϕ ˜1 (x1 , . . . , xM ) = ϕ1 (x1 , . . . , xN ) In particular, this means that i.e., all points of ϕ1 (U) = ϕ˜1 (U × R Theorem (Theorem 5.2.3)
for
x = (x1 , . . . , xN , . . . , xM ) ∈ U × RM −N .
det ϕ ˜1 (x) = 0, M −N
) are critical values of ϕ ˜1 . According to the Sard
meas ϕ1 (U) = 0 (meas is the Lebesgue measure in RM ), and RM \ ϕ1 (U) is dense.45 Therefore, there is a point y0 ∈ RM \ ϕ1 (U) such that y0 < ε2 . Put ϕ2 = ϕ1 − y0 . Then ϕ2 (x) = o for every x ∈ L and f − ϕ2 C(K) < ε. Moreover, ϕ2 (x) ≥
c 2
for
x ∈ K.
4.3D. Brouwer Degree
245
We can assume that the last inequality holds for all x ∈ L. Otherwise, we multiply ϕ2 c by the function 2 ϕ(·)
outside the set where ϕ2 (x) ≥ 2c . Since the Tietze extension retains upper and lower bounds, we can extend f − ϕ2 from the set K to a continuous mapping ψ on L for which ψ(x) < ε, x ∈ L. It remains to put ϕ(x) = ϕ2 (x) + ψ(x),
x ∈ L.
The above proof is cumbersome and seems to be endless. So we recommend that the reader goes through the main steps once again, not checking all the technicalities but concentrating on their main ideas. Corollary 4.3.132 (Borsuk–Ulam). Let f be a continuous mapping from the M -dimensional sphere S M ⊂ RM +1 into RM . Then there is a point x0 ∈ S M such that f (x0 ) = f (−x0 ). Proof. Extend ϕ(x) = f (x) − f (−x) to a mapping from the unit ball B(o; 1) ⊂ RM +1 into RM which is viewed as a subset RM × {0} of RM +1 . If o ∈ ϕ(S M ), the proof is complete. For the case o ∈ ϕ(S M ), the application of the Borsuk Theorem yields deg (ϕ, B(o; 1), o) = 0. By Theorem 4.3.124 (viii), deg (ϕ, B(o; 1), o) = deg (ϕ, B(o; 1), y)
y = (0, . . . , 0, 1) ∈ RM +1 ,
where
45 In fact we do not need here the whole strength of the Sard Theorem. The following much weaker result is sufficient: Let G ⊂ RM be an open set and ψ : G → RM a C 1 -mapping on G. If A ⊂ G has Lebesgue measure zero, so has ψ(A). Indeed, every point of A belongs to a ball B ⊂ G on which ψ is bounded. By the Mean Value Theorem, there is a constant K such that
ψ(x) − ψ(y) ≤ Kx − y,
x, y ∈ B.
(∗)
is separable, the set A can be covered by countably many balls {Bj }j∈N , i.e., A = Since ∞ ∞ (A∩Bj ), and ψ(A) = ψ(A∩Bj ). To complete the proof we show that meas (ψ(A ∩ Bj )) = RM
j=1
j=1
0, j ∈ N. So take η > 0. Since meas (A ∩ Bj ) = 0, there are countably many cubes (or balls) ∞ ∞ Qk , meas Qk < η, and the estimate (∗) holds for all Qk with {Qk }k∈N A ∩ Bj ⊂ k=1
k=1
the same constant K. This implies meas (ψ(A ∩ Bj )) ≤
∞ k=1
meas ψ(Qk ) ≤ c
∞
meas Qk < cη
k=1
where the constant c depends only on the dimension M and on K. Since η > 0 is arbitrary, meas (ψ(A ∩ Bj )) = 0 for all j ∈ N. To obtain the result required in the proof above take A = U × {0, . . . , 0} . (M −N)-tuple
246
Chapter 4. Local Properties of Differentiable Mappings
i.e., the equation ϕ(x) = y has a solution in B(o; 1). However, this is impossible since ϕ(B(o; 1)) ⊂ RM × {0}.
For more information in this direction see Schwartz [118]. Exercise 4.3.133. Deduce the classical Jordan Separation Theorem from Theorem 4.3.126! Hint. A Jordan curve is homeomorphic to S 1 . Exercise 4.3.134. Show that there is no continuous injection of RM into RN whenever M > N! Hint. Assume by contradiction that ϕ is such a mapping and put f (x) = (ϕ(x), 0, . . . , 0). Apply Corollary 4.3.127. For another proof see Exercise 4.3.137. Exercise 4.3.135. Let Ω be a ball in C with sufficiently large radius and let P be a polynomial of degree n ≥ 1. Show that deg (P, Ω, 0) = n. Hint. For P (x) =
n
ak xk , an = 0, use the homotopy
k=0
H(t, x) = tP (x) + (1 − t)an xn
on
Ω.
Exercise 4.3.136. Let f be an odd mapping from S = ∂B(o; 1) ⊂ RM +1 into RM +1 \{o}. Show that there is no continuous extension of f to M
ϕ : B(o; 1) → RM +1 \ {o}. (This is the original Borsuk’s formulation of Theorem 4.3.130.) Exercise 4.3.137. Deduce the assertion of Exercise 4.3.134 from the Borsuk–Ulam Theorem (Corollary 4.3.132). Exercise 4.3.138. Prove the following result due to Lusternik and Schnirelmann : Let F1 , . . . , FM +1 be closed sets which cover S M . Then at least one Fi contains a pair of antipodal points (i.e., x, −x ∈ Fi ). Hint. Let ϕ(x) −x,
x ∈ SM,
and suppose that ϕ(Fi ) ∩ Fi = ∅
for
i = 1, . . . , M.
There are continuous functions f i : S M → [0, 1] such that f i (Fi ) = {0},
f i (ϕ(Fi )) = {1}
(this consequence of the Tietze Theorem is known in a normal topological space as the Urysohn Lemma). Put f = (f 1 , . . . , f M ) and apply the Borsuk–Ulam Theorem to obtain a point x0 . Show that x0 ∈ FM +1 ∩ ϕ(FM +1 ).
4.3D. Brouwer Degree
247
Exercise 4.3.139. Prove the following complement of the above covering result of Lusternik and Schnirelmann: There exist closed sets F1 , . . . , FM +2 which cover S M , and such that no Fi contains a pair of antipodal points. Hint. Proceed by induction with respect to M . The assertion is obviously true for M = 1. Let M = 2. Cover the equator of S 2 with three closed sets E1 , E2 and E3 , with Ej ∩(−Ej ) = ∅, j = 1, 2, 3. Then choose a latitude L on the southern hemisphere and extend the cover of the equator to a covering A1 , A2 and A3 of the set of all points which lie to the north of L, including those of latitude L (see Figure 4.3.27).
A2
A3 E3
E2 A1 E1 L A4 Figure 4.3.27.
Here Aj consists of all great circle arcs from latitude L to the north pole which contain a point of Ej . Finally, let A4 consist of all points lying to the south of L, including those of latitude L . Then A1 , A2 , A3 and A4 is the desired covering of S 2 . Continue the argument for M ≥ 3. Exercise 4.3.140. Prove the following Bread–Ham–Cheese Theorem: If B1 , . . . , BM are bounded measurable sets in RM with M ≥ 1, then there is an (M − 1)-dimensional plane which divides all the sets Bj into two parts of the same measure. This assertion can be reformulated in three dimensions as follows: Suppose we have a sandwich of bread, ham and cheese with ham and cheese piled attractively but irregularly on the bread. Then the sandwich can be cut in two with one straight slash of a knife in such a way that each of two persons gets an identical share of bread, ham and cheese. Hint. Let M = 2 and let d ∈ S 1 determine a direction in R2 . Take a perpendicular line to d and move it from −∞ to +∞ (see Figure 4.3.28).
248
Chapter 4. Local Properties of Differentiable Mappings
B1 o
d
Hd Figure 4.3.28.
Take the first and the last perpendicular (they need not be necessarily distinct) which splits the set B1 into two parts of the same measure. The perpendicular Hd which is half-way between these two has the equation (x, d) = a(d) where a : R2 → R satisfies
a(−d) = −a(d).
In order to find d for which Hd also splits B2 into two parts of the same measure, we set f (d) meas {x ∈ B2 : (x, d) > a(d)}. Then f : S 1 → R is continuous and f (d) + f (−d) = meas B2 . By the Borsuk–Ulam Theorem, there is a point d such that f (d) = f (−d). Thus the corresponding Hd divides B2 into two parts of the same measure. If M ≥ 3, then construct functions f2 , . . . , fM corresponding to B2 , . . . , BM .
Chapter 5
Topological and Monotonicity Methods 5.1 Brouwer and Schauder Fixed Point Theorems One of the most frequent problems in analysis, especially in its applications, consists in solving the equation F (x) = y where F is a mapping from a Banach space X into a Banach space Y .1 Such an equation can be reduced to the equation F (x) = o, or, provided X ⊂ Y , to the equation F (x) = x. (5.1.1) In this section we present two basic results on the solvability of (5.1.1) in a special case, namely, for a continuous mapping F and a finite dimensional X, and a compact mapping F in a general Banach space of infinite dimension – the Brouwer and the Schauder Fixed Point Theorems. We start with the finite dimensional case. A brief inspection of F : R → R indicates that reasonable assumptions on F are continuity on a closed interval I and F (I) ⊂ I. Moreover, the interval I should also be bounded. The Intermediate Value Theorem from Calculus applied to g(x) = F (x) − x says that there is a solution of (5.1.1) in I provided these assumptions are satisfied. Notice that these assumptions are too weak to say anything about the number of solutions. Having no appropriate ordering in R2 , standard proofs of the above result fail in R2 and, therefore, a generalization is far from being simple. 1 Spaces X, Y are assumed to have linear and topological structure since we discuss problems of analysis and not only of algebra or topology. Banach space structures are supposed mainly for simplification here, but sometimes they can be crucial.
250
Chapter 5. Topological and Monotonicity Methods
Instead of an interval we consider the closed unit ball B B(o; 1) in RN , N ≥ 2, and a continuous mapping F : B → B. Suppose that the equation (5.1.1) has no solution in B. Define a map G as indicated in Figure 5.1.1, i.e., G(x) = τ (x)F (x) + (1 − τ (x))x where τ (x) ≤ 0 is a solution of the quadratic equation τ F (x) + (1 − τ )x2 = 1. The mapping G is well defined (remember our assumption that F (x) = x for G(x) x F (x) B Figure 5.1.1.
x ∈ B), it maps B continuously onto the unit sphere S N −1 in RN and G(x) = x
for x = 1.
However, this seems to be impossible as our experience says that if the ball is continuously deformed (by G) onto the sphere, this ball has to be punctured. Nevertheless, a rigorous proof of this fact is far from being obvious. Such a proof could be based on introducing certain topological notions which are preserved under continuous deformation. If we show that some of these notions are different for the ball and its boundary, we obtain a contradiction with the existence of G. Algebraic topology is devoted to the study of topological invariants of an algebraic nature (homotopic groups, homological groups, etc.). However, such methods are beyond the scope of this book. Instead we will give an analytic proof of the existence of a fixed point of a continuous mapping F : B → B. This proof which is due to Milnor [95] is based on the idea of approximating a “bad” nonlinear mapping F by a simpler one. A smooth approximation is possible by the Weierstrass Approximation Theorem (see Theorem 1.2.14 and the discussion there). Suppose, by contradiction, that F has no fixed point in B. For any ε > 0 there are polynomials P1 , P2 , . . . , PN of N variables, such that for P = (P1 , . . . , PN ) we have sup F (x) − P (x) < ε x≤1
5.1. Brouwer and Schauder Fixed Point Theorems
251
P and also that P˜ 1+ε : B → B. Moreover, P˜ has no fixed points in B, either. This follows from the estimate ε ˜ F (x) − x ≤ F (x) − P (x) + (1 + ε) P (x) − x + x 1+ε
< 2ε + (1 + ε)P˜ (x) − x and the fact that inf F (x) − x > 0, since B is compact. Now we construct the x∈B
mapping G : B → S N −1 corresponding to P˜ as has been shown in Figure 5.1.1 for F . We put H(t, x) (1 − t)x + tG(x), x ∈ B. The most important properties of H are given in the next lemma. Lemma 5.1.1. (i) H(t, ·) maps B into itself for every t ∈ [0, 1]. (ii) H(t, ·) maps int B into itself for every t ∈ [0, 1). (iii) The partial Fr´echet derivative H2 (t, x) exists on [0, 1] × int B and is bounded on this set. (iv) For small t ≥ 0 the mapping H(t, ·) is a diffeomorphism of int B onto itself. Proof. The first two statements are obvious, the third follows from differentiation of τ (x) (see Exercise 5.1.18). Let us prove the fourth statement. For a small positive t the derivative H2 (t, x) = (1 − t)I + tG (x) is an isomorphism (Proposition 2.1.2) and, by the Local Inverse Function Theorem, H(t, ·) is a local diffeomorphism. Since, by the Mean Value Theorem, applied to G, H(t, x) − H(t, y) ≥ (1 − t)x − y − t sup G (z) x − y ≥ (1 − ct)x − y z<1
for a constant c and all x, y ∈ int B, the mapping H(t, ·) is injective for small t and hence it is also a diffeomorphism on the whole int B. It remains to prove that H(t, ·)(int B) = int B. Notice that int B is a connected set and thus it is sufficient to show that M H(t, ·)(int B) is open and relatively closed in int B. The former property is a consequence of the local continuous invertibility of H(t, ·). To see the latter property we point out that M = H(t, ·)(B) ∩ int B (H(t, x) = 1 for every x = 1). Since B is compact, H(t, ·)(B) is also compact and therefore closed, i.e., M is relatively closed in int B.
252
Chapter 5. Topological and Monotonicity Methods
Having Lemma 5.1.1 we continue in the proof of our main statement on the existence of a fixed point of F . To reach a contradiction with the assumption of non-existence, we will use the substitution theorem for the Lebesgue integral: If meas A denotes the N -dimensional Lebesgue measure of A ⊂ RN , we have meas (int B) = dx = det H2 (t, y) dy (5.1.2) int B
int B
for small positive t. Notice that H2 (0, y) = I and thus det H2 (t, y) is positive for small t > 0. The second equality follows from the substitution x = H(t, y) (Lemma 5.1.1(iv)). The last integral in (5.1.2) is defined for all t ∈ [0, 1], and it is a polynomial Q(t) of the variable t ∈ [0, 1]. Since Q(t) is a constant for small t, we also obtain that Q(1) = meas (int B). The substitution G(y) H(1, y) = z yields Q(1) = det G (y) dy = int B
G(int B)
dz = meas G(int B).2
But G(B) = S N −1 and meas S N −1 = 0. Hence 0 = Q(1) = Q(0) = meas (int B), a contradiction. In order to get a fixed point theorem in reasonable generality we prove the following simple topological result. Lemma 5.1.2. Let K be a convex, closed and bounded subset of RN which contains at least two different points. Then K is homeomorphic to the unit ball B M B(o; 1) in RM for some M ≤ N . Proof. Choose linearly independent elements x1 , . . . , xM of K such that X Lin{x1 , . . . , xM } contains K. The existence of x0 ∈ K such that for any x ∈ X there is α > 0 such that x0 + α1 x ∈ K can be proved by induction with respect to the dimension of X. For the sake of simplicity we assume that x0 = o, and define the Minkowski functional of K, i.e., x 1 p(x) = inf α > 0 : x ∈ K , ϕ(x) = p(x) , x ∈ X \ {o}, ϕ(o) = o. α xRN It is not difficult to prove that ϕ is a homeomorphism of K onto B N ∩ X. Since X with the induced RN -norm is isomorphic to RM (Corollary 1.2.11(i)), K is also homeomorphic to B M . 2 Notice
that for this substitution we do not need G to be a diffeomorphism.
5.1. Brouwer and Schauder Fixed Point Theorems
253
The first main result of this section is the following Brouwer Fixed Point Theorem. Theorem 5.1.3 (Brouwer Fixed Point Theorem). Let K be a nonempty, convex, closed and bounded subset of RN . Assume that F : K → K is continuous. Then F has a fixed point in K. Proof. If K has exactly one point then the statement is obvious. In other cases choose a homeomorphism Φ of B M = B(o; 1) ⊂ RM onto K (Lemma 5.1.2). According to the above discussion, the mapping Φ−1 ◦ F ◦ Φ : B M → B M has a fixed point x ˜ ∈ B M . Then F (Φ(˜ x)) = Φ(˜ x) ∈ K.
The following example shows an interesting application of the Brouwer Fixed Point Theorem in linear algebra. Example 5.1.4. Let A = (aij )i,j=1,...,N be a matrix such that aij ≥ 0
for all i, j = 1, . . . , N.
Then there exists a nonnegative eigenvalue λ of A with an eigenvector x = (x1 , . . . , xN ) having all its components xi ≥ 0,
i = 1, . . . , N.
Indeed, consider the l1 -norm on RN , i.e., x1 =
N
|xi |,
and let D {x ∈ RN : x1 = 1, xi ≥ 0, i = 1, . . . , N }.
i=1
Then D is a nonempty, closed, convex and bounded subset of RN . Let A : RN → RN be a linear operator with the representation in the standard basis given by the matrix A. If A vanishes at an x ∈ D, then such x is an eigenvector for the eigenvalue λ = 0. If this is not the case, put f (x) =
Ax Ax1
for x ∈ D.
Since f maps D continuously into D, it has a fixed point x in D. Then Ax = λx
where λ = Ax1 .
g
Let us mention now the standard application of the Brouwer Fixed Point Theorem to the existence of periodic solutions of ordinary differential equations. The basic idea goes back to H. Poincar´e: Denote by x(·; ξ) a solution of the initial value problem x(t) ˙ = f (t, x(t)), x(0) = ξ. (5.1.3)
254
Chapter 5. Topological and Monotonicity Methods
Assume that f satisfies conditions which ensure the existence and uniqueness of (5.1.3) (see, e.g., Theorem 2.3.4) and, moreover, that f (·, x) is T -periodic. Then x(·; ξ) is a T -periodic solution of (5.1.3) if and only if x(·; ξ) is defined on the interval [0, T ] and P ξ x(T ; ξ) = ξ (P is called the Poincar´e mapping). Since x(·; ξ) depends continuously on the initial condition ξ (under reasonable assumptions on f , see Remark 2.3.5 and Example 4.2.5), the Poincar´e mapping is continuous and its fixed points can be found with help of the Brouwer Fixed Point Theorem as the following example suggests. Example 5.1.5. Assume in addition that there exists r > 0 such that (x, f (t, x))RN ≤ 0
for all t ∈ [0, T ] and xRN ≤ r.
Then there exists a T -periodic solution of (5.1.3). To be able to apply the Brouwer Fixed Point Theorem to the Poincar´e mapping P it is sufficient to show that P maps the closed ball B(o; r) into itself. Since d 1 x(t)2 = (x(t), f (t, x(t)) ≤ 0 whenever x(t) ≤ r, dt 2 the function t → x(t) is decreasing provided x(0) ≤ r. In particular, P is well defined (i.e., a solution x(·, x0 ) exists on the interval [0, T ] provided x(0) < r) g and P maps B(o; r) into itself. Example 5.1.6. Assume that the right-hand side in (5.1.3) is asymptotically linear in RN , i.e., there exist a T -periodic continuous matrix A(t) and the function g : R × RN → RN continuous, T -periodic in t, and locally Lipschitz with respect to the x-variables such that for f (t, x) = A(t)x + g(t, x)
(5.1.4)
the following condition is satisfied: (H) ∀ε > 0 ∃b > 0 :
g(t, x) ≤ b + εx
for all t ∈ R, x ∈ RN .3
We are again interested in periodic solutions to (5.1.3) with f given by (5.1.4). First we have to show that the Poincar´e mapping is well defined for all ξ ∈ RN . Denote by Φ(t, s) Φ(t)Φ−1 (s) where Φ(t) is the fundamental matrix of the linear equation x˙ = A(t)x 3 Roughly
(5.1.5)
speaking: g has a uniformly (with respect to t) “vanishing derivative at infinity”.
5.1. Brouwer and Schauder Fixed Point Theorems
255
such that Φ(s, s) = I.4 Then the solution of (5.1.3) satisfies the integral equation t x(t; ξ) = Φ(t, 0)ξ + Φ(t, s)g(s, x(s; ξ)) ds (5.1.6) 0
(the Variation of Constants Formula) whenever it exists on the interval [0, t]. The fundamental matrix Φ is continuous on [0, T ] × [0, T ] and therefore it is bounded: Φ(t, s)RN ×N ≤ K,
(t, s) ∈ [0, T ] × [0, T ].
By (H), we get the estimate
t
x(s; ξ) ds
x(t; ξ) ≤ Kξ + KbT + Kε 0
and, with help of the Gronwall inequality (see Exercise 5.1.16), x(t; ξ) ≤ K(bT + ξ)eKεT = L1 + L2 ξ
(5.1.7)
whenever x(·; ξ) is defined on the interval [0, t] ⊂ [0, T ]. If the maximal interval of the existence of the solution x(·; ξ) is [0, τ ) with τ ≤ T , then the boundedness of x(·; ξ), (5.1.7) and the condition (H) imply that x(·; ξ) is uniformly continuous on [0, τ ), and therefore it can be extended to a larger interval (see Proposition 1.2.4 and cf. a similar idea in Corollary 3.1.6). This implies that x(·; ξ) is defined on [0, T ] (actually on R) and the mapping P is defined for all ξ ∈ RN . To apply the Brouwer Fixed Point Theorem (the problem is to show that P maps a ball into itself) we assume that 1 is not the Floquet multiplier of the linear equation (5.1.5), i.e., 1 ∈ σ(Φ(T, 0)) or, equivalently, the equation (5.1.5) possesses only the trivial T -periodic solution. Then the equation P (ξ) = ξ is equivalent to the equation F (ξ) [I − Φ(T, 0)]−1 [x(T ; ξ) − Φ(T, 0)ξ] = ξ. From (5.1.6) and (5.1.7) we obtain F (ξ) ≤ [I − Φ(T, 0)]−1 [Kb + Kε(L1 + L2 ξ)]T = c1 (ε) + c2 εξ where c2 does not depend on ε. Choose ε small enough to satisfy c2 ε < 1. Keeping such ε fixed there is r > 0 such that c1 (ε) + c2 εr ≤ r. It follows that F maps the ball B(o; r) ⊂ RN of the radius r into itself and the Brouwer Fixed Point Theorem g yields a T -periodic solution of (5.1.4). The Brouwer Fixed Point Theorem is a very strong device for solving finite dimensional nonlinear equations. Unfortunately, it does not hold in infinite dimensions as the following example shows. means that x : t → Φ(t, s)ξ is a (unique) solution of the equation (5.1.5) which satisfies x(s) = ξ. 4 This
256
Chapter 5. Topological and Monotonicity Methods
Example 5.1.7 (Kakutani). Let H be a separable Hilbert space with an orthonormal basis {en }∞ n=1 . Denote by A ∈ L(H) the right shift given by ∞ ∞ Aen = en+1 , i.e., A xn en = xn en+1 , n=1
and
n=1 1
F (x) = (1 − x2 ) 2 e1 + Ax. Then F is continuous and F (x)2 = 1 − x2 + Ax2 = 1 If x =
∞
for
x ≤ 1.
xn en is a fixed point of F , then
n=1
xn = xn+1 The series
∞
and
1
x1 = (1 − x2 ) 2 .
x2n , with xn = xn+1 , is convergent only if xn = 0 for all n, i.e., x =
n=1
0. Then x1 = 1, a contradiction.
g
Notice that in the previous example the apparently simple linear operator A is perturbed by a nonlinear operator with a one-dimensional range. Continuous operators with the range in a finite dimensional subspace form an important special subclass of the so-called (nonlinear) compact operators. Definition 5.1.8. Let X, Y be normed linear spaces and let M ⊂ X. A mapping F : M → Y is called a compact operator on M into Y if F is continuous on M (M being a metric space with the metric induced by the norm of X) and F (M ∩ K) is a relatively compact set in Y for any bounded set K ⊂ X. The set of all compact operators from M into Y is denoted by C (M, Y ). If the range of F ∈ C (M, Y ) is a subset of a finite dimensional subspace of Y , then we say that F is a finite dimensional operator and write F ∈ Cf (M, Y ). We recall that linear compact operators have been investigated in Section 2.2. Warning. In contrast to the linear case the continuity of a nonlinear operator F is not a consequence of the fact that F maps bounded sets onto relatively compact ones! A simple example can be constructed for F : R → R. Our interest in compact operators arises from the observation that they are close to finite dimensional ones. The precise formulation follows. Theorem 5.1.9. Let X be a normed linear space, Y a Banach space and let M be a bounded subset of X. ∞ (i) If F ∈ C (M, Y ), then there is a sequence {Fn }n=1 ⊂ Cf (M, Y ) which converges to F uniformly on M.
5.1. Brouwer and Schauder Fixed Point Theorems
257
∞
(ii) If {Fn }n=1 ⊂ C (M, Y ) and lim Fn (x) = F (x) uniformly for x ∈ M, then n→∞
F ∈ C (M, Y ).
-net y1 , . . . , ym ∈ F*(M) Proof. (i) Since F (M) is compact there is a finite n1 ) of F (M) (Proposition 1.2.3). Functions ϕk (x) = max 0, n1 − F (x) − yk are m ϕk (x) > 0 for every x ∈ M. Therefore the functions continuous on M and k=1
ϕk (x) , µk (x) m ϕk (x)
k = 1, . . . , m,
k=1
form a continuous partition of unity on M. Put Fn (x) =
µk (x)yk , x ∈ M.
k=1
Then Fn ∈ Cf (M, Y ) and F (x) − Fn (x) ≤
m
m
µk (x)F (x) − yk <
k=1
1 n
for every x ∈ M.
(ii) If we literally translate the classical proof for real functions to vector functions we see that F is continuous on M. Let n ∈ N be such that sup F (x) − Fn (x) < ε x∈M
and y1 , . . . , yk is an ε-net for Fn (M). Then it is also a 2ε-net for F (M). Since Y is a Banach space, Proposition 1.2.3 shows that F (M) is compact. Remark 5.1.10. The assertion (i) of Theorem 5.1.9 obviously holds for linear compact operators, but generally we cannot guarantee linearity of the approximating ∞ sequence {Fn }n=1 (see Remark 2.2.7). The following theorem is a generalization of the Brouwer Fixed Point Theorem into the infinite dimensional setting. Theorem 5.1.11 (Schauder Fixed Point Theorem). Let K be a nonempty, closed, convex and bounded subset of a normed linear space X. Assume that F ∈ C (K, X) and F (K) ⊂ K. Then there is a fixed point of F in K. ∞
Proof. Let {Fn }n=1 be the sequence constructed in the proof of Theorem 5.1.9(i). Denote Xn Lin{y1 , . . . , ym }. Since y1 , . . . , ym ∈ F (K) and K is convex, we have Fn (K) ⊂ K ∩ Xn . The restriction of Fn to K ∩ Xn satisfies the assumptions of the Brouwer Fixed Point Theorem and hence there is xn ∈ K ∩ Xn such that Fn (xn ) = xn .
258
Chapter 5. Topological and Monotonicity Methods ∞
By the compactness of F there is a subsequence {F (xnk )}k=1 which converges to an x ∈ F (K) ⊂ K = K. The estimate F (xnk ) − xnk = F (xnk ) − Fnk (xnk ) <
1 nk
implies that also lim xnk = x. Since F is continuous, we conclude that k→∞
lim F (xnk ) = F (x)
and
k→∞
F (x) = x.
Remark 5.1.12. The above proof of Theorem 5.1.11 is based on the approximation of F by Fn ∈ Cf (K, X). The construction in the proof of Theorem 5.1.9(i) is surely not unique. We recommend that the reader thinks about a possible simplification when F acts on a separable Hilbert space. Another possibility occurs when K is a compact convex set. We obtain a typical situation as soon as X is a reflexive Banach space and K is a closed, convex and bounded subset of X. Then K is compact in the weak topology (Theorem 2.1.25)5 and the continuity of F : K → K in the weak topology (it sends weakly convergent sequences into weakly convergent ones) is sufficient to justify application of Theorem 5.1.11. A slightly more general statement was proved by A.N. Tikhonov (for a proof see, e.g., Dugundji [43, Appendix 1] and Deimling [34, § 10.3]). We now show how the Schauder Fixed Point Theorem can be applied to differential equations. To avoid technical details we restrict ourselves to ordinary differential equations. Their solutions are generally smooth which suggests a relation to compact operators. Proposition 5.1.13. Let G be an open subset of RN +1 and let f : G → RN be continuous on G. Then for any (t0 , x0 ) ∈ G there exists δ > 0 such that the equation x˙ = f (t, x) has a solution on the interval (t0 −δ, t0 +δ) which satisfies the condition x(t0 ) = x0 . Proof. It has been shown in Lemma 3.1.5 that the initial value problem is equivalent to the integral equation t F (x)(t) x0 + f (s, x(s)) ds = x(t) (5.1.8) t0
in the space C[t0 − δ, t0 + δ]. Choose δ > 0, r > 0 such that M = [t0 − δ, t0 + δ] × B(x0 ; r) ⊂ G 5 We
also have to use the fact that a convex set which is closed in the norm topology is also weakly closed (cf. Exercise 2.1.39).
5.1. Brouwer and Schauder Fixed Point Theorems
259
(B(x0 ; r) is the closed ball in RN of radius r centered at x0 ). Then M is a compact set in RN +1 , and therefore f is bounded on M, say f (t, x)RN ≤ c
for
(t, x) ∈ M.
Then F (x) − x0 C[t0 −δ,t0 +δ] ≤ cδ ≤ r for x ∈ K {y ∈ C[t0 − δ, t0 + δ] : y − x0 C[t0 −δ,t0 +δ] ≤ r} provided δ is sufficiently small. This proves that F (K) ⊂ K. Since f is also uniformly continuous on M, the operator F is continuous on K (the convergence on K is the uniform convergence). Further, for t, s ∈ [t0 − δ, t0 + δ], t < s, x ∈ K, we have s
F (x)(t) − F (x)(s)RN ≤
t
f (σ, x(σ))RN dσ ≤ c|s − t|.
This means that F (K) is equicontinuous. By Theorem 1.2.13, F (K) is relatively compact on C[t0 − δ, t0 + δ]. It follows from Theorem 5.1.11 that the equation (5.1.8) has a solution. Our second example concerns a boundary value problem for an ordinary differential equation. Example 5.1.14. Let f be a continuous function on [0, 1] × R. We wish to solve the equation x ¨(t) = f (t, x(t)) (5.1.9) with the Dirichlet boundary conditions x(0) = x(1) = 0.
(5.1.10)
We have dealt with this problem already in Example 2.3.8. It has been proved there that y is a solution of this problem if and only if it is continuous and satisfies the integral equation (f is assumed to be continuous) F y(t)
1
G(t, s)f (s, y(s)) ds = y(t)
(5.1.11)
0
where the Green function G is given by s(t − 1), 0 ≤ s ≤ t ≤ 1, G(t, s) = t(s − 1), 0 ≤ t ≤ s ≤ 1. The operator F maps C[0, 1] into itself (actually into C 2 [0, 1]) and is compact.
260
Chapter 5. Topological and Monotonicity Methods
This can be proved by two types of argument: (i) For any R > 0 there is c(R) such that |f (s, y)| ≤ c(R)
for s ∈ [0, 1], |y| ≤ R.
Since
d2 F (y)(t) = f (t, y(t)), dt2 F maps the ball B(o; R) in C[0, 1] into the set of functions which have uniformly bounded second derivatives. Thus F (B(o; R)) is relatively compact in C[0, 1] (see Theorem 1.2.13). (ii) The operator F is a composition of a linear integral operator and a Nemytski operator (see Example 3.2.21). The Nemytski operator Φ : y → f (·, y(·)) is continuous from C[0, 1] into itself, and the integral operator 1 K : x → G(·, s)x(s) ds 0
is compact from C[0, 1] into itself (Example 2.2.5). Therefore F = K ◦ Φ is also compact. It remains to prove that F maps a ball B(o; R) ⊂ C[0, 1] into itself. For this purpose some growth assumptions on f are needed. If |f (s, y)| ≤ a + b|y|
for s ∈ [0, 1], y ∈ R,
then
F (y)C[0,1] ≤ [a + byC[0,1]] sup t∈[0,1]
Whenever b < 8, R ≥ solution.
a 8−b ,
1
|G(t, s)| ds ≤
0
a + byC[0,1] . 8
then F maps B(o; R) into itself and (5.1.11) has a g
Exercise 5.1.15. If f has a sublinear growth in y, i.e., there is α ∈ [0, 1) such that |f (s, y)| ≤ a + b|y|α , then no restriction on b is needed. Prove this fact! Exercise 5.1.16. Prove the Gronwall inequality: Let f be a nonnegative continuous function on an interval [a, b] and let A, B be nonnegative reals. Assume that f (t) ≤ A + B
t
f (s) ds,
t ∈ [a, b].
a
Then f (t) ≤ AeB(t−a) ,
t ∈ [a, b].
(5.1.12)
5.1A. Fixed Point Theorems for Noncompact Operators
261
Hint. Denote the right-hand side of (5.1.12) by g and notice that g(t) ˙ = Bf (t) ≤ Bg(t). Remark 5.1.17. More general integral and differential inequalities can be investigated in a similar way. Let us mention x(t) ˙ ≤ f (t, x(t)) as an example. Exercise 5.1.18. Let τ be as in the proof of Lemma 5.1.1. Prove that x → τ (x),
x ∈ int B,
has a bounded Fr´echet derivative. Hint. Use the Implicit Function Theorem for Φ(τ, x) τ P˜ (x) + (1 − τ )x2 − 1. Exercise 5.1.19. Regard the operator F given by (5.1.11) as an operator on a space of integrable functions. Repeat the argument from Example 5.1.14. Exercise 5.1.20. Let f in (5.1.9) depend also on x(t), ˙ i.e., f = f (t, x(t), x(t)). ˙ Formulate assumptions on f (x, y, z) to get the existence of a solution of the boundary value problem (5.1.9), (5.1.10). See also Example 5.2.16. Exercise 5.1.21. Let K be a bounded continuous real function on [a, b] × [a, b] × R and let h ∈ C[a, b]. Prove that the integral equation b x(t) = K(t, τ, x(τ )) dτ + h(t) a
has at least one solution x ∈ C[a, b].
5.1A Fixed Point Theorems for Noncompact Operators There are many generalizations of the Schauder Fixed Point Theorem. We mention here one which shows that the assumption of compactness of the operator can be relaxed. However, having in mind Example 5.1.7 this must be done carefully and more than continuity of the operator must be required. To this purpose we need a tool which will measure “how much noncompact” the operator actually is. Definition 5.1.22. Let M be a bounded set in a metric space (X, ). The Kuratowski measure of noncompactness χ(M) is defined to be the infimum of the set of all numbers d > 0 with the property that (KM) M can be covered by finitely many sets, each of whose diameters6 is less than or equal to d. If X is complete, then it follows from Proposition 1.2.3 that M is relatively compact if and only if (KM) holds for every d > 0. Therefore χ(M) = 0 is equivalent to relative compactness of M. If the value of χ(M) increases, M deviates more strongly (in the sense of condition (KM)) from relatively compact sets. diameter of M is defined as diam M sup (x, y) where the supremum is taken over all x, y ∈ M. 6 The
262
Chapter 5. Topological and Monotonicity Methods
Proposition 5.1.23 (Properties of the Kuratowski measure of noncompactness). Let X be a (real or complex) Banach space. Then for all bounded subsets M, M1 , . . . , Mn , N of X the following assertions hold: (i) χ(∅) = 0; (ii) χ(M) = 0 ⇐⇒ M is relatively compact; (iii) 0 ≤ χ(M) ≤ diam M; (iv) M ⊂ N =⇒ χ(M) ≤ χ(N ); (v) χ(M + N ) ≤ χ(M) + χ(N );7 (vi) χ(βM) = |β|χ(M) for all β ∈ R (or C); (vii) χ(M) = χ(M);
n (viii) χ Mi = max{χ(M1 ), . . . , χ(Mn )}; i=1
(ix) χ(M) = χ(Co M). Proof. The properties (i)–(vii) follow directly from Definition 5.1.22, and so the proof is left to the reader. Let us prove (viii). Set M=
n
Mi
and
a = max{χ(M1 ), . . . , χ(Mn )}.
i=1
Then it follows from Mi ⊂ M and from (iv) that χ(Mi ) ≤ χ(M), so a ≤ χ(M). To i prove the equality, choose ε > 0 and a covering {M1i , M2i , . . . , Mm i } of Mi with diam Mji ≤ χ(Mi ) + ε ≤ a + ε. All of these Mji form a covering of M, so that χ(M) ≤ a + ε,
χ(M) ≤ a.
i.e.,
Hence, χ(M) = a and (viii) is proved. Finally, we prove (ix). It follows from M ⊂ Co M and (iv) that χ(M) ≤ χ(Co M). Conversely, we show that χ(Co M) ≤ χ(M). This will be done in three steps. Step 1. We prove inequality (5.1.13) below. For every ε > 0 there exists a covering N Mi with diam Mi ≤ χ(M) + ε. Since diam (Co Mi ) = diam Mi ,8 we may M ⊂ i=1
assume that Mi are all convex. Let Λ
λ = (λ1 , . . . , λN ) ∈ R
N
:
N
. λi = 1, λi ≥ 0 for all i
i=1
and A(λ)
N
λi Mi
for all
i=1 7M
+ N {z = x + y : x ∈ M, y ∈ N }. reader is invited to prove this equality.
8 The
λ = (λ1 , . . . , λN ) ∈ Λ.
5.1A. Fixed Point Theorems for Noncompact Operators
263
Now it follows from (iv), (v) and (vi) that χ(A(λ)) ≤
Step 2. We show that the union
N
λi χ(Mi ) ≤ χ(M) + ε.
(5.1.13)
i=1
A(λ) is a convex set. Indeed, let
λ∈Λ
x=
N
λi x i ,
y=
N
i=1
t ∈ [0, 1]
µi yi ,
z = tx + (1 − t)y
and
i=1
where λ, µ ∈ Λ and xi , yi ∈ Mi for all i. The point z can be represented in the form ⎧ N ⎨t λi , for ξ > 0, i ξi ξi zi where ξi = tλi + (1 − t)µi , zi = i xi + (1 − i )yi , i = z= ⎩0, for ξ = 0. i=1
i
By definition of ξ we have 0 ≤ i ≤ 1. The set Mi is convex, so zi ∈ Mi . Moreover, ξ ∈ Λ, by the convexity of Λ. Hence z ∈ A(ξ). Step 3. We prove that χ(Co M) ≤ χ(M) + 3ε. Since the set Λ is compact, for a given ε > 0 we can find finitely many points λ(1) , . . . , ' ( N (j) (j) λ(m) ∈ Λ such that for any x = λi xi ∈ A(λ) there exists λ(j) = λ1 , . . . , λN for i=1
which
N ε (j) (j) λ x x − k=ε i ≤ max -λi − λi - max |xi | ≤ i i=1,...,N i=1,...,N k i=1
where k > 0 is a common bound for all sets Mi . Therefore,
A(λ) ⊂
λ∈Λ
So, by Step 2, we have Co M ⊂ χ(Co M) ≤ χ
A(λ)
m
' ( A λ(j) + B(o; ε).
j=1
A(λ) and by the other statements and (5.1.13),
λ∈Λ
≤χ
m
'
A λ
(j)
(
+ B(o; ε)
j=1
λ∈Λ
≤
m
' ' (( χ A λ(j) + 2ε
j=1
≤ χ(M) + 3ε, i.e., since ε > 0 is arbitrary, χ(Co M) ≤ χ(M).
Example 5.1.24. Let B(o; 1) ⊂ X be the open unit ball in a Banach space X. If dim X < ∞, then χ(B(o; 1)) = χ(B(o; 1)) = χ(∂B(o; 1)) = 0 (see Proposition 1.2.3). On the other hand, if dim X = ∞, then χ(B(o; 1)) = χ(B(o; 1)) = χ(∂B(o; 1)) = 2.
(5.1.14)
The proof of this fact is not trivial. Since the diameter of B(o; 1) is equal to 2, we know that χ(B(o; 1)) ≤ 2. In order to prove (5.1.14) we show that χ(∂B(o; 1)) ≥ 2. Assume
264
Chapter 5. Topological and Monotonicity Methods
the contrary. Then there exist sets Mi with ∂B(o; 1) =
n
Mi
i=1
and the diameter of every Mi is strictly less than 2. We may take all Mi s to be closed. Let Xn ⊂ X be a subspace of X such that dim Xn = n. Then we have ∂B(o; 1) ∩ Xn =
n
(Mi ∩ Xn ).
i=1
The sets Mi ∩ Xn , i = 1, 2, . . . , n, cover the closed unit sphere ∂B(o; 1) ∩ Xn in Xn . By the result of Lusternik and Schnirelmann (see Exercise 4.3.138) there exists Mj such that Mj ∩ Xn contains an antipodal pair {x, −x}. Consequently, 2 ≤ diam (Mj ∩ Xn ) ≤ diam Mj , which is a contradiction. Finally, by (iv) and (vii) of Proposition 5.1.23 we have (5.1.14).
e
In the next definition we will consider a special class of continuous and bounded operators. Definition 5.1.25. Let T : M ⊂ X → X be a bounded operator9 from a Banach space X into itself. The operator T is called a k-set contraction if there is a number k ≥ 0 such that χ(T (M)) ≤ kχ(M) for all bounded sets M in M . The bounded operator T is called condensing if χ(T (M)) < χ(M) for all bounded sets M in M with χ(M) > 0. Obviously, every k-set contraction for 0 ≤ k < 1 is condensing. Every compact map T is a k-set contraction with k = 0. A typical example of a k-set contraction with 0 ≤ k < 1 is the following one. Example 5.1.26. Let K, C : D ⊂ X → X be operators on a Banach space X. Let K be a k-contractive, i.e., there exists k ∈ [0, 1) such that K(x) − K(y) ≤ kx − y
for all
x, y ∈ D,
(5.1.15)
and let C be compact. Then K + C is a k-set contraction. Indeed, let M ⊂ D be a bounded set. By Definition 5.1.22 it follows from (5.1.15) that χ(K(M)) ≤ kχ(M). By (ii) of Proposition 5.1.23 we have χ(C(M)) = 0. Set T K + C. Now (iv) and (v) of Proposition 5.1.23 imply χ(T (M)) ≤ χ(K(M) + C(M)) ≤ χ(K(M)) + χ(C(M)) ≤ kχ(M).
e
The following assertion is a generalization of the Schauder Fixed Point Theorem (note that every compact operator is condensing). The operator T is said to be bounded on M if T (M ∩ A) is a bounded set provided A is a bounded set.
9
5.1A. Fixed Point Theorems for Noncompact Operators
265
Theorem 5.1.27 (Darbo). Let us suppose that (i) M is a nonempty, closed, bounded and convex subset of a Banach space X; (ii) an operator T : M ⊂ X → M is condensing and continuous on M. Then T has a fixed point in M. Proof. The idea of the proof is to find a suitable subset A of M which is mapped into itself by T in such a way that the Schauder Fixed Point Theorem can be applied to the restriction T : A → A. The resulting fixed point is then trivially a fixed point of the original mapping T : M → M. The set A is constructed in the following way. Choose a point m ∈ M and let Σ denote the system of all closed, convex subsets K of M for which m ∈ K and T (K) ⊂ K. Set A= K and C = Co {T (A) ∪ {m}}.10 K∈Σ
Since m ∈ A and T (A) ⊂ A, it follows that C ⊂ A. This implies T (C) ⊂ T (A). Obviously T (A) ⊂ C, i.e., T (C) ⊂ C which means that C ∈ Σ. So, A ⊂ C. We have proved that A = C. Now, (vii), (viii) and (ix) of Proposition 5.1.23 imply that χ(A) = χ(C) = χ(T (A)).
(5.1.16)
Since T is condensing, χ(A) = 0. Since A is also closed, A is a compact set. The restriction of T to A is thus a compact operator. Consequently, the Schauder Fixed Point Theorem can be applied to the mapping T : A → A. Corollary 5.1.28. Let K, C : M ⊂ X → X be operators in a Banach space X such that (K + C)(M) ⊂ M, let M be a nonempty, closed, bounded and convex set in X, let K be k-contractive (0 ≤ k < 1) and C compact. Then K + C has a fixed point in M. Proof. The proof follows immediately from Example 5.1.26 and Theorem 5.1.27.
The following assertion generalizes the existence part of Theorem 3.1.4 and follows from the previous Corollary 5.1.28, cf. the statement with the example on page 110. Let us consider the initial value problem x˙ = f (t, x) + g(t, x), (5.1.17) x(t0 ) = x0 in a Banach space Y . For fixed positive numbers a and b define R [t0 − a, t0 + a] × [x ∈ Y : x − x0 ≤ b}. Proposition 5.1.29. Let us assume that (i) the map f : R → Y is continuous and also Lipschitz continuous with respect to the second variable, i.e., there exists L > 0 such that f (t, x) − f (t, y) ≤ Lx − y
for all
(t, x), (t, y) ∈ R;
(ii) the map g : R → Y is compact; 10 Observe
that Σ = ∅ because M ∈ Σ, and A = ∅ because m ∈ A.
266
Chapter 5. Topological and Monotonicity Methods
(iii) the sum f + g is bounded, i.e., there exists B > 0 such that f (t, y) + g(t, y) ≤ B
(t, y) ∈ R;
for all
(iv) the number c > 0 is chosen such that c ≤ a,
cL < 1,
Bc ≤ b.
Then the problem (5.1.17) has a solution x = x(t) defined on (t0 − c, t0 + c). Proof. It follows from Lemma 3.1.5 that the problem (5.1.17) is equivalent to the integral equation t
x(t) = x0 +
[f (s, x(s)) + g(s, x(s))] ds.
(5.1.18)
t0
Let X = C([t0 − c, t0 + c], Y )
M = {x ∈ X : x − x0 X ≤ b}.11
and
Then (5.1.18) can be regarded as the operator equation x = K(x) + C(x), where
x ∈ M,
t
K(x)(t) = x0 +
(5.1.19)
f (s, x(s)) ds,
t
C(x)(t) =
t0
g(s, x(s)) ds. t0
Similarly to the proof of Theorem 3.1.4 we obtain that (K + C)(M) ⊂ M. Furthermore, the operator K is k-contractive with k = Lc and the operator C is compact. So, Corollary 5.1.28 yields the existence of a solution of (5.1.19), hence of (5.1.18), and thus, ultimately, of (5.1.17). A similar approach can be also used for functional differential equations. In the following example we describe a simple situation. For a more general treatment of evolution equations see, e.g., Milota & Petzeltov´ a [96]. Example 5.1.30. Consider a system of ordinary functional differential equations t f (t, s, x(s)) ds, x(0) = x0 , (5.1.20) x(t) ˙ = A(t)x(t) + 0
where A is an N × N -matrix with continuous entries on the interval [0, T ], f : M older continuous with respect ([0, T ] × [0, T ] × RN ) → RN is continuous and locally α-H¨ to the first variable and locally satisfies the Lipschitz condition with respect to the third variable, i.e., for any (t0 , s0 , x0 ) ∈ M there are a neighborhood U of this point and constants c > 0, L > 0, α ∈ (0, 1) such that |f (t1 , s, x1 ) − f (t2 , s, x2 )| ≤ c|t1 − t2 |α + L|x1 − x2 |
for
(ti , s, xi ) ∈ U, i = 1, 2.
Instead of (5.1.20) we consider the equivalent integral equation t s Φ(t, s) f (s, σ, x(σ)) dσ ds x(t) = H(x)(t) Φ(t, 0)x0 + 0 11 Here
0
x0 ∈ X is understood to be a constant function defined on [t0 − c, t0 + c] with value in Y .
5.2. Topological Degree
267
where Φ(t) is a fundamental matrix of the equation x(t) ˙ = A(t)x(t). We put
s
F (s, x)
f (s, σ, x(σ)) dσ,
s ∈ [0, T ],
x ∈ C([0, T ], RN )
0
and
t
Φ(t, s)[F (s, x) − F (t, x)] ds,
G1 (x)(t) =
t
Φ(t, s) ds F (t, x).
G2 (x)(t) =
0
0
It is not difficult to show that there are r > 0, τ > 0 small enough such that H maps the set . Q(r, τ )
y ∈ C([0, τ ], RN ) : sup |y(t) − x0 | ≤ r t∈[0,τ ]
a–Ascoli Theorem) on Q(r, τ ) and into itself, G1 is a compact mapping (using the Arzel` G2 is a contraction on Q(r, τ ). The local existence of a solution of (5.1.20) follows now from Corollary 5.1.28. This local solution can be continuously extended. To keep the time step τ fixed it is sufficient to assume that f satisfies the global Lipschitz condition with respect to the x-variable on the whole domain M . Exercise 5.1.31. Let H and Q(r, τ ) be as in Example 5.1.30. Prove that H maps the set Q(r, τ ) into itself. Exercise 5.1.32. Let G1 , G2 and Q(r, τ ) be as in Example 5.1.30. Prove that G1 is a compact mapping on Q(r, τ ) and G2 is a contraction on Q(r, τ ). Exercise 5.1.33. Prove Proposition 5.1.23(i)–(vii). Exercise 5.1.34. Consider the boundary value problem x ¨(t) = f (t, x(t), x(t)), ˙ t ∈ (0, 1), x(0) = x(1) = 0,
(5.1.21)
where f : R3 → R is a real function. Find conditions on f and apply Corollary 5.1.28 to prove the existence of a solution to (5.1.21). Hint. Look for conditions which guarantee that the Nemytski operator given by f is a sum of a contraction and a compact operator.
5.2 Topological Degree In this section we stress the basic properties of the Brouwer degree of a continuous map in finite dimensional spaces and of the Leray–Schauder degree of a compact perturbation of the identity in general Banach spaces. We start with some elementary considerations in one dimension. The reader can find another motivation from the theory of functions of a complex variable in Appendix 4.3D. In the previous section we have dealt with a solution of the operator equation F (x) = 0.
268
Chapter 5. Topological and Monotonicity Methods
Now we are asking what happens with its solution if F : R → R is slightly perturbed. Figures 5.2.1–5.2.3 show that the situation can change considerably, namely, either a solution may disappear (if a perturbation takes place in the solid arrow direction) or the number of solutions may vary (in the dashed arrow direction). F
F
F
G x1
x0
x0
x2
x0 G
Figure 5.2.1.
Figure 5.2.2.
Figure 5.2.3.
A closer examination indicates that this can happen since a solution x0 is either on the boundary (Figure 5.2.1) or the derivative F vanishes at x0 , i.e., x0 is a critical point of F (Figures 5.2.2 and 5.2.3). We expect that a small perturbation of F does not cause any alteration provided the just described cases do not occur. There is another point which should be mentioned, namely the distinction between perturbations of F in the direction of one or the other arrow in Figures 5.2.2 and 5.2.3. The number of solutions changes by two, being even in Figure 5.2.2 (0 is even by definition), and being odd in Figure 5.2.3. Is there any way to describe this phenomenon? Look at the dashed curve G in Figure 5.2.2 and assume that G ∈ C 1 . We have G (x2 ) > 0. G (x1 ) < 0, These signs remain the same in some neighborhoods U1 , U2 of x1 and x2 . In particular, G is injective (actually a diffeomorphism) on these neighborhoods and can be regarded as a local transformation of the x-coordinate. This transformation changes the orientation at x1 and does not do that at x2 . The sum of signs of G at the solutions of G(x) = 0 is zero (more generally even) in Figure 5.2.2 and odd for the dashed curve in Figure 5.2.3. This observation can be generalized to higher dimensions: If A : RN → RN is a linear transformation of coordinates (i.e., A is injective and surjective), then we say that A does not change the orientation in RN provided det A > 0 where A is the matrix representation of A (this does not depend on the choice of basis in which the representation is taken). This concept can be also used locally for a nonlinear C 1 -transformation G : RN → RN by replacing G (a) for A. Then
5.2. Topological Degree
269
the sign of the matrix representation of G (a) is the sign of its Jacobian JG (a). This idea leads to the following preparatory definition. Definition 5.2.1. Let Ω be an open and bounded subset of RN and let F ∈ C(Ω, RN ) ∩ C 1 (Ω, RN ). Assume that y0 ∈ RN \ F (∂Ω) and y0 is a regular value of F .12 Then we define the Brouwer degree of F as deg (F, Ω, y0 ) = sgn JF (x) .13 (5.2.1) x∈F−1 (y0 )∩Ω
We point out that the sum in (5.2.1) is finite. Indeed, otherwise the set F−1 (y0 ) ∩ Ω {x ∈ Ω : F (x) = y0 } x) = y0 , and since has an accumulation point x ˜ ∈ Ω. By the continuity of F , F (˜ y0 ∈ F (∂Ω), x ˜ ∈ Ω. By the Local Inverse Function Theorem (Theorem 4.1.1), F is injective in a neighborhood U of x ˜. But U contains points of F−1 (y0 ) different from x ˜, a contradiction. Notice that in this argument we have used all assumptions of Definition 5.2.1. Proposition 5.2.2. Let Ω be an open bounded subset of RN . The degree defined in Definition 5.2.1 has the following properties ( I is the identity map): 1 if y0 ∈ Ω, (i) deg (I, Ω, y0 ) = 0 if y0 ∈ Ω. Suppose that F ∈ C(Ω, RN ) ∩ C 1 (Ω, RN ) and y0 ∈ RN \ F (∂Ω) is a regular value of F . Then (ii) deg (F, Ω, y0 ) ∈ Z; (iii) deg (F, Ω, y0 ) = deg (F − y0 , Ω, o); (iv) if deg (F, Ω, y0 ) = 0, then the equation F (x) = y0 has a solution in Ω; (v) if Ω1 is an open subset of Ω and y0 ∈ F (Ω \ Ω1 ), then deg (F, Ω, y0 ) = deg (F, Ω1 , y0 ). More generally, if Ω1 , . . . , Ωk are pairwise disjoint open subsets of Ω and k Ωj , then y0 ∈ F Ω \ j=1
deg (F, Ω, y0 ) =
k
deg (F, Ωj , y0 ).
j=1 12 The 13 Here
definition of a regular value is given in Definition 4.3.6. = 0 as usual. ∅
(5.2.2)
270
Chapter 5. Topological and Monotonicity Methods
(vi) For all y ∈ RN which are sufficiently close to y0 , deg (F, Ω, y) = deg (F, Ω, y0 )
holds.
(5.2.3)
(vii) For all G ∈ C(Ω, RN ) ∩ C 1 (Ω, RN ) which are sufficiently close to F in the C 1 -topology,14 deg (G, Ω, y0 ) = deg (F, Ω, y0 )
is valid.
(5.2.4)
Proof. The properties (i)–(v) follow immediately from Definition 5.2.1. To prove (vi) let F−1 (y0 ) = {x1 , . . . , xk } and let F be a diffeomorphism of an open neighborhood Uj of xj onto a neighborhood Vj of y0 (the Local Inverse Function Theorem (Theorem 4.1.1). Denote ⎧ ⎫ k ⎨ ⎬ d inf F (x) − y0 : x ∈ Ω \ Uj > 0. ⎩ ⎭ j=1
If y − y0 < d, then there is no solution of F (x) = y in Ω\
k
Uj , and if y ∈
j=1
k %
Vj
j=1
(the neighborhood of y0 ), then there is exactly one x ˜j ∈ Uj such that F (˜ xj ) = y. Moreover, sgn JF (˜ xj ) = sgn JF (xj ). This completes the proof of (5.2.3).15 To prove (vii) we use the same notation as above. Let G differ a little from F in the C 1 -topology, say F − GC 1 (Ω,RN ) < ε. The quantity ε will be specified later. Put H(t, x) = (1 − t)F (x) + tG(x),
x ∈ Ω, t ∈ (−δ, 1 + δ) for δ > 0.
(5.2.5)
Choose a fixed neighborhood Uj as above. Notice that we can take Uj so small that c1 sup F (x)L(RN ) < ∞
and
x∈Uj
c2 sup [F (x)]−1 L(RN ) < ∞, x∈Uj
and the determinant det F (x) has a constant sign in Uj . We have H(t, x) − y0 ≥ F (x) − y0 − |t|F (x) − G(x) ≥ d − |t|ε > 0 14 I.e.,
there exists ε > 0 such that F − GC 1 (Ω,RN ) sup F (x) − G(x)RN + sup F (x) − G (x)L(RN ) < ε. x∈Ω
15 Notice
x∈Ω
that this proof is correct also for F−1 (y0 ) = ∅.
5.2. Topological Degree k
for every x ∈ Ω \
271
Uj and t ∈ (−δ, 1 + δ), and for sufficiently small ε > 0. In
j=1
particular, deg (H(t, ·), Uj , y0 ) is well defined16 and, by (v), deg (H(t, ·), Ω, y0 ) =
k
deg (H(t, ·), Uj , y0 ).
j=1
We wish to prove that this degree is constant on the interval [0, 1]. We will study the set M {(t, x) ∈ [0, 1] × Uj : H(t, x) = y0 } with help of the Implicit Function Theorem at the point (0, xj ). This is possible since H2 (t, x) − F (x)L(RN ) = |t|F (x) − G (x)L(RN ) ≤ |t|ε < (t, x) ∈ (−δ, 1 + δ) × Uj ,
1 , c2
for small ε > 0.
This estimate implies that [H2 (t, x)]−1 exists (Exercise 2.1.33). The Implicit Function Theorem implies that M has the form {(t, ϕ(t)) : t ∈ [0, β)} in a certain neighborhood of (0, xj ) (F (xj ) = y0 ) where ϕ ∈ C 1 ([0, β), RN ) and −1 ϕ(t) ˙ H1 (t, ϕ(t))L(RN ) ≤ L(RN ) = [H2 (t, ϕ(t))]
c2 ε , 1 − c2 ε
see again Exercise 2.1.33. In particular, ϕ is uniformly continuous and, if necessary, it can be continued at least until t = 1.17 Therefore M is a graph of ϕ on the interval [0, 1], i.e., {x ∈ Uj : G(x) = H(1, x) = y0 } = {ϕ(1)}, and, consequently, deg (G, Uj , y0 ) = deg (F, Uj , y0 ).
One of our main goals is to show that the degree is homotopically invariant, i.e., if for H given by (5.2.5) we have y0 = H(t, x)
for all t ∈ [0, 1],
x ∈ ∂Ω,
then deg (H(t, ·), Ω, y0 ) is constant on [0, 1]. In particular, deg (H(0, ·), Ω, y0 ) = deg (H(1, ·), Ω, y0 ) 16 A 17 If
(5.2.6)
homotopy H(t, x) for which the degree is well defined is called an admissible homotopy. β ≤ 1, then lim ϕ(t) = x ˜ exists and x ˜ ∈ Uj (see Proposition 1.2.4). t→β−
272
Chapter 5. Topological and Monotonicity Methods
provided at least one side in (5.2.6) is defined. The problem in proving this property can be seen from Figures 5.2.2 and 5.2.3. Namely, if the dashed curve G is moving up, then it is equal to F in one instance, o is not a regular value for F , and so deg (F, Ω, o) is not yet defined. To overcome this obstacle we approximate a critical value by a regular one. Such approximation is based on the so-called Sard Theorem. Its special case stated below will be sufficient for our purposes. Theorem 5.2.3 (Sard). Let Ω be an open subset of RN and assume that F ∈ C 1 (Ω, RN ). Then the Lebesgue measure of the set of critical values of F is zero. Proof. Since RN can be covered by a sequence of bounded open sets and a countable union of sets of measure zero has also measure zero, we can suppose that Ω is bounded. Choose now an open subset G ⊂ Ω such that G ⊂ Ω. Let S be the set of critical points of F in G. By the same argument as above, it is sufficient to show that measN F (S) = 0 where measN is the Lebesgue measure in RN . Since G is compact, d dist(G, RN \ Ω) > 0 and G can be covered by a finite number of closed cubes C1 , . . . , Ck with √ sides parallel to the coordinate hyperplanes and edges of length a. If a < d N , then k Ci ⊂ Ω and i=1
sup F (x)L(RN ) < ∞.
c
x∈
k
i=1
Ci
Again it is sufficient to show that measN F (Ci ∩ S) = 0,
i = 1, . . . , k.
Choose one of these cubes and denote it by C. By the Mean Value Theorem (Theorem 3.2.7), F (y) − F (x) ≤ cx − y and F (y) − Lx y ≤ ω(x − y)x − y, x, y ∈ C where lim ω(r) = 0 (uniform continuity of F on the compact set C) and r→0+
Lx y F (x) + F (x)(y − x). a , and consider a small Divide now the cube C into mN small cubes with edges m ˜ ˜ cube r = √ a C which contains a critical point x. Denote the diameter of C by r˜ (˜ ˜ N m ). Since Lx (C) lies in an (N − 1)-dimensional hyperplane (x is a critical point!),
˜ ≤ (c˜ measN −1 Lx (C) r )N −1 we have
and
˜ ≤ ω(˜ ˜ dist(F (y), Lx (C)) r )˜ r for y ∈ C,
˜ ≤ cN −1 ω(˜ measN F (C) r)˜ rN .
5.2. Topological Degree
273
The number of such small cubes which contain critical points is mN at most. Therefore, measN F (S ∩ C) ≤ cN −1 ω(˜ r )˜ rN mN ≤ c1 ω(˜ r) with a constant c1 independent of m. Since ω(˜ r ) → 0 for m → ∞, measN F (S ∩ C) = 0.
Corollary 5.2.4. Under the hypotheses of Theorem 5.2.3 the set of regular values of F : Ω → RN is dense in RN . Proof. The complement of regular values, i.e., F (Ω ∩ S), cannot have an interior point and zero Lebesgue measure simultaneously. Remark 5.2.5. A more general Sard Theorem concerns F : RM → RN . It is surprising that the assertion measN F (S ∩ C) = 0 needs more smoothness of F , namely F ∈ C r (Ω, RN )
where
r > max {0, M − N }.
The proof is more involved (see, e.g., Hirsch [67, Chapter 3, Theorem 1.3] for the C ∞ -case and comments given there, or Sard [117], or Sternberg [124, Theorem II.3.1]). If r ≤ max {0, M − N }, then there exists F ∈ C r (Ω, RN ) such that int F (Ω ∩ S) = ∅, see Whitney [132]. The statement on the Lebesgue measure can be strengthened by considering the finer Hausdorff measure or dimension. The following result also holds: If F : RM → R is analytic, then F (Ω ∩ S) is even countable. For more detail see, e.g., Fuˇc´ık et al. [56, Chapter IV and Appendix IV]. There is also a generalization for mappings F : X → Y , X, Y Banach spaces: If F (x) is a Fredholm operator for all x ∈ Ω and F is sufficiently smooth, then F (Ω ∩ S) is nowhere dense in Y . Sharper results can be proved for functionals (i.e., Y = R) (see the book Fuˇc´ık et al. [56] cited above). Now we return to the degree deg (F, Ω, y0 ) where y0 ∈ RN \ F (∂Ω) and it is ∞ a critical value of F . According to Corollary 5.2.4 there is a sequence {yn }n=1 of regular values of F such that lim yn = y0 ,
n→∞
yn ∈ F (∂Ω).
In particular, deg (F, Ω, yn ) is well defined by Definition 5.2.1. Part (vi) of Proposi∞ tion 5.2.2 allows us to presume that the sequence {deg (F, Ω, yn )}n=1 is eventually
274
Chapter 5. Topological and Monotonicity Methods ∞
constant and does not depend on the choice of the sequence {yn }n=1 of regular values. To see this we need to extend Proposition 5.2.2(vi) to guarantee that deg (F, Ω, y) is constant on any open connected set G ⊂ RN \ F (∂Ω ∪ S). This can be done due to the fact that any two different points in an open connected subset of RN can be connected by a smooth curve in this subset (see Proposition 1.2.7). We leave details to the reader. He or she should also be convinced that all statements of Proposition 5.2.2 are still valid for this more general definition of the degree. For the definition of the degree deg (F, Ω, y0 ) it is not necessary to assume that F ∈ C 1 (Ω, RN ) since any F ∈ C(Ω, RN ) can be approximated by smooth mappings. This is a consequence of the Stone–Weierstrass Theorem (see Theorem 1.2.14).18 To show that deg (G, Ω, y0 ) is the same for all G ∈ C(Ω, RN ) ∩ C 1 (Ω, RN ) which are close to F in the C(Ω, RN )-norm we need the following extension of Proposition 5.2.2(vii). Proposition 5.2.6. Let Ω be a bounded open subset of RN and let F, G be the mappings from C(Ω, RN ) ∩ C 1 (Ω, RN ). Put H(t, x) = (1 − t)F (x) + tG(x),
t ∈ [0, 1],
x ∈ Ω.
Assume that y0 ∈ RN \ {H(t, x) : t ∈ [0, 1], x ∈ ∂Ω}. Then deg (F, Ω, y0 ) = deg (G, Ω, y0 ).
(5.2.7)
Proof. As has been stated above, Proposition 5.2.2(vii) holds for an arbitrary y0 ∈ RN \ F (∂Ω), in particular for H(t, ·) and t small. Put t0 = sup {t ∈ [0, 1] : deg (H(t, ·), Ω, y0 ) = deg (F, Ω, y0 )}. By the same statement (deg (H(t, ·), Ω, y0 ) is defined for all t ∈ [0, 1]), deg (H(t0 , ·), Ω, y0 ) = deg (H(t, ·), Ω, y0 )
for t ∈ (t0 − δ, t0 + δ) ∩ [0, 1],
i.e., t0 = 1, and the equality (5.2.7) follows.
The following result is a summarization of the previous exposition. 18 For the set A in Theorem 1.2.14 take restrictions of polynomials of N variables to Ω and approximate separately every Fi , i = 1, . . . , N , where F = (F1 , . . . , FN ). For another proof see Lemma 4.3.121.
5.2. Topological Degree
275
Theorem 5.2.7. Let Ω be a bounded open set in RN . There exists a mapping deg : C(Ω, RN ) × RN → Z defined for all F ∈ C(Ω, RN ) and y0 ∈ RN \ F (∂Ω) which has the properties (i)–(vii) from Proposition 5.2.2 and from Proposition 5.2.6.19 If, moreover, F ∈ C 1 (Ω, RN ) and y0 is a regular value of F , then the formula (5.2.1) holds. Remark 5.2.8. The function“deg” from the previous theorem is unique. The reader can consult, e.g., Amann & Weiss [5] or Deimling [34] to get more information. Example 5.2.9 (Brouwer). Let F be a continuous mapping from the closed unit ball B B(o; 1) into itself. Then F has a fixed point in B. Indeed, if there is x ∈ ∂B such that x − F (x) = o, then the statement is true. In the other case put H(t, x) = x − tF (x). Then H(1, x) = o and H(t, x) ≥ x − tF (x) ≥ 1 − t > 0
for t ∈ [0, 1), x = 1.
By the homotopy invariance property of the degree, 1 = deg (I, int B, o) = deg (x − F (x), int B, o). By property (iv), the equation x − F (x) = o has a solution in int B.
g
Example 5.2.10. Let B B(o; 1) be the closed unit ball in RN and let A be a linear injective operator from RN into RN .20 Then where p = m(λ) deg (A, int B, o) = (−1)p λ∈σ(A) λ<0
and m(λ) is the multiplicity21 of the eigenvalue λ of A. This follows immediately from Definition 5.2.1 and Exercise 1.1.40. Notice that the same result is true when A is replaced by a C 1 -mapping F : RN → RN which has an isolated zero at x0 ∈ RN , B(o; 1) is replaced by a sufficiently small 19 Mappings
F , G are now supposed to be continuous only. A maps RN onto RN and o ∈ A(∂B). 21 See footnote 23 on page 84, m(λ) is often called the algebraic multiplicity. 20 Actually,
276
Chapter 5. Topological and Monotonicity Methods
ball B(x0 ; r) (such that the equation F (x) = o has a unique solution in this closed ball, namely x0 ) and F (x0 ) is injective. Under these hypotheses we have deg (F, B(x0 ; r), o) = (−1)p where p = m(λ). (5.2.8) λ∈σ(F (x0 )) λ<0
The value deg (F, B(x0 ; r), o) is also called the index of an isolated solution x0 of g the equation F (x) = o. Example 5.2.11. Let Ω be a bounded open subset of RN and let Y be a linear subspace of RN . Suppose that f ∈ C(Ω, Y ) is such that o = f (∂Ω). Denote by π a projection of RN onto Y , by f˜ the restriction of πf onto Ω ∩ Y and g(x) f˜(πx) + (I − π)x. Then deg (g, Ω, o) = deg (f˜, Ω ∩ Y, o).
(5.2.9)
To see this notice first that ∂(Ω ∩ Y ) = ∂Ω ∩ Y
and
g−1 (o) = f˜−1 (o).
This means that both sides of (5.2.9) are defined. By the construction of the degree, it suffices to prove the equality under the following additional assumptions: f ∈ C 1 (Ω) and o is a regular value. Since g (y)(h + k) = f˜ (y)h + k
for
y ∈ Y ∩ Ω,
h ∈ Y,
k ∈ Ker π,
we get det g (y) = det f˜ (y), 22 g
and (5.2.9) follows.
Our next aim is to generalize the notion of the degree to infinite dimensional spaces. Since the Brouwer Theorem is a corollary of the homotopy invariance property of the degree and this theorem does not hold even in an infinite dimensional Hilbert space (Example 5.1.7) we cannot expect a meaningful generalization of the Brouwer degree which would be valid for all continuous mappings. Similarly as in the Schauder Fixed Point Theorem we restrict our attention to operators which are well approximated by finite dimensional ones, i.e., to compact operators. One more remark is desirable. One of the main consequences of the notion of deg (F, Ω, y0 ) is a sufficient condition for the solvability of the equation F (x) = y0 in the set Ω. If F is a compact operator, then F (Ω) is rather small in an infinite dimensional space. Therefore it is much better to solve either x − F (x) − y0 = o 22 The
reader is asked to check this equality.
5.2. Topological Degree
277
(recall the Fredholm theory in Section 2.2) or, more generally, Ax − F (x) = o
for a suitable A.
The Leray–Schauder degree concerns operators of the type I −F where F : X → X is a compact operator. Let Ω be a bounded open set in a Banach space X, o ∈ X \ (I − F )(∂Ω). Using the compactness of F , it is easy to prove that (I − F )(∂Ω) is closed, and hence, d dist(o, (I − F )(∂Ω)) > 0. Let {Fn }∞ n=1 be a sequence in Cf (Ω, X) which converges to F uniformly on Ω (Theorem 5.1.9). Denote Xn = Lin Fn (Ω),
Ωn = Ω ∩ X n ,
and Gn (x) = x − Fn (x)
for x ∈ Ωn .
If n is sufficiently large, then o ∈ Gn (∂Ωn ) and deg (Gn , Ωn , o) is well defined. Lemma 5.2.12. Under the above stated hypotheses, the sequence of integers ∞ {deg (Gn , Ωn , o)}n=1 is constant for large n, and its limit does not depend on the choice of the approximating sequence {Gn }∞ n=1 . Proof. For a given ε > 0 there is n0 such that sup F (x) − Fn (x) < ε
for
n ≥ n0 .
x∈Ω
Choose some n, m ≥ n0 and put ˜ = X n + Xm , X
˜ k (x) = x − Fk (x), G
˜ x ∈ Ω ∩ X,
k = n, m.
Consider the homotopy ˜ n (x), ˜ m (x) + (1 − t)G H(t, x) = tG
˜ t ∈ [0, 1], x ∈ Ω ∩ X.
˜ = ∂(Ω ∩ X) ˜ we have For x ∈ ∂Ω ∩ X H(t, x) = x − F (x) − t[Fm (x) − F (x)] − (1 − t)[Fn (x) − F (x)] ≥ x − F (x) − tFm (x) − F (x) − (1 − t)Fn (x) − F (x) ≥ d − 2ε > 0 for small ε > 0. By Theorem 5.2.7, ˜ o) = deg (G ˜ n , Ω ∩ X, ˜ o). ˜ m , Ω ∩ X, deg (G
278
Chapter 5. Topological and Monotonicity Methods
Using Example 5.2.11 we get ˜ k , Ω ∩ X, ˜ o) = deg (Gk , Ωk , o), deg (G
k = m, n,
i.e., deg (Gn , Ωn , o) is constant for n ≥ n0 . If Φn is another approximating sequence of F , then the homotopy joining the restrictions of I − Fn and I − Φn to the span Z of Im Fn + Im Φn can be defined. The same procedure as above yields deg (x − Fn (x), Ω ∩ Z, o) = deg (x − Φn (x), Ω ∩ Z, o).
We are now able to define the Leray–Schauder degree deg (I − F, Ω, y0 ) deg (I − (F + y0 ), Ω, o) ∞
as the limit of deg (Gn , Ωn , o) for any approximating sequence {Gn , Ωn }n=1 . This construction also shows that the Leray–Schauder degree inherits its properties from the Brouwer degree. Theorem 5.2.13. Let Ω be a bounded open subset of a Banach space X. There exists a mapping deg (I − F, Ω, y0 ) defined for all F ∈ C (Ω, X) and y0 ∈ X such that x − F (x) = y0 for all x ∈ ∂Ω. This mapping has the following properties: 1 if y0 ∈ Ω, (i) deg (I, Ω, y0 ) = 0 if y0 ∈ Ω. (ii) deg (I − F, Ω, y0 ) = deg (I − F − y0 , Ω, o). (iii) If deg (I − F, Ω, y0 ) = 0, then the equation x − F (x) = y0 has a solution in Ω. (iv) If Ω1 , . . . , Ωk are pairwise disjoint open subsets of Ω and x − F (x) = y0 for k each x ∈ Ω \ Ωj , then j=1
deg (I − F, Ω, y0 ) =
k
deg (I − F, Ωj , y0 ).
j=1
(v) If F, G ∈ C (Ω, X) and sup F (x) − G(x)X < inf x − F (x) − y0 X , x∈Ω
x∈∂Ω
then deg (I − F, Ω, y0 ) = deg (I − G, Ω, y0 ).
5.2. Topological Degree
279
(vi) (homotopy invariance property) If F, G ∈ C (Ω, X) and H(t, x) = (1 − t)F (x) + tG(x),
t ∈ [0, 1],
x ∈ Ω,
are such that x − H(t, x) = y0
for every
x ∈ ∂Ω
and
t ∈ [0, 1],
then deg (I − H(t, ·), Ω, y0 ) is constant on [0, 1]. In particular, deg (I − F, Ω, y0 ) = deg (I − G, Ω, y0 ). Proof. Finite dimensional approximations and the corresponding properties of the Brouwer degree are used to prove all the statements. We only give details for (iii). ∞ We can assume that y0 = o (by (ii)). Let {Fn }n=1 ⊂ Cf (Ω, X) be a sequence of finite dimensional approximations that converges to F uniformly on Ω. We know, by construction of the degree, that deg (I − F, Ω, o) = deg (In − Fn , Ωn , o) = 0
for all n large.
By Theorem 5.2.7 there are xn ∈ Ωn ⊂ Ω such that Fn (xn ) = xn . ∞
Since F is compact there exists a subsequence {F (xnk )}k=1 converging to a z ∈ X. It follows from the uniform convergence of Fn that lim Fnk (xnk ) = z, too. This k→∞
means that also lim xnk = z ∈ Ω and, therefore, k→∞
F (z) = z. But z cannot belong to ∂Ω, i.e., z ∈ Ω.
Example 5.2.14 (Rothe’s version of the Schauder Fixed Point Theorem). Assume that F is a compact operator from the closed unit ball B(o; 1) of a Banach space X into X. If F (∂B(o; 1)) ⊂ B(o; 1), then F has a fixed point in B(o; 1). Indeed, suppose not and consider H(t, x) x − tF (x). By the homotopy invariance property, deg (I − F, B(o; 1), o) = deg (I, B(o; 1), o) = 1, a contradiction.
g
280
Chapter 5. Topological and Monotonicity Methods
Example 5.2.15 (Schaefer). Let F ∈ C (X, X) and let Σ {x ∈ X : ∃t ∈ [0, 1] such that x − tF (x) = o} be bounded.23 Then F has a fixed point. To prove this choose an r > 0 such that Σ ⊂ B(o; r) and put Ω = B(o; r) (open ball). The homotopy invariance property of the degree can be applied to H(t, x) = (1 − t)x + t(x − F (x)), g
and the result follows.
The next example shows that finding an a priori estimate need not be a trivial task. Example 5.2.16. Consider the boundary value problem x ¨(t) = f (t, x(t), x(t)), ˙ t ∈ (0, 1), x(0) = x(1) = 0,
(5.2.10)
where f : [0, 1] × R2 → R is a continuous function. We know (Example 2.3.8 and also Example 5.1.14) that a solution of (5.2.10) is also a solution of the integral equation 1 F (x)(t) G(t, s)f (s, x(s), x(s)) ˙ ds = x(t) 0
where the Green function G(t, s) is defined as follows: s(t − 1), 0 ≤ s ≤ t ≤ 1, G(t, s) = t(s − 1), 0 ≤ t < s ≤ 1. Notice the difference between this example and Example 5.1.14. Here the operator F depends also on the derivative x, ˙ and it is thus defined only on a dense subset of the space C[0, 1]. The notion of the degree cannot be used for F in this space. Therefore, we have to work either in C 1 or in the space X = {x ∈ C 2 [0, 1] : x(0) = x(1) = 0} which our solution has to belong to. In both the spaces the problem of an a priori estimate of a possible solution to (5.2.10) occurs – see Step 2 of this example. We will work in X. 23 This assumption is often called an a priori estimate. Notice that it is not assumed that the equation x−tF (x) = o has any solution. However, if a solution exists, then it belongs to a certain ball the radius of which is independent of t. The result given in this example is also called the Leray–Schauder Continuation Method.
5.2. Topological Degree
281
Step 1. First we show that F is a compact operator on X. To prove this notice that F is the composition Ψ◦ Φ where Φ : X → C[0, 1] is a Nemytski type operator Φ(x) : t → f (t, x(t), x(t)) ˙ and Ψ : C[0, 1] → X is the linear integral operator 1 Ψ(y)(t) = G(t, s)y(s) ds. 0
The Nemytski operator Φ is also a composition of a compact embedding of X into C 1 (the Arzel` a–Ascoli Theorem) and a continuous operator from C 1 [0, 1] into C[0, 1]. It is sufficient to show that Ψ is a continuous linear operator from C[0, 1] into X. Indeed, since Ψ(y) is a solution of the boundary value problem x ¨(t) = y(t), t ∈ (0, 1), x(0) = x(1) = 0 (Example 2.3.8), we have x = Ψ(y) ∈ X. Because of the boundary conditions there is t0 ∈ (0, 1) such that x(t ˙ 0 ) = 0 (the classical theorem due to Rolle). This allows us to write t x(t) ˙ = y(s) ds, t ∈ [0, 1], t0
- t x(s) ˙ ds-- ≤ yC[0,1], and to get the estimate |x(t)| ˙ ≤ yC[0,1]. Since |x(t)| = 0 we have ˙ + sup |¨ x(t)| ≤ 3yC[0,1]. xX sup |x(t)| + sup |x(t)| t∈[0,1]
t∈[0,1]
t∈[0,1]
Step 2. In order to establish an a priori estimate we have to require estimates on the behavior of f (t, x, y) for large x and y: (H1) There is M0 such that xf (s, x, 0) > 0
for s ∈ [0, 1] and |x| ≥ M0 .
(H2) There are c1 , c2 such that |f (s, x, y)| ≤ c1 y 2 + c2
for s ∈ [0, 1],
|x| ≤ M0 ,
y ∈ R.24
Suppose that (H1) and (H2) hold. Let there exist x ∈ X, x = o, and λ ∈ (0, 1] such that x = λF (x). 24 The
tion.
condition (H1) is sometimes called the sign condition and (H2) the Nagumo-type condi-
282
Chapter 5. Topological and Monotonicity Methods
First we will estimate xC[0,1]. There is t0 ∈ (0, 1) such that |x(t0 )| = xC[0,1] and we assume x(t0 ) > 0. Then x(t ˙ 0 ) = 0 and x ¨(t0 ) ≤ 0. Since x(t0 ) = λx(t0 )f (t0 , x(t0 ), 0), 0 ≥ x(t0 )¨ we have x(t0 ) ≤ M0 according to (H1). Similarly for x(t0 ) < 0, i.e., xC[0,1] ≤ M0 . To get an estimate of x˙ we consider the function 2 ˙ + c2 ]. g(t) log[c1 (x(t))
Since g(t) ˙ = 2c1
x(t)¨ ˙ x(t) x(t)f ˙ (t, x(t), x(t)) ˙ = 2λc1 , 2 +c 2+c c1 (x(t)) ˙ c1 (x(t)) ˙ 2 2
we obtain, by (H2), |g(t)| ˙ ≤ 2c1 |x(t)|. ˙ Let G = {t ∈ [0, 1] : x(t) ˙ = 0}. Then G=
Jj
where Jj is a closed interval such that x(t) ˙ = 0 for t ∈ Jj and x˙ vanishes at one end point of Jj (say τj ) at least. Now, we have - t - t c1 x˙ 2 (t) + c2 0 ≤ log = g(t) − g(τj ) = g(s) ˙ ds ≤ |g(s)| ˙ dsc1 · 0 + c 2 τj τj - - t ≤ 2c1 |x(s)| ˙ ds- = 2c1 |x(t) − x(τj )|, - τj since x˙ does not change its sign in Jj . Hence log
c1 x˙ 2 (t) + c2 ≤ 4c1 M0 . c2
This inequality shows that
2
|x(t)| ˙ ≤ M1
c2 2c1 M0 e , c1
t ∈ [0, 1].
If M2 sup {f (t, x, y) : t ∈ [0, 1], |x| ≤ M0 , |y| ≤ M1 }, then ¨ xC[0,1] ≤ M2 . These estimates of xC[0,1], x ˙ C[0,1] , ¨ xC[0,1] show that the set Σ from Example 5.2.15 is bounded and therefore the proof of existence of a solution of (5.2.10) under the hypotheses (H1) and (H2) is complete. The reader can imagine that (H1), (H2) are not the only sufficient conditions for solving (5.2.10). However, the direct use of the Schauder Fixed Point Theorem leads to more restrictive assumptions on f . A survey of results until 1980 can be g found in the monograph Fuˇc´ık [53].
5.2. Topological Degree
283
It is clear that the above stated procedure can be used for solving a more general equation Ax = F (x) where F ∈ C (Ω, X) (5.2.11) and A is a linear operator with a bounded inverse. In that case (5.2.11) is equivalent to x = A−1 F (x) with a compact operator A−1 F . More interesting questions arise for a non-invertible A. Since many differential operators (both ordinary and partial) are Fredholm operators we will suppose that A is a linear closed Fredholm operator25 of index zero, and proceed as in Remark 4.3.14 with the exception that A is not assumed to be continuous. We denote X1 Ker A, Y2 Im A, and choose topological complements X2 , Y1 such that X = X 1 ⊕ X2 ,
Y = Y1 ⊕ Y2 .
These closed complements exist because X1 has a finite dimension and Y2 a finite co-dimension (Example 2.1.12 and Remark 2.1.19). By the assumption on the index of A there is also an homeomorphism Λ of Y1 onto X1 . Denote by P and Q the linear continuous projections onto X1 and Y1 with kernels X2 and Y2 , respectively. Then the restriction of A to X2 ∩ Dom A is an injective operator with a bounded inverse B.26 The equation (5.2.11) is equivalent to the pair of equations Ax2 = (I − Q)F (x1 + x2 ),
o = QF (x1 + x2 ),
x1 ∈ X1 , x2 ∈ X2 ∩ Dom A,
(see Figure 5.2.4) or x2 = B(I − Q)F (x1 + x2 ),
x1 = x1 + ΛQF (x1 + x2 ).27
(5.2.12)
This pair of equations is equivalent to the equation G(x) = x
where
G(x) P x + ΛQF (x) + B(I − Q)F (x).
linear closed operator A is said to be Fredholm if dim Ker A < ∞, Im A is closed and codim Im A < ∞. The index of such an operator is defined as ind A dim Ker A − codim Im A. See the definition on page 70. 26 Indeed, B is a closed operator (as an inverse to a closed operator) defined on the Banach space Y2 . The continuity of B follows now from the Closed Graph Theorem (Corollary 2.1.10). The operator B(I − Q) is called a generalized (or a right) inverse to A. It is characterized by the following two properties: (i) AB(I − Q) = I − Q (the reason for calling it the right inverse); (ii) B(I − Q)Ax = (I − P )x for x ∈ Dom A. 25 A
27 Since
F is nonlinear, Λ need not be taken as linear. It is actually only essential that Λ maps Y1 into X1 and Λ−1 (o) = {o}.
284
Chapter 5. Topological and Monotonicity Methods
X
X2
Y2 = Im A
Y
B Q
P o
X1 = Ker A
Y1 o Λ Figure 5.2.4.
If Ω is a bounded open subset of X, Ax − F (x) = o for each x ∈ ∂Ω and G is compact on Ω, then the Leray–Schauder degree deg (I − G, Ω, o) is well defined. It is called the coincidence degree of the couple (A, F ). It can be proved that this definition does not depend on the choice of the projections P , Q and the class of Λ’s which do not change the orientations in X1 and Y1 . The coincidence degree was introduced by J. Mawhin (see, e.g., Mawhin [91] or Gaines & Mawhin [57]). J. Mawhin also proved the following theorem which generalizes the statement of Example 5.2.15. Theorem 5.2.17 (J. Mawhin). Let A : Dom A ⊂ X → X be a Fredholm operator of index zero, Ω a bounded open subset of a Banach space X, and let B(I − Q)F ∈ C (Ω, X) where B(I − Q) is a generalized inverse to A. Assume further that (i) Ax − λF (x) = o for x ∈ ∂Ω ∩ Dom A, λ ∈ (0, 1),28 (ii) deg (ΛQF |Ker A∩Ω , Ker A ∩ Ω, o) = 0. Then the equation (5.2.11) has a solution in Ω. Proof. The proof is based on the observation that the coincidence degree can be reduced with help of the homotopy invariance property and the Product Formula to the Brouwer degree of the restriction of ΛQF to Ker A ∩ Ω – for details see the references given above. Example 5.2.18. Consider the equation x˙ = f (t, x)
(5.2.13)
together with the periodic boundary condition x(0) = x(1)
(5.2.14)
where f ∈ C([0, 1] × R, R). See also Example 4.3.15. 28 Notice
λ = 0.
that injectivity of A is actually not needed hence the assumption (i) is not assumed for
5.2. Topological Degree
285
We denote Dom A = {x ∈ C 1 [0, 1] : x(0) = x(1)} and put for x ∈ Dom A.
Ax = x˙
Further, let F be a Nemytski operator defined by t ∈ [0, 1],
F (x)(t) = f (t, x(t)),
x ∈ X C[0, 1].
Then A is a Fredholm operator of index zero. We choose projections P x = x(0) and
onto Ker A
1
y(s) ds onto the complement of Im A.
Qy = 0
Then the generalized inverse to A is given by t B(I − Q)y(t) = y(s) ds − t 0
1
y(s) ds
0
and it is an isomorphism of Im A = Ker Q onto X2 ∩ Dom A where X2 = Ker P . Moreover, B(I − Q)F is a compact operator on X. To verify conditions (i), (ii) of Theorem 5.2.17 we suppose (H) there are functions f+ , f− ∈ C[0, 1] such that lim f (t, x) = f− (t),
lim f (t, x) = f+ (t)
x→−∞
x→+∞
uniformly with respect to t ∈ [0, 1] and t ∈ [0, 1],
f− (t) < f (t, x) < f+ (t),
x ∈ R.29
Step 1. Verification of condition (i): For any solution x of (5.2.13)–(5.2.14) we have, by integration, 1 1 1 f− (t) dt < 0 = f (t, x(t)) dt < f+ (t) dt. (5.2.15) 0
0
Take an ε > 0 such that 1
0
f− (t) dt < −ε,
0
1
f+ (t) dt > ε. 0
Then there exists r > 0 such that 0 < f (t, x) − f− (t) < and 0 < f+ (t) − f (t, x) < 29 Conditions
ε 2 ε 2
for t ∈ [0, 1], for
t ∈ [0, 1],
x < −r, x > r.
similar to (H) are called conditions of the Landesman–Lazer type. See, e.g., Landesman & Lazer [83], Fuˇc´ık [53], Mawhin [91] or Dr´ abek [39], and also Section 7.5.
286
Chapter 5. Topological and Monotonicity Methods
This implies that
1
f (t, x) dt < −
0
1
f (t, x) dt > 0
ε 2
ε 2
for
x < −r,
for
x > r.
It means that for any solution x of (5.2.13)–(5.2.14) there exists t0 ∈ [0, 1] such that |x(t0 )| ≤ r. Since
t
x(t) = x(t0 ) +
f (s, x(s)) ds, t0
we get xC[0,1] ≤ M r + max {f− C[0,1] , f+ C[0,1] }. If we take Ω to be a ball B(o; R) of radius R > M , then the condition (i) from Theorem 5.2.17 is satisfied for λ = 1. The same is also true for the solution of λ ∈ (0, 1).
Ax = λF (x),
Step 2. Verification of condition (ii): For x ∈ Ker A we have
1
ΛQF (x) =
f (t, x) dt 0
provided we have taken Λ as the identity map. Moreover,
1
f (t, R) dt > 0 > 0
1
f (t, −R) dt.
(5.2.16)
0
From the construction of the Brouwer degree it follows that we can assume that Φ(x)
1
f (t, x) dt 0
is smooth, and 0 is a regular value of Φ. By (5.2.16), the number of zero points of Φ is odd. This means that deg (Φ, (−R, R), 0) = sgn det Φ (x) Φ(x)=0
= 1 = deg (ΛQF |Ker A∩B(o;R) , Ker A ∩ B(o; R), o) and, therefore, condition (ii) is also satisfied.
5.2. Topological Degree
287
These considerations show that the problem (5.2.13)–(5.2.14) has a solution provided (H) is fulfilled. Notice that we have shown that
1
1
f− (t) dt < 0 < 0
f+ (t) dt 0
is also a necessary condition for the solvability of (5.2.13)–(5.2.14) under the asg sumption (H) by (5.2.15)). Remark 5.2.19. It is not necessary to consider only projections P , Q on the “small” Ker A and a complement of Im A. For example, suppose that there are projections {Qn } converging to the identity in a certain sense, and Qn Q = Q (e.g., Qn can be the partial sums of the Fourier series of the elements of Y = C[0, 1] for the periodic problem). If we can take projections Pn so that A(Im (I − Pn )) = Im (I − Qn ), then there is a chance to solve the first equation in (5.2.12) for a fixed x1 by the Contraction Principle even if F is only locally Lipschitz. This idea belongs to L. Cesari (see, e.g., his survey in Cesari [22]). Using this approach he proved the existence of a 2π-periodic solution of the equation x ¨ + x3 = sin t. Notice a significant difference in the sign of the nonlinear term here and in (H1) in Example 5.2.16, and the fact that the growth of the nonlinear term is faster here than in (H2). At the end of this section we turn our attention to the bifurcations of solutions. As in Section 4.3 we consider the equation f (λ, x) = o where f : R × X → X is continuous on J × U, J is an open interval and U is a neighborhood of o in a Banach space X. We suppose that f (λ, o) = o
for all λ ∈ J
and desire to find conditions under which the point λ0 ∈ J is a bifurcation point according to Definition 4.3.21. In Section 4.3 we have used a method based on the Implicit Function Theorem and now we want to employ the topological approach based on the degree theory. Notice that the definition of the index of an isolated solution (Example 5.2.10) can be literally used also in an infinite dimensional space.
288
Chapter 5. Topological and Monotonicity Methods
Proposition 5.2.20. Let h(λ, ·) : X → X be a compact operator on the neighborhood U of zero in a Banach space X for all λ ∈ J . Let o be an isolated solution of f (λ, x) x − h(λ, x) = o
in
U
for all
λ ∈ J \ {λ0 }.
Put i(λ) = deg (f (λ, ·), U, o). If lim i(λ) = lim i(λ),
λ→λ0−
(5.2.17)
λ→λ0+
then (λ0 , o) is a bifurcation point of f . Proof. Suppose not. Then there is a neighborhood V = J˜ × U˜ of (λ0 , o) such that (λ, o) are the only solutions of f (λ, x) = o
in V.
This means that for any λ ∈ J˜ the index i(λ) = deg (I − h(λ, ·), U˜ , o) is defined and, by the general homotopy invariance property of the degree (Exercise 5.2.29), the index i(λ) is constant, a contradiction to (5.2.17). The use of Proposition 5.2.20 is restricted to the problems of computing the index i(λ). The following classical result (Theorem 5.2.23) is based on a special form of f which is often met in applications. For the proof we need two prerequisites which are of independent interest. Proposition 5.2.21. Let Ω be an open set in a Banach space X and let F ∈ C (Ω, X). If the Fr´echet derivative F (x0 ) exists for an x0 ∈ Ω, then F (x0 ) is a (linear) compact operator. ∞
Proof. If F (x0 ) is not compact, then one can find ε0 > 0 and a sequence {yn }n=1 ⊂ X such that yn ≤ 1 and F (x0 )yk − F (x0 )yl ≥ ε0
for k = l.
By the definition of Fr´echet derivative, there is δ > 0 such that F (x0 + h) − F (x0 ) − F (x0 )h ≤
ε0 h 4
provided h < δ
Choose τ such that τ yk < δ
and
x0 + τ yk ∈ Ω
for all k ∈ N.
( ≤ 1).
5.2. Topological Degree
289
Then F (x0 + τ yk ) − F (x0 + τ yl ) ≥ F (x0 )(τ yk − τ yl ) − F (x0 + τ yk ) − F (x0 ) − F (x0 )τ yk ) − F (x0 + τ yl ) − F (x0 ) − F (x0 )τ yl ≥ But this means that F is not compact on Ω, a contradiction.
ε0 τ . 2
Proposition 5.2.22 (Leray–Schauder Index Formula). Let Ω be an open bounded set in a Banach space X and let F ∈ C (Ω, X). Let x0 ∈ Ω be a unique solution in Ω of the equation x = F (x). Assume that the Fr´echet derivative F (x0 ) exists and I − F (x0 ) is continuously invertible. Then deg (I − F, Ω, o) = (−1)β where β = m(λ) (5.2.18) λ∈σ(F (x0 ))∩R λ>1
and m(λ) is the multiplicity30 of the eigenvalue λ of the operator F (x0 ). Proof. First we recall that F (x0 ) is a compact operator (Proposition 5.2.21) and, therefore, β is a finite number (Corollary 2.2.13). Choose such a small ball B(o; ε) that x0 + B(o; ε) ⊂ Ω, and put H(t, y) = and
F (x0 + ty) − F (x0 ) , t H(0, y) = F (x0 )y,
t ∈ (0, 1],
y ∈ B(o; ε),
y ∈ B(o; ε).
The ball B(o; ε) can be chosen such that the equation y = H(t, y) has a unique solution in B(o; ε), namely y = o. Indeed, 1 [F (x + ty) − F (x ) − F (x )(ty)] + F (x )y − y H(t, y) − y = 0 0 0 0 t 1 c ≥ F (x0 )y − y − t [F (x0 + ty) − F (x0 ) − F (x0 )(ty) ≥ cy − 2 y provided y is small enough (by the definition of F (x0 ) and the assumption on I − F (x0 )). 30 For
the definition of multiplicity see footnote 23 on page 84.
290
Chapter 5. Topological and Monotonicity Methods
By the homotopy invariance property (in a more general setting – see Exercise 5.2.29), deg (I − H(t, ·), B(o; ε), o) is constant on the interval [0, 1]. In particular, deg (I − F, Ω, o) = deg (I − F (x0 ), B(o; ε), o). Put X1
∞
Ker [λI − F (x0 )]p .
λ∈σ(F (x0 )) p=1 λ>1
As we have mentioned above, dim X1 = β < ∞. Moreover, there exists a topological complement X2 to X1 in X which is F (x0 )-invariant (see the decomposition (2.2.4)). This decomposition of X allows us to use the Product Formula for the degree (Exercise 5.2.28) provided balls Bi ⊂ Xi , i = 1, 2, are chosen such that B1 × B2 ⊂ B(o; ε). Hence we obtain deg (I − F (x0 ), B(o; ε), o) = deg (F1 , B1 , o) deg (F2 , B2 , o) where Fi denotes the restriction of I − F (x0 ) to Xi , i = 1, 2. To compute deg (F2 , B2 , o) we introduce the homotopy H2 (t, y) = y − tF (x0 )y,
t ∈ [0, 1],
y ∈ X2 .
Assume that H2 (t, y) = o for a y = o. Then t ∈ (0, 1) and 1t ∈ σ(F (x0 )) and, by the definition of X1 , y ∈ X1 . Since X1 ∩ X2 = {o} we arrive at a contradiction. This consideration shows that we may apply the homotopy invariance property to H2 to get deg (F2 , B2 , o) = deg (I, B2 , o) = 1. The degree deg (F1 , B1 , o) is the Brouwer degree of the linear operator F1 in the finite dimensional space X1 , and it was computed in Example 5.2.10. Notice that A is here(I − F (x0 ))|X1 , and thus {µ ∈ σ(A) : µ < 0} = {λ ∈ σ(F (x0 )) : λ > 1}. This shows that deg (F1 , B1 , o) = (−1)β .
Theorem 5.2.23 (Krasnoselski Local Bifurcation Theorem). Let U be a neighborhood of o in a Banach space X, and let f (λ, x) = x − λAx − G(λ, x),
λ ∈ J,
x ∈ U,
where J is an open interval in R, A is a linear compact operator on X, G(λ, ·) : U → X is a compact operator and G(λ, x) =o x→o x lim
for all
λ ∈ J.
If λ0 ∈ J is such that λ10 is an eigenvalue of A of odd multiplicity, then (λ0 , o) is a bifurcation point of f .
5.2. Topological Degree
291
Proof. Suppose that (λ0 , o) is not a bifurcation point. Then there is a neighborhood J˜ × V of (λ0 , o) such that the equation f (λ, x) = o has for every λ ∈ J˜ a*unique solution in V, namely x = o. We may assume that ) 0 ∈ J˜, λ : λ1 ∈ σ(A) ∩ J˜ = {λ0 }, and also that λ0 > 0 (for λ0 < 0 consider ˜ x) = f (−λ, x)). The degree deg (f (λ, ·), V, o) is defined and it is given by the f(λ, Leray–Schauder Index Formula (Proposition 5.2.22): deg (f (λ, ·), V, o) = (−1)β where β = m(µ) = m(κ). µ∈σ(λA)∩R µ>1
κ∈σ(A)∩R 1 κ> λ
' ( Since m λ10 is odd, the degree deg (f (λ, ·), V, o) changes sign at λ0 . A contradiction follows now from Proposition 5.2.20. The Krasnoselski Theorem 5.2.23 is of a local nature and does not say anything about the global behavior of a “branch” of nontrivial solutions of the equation f (λ, x) = o. The so-called global bifurcation theorems describe these branches (see Appendix 5.2A). The interested reader can also consult, e.g., Rabinowitz [103, pages 11– 36], Iz´e [69], Nirenberg [100, Chapter 3], Krasnoselski & Zabreiko [79], Krawcewicz & Wu [80]. There are also methods depending on other topological tools. See, e.g., Alexander [3, pages 457–483] or Fitzpatrick [50] and references given there. Remark 5.2.24 (Comparison of Theorems 4.3.22 and 5.2.23). Let value of A and
1 dim Ker I − A = 1. λ0
1 λ0
be an eigen-
Denote by x0 , x0 = 1, an eigenvector of A associated with λ10 . Let us compare the assumptions of Theorems 4.3.22 and 5.2.23. One of the essential differences consists in the smoothness assumptions: while Theorem 4.3.22 applied to f (λ, x) = x − λAx − G(λ, x) requires G being a C 2 -mapping, Theorem 5.2.23 demands G compact (and so continuous). The assumption G(λ, x) = o(x), x → o, yields f2 (λ0 , o) = I − λ0 A,
f1,2 (λ0 , o) = −A.
(5.2.19)
Theorem 4.3.22 requires Im (I − λ0 A) = Im (I − λ0 A), codim Im (I − λ0 A) = 1,
(5.2.20) (5.2.21)
(λ0 , o)(1, x0 ) ∈ Im (I − λ0 A). f1,2
(5.2.22)
292
Chapter 5. Topological and Monotonicity Methods
The compactness of A (Theorem 5.2.23) implies that I − λ0 A is a Fredholm operator of index 0, hence (5.2.20) holds and also codim Im (I − λ0 A) = dim Ker (I − λ0 A) = 1, i.e., follows. The last assumption is closely connected with the multiplicity ( ' (5.2.21) 1 1 m λ0 of λ0 as follows from the assertion: The assumption (5.2.22) is verified if and only if
m
1 λ0
= dim
∞
Ker (I − λ0 A)k = 1.
(5.2.23)
k=1
First, let us prove (5.2.23) ⇒ (5.2.22). Assume the contrary: f1,2 (λ0 , o)(1, x0 ) ∈ Im (I − λ0 A).
According to (5.2.19) it means −Ax0 ∈ Im (I − λ0 A). Since x0 = λ0 Ax0 , we have x0 ∈ Im (I − λ0 A) as well. Then there exists w ∈ X such that x0 = w − λ0 Aw. But x0 ∈ Ker (I − λ0 A), i.e., w ∈ Ker (I − λ0 A)2 . Since x = o, we have w ∈ Ker (I − λ0 A), which implies
dim Ker (I − λ0 A) > dim Ker (I − λ0 A), 2
i.e.,
m
1 λ0
> 1,
a contradiction. Now, let us prove (5.2.22) ⇒ (5.2.23). Take w ∈ Ker (I − λ0 A)2
and set
u = (I − λ0 A)w.
Then (I − λ0 A)u = (I − λ0 A)2 w = o that implies u ∈ Ker (I − λ0 A). Since Ker (I − λ0 A) is generated by x0 , there exists a ∈ R such that u = ax0 . Simultaneously, u = (I − λ0 A)w ∈ Im (I − λ0 A). For a = 0 we have −Ax0 = −λ0 x0 = −
λ0 u ∈ Im (I − λ0 A), a
a contradiction with (5.2.19) and (5.2.22). Hence a = 0 and u = (I − λ0 A)w = o,
i.e.,
This proves Ker (I − λ0 A)2 ⊂ Ker (I − λ0 A).
w ∈ Ker A.
5.2. Topological Degree
293
Since the opposite inclusion is evident, we have proved Ker (I − λ0 A)2 = Ker (I − λ0 A). By induction by the power n we now easily prove that Ker (I − λ0 A)n+1 = Ker (I − λ0 A)n
for any n ∈ N
(do it in detail!). Exercise 5.2.25. Prove the following assertion: Let f , A and G be as in Theorem 5.2.23. Let λ0 = 0 be a bifurcation point of f . Then λ10 is an eigenvalue of A. Hint. If λ0 is a bifurcation point of f , there are o = xn → o, Set vn
xn xn .
λn → λ0 ,
f (λn , xn ) = o.
Then vn = λn Avn −
G(λn , vn ) . xn
(5.2.24)
∞
Since {vn }n=1 is bounded, A is compact, and passing to a subsequence if necessary we may assume that vn → v for a v ∈ X, v = o. From (5.2.24) we obtain that v = λ0 Av. Exercise 5.2.26. Let F ∈ C (Ω, X) where Ω ⊂ X is an open, bounded, symmetric with respect to o ∈ X, and nonempty set in a Banach space X, F (x) = o for all x ∈ ∂Ω. Assume that F (x) = −F (−x)
for any x ∈ ∂Ω.
Then the Leray–Schauder degree deg (I − F, Ω, o) is an odd number. Hint. Use finite dimensional approximations as in the construction of the Leray– Schauder degree (see pages 277–278) and Theorem 4.3.130. Exercise 5.2.27. Modify the proof of (5.2.9) to obtain the so-called Product Formula: deg (g, Ω, y0 ) = deg (f1 , Ω1 , y1,0 ) deg (f2 , Ω2 , y2,0 ) where g = (f1 , f2 ) : Ω → RN1 +N2 , Ω = Ω1 × Ω2 , y0 = (y1,0 , y2,0 ), Ωi ⊂ RNi , fi ∈ C(Ωi , RNi ), yi,0 ∈ fi (∂Ωi ), i = 1, 2. Exercise 5.2.28. By repeating the construction of the Leray–Schauder degree show that the Product Formula and the boundary dependence (Theorem 4.3.124(vii)) of the degree also hold for the Leray–Schauder degree.
294
Chapter 5. Topological and Monotonicity Methods
Exercise 5.2.29. Prove the following general homotopy invariance property: Let Ω be an open bounded set in a Banach space X and assume that h = h(t, x) ∈ C([0, 1] × Ω, X) and x − h(t, x) = y0
for every
x ∈ ∂Ω
and
t ∈ [0, 1].
Then deg (I − h(t, ·), Ω, y0 ) is constant with respect to t ∈ [0, 1]. The following two exercises use an idea similar to that of Example 5.2.14. Exercise 5.2.30. Let H be a Hilbert space and F a compact operator on a bounded open set Ω ⊂ H into H. Assume that o ∈ Ω and (F (x), x) ≤ x2
for each x ∈ ∂Ω.
Prove that F has a fixed point in Ω. Hint. Suppose not and show that there is t0 ∈ [0, 1], x0 ∈ ∂Ω such that x0 = t0 F (x0 ). By assumption, t0 = 1. Exercise 5.2.31. Let F be a compact operator from the closed unit ball B(o; 1) of a Banach space X into X and, moreover, let x − F (x)2 ≥ F (x)2 − x2
for x ∈ ∂B(o; 1).
Prove that F has a fixed point in B(o; 1). Exercise 5.2.32. Let f be continuous and satisfy the following growth conditions: There are K > 0 and 0 < γ < 1 such that the inequality |f (t, x, y)| ≤ K(1 + |x|γ + |y|γ )
holds for
t ∈ [0, 1], x, y ∈ R.
Then the boundary value problem (5.2.10) has a solution. Prove that! Hint. Proceed similarly to Example 5.2.16. Use the equation to estimate x ∈ Σ and compute x˙ with help of a special form of the kernel G. Exercise 5.2.33. Apply Theorem 5.2.23 to the Dirichlet boundary value problem x ¨(t) + λx(t) + g(λ, t, x(t)) = 0, t ∈ (0, π), x(0) = x(π) = 0, and show that every point (k 2 , o), k = 1, 2, . . . is a bifurcation point.
5.2A. Global Bifurcation Theorem
295
5.2A Global Bifurcation Theorem In this appendix we study the bifurcation equation f (λ, x) x − λAx − G(λ, x) = o.
(5.2.25)
The following result is due to Rabinowitz [103, pp. 11–36], Rabinowitz [104]. Theorem 5.2.34 (Rabinowitz Global Bifurcation Theorem). Let X be a Banach space, Ω an open set in R × X, (λ0 , o) ∈ Ω, λ0 = 0. Let us assume: A is a compact linear operator from X into X,
(5.2.26)
G is a compact (nonlinear) operator from Ω into X,
(5.2.27)
for any bounded set M ⊂ {v ∈ R : (v, o) ∈ Ω} we have G(λ, x) = o(x), x → 0, uniformly for λ ∈ M, 1 is an eigenvalue of A of odd multiplicity. λ0
(5.2.28) (5.2.29)
Denote by S the closure of all solutions of (5.2.25) with x = o, i.e., S = {(λ, x) ∈ Ω : x = o, f (λ, x) = o}. Then S contains the point (λ0 , o).31 Let C be a component of S which contains (λ0 , o). Then at least one of the following assertions holds: (i) C is not a compact set in Ω. (ii) C contains an even number of points (λ, o) where multiplicity.
1 λ
is an eigenvalue of A of odd
Proof. We shall follow the proof of Iz´e [69]. The idea is the following. We will assume that C is compact, and prove that it contains an even number of points described in (ii). Since C is compact, it contains only a finite number of points (λ, o) where λ = 0 and λ1 is an eigenvalue of the compact operator A (see Figure 5.2.5): We shall denote them by (λ0 , o), . . . , (λk−1 , o). ˜ Since C is a component of S in Ω and S is closed, there exists an open bounded set Ω ˜ and S ∩ ∂ Ω ˜ = ∅. We prove that Ω ˜ can be chosen in such a way that such that C ⊂ Ω ˜ j = 0, 1, . . . , k − 1, but (λ, o) ∈ ˜ for 1 ∈ σ(A), λ = λj , j = 0, 1, . . . , k − 1” “(λj , o) ∈ Ω, /Ω λ (see Figure 5.2.6). Indeed, let U be a δ-neighborhood of C such that U \ C does not contain any point (λ, o), λ = 0, λ1 ∈ σ(A). The set K = U ∩ S is then compact,32 and obviously C ∩ (∂U ∩ S) = ∅. By Deimling [34, Lemma 29.1] there exist compact disjoint sets K1 , K2 ⊂ K such that K = K1 ∪ K2 , 31 I.e., 32 The
C ⊂ K1 ,
∂U ∩ S ⊂ K2 .
(λ0 , o) is a bifurcation point in the sense of Definition 4.3.21. reader is invited to prove it using the compactness of A and G.
296
Chapter 5. Topological and Monotonicity Methods
S
X
C (0, o)
λ0
λ1
λ2
λ3
λ
Figure 5.2.5.
X ∂U ∩ S
δ K1
K2
ε0
C
(0, o)
λ0 U0 (ε, ε)
λ1
λ2
U1 (ε, ε)
U2 (ε, ε)
U3 (ε, ε)
λ3
λ
˜ Ω
U Figure 5.2.6.
˜ can be chosen as an ε0 -neighborhood of K1 with Hence Ω ε0 < min {dist (K1 , K2 ), dist (K1 , ∂U), δ}. ˜ → X × R as For any r > 0 define fr : Ω fr (λ, x) = (x2 − r 2 , f (λ, x)).
(5.2.30)
5.2A. Global Bifurcation Theorem
297
Then obviously fr (λ, x) = o
⇐⇒
f (λ, x) = o
and
x = r.
(In other words, the function fr “considers” the solutions of f (λ, x) = o which belong ˜ and the homotopy invariance to the sphere x = r.) Then thanks to the choice of Ω property of the degree (Theorem 5.2.13(vi)), we conclude that ˜ o) deg (fr , Ω, is well defined and independent of r > 0. The rest of the proof consists in the calculation of this degree for sufficiently large r and for sufficiently small r. ˜ implies that there exists C > 0 such Step 1 (sufficiently large r). The boundedness of Ω ˜ that x < C for any (λ, x) ∈ Ω. Then for r > C the equation fr (λ, x) = o ˜ and so, according to Theorem 5.2.13(iii), we have has no solution in Ω, ˜ o) = 0. deg (fr , Ω, Step 2 (sufficiently small r). For j = 0, 1, . . . , k − 1 set Uj (ε, r) {(λ, x) : x2 + |λ − λj |2 < r 2 + ε2 }, ˜ and choose ε > 0 so small that the sets Uj (ε, ε) are pairwise disjoint, all belong to Ω, and do not contain (0, o) (see Figure 5.2.6). We prove first that there exists r > 0 (r ≤ ε) such that x − λAx − tG(λ, x) = o
(5.2.31)
˜ 0 < x ≤ r, |λ − λj | ≥ ε, j = 0, 1, . . . , k − 1. Indeed, for all t ∈ [0, 1], (λ, x) ∈ Ω, assume via contradiction that such r > 0 does not exist. Then there exist tn ∈ [0, 1] and ˜ n ∈ N, o = xn → o, |λn − λj | ≥ ε, j = 0, 1, . . . , k − 1, not satisfying (5.2.31), (λn , xn ) ∈ Ω, i.e., xn − λn Axn − tn G(λn , xn ) = o. (5.2.32) ˜ It follows from the construction We can assume, without loss of generality, that λn → λ. ˜ that 1 ∈ σ(A). On the other hand, it follows from (5.2.32) that (setting yn = xn ) of Ω ˜
xn
λ yn − λn Ayn − tn
G(λn , xn ) = o. xn
(5.2.33)
Now, the compactness of A and (5.2.28) imply that for a y = o (ynk → y for a subsequence) we have ˜ y − λAy = o, a contradiction. We shall write Uj = Uj (ε, r) for simplicity. It follows from Theorem 5.2.13(iv) that ˜ o) = deg (fr , Ω,
k−1 j=0
deg (fr , Uj , o).
(5.2.34)
298
Chapter 5. Topological and Monotonicity Methods
Let λj be fixed. It follows from the choice of ε > 0 that for 0 < |λ − λj | ≤ ε we have 1
∈ σ(A). λ Then for any such λ the degree deg (I − λA, B(o; r), o) is well defined. Moreover, the homotopy invariance property of the degree implies that it is locally constant with respect to λ. Denote ij− = deg (I − (λj − ε)A, B(o; r), o),
ij+ = deg (I − (λj + ε)A; B(o; r), o).
It follows from Lemma 5.2.35 below that deg (fr , Uj , o) = ij− − ij+ . If mj is the multiplicity of
1 λj
(5.2.35)
, then Proposition 5.2.22 yields ij+ = (−1)mj ij− .
Hence for mj even we obtain deg (fr , Uj , o) = ij− − ij+ = 0,
(5.2.36)
deg (fr , Uj , o) = 2ij− .
(5.2.37)
while for mj odd we have It follows from (5.2.34)–(5.2.37) that ˜ o) = 2 deg (fr , Ω,
k−1
ij− .
j=0 mj odd
Since this degree is independent of r, it must be equal to zero (see Step 1 of this proof). Hence there must be an even number of eigenvalues of odd algebraic multiplicity among λ0 , . . . , λk−1 . Now we prove an analogue of the Leray–Schauder Index Formula (see Proposition 5.2.22). Lemma 5.2.35. Let fr , Uj , ij− , ij+ be as above. Then deg (fr , Uj , o) = ij− − ij+ . Proof. We will connect fr with a simpler mapping using a suitable homotopy. Let us define this homotopy in the following way: ∀t ∈ [0, 1]
ft,r : Uj → R × X : ft,r (λ, x) = (τt , yt ),
τt = t(x2 − r 2 ) + (1 − t)(ε2 − (λ − λj )2 ), We prove that for any t ∈ [0, 1]
o∈ / ft,r (∂Uj ).
yt = x − λAx − tG(λ, x).
5.2A. Global Bifurcation Theorem
299
Assume the contrary, i.e., there exist t ∈ [0, 1] and (λ, x) ∈ ∂Uj such that ft,r (λ, x) = o. The fact that (λ, x) ∈ ∂Uj implies x2 + (λ − λj )2 = r 2 + ε2 . At the same time, from 0 = τt = t(x2 + (λ − λj )2 ) − t(r 2 + ε2 ) + ε2 − (λ − λj )2 we obtain λ = λj ± ε, and so x = r. This together with yt = o contradicts (5.2.31). The homotopy invariance property of the degree then implies deg (fr , Uj , o) = deg (f0,r , Uj , o). The mapping f0,r is now easier to deal with. Indeed, the point o has two preimages, (λj − ε, o) and (λj + ε, o), with respect to the mapping f0,r (λ, x) = (ε2 − (λ − λj )2 , x − λAx). At both points the Fr´echet differential f0,r is injective: (λ, 0)(λ, u) = (−2(λ − λj )λ, u − λAu). f0,r
Let us choose sufficiently small neighborhoods of points (λj ± ε, o) in the following way: Let V ± be small neighborhoods of points λj ± ε and let U be a small neighborhood of o in X such that U ± V ± × U ⊂ Uj . We have
deg (f0,r , Uj , o) = deg (f0,r , U − , o) + deg (f0,r , U + , o)
and, by the Product Formula (Exercise 5.2.28) and Proposition 5.2.22, deg (f0,r , U − , o) = deg (I − (λj − ε)A, U, o) · deg (ε2 − (· − λj )2 , V − , o) = ij− · 1. Similarly, we get deg (f0,r , U + , o) = ij+ · (−1).
This completes the proof.
Corollary 5.2.36. If Ω = R × X in Theorem 5.2.34, then the first possibility (i) reduces to C is unbounded in R × X and (ii) remains unchanged. Proof. Let (λ, x) ∈ C. Then x = λAx + G(λ, x). This implies that if C is bounded in R × X, it is also relatively compact because T (λ, x) = λAx + G(λ, x) is a compact operator. But C is closed, and so it is compact. We have thus proved that if C is bounded, it is also compact.
300
Chapter 5. Topological and Monotonicity Methods
We will now discuss the special case when λ10 is an eigenvalue of A the multiplicity of which is equal to 1. If this is the case in Theorem 5.2.34 and C is the component containing the point (λ0 , o), then C consists of two connected sets C ± which near (λ0 , o) meet only in (λ0 , o). More precisely, the next assertion holds (see Deimling [34, Corollary 29.1]). Corollary 5.2.37. Under the hypotheses of Theorem 5.2.34 suppose, in addition, that the multiplicity of λ10 is 1. Then the component C containing (λ0 , o) consists of two connected sets C + and C − , C = C + ∪ C − such that C + ∩ C − ∩ B((λ0 , o); ) = {(λ0 , o)}
and
C ± ∩ ∂B((λ0 , o); ) = ∅
for sufficiently small > 0. The meaning of C ± is the following. Let us assume that (λn , xn ) ∈ C ± , λn → λ0 and xn → o. Then similarly to Exercise 5.2.25 we prove that xxnn → ±v0 where v0 = o is a normalized eigenvector associated with the eigenvalue λ0 . In other words, the sets C ± describe the “branches” of nontrivial solutions which bifurcate in the direction of the eigenvectors ±v0 (see Figure 5.2.7 for the projections of C ± into the space X).
C+ v0 o −v0 C− Figure 5.2.7. The global properties of C ± were studied by Dancer [30]. The main result of this paper can be formulated as follows. Theorem 5.2.38 (Dancer Global Bifurcation Theorem). The sets C + and C − are either both unbounded, or C + ∩ C − = {(λ0 , o)}. Example 5.2.39 (Application of the Dancer Global Bifurcation Theorem). Let us consider the Dirichlet boundary value problem x ¨(t) + λx(t) = g(λ, t, x(t)), t ∈ (0, π), (5.2.38) x(0) = x(π) = 0. We assume that g = g(λ, t, s) is a continuous function from [0, π] × R × R into R and, given any bounded interval I ⊂ R, lim
s→0
g(λ, t, s) =0 s
(5.2.39)
5.2A. Global Bifurcation Theorem
301
holds uniformly with respect to t ∈ [0, π] and λ ∈ I. In particular, g(λ, t, 0) = 0,
t ∈ [0, π], λ ∈ R,
and so (5.2.38) has a trivial solution. In this example we discuss the existence and properties of nontrivial weak solutions of (5.2.38). Let X W01,2 (0, π) and define operators A, G : X → X as follows: π π (Ax, y) = x(t)y(t) dt, (G(λ, x), y) = g(λ, t, x(t))y(t) dt for any x, y ∈ X. 0
0
The existence of a weak solution of (5.2.38) is equivalent to the existence of a solution of the operator equation (5.2.25), i.e., x − λAx − G(λ, x) = o, cf. Example 5.3.11. Moreover, (5.2.39) implies (5.2.28) (the reader is invited to check it). Set λ0 = n2 where n ∈ N is fixed. Then λ10 = n12 is an eigenvalue of A of the multiplicity 1. It follows from the above results (Theorems 5.2.34 and 5.2.38) that there is a component C of S which contains nontrivial solutions of (5.2.25), and such that C = C+ ∪ C−,
{(n2 , o)} ∈ C + ∩ C − ,
C ± are either both unbounded, or C + ∩ C − = {(n2 , o)}. We show that the latter case cannot occur if g = g(λ, t, s) is locally Lipschitz continuous with respect to the third variable s (cf. page 93). To prove this fact the properties of the initial value problem (with λ fixed) x ¨(t) + λx(t) − g(λ, t, x(t)) = 0, (5.2.40) ˙ 0 ) = x1 , t0 ∈ [0, π], x(t0 ) = x0 , x(t play a crucial role. In particular, we use the uniqueness of the solution to (5.2.40), which in turn implies that (5.2.40) with x0 = x1 = 0 has only the trivial solution. The regularity result (cf. Remark 5.3.10 and Exercise 5.3.26) for weak solutions of (5.2.38) yields that for any (λ, x) ∈ C ± we have x ∈ C 2 [0, π] and the above mentioned uniqueness result for (5.2.40) also implies that any such x has only a finite number of nodes in (0, π). Let 2 1 2 v0 (t) = sin nt n π be a normalized eigenfunction associated with the eigenvalue n12 of A. Consider (λk , xk ) ∈ C + such that λk → n2 and xk → 0. Then xk − λk Axk − G(λk , xk ) = o. The reflexivity of X, (5.2.39) and the compactness of A imply that vk
xk → v0 xk
in
X.
The embedding X = W01,2 (0, π) ⊂ C[0, π] and the fact that vk = λk Avk +
G(λk , xk ) xk
(5.2.41)
302
Chapter 5. Topological and Monotonicity Methods
then yield that vk → v0 even in C 2 [0, π]. In particular, it means that for large enough k, the functions xk share the nodal properties of v0 . More precisely, let A+ {(λ, x) ∈ C + : x has exactly (n − 1) nodes in (0, π) and x(0) ˙ > 0}, A− {(λ, x) ∈ C − : x has exactly (n − 1) nodes in (0, π) and x(0) ˙ < 0}. Then there exists 0 > 0 such that C ± ∩ B((n2 , o); ) = A±
for any
0 ≤ ≤ 0 .
In particular, A± = ∅. We show that A± is closed and open in C ± . Let us consider C + , the case of C − is similar. Recall that C + is a connected set with respect to the ˆ x topology induced by the topology on R × X. For a given (λ, ˆ) ∈ C + the convergence ˆ x (λk , xk ) → (λ, ˆ) in this topology means that ˆ λk → λ
in R
and
xk → x ˆ
in X.
The above mentioned regularity result and the embedding X = W01,2 (0, π) ⊂ C[0, π] then imply that ˆ in C 2 [0, π]. xk → x Let us assume that (λk , xk ) ∈ A+ ,
(λk , xk ) = (λ, o),
ˆ x (λk , xk ) → (λ, ˆ) ∈ C + .
ˆ x The fact xk → x ˆ in C 2 [0, π] then yields that (λ, ˆ) ∈ A+ , i.e., A+ is closed in C + . On + 2 ˆ ˆ ˆ) = (n , o), then there exists ˆ > 0 such that the other hand, if (λ, x ˆ) ∈ A , (λ, x ˆ x C + ∩ B((λ, ˆ); ˆ) ⊂ A+ , ˆ and xk → x for otherwise there would be λk → λ ˆ in C 2 [0, π], (λk , xk ) ∈ A+ , (λk , xk ) ∈ + + + C , a contradiction. Hence A is open in C . We have just proved that A± = C ± and so the sets C + and C − do not have any common point besides (n2 , o). According to Theorem 5.2.38 both C + and C − are unbounded in R × X. Let us emphasize that this means that C ± are unbounded either with respect to x, or with respect to λ (or with respect to both x and λ!). Some further properties of g might provide more information about the sets of C ± (e.g., boundedness with respect to x – if there are a priori estimates for all solutions – and unboundedness with respect to λ; or vice versa, boundedness with respect to λ and e unboundedness with respect to x). Exercise 5.2.40. Consider the boundary value problem (5.2.38) and apply Theorem 5.2.34 to get conclusions about the bifurcation branches. Formulate further assumptions on g which will imply unboundedness of the branches with respect to x and λ, respectively. Exercise 5.2.41. Consider the Neumann boundary value problem x ¨(t) + λx(t) = g(λ, t, x(t)), t ∈ (0, π), x(0) ˙ = x(π) ˙ = 0.
(5.2.42)
Find conditions on g = g(λ, t, s) and λ making it possible to apply Theorem 5.2.34.
5.2B. Topological Degree for Generalized Monotone Operators
303
Exercise 5.2.42. Modify assumptions from Exercise 5.2.41 on g so as to make it possible to apply Theorem 5.2.38 to (5.2.42) and to exclude the situation C + ∩ C − = {(n2 , o)}. Exercise 5.2.43. Consider the periodic problem x ¨(t) + λx(t) = g(λ, t, x(t)), x(0) = x(2π),
t ∈ (0, 2π),
x(0) ˙ = x(2π). ˙
(5.2.43)
Find conditions on g = g(λ, t, s) and λ making it possible to apply Theorem 5.2.34, cf. Example 4.3.25.
5.2B Topological Degree for Generalized Monotone Operators Let X be a reflexive real Banach space and X ∗ its dual. We will consider the operator T : X → X ∗.
(5.2.44)
The purpose of this appendix is to inform the reader about a possible method for extending the Leray–Schauder degree theory to mappings of the type (5.2.44). The following definition is the key to the theory presented in this appendix. Definition 5.2.44. The operator T : X → X ∗ is said to satisfy the (S+ ) condition if the assumptions un u0
(weakly) in X
and
lim sup T (un ), un − u0 ≤ 0 33 n→∞
imply un → u0
(strongly) in
X.
Remark 5.2.45. The topological degree for generalized monotone operators was independently introduced by Browder [19] and Skrypnik [121]. The notation (S+ ) is brought from Browder [19] while the same condition is called α(X) in Skrypnik [121]. The (S+ ) condition is a kind of compactness condition and plays an essential role in the construction of the degree for T : X → X ∗ . This construction is based on the Brouwer degree and finite dimensional approximations as the construction of the Leray–Schauder degree, and mappings satisfying the (S+ ) condition then play a similar role as compact perturbations of the identity. The following assertion illustrates this fact. Its proof is a straightforward consequence of Definition 5.2.44. Lemma 5.2.46. Let T : X → X ∗ satisfy the (S+ ) condition and let K : X → X ∗ be a compact operator. Then the sum T + K : X → X ∗ satisfies the (S+ ) condition. The following assertion is an analogue of Theorem 5.2.13 and of Exercise 5.2.26. Theorem 5.2.47 (I. V. Skrypnik [121]). Let T : X → X ∗ be a bounded and demicontinuous34 operator satisfying the (S+ ) condition. Let D ⊂ X be an open, bounded and and in the sequel we denote by f, u f (u) the value of the linear form f ∈ X ∗ for an element u ∈ X. If X is a Hilbert space, then according to the Riesz Representation Theorem, f, x = (x, f ). 34 We say that T : X → X ∗ is demicontinuous if T maps strongly convergent sequences in X to weakly convergent sequences in X ∗ . 33 Here
304
Chapter 5. Topological and Monotonicity Methods
nonempty set with the boundary ∂D such that T (u) = o for u ∈ ∂D. Then there exists an integer deg (T, D, o) (called the degree of the mapping T ) such that (i) deg (T, D, o) = 0 implies that there exists an element u0 ∈ D such that T (u0 ) = o. (ii) If D is symmetric with respect to the origin and T satisfies T (u) = −T (−u) for any u ∈ ∂D, then deg (T, D, o) is an odd number (and thus different from zero). (iii) (Homotopy invariance property) Let Tλ be a family of bounded and demicontinuous mappings which satisfy the (S+ ) condition and which depend continuously on a real parameter λ ∈ [0, 1], and let Tλ (u) = o for any u ∈ ∂D and λ ∈ [0, 1]. Then deg (Tλ , D, o) is constant with respect to λ ∈ [0, 1]. In particular, we have deg (T0 , D, o) = deg (T1 , D, o). The following assertion combined with Theorem 5.2.47(i) is a crucial tool in proving the existence of a solution or the existence of a bifurcation branch (see Appendix 7.5A). Proposition 5.2.48 (I.V. Skrypnik [121]). Let T : X → X ∗ be a bounded, demicontinuous mapping satisfying the (S+ ) condition, o ∈ D \ ∂D, T (u) = o for u ∈ ∂D, D being as in Theorem 5.2.47. Let for u ∈ ∂D the inequality T (u), u ≥ 0
be valid.
Then deg (T, D, o) = 1. Let u0 ∈ X be an isolated solution of the equation T (u) = o.
(5.2.45)
Similarly to the finite dimensional case (and in the case of the Leray–Schauder degree) we define the index of an isolated solution u0 as i(T, u0 ) = lim deg (T, B(u0 ; r), o). r→0+
Then we have the following useful property of the degree. Proposition 5.2.49 (I.V. Skrypnik [121]). Let T and D be as in Theorem 5.2.47. Let T (u) = o have only isolated solutions in D and let T (u) = o for u ∈ ∂D. Then there is only a finite number of solutions of (5.2.45) in D, ui , i = 1, . . . , n, and the equality deg (T, D, o) =
n i=1
i(T, ui )
holds.
5.2B. Topological Degree for Generalized Monotone Operators
305
The last assertion connects the properties of the functionals and the degree of their Fr´echet derivatives. Proposition 5.2.50 (I. V. Skrypnik [121]). Assume that a real functional F : X → R has a local minimum at u0 ∈ X and its Fr´echet derivative F : X → X ∗ is a bounded and demicontinuous mapping which satisfies the (S+ ) condition. Let, moreover, u0 be an isolated solution of F (u0 ) = o. Then i(F , u0 ) = 1. Example 5.2.51. Let us consider the boundary value problem p−2 x(t))˙ ˙ − g(x(t)) = f (t), t ∈ (0, 1), −(|x(t)| ˙
(5.2.46)
x(0) = x(1) = 0
where p > 1, f ∈ Lp (0, 1), p = differential operator
p , p−1
and g : R → R is a continuous function. The
˙ 35 x → (|x| ˙ p−2 x)˙ is the so-called one-dimensional p-Laplacian (or half-linear differential operator of the second order). The parameter λ ∈ R for which there is a nontrivial weak solution (cf. Remark 5.3.10) ϕ = ϕ(t) (i.e., not identically equal to zero in (0, 1)) of the problem p−2 x(t))˙ ˙ − λ|x(t)|p−2 x(t) = 0, t ∈ (0, 1), −(|x(t)| ˙ (5.2.47) x(0) = x(1) = 0 is called an eigenvalue of the eigenvalue problem (5.2.47) and the function ϕ an eigenfunction associated with the eigenvalue λ. It is known (see, e.g., Elbert [47]) that the problem (5.2.47) has a countable set of simple eigenvalues 0 < λ1 < λ2 < · · · , lim λn = ∞ n→∞
(cf. Appendix 6.4B for the case p > 2) and the values of λn , n = 1, 2, . . . , can be explicitly calculated in terms of p and π. The eigenfunction ϕn associated with λn is continuously differentiable and has exactly n − 1 zero points in (0, 1). In particular, we can choose ϕ1 (t) > 0, t ∈ (0, 1). (See Elbert [47], Doˇsl´ y [37], Dr´ abek, Krejˇc´ı & Tak´ aˇc [41] and references given there.) However, the concrete values of λn are not important in this example. Let us assume that lim
s→±∞
g(s) =λ |s|p−2 s
where
λn < λ < λn+1
for an n = 1, 2, . . . .
(5.2.48)
The problem (5.2.46) is then called a nonresonance problem (cf. Remark 7.5.5). Put X W01,p (0, 1) with the norm
1
|x(t)| ˙ dt
x =
p
p1 .
0
Let us define operators J, G : X → X ∗ and an element f ∗ ∈ X ∗ by 1 1 p−2 |x(t)| ˙ x(t) ˙ y(t) ˙ dt, 36 G(x), y = g(x(t))y(t) dt, J(x), y = 0
0
right-hand side is defined by ϕ(x) ˙ where ϕ(s) = s = 0, ϕ(0) = 0 for p > 1. the H¨ older inequality the integral exists and defines, for a fixed x ∈ X, a continuous linear form on X.
35 The 36 By
|s|p−2 s,
306
Chapter 5. Topological and Monotonicity Methods f ∗ , y =
1
f (t)y(t) dt
x, y ∈ X.
for any
0
If we set T = J + G, then the operator equation
T (x) = f ∗
(5.2.49)
is equivalent to the requirement that the integral identity 1 1 1 p−2 |x(t)| ˙ x(t) ˙ y(t) ˙ dt − g(x(t))y(t) dt = f (t)y(t) dt 0
0
(5.2.50)
0
holds for all y ∈ X. The function x ∈ X satisfying (5.2.50) is then a weak solution of (5.2.46) (cf. Remark 5.3.10). It follows that the existence of a weak solution of (5.2.46) is equivalent to the existence of a solution of the operator equation (5.2.49). Our plan is to use the degree argument to prove the existence of a solution of (5.2.49). First we sketch the properties of operators J and G. The operator J satisfies J(x), x = xp .
(5.2.51)
Moreover, J is an odd mapping, (p − 1)-homogeneous,37 it is bounded, continuous (and so demicontinuous) and satisfies the (S+ ) condition. Indeed, let xn x0 in X and lim sup J(xn ), xn − x0 ≤ 0. n→∞
Then lim J(x0 ), xn − x0 = 0, and so n→∞
0 ≥ lim sup J(xn ) − J(x0 ), xn − x0 n→∞
1
= lim sup n→∞
8
0
1
≥ lim sup n→∞
9 |x˙ n |p−2 x˙ n (t) − |x˙ 0 (t)|p−2 x˙ 0 (t) (x˙ n (t) − x˙ 0 (t)) dt 1
|x˙ n (t)|p dt − 0
1 p |x˙ n (t)|p dt
0
1
−
|x˙ 0 (t)| dt p
1 p
0
1
p1 |x˙ 0 (t)|p dt
0
p1 1 |x˙ n (t)|p dt −
0
8 9 = lim sup xn p−1 − x0 p−1 [xn − x0 ] ≥ 0
.
1
|x˙ 0 (t)| dt p
0
n→∞
where the last inequality follows from the fact that s → |s|p−1 is strictly increasing on (0, ∞). Hence xn → x0 , and due to the uniform convexity of X we have xn → x0 in X.38 The operator J is also invertible and its inverse is continuous.39 The operator G is compact. This follows immediately from the compact embedding X = W01,p (0, 1) ⊂⊂ C[0, 1] and from the continuity of g (the reader is invited to prove it in detail). Hence, due to Lemma 5.2.46 the operator T satisfies the (S+ ) condition. J(tx) = tp−1 J(x) for any t > 0, x ∈ X. Proposition 2.1.22(iv). 39 See Exercise 5.2.53. 37 I.e., 38 See
5.2B. Topological Degree for Generalized Monotone Operators Let us define an operator S : X → X ∗ by 1 S(x), y = |x(t)|p−2 x(t)y(t) dt
307
x, y ∈ X.
for any
0
Then S is (p − 1)-homogeneous and compact (use X = W01,p (0, 1) ⊂⊂ Lp (0, 1)). We define a homotopy Tτ (x) = J(x) − (1 − τ )G(x) − τ λS(x) + (τ − 1)f ∗
τ ∈ [0, 1], x ∈ X,
for
and show that there exists R > 0 (large enough) such that this homotopy is admissible with respect to the ball B(o; R) ⊂ X. The usual way to prove it relies on an indirect argument. Assume by contradiction that for any k ∈ N there exists τk ∈ [0, 1] and xk ∈ X, xk ≥ k such that Tτk (xk ) = o, i.e.,
J(xk ) − (1 − τk )G(xk ) − τk λS(xk ) + (τk − 1)f ∗ = o.
We divide (5.2.52) by xk homogeneous to get
p−1
J(vk ) − (1 − τk )
, denote vk
xk
xk p−1
(5.2.52)
and use that J and S are (p − 1)-
G(xk ) f∗ − τk λS(vk ) + (τk − 1) = o. xk p−1 xk p−1
(5.2.53)
Due to the reflexivity of X and the compactness of the interval [0, 1], we may assume that vk v in X and τk → τ ∈ [0, 1]. Using the compactness of the embedding X ⊂⊂ Lp (0, 1), the facts that G and S are continuous as operators from Lp (0, 1) into Lp (0, 1), and using the assumption (5.2.48) we obtain G(xk ) → (1 − τ )λS(v) xk p−1
in
X ∗,
τk λS(vk ) → τ λS(v)
in
X ∗,
in
X∗
(1 − τk )
(τk − 1)
∗
f →o xk p−1
as
k→∞
(the reader is invited to justify all in detail!). Passing to the limit in (5.2.53) we thus get J(vk ) → (1 − τ )λS(v) + τ λS(v) i.e.,
in
vk → J −1 (λS(v))
in
X∗
as
k → ∞,
X.
Since at the same time vk v in X, we have vk → v
in
X
and J(v) − λS(v) = o
in
X ∗.
(5.2.54)
308
Chapter 5. Topological and Monotonicity Methods
Since vk = 1 for all k = 1, 2, . . . , we have v = 1, and so (5.2.54) contradicts the fact that λ is not an eigenvalue of (5.2.47). This proves that the homotopy Tτ is admissible with respect to the ball B(o; R) if R is large. Applying Theorem 5.2.47(iii) we arrive at deg (J − G − f ∗ , B(o; R), o) = deg (J − λS, B(o; R), o), but the value of the degree on the right-hand side is an odd number according to Theorem 5.2.47(ii). Hence deg (J − G − f ∗ , B(o; R), o) = 0, and the existence of at least one solution x ∈ X of (5.2.49) which satisfies x < R e follows from Theorem 5.2.47(i). Remark 5.2.52. It is possible to solve the problem (5.2.46) by means of the Leray– Schauder degree theory as well. In that case instead of solving the operator equation J(x) − G(x) = f ∗ one has to deal with
x = J −1 (f ∗ + G(x))
(cf. Exercise 5.2.54). Due to the properties of J −1 (cf. Exercise 5.2.53) this approach is more or less equivalent to that presented in Example 5.2.51. However, in more complicated applications (equations of higher order, partial differential equations, etc., see, e.g., Appendix 7.5A) the use of the degree presented in Theorem 5.2.47 can appear to be of essential advantage! Exercise 5.2.53. Let J be an operator from Example 5.2.51. Prove that there exists an inverse operator J −1 which is bounded and continuous. Hint. The strict monotonicity of s → |s|p−2 s implies that J(u) − J(v), u − v > 0
u = v.
for
Hence J is injective. Using the H¨ older inequality prove that J(x) − J(y), x − y ≥ (xp−1 − yp−1 )(x − y)
(5.2.55)
(cf. the proof of the (S+ ) condition in Example 5.2.51). The boundedness of J −1 then follows. To prove that J −1 is continuous proceed via contradiction. Suppose it is not, i.e., ∗ there exists a sequence {fn }∞ n=1 , fn → f in X and J −1 (fn ) − J −1 (f ) ≥ δ
for a
δ > 0.
Let xn J −1 (fn ), x = J −1 (f ). It follows that fn xn ≥ fn , xn = J(xn ), xn = xn p ,
i.e.,
xn p−1 ≤ fn .
˜ in X due to the reflexivity of X. Hence We may then assume xn x x), xn − x ˜ = J(xn ) − J(x), xn − x ˜ + J(x) − J(˜ x), xn − x ˜ → 0 (5.2.56) J(xn ) − J(˜ since J(xn ) → J(x) in X ∗ . It follows from (5.2.55) (with x xn , y x ˜) and (5.2.56) that x. Hence xn → x ˜ follows due to the fact that X is a uniformly convex Banach xn → ˜ space (see page 65 or Adams [2, Theorem 3.5]). Since J is continuous and injective, x ˜ = x, a contradiction.
5.3. Theory of Monotone Operators
309
Exercise 5.2.54. Consider the boundary value problem (5.2.46) with g satisfying (5.2.48). Prove the existence of at least one weak solution of (5.2.46) using the Leray–Schauder degree theory. Hint. Prove that J −1 ◦ G is a compact operator from X into itself and then use the homotopy invariance property of the Leray–Schauder degree to prove that x = J −1 (f ∗ + G(x)) has at least one solution in X. Compare your proof with the method presented in Example 5.2.51. Exercise 5.2.55. Consider the problem p−2 − |x(t)| ˙ x(t) ˙ ˙= h(t, x(t), x(t)), ˙
t ∈ (0, 1),
x(0) = x(1) = 0,
(5.2.57)
where p > 1. Formulate conditions on h = h(t, x, s) which guarantee the existence of a weak solution of (5.2.57) (see Remark 5.3.10).
5.3 Theory of Monotone Operators The motivation for the methods presented in this section can be described by the following simple example of a real function of one real variable f : R → R. We would like to find conditions on f which guarantee that for any y ∈ R the equation f (x) = y has a (unique) solution x. One possible way to solve this first semester calculus problem is to consider f which is continuous, (strictly) monotone and lim |f (x)| = ∞ (see Figure 5.3.1). |x|→∞
y
y = f (x)
0
x
Figure 5.3.1.
If f is replaced by an operator T: H →H from a real Hilbert space H (with a scalar product (·, ·) and the induced norm · ) into itself and the same question is posed, then similar conditions appear to be appropriate to prove that for any h ∈ H the equation T (u) = h
310
Chapter 5. Topological and Monotonicity Methods
has a unique solution u ∈ H. It is clear how to reformulate the first condition on f in the case of a general operator T . The third condition motivates the following definition. Definition 5.3.1. Let H be a real Hilbert space. An operator T : H → H satisfying lim
uH →∞
T (u) = ∞
is called weakly coercive. In order to reformulate the second condition we should first note that a real function of one real variable is increasing (decreasing) if and only if (f (x) − f (y))(x − y) ≥ 0
( ≤ 0)
for any x, y ∈ R.
Definition 5.3.2. Let H be a real Hilbert space. An operator T : H → H satisfying (T (u) − T (v), u − v) ≥ 0
for any u, v ∈ H
(5.3.1)
is called a monotone operator. An operator T is called strictly monotone if for u = v the strict inequality holds in (5.3.1). An operator T is called strongly monotone if there exists c > 0 such that (T (u) − T (v), u − v) ≥ cu − v2
for any u, v ∈ H.
Remark 5.3.3. It is clear that a strongly monotone operator is strictly monotone and, therefore, monotone. Also, every strongly monotone operator is weakly coercive. Indeed, T being strongly monotone implies (T (u) − T (o), u) ≥ cu2 .
(5.3.2)
The Schwartz inequality (see Proposition 1.2.30(i)) yields (T (u) − T (o), u) ≤ [T (u) + T (o)]u.
(5.3.3)
Putting (5.3.2) and (5.3.3) together we get T (u) ≥ cu − T (o), and the weak coercivity follows. The following theorem is the basic assertion of this section. Theorem 5.3.4. Let H be a real Hilbert space and let T : H → H be continuous, monotone and weakly coercive. Then T (H) = H. If, moreover, T is strictly monotone, then for any h ∈ H the equation T (u) = h has a unique solution.
(5.3.4)
5.3. Theory of Monotone Operators
311
Proof. The uniqueness of the solution is a direct consequence of the strict monotonicity of T . The existence of a solution to (5.3.4) for any h ∈ H is proved in two steps: Step 1. Assume for a while that the assertion of the theorem holds if T is continuous and strongly monotone. We prove this fact later, in Proposition 5.3.5. Since Tn : H → H, n ∈ N, defined by Tn : u →
1 u + T (u) n
is strongly monotone (prove it!) for any n ∈ N, we claim that given h ∈ H there exists un ∈ H such that Tn (un ) = h. (5.3.5) ∞
Step 2. Let us prove that {un }n=1 is a bounded sequence in H. Assume the ∞ contrary, i.e., there exists a subsequence which will be denoted by {un }n=1 again such that lim un = ∞. n→∞
It follows from the monotonicity of T that
1 1 1 un = un + (T (un ) − T (o), un ) + (T (o), un ) h ≥ h, un n un un 1 ≥ un − T (o), n *∞ ) i.e., n1 un n=1 is a bounded sequence (and therefore weakly sequentially compact – see Theorem and note that H is reflexive). Hence there exists a subsequence + ,∞ 2.1.25 ) 1 *∞ 1 ⊂ n un n=1 which is weakly convergent, i.e., nk unk k=1
1 un w. nk k According to (5.3.5), T (unk ) h − w. {T (unk )}∞ k=1
is a bounded sequence (Proposition 2.1.22(iii)), This implies that which contradicts the weak coercivity of T . This proves the boundedness of ∞ {un }n=1 . In particular, n1 un → o and T (un ) → h. ∞ By Theorem 2.1.25, there is a subsequence {umk }∞ k=1 ⊂ {un }n=1 such that
u mk u 0 . We prove that T (u0 ) = h. Indeed, for any v ∈ H and k ∈ N we have (T (umk ) − T (v), umk − v) ≥ 0.
312
Chapter 5. Topological and Monotonicity Methods
Passing to the limit for k → ∞ we obtain (h − T (v), u0 − v) ≥ 0
for any v ∈ H.40
Set v = u0 + λw, λ > 0, w ∈ H. Then (h − T (u0 + λw), w) ≤ 0
holds for any λ > 0 and w ∈ H.
(5.3.6)
Passing to the limit for λ → 0+ in (5.3.6) and using the continuity of T and of the scalar product in H, we get (h − T (u0 ), w) ≤ 0
for any w ∈ H.
(5.3.7)
Since (5.3.7) holds simultaneously for any w and −w, we actually have (h − T (u0 ), w) = 0
for any w ∈ H,
i.e.,
T (u0 ) = h.
Now, it remains to justify the assumption made in Step 1. For this purpose we prove the following assertion. Proposition 5.3.5. Let H be a real Hilbert space and S : H → H a continuous and strongly monotone operator. Then S(H) = H. Proof. The idea of the proof is easy. Since H is a connected metric space, it is enough to prove (see Lemmas 5.3.6 and 5.3.8) that S(H) is both open and closed in H. Then S(H) = H because the only nonempty subset of H which is both open and closed is the entire space H. First we prove that S(H) is closed. Lemma 5.3.6. Let D be a closed set in H, let S : D → H be a continuous and strongly monotone operator. Then S(D) is a closed set in H. ∞
Proof. Let {un }n=1 ⊂ D be such that S(un ) → h. Since S is strongly monotone, we have (S(un ) − S(um ), un − um ) ≥ cun − um 2 , and using the Schwartz inequality we obtain 1 S(un ) − S(um ) ≥ un − um . c 40 Here
we use that xn x and yn → y imply (xn , yn ) → (x, y). See Exercise 2.1.36.
5.3. Theory of Monotone Operators
313
∞
Hence {un }n=1 is a Cauchy sequence, and there exists u0 ∈ D such that un → u0 . The continuity of S implies that S(un ) → S(u0 ),
i.e.,
S(u0 ) = h.
To prove that S(H) is an open set is more tricky. For this purpose we need an auxiliary assertion about an extension of Lipschitz continuous operators. Lemma 5.3.7. Let D be a subset of a real Hilbert space H, let V : D → H be an operator satisfying V (u) − V (v) ≤ u − v
for any
u, v ∈ D.
Then there exists an operator W : H → H such that W (u) − W (v) ≤ u − v
for any
u, v ∈ H
(5.3.8)
and, moreover, W (u) = V (u)
for any
u ∈ D.
Proof. It follows from Zorn’s Lemma (see Theorem 1.1.4) that there exists a maximal extension W of the operator V , the domain of which satisfies Dom W ⊂ H,
D ⊂ Dom W,
and for any u, v ∈ Dom W the inequality (5.3.8) holds. Our aim is to prove Dom W = H. Assume the contrary, i.e., there exists u0 ∈ H \ Dom W . In order to reach a contradiction it is enough to prove the existence of v0 ∈ H such that v0 − W (u) ≤ u0 − u Indeed, setting
˜ : u → W
for any u ∈ Dom W.
(5.3.9)
u = u0 , v0 , W (u), u ∈ Dom W,
we obtain an operator ˜ : Dom W ∪ {u0 } → H W satisfying (5.3.8) for any u, v ∈ Dom W ∪ {u0 }. This will be a contradiction with the maximality of the extension W . So, in the rest of the proof we concentrate on the existence of v0 satisfying (5.3.9). Let B be a finite subset of Dom W . Denote by AB the set of all v0 ∈ H satisfying (5.3.9) for any u ∈ B. Let A denote the set of all v0 ∈ H satisfying (5.3.9) for all u ∈ Dom W . Let Bn be the system of all finite subsets B of Dom W which belong to the closed ball {u ∈ H : u ≤ n}, n ∈ N. Set An = AB . B∈Bn
314
Chapter 5. Topological and Monotonicity Methods
Clearly, we have A=
∞
An ,
An+1 ⊂ An ⊂ A1 .
n=1
We wish to prove that A = ∅. Observe first that AB and An are weakly compact sets (they are bounded and weakly closed41 ). If AB = ∅ for any finite subset B ⊂ Dom W , then An = ∅ for any n ∈ N by Exercise 1.2.42. Applying this procedure again we finally obtain A = ∅ and the proof will be complete. Assume now that there exists B = {u1 , . . . , um } ⊂ Dom W such that AB = ∅. We want to reach a contradiction which will complete the proof. Denote Hf = Lin{u1 − u0 , . . . , um − u0 , W (u1 ), . . . , W (um )}. Then Hf is a subspace of H and dim Hf ≤ 2m. For any w ∈ Hf set h(w) = max
1≤j≤m
w − W (uj ) . u0 − uj
If there exists v0 ∈ Hf such that h(v0 ) ≤ 1, then v0 ∈ AB , a contradiction. So, assume that h(w) > 1 for any w ∈ Hf . Note that the real function h is continuous on Hf and lim h(w) = ∞. w→∞ w∈Hf
Hence, there exists w0 ∈ Hf 42 such that h(w0 ) = min h(w) = λ > 1. w∈Hf
Let us re-enumerate u1 , . . . , um in such a way that w0 − W (uj ) = λ > 1, u0 − uj w0 − W (uj ) < λ, u0 − uj
1 ≤ j ≤ k, (5.3.10) k + 1 ≤ j ≤ m.
We prove that w0 belongs to the convex hull M = Co{W (u1 ), . . . , W (uk )} of {W (u1 ), . . . , W (uk )}.43 Let us assume the contrary. Then we can find w1 ∈ Hf 41 Since the weak topology is not metrizable the fact that A is weakly closed has to be shown B with help of weak neighborhoods (Remark 2.1.23). But this is simple due to the Dual Characterization of the Norm (Corollary 2.1.16). 42 Recall that bounded sets in a finite dimensional space H are relatively compact. f 43 The convex hull of the set A is the least convex set containing A.
5.3. Theory of Monotone Operators
315
such that w1 − W (uj ) w0 − W (uj ) < = λ, u0 − uj u0 − uj w1 − W (uj ) < λ, u0 − uj
1 ≤ j ≤ k, k + 1 ≤ j ≤ m,
(see Figure 5.3.2 for m = 5 and k = 2).44 Hence h(w1 ) < h(w0 ), a contradiction.
W (u1 )
Hf
W (u4 )
∂U3
w0
W (u5 ) w1
∂U5
wM
M
∂U2
=C
∂U1
∂U4
o{W
W (u3 ) + Figure 5.3.2. Uj = w ∈ Hf :
w−W (uj )
u0 −uj
(u 1 ), W
W (u2 ) (u 2 )}
, < λ = B(W (uj ); λu0 − uj ) ∩ Hf
Consequently, there are c1 , . . . , ck such that w0 =
k
cj W (uj ),
cj ≥ 0,
j=1
k
cj = 1.
j=1
Set zj = w0 − W (uj ), zˆj = u0 − uj , 1 ≤ j ≤ k. Then k
cj zj = o,
ˆ zj 2 < zj 2 ,
1 ≤ j ≤ k.
(5.3.11)
j=1
+ , %
w−W (u )
other words, Uj = ∅ where Uj = w ∈ Hf : u −u j < λ (see Figure 5.3.2). 0 j 1≤j≤m % Uj is a nonempty open subset of Hf and the latter (m − k) inequalities in Indeed, k+1≤j≤m % Uj . Using the convexity and compactness of M, the reader is (5.3.10) imply w0 ∈ k+1≤j≤m % Uj contains the segment {tw0 + (1 − t)wM : 0 < t < 1} where invited to show that 1≤j≤k % Uj is a nonempty set, too, and w0 belongs to w0 − wM = dist (w0 , M). Consequently, 44 In
1≤j≤k
its boundary.
316
Chapter 5. Topological and Monotonicity Methods
For 1 ≤ j, n ≤ k we also have zj − zˆn 2 , zj − zn 2 = W (un ) − W (uj )2 ≤ un − uj 2 = ˆ i.e., zj 2 + zn 2 − 2(zj , zn ) ≤ ˆ zj 2 + ˆ zn 2 − 2(ˆ zj , zˆn ).
(5.3.12)
We conclude from (5.3.11) and (5.3.12) that (ˆ zj , zˆn ) < (zj , zn ), and thus
k
cj cn (ˆ zj , zˆn ) <
j,n=1
1 ≤ j, n ≤ k, k
cj cn (zj , zn ).
j,n=1
However, 2 k cj cn (zj , zn ) = c z j j = 0, j=1 j,n=1 k
i.e.,
2 k cj cn (ˆ zj , zˆn ) = c z ˆ j j , j=1 j,n=1 k
2 k c z ˆ j j < 0, j=1
a contradiction. This proves that there exists v0 ∈ Hf such that h(v0 ) ≤ 1, i.e., AB = ∅ for any finite set B ⊂ Dom W , and the proof is complete. Now, we are ready to prove that S(H) is also an open set in H. Lemma 5.3.8. Let D ⊂ H be an open set, let S : D → H be continuous and strongly monotone. Then S(D) is an open subset of H. Proof. It is enough to prove this lemma for S satisfying the strong monotonicity assumption with c = 1 (explain why!). Let us denote R S(D). We are going to construct a continuous mapping Z : H → H, Dom Z = H, such that Z−1 (D) = R and this will imply that R is open. The operator S is injective in D and S −1 is continuous on R (see Exercise 5.3.12). So we intend to construct Z as an extension of S −1 . For this purpose set F (u) = S(u) − u. Then for u, u1 ∈ D we have (F (u) − F (u1 ), u − u1 ) ≥ 0, i.e., F is monotone. For v ∈ R set K(v) = S −1 (v) − F (S −1 (v)).
5.3. Theory of Monotone Operators
317
Let v, v1 ∈ R be such that v = S(u), v1 = S(u1 ). Then K(v) − K(v1 )2 = u − u1 2 + F (u) − F (u1 )2 − 2(F (u) − F (u1 ), u − u1 ), v − v1 2 = F (u) − F (u1 )2 + u − u1 2 + 2(F (u) − F (u1 ), u − u1 ). The monotonicity of F implies that for any v, v1 ∈ R, K(v) − K(v1 ) ≤ v − v1 . It follows from Lemma 5.3.7 that there exists a continuous extension K1 of K which is defined on the whole H and for any v, v1 ∈ H we have K1 (v) − K1 (v1 ) ≤ v − v1 . For v ∈ H set Z(v) =
1 (v + K1 (v)). 2
If v ∈ R and v = S(u), then v + K1 (v) = v + K(v) = 2u,
i.e.,
Z|R = S −1
and R ⊂ Z−1 (D).
The inclusion Z−1 (D) ⊂ R
(5.3.13)
will imply that Z−1 (D) = R and, by the continuity of Z, the set R = S(D) is open. To prove (5.3.13) it is enough to show that for any v ∈ Z−1 (D) we have v = S(Z(v)). Assume by contradiction that there is v ∈ Z−1 (D) such that for u = Z(v) we have v − S(u) > 0. (5.3.14) The continuity of S implies the existence of d > 0 such that B(u; d) ⊂ D and for u1 ∈ B(u; d) we have S(u) − u − S(u1 ) + u1 ≤
1 v − S(u). 2
Let us choose t > 0 so small that t(v − S(u)) < d. Set u1 = u + t(v − S(u)),
v1 = S(u1 ).
Then u − u1 < d, and so (S(u1 ) − u1 − v + u, t(v − S(u))) = (v1 − Z(v1 ) − v + Z(v), Z(v1 ) − Z(v)) 1 = (v1 − K1 (v1 ) − v + K1 (v), v1 + K1 (v1 ) − v − K1 (v)) 4 1 = (v − v1 2 − K1 (v) − K1 (v1 )2 ) ≥ 0. 4
318
Chapter 5. Topological and Monotonicity Methods
Furthermore, (S(u1 ) − u1 − S(u) + u, u − u1 ) = (S(u1 ) − u1 − v + u, −t(v − S(u))) + (v − S(u), −t(v − S(u))) ≤ (v − S(u), −t(v − S(u))) = −tv − S(u)2 , and so tv − S(u)2 ≤ (S(u1 ) − u1 − S(u) + u, t(v − S(u))) ≤ tS(u1 ) − u1 − S(u) + uv − S(u) ≤
1 tv − S(u)2 . 2
Since t > 0 this contradicts (5.3.14). This proves (5.3.13) and the proof is complete. Let us point out that for operator equations with strongly monotone operators we obtain the continuous dependence of the solution on the right-hand side. Corollary 5.3.9. Let H be a real Hilbert space and T : H → H a continuous and strongly monotone operator. Then for any h ∈ H the equation T (u) = h has a unique solution. Let T (u1 ) = h1 and T (u2 ) = h2 . Then u1 − u2 ≤
1 h1 − h2 c
with c > 0 from Definition 5.3.2, i.e., T −1 is Lipschitz continuous. Proof. The existence part follows from Proposition 5.3.5. Uniqueness is obvious. For T (u1 ) = h1 and T (u2 ) = h2 we have (using the Schwartz inequality) cu1 − u2 2 ≤ (T (u1 ) − T (u2 ), u1 − u2 ) ≤ u1 − u2 h1 − h2 ,
which completes the proof.
Remark 5.3.10. Let h : [0, 1]×R×R → R be a real function. Consider the boundary value problem −¨ x(t) = h(t, x(t), x(t)), ˙ t ∈ (0, 1), (5.3.15) x(0) = x(1) = 0. Assume that h is continuous and x ∈ C 2 [0, 1] is a solution of (5.3.15). Let us multiply the equation in (5.3.15) by a function y ∈ W01,2 (0, 1) and then integrate the equation from 0 to 1. Applying the Integration by Parts Formula on the lefthand side, we obtain 1 1 x(t) ˙ y(t) ˙ dt = h(t, x(t), x(t))y(t) ˙ dt. (5.3.16) 0
0
5.3. Theory of Monotone Operators
319
This identity makes sense for a more general x than that from C 2 [0, 1] and also for a more general function h. We discuss this issue in Section 7.3 in detail. If we assume that h is such that the integral on the right-hand side of (5.3.16) exists for any x, y ∈ W01,2 (0, 1) (see the following Example 5.3.11), then the function x ∈ W01,2 (0, 1) is called the weak solution of (5.3.15) if the integral identity (5.3.16) holds for any y ∈ W01,2 (0, 1). Once we succeed in finding a weak solution of (5.3.15), a natural question arises whether it belongs to a “better” space than W01,2 (0, 1), e.g., the continuity of the first and second derivatives of x can be of interest. This is the so-called regularity problem. It is a very delicate issue in the theory of partial differential equations. On the other hand, for an ordinary differential equation, it is not. For instance, if h is a continuous function, independent of x, ˙ and x ∈ W01,2 (0, 1) is a weak solution of (5.3.15), then x ∈ C 2 [0, 1] is a classical solution of (5.3.15), i.e., the equation in (5.3.15) holds pointwise in (0, 1). Example 5.3.11. Let us consider the boundary value problem −¨ x(t) + g(x(t)) = f (t), t ∈ (0, 1), x(0) = x(1) = 0
(5.3.17)
where g : R → R is a continuous function and f ∈ L2 (0, 1) is a given function. Reformulate (5.3.17) as an operator equation. Put H W01,2 (0, 1) and define operators J, G : H → H and an element f ∗ ∈ H by
1
x(t) ˙ y(t) ˙ dt,
(J(x), y) = 0
(f ∗ , y) =
(G(x), y) =
1
g(x(t))y(t) dt, 0
1
f (t)y(t) dt
for any x, y ∈ H.
0
We will work with the scalar product
1
x, y ∈ H,
x(t) ˙ y(t) ˙ dt,
(x, y) = 0
and with the norm
1
x =
12 |x(t)| ˙ dt , 2
0
cf. Exercise 1.2.46. The reader is invited to prove that the operators J and G as well as the element f ∗ are well defined and that J is linear. Set S = J + G. Then the operator equation
S(x) = f ∗
320
Chapter 5. Topological and Monotonicity Methods
is equivalent to the requirement that the integral identity
1
1
x(t) ˙ y(t) ˙ dt +
1
g(x(t))y(t) dt =
0
f (t)y(t) dt
0
(5.3.18)
0
holds for all y ∈ H. This is the weak formulation of (5.3.17), and x ∈ H satisfying (5.3.18) for any y ∈ H is a weak solution of (5.3.17). Let us prove that S is a continuous operator. This fact follows from the continuity of J and G. By the definition of J and of the scalar product in H, J is the identity on H, and so it is a continuous operator. Assume now that xn → x. The embedding of H = W01,2 (0, 1) into C[0, 1] (see Theorem 1.2.26) implies that xn ⇒ x uniformly in [0, 1]. It follows then from the continuity of g on R that g ◦ xn ⇒ g ◦ x uniformly in [0, 1] (justify this statement carefully!). Then using the Dual Characterization of the Norm and the H¨ older inequality, we conclude that G(xn ) − G(x) = sup |(G(xn ) − G(x), w)| w≤1
- = sup -w≤1
≤
1
1
0
[g(xn (t)) − g(x(t))]w(t) dt- 12
|g(xn (t)) − g(x(t))| dt 2
0
≤c
1
12
|g(xn (t)) − g(x(t))| dt 2
sup wL2 (0,1)
w≤1
→ 0 as n → ∞.
0
Hence G is also a continuous operator. Next we prove that S is a strongly monotone operator provided g is an increasing function. Indeed, for any x, y ∈ H we have (S(x) − S(y), x − y) 1 2 = |x(t) ˙ − y(t)| ˙ dt + 0
1
[g(x(t)) − g(y(t))][x(t) − y(t)] dt ≥ x − y2 .45
0
It follows from Corollary 5.3.9 that the problem (5.3.17) has a unique weak solution for any f ∈ L2 (0, 1). If fn → f in L2 (0, 1), then fn∗ → f ∗ strongly in H (prove it in detail!), and according to Corollary 5.3.9 the corresponding weak solutions xn ∈ H satisfy xn − x → 0. In particular, this means that a weak solution of (5.3.17) depends continuously on g the right-hand side f ∈ L2 (Ω). 45 Here
we use that (g(r) − g(s))(r − s) ≥ 0.
5.3. Theory of Monotone Operators
321
Exercise 5.3.12. Let H be a real Hilbert space and let S : H → H be a strongly monotone operator. Prove that S is injective and S −1 is Lipschitz continuous. Exercise 5.3.13. Let ε > 0 and T : B(o; R + ε) ⊂ RN → RN be a monotone operator. Prove that T (B(o; R)) is a bounded set. Exercise 5.3.14. Let B(o; 1) ⊂ RN , N ≥ 2. Prove that there exists a strongly monotone operator T : B(o; 1) → RN for which T (B(o; 1)) is an unbounded set. ∞ Hint. Let {xn }n=1 ⊂ RN , xn = 1, xn = xm for n = m, xn → x0 . Set x for x ∈ B(o; 1)), x = xm , m = 1, 2, . . . , T : x → xn + nxn for x = xn , n = 1, 2, . . . . Prove that T (B(o; 1)) is unbounded and T is strongly monotone. Exercise 5.3.15. Define fn : t →
0 nt −
n 2
for t ≤ 12 , for t > 12 .
For x = (x1 , x2 , . . . ) ∈ l2 set T x = (f1 (x1 ), f2 (x2 ), . . . ) + (x1 , x2 , . . . ). Prove that (T (x) − T (y), x − y)l2 ≥ x − y2l2
for any x, y ∈ l2
and T (B(o; 1)) is an unbounded set. Exercise 5.3.16. Let T : RN → RN be a monotone operator and T (RN ) = RN . Prove that T is weakly coercive. ∞ Hint. Assume the contrary, i.e., there exist M > 0 and a sequence {un }n=1 such that un → ∞ as n → ∞ and T (un) ≤ M. ,∞ + which is convergent to w. Since T (RN ) = RN Choose a subsequence of uunn n=1
there is u ∈ RN such that T (u) = (M + 1)w. By the monotonicity of T we have
un − u un − u ≥ T (u), . T (un ), un un Taking lim sup on both sides we obtain a contradiction. Exercise 5.3.17. Let f : R → R be defined as follows: x for x < 0, f : x → x + 1 for x ≥ 0.
322
Chapter 5. Topological and Monotonicity Methods
For any (x, y) ∈ R2 set
T (x, y) = (y + f (x), −x).
Prove that T is an injective monotone operator, T (R2 ) = R2 and T is not continuous. Can there exist an injective, monotone function T : R → R which is not continuous and T (R) = R? Exercise 5.3.18. Let H be a real Hilbert space and T : H → H a strongly monotone and Lipschitz continuous operator, i.e., there exist numbers m > 0, M > 0, M > m such that (T (u) − T (v), u − v) ≥ mu − v2 ,
T (u) − T (v) ≤ M u − v
hold for all u, v ∈ H. Prove that the equation T (u) = h has precisely one solution for every h ∈ H, and it is possible to construct this solution by iterations. Hint. Let h ∈ H, ε > 0. Define an operator Aε (u) = u − ε(T (u) − h). Prove that for u, v ∈ H, Aε (u) − Aε (v)2 ≤ (1 − 2εm + ε2 M 2 )u − v2 , 2m and show that for ε < M 2 the operator Aε is contractive. Apply the Contraction Principle (Theorem 2.3.1).
Exercise 5.3.19. Let H be a Hilbert space and T : H → H a contraction. Prove that I − T is a monotone operator. Exercise 5.3.20. A function x ∈ W 1,2 (0, 1) satisfying (5.3.16) for any y ∈ W 1,2 (0, 1) is called a weak solution of the Neumann problem −¨ x(t) = h(t, x(t), x(t)), ˙ t ∈ (0, 1), (5.3.19) x(0) ˙ = x(1) ˙ = 0. Prove that any weak solution x of (5.3.19) such that x ∈ C 2 [0, 1] satisfies the equation in (5.3.19) and x(0) ˙ = x(1) ˙ = 0, i.e., it is a classical solution of (5.3.19). Hint. Taking y ∈ D(0, 1) in (5.3.16) show that the equation in (5.3.19) holds pointwise in (0, 1). Then take arbitrary y ∈ C 2 [0, 1] in (5.3.16) and integrate by parts. Exercise 5.3.21. Find conditions on λ, g = g(t, x) and f for the Neumann problem −¨ x(t) + λx(t) + g(t, x(t)) = f (t), t ∈ (0, 1), x(0) ˙ = x(1) ˙ = 0, to have a unique weak solution. Hint. Use Corollary 5.3.9.
5.3A. Browder and Leray–Lions Theorem
323
5.3A Browder and Leray–Lions Theorem In this appendix we will discuss generalizations of the previous assertions from the basic text. We will present two assertions: one attributed to F.E. Browder, the other named after J. Leray and J.-L. Lions. Theorem 5.3.22 (Browder). Let X be a reflexive real Banach space. Moreover, let T : X → X ∗ be an operator satisfying the conditions (i) T is bounded; (ii) T is demicontinuous; (iii) T is coercive, i.e., lim
u →∞
T (u), u = +∞, u
cf. Definition 6.2.17 in the Hilbert space setting; (iv) T is monotone on the space X, i.e., for all u, v ∈ X we have T (u) − T (v), u − v ≥ 0,
(5.3.20)
cf. Definition 5.3.2 in the Hilbert space setting. Then the equation
T (u) = f ∗ ∗
(5.3.21) ∗
has at least one solution u ∈ X for every f ∈ X . If, moreover, the inequality (5.3.20) is strict for all u, v ∈ X, u = v, then the equation (5.3.21) has precisely one solution u ∈ X for every f ∗ ∈ X ∗ . The second assertion is more general since the monotonicity condition (iv) is replaced by a set of weaker conditions. Theorem 5.3.23 (Leray–Lions). Let X be a reflexive real Banach space. Let T : X → X ∗ be an operator satisfying the conditions (i) T is bounded; (ii) T is demicontinuous; (iii) T is coercive. Moreover, let there exist a bounded mapping Φ : X × X → X ∗ such that (iv) Φ(u, u) = T (u) for every u ∈ X; (v) for all u, w, h ∈ X and any sequence {tn }∞ n=1 of real numbers such that tn → 0, we have Φ(u + tn h, w) Φ(u, w); (vi) for all u, w ∈ X we have Φ(u, u) − Φ(w, u), u − w ≥ 0 (the so-called condition of monotonicity in the principal part);
324
Chapter 5. Topological and Monotonicity Methods
(vii) if un u and lim Φ(un , un ) − Φ(u, un ), un − u = 0,
n→∞
then we have Φ(w, un ) Φ(w, u)
for arbitrary
w ∈ X;
(viii) if w ∈ X, un u, Φ(w, un ) z, then lim Φ(w, un ), un = z, u.
n→∞
Then the equation
T (u) = f ∗
has at least one solution u ∈ X for every f ∗ ∈ X ∗ . The conditions (iv)–(viii) of Theorem 5.3.23 are somewhat unintuitive at first glance. We will try to clarify these conditions in Appendix 7.6A where an application to boundary value problems for partial differential equations is given. Next we will discuss the main steps of the proof of Theorem 5.3.22. The proof of Theorem 5.3.23 is similar, nonetheless it is technically more demanding (see, e.g., Leray & Lions [85]). Proof of Theorem 5.3.22. We divide the proof into eight steps. Step 1. Observe that the operator Tf ∗ (u) T (u) − f ∗ also satisfies all the conditions of Theorem 5.3.22. Hence it suffices to prove that the equation T (u) = o (5.3.22) has at least one solution. Step 2. We construct an “approximation of the infinite dimensional equation (5.3.22) by an equation in a space of finite dimension”. More precisely: Let Λ be the family of all subspaces of finite dimension in the space X. If F ∈ Λ, define the operator jF : F → X by jF (u) = u. Obviously jF is linear and continuous on F . Let jF∗ be the adjoint operator to jF (see Section 2.1). Then jF∗ : X ∗ → F ∗ and for u ∈ F , put
TF (u) jF∗ (T (u)).
This defines a mapping TF from the space F into the space F ∗ (see Figure 5.3.3). Step 3. Since a continuous linear operator maps a weakly convergent sequence to a weakly convergent one (see Proposition 2.1.27(i)) and the weak convergence coincides with the strong convergence on the subspace F of finite dimension (cf. Remark 2.1.23), it follows from (ii) that TF is continuous. Put c(r) inf
u∈X
u =r
T (u), u . u
5.3A. Browder and Leray–Lions Theorem
325
F
F
u
u jF
X TF
T F∗ jF∗ T (u)
TF (u) =
jF∗ (T (u))
X∗
Figure 5.3.3.
By condition (iii), we have lim c(r) = ∞,
r→∞
i.e., TF (u), u = T (u), jF (u) = T (u), u ≥ c(u)u holds for all u ∈ F .
(5.3.23)
Step 4. In Exercise 5.3.26 it is proved that the equation TF (u) = oF ∗
(5.3.24)
has at least one solution uF ∈ F . Step 5 (a priori estimate). There exists an r0 > 0 such that uF ≤ r0 holds for arbitrary F ∈ Λ and for every solution uF ∈ F of the equation (5.3.24). Indeed, if such an r0 did not exist, there would be a sequence {un }∞ n=1 of solutions of the equation (5.3.24) with F = Fn , n = 1, 2, . . . , such that lim un = ∞,
n→∞
lim c(un ) = ∞.
n→∞
This would lead to a contradiction in view of the inequality (5.3.23) c(un )un ≤ 0 = TFn (un ), un . Step 6. Let F0 ∈ Λ and
ΛF0 = {F ∈ Λ : F0 ⊂ F }. We denote by UF0 the set of all elements u ∈ X which are solutions of the equation (5.3.24) for a F ∈ ΛF0 . Furthermore, let UF0 w be the weak closure of the set UF0 .46 Note 46 In
other words, UF0
w
is the least weakly closed set which contains UF0 .
326
Chapter 5. Topological and Monotonicity Methods
that UF0 w ⊂ B(o; r0 ) for any F0 ∈ Λ (cf. Exercise 2.1.39 and the fact that UF0 ⊂ B(o; r0 ) for all F0 ∈ Λ). Let Λf ⊂ Λ be any finite subset of Λ. Then UF0 w = ∅. F0 ∈Λf
Indeed, let Λf = {F0i ∈ Λ : dim F0i < ∞, i = 1, . . . , n}. Then each of the sets UF i contains 0 n all solutions u ∈ X of the equation (5.3.24) in F0i . i=1
Since B(o; r0 ) is a compact topological space with respect to the weak topology (notice that X is reflexive), it follows from the result of Exercise 1.2.42 that
UF 0
w
= ∅.
F0 ∈Λ
Hence there exists
u0 ∈
UF 0 w .
F0 ∈Λ
In the next two steps we prove that u0 is the desired solution of (5.3.22). Step 7. Let v ∈ X. Choose F0 ∈ Λ such that v ∈ F0 , and let F ∈ ΛF0 . If uF ∈ F is a solution of the equation (5.3.24), then condition (iv) implies that 0 ≤ T (v) − T (uF ), v − uF = T (v), v − uF − T (uF ), v − uF = T (v), v − uF − T (uF ), jF (v − uF ) = T (v), v − uF − TF (uF ), v − uF = T (v), v − uF . Thus T (v), v − u ≥ 0
holds for arbitrary
u ∈ UF0 .
(5.3.25)
By the definition of weak topology (see Remark 2.1.23), (5.3.25) is valid even for arbitrary u ∈ UF0 w . In particular, we then have T (v), v − u0 ≥ 0.
(5.3.26)
Step 8 (the Minty trick). Choose w ∈ X, t > 0, and put v = u0 + tw in the inequality (5.3.26). Then 0 ≤ T (u0 + tw), tw = t(T (u0 + tw), w,
i.e.,
0 ≤ T (u0 + tw), w.
By passing to the limit as t → 0+ , we obtain (applying condition (ii) – demicontinuity of T ) the inequality (5.3.27) T (u0 ), w ≥ 0 which is valid for all w ∈ X. Replacing the element w in (5.3.27) by the element −w, we have (5.3.28) 0 ≤ T (u0 ), −w = −T (u0 ), w, and thus T (u0 ), w = 0
for every
w ∈ X,
i.e.,
T (u0 ) = o.
5.3A. Browder and Leray–Lions Theorem
327
Example 5.3.24. Let us consider the boundary value problem p−2 −(|x(t)| ˙ x(t))˙ ˙ + g(x(t)) = f (t), t ∈ (0, 1),
(5.3.29)
x(0) = x(1) = 0
where p > 1, f ∈ Lp (0, 1) and g is as in Example 5.3.11 (continuous and increasing). Put X W01,p (0, 1) with the norm
1
|x(t)| ˙ dt
x =
p
p1
0
and define operators J, G : X → X ∗ and an element f ∗ ∈ X ∗ as in Example 5.2.51. Set T = J + G. Then the operator equation
T (x) = f ∗
(5.3.30)
is equivalent to the requirement that the integral identity 1 1 1 p−2 |x(t)| ˙ x(t) ˙ y(t) ˙ dt + g(x(t))y(t) dt = f (t)y(t) dt 0
0
(5.3.31)
0
holds for all y ∈ X. So, as in Example 5.2.51, to find a weak solution x of (5.3.29) (i.e., x satisfying (5.3.31)) is equivalent to finding a solution of (5.3.30). We have, by the H¨ older inequality, J(xn ) − J(x) = sup |J(xn ) − J(x), y|
y ≤1
- = sup - y ≤1
1
≤
1
8
|x˙ n (t)|
p−2
x˙ n (t) − |x(t)| ˙
p−2
0
x(t) ˙ y(t) ˙ dt-9
- p p−2 -|x˙ n (t)|p−2 x˙ n (t) − |x(t)| ˙ x(t) ˙ - dt
1
≤
- p p−2 -|x˙ n (t)|p−2 x˙ n (t) − |x(t)| ˙ x(t) ˙ - dt
1 p
1 p |y(t)| ˙ dt
sup
y ≤1
0
1 p
p1
0
.
(5.3.32)
0
The last integral tends to zero as xn − x → 0 due to the continuity of the Nemytski operator Φ(x)(t) = ϕ(x(t)) ˙ from Lp (0, 1) into Lp (0, 1) with ϕ(s) = |s|p−2 s, s = 0, ϕ(0) = 0, p > 1 (see Theorem 3.2.24). Using the H¨ older inequality we also have - 1 G(xn ) − G(x) = sup |G(xn ) − G(x), y| = sup -g(xn (t)) − g(x(t)) y(t) dt- y ≤1
y ≤1
1
|g(xn (t)) − g(x(t))| dt
≤ sup
y ≤1
1
|y(t)| dt p
p1
0
|g(xn (t)) − g(x(t))|p dt 0
1 p
0 1
≤c
p
0
1 p
→0
(5.3.33)
328
Chapter 5. Topological and Monotonicity Methods
as xn − x → 0 (cf. Example 5.3.11 and the continuous embedding W01,p (0, 1) ⊂ C[0, 1] for p > 1). It follows from (5.3.32) and (5.3.33) that T is continuous and hence demicontinuous. The boundedness of T follows from estimates similar to (5.3.32), (5.3.33) (the reader is invited to do it in detail!). We also have 1 9 8 p−2 p−2 T (x) − T (y), x − y = x(t) ˙ − |y(t)| ˙ y(t) ˙ (x(t) ˙ − y(t)) ˙ dt |x(t)| ˙ 0
1
[g(x(t)) − g(y(t))] (x(t) − y(t)) dt
+ 0
1
≥
1
p |x(t)| ˙ dt −
p−2 |y(t)| ˙ y(t) ˙ x(t) ˙ dt
0
0
1
−
1
p−2 |x(t)| ˙ x(t) ˙ y(t) ˙ dt + 0
1
≥
|x(t)| ˙ dt −
−
|y(t)| ˙ dt p
1 p |x(t)| ˙ dt
0 p−1
= [x
1
p
0
p |y(t)| ˙ dt 0
0 1 p
1 p
1
|x(t)| ˙ dt p
p1
0
1
p1 p |y(t)| ˙ dt +
0
1 p |y(t)| ˙ dt 0
− yp−1 ][x − y] ≥ 0
with strict inequality for x = y, since s → |s|p−1 is a strictly increasing function on (0, ∞). Hence the monotonicity of T follows. Finally,
1
1
p |x(t)| ˙ dt +
T (x), x = 0
g(x(t))x(t) dt 0
1
= xp +
1
[g(x(t)) − g(0)](x(t) − 0) dt + 0
g(0)x(t) dt ≥ xp − |g(0)|x, 47 0
i.e., T is coercive. It follows then from Theorem 5.3.22 that there is a unique solution of e (5.3.30) (which in turn is a unique weak solution of (5.3.29)). The advantage of the Browder Theorem is more transparent in the case of partial differential equations when the embedding W01,p (Ω) ⊂ C(Ω) does not hold in general, and so only the demicontinuity of T can be proved. An application of the more general Theorem 5.3.23 is postponed to the last chapter, Appendix 7.6A. Exercise 5.3.25. Prove that the unique weak solution x = x(t) of (5.3.29) belongs to ˙ p−2 x˙ is absolutely continuous and the equation C 1 [0, 1], |x| ˙ + g(x(t)) = f (t) −(|x| ˙ p−2 x)˙
holds a.e. in
(0, 1).
Hint. Integrating by parts in (5.3.31), we obtain that t 1 p−2 x(t) ˙ + (g(x(τ )) − f (τ ) dτ y(t) ˙ dt = 0 |x(t)| ˙ 0
0
for every y ∈ D(0, 1). 47 We
have used xL1 (0,1) ≤ xW 1,p (0,1) . Prove it! 0
(5.3.34)
5.3A. Browder and Leray–Lions Theorem
329
Set
t
p−2 M (t) = |x(t)| ˙ x(t) ˙ +
(g(x(τ )) − f (τ ) dτ . 0
It follows from Lemma 6.1.9 that M (t) = c
a.e. in
(0, 1).
(5.3.35)
The assertion now follows from (5.3.35) as in the proof of Theorem 6.1.14. Exercise 5.3.26. Prove the following assertion: Let T be a continuous mapping defined on a Banach space X of finite dimension with values in X ∗ . Assume that there exists a real function c = c(r), defined on the interval (0, ∞), such that lim c(r) = ∞ and that T (u), u ≥ c(u)u holds for all u ∈ X.
r→∞
Then T (X) = X ∗ , i.e., the equation T (u) = f ∗ has at least one solution in the space X for arbitrary f ∗ ∈ X ∗ . Hint. Let f ∗ ∈ X ∗ . In the case when X = X ∗ = RN and u, v = (u, v)RN is the scalar product in RN , there exists r > 0 such that the operator F : RN → RN defined by the relation F (u) = T (u) − f ∗ satisfies the assumption (F (u), u)RN > 0
for
u ∈ ∂B(o; R)
with R > 0 large enough.
(5.3.36)
Apply the homotopy invariance property of the Brouwer degree and show that (5.3.36) implies that there exists u0 ∈ B(o; R) such that F (u0 ) = o,
i.e.,
T (u0 ) = f ∗ .
In the general case Remark 1.1.12(i) must be employed. Exercise 5.3.27. Consider the problem p−2 − |x(t)| ˙ x(t) ˙ ˙= h(t, x(t), x(t)), ˙ x(0) = x(1) = 0,
t ∈ (0, 1),
(5.3.37)
where p > 1. Formulate conditions on h = h(t, x, s) which guarantee the existence of a weak solution of (5.3.37). Hint. Apply Theorem 5.3.22. Exercise 5.3.28. How do the conditions on h change if we replace the homogeneous Dirichlet boundary conditions in (5.3.37) by the Neumann ones?
330
Chapter 5. Topological and Monotonicity Methods
5.4 Supersolutions, Subsolutions, Monotone Iterations In this section we deal with another possibility of extending the notion of a monotone function to operators between Banach spaces of infinite dimension. Instead of characterizing an increasing function f : R → R in terms of the inequality (f (x) − f (y))(x − y) ≥ 0
for any x, y ∈ R
(cf. Section 5.3), we use the usual “first semester calculus” definition for any x, y ∈ R satisfying x ≤ y
we have
f (x) ≤ f (y).
(5.4.1)
However, to generalize the implication (5.4.1) to the case of general operators we have to introduce an inequality relation for Banach spaces which can be used analogously to the inequality relation for the set of real numbers. Definition 5.4.1. Let X be a real Banach space and let K be a subset of X. Then K is called an order cone if (1) K is closed, nonempty, and K = {o}; (2) a, b ∈ R, a, b ≥ 0, x, y ∈ K implies ax + by ∈ K; (3) x ∈ K and −x ∈ K implies x = o. On this basis we define x≤y
provided
y − x ∈ K,
x
provided provided
x ≤ y and x = y, y − x ∈ int K.
(5.4.2)
The set [x, y] = {z ∈ X : x ≤ z ≤ y} is called an order interval in X. Note that x≥y
means
x−y ∈K
and similarly for “>” and “”. Remark 5.4.2. Condition (2) is equivalent to saying that K is convex, and if x ∈ K and a ≥ 0, then ax ∈ K. Definition 5.4.3. By an ordered Banach space we mean a real Banach space together with an order cone. Remark 5.4.4. The reader should notice the difference between order cones and cones. A subset C of the Banach space X is called a cone if x ∈ C and a > 0 implies ax ∈ C. So, every order cone is a cone, but the converse is not true in general. Example 5.4.5. Let X = RN . We set RN,+ = {(ξ1 , . . . , ξN ) ∈ RN : ξi ≥ 0 for all i = 1, . . . , N }. Then K = RN,+ is an order cone (see Figure 5.4.1).
5.4. Supersolutions, Subsolutions, Monotone Iterations
331
We have (ξ1 , . . . , ξN ) ≤ (η1 , . . . , ηN )
if and only if ξi ≤ ηi for all i= 1, . . . , N,
(ξ1 , . . . , ξN ) (η1 , . . . , ηN ) if and only if ξi < ηi for all i= 1, . . . , N.
g
Example 5.4.6. The set C in Figure 5.4.2 is a cone in R2 but it is not an order g cone. y
y
K = R2,+ (η1 , η2 )
R2
o x
(ξ1 , ξ2 ) C o
x Figure 5.4.1.
Figure 5.4.2.
Example 5.4.7. Let X = C(Ω) for a bounded set Ω ⊂ RN . We set C + (Ω) = {f ∈ C(Ω) : f (x) ≥ 0 for every x ∈ Ω}. Then K = C + (Ω) is an order cone in X, and we have f ≤g
if and only if
f (x)≤ g(x)
for all x ∈ Ω,
f g
if and only if
f (x)< g(x)
for all x ∈ Ω.
g
The following assertion summarizes the basic properties of ordering in a Banach space X. Proposition 5.4.8. For all u, x, xn , y, yn , z ∈ X and all a, b ∈ R, we have x≤ x, x≤y x≤y
and and
y≤x y ≤z
imply imply
x= y, x≤ z.
Furthermore, we have x≤y x≤y xn ≤ yn 47 The
and and
0≤a≤b u≤z
for all
n
imply imply implies
limits are understood in the norm topology of X.
ax ≤ by, x+u≤y+z
and
lim xn ≤ lim yn 48
n→∞
n→∞
332
Chapter 5. Topological and Monotonicity Methods
provided the limits exist. For the symbol “ ”, the following implications hold: xy x≤y
and and
y≤z y z
imply imply
x z, x z,
xy
and
a>0
imply
ax ay.
Proof. Use (5.4.2) and the properties of K. For example, if xn ≤ yn for all n, then ∞ ∞ yn − xn ∈ K. Since K is closed and limits of {xn }n=1 , {yn }n=1 exist, we conclude that y − x ∈ K, i.e., x ≤ y. Definition 5.4.9. The order cone K is called normal if there is a number c > 0 such that for all x, y ∈ X, o ≤ x ≤ y we have x ≤ cy. Example 5.4.10. For X = RN , K = RN,+ is a normal order cone in RN . Similarly, g C + (Ω) is a normal order cone in C(Ω). Lemma 5.4.11. If an order cone is normal, then every order interval [x, y] is bounded in the norm. Proof. If x ≤ w ≤ y, then o ≤ w − x ≤ y − x, and hence w ≤ w − x + x ≤ cx − y + x.
Now we can introduce the definition of a monotone increasing operator between ordered Banach spaces. Definition 5.4.12. Let X and Y be ordered Banach spaces. An operator T : Dom T ⊂ X → Y is said to be monotone increasing if x
implies
T (x) ≤ T (y)
for any x, y ∈ Dom T.
An operator T is said to be strictly or strongly monotone increasing if the symbol “≤” is replaced by “<” or “”, respectively. Similarly we define (strictly, strongly) monotone decreasing operator. The operator T is said to be positive if T (o) ≥ o and for all x ∈ Dom T , x>o
implies
T (x) ≥ o.
As above, the operator is strictly or strongly positive if the symbol “≥” is replaced by “>” or “”, respectively. Example 5.4.13. Let X = Y = R, K = R+ . Then for a real function f : Dom f ⊂ R → R the concepts of (strictly) monotone increasing (or decreasing) above coincide with the usual definitions. Because of the equivalence of x y and x < y on R there is no difference here between strongly monotone increasing (decreasing) g and strictly monotone increasing (decreasing).
5.4. Supersolutions, Subsolutions, Monotone Iterations
333
Example 5.4.14. For a linear operator T , the concepts of (strictly, strongly) positive are the same as those of (strictly, strongly) monotone increasing. Indeed, let T be positive, for example. Then we have the following sequence of implications: x
=⇒
o
=⇒
o ≤ T (y) − T (x)
=⇒
o ≤ T (y − x) =⇒
T (x) ≤ T (y), g
i.e., T is monotone increasing. The other proofs are similar.
Let T : X → X be an operator on a Banach space X. We consider the operator equation u = T (u) (5.4.3) and apply an iterative method to solve it. For this purpose we consider the iterations (successive approximations) un+1 = T (un )
and
vn+1 = T (vn ),
n = 0, 1, 2, . . . .
(5.4.4)
We illustrate the idea of approximations in Figure 5.4.3. X
T
o u0
u1 u2 · · · u
v · · · v2 v1
v0
X
Figure 5.4.3. Fixed points u, v of T .
The next definition is a basic definition of the existence theory for operator equations in ordered Banach spaces.
334
Chapter 5. Topological and Monotonicity Methods
Definition 5.4.15. A point u is called a supersolution of (5.4.3) (or of the operator T ) if T (u) ≤ u. The prefix “super” is replaced by “sub” when the respective inequalities are reversed. For example, u ∈ X satisfying u ≤ T (u) is a subsolution of (5.4.3).49 The following assertions justify the general principle of super- and subsolutions. This principle can be formulated as follows: If there is a super- and subsolution, then a solution can be obtained by the convergent iterative method (5.4.4). Namely, we have the following results. Theorem 5.4.16. Let T : X → X be a compact monotone increasing operator on a real Banach space X with a normal order cone X + and u0 be a subsolution (v0 ∞ ∞ a supersolution) of (5.4.3). Then {un }n=1 ({vn }n=1 ) in (5.4.4) converges if and 50 only if this sequence is bounded above (below). In the case of convergence, the limit point u is the smallest fixed point u of T with u0 ≤ u (v is the greatest fixed point v of T with v ≤ v0 ).51 Proof. We will consider the case of a subsolution. The case of a supersolution is very similar. Since T is monotone increasing, we have the sequence of implications u0 ≤ T (u0 ) = u1 =⇒ u1 ≤ u2 =⇒ · · · ,
i.e., u0 ≤ u1 ≤ u2 ≤ · · · .
Now un → u implies that un ≤ u for all n. Consequently, the sequence {un }∞ n=1 is bounded above, if it is convergent. ∞ Conversely, the sequence {un }n=1 is convergent if it is bounded above. In∞ deed, suppose un ≤ v for all n. By Lemma 5.4.11, the sequence {un }n=1 is bounded in the norm. Since un = T (un−1 ) and since T is compact, the set of all un is rel∞ atively compact. Thus there exist a convergent subsequence {unk }k=1 and u such ∞ that unk → u. Since the sequence {un }n=1 is monotone, all convergent subse∞ quences have the same limit point. Therefore, the whole sequence {un }n=1 converges to u as well. Since un+1 = T (un ), letting n → ∞ shows that u = T (u). Let w be a solution of (5.4.3) with u0 ≤ w. Then u1 = T (u0 ) ≤ T (w) = w, etc., so that un ≤ w for all n. Hence u ≤ w. The intuitive meaning of the following assertion is demonstrated in Figure 5.4.3. 49 The terminology is not fixed. Instead of super- and subsolution the notions upper and lower solutions have been also used. 50 The set M ⊂ X is bounded above if there is m ∈ X such that u ≤ m for any u ∈ M. 51 Note that the concepts of “smallest” and “greatest” are used in the usual sense, e.g., a smallest fixed point u in X is characterized by u ≤ w for all other fixed points w ≥ u0 .
5.4. Supersolutions, Subsolutions, Monotone Iterations
335
Corollary 5.4.17 (Monotone Iterative Method). Let X be a real Banach space with a normal order cone and T : X → X. Assume that u0 and v0 is a subsolution and a supersolution of (5.4.3), respectively, and u0 ≤ v0 . If T is a compact monotone increasing operator on the order interval [u0 , v0 ], then both the iterative sequences ∞ ∞ {un }n=1 and {vn }n=1 from (5.4.4) are defined, converge, and u = lim un
and
n→∞
v = lim vn n→∞
is the smallest fixed point and the largest fixed point, respectively, of T in [u0 , v0 ]. Furthermore, we have the error estimates un ≤ u ≤ v ≤ vn
for all
n = 0, 1, . . .
(Figure 5.4.3). Proof. Since u0 ≤ T (u0 ), T (v0 ) ≤ v0 and u0 ≤ v0 together imply that u0 ≤ u1 ≤ v0 and similarly that un ≤ v0 for all n,
it follows that {un }∞ n=1 is bounded above. The proof then follows from Theorem 5.4.16.
Example 5.4.18 (Integral Equation). Let Ω be a bounded domain in RN , G : Ω × Ω → R be a continuous and nonnegative function and f : Ω × R → R be a continuous and increasing function in the second variable. Consider the integral equation u(x) =
G(x, y)f (y, u(y)) dy
(5.4.5)
Ω
in the space C(Ω). We write this equation in the form u ∈ C(Ω).
u = T (u), +
Let us consider the normal order cone C (Ω) from Example 5.4.7. Then the operator T : C(Ω) → C(Ω) is compact and monotone increasing.52 Considering subsolutions and supersolutions now means that we replace “=” by “≤” and “≥”, respectively, in the integral equation (5.4.5). Corollary 5.4.17 implies that if u0 ∈ C(Ω) is a subsolution and v0 ∈ C(Ω) is a supersolution with u0 ≤ v0 on Ω, then for n → ∞ the iterative method un+1 (x) = G(x, y)f (y, un (y)) dy, n = 0, 1, 2, . . . , Ω
converges uniformly on Ω to a solution u ∈ C(Ω) of the integral equation with u0 ≤ u ≤ v0 on Ω. Here u is the smallest solution with this property. If, instead, the iterative method starts with v0 , then we obtain the greatest solution v with u0 ≤ v ≤ v0 . The most difficult task in solving (5.4.5) is to find at least one g subsolution u0 and/or supersolution. 52 The
compactness of T has been proved in Example 5.1.14; f = f (y, u) increasing in u immediately implies that T is monotone increasing.
336
Chapter 5. Topological and Monotonicity Methods
Example 5.4.19 (Differential Equation). Let us consider the Dirichlet boundary value problem −¨ x(t) = f (t, x(t)), t ∈ (0, 1), (5.4.6) x(0) = x(1) = 0 where f : [0, 1] × R → R is a function continuous in the first variable and continuously differentiable in the second one. Suppose u0 , v0 ∈ C 2 [0, 1] are such that −¨ v0 (t) ≥ f (t, v0 (t)), t ∈ (0, 1), −¨ u0 (t) ≤ f (t, u0 (t)), t ∈ (0, 1), u0 (0) ≤ 0,
u0 (1) ≤ 0,
v0 (0) ≥ 0,
v0 (1) ≥ 0.53
(5.4.7) We will show that u0 , v0 is a subsolution and a supersolution, respectively, for the operator w = T (z) defined by a solution of the problem −w(t) ¨ + cw(t) = f (t, z(t)) + cz(t) F (t, z), t ∈ (0, 1), w(0) = w(1) = 0, where c > 0 is chosen in such a way that ∂f (t, s) + c > 0 for t ∈ [0, 1] and s ∈ I0 ∂s
min {u0 (t)}, max {v0 (t)} .
t∈[0,1]
t∈[0,1]
Notice that 0 ∈ I0 . The map T is correctly defined because the Dirichlet problem −w(t) ¨ + cw(t) = g(t), t ∈ (0, 1), (5.4.8) w(0) = w(1) = 0 has a unique solution for any fixed g ∈ C[0, 1]. Then T : C[0, 1] → C[0, 1] is a compact operator. This follows from the fact that T is composed from the Nemytski operator N : z(t) → f (t, z(t)) + cz(t) which is continuous, and a compact linear operator A−1 where A(w(t)) = −w(t) ¨ + cw(t), (cf. Example 2.2.17), i.e.,
Dom A = {w ∈ C 2 [0, 1] : w(0) = w(1) = 0} T = A−1 ◦ N.
We will prove that T : C[0, 1] → C[0, 1] is a monotone increasing operator. Indeed, let z1 , z2 ∈ C[0, 1], z1 ≤ z2 . By definition, −(T (zi ))¨(t) + cT (zi )(t) = f (t, zi (t)) + czi (t), t ∈ (0, 1), for i = 1, 2. (T (zi ))(0) = (T (zi ))(1) = 0 53 The
functions u0 , v0 are called a subsolution and supersolution of the boundary value problem (5.4.6), respectively.
5.4. Supersolutions, Subsolutions, Monotone Iterations
337
Putting w = T (z2 ) − T (z1 ) we get −w(t) ¨ + cw(t) = f (t, z2 (t)) − f (t, z1 (t)) + c(z2 (t) − z1 (t)), w(0) = w(1) = 0.
t ∈ (0, 1),
However, the function F : (t, s) → f (t, s) + cs is increasing in s on the interval I0 by the choice of c. Hence for z1 ≤ z2 , z1 (t), z2 (t) ∈ I0 for every t ∈ [0, 1], we have 0 ≤ F (t, z2 ) − F (t, z1 ) = f (t, z2 (t)) − f (t, z1 (t)) + c(z2 (t) − z1 (t)) = −w(t) ¨ + cw(t). Therefore
−w(t) ¨ + cw(t) ≥ 0, w(0) = w(1) = 0.
(5.4.9)
Assume that there is t ∈ (0, 1) such that w(t) < 0. Then there is t0 ∈ (0, 1) such that 0 > w(t0 ) = min w(t). t∈[0,1]
But then w(t ¨ 0 ) ≥ 0, a contradiction with the inequality (5.4.9). Hence w(t) ≥ 0 in (0, 1), i.e., T (z1 ) ≤ T (z2 ).54 We now prove that v0 ≥ T (v0 ), i.e., v0 is a supersolution of T . Set v1 = T (v0 ). We get −¨ v1 (t) + cv1 (t) = f (t, v0 (t)) + cv0 (t), t ∈ (0, 1), v1 (0) = v1 (1) = 0, therefore −(v1 (t) − v0 (t))¨+ c(v1 (t) − v0 (t)) = f (t, v0 (t)) + cv0 (t) + v¨0 (t) − cv0 (t) ≤ 0 for t ∈ (0, 1). The same argument as above yields that v1 (t) ≤ v0 (t), t ∈ [0, 1]. Analogously we prove that u0 ≤ T (u0 ), i.e., u0 is a subsolution of T . If, moreover, g u0 ≤ v0 , then Corollary 5.4.17 can be used (cf. Exercise 5.4.22). Exercise 5.4.20. Let T : RN → RN . Then the equation x = T (x), x ∈ RN , in (5.4.3) describes a system of nonlinear equations xi = Ti (x1 , . . . , xN ),
i = 1, . . . , N.
Consider the order cone RN,+ from Example 5.4.5. Translate all the assumptions and conclusions of Theorem 5.4.16 and Corollary 5.4.17 to this system. 54 The argument used to prove w(t) ≥ 0 in (0, 1) is a special version of the more general Maximum Principle (see, e.g., Protter & Weinberger [102]). The monotonicity of T can be also shown by proving that the Green function corresponding to the operator A (Example 2.2.17) is nonnegative.
338
Chapter 5. Topological and Monotonicity Methods
Exercise 5.4.21. Formulate conditions on a nonlinear function f : Ω×R → R which guarantee that there exist a subsolution u0 ∈ C(Ω) and a supersolution v0 ∈ C(Ω) of the operator T from Example 5.4.18 such that u0 ≤ v0 on Ω. Then formulate the corresponding existence result for the integral equation (5.4.5). Exercise 5.4.22. Formulate conditions on a function f : [0, 1] × R → R which guarantee that there exist a subsolution u0 ∈ C 2 [0, 1] and a supersolution v0 ∈ C 2 [0, 1] of the operator T from Example 5.4.19 such that u0 ≤ v0 in [0, 1]. Then formulate the corresponding existence result for the boundary value problem (5.4.6). Exercise 5.4.23. Replace in (5.4.6) the homogeneous Dirichlet boundary conditions by the Neumann ones. Modify the definitions of a subsolution and a supersolution in such a way that Corollary 5.4.17 could be applied. Formulate conditions on f = f (t, x) which guarantee the existence of a solution of the corresponding Neumann problem. Hint. Use Corollary 5.4.17. Exercise 5.4.24. Consider the Dirichlet boundary value problem −¨ x(t) = h(t, x(t), x(t)), ˙ t ∈ (0, 1), x(0) = x(1) = 0.
(5.4.10)
Formulate conditions on h = h(t, x, s) which guarantee the existence of a solution of (5.4.10). Hint. Use Corollary 5.4.17. Exercise 5.4.25. How do the conditions on h change if we replace the homogenenous Dirichlet boundary conditions in (5.4.10) by the Neumann ones?
5.4A Minorant Principle and Krein–Rutman Theorem In this appendix we study the eigenvalue problem T (x) = λx,
(5.4.11)
and the corresponding inhomogeneous equation λx − T (x) = y,
y > o,
(5.4.12)
on a real Banach space X with an order cone K X . +
Definition 5.4.26. By a positive solution (x, λ) of (5.4.11), we mean a solution of T (x) = λx with x > o and λ > 0. If we replace “=” by “≥”, then we speak about a positive subsolution. Although we present mainly statements about linear problems, the following results play a central role in the investigation of nonlinear problems, for example, in the bifurcation theory, variational principles, etc. The essential tools for investigating (5.4.11) are the Minorant Principle and the Separation Theorem for convex sets (see Corollary 2.1.18). Set Kr = {x ∈ K : x ≤ r} for a fixed, given r > 0.
5.4A. Minorant Principle and Krein–Rutman Theorem
339
The key is to find a suitable minorant M for T , so that T (x) ≥ M (x)
for all
x ∈ Kr ,
(5.4.13)
and which satisfies appropriate conditions. Furthermore, it is important to know a subsolution x0 , i.e., c > 0, x0 > o. (5.4.14) M (x0 ) ≥ cx0 , The general Minorant Principle: If we know a subsolution of (5.4.11), then we can obtain a positive eigenvalue with a positive eigenvector of (5.4.11), is formulated precisely in the following two theorems. Theorem 5.4.27 (Krasnoselski). Suppose that (i) X is a real Banach space with an order cone K; (ii) an operator T : Kr ⊂ X → X is compact and (5.4.13) holds; (iii) a linear operator M : K → X is positive, and there are an x0 > o and a positive real number c such that (5.4.14) holds. Then for every with 0 < < r the problem (5.4.11) has a positive solution (x, λ) satisfying x = . Theorem 5.4.28 (Zeidler). Let us set α(x) = sup{t ≥ 0 : x ≥ tx0 }
for fixed
x0 > o
and all
x ∈ K.
The conclusion of Theorem 5.4.27 still holds if we replace (iii) by the following condition: (iii ) suppose that M : K ⊂ X → K is an operator, not necessarily linear, for which there is an x0 > o and there are real numbers s with 0 < s ≤ 1 and c > 0 such that M (x) ≥ (α(x))s cx0 for all x ∈ Kr . (5.4.15) Theorem 5.4.27 is a special case of Theorem 5.4.28. Indeed, since x ≥ α(x)x0 for x ∈ Kr , we have M (x) ≥ α(x)M (x0 ) ≥ α(x)cx0
x ∈ Kr .
for all
Thus, (5.4.14) implies (5.4.15) with s = 1. Proof of Theorem 5.4.28. We will use a regularization method and the Schauder Fixed Point Theorem (Theorem 5.1.11). Let us first solve an auxiliary problem λn xn = Tn (xn ),
λn > 0,
where Tn (x) T (x) +
xn > o,
xn =
x0 , n = 1, 2, . . . , 0 < ≤ r. n
Let n be fixed. Set z(x) =
x x
for
x = o
For x ∈ K we set S(x) = xTn (z(x)) +
and
z(o) = o.
( − x)x0 . n
(5.4.16)
340
Chapter 5. Topological and Monotonicity Methods
Then S is compact on K (explain why!), and by (5.4.13), (5.4.16) S(x) ≥
xx0 ( − x)x0 x0 + = >o n n n
for all
x ∈ K .
So there is an an > 0 such that S(x) ≥ an
for all
x ∈ K .
It follows from the boundedness of S(K ) that there exists a number bn > 0 such that 0 < an ≤ S(x) ≤ bn
for all
x ∈ K .
(5.4.17)
By (5.4.17), S(x) S(x) is well defined on K . Furthermore, the operator V : K → K is compact on the closed, bounded, and convex set K (why?). By Theorem 5.1.11 (the Schauder Fixed Point Theorem) there is an xn ∈ K such that V (x) =
xn = V (xn ). Tn (xn ) , which In particular, xn = V (xn ) = , so z(xn ) = xn . Therefore xn = 2 S(x n )
n )
. means that xn is a solution of (5.4.16) with λn = S(x 2 Before we pass to the limit for n → ∞, we estimate the value of λn . Namely, we will show that there exist numbers a, b > 0 such that
0 < a ≤ λn ≤ b
n ∈ N.
for all
(5.4.18)
It follows from (5.4.16) that λn ≤ T (xn ) + x0 ,
so
b sup λn < ∞. n∈N
On the other hand, xn ≥ α(xn )x0 implies that there exists γ such that γ sup α(xn ) < ∞. n∈N
Indeed, otherwise there would be a subsequence, again denoted by {xn }∞ n=1 , with xn → o as n → ∞. Now (5.4.16), α(xn ) → ∞ as n → ∞, contradicting o < x0 ≤ α(x n) (5.4.13) and (5.4.15) imply that λn xn = T (xn ) +
x0 x0 x0 ≥ M (xn ) + ≥ . n n n
Therefore, α(xn ) > 0, and furthermore λn xn ≥ M (xn ) ≥ (α(xn ))s cx0 , i.e., the definition of α(xn ) implies α(xn ) ≥ This proves (5.4.18).
(α(xn ))s c , λn
so
λn ≥ α(xn )s−1 c ≥ γ s−1 c a.
5.4A. Minorant Principle and Krein–Rutman Theorem
341
Now, we pass to the limit n → ∞ in (5.4.16). Using (5.4.18) and xn = , we can ∞ find convergent subsequences, again denoted by {λn }∞ n=1 and {T (xn )}n=1 , with λn → λ and T (xn ) → y strongly in X. By (5.4.18), λ > 0. Then we have also strong convergence in X for ' x0 ( xn = λ−1 T (xn ) + → x. n n Hence λx = T (x) and x ∈ K, x = . Example 5.4.29. We will consider the nonlinear system of equations λξi = fi (ξ1 , . . . , ξN ),
i = 1, . . . , N,
(5.4.19)
, x = , λ > 0. The following assertion (the with x = (ξ1 , . . . , ξN ) and x ∈ K R Generalized Perron Theorem) is a consequence of Theorem 5.4.27: N,+
Suppose that fi : K → (0, ∞) is continuous for i = 1, . . . , N and that there is a fixed r > 0 for which fi (ξ1 , . . . , ξN ) ≥
N
µij ξj
holds for all
x∈K
with
x ≤ r,
(5.4.20)
j=1
and i = 1, . . . , N . Assume that all the real numbers µij are nonnegative, and that N min µij > 0. 1≤i≤N
j=1
Then (5.4.19) has a positive solution for every with 0 < ≤ r. Indeed, we can write (5.4.19) as λx = T (x) and apply Theorem 5.4.27 with X = RN , X + = RN,+ , x0 = (1, . . . , 1), and M (x) (η1 , . . . , ηN )
where
ηi =
N
e
µij ξj .
j=1
Example 5.4.30. We will consider the nonlinear integral equation b A(t, s)f (s, x(s)) ds λx(t) =
(5.4.21)
a
on a finite interval [a, b] with λ > 0. This time, for fixed µ > 0 and r > 0, the key condition (substituting (5.4.20)) is f (s, x) ≥ µx
for all
(s, x) ∈ [a, b] × [0, r].
(5.4.22)
Applying Theorem 5.4.27 we have the following assertion (the Generalized Jentzsch Theorem): Suppose A : [a, b] × [a, b] → R is continuous, nonnegative, and b A(t, s) ds > 0. min t∈[a,b]
a
Let f : [a, b] × R → R be continuous and let (5.4.22) be satisfied. Then for every with 0 < ≤ r, (5.4.21) has a positive solution x ∈ C[a, b] with x = . +
342
Chapter 5. Topological and Monotonicity Methods
Indeed, we write (5.4.21) as λx = T (x) and apply Theorem 5.4.27 with X = C[a, b], X + = C + [a, b], x0 (t) ≡ 1 and
b
M (x)(t) µ
e
A(t, s)x(s) ds. a
Proposition 5.4.31. Let T : X → X be a compact linear positive operator on a real ordered Banach space X. Then there exists a positive solution of (5.4.11) if and only if (5.4.11) has a positive subsolution. Proof. This assertion is an immediate consequence of the Minorant Principle (Theorem 5.4.27 with M = T ). Our goal is to sharpen this result. Let T : X → X be a linear operator and let r(T ) denote the spectral radius of the complexification of T .55 We call λ a simple eigenvalue of T if its multiplicity m(λ) is equal to 1.56 Recall that this means dim Ker (λI − T ) = 1
and
Ker (λI − T )2 = Ker (λI − T ).
Let K∗ X ∗,+ denote the set of all positive functionals x∗ ∈ K∗ , i.e., x∗ , x ≥ 0
for all
x ∈ K X+.
We write x∗ ≥ o if x∗ is positive. Furthermore, x∗ > o means x∗ ≥ o
and
x∗ , x > 0
for a certain
x ∈ K.
We call x∗ strictly positive if x>o
always implies
x∗ , x > 0.
A cone K X + ⊂ X is called total if Lin(K) is dense in X. Then K being total implies that K∗ is an order cone (cf. Exercise 5.4.43). In this case we call K∗ the dual order cone of K. In particular, K is total if int K = ∅ (cf. Exercise 5.4.41). For X = RN , K = RN,+ we have X ∗ = X, K∗ = K (explain why!). Proposition 5.4.32 (Krein–Rutman). Let X be a real Banach space with a total order cone K. Suppose that T : X → X is linear, compact, and positive, with r(T ) > 0. Then r(T ) is an eigenvalue of both T and T ∗ with eigenvectors in K and K∗ , respectively. If T is strongly positive, we get a sharper version of the previous assertion. 55 If X is a real Banach space, then by the complexification of T we mean the operator T : X → C XC defined by T (x + iy) = T (x) + iT (y), x, y ∈ X, where XC is the complexification of X in the sense of Example 1.1.6(iii). 56 The significance of simple eigenvalues, roughly speaking, is that their behavior is very stable under perturbations of the operator (cf. Example 4.2.4 and in more details Kato [73]). For this reason, simple eigenvalues play a special role also in the bifurcation theory.
5.4A. Minorant Principle and Krein–Rutman Theorem
343
Theorem 5.4.33 (Krein–Rutman). Let X be a real Banach space with an order cone K having nonempty interior. Then any linear, compact, and strongly positive operator T : X → X has the following properties: (i) T has exactly one eigenvector with x > o and x = 1. The corresponding eigenvalue is r(T ) and it is algebraically simple. Furthermore, x o. (ii) If λ ∈ σ(T ), λ = r(T ), then |λ| < r(T ). (iii) The dual operator T ∗ has r(T ) as an algebraically simple eigenvalue with a strictly positive eigenvector x∗ . Remark 5.4.34. Recall that by the Riesz–Schauder theory (see Theorem 2.2.9), the spectrum of T consists of at most countably many nonzero eigenvalues of finite multiplicity which can accumulate only at the origin, and o ∈ σ(T ) whenever dim X = ∞. The spectra of T and T ∗ coincide (X is a real space). Now, we will give proofs of Proposition 5.4.32 and Theorem 5.4.33. Proof of Proposition 5.4.32. Let us consider T on the complexification XC = X + iX. By the Riesz–Schauder theory (see Theorem 2.2.9), all of the nonzero points of the spectrum of T consist of eigenvalues of finite multiplicity. The same holds for T ∗ . Note that σ(T ) ∩ {λ : |λ| = r(T )} = ∅. We consider the eigenvalues λ of T satisfying |λ| = r(T ), and distinguish three cases. Case 1 (λ0 = r(T ) is an eigenvalue). Our goal is to construct an x > o and an x∗ > o such that T (x) = λ0 x and T ∗ (x∗ ) = λ0 x∗ . From footnote 3 on page 57 we have (λI − T )−1 u = and, therefore,
(λI − T )−1 u ≥ o
∞ T ju , λj+1 j=0
for
λ > r(T ),
λ > λ0
and
u ≥ o.
Since T is compact, λ0 = 0 is an eigenvalue of finite multiplicity (Remark 2.2.10) and in the Laurent series +∞ (λ − λ0 )n An , (5.4.23) (λI − T )−1 = n=−∞
there is an index k such that A−n = O
for all
k
and
A−k = O
(Proposition 3.1.15). So, there is u > o such that x A−k u = o (otherwise A−k = O since K is total). It follows from Proposition 3.1.15 that T x = λ0 x.
344
Chapter 5. Topological and Monotonicity Methods
Moreover, by (5.4.23) and its proof (cf. page 114) x = A−k u = lim (λ − λ0 )k (λI − T )−1 u ≥ o, λ→λ0+
i.e.,
x > o.
Let us construct the element x∗ . By the previous step, T−n (K) ⊂ K. We choose a ∗ u∗ ∈ K∗ with u∗ , x > 0. This is possible by Exercise 5.4.42. We set x∗ = T−n u∗ . Then v ≥ o implies x∗ , v = u∗ , T−n v ≥ 0
x∗ , u = u∗ , x > 0.
and
Thus x∗ > o. Passing to the dual operator in (5.4.23), we obtain λ0 x∗ = T ∗ (x∗ ) analogously as above. Case 2 (there is an eigenvalue λ0 ∈ C of T with |λ0 | = r(T ) and λn 0 > 0 for an n ∈ N, which lies on the spectral circle of i.e., Arg λ0 57 ). Now T n has a positive eigenvalue λn 0 T n , so by Case 1 there exists a u > o with T n (u) = λn 0 u. If we set x = |λ0 |n−1 u + |λ0 |n−2 T (u) + · · · + T n−1 (u), then x > o and T (x) = |λ0 |x. Analogously we construct an x∗ for T ∗ . Case 3 (none of the eigenvalues of T with |λ| = r(T ) has the property from Case 2). We show that this is impossible. So, let λ0 be an eigenvalue of T with |λ0 | = r(T ) and with the greatest possible real part. We set Tε = T + εT 2
for
ε > 0.
By the Spectral Mapping Theorem (see Proposition 3.1.14(v)), all eigenvalues of Tε are of the form λ + ελ2 where λ is an eigenvalue of T . One can check that λ1 = λ0 + ελ20 and λ1 are the eigenvalues of Tε of greatest absolute value (the reader is asked to justify it!). k There is a sequence {εk }∞ k=1 , εk → 0, such that Arg λ1 is a rational multiple of 2π where k 2 λ1 = λ0 + εk λ0 (explain why!). According to Case 2, there is n ∈ N such that λnk 1 > 0. Since n lim λnk 1 = λ0 > 0, k→∞
we get a contradiction. Before we prove Theorem 5.4.33 we need the following geometrical result.
Lemma 5.4.35. Let X be a real Banach space with an order cone K X + containing an interior point. Let u o. Then for every v ∈ K there is a uniquely determined number αu (v) > 0 such that (i) 0 ≤ α ≤ αu (v) implies u + αv ≥ o; (ii) α > αu (v) implies u + αv ∈ K. In particular, u + αv o 57 Any
and
α>0
imply
α < αu (v).
complex number λ = 0 can be written in the form λ = |λ|ei Arg λ .
(5.4.24)
5.4A. Minorant Principle and Krein–Rutman Theorem
345
Proof. Consider the ray = {u + αv : α ≥ 0}. For small α ≥ 0 we have u + αv ∈ int K, and for large α ≥ α0 we have u + αv ∈ K. Otherwise u + nv ∈ K for large n ∈ N, and nu + v ∈ K. Passing to the limit for n → ∞, we obtain a contradiction v ∈ K. Set αu (v) sup {α > 0 : u + αv ∈ int K}.
It is easy to show that αu (v) has the desired properties. Proof of Theorem 5.4.33. We proceed in six steps.
Step 1 (existence of a positive solution). We choose an x > o. Since T is strongly positive, T (x) o, so T (x) ∈ int K. Thus T (x) − γx ∈ K for small γ > 0, so T (x) ≥ γx. By Proposition 5.4.31, there exists a positive solution (e, λ0 ): T (e) = λ0 e
with
e>o
and
λ0 > 0.
Since T (e) o, we have also e o. Step 2. We show: If T (x) = λx, x > o and λ ∈ R, then x = γe for a positive γ and λ = λ0 . To begin with, T (x) o, so λ > 0 and x o. We consider two identities T (e − βx) = λ0 (e − βλ−1 0 λx), −1
T (x − γe) = λ(x − γλ
λ0 e),
(5.4.25) (5.4.26)
and choose β = αe (−x)
and
γ = αx (−e).
Then x = γe. Otherwise, x−γe > o. This implies T (x−γe) o, and hence λ−1 λ0 < 1 by (5.4.26) and (5.4.24). On the other hand, e − βx ≥ o immediately implies T (e − βx) ≥ o, and (5.4.25) yields the contradiction λ−1 0 λ ≤ 1. Step 3. We show: If T (x) = λx and x = o, λ ∈ R \ {0} as well as x = αe for all α ∈ R, then |λ| < λ0 . By Proposition 5.4.32 λ0 = r(T ) now follows, and with respect to Step 2, dim Ker (λ0 I − T ) = 1. By Step 2, ±x ∈ K. We consider T (e ± β± x) = λ0 (e ± β± λ−1 0 λx) and set β± = αe (±x). Since e ± β± x = o, we have e ± β± x > o, so T (e ± β± x) o. Then (5.4.27) and (5.4.24) immediately imply λ−1 0 |λ| < 1.
(5.4.27)
346
Chapter 5. Topological and Monotonicity Methods
Step 4. We now consider the complexification XC = X + iX and T : XC → XC (see footnote 55 on page 342). In this step we show: If λ is a complex eigenvalue of T , then |λ| < λ0 . Let λ = σ +iτ , σ, τ ∈ R, be an eigenvalue of T and z = x+iy, x, y ∈ X, the corresponding eigenvector, i.e., according to the definition of T , we have T (x + iy) = (σ + iτ )(x + iy), which is equivalent to T (x) = σx − τ y,
T (y) = τ x + σy.
(5.4.28)
Our goal is to show that (5.4.28) implies 1 |λ| = σ 2 + τ 2 < λ0 . The reader is invited to prove that if λ is not real and (5.4.28) holds, then x and y are linearly independent elements of X (cf. Remark 1.1.35(ii)). In particular, x = o and y = o. Let P be a two-dimensional plane in X which consists of elements ξx + ηy, ξ, η ∈ R. Then P is an invariant subspace of the operator T , i.e., T (P) ⊂ P. ˜ = K ∩ P is Let T˜ be the restriction of T onto P. Since also T (K) ⊂ K, the cone K invariant with respect to T˜, i.e., ˜ ⊂ K. ˜ T˜ (K) We want to prove that ˜ = {o}. K
(5.4.29) ˜ is an order cone in P and T˜ : P → P is strongly positive Assume the opposite, then K since T is strongly positive. According to Step 1, there exists a positive eigenvector e˜ ∈ P ˜ According to Step 2 we necessarily have of T˜ (and hence also of T ) such that e˜ ∈ K. e˜ = γe for a certain γ = 0, γ ∈ R. But this fact combined with (5.4.28) implies that x and y are linearly dependent, which is a contradiction, i.e., (5.4.29) is proved. It now follows from (5.4.29) that no elements ξx + ηy with |ξ| + |η| > 0 belong to K. In particular, x ∈ K. Since int K = ∅ implies that K is total, there exist nonzero elements x ∈ int K and x ∈ int (K) such that x = x − x .58 There exists β > 0 such that T (x ) ≤ βe. Indeed, since e ∈ int K we find β > 0 large enough to satisfy e − β1 T (x ) ∈ K. 58 Indeed, if v 0 x= u − v 0 .
∈ int K, v0 = o and > 0 is small enough, then u = v0 + x ∈ int K, u = o. Hence
5.4A. Minorant Principle and Krein–Rutman Theorem
347
So, we have T (x) = T (x ) − T (x ) ≥ −T (x ) ≥ −βe,
i.e.,
e+
1 T (x) ∈ K. β
It follows from (5.4.28) that ψ = e + β1 T (x) can be written in the form |ξ| + |η| > 0.
ψ = ξx + ηy + e,
(5.4.30)
Let A be the set of all elements of the form (5.4.30) which belong to K. We have just shown that A = ∅. Let us consider a continuous function of two variables f : A → R which with every ψ ∈ A associates the number ξ 2 + η 2 . Since x ∈ K, y ∈ K, the function f must be bounded. It follows from the Extreme Value Theorem (K is closed) that there is ψ0 = ξ0 x + η0 y + e ∈ A
f (ψ0 ) = max f (ψ) M.
such that
ψ∈A
It follows from the strict positivity of T that there exists δ > 0 such that T (ψ0 ) ≥ δe. Indeed, ψ0 ∈ K, ψ0 = o, implies T (ψ0 ) ∈ int K. We then can find δ > 0 small enough to satisfy T (ψ0 ) − δe ∈ K. δ λ0
Let us assume without loss of generality that
1−
δ λ0
< 1. Let us rewrite T (ψ0 ) ≥ δe as
λ0 e + (ξ1 x + η1 y) ≥ o
where
ξ1 x + η1 y = T (ξ0 x + η0 y).
(5.4.31)
Using (5.4.28), we have T (ξ0 x + η0 y) = (ξ0 σ + η0 τ )x + (−ξ0 τ + η0 σ)y and hence η1 = −ξ0 τ + η0 σ.
ξ1 = ξ0 σ + η0 τ,
(5.4.32)
Then ξ12 + η12 = (ξ02 + η02 )(σ 2 + τ 2 ) = M |λ|2 . It follows from (5.4.31) that ψ1 = e + '
ξ1 1−
δ λ0
(
x+ ' λ0
η1 1−
δ λ0
(
y λ0
is an element of the form (5.4.30). Hence
M≥ which implies |λ| < λ0 .
ξ1 λ0 − δ
2
+
η1 λ0 − δ
2 =
M |λ|2 , (λ0 − δ)2
348
Chapter 5. Topological and Monotonicity Methods
Step 5. We show that λ0 is simple. Since dim Ker (λ0 I − T ) = 1 (Step 3), it is enough to prove Ker (λ0 I − T )2 = Ker (λ0 I − T ). Let (λ0 I − T )2 (x) = o. By Step 2, this implies (λ0 I − T )x = γe. We want to show that γ = 0. Suppose γ = 0. We may assume that γ > 0, for otherwise we pass to −x. Set µ0 = λ−1 0 . Now x = µ0 T (x + γe) implies x + γe = µ0 T (x + 2γe)
and
x = µ20 T 2 (x + 2γe).
n It follows by induction that x = µn 0 T (x + nγe), so ' x x( n = µn for all γe + 0T n n
n ∈ N.
(5.4.33)
Since e ∈ int K, we have γe + nx ≥ o for large n. By (5.4.33) and the positivity of T , we n have nx ≥ o. Furthermore, from (5.4.33) and µn 0 T (e) = e we immediately conclude ' ( x n x − γe = µn ≥ o. 0T n n Passing to the limit for n → ∞ we get −γe ≥ o, so γ = 0, contradicting γ > 0. Step 6 (examination of T ∗ ). By Proposition 5.4.32 there exists e∗ > o such that T ∗ (e∗ ) = λ0 e∗ . We show that
e∗ , x > 0
provided
x > o,
(5.4.34)
i.e., e∗ is strictly positive. Indeed, let x > o. Then T (x) o and by Exercise 5.4.44, e∗ , T (x) > 0. So λ0 e∗ , x = T ∗ (e∗ ), x = e∗ , T (x) > 0. According to the Riesz–Schauder Theory (see Theorem 2.2.9), dim Ker (λ0 I ∗ − T ∗ ) = dim Ker (λ0 I − T ) which is equal to 1 by Steps 2 and 3. To prove that λ0 is an algebraically simple eigenvalue of T ∗ choose x∗ ∈ Ker (λ0 I ∗ − T ∗ )2 . Let y ∗ = λ0 x∗ − T ∗ x∗ . Then y ∗ = αe∗ for an α ∈ R. For any x > o we have αe∗ , x = y ∗ , x = x∗ , λ0 x − T x. In particular, taking x = e we obtain α = 0, i.e., y ∗ = o and x∗ ∈ Ker (λ0 I ∗ − T ∗ ). This proves Ker (λ0 I ∗ − T ∗ )2 = Ker (λ0 I ∗ − T ∗ ). This completes the proof of Theorem 5.4.33.
The authors want to point out that another proof of the Krein–Rutman Theorem can be found in, e.g., Tak´ aˇc [126].
5.4A. Minorant Principle and Krein–Rutman Theorem
349
Corollary 5.4.36. Let X and T be as in Theorem 5.4.33. For every y > o, (5.4.12) has exactly one solution x > o if λ > r(T ), and no such solution if λ ≤ r(T ). Moreover, λx − T (x) = µy
and
x > o, y > o
sgn(µ) = sgn(λ − r(T )).
imply
Here λ and µ are real numbers. Proof. The resolvent Rλ exists for λ > r(T ) and thus the equation λx − T (x) = y
(5.4.35)
has a unique solution for any y ∈ X. Since Rλ : K → K by the proof of Proposition 5.4.32, hence y > o implies x > o. On the other hand, if λ ≤ r(T ) and there is a positive solution x of (5.4.35) for y > o, then choosing e∗ ∈ X ∗ as in Step 6 of the proof of Theorem 5.4.33 we arrive at (λ − r(T ))e∗ , x = e∗ , λx − T (x) = e∗ , y > 0, a contradiction. Finally, let x > o, y > o and λx − T (x) = µy
for a certain
µ ∈ R.
Then (λ − r(T ))e∗, x = e∗ , λx − T (x) = µe∗ , y,
i.e.,
sgn(λ − r(T )) = sgn µ.
Corollary 5.4.37 (Comparison Principle). Let X and T be as in Theorem 5.4.33. If S : X → X is a compact linear operator with S(x) ≥ T (x)
for all
x ≥ o,
then r(S) ≥ r(T ). If S(x) > T (x) for all x > o, then r(S) > r(T ). Proof. Let S(x) ≥ T (x)
for all
x ≥ o.
Choose e > o such that T (e) = r(T )e. Then S(e) ≥ T (e) = r(T )e. By Proposition 5.4.31, r(S) ∈ σ(S) and therefore r(S) ≥ r(T ). In order to prove the second part of the statement we choose x > o with S(x) = r(S)x (see Proposition 5.4.32). We now set AS−T and choose e∗ as in Step 6 of the proof of Theorem 5.4.33. Then r(S)e∗ , x = e∗ , A(x) + e∗ , T (x) = e∗ , A(x) + T ∗ (e∗ ), x = e∗ , A(x) + r(T )e∗ , x. By (5.4.34), we have e∗ , x > 0 and also e∗ , A(x) > 0, and thus r(S) > r(T ).
350
Chapter 5. Topological and Monotonicity Methods
Example 5.4.38. Let X = RN and X + = RN,+ . Further, let T be a real (N × N ) matrix of positive elements only. Then T : X → X is linear, compact, and strongly positive. The e conclusions of Theorem 5.4.33 coincide with those of the classical Perron Theorem. Example 5.4.39. Let Ω be a bounded domain in RN . We set X = C(Ω), X + = C + (Ω) (cf. Example 5.4.7) and consider the integral equation A(t, s)x(s) ds
λx(t) =
for all
t ∈ Ω,
(5.4.36)
Ω
with a positive continuous kernel A : Ω × Ω → R. If we write (5.4.36) in the form λx = T (x),
x ∈ X,
e
then Theorem 5.4.33 is the classical Jentzsch Theorem.
In the next example we use some facts from the forthcoming Chapter 7. The reader who is not acquainted with the properties of the Laplace operator can skip this example or consider the one-dimensional case and replace the Laplace operator by the second derivative. Example 5.4.40. Let us consider the eigenvalue problem for the Laplace operator subject to the homogeneous Dirichlet boundary conditions
−∆u(x) = µu(x)
in
x ∈ Ω,
u(x) = 0
on
x ∈ ∂Ω,
(5.4.37)
where Ω is a bounded domain in RN and ∂Ω is its boundary (cf. Remark 7.2.2). Then (5.4.37) can be written in the form (5.4.36) with λ = µ1 where A = A(t, s) is the Green function associated with the Laplace equation with the homogeneous Dirichlet boundary conditions. Since A is a positive continuous function A : Ω × Ω → R (see, e.g., Gilbarg & Trudinger [59]), we can apply the result of Example 5.4.39. Multiplying the equation in (5.4.37) by u = u(x) (u is a real function) and using the Green Formula (cf. footnote 7 on page 479), we find
∇u(x)2 dx = µ Ω
u2 (x) dx, Ω
which shows that (5.4.37) has only positive real eigenvalues. It then follows from Example 5.4.39 (and hence from the Krein–Rutman Theorem) that (5.4.37) has the least eigenvalue µ1 > 0 which is simple and which is the only eigenvalue of (5.4.37) having a e positive eigenfunction ϕ1 (x) > 0, x ∈ Ω. Exercise 5.4.41. Show that if int K = ∅, then K is a total cone and construct an example of a cone which is not total. Hint. If y ∈ int K, then y ± αx ∈ K for every x ∈ X with α > 0 sufficiently small. Thus X = K − K because (y + αx) − (y − αx) . x= 2α
5.4B. Supersolutions, Subsolutions and Topological Degree
351
Exercise 5.4.42. Show that for every x ∈ K \ {o}, there exists an x∗ ∈ X ∗ such that x∗ , x > 0. Hint. Since −x ∈ K and K is closed, −x is an exterior point of K. Consequently, there is an open convex neighborhood U of −x which is disjoint from K. By the Separation Theorem for convex sets,59 there is an x∗ ∈ X ∗ with x∗ (K) ≥ 0 and x∗ (U) < 0. Hence x∗ , x > 0. Exercise 5.4.43. Show that if K is total, then K∗ is an order cone on X ∗ . Hint. K = {o} implies K∗ = {o} by Exercise 5.4.42. Suppose ±x∗ ∈ K∗ . We have to show that x∗ = o. Indeed, x∗ , ±x ≥ 0 for all x ∈ K implies x∗ , x ≥ 0 for all x ∈ X, because K is total. Hence x∗ = o. Exercise 5.4.44. Let x∗ ∈ X ∗ . Show that if x∗ > o (i.e., x∗ ≥ o and x∗ , y > 0 for a y > o), then x∗ , x > 0 for all x ∈ int K. Hint. Suppose x∗ , x = 0 for an x ∈ int K. Then x ± αy ∈ K for sufficiently small α > 0. Hence x∗ , x ± αy ≥ 0, so x∗ , y = 0. This is a contradiction. Exercise 5.4.45. Prove that the functional v → αu (v) from Lemma 5.4.35 is continuous. Exercise 5.4.46. Apply the Krein–Rutman Theorem to the problems in Examples 2.1.32 and 2.2.17.
5.4B Supersolutions, Subsolutions and Topological Degree In this appendix we show the connection between the supersolution and subsolution on the one hand and the topological degree on the other. We consider the quasilinear boundary value problem p−2 −(|x(t)| ˙ x(t))˙ ˙ = f (t, x(t)), t ∈ (0, 1), (5.4.38) x(0) = x(1) = 0, as a model example. A special case of it was studied in Examples 5.2.51 and 5.3.24. However, in this appendix we work in different function spaces. Here p > 1 is a real number and f : [0, 1] × R → R is a function the properties of which will be specified later. By a solution of (5.4.38) we understand a function x ∈ C 1 [0, 1] with x(0) = x(1) = 0 such that |x| ˙ p−2 x˙ is absolutely continuous and the equation in (5.4.38) holds a.e. in (0, 1). Clearly, the problem (5.4.38) formally coincides with (5.4.6) if p = 2. Definition 5.4.47. A function u0 ∈ C 1 [0, 1] with |u˙ 0 |p−2 u˙ 0 absolutely continuous is called a subsolution of (5.4.38) if u0 (1) ≤ 0 u0 (0) ≤ 0, and −(|u˙ 0 (t)|p−2 u˙ 0 (t))˙ ≤ f (t, u0 (t))
for a.e.
In an analogous way we define a supersolution v0 of (5.4.38). We write x y if and only if x(t) < y(t), 59 This
is a minor supplement of Corollary 2.1.18.
t ∈ (0, 1),
t ∈ (0, 1).
352
Chapter 5. Topological and Monotonicity Methods
and either
x(0) < y(0)
or
x(0) = y(0)
and
x(0) ˙ < y(0), ˙
and the same alternatives hold at 1.60 Definition 5.4.48. A subsolution u0 of (5.4.38) is said to be strict if every possible solution x of (5.4.38) such that u0 ≤ x on [0, 1] satisfies u0 x. In an analogous way we define a strict supersolution of (5.4.38). Let us formulate (5.4.38) as a “fixed point” operator equation. Assume that for any y ∈ C01 [0, 1] {x ∈ C 1 [0, 1] : x(0) = x(1) = 0} we have
f (t, y(t)) ∈ L∞ (0, 1).
We denote by T : C01 [0, 1] → C01 [0, 1] the solution operator of p−2 −(|x(t)| ˙ x(t))˙ ˙ = f (t, y(t)), t ∈ (0, 1), x(0) = x(1) = 0,
(5.4.39)
i.e., for x, y ∈ C01 [0, 1], x = T (y) if and only if the equation in (5.4.39) holds a.e. in (0, 1). For any fixed y ∈ C01 [0, 1] it follows by integration of (5.4.39) and the injectivity of ϕ(s) = |s|p−2 s that the operator T is well defined. Clearly, the problem (5.4.38) has a solution x if and only if x = T (x), i.e., x is a fixed point of T . Let f be a Carath´eodory function and for any r > 0 let there exist a constant hr > 0 such that for a.e. t ∈ (0, 1) and for all s ∈ (−r, r), |f (t, s)| < hr .
(5.4.40)
This condition is satisfied if, e.g., f (t, x(t)) = h(t) − g(x(t)) where h ∈ L∞ (0, 1) and g : R → R is a continuous function (cf. Examples 5.2.51 and 5.3.24). We prove that the operator T is compact. To this purpose we express T in the integral form. By the Rolle Theorem for any x = T (y) there exists tx ∈ [0, 1] such that x(t ˙ x ) = 0, i.e., -p −2 tx - tx f (τ, y(τ )) dτ -f (τ, y(τ )) dτ (5.4.41) x(t) ˙ =t
t - x(t) = -
and
0
where p =
p . p−1
t
-p −2 tx f (τ, y(τ )) dτ --
σ
f (τ, y(τ )) dτ
dσ
(5.4.42)
σ
If yn → y0 in C01 [0, 1], then the continuity of the Nemytski operator y → f (·, y)
60 Here
tx
(5.4.43)
x(0) ˙ and x(1) ˙ mean the derivative from the right and from the left, respectively.
5.4B. Supersolutions, Subsolutions and Topological Degree
353
from C[0, 1] into C[0, 1], and (5.4.41), (5.4.42) imply that xn → x0 in C01 [0, 1] where xn = T (yn ), x0 = T (y0 ), i.e., T is continuous. Let M ⊂ C01 [0, 1] be a bounded set. To prove the compactness of T we have to show that T (M) is relatively compact. Let {xn }∞ n=1 ⊂ T (M) be an arbitrary sequence, xn = T (yn ), yn ∈ M. It follows from the compact embedding C01 [0, 1] ⊂⊂ C[0, 1] (see Theorem 1.2.13) that there exists a ∞ subsequence {ynk }∞ k=1 ⊂ {yn }n=1 which converges uniformly on [0, 1]. But the continuity of the Nemytski operator (5.4.43) and (5.4.41), (5.4.42) imply that {xnk }∞ k=1 converges in C01 [0, 1], i.e., T (M) is relatively compact. Hence the compactness of T is proved. The following assertion is referred to as a well-ordered case of supersolution and subsolution. Theorem 5.4.49 (well-ordered case). Let f be a Carath´ eodory function satisfying (5.4.40). Assume that u0 and v0 are a subsolution and a supersolution of (5.4.38), respectively, with u0 ≤ v0 (see Figure 5.4.4). Then the problem (5.4.38) has at least one solution x satisfying in [0, 1]. u0 ≤ x ≤ v0 If, moreover, u0 and v0 are strict and satisfy u0 v0 , then there exists R0 > 0 such that for, all R > R0 , deg (I − T, Ω1 , o) = 1
where
Ω1 {x ∈ C01 [0, 1] : u0 x v0 } ∩ B(o; R),
is an open set in C01 [0, 1] (cf. Exercise 5.4.53).
v0
0
u0
1
t
Figure 5.4.4. Well-ordered case Proof. Set
⎧ ⎪ ⎨f (t, y) f˜(t, y) f (t, u0 (t)) ⎪ ⎩ f (t, v0 (t))
if if if
u0 (t) ≤ y ≤ v0 (t), y ≤ u0 (t), y ≥ v0 (t).
Every solution of
p−2 x(t))˙ ˙ = f˜(t, x(t)), −(|x(t)| ˙
t ∈ (0, 1),
x(0) = x(1) = 0,
(5.4.44)
is a solution of (5.4.38). Indeed, assume that x solves (5.4.44) and x > v0 in an interval I+ ⊂ (0, 1) and x = v0 on ∂I+ . Then
1 0
1 - dx(t) -p−2 dx(t) d f (t, v0 (t))(x(t) − v0 (t))∗ dt (x(t) − v0 (t))∗ dt = - dt dt dt 0
(5.4.45)
354
Chapter 5. Topological and Monotonicity Methods
where ∗
(x(t) − v0 (t)) =
x(t) − v0 (t) 0
on on
I+ , [0, 1] \ I+ .
Since v0 is a supersolution, we have 1 1 - dv0 (t) -p−2 dv0 (t) d ∗ (t)) dt ≥ f (t, v0 (t))(x(t) − v0 (t))∗ dt. (5.4.46) (x(t) − v 0 - dt dt dt 0 0 Hence, combining (5.4.45) and (5.4.46), we obtain p−2 (|x(t)| ˙ x(t) ˙ − |v˙ 0 (t)|p−2 v˙ 0 (t))(x(t) ˙ − v˙ 0 (t)) dt ≤ 0. I+
This is a contradiction,61 which proves that x(t) ≤ v0 (t),
t ∈ (0, 1).
The same argument shows that x(t) ≥ u0 (t),
t ∈ (0, 1).
Now, denote by T˜(y) a solution of the boundary value problem p−2 x(t))˙ ˙ = f˜(t, y(t)), t ∈ (0, 1), −(|x(t)| ˙ x(0) = x(1) = 0 for y ∈ C01 [0, 1]. Then T˜ : C01 [0, 1] → C01 [0, 1] is compact62 and the solutions of (5.4.44) are in a one-to-one correspondence with the fixed points of T˜. The definition of f˜ ensures that there exists a constant R0 > 0 such that for any y ∈ C01 [0, 1] we have T˜ (y)C01 [0,1] < R0
(5.4.47)
(see (5.4.41), (5.4.42)). By the Schauder Fixed Point Theorem T˜ has a fixed point x in B(o; R0 ), i.e., x is a solution of (5.4.44). It follows from the above considerations that u0 ≤ x ≤ v0 , and so x is also a desired solution of (5.4.38). The proof of the second part follows from the fact that due to (5.4.47), we can construct an admissible homotopy H(τ, ·) I − τ T˜ ,
τ ∈ [0, 1],
which shows that deg (I − T˜ , B(o; R0 ), o) = deg (I, B(o; R0 ), o) = 1. Since u0 and v0 are strict and there is no solution x of the equation x − T˜(x) = o for which either x(t) < u0 (t) or x(t) > v0 (t) for a t ∈ (0, 1), it follows from Theorem 5.2.13(iv) that deg (I − T˜ , Ω1 , o) = deg (I − T˜, B(o; R0 ), o) = 1. The assertion now follows from the fact that T and T˜ coincide in Ω1 . that s → is a strictly increasing function! proof of this fact is the same as that for T .
61 Note 62 The
|s|p−2 s
5.4B. Supersolutions, Subsolutions and Topological Degree
355
The next assertion is referred to as a non-well-ordered case of a supersolution and a subsolution. Theorem 5.4.50 (non-well-ordered case). Let f be a Carath´ eodory function which satisfies the following assumption: there are ci > 0, i = 1, 2, such that |f (t, s)| ≤ c1 + c2 |s|p−1
for a.e.
t ∈ (0, 1)
and for all
s∈R
(5.4.48)
and, moreover,
lim
|s|→∞
f (t, s) = λ1 .63 |s|p−2 s
(5.4.49)
Assume that u0 and v0 are a subsolution and a supersolution of (5.4.38), respectively, and there exists t0 such that u0 (t0 ) > v0 (t0 ) (see Figure 5.4.5).
v0
0
t0
u0
1
t
Figure 5.4.5. Non-well-ordered case Then (5.4.38) has at least one solution in the closure (with respect to the C 1 -norm) of the set S {x ∈ C01 [0, 1] : ∃t1 , t2 ∈ (0, 1), x(t1 ) < u0 (t1 ), x(t2 ) > v0 (t2 )}. Set Ω2 S ∩ B(o; R) and assume that there is no solution of (5.4.38) on ∂Ω2 . Then there exists R0 > 0 such that for all R > R0 , deg (I − T, Ω2 , o) = −1. Proof. If (5.4.38) has a solution on ∂S, we are done. Let us assume in the sequel that (5.4.38) does not have a solution on ∂S. For r > 0 let us define ⎧ ⎪ ⎨f (t, y) fr (t, y) = (1 + r − |y|)f (t, y) ⎪ ⎩ 0 63 Here
if if if
|y| < r, r < |y| < r + 1, |y| > r + 1.
λ1 is the first eigenvalue of (5.2.47), see Example 5.2.51.
356
Chapter 5. Topological and Monotonicity Methods
Next we show that there is K > 0 such that for any r > 0 and for any possible solution of p−2 −(|x(t)| ˙ x(t))˙ ˙ = fr (t, x(t)), t ∈ (0, 1), (5.4.50) x(0) = x(1) = 0, the following a priori estimate holds: xC01 [0,1] ≤ K.
(5.4.51)
To prove this fact we argue by contradiction, and thus we assume that for any k ∈ N there are rk > 0, xk ∈ S solving −(|x˙ k (t)|p−2 x˙ k (t))˙ = frk (t, xk (t)), t ∈ (0, 1), (5.4.52) xk (0) = xk (1) = 0, and satisfying xk ≥ k. Set yk xxkk and divide (5.4.50) by xk p−1 to obtain ⎧ ⎪ ⎨ −(|y˙ k (t)|p−2 y˙ k (t))˙ = frk (t, xk (t)) , t ∈ (0, 1), xk p−1 ⎪ ⎩ y (0) = y (1) = 0. k
k
By integration we find that {yk }∞ k=1 equivalently satisfies
t frk (σ, xk (σ)) dσ y˙ k (t) = ϕp ϕp (y˙ k (0)) + xk p−1 0 and
t
yk (t) =
ϕp
τ
ϕp (y˙ k (0)) +
0
0
frk (σ, xk (σ)) dσ xk p−1
(5.4.53)
dτ ,
t ∈ [0, 1],
(5.4.54)
where for s > 1 we set ϕs (ξ) = |ξ|s−2 ξ if ξ = 0 and ϕs (0) = 0. Now, since yk = 1, by passing to a subsequence if necessary, we have yk → y
in
C0 [0, 1] {x ∈ C[0, 1] : x(0) = x(1) = 0}
for a
y ∈ C0 [0, 1].64
But then (5.4.53) yields yk → y
in
C01 [0, 1]
(note that without loss of generality we may also assume that {y˙ k (0)}∞ k=1 forms a convergent sequence!). It follows from (5.4.54), (5.4.48), (5.4.49) and the Lebesgue Dominated Convergence Theorem that y solves the problem ˙ p−2 y(t))˙ ˙ −(|y(t)| = λ1 |y(t)|p−2 y(t), t ∈ (0, 1), y(0) = y(1) = 0. Since y = 1, it follows that y is a nonzero multiple of the first eigenfunction ϕ1 (t) > 0 in (0, 1) (see Example 5.2.51). If y > 0 in (0, 1), then we find that xk (t) → ∞ for any t ∈ (0, 1), which contradicts xk ∈ S. Also y < 0 in (0, 1) leads to a contradiction. Hence the a priori estimate (5.4.51) is proved. 64 This
is a consequence of the Arzel` a–Ascoli Theorem.
5.4B. Supersolutions, Subsolutions and Topological Degree
357
Now choose R > R0 = max{K, u0 C[0,1] , v0 C[0,1] } + 1 and consider (5.4.50) with r = R and xk = x, i.e., p−2 −(|x(t)| ˙ x(t))˙ ˙ = fR (t, x(t)),
t ∈ (0, 1),
(5.4.55)
x(0) = x(1) = 0.
Obvious modifications of the definition of a strict subsolution and supersolution of (5.4.38) lead to the same notions associated with (5.4.55). Then α = −R−2 and β = R+2 are a subsolution and a supersolution, respectively, associated with (5.4.55). Both are actually strict. Indeed, assume, e.g., that x is a solution of (5.4.55), x(t) ≥ −R − 2 and x(t0 ) = −R − 2 for a certain t0 ∈ (0, 1). Then x(t0 ) = min x(τ ), i.e., x(t ˙ 0 ) = 0 and τ ∈(0,1)
there exists η > 0 such that x(t) < −R − 1 for t ∈ [t0 , t0 + η). But fR (t, x(t)) = 0 by definition, so x(t) ≡ −R − 2 in (t0 , t0 + η]. Since this implies that x(t) ≡ −R − 2 in (t0 , 1], we obtain a contradiction. The same argument applies to R + 2. Notice also that α v0 and u0 β. Now, let us define TR : C01 [0, 1] → C01 [0, 1] by x TR (y) where x is a solution of the problem p−2 x(t))˙ ˙ = fR (t, y(t)), −(|x(t)| ˙
t ∈ (0, 1),
x(0) = x(1) = 0, and define the sets Sαβ {x ∈ C01 [0, 1] : α x β}, Su0 β {x ∈ C01 [0, 1] : u0 x β}
and
Sαv0 {x ∈ C01 [0, 1] : α x v0 }
(see Figure 5.4.6).
β =R+2 v0
0
t0
u0
α = −R − 2 Figure 5.4.6.
1
t
358
Chapter 5. Topological and Monotonicity Methods
By definition, TR and T coincide in the ball B(o; R). Applying Theorem 5.4.49 and Theorem 5.2.13(iv) we obtain 1 = deg (I − TR , B(o; R) ∩ Sαβ , o) = deg (I − TR , B(o; R) ∩ Sαv0 , o) + deg (I − TR , B(o; R) ∩ Su0 β , o) + deg (I − TR , Ω2 , o) = 2 + deg (I − TR , Ω2 , o),
which completes the proof.
Remark 5.4.51. There are several applications of Theorems 5.4.49 and 5.4.50. Also generalizations of these results to the case of partial differential equations can be found in literature, see, e.g., Dr´ abek, Girg & Man´ asevich [40]. In the next assertion we present one application of Theorems 5.4.49 and 5.4.50 which under suitable assumptions on f yields the multiplicity of solutions of (5.4.38). Theorem 5.4.52. Let f be as in Theorem 5.4.50 and let ui0 and v0i , i = 1, 2, be subsolutions and supersolutions of (5.4.38), respectively, which satisfy u10 v01 ,
u20 v02 ,
and let there exist t0 ∈ (0, 1) such that u20 (t0 ) > v01 (t0 ) (see Figure 5.4.7). Then the problem (5.4.38) has at least three distinct solutions.
x3
v02 v01
x2
0
1
t0
t
u20 x1
u10 Figure 5.4.7.
Proof. It follows from Theorem 5.4.49 that there are solutions xi = xi (t), i = 1, 2, of (5.4.38) which satisfy u20 x2 v02 . u10 x1 v01 , Now, let us apply Theorem 5.4.50 with a subsolution u20 and a supersolution v01 . We get another solution x3 = x3 (t) of (5.4.38). Clearly, all xi , i = 1, 2, 3, are mutually different.
5.4B. Supersolutions, Subsolutions and Topological Degree
359
Exercise 5.4.53. Prove that Ω1 from Theorem 5.4.49 is an open set in C01 [0, 1]. Exercise 5.4.54. Formulate conditions on f = f (t, x) which guarantee that the problem (5.4.38) has a pair of well-ordered supersolution and subsolution. Exercise 5.4.55. Formulate conditions on f = f (t, x) which guarantee that the problem (5.4.38) has a pair of non-well-ordered supersolution and subsolution. Exercise 5.4.56. Formulate conditions on f = f (t, x) which guarantee that the problem (5.4.38) has two pairs of supersolutions and subsolutions which satisfy the assumptions from Theorem 5.4.49.
Chapter 6
Variational Methods 6.1 Local Extrema In this section we present necessary and/or sufficient conditions for local extrema of real functionals. The most famous ones are the Euler and Lagrange necessary conditions and the Lagrange sufficient condition. We also present the brachistochrone problem, one of the oldest problems in the calculus of variations. We also discuss regularity of the point of a local extremum. The methods presented in this section are motivated by the equation f (x) = 0
(6.1.1)
where f is a continuous real function defined in R. The solution of this equation can be transformed to the problem of finding a local extremum of the integral F of f (i.e., F (x) = f (x), x ∈ R). Indeed, if there exists a point x0 ∈ R at which the function F has its local extremum, then the derivative F (x0 ) necessarily vanishes due to a familiar theorem of the first-semester calculus. The problem of finding solutions of (6.1.1) can be thus transformed to the problem of finding local extrema of the function F . On the other hand, one should keep in mind that the equation (6.1.1) may have a solution which is not a local extremum of F . In what follows we will deal with real functionals F: X →R where X is a normed linear space with the norm · . Definition 6.1.1. We say that F has a local minimum (maximum) at a point a ∈ X if there exists a neighborhood U of a such that for all x ∈ U \ {a} we have F (x) ≥ F (a)
(F (x) ≤ F (a)).
362
Chapter 6. Variational Methods
If the inequalities are strict, we speak about a strict local minimum (strict local maximum). If the functional F has a (strict) local minimum or (strict) local maximum at a, we say that it has a (strict ) local extremum at a. In Figure 6.1.1 the critical point a is not a point of extremum of F . R
F
a
0
R
Figure 6.1.1.
The fundamental assertion is the following Euler (or Fermat ) Necessary Condition. Proposition 6.1.2 (Euler Necessary Condition). Let F : X → R have a local extremum at a ∈ X. If for v ∈ X the derivative δF (a; v) exists, then δF (a; v) = 0. Proof. Set g(t) = F (a + tv),
t ∈ R.
Then g attains a local minimum at t = 0, thus 0 = g (0) = δF (a; v).
Definition 6.1.3. If δF (a; v) = 0
for all v ∈ X,
then a is called a critical point of the functional F .1 The more precise Lagrange Necessary Condition distinguishes between local minima and maxima, but requires the existence of the second derivative in the given direction. Proposition 6.1.4 (Lagrange Necessary Condition). Let F : X → R have a local minimum (maximum) at a ∈ X. If for v ∈ X the second derivative δ 2 F (a; v, v) exists, then δ 2 F (a; v, v) ≥ 0 (δ 2 F (a; v, v) ≤ 0). 1 Cf.
Definition 4.3.6.
6.1. Local Extrema
363
Proof. Let g be as in the proof of Proposition 6.1.2. Then g (0) = δ 2 F (a; v, v). Now we can apply the Lagrange necessary condition for local extrema of the real function g of one real variable to get the conclusion. Contrary to Propositions 6.1.2 and 6.1.4, the Lagrange Sufficient Condition provides the information when a critical point of F is a point of its local minimum or local maximum. Theorem 6.1.5 (Lagrange Sufficient Condition). Let a ∈ X be a critical point of F : X → R. Let there exist a neighborhood U of a such that the mapping x → D2 F (x) is continuous in U. If there exists α > 0 such that D2 F (a)(v, v) ≥ αv2
(D2 F (a)(v, v) ≤ −αv2 )
for any
v ∈ X,
then F has a strict local minimum (maximum) at a. Proof. Let v ∈ X be such that a + v ∈ U. Then according to Proposition 3.2.27 we have 1 F (a + v) − F (a) = (1 − t)D2 F (a + tv)(v, v) dt.2 (6.1.2) 0
On the other hand, D2 F (a + tv)(v, v) ≥ D2 F (a)(v, v) − |D2 F (a + tv)(v, v) − D2 F (a)(v, v)| 8 9 ≥ α − D2 F (a + tv) − D2 F (a)B2 (X,R) v2 . The continuity of D2 F (x) in U implies that there is δ > 0 so small that for v < δ, t ∈ [0, 1], D2 F (a + tv) − D2 F (a)B2 (X,R) < α, (6.1.3) i.e., for 0 < v < δ we have (due to (6.1.2) and (6.1.3)) F (a + v) > F (a). The proof for a strict local maximum is similar.
Let us illustrate the general statements at first on a function of several real variables F : RN → R. Example 6.1.6. Let F : RN → R have all partial derivatives of the first order at a point a ∈ RN and, moreover, let the function F have a local extremum at a. Then Proposition 6.1.2 states that ∂F ∂F ∂F (a) = (a) = · · · = (a) = 0. ∂x1 ∂x2 ∂xN 2 We
(6.1.4)
can assume that U is convex. Then D 2 F (a + tv) exists and is continuous for all t ∈ [0, 1].
364
Chapter 6. Variational Methods
On the other hand, it is well known that (6.1.4) does not imply that F has a local extremum at the point a. To check that this is the case we can apply Theorem 6.1.5. If F has continuous second partial derivatives in a neighborhood of a, then we should investigate the quadratic form D2 F (a)(v, v) =
N
∂ 2F (a)vi vj . ∂xi ∂xj i,j=1
(6.1.5)
To prove that F has, e.g., a local minimum at a, it is enough to show that there exists α > 0 such that for any v ∈ RN , v = 1, D2 F (a)(v, v) ≥ α.
(6.1.6)
(Here we have used the fact that the quadratic form is homogeneous.) Since we are in finite dimension, the unit sphere in RN is a compact set. Then (6.1.6) holds with an α > 0 whenever for all v = 1.3
D2 F (a)(v, v) > 0
(6.1.7)
The reader is invited to justify that (6.1.7) implies (6.1.6) and to explain why this is not the case when RN is replaced by a space of infinite dimension. It follows from linear algebra4 that for any quadratic form on RN there exists a basis {u1 , . . . , uN } of RN and numbers λ1 , . . . , λN such that for any v of the form v=
N
ξi ui
i=1
we have D2 F (a)(v, v) =
N
λi ξi2 .
i=1
The inequality (6.1.7) holds if and only if all λi , i = 1, . . . , N , are positive, and so according to Theorem 6.1.5 the function F has a strict local minimum at a. If there is at least one positive and at least one negative number among λi , i = 1, . . . , N , then according to Proposition 6.1.4 the function F does not have a local extremum g at a. Before we give an application in an infinite dimensional space, we prove the following assertion for convex functionals. 3 Here we use the fact that a positive continuous function on a compact set achieves its minimal value which has to be positive. 2 ∂2 F F 4 See also Corollary 6.3.9. (Remember that (a) = ∂x∂ ∂x (a).) ∂x ∂x i
j
j
i
6.1. Local Extrema
365
Definition 6.1.7. Let M ⊂ X be a convex set. A functional F : X → R is said to be convex in M if for any u, v ∈ M and t ∈ [0, 1] we have F (tu + (1 − t)v) ≤ tF (u) + (1 − t)F (v). The functional F is said to be strictly convex in M if for any u, v ∈ M, u = v and t ∈ (0, 1) we have F (tu + (1 − t)v) < tF (u) + (1 − t)F (v). Proposition 6.1.8. Let F : X → R be a convex functional on a normed linear space X. Then every critical point of F in X is a point of minimum of F over X. Proof. Without loss of generality we can assume that F (o) = 0
and
δF (o; v) = 0
for any v ∈ X
(i.e., o ∈ X is a critical point). Assume that F does not achieve the minimum value over X at o ∈ X. Then there exists u ∈ X for which F (u) = α < 0. The convexity of F implies that F (tu + (1 − t)o) ≤ tα i.e.,
for any t ∈ (0, 1),
F (tu) − F (o) ≤ α < 0. t
(6.1.8)
But (6.1.8) implies δF (o; u) ≤ α < 0, which is a contradiction.
The following result will be needed several times in the further text. Lemma 6.1.9 (Fundamental Lemma in Calculus of Variations). Let I be an open interval and f ∈ L1loc (I). If f (x)ϕ (x) dx = 0 for any ϕ ∈ D(I), 5 (6.1.9) I
then f = const. a.e. in I. Proof. Let J be a compact subinterval of I and ϕ a mollifier, ϕ ∈ D(R), supp ϕ ⊂ [−1, 1] (see Proposition 1.2.20(iv)). For f (x), x ∈ J , g(x) = 0, x ∈ R\J, 5 See
page 35 for the definition of D(I).
366
Chapter 6. Variational Methods
we have g ∈ L1 (R), and thus lim g ∗ ϕn = g
n→∞
in the L1 (R)-norm6
and (passing to a subsequence – cf. Remark 1.2.18) also a.e. in R. Since g(x)ϕ n (y − x) dx = f (x)ϕ n (y − x) dx (g ∗ ϕn ) (y) = R
I
whenever y − n1 , y + n1 ⊂ J , by the assumption (6.1.9), (g ∗ ϕn )(y) is constant for all such y. The convergence of g ∗ ϕn to g means that g is constant a.e. in J , i.e., f = const. a.e. in I. One of the oldest problems in the calculus of variations is studied in detail in the following example. Example 6.1.10 (Brachistochrone Problem). The problem is formulated as follows: “For two given points A and B in a vertical plane find a curve connecting A and B which is optimal among all other such curves in the following sense. The point P of unit mass which starts from A with zero velocity and moves along this curve only due to the gravitational force will reach the point B in a minimal time.”7 In order to find a suitable mathematical model we shall assume that the points A = (0, 0) and B = (a, b), b ≥ 0, are situated in a vertical plane with the coordinate system chosen as in Figure 6.1.2. The reader is invited to verify that such a position of A and B can be considered without loss of generality. We shall concentrate first only on curves which are graphs of nonnegative functions y = u(x) which belong to the space C 1 [0, a]. The point P moves according to the second Newton Law. The resulting force is a composition of the gravitational force and the reaction force of the constraint (the point P moves along the given curve). The resulting direction is given by the tangent line of the curve, see Figure 6.1.2. The Second Newton Law says that for the velocity v of the point P the following identity holds: mv˙ = F = mg cos α (see Figure 6.1.2). Multiplying this identity by v and taking into account that x˙ = v cos α, we obtain
· 1 2 v = gv cos α = g x, ˙ 2 i.e., 1 2 v = gx (6.1.10) 2 (the Principle of Conservation of Energy). 6ϕ
n is defined in Proposition 1.2.20(iv). 7 This problem was posed by Johann Bernoulli
(see Berkovitz [12]).
6.1. Local Extrema
367
y
b
A
P α F mg a
B
x Figure 6.1.2. The x-axis is oriented in the (downward) direction of the gravitational force.
Since the point P moves along the graph of u = u(x), its trajectory s = s(t) is given by x(t) 1 s(t) = 1 + (u (x))2 dx.8 (6.1.11) 0
Hence
ds(t) ds(t) dx 1 ˙ = = 1 + (u (x(t)))2 x(t). dt dx dt Using (6.1.10) and the strict monotonicity of x we have 1 1 + (u (x))2 dt √ = . dx 2gx v(t) =
Therefore the time needed to get from A to B is given by a1 1 + (u (x))2 ˜ √ dx. F (u) = 2gx 0
(6.1.12)
We wish to apply Proposition 6.1.2 to the functional F˜ . However, F˜ is not defined on a linear space (u(a) = b = 0). To avoid this obstacle we change the variable u for this moment by a substitution b w(x) = u(x) − x. a 8 use the formula for the length of a curve given by the graph of u = u(x): s = We x0 ; 1 + (u (x))2 dx. 0
368
Chapter 6. Variational Methods
So, we can write (6.1.12) as a
b ˜ F (w) = F w + x = a 0
; 2 1 + w (x) + ab √ dx 2gx
where w ∈ C01 [0, a] {w ∈ C 1 [0, a] : w(0) = w(a) = 0}. We equip C01 [0, a] with the norm
uC01 [0,a] =
a
|u (x)|2 dx
12 .
0
For a given h ∈ C01 [0, a] we have (see Corollary 3.2.14 and Example 3.2.21)
a
δF (w; h) = 0
w (x) + ab 2 = h (x) dx. < 2 2gx 1 + w (x) + ab
The Euler Necessary Condition (Proposition 6.1.2) for the original variable u reads a u (x) 1 for all h ∈ C01 [0, a]. (6.1.13) h (x) dx = 0 2gx[1 + (u (x))2 ] 0 Let us denote
u (x) M (x) = 1 , 2gx[1 + (u (x))2 ]
x ∈ (0, a).
Applying Lemma 6.1.9 we obtain that there is a constant K ∈ R such that M (x) = K a.e. in (0, a). However, the continuity of M actually implies that u (x) 1 =K 2gx[1 + (u (x))2 ]
for all x ∈ (0, a).
(6.1.14)
We will find a solution of the Euler equation (6.1.14). Note that K = 0 implies b = 0, and so in this case u = 0 is a unique solution of (6.1.14). Assume now that 1 b > 0, and write K as ± √4gc with a c > 0. The equation (6.1.14) then implies '
1−
x( x (u (x))2 = , 2c 2c
x ∈ [0, a].
(6.1.15)
x < 1. After the change of variables x = c(1 − cos τ ), τ ∈ [0, τ0 ] (here Hence 0 ≤ 2c τ0 < π is such that a = c(1 − cos τ0 )) we obtain
du du = c sin τ dτ dx
6.1. Local Extrema
369
and (6.1.15) is transformed into
du dτ
2 = c2 (1 − cos τ )2 .
Hence u(τ ) = ±c(τ − sin τ ),
τ ∈ [0, τ0 ].
(Notice that the integration constant is zero since u(0) = 0, and only the sign plus corresponds to our problem.) Hence the parametric equations of the graph of u are given by x = c(1 − cos τ ),
y = c(τ − sin τ ),
τ ∈ [0, τ0 ].
This is a part of the cycloid, and we have to determine parameters c and τ0 so that B is the end point of this curve. This means b τ0 − sin τ0 = , a 1 − cos τ0 Since the function τ →
τ − sin τ , 1 − cos τ
τ0 ∈ (0, π).
(6.1.16)
τ ∈ (0, π),
is strictly increasing with the supremum (over (0, π)) equal to π2 , we have that for 0 ≤ ab < π2 the functional F has a unique critical point v ∈ C01 [0, a] such that the graph of the function u(x) = v(x) + ab x has parametric equations x=a
1 − cos τ , 1 − cos τ0
y=a
τ − sin τ , 1 − cos τ0
τ ∈ [0, τ0 ],
(6.1.17)
where τ0 is given by (6.1.16). On the other hand, for ab ≥ π2 the functional F does not have critical points 1 in C0 [0, a]. However, this does not mean that the original problem has no solution at all! The restriction we made during the formulation of the mathematical model (considering only curves which are graphs of functions y = u(x)) does not fit with the real situation if ab ≥ π2 ! In this case one has to parametrize the curves x = x(τ ), y = y(τ ) and to investigate the functional 2 dx 2 ' dy (2 τ0 + dτ (τ ) dτ (τ ) 1 F˜ (x, y) = dτ . 2gx(τ ) 0 An analogous procedure leads to the solution of two differential equations for x and y and one can prove the existence of a unique critical point.9 9 The
reader is invited to prove it in detail as an exercise.
370
Chapter 6. Variational Methods
Let us return to the case ab < π2 . It still remains to show that the solution (6.1.17) is a global minimum of F over C01 [0, a]. This follows from Proposition 6.1.8. Indeed, the function 1 z → 1 + z 2 is convex in R. This immediately implies that the functional F is convex on C01 [0, a] (the reader is invited to prove both facts in detail). Hence the unique critical point g of F in C01 [0, a] must be the point of its global minimum. Let us now consider a more general situation. Namely, let M = {u ∈ C 1 [a, b] : u(a) = u1 , u(b) = u2 }, and let us introduce the functional b f (x, u(x), u (x)) dx, F (u) =
u ∈ M,
a
where f = f (x, y, z) is a function defined on [a, b] × R2 with continuous second partial derivatives with respect to all its variables. This assumption will hold throughout the rest of this section. Applying the Euler Necessary Condition (Proposition 6.1.2) we get the following assertion. Proposition 6.1.11. Let u0 ∈ M be a local extremum of F with respect to M. Then the function ∂f x → (x, u0 (x), u 0 (x)) (6.1.18) ∂z is continuously differentiable on [a, b] and ∂f d ∂f (x, u0 (x), u 0 (x)) − (x, u0 (x), u 0 (x)) = 0 (6.1.19) ∂y dx ∂z for all x ∈ [a, b]. Proof. Let us first assume u1 = u2 = 0. Let w ∈ C01 [a, b]. Since b ∂f ∂f (x, u0 (x), u 0 (x))w(x) + (x, u0 (x), u 0 (x))w (x) dx, 0 = δF (u0 ; w) = ∂y ∂z a we get, by integrating by parts, x b ∂f ∂f (x, u0 (x), u0 (x)) − (ξ, u0 (ξ), u0 (ξ)) dξ w (x) dx = 0. ∂z a a ∂y Using Lemma 6.1.9 we get from (6.1.20) that there is c ∈ R such that x ∂f ∂f (x, u0 (x), u0 (x)) − (ξ, u0 (ξ), u 0 (ξ)) dξ = c ∂z ∂y a
(6.1.20)
(6.1.21)
6.1. Local Extrema
371
for all x ∈ [a, b]. This equality shows that the function (6.1.18) is continuously differentiable and (6.1.19) holds for all x ∈ [a, b]. −u1 In a general case we can consider u − u2b−a (x − a) − u1 instead of u and apply the previous result on the transformed functional. Remark 6.1.12. Equation (6.1.19) is the Euler Equation of the functional F . Taking the “formal” derivative of the second term in (6.1.19) we obtain d ∂f ∂2f (x, u0 (x), u0 (x)) = (x, u0 (x), u 0 (x)) dx ∂z ∂x∂z ∂2f ∂2f (x, u0 (x), u 0 (x))u 0 (x) + 2 (x, u0 (x), u 0 (x))u 0 (x). + ∂y∂z ∂z Hence (6.1.19) indicates that u 0 (x) should exist. This motivates the following assertion. Theorem 6.1.13 (Regularity of the “classical solution”). Let u0 ∈ M be a local extremum of F with respect to M, and let x0 ∈ (a, b) be such that ∂2f (x0 , u0 (x0 ), u 0 (x0 )) = 0. ∂z 2 Then there exists δ > 0 such that u0 ∈ C 2 (x0 − δ, x0 + δ). Proof. For x ∈ [a, b] and z ∈ R define a function ϕ by x ∂f ∂f (x, u0 (x), z) − (ξ, u0 (ξ), u 0 (ξ)) dξ − c ϕ(x, z) = ∂z a ∂y where c is the constant from the proof of Proposition 6.1.11. The Implicit Function Theorem (see Theorem 4.2.1) implies that there exist δ1 > 0, δˆ > 0 with the following properties: for any x ∈ (x0 − δ1 , x0 + δ1 ) there exists a unique z(x) ∈ ˆ u (x0 ) + δ) ˆ such that (u 0 (x0 ) − δ, 0 ϕ(x, z(x)) = 0. Moreover, z ∈ C 1 (x0 − δ1 , x0 + δ1 ). The continuity of u 0 and the uniqueness of z imply the existence of δ ∈ (0, δ1 ) such that u 0 (x) = z(x)
for
x ∈ (x0 − δ, x0 + δ).
It is more convenient to look for critical points of F on “greater” sets than M in several situations. As we will see later (Section 6.2) this is mainly connected with the fact that the space of continuously differentiable functions C 1 [a, b] is not reflexive and it does not possess a Hilbert structure, either. For this purpose it is
372
Chapter 6. Variational Methods
more convenient to work in the Sobolev space W 1,2 (a, b) and to look for extrema of F on the set N = {u ∈ W 1,2 (a, b) : u(a) = u1 , u(b) = u2 }. Notice that it is not obvious whether the functional F is well defined on the set N . We have to assume that f satisfies certain growth conditions (see Theorem 3.2.24 and Remark 3.2.25; the Carath´eodory property is guaranteed by the continuity of f and its derivatives). In this case we have Theorem 6.1.14 (Regularity of the “weak solution”). Let h ∈ L2 (a, b), c1 ≥ 0 be such that for a.a. x ∈ [a, b] and for all (y, z) ∈ R2 , |f (x, y, z)| ≤ h(x) + c1 (y 2 + z 2 ), - ∂f - (x, y, z)- ≤ h(x) + c1 (|y| + |z|), - ∂y - ∂f - (x, y, z)- ≤ h(x) + c1 (|y| + |z|). - ∂z -
(6.1.22) (6.1.23) (6.1.24)
Let u0 ∈ W 1,2 (a, b) be a local minimum of F on N . For x ∈ [a, b] and z ∈ R set ψ(x, z) =
∂f (x, u0 (x), z). ∂z
Assume that ∂ψ ∂z > 0 on [a, b] × R and that for every fixed x ∈ [a, b] the function z → ψ(x, z) maps R onto R. Then u0 ∈ C 2 [a, b]. Proof. First, let us assume that u1 = u2 = 0. Conditions (6.1.22)–(6.1.24) guarantee that F is well defined on W01,2 (a, b) and that δF (u0 ; v) exists for any v ∈ W 1,2 (a, b).10 It follows from Proposition 6.1.2 that for any w ∈ W01,2 (a, b), b ∂f ∂f (x, u0 (x), u 0 (x))w (x) + (x, u0 (x), u 0 (x))w(x) dx = 0. δF (u0 ; w) = ∂z ∂y a If we proceed literally as in the proof of Proposition 6.1.11 we arrive at (6.1.21) which now holds for a.a. x ∈ [a, b]. Since the function x ∂f g(x, z) = ψ(x, z) − c − (ξ, u0 (ξ), u 0 (ξ)) dξ ∂y a is continuous on [a, b] × R, hence by the assumptions on the function ψ, for any x ∈ [a, b] the equation g(x, z) = 0 10 The
reader is invited to check these facts in detail, see Remark 3.2.25.
6.1. Local Extrema
373
has a unique solution z = z(x). Moreover, by the Implicit Function Theorem (see Remark 4.2.3, not Theorem 4.2.1!), the function x → z(x) is continuous on (a, b). It can be shown (Exercise 6.1.21) that it is continuous also at the end points a, b. So, it follows from (6.1.21) that x u 0 (x) = z(x) for a.a. x ∈ [a, b], i.e., u0 (x) = z(y) dy. a
and it is a local minimum of F in the space C01 [a, b]. The Hence u0 ∈ assertion now follows from Theorem 6.1.13. In the general case, we consider again C01 [a, b]
u−
u2 − u1 (x − a) − u1 b−a
instead of u and apply the previous result on the transformed functional.
Exercise 6.1.15. Consider a function of two real variables F (x, y) = sin x + sin y − sin (x + y) π 3π π 3π × − , . − , 2 2 2 2 4π 4π 2π Prove that F has a local maximum at 2π 3 , 3 , a local minimum at 3 , 3 , and there is no extremum at the critical point (0, 0). For the graph of F see Figure 6.1.3.
on the set
M=
Exercise 6.1.16. Find local and global extrema of the functional 1 F : C[0, 1] → R : F (u) = [|u(t)|2 + u(t)v(t) + w(t)] dt 0
where v, w ∈ C[0, 1] are given functions.
11
Exercise 6.1.17. Use Theorem 6.1.5 to prove that the solution of the Euler equation (6.1.14) is a local minimum of F from Example 6.1.10. Hint. Show that 3 (2c − a) 2 a δ 2 F (v; h, h) ≥ |h (x)|2 dx. √ 4c gca 0 Exercise 6.1.18. Prove that the functional π |u(x)|2 [1 − |u (x)|2 ] dx F (u) = 0
has in C01 [0, π] a unique local minimum at u = 0. functional F : X → R reaches its global minimum over M ⊂ X if there exists u0 ∈ M such that F (u) ≥ F (u0 ) for all u ∈ M. Global maximum is defined similarly. See Section 6.2 for more detail on the existence of global extrema. 11 The
374
Chapter 6. Variational Methods
F
y
x
Figure 6.1.3. Graph of F
Exercise 6.1.19 (Weierstrass Example). Prove that the functional 1 F (u) = x2 |u (x)|2 dx −1
does not have its global minimum over the set M = {u ∈ C 1 [−1, 1] : u(−1) = −1, u(1) = 1}. Hint. Set un (x) =
arctan nx arctan n
and prove that lim F (un ) = 0. n→∞
Exercise 6.1.20. Prove that the functional 1 2 x 5 |u (x)|2 dx F (u) = −1
does not have its global minimum over the set M from Exercise 6.1.19. Hint. The corresponding Euler equation has no solution. Exercise 6.1.21. Prove the following statement: Let g : [a, b]×R → R be a function and assume that for any x ∈ [a, b] the equation g(x, z) = 0 has a solution denoted by z = z(x) (not necessarily unique). If ∂g (x, z) > 0 on [a, b] × R, ∂z then this solution is unique. If, moreover, g and ∂g ∂z are continuous on [a, b] × R, then z = z(x) is continuous on [a, b] as well.
6.2. Global Extrema
375
Hint. For the continuity of z = z(x) use the Implicit Function Theorem in the form of Remark 4.2.3 and notice that usage of the Contraction Principle is also possible at the end points a, b.
6.2 Global Extrema In contrast with the previous section we focus now on points of global extrema. The key assertions deal with weakly coercive and weakly sequentially lower semicontinuous functionals. Let us consider a differentiable function of one real variable, F : R → R. It is not difficult to give an example which shows that local extrema of F need not be its global extrema – see Figures 6.2.1, 6.2.2. R
R F
F
0
R
Figure 6.2.1. F attains neither its maximum nor its minimum on R.
0
a
b R
Figure 6.2.2. F attains its extrema on
[a, b] at a and b, respectively.
It is quite natural to ask: What properties of F guarantee the existence of the point of global extremum of F ? First of all let us note that we can look for global minima only because global maxima of F are global minima of −F and vice versa. Let us consider the following very simple model example of a function F : R → R which is continuous in a bounded interval [a, b]. Then there exists a point x0 ∈ [a, b] such that F (x0 ) = min F (x), x∈[a,b]
i.e., the minimum of F over [a, b] is at the point x0 (see Figure 6.2.3). The proof of ∞ this fact is typical for this section. Assume that {xn }n=1 is a minimizing sequence for F on [a, b], i.e., F (xn ) inf F (x).12 (6.2.1) x∈[a,b]
12 Note
that, for a general M, we set inf M = −∞ if M is not bounded below.
376
Chapter 6. Variational Methods
R
F (b) F (a)
0
a
x0
b
R
Figure 6.2.3. ∞
∞
The compactness of [a, b] implies that there is a subsequence {xnk }k=1 ⊂ {xn }n=1 and a point x0 ∈ [a, b] such that xnk → x0 . The continuity of F then implies that F (x0 ) = inf F (x). x∈[a,b]
The reader should notice that a property weaker than the continuity of F is sufficient to get this conclusion, namely F (x0 ) ≤ lim inf F (xnk )
(6.2.2)
k→∞
(cf. Definition 6.2.1 below). It follows now from (6.2.1) and (6.2.2) that F (x0 ) = inf F (x) x∈[a,b]
(see Figure 6.2.4). If, moreover, F (a) > inf F (x),
F (b) > inf F (x),
x∈(a,b)
(6.2.3)
x∈(a,b)
then x0 is also a local minimum of F (see Figures 6.2.3 and 6.2.4). Assume in the sequel in this section that F: X →R is a functional on a (infinite dimensional) Banach space X. It is quite natural to ask if a similar result as above holds if [a, b] is replaced by a closed and bounded set D ⊂ X and (6.2.3) is substituted by inf F (u) >
u∈∂D
inf
u∈int D
F (u).
6.2. Global Extrema
R
377
F (a)
F (b)
F (x0 ) 0
a
x0
b
R
Figure 6.2.4.
Unfortunately, the answer is no in general (see Exercise 6.2.23). The reason lies in the fact that the compactness of the bounded and closed interval [a, b] is the crucial property which plays the essential role in the proof. In fact, one can imitate the proof above to get the following result: Let F be a lower semi-continuous real functional on a compact set K ⊂ X. Then F has a minimum in K. However, this assertion has practically no applications because compact subsets of the infinite dimensional Banach space X are “too thin” (see Proposition 1.2.15). For instance, for any compact set K ⊂ X we have int K = ∅. Because of this fact we have to look for a different (weaker – why?) topology on X than that induced by the norm. We would like to find a new topology on X with respect to which any bounded (in the norm) set D ⊂ X is relatively compact. The lower semi-continuity of a functional F with respect to this topology will then allow us to prove the above assertion with K substituted by a bounded and closed set D with respect to this new topology. These problems gave an impulse for the study of weak convergence introduced in Definition 2.1.21. The reader should notice that we will discuss weak sequential continuity of functionals instead of weak continuity (these are different concepts since weak topology is not metrizable in general). The reason is quite practical: weak sequential (semi-) continuity is easier to prove for a concrete (e.g., integral) functional. In order to make the exposition in this section as clear as possible we will restrict our attention to real Hilbert spaces H. The reader should have in mind that the following notions can also be defined in any Banach space. Definition 6.2.1. Let F : H → R be a functional, M ⊂ H. Then F is said to be weakly sequentially lower semi-continuous at a point u0 ∈ M relative to M if for ∞ any sequence {un }n=1 ⊂ M such that un u0 we have F (u0 ) ≤ lim inf F (un ). n→∞
378
Chapter 6. Variational Methods
We say that F is weakly sequentially lower semi-continuous in M ⊂ H if it is weakly sequentially lower semi-continuous at every point u ∈ M relative to M. Example 6.2.2. The norm · on H is a weakly sequentially lower semi-continuous g functional in H as follows immediately from Proposition 2.1.22(iii). Example 6.2.3. Let L : H → R be a continuous linear form. Then L is weakly sequentially lower semi-continuous in H. Indeed, it follows from the Riesz Representation Theorem (Theorem 1.2.40) that there is v ∈ H such that for all u ∈ H.
L(u) = (u, v) Hence u n u0
L(un ) → L(u0 ), 13
implies
in particular,
g
L(u0 ) ≤ lim inf L(un ). n→∞
The following assertion is a counterpart of Proposition 1.2.2 which is known as the Extreme Value Theorem for H = R. Theorem 6.2.4 (Extreme Value Theorem). Let M be a weakly sequentially compact nonempty subset of H and let F be a weakly sequentially lower semi-continuous functional in M. Then F is bounded below in M, and there exists u0 ∈ M such that F (u0 ) = min F (u). u∈M
Proof. Let
∞ {un }n=1
be a minimizing sequence for F relative to M, i.e.,
{un }∞ n=1 ⊂ M
and
F (un ) inf F (u). u∈M
Since M is weakly sequentially compact there exist u0 ∈ M and a subsequence ∞ ∞ {unk }k=1 ⊂ {un }n=1 such that unk u0 . The assumption on F implies inf F (u) ≤ F (u0 ) ≤ lim inf F (unk ) = lim F (un ) = inf F (u),
u∈M
n→∞
k→∞
u∈M
i.e., F (u0 ) = inf F (u) > −∞. u∈M
Corollary 6.2.5. Let M ⊂ H, F : H → R, and u0 be as in Theorem 6.2.4. Assume, moreover, that u0 ∈ int M. If δF (u0 ; v) exists for a v ∈ H, then δF (u0 ; v) = 0. un u0 implies F (un ) → F (u0 ), then the functional F is called weakly sequentially continuous at u0 . 13 If
6.2. Global Extrema
379
Proof. The assumption u0 ∈ int M implies that F attains also its local minimum at u0 . The assertion now follows from Proposition 6.1.2. Example 6.2.6. Let us consider the boundary value problem for the second order ordinary differential equation −¨ x(t) + x3 (t) = f (t), t ∈ (0, 1), (6.2.4) x(0) = x(1) = 0, where f ∈ L2 (0, 1) is a given function. Put H W01,2 (0, 1) with the norm
1
x =
12 |x(t)| ˙ dt . 2
0
A weak solution 14 of (6.2.4) is a function x ∈ H for which the integral identity 1 1 1 3 x(t) ˙ y(t) ˙ dt + x (t)y(t) dt = f (t)y(t) dt 0
0
0
holds for any function y ∈ H. Let us define a functional F : H → R by 1 1 1 1 1 2 4 F (x) = |x(t)| ˙ dt + |x(t)| dt − f (t)x(t) dt, 2 0 4 0 0 Then for x, y ∈ H we have 1 δF (x; y) = x(t) ˙ y(t) ˙ dt + 0
1
x (t)y(t) dt − 3
0
x ∈ H.15
1
f (t)y(t) dt, 0
and any critical point of F , i.e., x ∈ H satisfying δF (x; y) = 0
for an arbitrary y ∈ H,
is a weak solution of (6.2.4) and vice versa. We will show that Corollary 6.2.5 applies to F and a suitably chosen set M ⊂ H. First let us prove that F is a weakly sequentially lower semi-continuous ∞ functional on H. Consider an arbitrary z ∈ H and {xn }n=1 ⊂ H such that xn z in H. Due to the compact embedding (Theorem 1.2.28(iii)) H = W01,2 (0, 1) ⊂⊂ C[0, 1], we have that xn → z in C[0, 1] (Proposition 2.2.4(iii)). 14 For
a detailed discussion of the notion of a weak solution see Remark 5.3.10. Note that this weak solution x0 minimizes the energy functional F , i.e., it corresponds to the state with the minimal energy of the system. 15 This functional can represent the energy of a certain system. For this reason it is often called the energy functional.
380
Chapter 6. Variational Methods
This implies
1
|xn (t)|4 dt →
0
1
|z(t)|4 dt,
0
1
f (t)xn (t) dt →
0
1
f (t)z(t) dt.
(6.2.5)
0
The weak sequential lower semi-continuity of the Hilbert norm · (Example 6.2.2) implies lim inf xn 2 ≥ z2 . (6.2.6) n→∞
We obtain from (6.2.5) and (6.2.6) that lim inf F (xn ) ≥ F (z). n→∞
To find a suitable set M we first note that xL2 (0,1) ≤ xW 1,2 (0,1) .16
(6.2.7)
0
Due to this fact we can estimate F using the H¨older inequality as follows: F (x) ≥
1 1 x2 − f L2(0,1) xL2 (0,1) ≥ x x − 2f L2(0,1) . 2 2
(6.2.8)
It is clear that for x > 2f L2(0,1) we have F (x) > 0, and at the same time F (o) = 0. So, taking M = {x ∈ H : x ≤ 2f L2(0,1) + 1}, the assumptions of Corollary 6.2.5 are fulfilled, since a closed ball in a Hilbert space is weakly sequentially compact (see Theorem 2.1.25 and Proposition 2.1.22(iii)). We then conclude that there exists at least one weak solution x0 ∈ H of the g boundary value problem (6.2.4). From (6.2.8) it is easy to see that the functional F from the previous example satisfies lim F (x) = ∞. x→∞
This motivates the following general definition. Definition 6.2.7. A functional F : H → R is said to be weakly coercive on H if lim F (u) = ∞.
u→∞
16 This
t
follows by a direct calculation using the H¨ older inequality for x(t) =
x(s) ˙ ds. 0
6.2. Global Extrema
381
This notion together with Corollary 6.2.5 leads to the following global result. Theorem 6.2.8. Let F : H → R be a weakly sequentially lower semi-continuous and weakly coercive functional. Then F is bounded below on H, and there exists u0 ∈ H such that F (u0 ) = min F (u). u∈H
Moreover, if δF (u0 ; v) exists for a v ∈ H, then δF (u0 ; v) = 0. Proof. Let d > inf F (u). There exists R > 0 such that for u ∈ H, u ≥ R, we u∈H
have F (u) ≥ d. Hence inf F (u) = inf F (u).
u≤R
u∈H
Now, we apply Theorem 6.2.4 with M = {u ∈ H : u ≤ R}. The assertion on a directional derivative follows from Corollary 6.2.5.
From the point of view of applications, it is convenient to have sufficient conditions “in the language of the topology on H induced by the norm” which guarantee that the set M is weakly sequentially compact; the functional F is weakly sequentially lower semi-continuous in M.17 We recall the results from Chapter 2 which state that every closed, convex and bounded set M ⊂ H is weakly sequentially compact (see Exercise 2.1.39, Theorem 2.1.25 and Remark 2.1.24). Concerning the desired property of F we need the following auxiliary assertion. Lemma 6.2.9. Let M ⊂ H. Then F : H → R is weakly sequentially lower semicontinuous in M if and only if for every a ∈ R the set E(a) = {u ∈ M : F (u) ≤ a} is weakly sequentially closed in M.18 17 Not every continuous functional is weakly sequentially lower semi-continuous (cf. Exercise 6.2.31). 18 The set E ⊂ M is called weakly sequentially closed in M if for any {x }∞ n n=1 ⊂ E, xn x ∈ M, we have x ∈ E.
382
Chapter 6. Variational Methods
Proof. Let F be a weakly sequentially lower semi-continuous functional in M, a ∈ R, {un }∞ n=1 ⊂ E(a), un u0 , u0 ∈ M. Then F (u0 ) ≤ lim inf F (un ) ≤ a, n→∞
i.e.,
u0 ∈ E(a).
Hence E(a) is weakly sequentially closed in M. On the other hand, assume that for every a ∈ R the set E(a) is weakly ∞ sequentially closed in M. Let {un }n=1 ⊂ M, un u0 ∈ M and denote γ = lim inf F (un ). n→∞
∞
Then there is a subsequence {unk }k=1 such that F (unk ) → γ. For any ε > 0 we have unk ∈ E(γ + ε) for k sufficiently large. Since E is weakly sequentially closed in M, u0 ∈ E(γ + ε). Hence u0 ∈ E(γ), i.e., F (u0 ) ≤ lim inf F (un ). n→∞
Proposition 6.2.10. Let F be a convex and continuous functional defined in a convex set M ⊂ H. Then F is weakly sequentially lower semi-continuous in M. Proof. It follows from the convexity of F that the set E(a) = {u ∈ M : F (u) ≤ a} is convex. The continuity of F implies that E(a) is closed in M. It follows from Exercise 2.1.39 and Remark 2.1.24 that it is also weakly sequentially closed in M. The result now follows from Lemma 6.2.9. These results combined with Theorems 6.2.4 and 6.2.8 allow us to formulate the following assertions, very often used in applications. Theorem 6.2.11. Let M be a closed, convex, bounded and nonempty subset of H. Let F : H → R be a convex and continuous functional on M. Then F is bounded below on M and there exists u0 ∈ M such that F (u0 ) = inf F (u). u∈M
If, moreover, F is strictly convex, then u0 is the unique point with this property.19 Theorem 6.2.12. Let F : H → R be continuous, convex and weakly coercive on H. Then F is bounded below on H, and there exists u0 ∈ H such that F (u0 ) = inf F (u). u∈H
19 The
reader is invited to prove the uniqueness of u0 !
6.2. Global Extrema
383
If δF (u0 ; v) exists for a v ∈ H, then δF (u0 ; v) = 0. If, moreover, F is strictly convex, then u0 is uniquely determined. Example 6.2.13. For any real continuous linear form L : H → R there exists u ∈ H such that u = 1 and L = L(u). Indeed, the set M = {u ∈ H : u ≤ 1} and the functional F = −L satisfy the assumptions of Theorem 6.2.11. Hence there exists u0 ∈ M such that −L(u0 ) = inf (−L(u)). u∈M
By the linearity of L and the symmetry of M we have − inf (−L(u)) = sup |L(u)|, u∈M
i.e.,
u∈M
L(u0 ) = sup |L(u)| = L. u∈M
Assume that L = 0 and u0 < 1. Then there exists t > 1 such that tu0 = 1, i.e., tu0 ∈ M, and L(tu0 ) = tL(u0 ) = tL > sup |L(u)|, u∈M
a contradiction. Note that this assertion can be proved directly using the Riesz Representation g Theorem (Theorem 1.2.40). Example 6.2.14. Let us consider the boundary value problem (6.2.4) and the energy functional 1 1 1 1 1 2 4 F (x) = |x(t)| ˙ dt + |x(t)| dt − f (t)x(t) dt, x ∈ H W01,2 (0, 1) 2 0 4 0 0 associated with (6.2.4). We have actually proved in Example 6.2.6 that F is weakly coercive on H. The continuity of F on H follows from the continuity of the norm in H, the continuity of the embedding H = W01,2 (0, 1) ⊂ L4 (0, 1) and from the continuity of the linear form x →
1
f (t)x(t) dt
on H
0
under the assumption f ∈ L2 (0, 1). The strict convexity of F follows from the strict convexity of the real functions t → t2 ,
t → t4 ,
384
Chapter 6. Variational Methods
and the convexity of the linear form. We conclude (see Theorem 6.2.12) that there exists a unique x0 ∈ H such that F (x0 ) = min F (x). x∈H
It follows then from Proposition 6.1.8 that x0 is the unique weak solution of (6.2.4). g Remark 6.2.15. The reader should compare Examples 6.2.6 and 6.2.14. In the latter one we have used Theorem 6.2.12 which enables us to avoid verifying the assumption of the weak sequential lower semi-continuity of F . This might be a difficult task in general (it can not be always done so easily by means of the compact embedding as in Example 6.2.6). The reader should also notice that the continuity of F without any additional assumptions does not imply the weak sequential lower semi-continuity of F (see Exercise 6.2.31). In the last part of this section we show another possibility for finding critical points of F under the assumption that F is differentiable. First we need two auxiliary assertions. Lemma 6.2.16. Let F be a functional defined on H and ∇F its gradient.20 Let ∇F : H → H be a monotone operator. Then F is weakly sequentially lower semicontinuous on H. Proof. Let u, v ∈ H. According to the Mean Value Theorem applied to the real function ϕ : s → F (v + s(u − v)), s ∈ [0, 1], there exists t ∈ (0, 1) such that F (u) − F (v) = (∇F (v + t(u − v)), u − v) = (∇F (v), u − v) + (∇F (v + t(u − v)) − ∇F (v), u − v) ≥ (∇F (v), u) − (∇F (v), v).21
(6.2.9)
Let {vn }∞ n=1 be a sequence in H such that vn v in H, i.e., (∇F (v), vn ) → (∇F (v), v). It follows from (6.2.9) that lim inf F (vn ) ≥ F (v) − (∇F (v), v) + lim (∇F (v), vn ) = F (v). n→∞
20 Remember
n→∞
that according to the Riesz Representation Theorem (Theorem 1.2.40), Gˆ ateaux derivative DF (u) is identified with an element of H which is denoted by ∇F (u) and called a gradient of F at u. Remember also that ∇F is a mapping from H into itself. (Cf. Example 3.2.4.) 21 Since the monotonicity of ∇F implies that ϕ is increasing, ϕ is a convex function, i.e., F is convex.
6.2. Global Extrema
385
Definition 6.2.17. Let T : H → H be an operator from H into itself. We say that T is coercive if (T (u), u) lim = ∞. u→∞ u Lemma 6.2.18. Let F : H → R be a functional and ∇F : H → H its gradient. Let ∇F be a coercive and bounded operator. Then F is weakly coercive. Proof. Since d F (tu) = (∇F (tu), u), dt we obtain by integration
1
F (u) = F (o) + 0
dt = F (o) + (∇F (tu), tu) t
u
0
u u dτ ,τ ∇F τ u u τ
for any u ∈ H, u = o. The coercivity of ∇F implies that there exists r ≥ 0 such that
1 u u ∇F τ ,τ ≥1 for any τ ≥ r and u ∈ H, u = o. τ u u The boundedness of ∇F implies m
sup τ ∈[0,r] u∈H, u =o
∇F τ u < ∞. u
Consequently, we obtain
u dτ u ,τ ∇F τ u u τ 0
u u dτ u ,τ ∇F τ + u u τ r
r
F (u) = F (o) +
≥ F (o) − rm + u − r
for any u ∈ H,
u > r.
The last inequality yields the weak coercivity of F .
Remark 6.2.19. Let F : H → R, h ∈ H. Assume that the gradient ∇F (u) of F exists at any point u ∈ H. Then the following equivalence obviously holds true: There exists u0 ∈ H such that ∇F (u0 ) = h if and only if there exists u0 ∈ H such that ∇G(u0 ) = o where G : u → F (u) − (u, h).
(6.2.10)
386
Chapter 6. Variational Methods
Theorem 6.2.20. Let F : H → R and let ∇F : H → H be the gradient of F . Let ∇F be a monotone, coercive and bounded operator. Then ∇F (H) = H.22 Proof. It follows from Remark 6.2.19 that it is enough to prove that for any h ∈ H, the functional G defined by (6.2.10) has a critical point. But Lemmas 6.2.16 and 6.2.18 yield that G is weakly sequentially lower semi-continuous and weakly coercive. The existence of a critical point of G follows from Theorem 6.2.8. Example 6.2.21. Let us consider again the boundary value problem (6.2.4) and the associated energy functional 1 1 1 1 1 2 F (x) = |x(t)| ˙ dt + |x(t)|4 dt − f (t)x(t) dt. 2 0 4 0 0 Then
1
∇F (x)y =
x(t) ˙ y(t) ˙ dt + 0
1
x (t)y(t) dt − 3
0
1
f (t)y(t) dt 0
is the Gˆ ateaux derivative of F in H W01,2 (0, 1). We verify the assumption of Theorem 6.2.20. Using the continuous embedding H = W01,2 (0, 1) ⊂ C[0, 1] (Theorem 1.2.26) we prove the boundedness (and even continuity!) of ∇F in the space H. Since s → s3 is monotone we have (∇F (x1 ) − ∇F (x2 ), x1 − x2 ) = + 0
1
|x˙ 1 (t) − x˙ 2 (t)|2 dt
0 1
(x31 (t) − x32 (t))(x1 (t) − x2 (t)) dt ≥ x1 − x2 2
for x1 , x2 ∈ H and the monotonicity of ∇F follows. Finally, we have 1 x4L4 (0,1) 1 (∇F (x), x) = x + − f (x)x(t) dt x x x 0 xL2 (0,1) . ≥ x − f L2 (0,1) x Using the inequality (6.2.7) we get (∇F (x), x) ≥ x − f L2 (0,1) , x i.e., ∇F is coercive. We conclude from Theorem 6.2.20 that ∇F (H) = H, 22 Compare
this result with Theorem 5.3.4.
(6.2.11)
6.2. Global Extrema
387
in particular, there exists x0 ∈ H such that ∇F (x0 ) = o. Hence x0 is a weak solution of (6.2.4). The estimate (6.2.11) then implies the g uniqueness of x0 .23 Remark 6.2.22. Most of the previous results hold true when the Hilbert space H is replaced by a real reflexive Banach space X and the scalar product (·, ·) is replaced by the duality pairing ·, ·, between X ∗ and X, i.e., for f ∈ X ∗ and x ∈ X we write f, x f (x). However, the proofs are technically more involved and the gradient ∇F has to be replaced by the Gˆ ateaux derivative DF . ∞
Exercise 6.2.23. Let {en }n=1 be an orthonormal basis in a Hilbert space H. Put 1 Dn = x ∈ H : x − en ≤ 2 and define a functional ⎧ ⎪ ⎪ ⎪x for ⎨ f (x) =
⎪ 2(n − 1) 1 ⎪ ⎪ − x − en for ⎩x + n 2
x ∈
n=1
x ∈ Dn .
Show that f is continuous on H, sup f (x) = 2, x≤ 32
but f does not have maximum on the ball 3 . x ∈ H : x ≤ 2 Exercise 6.2.24. The mapping U : R2 → R2 is defined by U : (x, y) → (y, −x). Prove that U is monotone and satisfies lim
(x,y)→∞
U (x, y) = ∞
but is not coercive. 23 The
∞
reader is invited to apply Theorem 5.3.4 to get the same result.
Dn ,
388
Chapter 6. Variational Methods
Exercise 6.2.25. Prove that any coercive map F : H → H satisfies lim F (u) = ∞.
u→∞
Exercise 6.2.26. Prove that the same conclusion as in Example 6.2.14 holds true also if f ∈ L1 (0, 1). Hint. Use the embedding W01,2 (0, 1) ⊂ L∞ (0, 1). Exercise 6.2.27. Prove that the norm on H and linear forms on H are convex functionals. Exercise 6.2.28. Prove that in Theorem 6.2.12 the weak coercivity of F can be substituted by a weaker assumption: For any u ∈ H there exists r > 0 such that for all v ∈ H, v ≥ r, we have F (v) > F (u). Exercise 6.2.29. Let M be an open convex subset of a real Hilbert space H, let F : H → R be a functional such that for any u ∈ M there exists the second Gˆ ateaux derivative D2 F (u). Prove that (a)
=⇒
(b)
=⇒
(c)
where (a) D2 F (u)(h, h) ≥ 0 for u ∈ M, h ∈ H; (b) (∇F (u) − ∇F (v), u − v) ≥ 0 for u, v ∈ M; (c) F is convex on M. Hint. Use the Mean Value Theorem (see Theorem 3.2.6) as for real functions. Exercise 6.2.30. Prove that for any n ∈ N and f ∈ L2 (0, 1) the boundary value problem −¨ x(t) + x2n+1 (t) = f (t), t ∈ (0, 1), x(0) = x(1) = 0 has a unique weak solution. Exercise 6.2.31. Let f be the functional from Exercise 6.2.23. Prove that f is not weakly sequentially upper semi-continuous (i.e., −f is not weakly sequentially lower semi-continuous). Hint. Remember that en o.
6.2A Ritz Method In this part of the text we want to address one fundamental numerical approach to finding the global minimum of a real functional on a real Banach space. In applications such a minimum corresponds to a solution of a certain boundary value problem and the general method we will discuss below is a starting point for many numerical methods. Let
6.2A. Ritz Method
389
us mention the Galerkin Method, the Finite Elements Method, the Katchanov–Galerkin Method, etc., which are powerful tools in the numerical solution of differential equations. Let X be a real Banach space and F a real functional defined on X. An element u0 ∈ X satisfying F (u0 ) = inf F (u) (6.2.12) u∈X
will be called a solution of the variational problem (6.2.12). We will discuss the Ritz Method which actually yields directly an algorithm for finding a solution of the variational problem. The basic idea of the Ritz Method is rather simple: Instead of looking for the minimum of the functional F on the entire space X, we look for its minimum on suitable subspaces of the space X in which we know how to solve the variational problem. Let us now formulate this idea precisely: To every n ∈ N, let a closed subspace Xn of the space X be assigned. The problem of finding an element un ∈ Xn such that F (un ) = inf F (u) u∈Xn
(6.2.13)
holds is called the Ritz approximation of the problem (6.2.12) and the element un ∈ Xn is called a solution of the problem (6.2.13). The following two fundamental problems immediately present themselves: (a) the problem of the existence and uniqueness of a solution of the problem (6.2.13); (b) the relation between the solutions of the problems (6.2.12) and (6.2.13). Problem (a) has already been solved by Theorem 6.2.12 in the framework of Hilbert spaces. It follows from Remark 6.2.22 that the same assertion can be proved in a reflexive Banach space X. Since a closed subspace Xn of a reflexive Banach space X is also a reflexive Banach space, we have the following assertion which follows directly from Theorem 6.2.12 and Remark 6.2.22. Proposition 6.2.32. Let X be a reflexive Banach space, and let a functional F defined on the space X be continuous, strictly convex and weakly coercive on X. Then each of the problems (6.2.12) and (6.2.13) has precisely one solution u0 and un , respectively. We now focus our effort on problem (b). We investigate under what condition lim u0 − un = 0
n→∞
(6.2.14)
is true. If (6.2.14) is valid, then we say that the Ritz Method converges for the problem (6.2.12) and the solutions un of the problems (6.2.13) approximate the solution of the problem (6.2.12) in the sense of the norm of the space X. Proposition 6.2.33. Let F be a continuous linear functional on a normed linear space X and let {Xn }∞ n=1 be a sequence of closed subspaces of X such that for every v ∈ X there exist elements vn ∈ Xn , n ∈ N, such that lim v − vn = 0.
n→∞
(6.2.15)
390
Chapter 6. Variational Methods
Let un be such an element of Xn that (6.2.13) holds. Then {un }∞ n=1 is a minimizing sequence for the functional F on X, i.e., lim F (un ) = inf F (u).
n→∞
(6.2.16)
u∈X
Proof. Let {αk }∞ k=1 be a sequence such that αk inf F (u). u∈X
Then there exist elements v (k) ∈ X for which ( ' F v (k) < αk . (k)
(k)
By the assumption (6.2.15) we can find wn ∈ Xn satisfying wn Hence ' ( inf F (u) ≤ F (un ) ≤ F wn(k) .
→ v (k) for n → ∞.
u∈X
By the continuity of F we get ' ( ' ( lim sup F (un ) ≤ lim F wn(k) = F v (k) < αk . n→∞
n→∞
This implies that lim F (un ) = inf F (u).
n→∞
u∈X
The assertion on the convergence of the Ritz Method for the problem (6.2.12) is the following proposition. Proposition 6.2.34 (Ritz Method). Let H be a real Hilbert space,24 and let F be a continuous functional on the space H which has the second Gˆ ateaux derivative D2 F (u) ∈ 25 B2 (H, R). Assume, further, that there exists a constant c > 0 such that for all u, v ∈ H we have (6.2.17) D2 F (u)(v, v) ≥ cv2 . Let subspaces Hn of the space H satisfy condition (6.2.15). Then (i) there exists precisely one solution u0 ∈ H of problem (6.2.12); (ii) for every n ∈ N there exists precisely one solution un ∈ Hn of problem (6.2.13); (iii) the Ritz Method converges for problem (6.2.12), i.e., lim u0 − un = 0.
n→∞
24 We will state and prove Proposition 6.2.34 in the Hilbert space setting. The generalization to the Banach space setting can be obtained (c.f. Remark 6.2.22). The reader can find details in specialized literature (see, e.g., Saaty [116]). 25 See Section 3.2.
6.2A. Ritz Method
391
Proof. It follows from the Taylor Formula (Proposition 3.2.27) that 1 F (u + v) = F (u) + DF (u)(v) + (1 − t)D2 F (u + tv)(v, v) dt.
(6.2.18)
0
Choosing u = o, we have due to (6.2.17) 1 (1 − t)D2 F (tv)(v, v) dt F (v) = F (o) + DF (o)(v) + 0 0. In particular, for u = tw1 + (1 − t)w2 , w1 = w2 , t ∈ (0, 1), we have F (w1 ) − F (u) > (1 − t)DF (u)(w1 − w2 ),
F (w2 ) − F (u) > −tDF (u)(w1 − w2 ).
Multiplying the first inequality by t, the second by (1 − t) and adding both of them, we obtain that F is strictly convex on H (and also on Hn for arbitrary n). The assertions (i) and (ii) now follow from Theorem 6.2.12. It remains to prove assertion (iii). Let u0 and un be a solution of (6.2.12) and (6.2.13), respectively. Set u u0 and v un − u0 in (6.2.18). From (6.2.17) and (6.2.18) we obtain c F (un ) ≥ F (u0 ) + DF (u0 )(un − u0 ) + un − u0 2 . 2 Since u0 ∈ H is the minimum point for F on H, it follows from Theorem 6.2.12 that DF (u0 )(un − u0 ) = o,
i.e.,
F (un ) ≥ F (u0 ) +
c un − u0 2 2
(6.2.19)
holds for arbitrary n ∈ N. On the other hand, due to Proposition 6.2.33, the elements un , n ∈ N, constitute a minimizing sequence for F on H, i.e., lim F (un ) = inf F (u) = F (u0 ).
n→∞
u∈H
(6.2.20)
It follows from (6.2.19) and (6.2.20) that lim u0 − un = 0
n→∞
and the proof is complete.
So far, we have answered theoretically problems (a) and (b) formulated at the beginning of this appendix. However, from the point of view of practical (numerical) calculations the most interesting problems start right now. The most frequent and most important case arises in practice when the spaces Hn are of finite dimension, e.g., dim Hn = N . If e1 , . . . , eN is a basis of Hn and N Fn (c1 , . . . , cN ) F ci ei , i=1
392
Chapter 6. Variational Methods
then the problem (6.2.13) means to find c˜ = (˜ c1 , . . . , c˜N ) ∈ RN such that Fn (˜ c1 , . . . , c˜N ) =
inf
(c1 ,...,cN )∈RN
Fn (c1 , . . . , cN ).
(6.2.21)
If the assumptions of Proposition 6.2.34 are satisfied, then the function Fn is continuous, strictly convex on the space RN , satisfies lim Fn (c) = ∞,
c →∞
and then the vector c˜ is a solution of problem (6.2.21) if and only if all partial derivatives of the first order of the function Fn vanish at c˜ (cf. Theorem 6.2.12). Thus the problem of finding a solution of problem (6.2.21) is equivalent to the problem of finding a solution of the system ∂Fn (c1 , . . . , cN ) = 0, ∂c1 .. .
(6.2.22)
∂Fn (c1 , . . . , cN ) = 0. ∂cN The system (6.2.22) is a system of N algebraic equations which are generally nonlinear. However, note that if the functional F is quadratic, then the system (6.2.22) is a system of linear algebraic equations. Remark 6.2.35. We have not been concerned with the question which is fundamental from the practical point of view: “How do we solve system (6.2.22) numerically?” A vast literature dedicated to numerical methods deals with this problem. Just for an illustration we mention one minimization method. Choose arbitrarily a vector c0 = (c01 , . . . , c0N ) ∈ RN . Let us present an algorithm for the construction of a sequence {cm }∞ m=1 which converges under appropriate assumptions on f to the solution of system (6.2.22). If we m N know the vector cm = (cm 1 , . . . , cN ) ∈ R , we calculate the components of the vector m+1 N , . . . , c ) ∈ R as follows: Let the function cm+1 = (cm+1 1 N m m Fn (cm+1 , . . . , cm+1 1 i−1 , ξ, ci+1 , . . . , cN )
of the variable ξ on R assume its minimum at the point c˜m+1 . Put, then, i cm cm+1 − cm cm+1 i + ω(˜ i ) i i
where
0 < ω ≤ 2.
Here ω is the so-called relaxation parameter. If we choose ω = 1 and if F is a quadratic functional, we obtain the so-called Gauss–Seidel Iterative Method (see, e.g., Stoer & Bulirsch [125]). Nowadays there are plenty of packages available in Mathematica, Maple, Matlab, etc. and offering different solvers of system (6.2.22). From the practical point of view it is important that the system (6.2.22) be as simple as possible. The form of the system (6.2.22) depends in an essential way on the actual choice of the subspaces Hn . One special choice depends on the notion and the properties of the Schauder basis.
6.2A. Ritz Method
393
Let {ei }∞ i=1 be a Schauder basis (see Section 1.2) of a Hilbert space H (not necessarily orthonormal) and define the subspace Hn as the set of all elements u ∈ H which are of the form u = c1 e1 + · · · + cn en . It follows from the definition of the Schauder basis that {Hn }∞ n=1 satisfies condition (6.2.15). Example 6.2.36. Let H W01,2 (0, 1), f ∈ L1 (0, 1) and F (x)
1 2
1 2 |x(t)| ˙ dt + 0
1 4
1
1
|x(t)|4 dt − 0
x ∈ H.
f (t)x(t) dt,
(6.2.23)
0
Then F is the energy functional associated with the Dirichlet problem t ∈ (0, 1), −¨ x(t) + x3 (t) = f (t), x(0) = x(1) = 0
(6.2.24)
(cf. Example 6.2.6). We have
1
0
0
f (t)y(t) dt, 0
1
2 |y(t)| ˙ dt + 3 0
1
x3 (t)y(t) dt −
1
D2 F (x)(y, y) =
1
x(t) ˙ y(t) ˙ dt +
DF (x)(y) =
|x(t)|2 |y(t)|2 dt, 0
and the assumptions of Proposition 6.2.34 are satisfied.26 The sequence of functions ei , i = 1, 2, . . . , which are defined by ei (t) ti (1 − t), constitutes a Schauder basis of the space H (see, e.g., Michlin [94]). Thus, if we construct the subspaces Hn as above, the condition (6.2.15) will be satisfied. If we rewrite the system (6.2.22) for this particular case, we obtain the system of nonlinear equations for unknowns c1 , . . . , cn , n
1
ck
k=1
0
3 1 n = d < = d
(6.2.25)
0
j = 1, . . . , n. In each of the equations of system (6.2.25), all unknowns c1 , . . . , cn appear – this fact is rather unpleasant from the computational point of view! The question then arises whether it is possible to choose the spaces Hn so that each of the equations of the system (6.2.22) depend on only a “small number” of unknowns. This is one of the fundamental questions of numerical mathematics. Such a choice of Hn is possible, there are different ways to do it and each of them leads to a particular
26 Note
1
that we consider x = 0
2 (|x(t)| ˙ dt
12
as the norm on H.
394
Chapter 6. Variational Methods
numerical method. Below we indicate one possible choice of Hn which is different from the previous one and which meets the above mentioned requirements. Let n ∈ N, and put ti = ni for i = 0, 1, . . . , n and Ij = [tj , tj+1 ] for j = 0, 1, . . . , n−1. We define the spaces Hn as follows: Hn is the set of functions x = x(t) continuous on the interval [0, 1] which are linear on every interval [ti , ti+1 ] and for which x(0) = x(1) = 0. Let ei ∈ Hn , i = 1, . . . , n − 1, be functions such that 1 for i = j, j = 0, . . . , n. ei (tj ) = 0 for i = j, It is easily established that the set {ei }n−1 i=1 constitutes a basis of the space Hn and that for all y ∈ Hn we have y(t) =
n−1
t ∈ [0, 1].
y(tj )ej (t),
j=1
The system (6.2.22) constructed for this basis will now be itself a system for the unknown values xn (tj ) of the solution of problem (6.2.13). The crucial point in this 8 construction 9 , i+1 . We is the fact that the functions ei (t) vanish outside the interval Ii−1 ∪ Ii = i−1 n n then have ei (t)ej (t) = e˙ i (t)e˙ j (t) = 0 for i, j = 1, . . . , n − 1, |i − j| > 1 at every point t ∈ [0, 1] (with the obvious exception, for derivatives, of the points t1 , . . . , tn−1 , which constitute a set of measure zero). Therefore, in each of the equations n−1 i=1
1
ci
1
e˙ i (t)e˙ j (t) dt + 0
0
n−1
3 ci ei (t)
1
ej (t) dt =
i=1
f (t)ej (t) dt,
(6.2.26)
0
j = 1, . . . , n − 1, of system (6.2.22) only the unknowns cj−1 , cj+1 appear apart from cj . If we compute a solution c1 , . . . , cn−1 from these equations, and if we put un (t) = c1 e1 (t) + · · · + cn−1 en−1 (t),
t ∈ [0, 1],
we obtain a solution of problem (6.2.13). Now, we wish to know whether lim un − u = 0.
n→∞
By Proposition 6.2.34, it suffices to show that the spaces Hn satisfy condition (6.2.15). Let y ∈ H and ε > 0. We shall show that there exist n ∈ N and yn ∈ Hn such that y − yn < ε.
(6.2.27)
Indeed, the set D(0, 1) is dense in H (see Exercise 1.2.46). Hence there exists w ∈ D(0, 1) such that ε y − w < . (6.2.28) 2
6.2A. Ritz Method
395
Let n ∈ N be arbitrary, and let us construct a function yn ∈ Hn such that yn (ti ) = w(ti )
for all
i = 0, . . . , n.
Then we have (due to the Mean Value Theorem): w − yn =
n−1 ti+1 i=0
|w(t) ˙ − y˙ n (t)|2 dt ≤
ti
n−1 i=0
max t∈[0,1]
1 2 |w(t)| ¨ (ti+1 − ti ) n2
1 ¨ = 2 max |w(t)|. n t∈[0,1] This implies that for sufficiently large n ∈ N we have w − yn <
ε . 2
(6.2.29)
The desired inequality (6.2.27) now follows from (6.2.28) and (6.2.29).
e
Remark 6.2.37. (i) Let us point out that to get system (6.2.25) it was not essential that an equidistant division of the interval [0, 1] has been selected. Nonetheless, the norm of the division (i.e., the maximal distance between two consecutive points) must approach zero. (ii) The spaces Hn are the simplest which could be chosen for the given example. It is also possible to choose spaces of C 1 -functions which are polynomials of higher degree on every interval Ii . For instance, one can choose Hn = {y ∈ C 1 [0, 1] : y(0) = y(1) = 0, y|Ii is a polynomial of the third degree for all i = 0, . . . , n − 1}.27 There exists a basis of this space whose dimension is 2n which consists of the functions e1 , . . . , en−1 , ψ0 , . . . , ψn such that 1 for i = j, e˙ i (tj ) = 0, i = 1, . . . , n − 1, j = 0, . . . , n; ei (tj ) = 0 for i = j, ψ˙ i (tj ) =
ψi (tj ) = 0,
1 0
for for
i = j, i = j,
i, j, = 0, . . . , n,
see Figure 6.2.5. Every function y ∈ Hn can be written in the form y(t) =
n−1 j=1
y(tj )ej (t) +
n
y(t ˙ j )ψj (t),
t ∈ [0, 1].
j=0
(iii) From the computational point of view the question of how rapidly the solutions un of problem (6.2.13) converge to a solution of problem (6.2.12) is very important. This question is closely related to the regularity of solutions of equations. If, e.g., f ∈ C 0 [0, 1], then u0 ∈ C 2 [0, 1] (cf. Proposition 6.1.11 and Theorem 6.1.13) and 27 These
functions are called cubic splines (see, e.g., de Boor [32]).
396
Chapter 6. Variational Methods
1
ei ψi 0 = t0
ti−1
ti
ti+1
1 = tn t
Figure 6.2.5.
using this it can be proved that there exists a constant c > 0 such that for all n ∈ N we have c u0 − un ≤ . n If, e.g., u0 ∈ C 4 [0, 1], then we even have u0 − un ≤
c . n3
Remark 6.2.38 (Finite Elements Method). Similarly to Example 6.2.36 we could proceed even in the case H W0k,2 (Ω), Ω ⊂ RN , Ω ∈ C 0,1 . The situation then corresponds to the boundary value problem for partial differential equations – see Chapter 7 for more details. Suppose that we can divide the set Ω into a finite number of open subsets Ωi , i = 1, . . . , k, such that their diameter diam Ωi = sup x − y < x,y∈Ωi
1 n
and such that
Ω=
k
Ωi , Ωi ∩ Ωj = ∅ for i = j.
i=1
Each of the sets Ωi is called a finite element. The space Hn will consist of functions whose restrictions to Ωi are smooth functions, for instance polynomials in N variables, and satisfy certain conditions on the common boundary of the sets Ωi and Ωj (i = j). For simplicity and greater intuitive appeal we will consider Ω to be a polygon in R2 and for every n ∈ N we perform a triangulation Tn of the set Ω, i.e., we put Ω=
k
Ki
where Ki are open triangles such that
diam Ki ≤
i=1
1 , i = 1, . . . , k, n
see Figure 6.2.6. Assume that precisely one of the following situations arises for the mutual position of triangles Ki , Kj ∈ Tn (i = j): (a) the closures of two distinct triangles have no common point; (b) the closures of two distinct triangles have only one vertex in common; (c) the closures of two distinct triangles have an entire side in common. The spaces Hn will be sets of continuous functions whose restrictions to Ki are polynomials of the kth order. Below, we give examples of the spaces Hn for the case k = 1 and k = 3. The continuity of a function v ∈ Hn is ensured on the set Ω by choosing the values of parameters (used for the construction of the function) to be equal at the common vertices. The reader will find more details in specialized literature on the Finite Elements Method (see, e.g., Brenner & Scott [16], Kˇr´ıˇzek & Neitaanm¨ aki [81], Rektorys [107]).
6.2A. Ritz Method
397
Ω Ki
Figure 6.2.6. Example 6.2.39 (k = 1). Let Ω be a polygon in R2 . Let K be an open triangle with vertices Q1 , Q2 , Q3 . Let P1 (K) be the set of all polynomials of the first degree defined on K, i.e., P ∈ P1 (K) if P (x, y) = α0 + α1 x + α2 y,
(x, y) ∈ K.
It is easily shown that any function P (x, y) ∈ P1 (K) is uniquely determined by its values at the vertices Q1 , Q2 , Q3 . The values P (Q1 ), P (Q2 ), P (Q3 ) serve as parameters by means of which the function P (x, y) is constructed. The function P ∈ P1 (K) for which P (Qi ) = v(Qi ),
i = 1, 2, 3,
is called the Lagrange interpolation of the function v ∈ C(K). The function P (x, y) constructed in this way is denoted by ΠK v. Clearly, ΠK is a linear operator from the space C(K) into P1 (K) and v − ΠK vW 1,2 (K) ≤ chK vW 2,2 (K)
(6.2.30)
holds for arbitrary functions v ∈ W 2,2 (K) (here hK = diam K and c > 0 is a constant independent of v and hK ).28 Define the space Hn as follows: Hn {v ∈ C(Ω) : v|Ki ∈ P1 (Ki ) for all Ki ∈ Tn }. Obviously, Let v ∈ W
2,2
Hn ⊂ H W 1,2 (Ω). (Ω). Construct a function vn ∈ Hn in the following way: vn |Ki = ΠKi v.
Applying inequality (6.2.30), we obtain c vW 2,2 (Ω) . n Thus, the function vn is arbitrarily close to the function v provided n is a sufficiently large nonnegative integer. Hence, making use of the fact that the space W 2,2 (Ω) is dense v − vn ≤
28 The
reader is invited to prove it in detail!
398
Chapter 6. Variational Methods
in the space H (explain why!), we conclude that the spaces Hn , n ∈ N, satisfy condition (6.2.15). We can construct the basis functions e1 , . . . , ek of Hn just as in Example 6.2.36. If {Qi }m i=1 are all vertices of all triangles of the triangulation Tn , then 1 for i = j, e ei (Qj ) = j = 1, . . . , m. 0 for i = j, Example 6.2.40 (k = 3). Let K be an open triangle with vertices Q1 , Q2 , Q3 and with the center of gravity Q0 . Let P3 (K) be the set of polynomials of the third degree defined on K, i.e., P ∈ P3 (K) if P (x, y) = α0 + α1 x + α2 x2 + α3 x3 + α4 xy + α5 xy 2 + α6 x2 y + α7 y + α8 y 2 + α9 y 3 , (x, y) ∈ K. A function P (x, y) ∈ P3 (K) is uniquely determined by its values at the vertices and at the center of gravity and by the values of the first partial derivatives at the vertices of the triangle K. A function ΠK v ∈ P3 (K) for which ΠK v(Qi ) = v(Qi ), ∂v(Qi ) ∂ΠK v(Qi ) = , ∂x ∂x
i = 0, 1, 2, 3;
∂ΠK v(Qi ) ∂v(Qi ) = , ∂y ∂y
i = 1, 2, 3,
is called the Hermite interpolation of the function v ∈ C 1 (K). Just as in the preceding example, the inequality v − ΠK vW 3,2 (K) ≤ chK vW 4,2 (K)
holds for all
v ∈ W 4,2 (K).
If we put Hn {v ∈ C 1 (K) : v|Ki ∈ P3 (Ki ) for every triangle Ki ∈ Tn }, then Hn ⊂ H W 3,2 (Ω) and the spaces Hn , n ∈ N, again satisfy condition (6.2.15) since the set W 4,2 (Ω) is dense e in the space H. Exercise 6.2.41. Apply the spaces Hn described in Remark 6.2.37(ii) to Example 6.2.36.
6.2B Supersolutions, Subsolutions and Global Extrema In this appendix we show the connection between the supersolutions and subsolutions (see Section 5.4) on the one hand and the existence of global minima (see Section 6.2) on the other. We will illustrate it on the Dirichlet boundary value problem x ¨(t) = f (t, x(t)), t ∈ (0, 1), (6.2.31) x(0) = x(1) = 0, where f is a continuous function on [0, 1] × R (cf. Example 5.4.19). Put H W01,2 (0, 1). The functional 1 x(t) f (t, s) ds dt ψ(x) 0
0
6.2B. Supersolutions, Subsolutions and Global Extrema defined on H is of the class C 1 (H, R) and 1 f (t, x(t))h(t) dt, ψ (x)(h) =
399
x, h ∈ H.29
(6.2.32)
0
Then
1
F (x) = 0
1 2 |x(t)| ˙ + 2
x(t)
f (t, s) ds dt 0
is of the class C 1 (H, R) and its critical points correspond to weak solutions of (6.2.31). A regularity argument applied to (6.2.31) (similar to that from Theorem 6.1.13) implies that every weak solution is a classical solution in the sense that x ∈ C02 [0, 1] {x ∈ C 2 [0, 1] : x(0) = x(1) = 0} and the equation in (6.2.31) holds at every point t ∈ (0, 1). The link between the method of supersolutions and subsolutions on the one side and the method of finding the global minimizer on the other side is that the existence of a well-ordered pair of a subsolution and supersolution u0 and v0 , respectively, implies that the functional F has a minimum on the convex but noncompact set M = {x ∈ H : u0 (t) ≤ x(t) ≤ v0 (t) for all t ∈ [0, 1]}. This minimum then solves (6.2.31). Namely, we have the following assertion. Theorem 6.2.42. Let u0 and v0 be a subsolution and supersolution of (6.2.31) such that u0 (t) ≤ v0 (t), t ∈ [0, 1], E {(t, x) ∈ [0, 1] × R : u0 (t) ≤ x ≤ v0 (t)}, and let f : E → R be a continuous function. Then the functional F has a global minimum on M, i.e., there exists x0 ∈ M such that F (x0 ) =
min
x∈H u0 ≤x≤v0
F (x).
Moreover, x0 is a solution of (6.2.31). Proof. Let γ(t, x) max{u0 (t), min{x, v0 (t)}} and consider the modified problem x ¨(t) = f (t, γ(t, x(t))),
t ∈ (0, 1),
x(0) = x(1) = 0. Define the energy functional associated with this modified problem by 1 x(t) 1 2 ˜ F (x) = |x(t)| ˙ + f (t, γ(t, s)) ds dt. 2 0 0 29 Cf.
Section 3.2 in order to prove these facts.
(6.2.33)
400
Chapter 6. Variational Methods
Then F˜ ∈ C 1 (H, R) and its critical points correspond to the solutions of (6.2.33). It is easy to prove (the reader should do it as an exercise) that F˜ is weakly sequentially lower semicontinuous and weakly coercive. It then follows from Theorem 6.2.8 that F˜ has a global minimum on H at x0 ∈ H, F˜ (x0 ) = o. This x0 is a weak solution of (6.2.33) and it is regular, i.e., x0 ∈ C 2 [0, 1], by Theorem 6.1.14. We shall show that u0 (t) ≤ x0 (t) ≤ v0 (t). Indeed, assume by contradiction that min (x0 (t) − u0 (t)) < 0
t∈[0,1]
and define
t0 max t ∈ [0, 1] : x0 (t) − u0 (t) = min (x0 (s) − u0 (s)) . s∈[0,1]
From the definition of a subsolution u0 and of γ we obtain that t0 < 1, and for t ≥ t0 , t close to t0 , we have t t x˙ 0 (t) − u˙ 0 (t) = [¨ x0 (s) − u ¨0 (s)] ds = [f (s, u0 (s)) − u ¨0 (s)] ds ≤ 0. t0
t0
This contradicts the definition of t0 . Hence x0 (t) ≥ u0 (t), t ∈ [0, 1]. Similarly we prove x0 (t) ≤ v0 (t), t ∈ [0, 1]. Notice that if x is such that u0 (t) ≤ x(t) ≤ v0 (t), then γ(t, x(t)) = x(t), i.e., x0 is a minimizer for F on M and F (x0 ) = o. Example 6.2.43. Consider the problem x ¨(t) = λf (t, x(t)),
t ∈ (0, 1),
x(0) = x(1) = 0,
(6.2.34)
where f is continuous on [0, 1] × R, f (t, 0) = 0, f (t, R) ≥ 0 for an R > 0 and there exists w ∈ H W01,2 (0, 1), 0 ≤ w(t) ≤ R, t ∈ [0, 1], such that 1 w(t) f (t, s) ds dt < 0. 0
0
Then there exists Λ ≥ 0 such that for all λ ≥ Λ, (6.2.34) has, besides the trivial solution, at least one nontrivial nonnegative solution. Indeed, u0 ≡ 0 is a subsolution and v0 ≡ R is a supersolution, and according to Theorem 6.2.42 there exists x0 ∈ M {x ∈ H : 0 ≤ x(t) ≤ R} which solves (6.2.34) and minimizes the energy functional F on M. Moreover, taking λ large enough, we have w(t) 1 1 2 |w(t)| ˙ +λ f (t, s) ds dt < 0, F (w) = 2 0 0 and so F (x0 ) = min F (x) ≤ F (w) < 0 = F (o). x∈M
e
6.3. Relative Extrema and Lagrange Multipliers
401
Remark 6.2.44. The same results as in Theorem 6.2.42 and Example 6.2.43 hold if the continuity of f is relaxed to f ∈ CAR([0, 1]×R) and for all r > 0 there exists h ∈ L1 (0, 1) such that for a.e. t ∈ (0, 1) and all s ∈ R, |s| ≤ r, we have |f (t, s)| ≤ h(t). The reader is invited to verify all the previous steps as an exercise. The reader who wants to learn more is referred to De Coster & Habets [33] where also the relation between non-well-ordered supersolutions and subsolutions on the one hand and the minimax method on the other is discussed. Exercise 6.2.45. How does the proof of Theorem 6.2.42 change if the homogeneous Dirichlet boundary conditions in (6.2.31) are replaced by the Neumann ones? Exercise 6.2.46. Consider the problem
p−2 |x(t)| ˙ x(t) ˙ ˙= f (t, x(t)),
t ∈ (0, 1),
x(0) = x(1) = 0, where p > 1 and
1
F (x) = 0
1 p + |x(t)| ˙ p
x(t)
(6.2.35)
f (t, s) ds dt.
0
Prove the analogue of Theorem 6.2.42 for (6.2.35). Exercise 6.2.47. Find conditions on a continuous function f : [0, 1] × R → R which guarantee that the problem (6.2.35) has a subsolution u0 and a supersolution v0 satisfying u0 (t) ≤ v0 (t)
for all
t ∈ [0, 1].
Hint. Look for u0 and v0 constant on [0, 1].
6.3 Relative Extrema and Lagrange Multipliers In this section we will investigate the local minima or maxima of a real function f on a smooth manifold M (in particular, on a surface in R3 ). Such a manifold is often determined by various constraints which are given by certain equations like Φ(x) = o (cf. Remark 4.3.9). The key assertions of this section are the Lagrange Multiplier Method, the Courant–Fischer and Courant–Weinstein Variational Principles. Definition 6.3.1. Let X be a metric (or, more generally, topological) space, M ⊂ X. We say that a function f : M → R has a local minimum (maximum) at a point a ∈ M with respect to M (or a constrained minimum on M ) if there is a neighborhood U of a such that f (x) ≥ f (a)
(f (x) ≤ f (a))
for all x ∈ M ∩ U.
402
Chapter 6. Variational Methods
We will suppose that M is given as the zero set of a map Φ : X → Y , i.e., M = {x ∈ X : Φ(x) = o}. The way of investigating the behavior of f in a relative neighborhood U ∩ M of a point a ∈ M is simple and transparent. It consists in expressing M ∩ U as the graph of a map ϕ : Z → X and subsequently studying f ◦ϕ. This is always possible if M is a differentiable manifold in X = RN (Definition 4.3.4) or if M is given by Φ as above and Φ satisfies certain regularity conditions (Proposition 4.3.8 and Remark 4.3.9(i)). Theorem 6.3.2 (Lagrange Multiplier Method). Let X be a Banach space, f : X → R, Φ = (Φ1 , . . . , ΦN ) : X → RN . Let f have a local minimum or maximum with respect to M = {x ∈ X : Φ(x) = o} at a point a ∈ M . Let there be a neighborhood U of a in X such that f, Φ ∈ C 1 (U) and let a be a regular point of Φ (i.e., Φ (a) is a surjective map onto RN ). Then there exist numbers λ1 , . . . , λN 30 such that N f− λi Φi (a) = o. (6.3.1) i=1
Proof. Proposition 4.3.8 and Remark 4.3.9(i) yield a diffeomorphism ϕ of a neighborhood U of o ∈ X onto a neighborhood V of a such that ϕ(U ∩ Ker Φ (a)) = M ∩ V,
ϕ(o) = a.
If ϕ1 denotes the restriction of ϕ to U ∩Ker Φ (a), then f ◦ ϕ1 has a local minimum (or maximum) at o and therefore (f ◦ ϕ1 ) (o) = o. Since ϕ 1 (o)h = h for any h ∈ Ker Φ (a) (see the proof of Proposition 4.3.8), it follows that Ker Φ (a) ⊂ Ker f (a). The use of Proposition 1.1.19 completes the proof.
Remark 6.3.3. (i) The main significance of Theorem 6.3.2 consists in reducing a (difficult) problem of finding the constrained extremal points to an easier task of finding the local ones for a function f−
N
λi Φi
i=1
with unknown coefficients λ1 , . . . , λN (they have to be determined in the course of calculation – see Example 6.3.4). 30 The
numbers λ1 , . . . , λN are called Lagrange multipliers.
6.3. Relative Extrema and Lagrange Multipliers
403
(ii) For an infinite number of constraints (i.e., Φ : X → Y , Y is a Banach space of infinite dimension) the proof of Theorem 6.3.2 still holds provided there exists a continuous projection of X onto Ker Φ (a). It is interesting that the statement (now (f − F ◦ Φ) (a) = 0 for a certain F ∈ Y ∗ ) is true without the assumption on existence of a projection (the so-called Lusternik Theorem), but the proof is more difficult (see Lusternik & Sobolev [90]). Example 6.3.4. Find the minimal and maximal values of f (x, y, z) = x2 y + xy 2 + z 2
on the set M = {(x, y, z) ∈ R3 : x2 + y 2 + z 2 = 1}.
Notice first that all points of M are regular. The necessary condition given by Theorem 6.3.2 for extremal points requires solving the following four equations: 2xy + y 2 − 2λx = 0,
(6.3.2)
x + 2xy − 2λy = 0,
(6.3.3)
2z − 2λz = 0,
(6.3.4)
2
2
2
2
x + y + z = 1.
(6.3.5)
We have either z = 0 or λ = 1 from the third equation. Adding x2 and y 2 to (6.3.2) and (6.3.3) we obtain x2 + 2λx = y 2 + 2λy, Case 1 (z = 0). If x = y, then √ 2 x=y=± and 2
(x − y)(x + y + 2λ) = 0.
i.e.,
√ √ √ 2 2 2 f ± ,± ,0 = ± . 2 2 2
If x + y = −2λ, then (6.3.2) and (6.3.5) imply xy = − 13 and from equation (6.3.5) we find √ √ 3 3 and hence f (x, y, 0) = xy(x + y) = ∓ . x+y =± 3 9 Case 2 (λ = 1). Again we have either x = y or x + y = −2. Putting x = y into (6.3.2) we find x = y = 0,
z = ±1
or
x=y=
and f (0, 0, ±1) = 1,
f
2 2 1 , ,± 3 3 3
2 , 3
=
If x + y = −2, then (from (6.3.2) and (6.3.3)) x2 + 2x − 4 = y 2 + 2y − 4 = 0.
z=± 19 . 27
1 3
404
Chapter 6. Variational Methods
Summing these equations we get 0 = x2 + y 2 − 4 − 8, i.e., there cannot exist z such that x2 + y 2 + z 2 = 1. We have found several points in M for which the necessary condition is satisfied. Since M is a compact set in R3 and f is continuous, the maximum and the minimum of f on M have to exist. Comparing the values of f at points at which the necessary condition is satisfied we find that √ √ √ 2 2 2 max f = f (0, 0, ±1) = 1, ,− ,0 = − . min f = f − M M 2 2 2 If we were interested in local minima/maxima of f with respect to M , we would need some sufficient conditions. Since we are able to reduce the problem of constrained minima/maxima to that of local ones (see the proof of Theorem 6.3.2), we might employ the sufficient condition which uses the second differential (Theg orem 6.1.5). Cf. Exercise 6.3.17. Example 6.3.5 (Existence of the principal eigenvalue). Let p > 1 be a real number, X W01,p (0, 1).31 Consider the eigenvalue problem
p−2 x(t))˙ ˙ = λ|x(t)|p−2 x(t), −(|x(t)| ˙ x(0) = x(1) = 0
t ∈ (0, 1),
(6.3.6)
with a real parameter λ. This problem is linear for p = 2 and nonlinear for p = 2. We say that λ ∈ R is an eigenvalue of (6.3.6) if there is a weak solution x ∈ X, x = o, of (6.3.6), i.e.,
1
1
p−2 |x(t)| ˙ x(t) ˙ y(t) ˙ dt = λ
0
|x(t)|p−2 x(t)y(t) dt
(6.3.7)
0
holds for every y ∈ X. The corresponding x is then called an eigenfunction associated with the eigenvalue λ.32
31 We 32 To
1
will work with the norm x =
p |x(t)| ˙ dt
p1
.
0
see the analogue to the linear case the reader should notice that for p = 2 such a function x is an eigenvector (Definition 1.1.27) and λ is an eigenvalue of the linear operator Bx = x ¨, Dom B = {x ∈ W01,2 (0, 1) : x(0) = x(1) = 0} ⊂ L2 (0, 1). The identity (6.3.7) can be interpreted (for p = 2) as the operator equation x = λAx where A is defined by the equality (Ax, y)W 1,2 (0,1) = 0
(x, y)L2 (0,1) . The eigenvalues of (6.3.6) are then reciprocal values of the eigenvalues of A.
6.3. Relative Extrema and Lagrange Multipliers
405
Since (6.3.7) must also hold for y = x, we obtain
1
λ = 0 1
p |x(t)| ˙ dt
, |x(t)|p dt
0
which implies that λ > 0 for any eigenvalue λ. We will prove that the value
1
0
λ1 = inf x∈X x =o
1
p |x(t)| ˙ dt
,
(6.3.8)
|x(t)| dt p
0
i.e.,
1
λ1 = inf
x∈X
1
p |x(t)| ˙ dt :
0
|x(t)|p dt = 1
0
is attained and use the Lagrange Multiplier Method to show that λ1 is the least eigenvalue (principal eigenvalue) of (6.3.6). Let us prove that the infimum in (6.3.8) is achieved at an x1 ∈ X with
1
|x1 (t)|p dt = 1.
0 ∞
Indeed, there exists a minimizing sequence {xn }n=1 ⊂ X such that
1
|xn (t)| dt = 1 p
1
and
0
|x˙ n (t)|p dt → λ1 .
0 ∞
In particular, this means that the sequence {xn }n=1 is bounded in X. By the reflexivity of X and the compact embedding X = W01,p (0, 1) ⊂⊂ Lp (0, 1) (see ∞ Theorem 1.2.28 and Exercise 1.2.46(i)) there exists a subsequence {xnk }k=1 ⊂ ∞ {xn }n=1 and a function x1 ∈ X such that xnk x1 Hence
1
in X,
|x1 (t)|p dt = 1
and
xnk → x1
x1 p ≤ lim inf xn p = λ1 ,
0
i.e.,
0
1
in Lp (0, 1).
|x˙ 1 (t)|p dt = λ1 .
n→∞
406
Chapter 6. Variational Methods
Now we apply Theorem 6.3.2 with 1 p |x(t)| ˙ dt and f (x) =
1
g(x) =
0
|x(t)|p dt − 1.
0
The Fr´echet derivatives of f and g at x1 (in the space X) are given by f (x1 )y = p
1
|x˙ 1 (t)|p−2 x˙ 1 (t)y(t) ˙ dt, for any y ∈ X
0
g (x1 )y = p
1
|x1 (t)|p−2 x1 (t)y(t) dt
0
(cf. Exercise 3.2.35). Since x1 = o, we also have g (x1 ) = o, and so the assumptions of Theorem 6.3.2 are fulfilled. Hence there exists λ ∈ R such that f (x1 ) = λg (x1 ), which is equivalent to 1 |x˙ 1 (t)|p−2 x˙ 1 (t)y(t) ˙ dt = λ 0
1
|x1 (t)|p−2 x1 (t)y(t) dt
(6.3.9)
0
for any y ∈ X. Setting y = x1 in (6.3.9) we get λ = λ1 . Now it follows from (6.3.7) and (6.3.8) that λ1 is the least eigenvalue of (6.3.6). g Remark 6.3.6. Let us emphasize that Theorem 6.3.2 provides a necessary condition only. It means that not every point a ∈ M for which f (a) −
N
λi Φ i (a) = o
with some
λi ∈ R,
i = 1, . . . , N,
i=1
need be a point of local extremum of f relative to M ! On the other hand, to find all local extrema of f relative to M one has to start with finding all λi ∈ R, N λi Φi has a critical point a ∈ M . It i = 1, . . . , N , such that the functional f − i=1
is a well-known fact from the calculus of several real variables (when X = RN ) that the set of all such a’s is “almost always” finite (see, e.g., Example 6.3.4). Hence a very natural and deep question arises: “How many points a do we have if dim X = ∞?” Remark 6.3.7. Let us denote by Λ ⊂ R the set of all λ ∈ R such that f − λg has a critical point a ∈ M . If X is a Hilbert space of infinite dimension, then in
6.3. Relative Extrema and Lagrange Multipliers
407
Krasnoselski [78, Chapter 6] the reader can find the proof of the assertion that the set Λ contains a sequence of nonzero numbers λn = 0 such that λn → 0. The same assertion for a Banach space X can be found in Citlanadze [26], Browder [18], Fuˇc´ık & Neˇcas [55]. Actually, the whole Chapter 6 of the lecture notes by Fuˇc´ık et al. [56] is devoted to this problem. As for more recent references the reader can confer Zeidler [136] and the bibliography therein. Let us emphasize that in all above results the authors prove that the cardinality of the set Λ is equal to infinity. The question: “When is Λ a countable set?” is much more involved. Some partial results in this direction can be found in Fuˇc´ık et al. [56]. The proofs are based on a stronger version of the Morse Theorem and go beyond the scope of this book. Proposition 6.3.8. Let H be an N -dimensional Hilbert space and let A be a selfadjoint operator in H. Then A has N real eigenvalues λ1 , . . . , λN (if they are counted with their multiplicities), and the corresponding eigenvectors e1 , . . . , eN form an orthonormal basis in H. Proof. Consider two functions f, ϕ1 : H → R defined by f (x) = (Ax, x),
ϕ1 (x) = (x, x) − 1,
x ∈ H.
Then the set M1 = {x ∈ H : ϕ1 (x) = 0} (the unit sphere in H) is a compact subset of H and the continuous function f assumes its maximum in M1 at a point e1 ∈ M1 . By Theorem 6.3.2, there is a λ1 ∈ R such that f (e1 ) − λ1 ϕ 1 (e1 ) = o. A simple calculation shows that f (e1 )h = 2(Ae1 , h), ϕ 1 (e1 )h = 2(e1 , h). Therefore (Ae1 − λ1 e1 , h) = 0
for all h ∈ H,
i.e.,
Ae1 = λ1 e1 .
Taking h = e1 we also get λ1 = (Ae1 , e1 ) = max (Ax, x). x∈M1
In particular, λ1 is the largest (equivalently, first) eigenvalue. To find the second eigenvalue we add another constraint ϕ2 (x) (x, e1 ) = 0 (remember that eigenvectors of a symmetric matrix are pairwise orthogonal). The function f has again a maximum with respect to M2 = {x ∈ H : ϕ1 (x) = ϕ2 (x) = 0}
408
Chapter 6. Variational Methods
˜2 ∈ R such that and thus there are e2 ∈ M2 , λ2 , λ ˜ 2 ϕ (e2 )h = (2Ae2 − 2λ2 e2 − λ ˜2 e1 , h) = 0 f (e2 )h − λ2 ϕ 1 (e2 )h − λ 2
(6.3.10)
for all h ∈ H. In particular, for h = e1 we get ˜ 2 e1 2 = 2(e2 , Ae1 ) − λ ˜ 2 = 2λ1 (e2 , e1 ) − λ ˜2, 0 = (2Ae2 , e1 ) − λ ˜ 2 = 0. The equality (6.3.10) hence yields and consequently λ Ae2 = λ2 e2 and, similarly as above, λ2 = max (Ax, x). x=1 (x,e1 )=0
It is obvious that we can proceed by induction to obtain all eigenvalues λ1 , . . . λN and to show that the corresponding eigenvectors e1 , . . . , eN are orthonormal and form a basis of H. Corollary 6.3.9. Let A = (aij )i,j=1,...,N be a symmetric matrix (aij = aji for i, j = 1, . . . , N ). Then there exist real numbers λ1 , . . . , λN and a basis e1 , . . . , eN of RN such that N i,j=1
aij xi xj =
N
λi ξi2 ,
where
x = (x1 , . . . , xN ),
i=1
x=
N
ξi ei .
i=1
Remark 6.3.10. The procedure explored in the proof of Proposition 6.3.8 has a disadvantage, namely, to find the kth eigenvalue λk it is necessary to know the first k − 1 eigenvectors e1 , . . . , ek−1 . Because of that it can be convenient to have another expression for λk . We will now prove that (Ax, x) λk = min max : (x, y1 ) = · · · = (x, yk−1 ) = 0 and x = o (6.3.11) y1 ,...,yk−1 x2 provided dim H ≥ k. Expression (6.3.11) is called the Minimax Principle. Let e1 , . . . , ek be eigenvectors corresponding to the first k eigenvalues λ1 ≥ · · · ≥ λk . Take y1 , . . . , yk−1 ∈ H and let N = {x = o : (x, y1 ) = · · · = (x, yk−1 ) = 0}. There is an x ˜ ∈ N ∩ Lin{e1 , . . . , ek }, say x ˜ =
k
αi ei . A simple argument to
i=1
see this consists in the observation that the linear operator Φ : Rk → Rk−1 (or Ck → Ck−1 ) given by k Φα = αi (ei , yj ) i=1
j=1,...,k−1
6.3. Relative Extrema and Lagrange Multipliers
409
must have a nontrivial kernel. For such an x ˜ we have ⎛ ⎞ k k k k αi λi ei , αj ej ⎠ = λi |αi |2 ≥ λk |αi |2 = λk ˜ x2 . (A˜ x, x ˜) = ⎝ i=1
j=1
i=1
i=1
This shows that the maximum in (6.3.11) (denoted by m(y1 , . . . , yk−1 )) is not less than λk and therefore inf
y1 ,...,yk−1
m(y1 , . . . , yk−1 ) ≥ λk ,
too. But the above calculation yields that m(e1 , . . . , ek−1 ) = λk . Remark 6.3.11. This method of finding eigenvalues of a self-adjoint continuous operator A cannot be extended to infinite dimensional Hilbert spaces. The reason is rather simple: such an operator need not have any eigenvector (Example: Ax(t) = tx(t), x ∈ L2 (0, 1)). On the other hand, if we assume that A is, in addition to self-adjointness, also compact, then similar result holds. Theorem 6.3.12 (Courant–Fischer Principle). Let A : H → H be a compact, selfadjoint and positive33 linear operator from an (infinite dimensional ) separable real Hilbert space H into itself. Then all eigenvalues of A are positive reals and there exists an orthonormal basis of H which consists of eigenvectors of A. If, moreover, λ1 ≥ λ2 ≥ λ3 ≥ · · · > 0,
λn → 0
(n → ∞),
denote the eigenvalues of A, then λ1 = max{(Au, u) : u = 1} and λk+1 = min max {(Au, u) : u = 1, (u, v1 ) = · · · = (u, vk ) = 0}, v1 ,...,vk
k = 1, 2, . . . .34 Proof. Set F (u) = (Au, u),
ϕ1 (u) = u2 − 1
for u ∈ H,
and M1 = {u ∈ H : ϕ1 (u) = 0}. linear self-adjoint operator A is said to be positive if (Au, u) > 0 for all u = o. reader should compare this assertion and its proof with the Hilbert–Schmidt Theorem (Theorem 2.2.16). 33 A
34 The
410
Chapter 6. Variational Methods ∞
Let {un }n=1 be a maximizing sequence for F subject to M1 , i.e., un = 1, n = 1, . . . , and lim F (un ) = sup {F (u) : u ∈ M1 }. n→∞
The boundedness of M1 and the compactness of A imply (Proposition 2.2.4(iii)) ∞ that we can pass to a subsequence (denoted again as {un }n=1 ) for which u n e1
and
Aun → Ae1
in
H
with an e1 ∈ H.
Then |(Aun , un ) − (Ae1 , e1 )| ≤ |(Aun − Ae1 , un )| + |(Ae1 , un − e1 )| → 0 since both terms on the right-hand side approach zero. So F (e1 ) = sup {F (u) : u ∈ M1 }. In particular, we have F (e1 ) > 0
and
e1 = o.
Let us prove that e1 = 1. Indeed, we have e1 ≤ lim inf un = 1. n→∞
Assume that e1 < 1. Then there exists t > 1 such that for e˜1 = te1 we have ˜ e1 = 1, i.e., e˜1 ∈ M1 . Also F (˜ e1 ) = (A(te1 ), te1 ) = t2 (Ae1 , e1 ) = t2 F (e1 ) > sup {F (u) : u ∈ M1 }, a contradiction. Hence λ1 = F (e1 ) = max {F (u) : u ∈ M1 }. Applying Theorem 6.3.2 we prove exactly as in Proposition 6.3.8 that λ1 is an eigenvalue of A and e1 is the corresponding eigenvector. Now, we proceed by induction using Mn = {u ∈ H : u = 1 and (u, e1 ) = · · · = (u, en−1 ) = 0} as above to get the sequence of eigenvalues λ1 ≥ λ2 ≥ · · · > 0
(6.3.12)
and the sequence of the corresponding eigenvectors e1 , e2 , . . . 35 which are pairwise orthogonal. The infinite dimension of H causes that the above sequences are infinite in general. 35 The
reader should perform this part of the proof in detail.
6.3. Relative Extrema and Lagrange Multipliers
411
Suppose now that there is w ∈ H such that w = 1
(w, en ) = 0
and
for all n ∈ N.
Then w∈
∞
Mn ,
(Aw, w) ≤ λn
and thus
for n = 1, 2, . . . .
n=1
Since λn → 0 (Corollary 2.2.13), we have (Aw, w) = 0. The assumption on the ∞ positivity of A implies w = o, a contradiction. This result shows that {en }n=1 is an orthonormal basis of H (Corollary 1.2.36). Moreover, the sequence (6.3.12) contains all eigenvalues of A. Indeed, if Aw = λw
for w =
∞
αn en = 0,
n=1
then λn αn = λαn
for n = 1, 2, . . . .
Therefore αn = 0 provided λn = λ. The “min max” characterization of λn ’s follows as in the finite dimensional case (Remark 6.3.10). Remark 6.3.13. It is remarkable that the Minimax Principle holds even without the assumption on the continuity of A in the sense that inf
y1 ,...,yk−1
sup {(Ax, x) : x ∈ Dom A, x = 1, (x, y1 ) = · · · = (x, yk−1 ) = 0}
yields either the kth eigenvalue or an upper bound of the essential spectrum of a linear self-adjoint operator A provided A is bounded above. For details see, e.g., Reed & Simon [106]. There is also a dual characterization of the eigenvalues of A called the Courant–Weinstein Variational Principle. Theorem 6.3.14 (Courant–Weinstein Variational Principle). Let H be a real separable Hilbert space, A : H → H a positive compact self-adjoint linear operator. Assume that the eigenvalues λn of A form a decreasing sequence λ1 ≥ λ2 ≥ λ3 ≥ · · · ≥ λn ≥ · · · > 0,
λn → 0
(n → ∞)
(cf. Theorem 6.3.12), and the multiplicity of an eigenvalue λ indicates how many times this λ repeats in the above sequence. Then for any n ∈ N, λn =
sup
inf (Au, u).
u∈X X⊂H dim X=n u=1
(Here X is an arbitrary linear subspace of H of dimension equal to n.)
412
Chapter 6. Variational Methods
Proof. Keeping the notation from Theorem 6.3.12, in particular, Aen = λn en , we denote for n ∈ N fixed ˜ n = sup λ inf (Au, u). u∈X X⊂H dim X=n u=1
˜ n = λn . Our aim is to prove λ ˜ n ≥ λn . Set Step 1. We prove that λ X0 = Lin{e1 , . . . , en }. Then X0 is a linear subspace of H, dim X0 = n, and clearly ˜n ≥ min (Au, u). λ u∈X0 u=1
However, we can estimate the minimum of the quadratic form on the right-hand side in terms of λn . For u ∈ X0 , u = 1 we have u=
n
n
xi ei ,
i=1
Then
⎛
(Au, u) = ⎝
n i=1
xi λi ei ,
n
x2i = 1.
i=1
⎞ xj ej ⎠ =
j=1
n
λi x2i ≥ λn ,
i.e.,
λ˜n ≥ λn .
i=1
˜ n ≤ λn . Set Step 2. We prove λ Y = Lin{ei }∞ i=n . Then codim Y = n − 1. Let X be an arbitrary linear subspace of H, dim X = n. Then necessarily dim (X ∩ Y ) > 0, and the space X ∩ Y must contain an element w = o. We can assume w = 1. Since w ∈ Y , we have ∞ ∞ w= xi ei , x2i = 1. i=n
i=n
The estimate of the quadratic form (Au, u) on the unit sphere in X yields min (Au, u) ≤ (Aw, w) =
u∈X u=1
∞
λi x2i ≤ λn
i=n
˜n ≤ λn follows. Since X is arbitrary, the equality λ
∞
x2i = λn .
i=n
6.3. Relative Extrema and Lagrange Multipliers
413
Example 6.3.15 (Higher eigenvalues). Let p = 2 in (6.3.6), i.e., let us consider the eigenvalue problem x ¨(t) + λx(t) = 0, t ∈ (0, 1), (6.3.13) x(0) = x(1) = 0. The eigenvalues of the linear problem (6.3.13) can be calculated in an elementary way. On the other hand, if we set H W01,2 (0, 1) and define a positive and compact operator A : H → H by (Ax, y)
W01,2 (0,1)
1
x(t)y(t) dt, 36
= 0
then µ = 0 is an eigenvalue of A if and only if λ = µ1 is an eigenvalue of (6.3.13) (cf. footnote 32 on page 404). It follows from Theorem 6.3.14 that 1 = λn
sup
min
X⊂H x=1 dim X=n
1
g
|x(t)|2 dt.
0
The following two exercises show the relation between the local (global) extremum subject to a constraint and the local (global) extremum of the functional depending on a parameter (without the constraint). Exercise 6.3.16. Prove the following assertion: Let f , Φ be two real functionals defined on a real Hilbert space H. Let the functional f − λΦ ( λ ∈ R) have a local (global ) extremum at a point x0 ∈ H. Then the functional f has a local (global ) extremum subject to the constraint {x ∈ H : Φ(x) = Φ(x0 )} at the point x0 . Exercise 6.3.17. Prove the following assertion: Let f, Φ : X → R satisfy the assumptions of Theorem 6.3.2 and let x0 ∈ X, λ ∈ R be such that f (x0 ) − λΦ (x0 ) = 0. Assume, moreover, that there exist D2 f (x0 ; h, h), D2 Φ(x0 ; h, h). Then x0 is a local minimum of f − λΦ (without the constraint) provided the quadratic form h → D2 f (x0 ; h, h) − λD2 Φ(x0 ; h, h),
h ∈ X,
is positive definite in X. 36 By
1
Example 2.2.17 the operator A is also defined as (Ax)(t) =
G(t, s)x(s) ds, and the 0
compactness of A follows.
414
Chapter 6. Variational Methods
Exercise 6.3.18. Show that the first eigenvalue of x ¨(t) + λx(t) = 0, t ∈ (0, π), x(0) = x(π) = 0 is simple and equal to 1, and that given λ > −1 there exists c = c(λ) > 0 such that for any x ∈ W01,2 (0, π), π π π 2 2 |x(t)| ˙ dt + λ |x(t)|2 dt ≥ c |x(t)| ˙ dt. 0
0
0
Exercise 6.3.19. Prove that for all x ∈ W01,2 (0, π) the inequality π π 2 2 |x(t)| dt ≤ |x(t)| ˙ dt holds true. 0
0
Hint. Use Exercise 6.3.18.
6.3A Contractible Sets This appendix has solely an auxiliary character and will be used in the proof of the Krasnoselski Potential Bifurcation Theorem in Appendix 6.3B. The proofs of the assertions from this appendix rely on the Brouwer Fixed Point Theorem (Theorem 5.1.3). Definition 6.3.20. Let A and B be subsets of a topological space Y . Then by definition A is contractible into B in the space Y , briefly A≺B
in
Y,
if there exists a homotopy h ∈ C([0, 1] × A, Y ) such that for any u ∈ A, h(0, u) = u,
h(1, u) ∈ B.
The next assertion shows that “≺” is a transitive relation. Lemma 6.3.21. Let A, B and C be subsets of Y . If A ≺ B and B ≺ C in Y , then also A ≺ C in Y . Proof. Let us assume that A ≺ B and B ≺ C by means of homotopies h and g. Define a homotopy f ∈ C([0, 1] × A, Y ) by h(2t, u), 0 ≤ t ≤ 12 , u ∈ A, f (t, u) = g(2t − 1, h(t, u)), 12 < t ≤ 1, u ∈ A. Then f ∈ C([0, 1] × A, Y ) and Definition 6.3.20 yields A ≺ C. Let H1 and H2 be two closed subspaces of a Hilbert space H such that H = H1 ⊕ H2 .
6.3A. Contractible Sets
415
Let Pi : H → Hi , i = 1, 2, be projections (cf. Example 1.1.13(i)), and assume that dim H1 < ∞. Set R = {x ∈ H : P1 x = o}. The set R equipped with the metric induced by the norm in H is a metric space. Lemma 6.3.22. The set S1,r ∂B(o; r) ∩ H1 is not contractible to a point in R.37 Proof. It is enough to prove this assertion for the sphere with radius r = 1. Let us denote it by S1 . We proceed in two steps. We prove first that if S1 were contractible to a point in R, then it would have to be contractible to a point in S1 . In the second step we show that this fact contradicts the Brouwer Fixed Point Theorem (Theorem 5.1.3). Step 1. If S1 is contractible to a point in R, then there exists a continuous mapping f : [0, 1] × S1 → R and x0 ∈ R such that f (0, x) = x,
f (1, x) = x0
x ∈ S1 .
for all
For t ∈ [0, 1], x ∈ S1 set g(t, x) =
P1 f (t, x) . P1 f (t, x)
Then g deforms the set S1 continuously to the point
P1 x 0
P1 x0
in S1 .
Step 2. Let the unit sphere S1 ⊂ H1 be contractible to a point in S1 , i.e., there exists a continuous map g : [0, 1] × S1 → S1 and a point x0 ∈ S1 such that g(0, x) = x,
g(1, x) = x0
Now, we define h : B(o; 1) ∩ H1 → B(o; 1) ∩ H1 by
⎧ ⎨−g 1 − x, x x h : x → ⎩ −x0
for all
x ∈ S1 .
for
x = o,
for
x = o.
Then h is continuous. Since dim H1 < ∞, the Brouwer Fixed Point Theorem (Theorem 5.1.3) implies that there exists y ∈ B(o; 1) ∩ H1 such that h(y) = y. Since h assumes only values from S1 , we have y ∈ S1 , y = 1. On the other hand, h(y) = −g(0, y) = −y, which is a contradiction. Lemma 6.3.23. Let F be a subset of R. If there exists x0 ∈ H1 , x0 = 1, such that P1 (F) ∩ {y ∈ H1 : y = ax0 , a ∈ R} = ∅, then F is contractible to a point in R. 37 I.e.,
there is no x ∈ R such that S1,r ≺ {x} in R.
416
Chapter 6. Variational Methods
Proof. Define f : [0, 1] × F → R as f (t, x) =
x + 2tx0 [1 − (x, x0 )] x0 + 2(1 − t)[x − (x, x0 )x0 ]
for for
9 8 t ∈ 0, 12 , x ∈ F, 1 9 t ∈ 2 , 1 , x ∈ F.
The mapping f is continuous and deforms F to the point x0 ∈ R. It is sufficient to verify that for any t ∈ [0, 1], x ∈ F we have P1 f (t, x) = o. 8
Indeed, for any t ∈ 0,
9 1 2
we have P1 f (t, x) = 2t[1 − (x, x0 )]x0 + P1 x,
for t ∈
1 2
9 , 1 we have P1 f (t, x) = [1 − 2(1 − t)(x, x0 )]x0 + 2(1 − t)P1 x.
For t ∈ [0, 1) we have then P1 f (t, x) = o due to the assumption P1 (F) ∩ Lin{x0 } = ∅. For t = 1 we have P1 f (t, x) = x0 = o.
6.3B Krasnoselski Potential Bifurcation Theorem Let us recall the definition of a potential operator. Definition 6.3.24. Let O be an open subset of a real Hilbert space H, f : O → H. We say that f has a potential (in O) if there exists a functional F : O → R which is Fr´echet differentiable in O, and for any x ∈ O we have f (x) = F (x).
(6.3.14)
Remark 6.3.25. Let us recall how to interpret the equality (6.3.14). The Fr´echet derivative F (x) is a continuous linear operator from H into R. It follows from the Riesz Representation Theorem (see Theorem 1.2.40) that there is a unique point z z(x) ∈ H such that F (x)y = (y, z), z = F (x) for any y ∈ H. In what follows we will identify F (x) with z(x) ∈ H and study bifurcation points of the equation λx − F (x) = o. (6.3.15) The main objective of this appendix is to prove that (under the assumptions F (o) = o, F (o) = o and some assumptions concerning the smoothness of F ) every point (λ, o) where λ is a nonzero eigenvalue of F (o) : H → H is a bifurcation point of (6.3.15).
6.3B. Krasnoselski Potential Bifurcation Theorem
417
Theorem 6.3.26 (Krasnoselski Potential Bifurcation Theorem). Let F be a (nonlinear) functional on a Hilbert space H. Assume that F is twice differentiable in a certain neighborhood U(o) of o ∈ H,
(6.3.16)
F is compact on U(o),
(6.3.17)
F : U(o) → L(H) is continuous at o,
(6.3.18)
F (o) = o,
F (o) = o.
(6.3.19)
Then (λ0 , o) where λ0 = 0 is a bifurcation point of λx − F (x) = o
(6.3.20)
if and only if λ0 is an eigenvalue of the operator A F (o). Remark 6.3.27. Note that the equation (6.3.20) is a special case of the equation o = λx − Ax + G(λ, x) from Theorem 5.2.23. Indeed, the left-hand side of (6.3.20) can be written as λx − F (o)x + [F (o)x − F (x)] where F (o) is a compact linear operator (see Proposition 5.2.21), and F (o)x − F (x) = o(x),
x → 0.
Note first that the implication
0 and (λ0 , o) is a bifurcation point of (6.3.20), then λ0 is an eigenif λ0 = value of A, follows from Exercise 5.2.25. So we will concentrate on the proof of the reversed implication. Roughly speaking, we know that the “linearization of (6.3.20)”, i.e., the equation (λI − F (o))x = o has a nontrivial solution, and we want to show that there is also a nontrivial solution of the “close” but nonlinear equation (6.3.20). The basic idea of the proof consists in the fact that (6.3.20) is a necessary condition for x to be a critical point of F subject to the sphere 1 1 where J(x) = x2 . ∂B(o; r) x ∈ H : J(x) = r 2 2 2 Here we use the fact that identity is the differential of the functional J, and the Lagrange Multiplier Method. Later we will prove the existence of a sufficiently large number of critical points of F on ∂B(o; r). If we restrict ourselves to spheres with sufficiently small radii (B(o; r) ⊂ U(o) at least), we get critical points converging to zero. The last part of the proof consists in showing that the corresponding Lagrange multipliers can be chosen close to λ0 .
418
Chapter 6. Variational Methods
Let us assume that λ0 = 0 is an eigenvalue of the operator A. The assumption (6.3.18) guarantees that F (o) is a linear self-adjoint operator (see Proposition 3.2.28). We can assume, without loss of generality, that λ0 > 0. Let us start with a geometrical interpretation of the points x ∈ ∂B(o; r) such that λx = F (x).
(6.3.21)
In this case the differential F (x) is perpendicular (recall that F (x) ∈ H in our interpretation) to the sphere ∂B(o; r) at x. Then x can be looked for as a limit of those points of the sphere ∂B(o; r) at which the tangent projections (see (6.3.22) below and Figure 6.3.1) of F (x) converge to zero. More precisely, we have
P (z) =
F (z)
(F (z), z) z (z, z)
z {y : (z, y) = 0} o D(z)
Figure 6.3.1. Lemma 6.3.28. For z ∈ H, z = o, set D(z) = F (z) −
(F (z), z) z (z, z)
(6.3.22)
(D(z) is the orthogonal projection of F (z) to the tangent space of ∂B(o; z) at z 38 ). Let yn ∈ ∂B(o; r), yn x0 , and let F be continuous, and lim F (yn ) = y = o,
n→∞
lim D(yn ) = o.39
n→∞
(6.3.23)
Then yn → x0 , y = F (x0 ), x0 = o, and λx0 − F (x0 ) = o
where
λ=
1 (F (x0 ), x0 ). r2
Proof. From the weak convergence yn x0 and from (6.3.23) we obtain (F (yn ), yn ) → (y, x0 ) 38 This 39 Both
and hence
(F (yn ), yn ) (y, x0 ) yn x0 . r2 r2
tangent space is equal to {x ∈ H : (x, z) = 0} – see Remark 4.3.40. limits are considered with respect to the norm in H.
(6.3.24)
6.3B. Krasnoselski Potential Bifurcation Theorem
419
At the same time, from the definition of D(yn ) and (6.3.23) we have (F (yn ), yn ) yn = F (yn ) − D(yn ) → y. r2 Hence
1 (y, x0 )x0 . r2 Since y = o, we have x0 = o and also (y, x0 ) = 0. The definition of D(yn ) and the fact that D(yn ) → o yield y=
yn = r 2
F (yn ) − D(yn ) y → r2 = x0 . (F (yn ), yn ) (y, x0 )
Continuity of F at x0 then implies y = F (x0 ),
i.e.,
F (x0 ) =
(y, x0 ) (F (x0 ), x0 ) x0 = x0 . 2 r r2
We will look for a curve on the sphere ∂B(o; r) which starts at a fixed point x; the values of F along this curve do not decrease, and after a finite time (even if large) we “almost” reach the critical point of F . In other words, we are looking for a curve k = k(t, x), t ∈ [0, ∞), x ∈ ∂B(o; r) such that k(0, x) = x,
(6.3.25)
and for all t ∈ (0, ∞) we require k(t, x) ∈ ∂B(o; r),
i.e.,
k(t, x)2 = r 2 .
The last relation implies d k(t, x)2 = 0, dt which is equivalent to
d k(t, x), k(t, x) dt
=0
for all
t ∈ (0, ∞).
(6.3.26)
d k(t, x) is perThe equality (6.3.26) states that for all t ∈ (0, ∞) the element dt pendicular to k(t, x). This will be satisfied if we look for a solution of the initial value problem ⎧ ⎨ d k(t, x) = D(k(t, x)), t ∈ (0, ∞), dt (6.3.27) ⎩ k(0, x) = x.
The assumption (6.3.18) implies that F is Lipschitz continuous in a neighborhood of o. Hence, for r > 0 sufficiently small, D is Lipschitz continuous. Then, by virtue of Corollary 3.1.6, there exists a unique solution of (6.3.27) which is defined on the whole interval (0, ∞). It follows from Remark 3.1.7 that this solution depends continuously on the initial condition x ∈ ∂B(o; r).
420
Chapter 6. Variational Methods
Let k be a solution of the initial value problem (6.3.27). Then it has the following important properties: (i) For any t ∈ (0, ∞) we have k(t, x) = x. (ii) For any t ∈ (0, ∞) we have d F (k(t, x)) = (F (k(t, x)), D(k(t, x))) = D(k(t, x))2 ≥ 0. dt In other words, the values of the functional F increase along k regardless of the choice of x ∈ ∂B(o; r). (iii) For any t ∈ (0, ∞) we have
t
D(k(τ, x))2 dτ .
F (k(t, x)) = F (x) + 0
Since F is bounded on ∂B(o; r) (by the Mean Value Theorem and (6.3.19)), there exists a sequence {ti }∞ i=1 ⊂ (0, ∞) such that lim D(k(ti , x)) = o.
i→∞
40 (iv) Since {k(ti , x)}∞ i=1 is bounded, we can select a weakly convergent subsequence.
Summarizing, we have Lemma 6.3.29. For any x ∈ ∂B(o; r) there exist a sequence {ti }∞ i=1 ⊂ (0, ∞) and x0 ∈ H such that k(ti , x) x0 , D(k(ti , x)) → o, {F (k(ti , x))}∞ i=1
is an increasing sequence.
(6.3.28) (6.3.29) (6.3.30)
It follows from (6.3.28) and (6.3.17) that F (k(ti , x)) → y. If we prove that y = o, then the assumptions of Lemma 6.3.28 are verified with yn = k(tn , x), and so the existence of a solution x0 of (6.3.20) with λ described by (6.3.24) will be proved. By an appropriate choice of the initial condition x ∈ ∂B(o; r), we show that the above convergence takes place and that λ given by (6.3.24) is sufficiently close to λ0 . Recall that A = F (o) is a compact linear self-adjoint operator in the Hilbert space H (see Proposition 5.2.21). Its spectrum consists of a countable set of real eigenvalues with one possible limit point λ = 0. We split the set of all eigenvalues to the parts λ ≥ λ0 and λ < λ0 , respectively. We denote by H1 and H2 , respectively, the corresponding closed linear subspaces generated by the eigenvectors (see Theorem 2.2.16). Note that λ0 > 0 implies that dim H1 < ∞. The eigenspace associated with λ0 will be denoted by H0 . Let P1 , P2 be the orthogonal projections of H onto H1 , H2 , respectively (see Figure 6.3.2). 40 The
reader is invited to justify (i)–(iv).
6.3B. Krasnoselski Potential Bifurcation Theorem
421
H2 P1
P2 o
(H0 ⊂) H1 Figure 6.3.2.
Let us denote S1 = {x ∈ H1 : x = r}. Lemma 6.3.30. There exists r0 > 0 such that ∂B(o; r0 ) ⊂ U(o) (see (6.3.16)), and for all 0 < r < r0 we have (i) there is no t ∈ [0, ∞) for which the set k(t, S1 ) is contractible to a point (see Definition 6.3.20) in R = {x ∈ H : P1 x = o}, (ii) for any t ∈ [0, ∞) there exists xt ∈ S1 such that P1 k(t, xt ) ∈ H0 ,
i.e.,
k(t, xt ) ∈ H0 ⊕ H2 .
Proof. Lemma 6.3.23 and (i) imply (ii) (see Exercise 6.3.33). Hence we prove only (i). According to Lemma 6.3.21 it is sufficient to prove that for any t the set S1 is contractible into k(t, S1 ) in R. Indeed, according to Lemma 6.3.22 the set S1 is not contractible to a point in R. Since k is a continuous function of both variables, it is sufficient to prove that it assumes only values from R: we want to prove that P1 k(t, x) = o
∀t ∈ [0, ∞),
x ∈ S1 .
We have F (k(0, x)) = F (x) ≥
1 (F (o)x, x) − ε(x)x2 ≥ 2
(6.3.31)
1 λ0 − ε(x) x2 2
where ε(r) → 0 as r → 0 (see (6.3.19) and Proposition 3.2.27). Note that the last inequality holds due to x ∈ H1 . Since F (k(t, x)) is increasing in t, we conclude herefrom that
1 (6.3.32) λ0 − ε(r) r 2 . F (k(t, x)) ≥ 2 On the other hand, we have an estimate from above (we write k instead of k(t, x) for the sake of brevity): 1 1 F (k) = (F (o)k, k) + F (k) − (F (o)k, k) 2 2 1 1 ≤ (F (o)P1 k, P1 k) + (F (o)P2 k, P2 k) + ε(k)k2 2 2 (note that (F (o)P1 k, P2 k) = 0 due to H1 ⊥ H2 ).
422
Chapter 6. Variational Methods Denote µ = max {λ : λ ∈ σ(F (o))},
ν = sup {λ ∈ σ(F (o)) : λ < λ0 }.
Then µ ν ν µ−ν P1 k2 + P2 k2 + ε(k)k2 = k2 + P1 k2 + ε(k)k2 .41 2 2 2 2 Hence, due to the fact that k = r, we have F (k) ≤
ν 2 µ−ν r + P1 k2 + ε(r)r 2 . 2 2 It follows from (6.3.32) and (6.3.33) that F (k) ≤
P1 k(t, x)2 ≥
(6.3.33)
λ0 − ν 2 4 r − ε(r)r 2 . µ−ν µ−ν
This implies the existence of r0 such that P1 k(t, x)2 ≥ ar 2
for any
r ≤ r0
where
a = a(r0 ) > 0.
(6.3.34)
This completes the proof of Lemma 6.3.30.
Proof of Theorem 6.3.26. Step 1. Let tn → ∞ be an arbitrary sequence of positive numbers. Let xn be a point from S1 for which P1 k(tn , xn ) ∈ H0 (its existence follows from (ii) of Lemma 6.3.30). Since S1 is compact, we can select a strongly convergent subsequence (denoted again by {xn }∞ n=1 ) such that lim xn = x ˜.
(6.3.35)
n→∞
Step 2. It follows from Lemma 6.3.29 that there is a sequence {τi }∞ i=1 such that k(τi , x ˜ ) = yi x 0
in
H,
and at the same time also D(yi ) → o.
Step 3. The compactness of F implies that (passing again to a subsequence if necessary) there exists y ∈ H such that lim F (yi ) = y. i→∞
We show that y = o. Indeed, we have (F (yi ), P1 yi ) → (y, P1 x0 ). Also, for all i ∈ N, we have the estimate (F (yi ), P1 yi ) = (F (o)yi , P1 yi ) + (F (yi ) − F (o)yi , P1 yi ) 1 ≥ λ0 P1 yi 2 − ε(yi )yi 2 ≥ λ0 ar 2 2 for all r small enough due to (6.3.34). This immediately implies (y, P1 x0 ) = 0, 41 We
and so
use the identity P1 k2 + P2 k2 = k2 .
y = o, x0 = o.
6.3B. Krasnoselski Potential Bifurcation Theorem
423
Step 4. We have just verified the assumptions of Lemma 6.3.28. Hence yi → x0 in H, and x0 solves (6.3.20) with λ given by (6.3.24): λx0 − F (x0 ) = o,
λ=
1 (F (x0 ), x0 ). r2
Step 5. The last step consists in proving the fact that for r > 0 small enough λ is arbitrarily close to λ0 . Let us estimate 1 |λ − λ0 | = 2 |(F (x0 ), x0 ) − λ0 (x0 , x0 )| r - - 1 1 ≤ 2 |(F (x0 ) − F (o)x0 , x0 )| + 2 -F (o)x0 , x0 − F (x0 )-- + |2F (x0 ) − λ0 (x0 , x0 )| r 2 1 = 2 |2F (x0 ) − λ0 (x0 , x0 )| + ε(r). r Since ε(r) → 0 as r → 0, it suffices to estimate 1 |2F (x0 ) − λ0 (x0 , x0 )|. r2 The continuity of F implies F (x0 ) = lim F (yi ). i→∞
(6.3.36)
Since F is increasing along k, we also have ˜)) ≥ F (˜ x). F (yi ) = F (k(τi , x Since x ˜ ∈ S1 ,
λ0 2 1 x, x ˜ ≥ F (o)˜ r . 2 2
Then (6.3.36)–(6.3.38) imply an estimate from below:
λ0 − ε(r) r 2 . F (x0 ) ≥ 2
(6.3.37)
(6.3.38)
(6.3.39)
Now we derive an estimate from above for F (x0 ). Since xn → x ˜ and k(τi , ·) is continuous with respect to the second variable, for fixed i ∈ N we have ˜ ) = yi . k(τi , xn ) → k(τi , x The continuity of F implies that for fixed i ∈ N and r > 0 there exists n0 ∈ N such that for all n ≥ n0 we have (6.3.40) F (yi ) ≤ F (k(τi , xn )) + r 3 . However, for any fixed i ∈ N we find ni ≥ n0 such that tni > τi , and the monotonicity of F along k then implies F (k(τi , xni )) ≤ F (k(tni , xni )).
(6.3.41)
The choice of xn from Step 1 guarantees that k(tni , xni ) ∈ H0 ⊕ H2 , and so (writing ki instead of k(tni , xni )) we have the estimate
1 λ0 F (ki ) ≤ (F (o)ki , ki ) + ε(ki )ki 2 ≤ (6.3.42) + ε(r) r2 . 2 2
424
Chapter 6. Variational Methods
However, (6.3.36), (6.3.40) and (6.3.41) reduce (6.3.42) to
λ0 F (x0 ) ≤ + ε(r) r 2 . 2
(6.3.43)
Both the estimates (6.3.39) and (6.3.43) yield that 1 |2F (x0 ) − λ0 (x0 , x0 )| → 0 r2
as
r → 0.
This completes the proof of Theorem 6.3.26.
Remark 6.3.31. It follows from the Krasnoselski Potential Bifurcation Theorem that every point (λ0 , o) where λ0 is a nonzero eigenvalue of the operator A is a bifurcation point. But there is no warranty that there is a curve (or continuum) of nontrivial solutions which departs from (λ0 , o). In fact, there are counterexamples even in the finite dimension which prove that such a curve need not exist. B¨ ohme [13] gave an example of a real function of two independent real variables, F ∈ C ∞ (R2 ), for which (λ0 , (0, 0)) is a bifurcation point of f (z, λ) = λz − F (z) = o,
z = (x, y) ∈ R × R,
λ∈R
(6.3.44)
and there is no continuous curve of nontrivial solutions of (6.3.44) which contains the point (λ0 , (0, 0)). Example 6.3.32 (Application of the Krasnoselski Potential Bifurcation Theorem). We will consider a periodic problem similar to that studied in Example 4.3.25: x ¨(t) + λx(t) + g(λ, t, x(t)) = 0, t ∈ (0, 2π), (6.3.45) x(0) = x(2π), x(0) ˙ = x(2π). ˙ The difference between (6.3.45) and (4.3.12) consists in the fact that now we do not allow g to depend on x. ˙ The reason for this restriction consists in the fact that the boundary value problem (4.3.12) cannot be written in the form (6.3.51) if g depends on x. ˙ We simplify the situation even more and write g in the form g(λ, t, s) = (λ + 1)˜ g (t, s). Set
s
˜ s) = G(t,
1
g˜(t, τ ) dτ = 0
g˜(t, sσ)s dσ, 0
˜ is the primitive of g˜ with respect to the second variable s. i.e., G Put 2π 1 ˜ x(t)) dt. |x(t)|2 + G(t, F (x) = 2 0
(6.3.46)
We work in the Hilbert space H {x ∈ W 1,2 (0, 2π) : x(0) = x(2π)}
(6.3.47)
6.3B. Krasnoselski Potential Bifurcation Theorem
425
with the scalar product on H given by 2π (x, y) = [x(t) ˙ y(t) ˙ + x(t)y(t)] dt,
x, y ∈ H.
0
Then (F (x), y) =
2π
[x(t)y(t) + g˜(t, x(t))y(t)] dt
for any
x, y ∈ H.
(6.3.48)
0
A weak solution of the periodic problem is a function x ∈ H which satisfies the integral identity 2π [x(t) ˙ y(t) ˙ − λx(t)y(t) − (λ + 1)˜ g (t, x(t))y(t)] dt = 0 (6.3.49) 0
for any y ∈ H. The last equality (6.3.49) can be written as 2π [x(t) ˙ y(t) ˙ + x(t)y(t) − (λ + 1)x(t)y(t) − (λ + 1)˜ g (t, x(t))y(t)] dt = 0.
(6.3.50)
0
The integral identity (6.3.50) can be written for λ = −1 as the operator equation µx − F (x) = o Let us define an operator B : H → H by
where
µ=
1 . λ+1
(6.3.51)
2π
(B(x), y)H =
x(y)y(t) dt. 0
It follows easily that B is a bounded linear operator and the compact embedding H ⊂⊂ Y {x ∈ C[0, 2π] : x(0) = x(2π)} (see Theorem 1.2.28) yields that B is compact. Since n2 is an eigenvalue of x ¨(t) + λx(t) = 0, t ∈ (0, 2π), x(0) = x(2π), x(0) ˙ = x(2π), ˙ then µ =
1 n2 +1
is an eigenvalue of B. We make the following assumptions:
∂˜ g : R × R → R are continuous functions, (6.3.52) ∂s ∂˜ g g˜(t, 0) = 0, (t, 0) = 0 for all t ∈ R. (6.3.53) ∂s Now we prove that F verifies the assumptions of Theorem 6.3.26: Note that F can be written as 2π 1 1 2π 2 |x(t)| dt + g˜(t, sx(t))x(t) ds dt. (6.3.54) F (x) = 2 0 0 0 (i) F (o) = 0 is an immediate consequence of (6.3.53). (ii) Differentiability of F follows directly from (6.3.54). (iii) Compactness of F (x). This is a consequence of the compactness of the embedding H ⊂⊂ Y (cf. Exercise 6.3.35). (iv) F (o) = o is a consequence of (6.3.53). (v) F (o) = B and F is continuous at o (cf. Exercise 6.3.35). g˜,
426
Chapter 6. Variational Methods
' ( Theorem 6.3.26 now implies that every point n21+1 , o is a bifurcation point of the equation µx − F (x) = o. In other words, for any n = 0, 1, . . . we have the following assertion: Under the assumptions (6.3.52) and (6.3.53), for an arbitrarily small neighborhood U of the point (n2 , o) ∈ R × H there exists (λ, x) ∈ U such that x = o is a weak solution of the periodic problem x ¨(t) + λx(t) + (λ + 1)˜ g (t, x(t)) = 0, t ∈ (0, 2π), x(0) = x(2π), x(0) ˙ = x(2π). ˙ Note that the continuity of g˜ and the regularity argument imply that every such nontrivial e solution satisfies x ∈ C 2 [0, 2π] and x(0) ˙ = x(2π). ˙ Exercise 6.3.33. Prove that Lemma 6.3.23 and Lemma 6.3.30(i) imply the statement of Lemma 6.3.30(ii). Hint. Argue by contradiction. Exercise 6.3.34. Prove that H defined in Example 6.3.32 by (6.3.47) is a closed subspace of W 1,2 (0, 2π), i.e., H is a Hilbert space. Exercise 6.3.35. Prove that F from Example 6.3.32 is twice Fr´echet differentiable, F compact and that F is continuous at o. Exercise 6.3.36. Apply Theorem 6.3.26 to the Dirichlet and the Neumann boundary value problem.
6.4 Mountain Pass Theorem One of the most efficient tools to prove that a given functional having a local extremum at a point possesses another critical point is the Mountain Pass Theorem. In order to motivate the main ideas of this section we will consider a real function of two real independent variables F: R×R→R which is continuously differentiable and satisfies the following condition: There exist r > 0, e ∈ R2 , e > r such that inf F (x) > F (o) ≥ F (e).
x=r
(6.4.1)
The graph of such a function is sketched in Figure 6.4.1. The Extreme Value Theorem and the first inequality in (6.4.1) immediately imply that F has a local minimum and thus a critical point in the set {x ∈ R2 : x < r}.
6.4. Mountain Pass Theorem
427
(o, F (o))
(e, F (e)) Figure 6.4.1.
Hiker’s experience suggests the idea that F should have another critical point different from that local minimum. Indeed, if the values of F are interpreted as mountains on the plastic map, then the valley (containing the origin) is surrounded by mountains. At the same time the altitude of every place the distance of which from the origin is equal to r is greater than that of the origin itself. So, there should be an “optimal pass” through the mountain range. Practical experience even suggests how to find such a critical point. Let us consider all continuous finite paths which lie on the graph of F and which connect the points (o, F (o)) and (e, F (e)). On every curve we have at least one “highest” point. It seems that if we select the “highest” point with the “lowest” altitude, we have found a critical point of F . If we formulate precisely the considerations made above, then the “lowest” altitude of the “highest” points corresponds to the value c inf max F (γ(t)) γ∈Γ t∈[0,1]
(6.4.2)
where Γ = {γ ∈ C([0, 1], R2 ) : γ(0) = o, γ(1) = e}. If c is a critical value of F , then there exists xc ∈ R2 such that F (xc ) = c
and
F (xc ) = o.
However, the value c defined above need not be a critical value of F ! An example which illustrates this phenomenon is rather elementary.
428
Chapter 6. Variational Methods
Example 6.4.1 (Br´ezis–Nirenberg). Let F (x, y) = x2 + (1 − x)3 y 2
and
r=
1 , 2
e = (2, 2).
min
F (x, y) > 0,
Then F (o) = F (e) = 0,
inf
(x,y)=r
F (x, y) =
(x,y)=r
and so the value c defined by (6.4.2) is positive. Since ∂F (x, y) = 2x − 3(1 − x)2 y 2 , ∂x
∂F (x, y) = 2(1 − x)3 y, ∂y
the origin is the only critical point of F and obviously F (o) < c. The reader is g invited to sketch the level sets of the function F . It is natural to ask why this happens. Such a situation corresponds, roughly speaking, to the fact that the altitude of the “highest” points approaches the value of c but the distance of these points from the origin diverges to infinity. More precisely, if xn ∈ R2 are such that F (xn ) = max F (γn (t))
for
t∈[0,1]
γn ∈ Γ
and F (xn ) → c,
(6.4.3)
xn → ∞.
(6.4.4)
then It follows that the existence of r > 0 and e ∈ R2 satisfying (6.4.1) is not sufficient to guarantee the existence of a critical point which is different from the local minimum in {x ∈ R2 : x ≤ r}. On the other hand, we will prove later that 2 (6.4.1) guarantees the existence of a sequence {xn }∞ n=1 ⊂ R such that F (xn ) → c
and
∇F (xn ) → o.
(6.4.5)
Now, let us assume for a moment that F satisfies the following condition: ∞ ∞ > (PS) Let {xn }n=1 ⊂ R2 be such that {F (xn )}n=1 is a bounded sequence in R ∞ and ∇F (xn ) → o. Then {xn }n=1 is a bounded sequence in R2 . Then the situation described in (6.4.3) and (6.4.4) cannot occur. Moreover, (6.4.5) > already implies that c is a critical value of F . Indeed, let together with (PS) ∞ > there exists a subsequence {xn }∞ ⊂ {xn }n=1 satisfy (6.4.5). According to (PS) k k=1 ∞ {xn }n=1 such that xnk → xc . The continuous differentiability of F and (6.4.5) imply that ∇F (xc ) = o.
6.4. Mountain Pass Theorem
429
Let us consider a more general situation, namely F: H →R where H is a real Hilbert space with a scalar product (·, ·) and the induced norm · . In order to simplify the proofs we will also require more smoothness of F , i.e., let F ∈ C 2 (H, R). For the sake of brevity we will denote F d F−1 ((−∞, d]). The key assertion of this section is the following Quantitative Deformation Lemma. The reader should have in mind that it can be proved under more general assumptions (H a Banach space, F ∈ C 1 (H, R)) – see also footnotes 44, 45 and 46 on pages 430–432. Lemma 6.4.2 (Quantitative Deformation Lemma). Let H be a real Hilbert space and F a C 2 -functional, c ∈ R, ε > 0. Assume that ∇F (u) ≥ 2ε
for any
u ∈ F−1 ([c − 2ε, c + 2ε]).
Then there exists η ∈ C(H, H) such that (i) η(u) = u for any u ∈ F−1 ([c − 2ε, c + 2ε]), (ii) η(F c+ε ) ⊂ F c−ε . Proof. Let us introduce closed sets A = F−1 ([c − 2ε, c + 2ε]),
B = F−1 ([c − ε, c + ε])
(see Figure 6.4.2) and a functional ψ(u) =
dist (u, H \ A) .42 dist (u, H \ A) + dist (u, B)
H B
A
Figure 6.4.2.
42 Recall
that dist (u, C) inf{u − vX : v ∈ C} for u ∈ X, C ⊂ X.
430
Chapter 6. Variational Methods
Then 1 ψ= 0
on B, on H \ A,
ψ is locally Lipschitz continuous, 43
Let us define a vector field ⎧ ⎨−ψ(u) ∇F (u) ∇F (u)2 f (u) = ⎩ o
for
u ∈ A,
for
u ∈ H \ A.
0 ≤ ψ ≤ 1.
Then f is also locally Lipschitz continuous44 and for any u ∈ H we have f (u) ≤
1 . 2ε
Indeed, for u ∈ H \ A we have f (u) = 0 and for u ∈ A we have f (u) ≤ |ψ(u)|
1 1 ∇F (u) ≤ . ≤ 2 ∇F (u) ∇F (u) 2ε
Consider the Cauchy problem
σ˙ = f (σ), σ(0) = u.
(6.4.6)
It follows from Corollary 3.1.6 that (6.4.6) has a unique solution, denoted by σ(·, u), which is defined on R for any u ∈ H, and for any t > 0, σ(t, ·) : H → H is continuous (continuous dependence on the initial data – see Remark 3.1.7). Let us define η(u) = σ(2ε, u), u ∈ H. We will prove that η satisfies (i) and (ii). Property (i) follows from the fact that f (u) = 0 for u ∈ H \ A. Let us prove that (ii) is also satisfied. Since
d d F (σ(t, u)) = ∇F (σ(t, u)), σ(t, u) dt dt (6.4.7) = (∇F (σ(t, u)), f (σ(t, u))) = −ψ(σ(t, u)), the function t → F (σ(t, u)) is decreasing. Let u ∈ F c+ε , i.e., F (u) ≤ c + ε. We have to show that F (σ(2ε, u)) ≤ c − ε. If there is t ∈ [0, 2ε] such that F (σ(t, u)) ≤ c − ε, 43 Cf.
Exercise 6.4.8. the assumption F ∈ C 2 (H, R) is essentially used (cf. Exercise 6.4.9).
44 Here
6.4. Mountain Pass Theorem
431
then also F (σ(2ε, u)) ≤ c − ε and (ii) is satisfied. If, on the other hand, c − ε < F (σ(t, u)) ≤ c + ε then we obtain from (6.4.7)
for all t ∈ [0, 2ε],
i.e.,
2ε
d F (σ(2ε, u)) = F (u) + F (σ(t, u)) dt = F (u) − dt 0 ≤ c + ε − 2ε = c − ε,
ψ(σ(t, u)) = 1,
2ε
ψ(σ(t, u)) dt 0
a contradiction, and so (ii) is also satisfied.
The Quantitative Deformation Lemma provides a tool for proving the existence of “almost critical” points of functionals which have the so-called mountain pass type geometry (see (6.4.8) below). Proposition 6.4.3. Let F ∈ C 2 (H, R), e ∈ H and r > 0 be such that e > r and b inf F (u) > F (o) ≥ F (e). u=r
(6.4.8)
Let c inf max F (γ(t)) γ∈Γ t∈[0,1]
and
Γ {γ ∈ C([0, 1], H) : γ(0) = o, γ(1) = e}.
Then for each ε > 0 there exists u ∈ H such that (i) c − 2ε ≤ F (u) ≤ c + 2ε, (ii) ∇F (u) < 2ε. Proof. Let γ ∈ Γ be arbitrary. Then (6.4.8) implies b ≤ max F (γ(t)),
and so
t∈[0,1]
b ≤ c ≤ max F (te). t∈[0,1]
Without loss of generality, we can restrict ourselves to ε small, satisfying c − 2ε > F (o) ≥ F (e).
(6.4.9)
Suppose that the conclusion of the proposition is not satisfied for an ε > 0, i.e., for each u ∈ H satisfying (i) condition (ii) is violated. We apply Lemma 6.4.2 to get a contradiction. By the definition of c, there exists γ ∈ Γ such that max F (γ(t)) < c + ε.
(6.4.10)
t∈[0,1]
Consider β(t) = η(γ(t)) where η is from Lemma 6.4.2. Using Lemma 6.4.2(i), γ(0) = o, γ(1) = e, and (6.4.9) we conclude β(0) = η(o) = o
and
β(1) = η(e) = e.
Hence β ∈ Γ, i.e., c ≤ max F (β(t)). It follows from Lemma 6.4.2(ii) and (6.4.10) t∈[0,1]
that max F (β(t)) ≤ c − ε,
t∈[0,1]
a contradiction.
432
Chapter 6. Variational Methods
In order to prove that c is a critical value, our functional F has to satisfy a > However, in infinite dimensions we have to “compactness” condition of type (PS). strengthen its formulation. Definition 6.4.4. Let F ∈ C 2 (H, R) and c ∈ R. The functional F satisfies the Palais–Smale condition on the level c (shortly (PS)c ) if any sequence {un }∞ n=1 ⊂ H such that F (un ) → c, ∇F (un ) → o (6.4.11) has a convergent (in the norm of H) subsequence.45 Now we are ready to formulate the Mountain Pass Theorem which is the simplest and one of the most useful “variational” theorems. It is one of the most efficient tools to prove the existence of at least two critical points of a given functional (see, e.g., Example 6.4.7). Theorem 6.4.5 (Mountain Pass Theorem). Let the assumptions of Proposition 6.4.3 be satisfied. Let F satisfy (PS)c . Then c is a critical value of F .46 ∞
Proof. It follows from Proposition 6.4.3 that there is a sequence {un }n=1 ⊂ H ∞ ∞ satisfying (6.4.11). By (PS)c there exist {unk }k=1 ⊂ {un }n=1 and u0 ∈ H such 1 that unk → u0 . But F ∈ C (H, R) implies that F (u0 ) = c
and
∇F (u0 ) = o.
Remark 6.4.6. Theorem 6.4.5 actually states that there exists a critical point u0 = o of F since c ≥ inf F (u) > F (o). u=r
Example 6.4.7. Let us consider the boundary value problem t ∈ (0, π), −¨ x(t) + λx(t) = |x(t)|p−2 x(t), x(0) = x(π) = 0,
(6.4.12)
where p > 2 is a given real number and λ ∈ R is a parameter. Notice that the function identically equal to zero is a solution. We will prove that problem (6.4.12) has also a positive C 2 -solution on (0, π) if and only if λ > −1.
(6.4.13)
Let us prove that (6.4.13) is a necessary condition. Let x ∈ C 2 [0, π] be a positive solution of (6.4.12). Multiply the equation in (6.4.12) by sin t and integrate 45 Definition 6.4.4 in a more general setting with H replaced by a Banach space X and F ∈ C 1 (X, R) is due to Br´ezis, Cor´ on, Nirenberg. Cf. Corollary 6.4.19. 46 The assertion of Theorem 6.4.5 where H is replaced by a Banach space X and F ∈ C 1 (X, R) is due to Ambrosetti, Rabinowitz, see Theorem 6.4.24.
6.4. Mountain Pass Theorem
by parts:
433
π
π
|x(t)|
p−2
x(t) sin t dt =
λ
π
x(t) sin t dt + x ¨(t) sin t dt 0 0 π π > x ¨(t) sin t dt = − x(t) sin t dt.
0
0
0
Hence (6.4.13) follows. Next we show that (6.4.13) is also a sufficient condition. Let us define the following two functions from R into R: ⎧ ⎪ for s ≤ 0, ⎨0 0 for s ≤ 0, g(s) = G(s) = 1 ⎪ sp−1 for s > 0, ⎩ sp for s > 0. p Then G ∈ C 2 (R) and G (s) = g(s) for all s ∈ R (remember that p > 2). Put H W01,2 (0, π) and define π 1 π λ π 2 2 F (x) |x(t)| ˙ dt + |x(t)| dt − G(|x(t)|) dt, x ∈ H. 2 0 2 0 0 Then F ∈ C 2 (H, R) (see Exercise 6.4.10). We shall verify the assumptions of Theorem 6.4.5. Note that for λ > −1 the expression
π 12 π 2 2 |x| |x(t)| ˙ dt + λ |x(t)| dt 0
0
satisfies c1 x ≤ |x| ≤ c2 x
for any x ∈ H
(6.4.14)
where ci > 0, i = 1, 2, are constants independent of x and
π 12 2 x = |x(t)| ˙ dt 0
(cf. Exercise 6.3.18). Let us show that the functional F has a mountain pass type geometry. It follows from the Sobolev Embedding Theorem (Theorem 1.2.26) that
π
π p1 12 2 |x(t)|p dt ≤ cp |x(t)| ˙ dt . (6.4.15) 0
0
Hence combining (6.4.14) and (6.4.15) we obtain 1 π λ π 1 π 2 F (x) = |x(t)| ˙ dt + |x(t)|2 dt − |x(t)|p dt 2 0 2 0 p 0
p
p 1 1 cp 1 cp 2 p 2 1 p−2 ≥ |x| − − . |x| = |x| |x| 2 p c1 2 p c1
434
Chapter 6. Variational Methods
So, because p > 2, due to (6.4.14) there exists r > 0 (small enough) such that b = inf F (x) > 0 = F (o). x=r
Let x ∈ H, x > 0 in (0, π). Then for s ≥ 0 we have
π π 1 s2−p 1 π 2 2 F (sx) = | x(t)| ˙ dt + λ |x(t)| dt − |x(t)|p dt. sp 2 p 0 0 0 For s > 0 set e = sx. Then for s large we obtain e > r
and
F (e) ≤ 0.
It remains to verify that F satisfies the (PS)c condition. Actually, we will verify that F satisfies even a stronger version of (PS)c . Namely, we will prove that ∞ any sequence {xn }n=1 ⊂ H satisfying d sup F (xn ) < ∞, n
∇F (xn ) → o,
(6.4.16)
contains a convergent subsequence.47 A typical scheme of the proof is the following: ∞ In Step 1 we prove that {xn }n=1 is a bounded sequence. In Step 2 we pass to a weakly convergent subsequence and show that it converges strongly as well. Step 1. For n large enough, we have by (6.4.16)48 1 d + xn ≥ F (xn ) − (∇F (xn ), xn ) p π
π 1 1 1 1 2 2 2 − − xn 2 . = |x˙ n (t)| dt + λ |xn (t)| dt ≥ c1 2 p 2 p 0 0 It follows from this quadratic inequality that xn is bounded. Step 2. Passing to a subsequence if necessary, we can assume that xn x
in H.
By the compact embedding H = W01,2 (0, 1) ⊂⊂ C[0, π] (see Theorem 1.2.13) we have xn → x
in C[0, π],
and so
g(xn ) → g(x) in
C[0, π].
Observe that we have 2
|xn − x| = (∇F (xn ) − ∇F (x), xn − x) π + (g(xn (t)) − g(x(t)))(xn (t) − x(t)) dt.
(6.4.17)
0
It is clear that (∇F (xn ) − ∇F (x), xn − x) → 0 47 The 48 Due
as n → ∞
reader should justify that (PS)c in the sense of Definition - 6.4.4 is satisfied as well. to (6.4.16) we can actually assume that - 1p (∇F (xn ), xn )- ≤ xn .
6.4. Mountain Pass Theorem
435 ∞
∞
(cf. (6.4.16)). The uniform convergence of {xn }n=1 and {g(xn )}n=1 implies that also π (g(xn (t)) − g(x(t)))(xn (t) − x(t)) dt → 0
as n → ∞.
0
Thus it follows from (6.4.17) that |xn − x| → 0
as
n → ∞,
i.e., xn → x in H
due to (6.4.14). It follows from Theorem 6.4.5 that there exists a critical point x0 ∈ H of F (and hence a weak solution of (6.4.12)) with F (x0 ) = c ≥ b > 0. In particular, x0 = o. We prove that x0 > 0 in (0, π). Indeed, π π π x˙ 0 (t)y(t) ˙ dt + λ x0 (t)y(t) dt = g(x0 (t))y(t) dt holds for any y ∈ H. 0
0
Taking y =
49 x− 0,
0
we get π - − -2 π - dx0 (t) 2 - dt + λ |x− 0 (t)| dt = 0. - dt 0 0
(6.4.18)
Hence |x− 0 | = 0, i.e., x0 (t) ≥ 0 for all t ∈ [0, π] due to (6.4.14). A similar argument to that used in Section 6.1 yields that x0 ∈ C 2 [0, π] (cf. Exercise 6.4.11). Now, if there were t0 ∈ (0, π) such that x0 (t0 ) = 0, then, due to x− ˙ 0 (t0 ) = 0 would hold. However, the uniqueness theorem for the 0 ≡ 0, also x second order initial value problem −¨ x(t) = −λx(t) + |x(t)|p−2 x(t), ˙ 0) = 0 x(t0 ) = x(t implies that x0 ≡ 0, i.e., a contradiction to x0 = o. Hence x0 > 0 in (0, π).
g
Exercise 6.4.8. Prove that ψ defined in the proof of Lemma 6.4.2 is a locally Lipschitz continuous functional on H. Hint. For u1 , u2 from a bounded set we have dist(ui , H \ A) + dist(ui , B) ≥ δ,
i = 1, 2,
with a δ ≥ 0.
Using the triangle inequality prove that |ψ(u2 ) − ψ(u1 )| ≤
dist(u2 , H \ A) | dist(u1 , B) − dist(u2 , B)| δ2 dist(u2 , B) + | dist(u1 , H \ A) − dist(u2 , H \ A)|, δ2
and then apply Exercise 1.2.45. x− = max{0, −x}. One can prove that for x ∈ W01,2 (0, π) we have x− ∈ W01,2 (0, π) (cf. Exercise 1.2.47 and the embedding of W01,2 (0, π) into C[0, π]). 49 Here
436
Chapter 6. Variational Methods
Exercise 6.4.9. Prove that f defined in the proof of Lemma 6.4.2 is a locally Lipschitz continuous map from H into itself. Hint. Use the facts that ψ is locally Lipschitz continuous (Exercise 6.4.8) and F ∈ C 2 (H, R). Exercise 6.4.10. Prove that the functional F from Example 6.4.7 satisfies F ∈ C 2 (H, R). Hint. Use the fact that G defined in Example 6.4.7 belongs to C 2 (R) if p > 2. Exercise 6.4.11. Prove that x0 ∈ C 2 [0, π] for any weak solution x0 of (6.4.12). Hint. Look at the proof of Proposition 6.1.11. Exercise 6.4.12. Consider the boundary value problem −¨ x(t) − λx(t) = g(t, x(t)), t ∈ (0, π), x(0) = x(π) = 0.
(6.4.19)
Formulate conditions on λ and g which guarantee that the energy functional associated with (6.4.19) has a geometry corresponding to the Mountain Pass Theorem. Exercise 6.4.13. Consider the Neumann boundary value problem −¨ x(t) = h(t, x(t)), t ∈ (0, π), x(0) ˙ = x(π) ˙ = 0.
(6.4.20)
Formulate conditions on h = h(t, x) which guarantee the existence of a weak solution of (6.4.20). Exercise 6.4.14. Consider the Dirichlet boundary value problem for the fourth order equation ⎧ 4 ⎨ d x(t) = h(t, x(t)), t ∈ (0, π), (6.4.21) dt4 ⎩ x(0) = x(0) ˙ = x(π) = x(π) ˙ = 0. Formulate conditions on h = h(t, x) which guarantee the existence of a weak solution of (6.4.21).
6.4A Pseudogradient Vector Fields in Banach Spaces The aim of this appendix is to show how to extend the Quantitative Deformation Lemma (Lemma 6.4.2) to continuously differentiable functionals defined on a Banach space. For this purpose the notion of the pseudogradient introduced by Palais is crucial. Definition 6.4.15. Let M be a separable metric space, X a normed linear space and h : M → X ∗ \ {o} a continuous mapping. A pseudogradient vector field for h on M is a locally Lipschitz continuous map g : M → X such that for every u ∈ M , g(u)X ≤ 2h(u)X ∗ ,
h(u), g(u)X ≥ h(u)2X ∗ .
6.4A. Pseudogradient Vector Fields in Banach Spaces
437
Lemma 6.4.16. For any h as above there exists a pseudogradient vector field for h on M . Proof. For every v ∈ M there exists x ∈ X such that x = 1
h(v), x >
and
2 h(v).50 3
Define y 32 h(v)x. Then y < 2h(v)
h(v), y > h(v)2 .
and
Since h is continuous, there exists an open neighborhood U(v) ⊂ M such that y ≤ 2h(u)
and
h(u), y ≥ h(u)2
for every
u ∈ U(v).
(6.4.22)
The family U {U(v) : v ∈ M } is an open covering of M . Since M is a separable metric space, there exists a locally finite open covering M {Mi : i ∈ N} of M which is subordinate to U (cf. Lemma 4.3.75), i.e., for each i ∈ N there exists v ∈ M such that Mi ⊂ U(v).51 Hence there exists y = yi such that (6.4.22) is satisfied for every u ∈ Mi . Define, on M , i (u) dist(u, M \ Mi ) and g(u)
i∈N
(u) i yi .52 j (u) j∈N
It is now straightforward to verify that g is the desired pseudogradient vector field for h on M (cf. Exercise 6.4.8). The following generalization of Lemma 6.4.2 was proved by Willem [134]. Lemma 6.4.17 (Quantitative Deformation Lemma). Let X be a Banach space, and let F : X → R, F ∈ C 1 (X, R), S ⊂ X, S = ∅, c ∈ R, ε, δ > 0 be such that for any u ∈ F−1 ([c − 2ε, c + 2ε]) ∩ S2δ we have53 F (u)X ∗ ≥
8ε . δ
(6.4.23)
Then there exists η ∈ C([0, 1] × X, X) such that (i) η(t, u) = u if t = 0 or u ∈ F−1 ([c − 2ε, c + 2ε]) ∩ S2δ , (ii) η(1, F c+ε ∩ S) ⊂ F c−ε ,54 (iii) for any t ∈ [0, 1], η(t, ·) is a homeomorphism of X, (iv) for any u ∈ X and any t ∈ [0, 1], η(t, u) − uX ≤ δ, (v) for any u ∈ X, F (η(·, u)) is decreasing, (vi) for any u ∈ F c ∩ Sδ and any t ∈ (0, 1], F (η(t, u)) < c. that for any v ∈ M we have h(v) = o ∈ X ∗ ! of M is actually not necessary. In the case of a general metric space M its paracompactness is used instead of separability (see Dugundji [43], Zeidler [136]). 52 Note that the sums contain only a finite number of nonzero terms. 53 Here S 2δ {u ∈ X : dist(u, S) ≤ 2δ}. 54 Recall that F c±ε F −1 ((−∞, c ± ε]). 50 Note
51 Separability
438
Chapter 6. Variational Methods
Proof.55 By Lemma 6.4.16 there exists a pseudogradient vector field g for F on M {u ∈ X : F (u) = o}. Let us define sets A F−1 ([c − 2ε, c + 2ε]) ∩ S2δ , and a functional ψ(u)
B F−1 ([c − ε, c + ε]) ∩ Sδ
dist(u, X \ A) . dist(u, X \ A) + dist(u, B)
Then ψ is locally Lipschitz continuous (see Exercise 6.4.8) and 1 on B, ψ= 0 on X \ A. Let us define a vector field ⎧ ⎨ −ψ(u) g(u) g(u)2 f (u) = ⎩ o
for
u ∈ A,
for
u ∈ X \ A.
Then f is also locally Lipschitz continuous (cf. Exercise 6.4.9) and by assumption (6.4.23) and by Definition 6.4.15, for any u ∈ X we have f (u) ≤ Consider the Cauchy problem
δ . 8ε
σ˙ = f (σ), σ(o) = u.
(6.4.24)
(6.4.25)
It follows from Corollary 3.1.6 and Remark 3.1.7 that (6.4.25) has a unique solution σ(·, u) which is defined on the whole R and σ is continuous on R × X. Let us define η : [0, 1] × X → X by η(t, u) σ(8εt, u). It follows from Definition 6.4.15, assumption (6.4.23) and from (6.4.24) that for t ≥ 0 the inequalities t t δt (6.4.26) f (σ(τ, u)) dτ f (σ(τ, u)) dτ ≤ σ(t, u) − u = ≤ 8ε 0 0 and d F (σ(t, u)) = dt
?
55 Cf.
the proof of Lemma 6.4.2.
d σ(t, u) dt
@
1 = F (σ(t, u)), f (σ(t, u)) ≤ − ψ(σ(t, u)) ≤ 0 4 (6.4.27) hold. To verify (i), (iii), (iv), (v) and (vi) is a matter of straightforward calculation (cf. Exercise 6.4.25). F (σ(t, u)),
6.4A. Pseudogradient Vector Fields in Banach Spaces
439
In oder to verify (ii), let u ∈ F c+ε ∩ S. If there is t ∈ [0, 8ε] such that F (σ(t, u)) < c − ε, then F (σ(8ε, u)) < c − ε by (6.4.27) and (ii) is satisfied. If, on the other hand, we have σ(t, u) ∈ F−1 ([c − ε, c + ε]) for any t ∈ [0, 8ε], we obtain from (6.4.26) that σ(t, u) ∈ B and hence (6.4.27) yields 8ε d 1 8ε F (σ(8ε, u)) = F (u) + ψ(σ(t, u)) dt F (σ(t, u)) dt ≤ F (u) − dt 4 0 0 ≤ c + ε − 2ε = c − ε
and (ii) is also satisfied.
A special case of the Ekeland Variational Principle is considered to be the first application of Lemma 6.4.17. Theorem 6.4.18 (Ekeland Variational Principle). Let X be a Banach space, let F ∈ C 1 (X, R) be bounded below, and let ε, δ > 0 be arbitrary. If F (v) ≤ inf F (u) + ε u∈X
for a
v ∈ X,
then there exists u0 ∈ X such that F (u0 ) ≤ inf F (u) + 2ε, u∈X
u0 − vX ≤ 2δ,
and
F (u0 )X ∗ <
8ε . δ
Proof. We apply Lemma 6.4.17 with S {v}
c inf F (u).
and
u∈X
We proceed via contradiction. Assume that there exist ε and δ such that F (u) ≥ for every u ∈ F−1 [c, c + 2ε] ∩ S2δ . Then
8ε δ
η(1, v) ∈ F c−ε by (ii) of Lemma 6.4.17. However, the definition of c implies F c−ε = ∅, a contradiction. Corollary 6.4.19. Let F ∈ C 1 (X, R) be bounded below. If F satisfies the (PS)c condition with c inf u∈X F (u), then every minimizing sequence for F contains a converging subsequence. In particular, there exists u0 ∈ X such that F (u0 ) = min F (u). u∈X
Proof. Let
{vn }∞ n=1
⊂ X be a minimizing sequence for F . We apply Theorem 6.4.18 with √ 1 εn max and δn εn . , F (vn ) − c n
Then there exists a sequence {un }∞ n=1 ⊂ X such that F (un ) → c,
F (un ) → o
The assertion follows now from (PS)c .
and
un − vn → 0.
440
Chapter 6. Variational Methods Another application of Lemma 6.4.17 is the following result.
Theorem 6.4.20 (Br´ezis and Nirenberg). Let F ∈ C 1 (X, R). If c lim inf F (u) ∈ R,
u →∞
then for every ε, δ > 0, R > 2δ, there exists u ∈ X such that (i) c − 2ε ≤ F (u) ≤ c + 2ε, (ii) u > R − 2δ, (iii) F (u) <
8ε . δ
Proof. We proceed by contradiction similarly to the proof of Theorem 6.4.18. Suppose that the assertion does not hold. Then there exist ε, δ and R such that for any u ∈ X satisfying (i) and (ii) the inequality in (iii) is false. Hence we can apply Lemma 6.4.17 with S X \B(o; R). By the definition of c, F c+ε ∩S is an unbounded set and F c−ε ⊂ B(o; r) for r > 0 large enough. By (ii) and (iv) of Lemma 6.4.17, η(1, F c+ε ∩ S) ⊂ F c−ε
and
F c+ε ∩ S ⊂ B(o; r + δ),
a contradiction.
Corollary 6.4.21 (Shujie Le). Let F ∈ C 1 (X, R) be bounded below. If for arbitrary d ∈ R every sequence {un }∞ n=1 ⊂ X such that F (un ) → d,
F (un ) → o
is bounded, then lim F (u) = ∞.
u →∞
Proof. We proceed again by contradiction. Assume the assertion does not hold. Then c lim inf F (u) ∈ R.
u →∞
By Theorem 6.4.20 there exists a sequence {un }∞ n=1 ⊂ X such that F (un ) → c,
F (un ) → o
and
un → ∞,
a contradiction.
Let us present now the most important application of Lemma 6.4.17, the General Minimax Principle. Theorem 6.4.22. Let X be a Banach space. Let M0 be a subset of a metric space M and Γ0 ⊂ C(M0 , X). Define Γ {γ ∈ C(M, X) : γ|M0 ∈ Γ0 }. If F ∈ C 1 (X, R) satisfies a sup
sup F (γ0 (u)) < c inf sup F (γ(u)) < ∞,
γ0 ∈Γ0 u∈M0
γ∈Γ u∈M
(6.4.28)
6.4A. Pseudogradient Vector Fields in Banach Spaces then for every ε ∈ 0,
c−a 2
441
, δ > 0 and γ ∈ Γ such that sup F (γ(u)) ≤ c + ε,
(6.4.29)
u∈M
there exists u0 ∈ X such that (i) c − 2ε ≤ F (u0 ) ≤ c + 2ε, (ii) dist(u0 , γ(M )) ≤ 2δ, . (iii) F (u0 ) < 8ε δ Proof. Suppose, by contradiction, that the assertion is false. Then there exist 0 < ε < c−a , δ > 0 and γ ∈ Γ such that (6.4.29) holds and for any u ∈ X satisfying (i) and (ii), 2 the inequality in (iii) is false. Hence Lemma 6.4.17 can be applied with S = γ(M ). Define β(u) η(1, γ(u)). Since c − 2ε > a, we obtain from (6.4.28) that β(u) = η(1, γ(u)) = γ(u)
for every
u ∈ M0
so that
β ∈ Γ.
It follows from (6.4.29) and Lemma 6.4.17 that sup F (β(u)) = sup F (η(1, γ(u))) ≤ c − ε, u∈M
u∈M
contradicting the definition of c. We now have the following consequence.
Corollary 6.4.23. Let the assumptions of Theorem 6.4.22 be fulfilled. Then there exists a sequence {un }∞ n=1 ⊂ X satisfying F (un ) → c,
F (un ) → o.
In particular, if F satisfies the (PS)c condition, then c is a critical value of F . The special choices of M , M0 , Γ and Γ0 in Theorem 6.4.22 can yield the Mountain Pass Theorem (see below) and the Saddle Point Theorem (see Appendix 6.5A) under more general assumptions than in Sections 6.4 and 6.5, respectively. Theorem 6.4.24 (Mountain Pass Theorem, Ambrosetti and Rabinowitz). Let X be a Banach space and let F ∈ C 1 (X, R), e ∈ X and r > 0 be such that e > r and inf F (u) > F (o) ≥ F (e).
u∈X
u =r
If F satisfies the (PS)c condition with c inf max F (γ(t)) γ∈Γ t∈[0,1]
where
Γ {γ ∈ C([0, 1], X) : γ(0) = o, γ(1) = e},
then c is a critical value of F . Proof. It suffices to apply Corollary 6.4.23 with M = [0, 1], M0 = {0, 1}, Γ0 = {γ0 } where γ0 (0) = o and γ0 (1) = e. Exercise 6.4.25. Verify that (i), (iii), (iv), (v) and (vi) of Lemma 6.4.17 hold true. Let g and f be from the proof of Lemma 6.4.17. Explain why locally Lipschitz continuity of g implies that f is locally Lipschitz continuous. Compare this with the proof of Lemma 6.4.2. Hint. Use (6.4.26) and (6.4.27).
442
Chapter 6. Variational Methods
Exercise 6.4.26. Consider the boundary value problem p−2 −(|x(t)| ˙ x(t))˙ ˙ + λx(t) = |x(t)|r−2 x(t),
t ∈ (0, 1),
x(0) = x(1) = 0,
(6.4.30)
where p > 1 and r > p. Let λ1 be the first eigenvalue of (5.2.47). Prove that the problem (6.4.30) has a positive solution on (0, 1) provided λ > −λ1 . Hint. Follow the idea of proving the “sufficient condition” in Example 6.4.7 and apply Theorem 6.4.24. Exercise 6.4.27. Consider the problem p−2 − |x(t)| ˙ x(t) ˙ ˙− λ|x(t)|p−2 x(t) = g(t, x(t)),
t ∈ (0, 1),
x(0) = x(1) = 0,
(6.4.31)
where p > 1. Formulate conditions on λ and g which guarantee that the energy functional associated with (6.4.31) has a geometry corresponding to the Mountain Pass Theorem (Theorem 6.4.24). Exercise 6.4.28. Consider the Dirichlet boundary value problem p−2 x(t) ˙ ˙= h(t, x(t)), t ∈ (0, 1), |x(t)| ˙ x(0) = x(1) = 0,
(6.4.32)
where p > 1. Formulate conditions on h = h(t, x) which guarantee that (6.4.32) has a weak solution.
6.4B Lusternik–Schnirelmann Method The purpose of this appendix is to extend the results presented in Section 6.3, namely, we concentrate on the Lusternik–Schnirelmann Method which generalizes the Courant– Fischer and Courant–Weinstein Principles. In order to motivate the topic let us consider the unit circle S 1 in the plane and a continuous function ϕ defined on it. The Extreme Value Theorem implies that this function has to attain its maximum and minimum values. If the function ϕ is the restriction of a non-vanishing linear function of two independent variables to S 1 , then ϕ has exactly one maximum at M and one minimum at m as in Figure 6.4.3. The set S 1 can be covered by two closed sets which are contractible to a point in S 1 (see Figure 6.4.4). As another example consider a two-dimensional torus T 2 and identify it with the quotient space R2 | 2 (see Figure 6.4.5). Z Let us consider a function ϕ ∈ C 1 (T 2 , R) 56 having a maximum at M and a minimum at m. We also assume that the level sets of ϕ are the curves indicated in Figure 6.4.5. The function ϕ has three critical points on the torus: the maximum at M , the minimum at m and a saddle point at S. In Figure 6.4.6 we give a covering of the torus T 2 by three closed sets which are contractible to a point in T 2 (see Appendix 6.3A for the notion of a contractible set). the notion of the differentiability of functions defined on manifolds, like S 1 , T 2 , see Definition 4.3.36. 56 For
6.4B. Lusternik–Schnirelmann Method
443
M A1
A2
m Figure 6.4.3.
Figure 6.4.4.
Definition 6.4.29. We define the Lusternik–Schnirelmann category catY (A) of a closed nonempty subset A of a topological space Y as the least integer n such that there exists a covering of A by n closed sets contractible to a point in Y .57 The essential idea of the Lusternik–Schnirelmann method is the following one: The number of critical points of a C 1 -functional ϕ defined on a compact manifold Y is greater than or equal to catY (Y ). The corresponding critical values are given by ck inf sup ϕ(u) A∈Ak u∈A
where Ak {A ⊂ Y : A closed, catY (A) ≥ k}.
Let us prove some elementary properties of the Lusternik–Schnirelmann category. Lemma 6.4.30. Let A and B be closed subsets of Y . Then we have (i) (normalization) catY (∅) = 0, (ii) (subadditivity) catY (A ∪ B) ≤ catY (A) + catY (B), (iii) (monotonicity) if A ≺ B,58 then catY (A) ≤ catY (B). Proof. Properties (i) and (ii) follow directly from Definition 6.4.29. Let us prove (iii). Assume that A ≺ B by means of homotopy h, and let {B1 , . . . , Bn } be a covering of B corresponding to n catY (B) according to Definition 6.4.29. Define sets Aj {u ∈ A : h(1, u) ∈ Bj }, Then A=
n
Aj ,
Aj ≺ Bj ,
j = 1, . . . , n.
j = 1, . . . , n.
j=1
According to Lemma 6.3.21, catY (A) ≤ n.
Definition 6.4.31. A metric space Y is an absolute neighborhood extensor if for every metric space E, every closed subset F ⊂ E and every continuous mapping f : F → Y there exists a continuous extension of f defined on a neighborhood of F in E.59 57 See
Definition 6.3.20. Definition 6.3.20 for the relation ≺. 59 The terminology is not fixed in the literature. We follow here the books Willem [134] and Zeidler [136]. On the other hand, the same objects are called “absolute neighborhood retract” in Deimling [34] and Dugundji [43]. 58 See
444
Chapter 6. Variational Methods
M
m
p S
q
S
q
S
m p
p M
S
q Figure 6.4.5.
S
6.4B. Lusternik–Schnirelmann Method
445
A2
A3
A1
A1
A1
A2
A3
A1
A1 Figure 6.4.6.
446
Chapter 6. Variational Methods
Remark 6.4.32. Note that every normed linear space is an absolute neighborhood extensor (see, e.g., the Tietze–Dugundji Theorem in Zeidler [136, Prop. 2.1]). Proposition 6.4.33. Let A be a closed subset of an absolute neighborhood extensor Y . Then there exists a closed neighborhood B of A in Y such that catY (B) = catY (A). Proof. The reader should realize that it is sufficient to consider the case catY (A) = 1 (cf. Lemma 6.4.30(ii) and (iii)). Let h be the corresponding homotopy which contracts A to a point. The set N ([0, 1] × A) ∪ ({0, 1} × Y ) Let u0 ∈ A be fixed. The map f : ⎧ ⎪ ⎨ f (t, u) ⎪ ⎩
M [0, 1] × Y.
is closed in
N → Y defined by h(t, u), u, h(1, u0 ),
t ∈ [0, 1], t = 0, t = 1,
u ∈ A, u ∈ Y, u ∈ Y,
is continuous. The fact that Y is an absolute neighborhood extensor implies that there exists a continuous extension g of f defined on a neighborhood U of N . The compactness of [0, 1] implies the existence of a closed neighborhood B of A such that [0, 1] × B ⊂ U. However, then B is contractible to a point in Y , i.e., catY (B) = 1. Our aim is now to prove the Quantitative Deformation Lemma which will be the key tool for proving the existence of critical points on manifolds. In the following considerations we will always assume that X is a real separable Banach space, ψ ∈ C 2 (X, R), V {v ∈ X : ψ(v) = 1} = ∅
and
ψ (v) = 0
for every
v∈V.
The reader should be aware of the fact that some of these assumptions can be relaxed and more general results parallel to those from this section can be proved (see, e.g., Ghoussoub [58]). The set V is a differentiable manifold of the class C 2 (cf. Remark 4.3.39 or Deimling [34, § 27], Zeidler [136, Chapter 43]). The norm on X induces a metric on V and so V becomes a metric manifold, i.e., a metric space and a manifold. It can be proved that V is an absolute neighborhood extensor (see, e.g., Deimling [34, Proposition 27.6]). We denote by Tv V its tangent space at v (see Remark 4.3.40), i.e., Tv V {y ∈ X : ψ (v), yX = 0}. Let ϕ ∈ C 1 (X, R) be given. The norm of the restriction of the derivative ϕ (v) to Tv V is given by ϕ (v)∗ sup ϕ (v), yX . y∈Tv V
y X =1
The point v is a critical point of the restriction of ϕ to V if the restriction of ϕ (v) to Tv V is equal to o. We define ϕd {v ∈ V : ϕ(v) ≤ d}. We will use the following Duality Lemma.
6.4B. Lusternik–Schnirelmann Method
447
Lemma 6.4.34 (Duality Lemma). If f, g ∈ X ∗ , then sup f, y = min f − λg. λ∈R
g,y =0
y =1
Proof. For every λ ∈ R we have sup f, y = sup f − λg, y ≤ sup f − λgy = f − λg.
g,y =0
y =1
g,y =0
y =1
y =1
By the Hahn–Banach Theorem (Corollary 2.1.15) there is a continuous linear functional f˜ on X, f˜|Ker g = f , such that sup f, y = f˜.
g,y =0
y =1
Since Ker g ⊂ Ker (f − f˜), there exists λ ∈ R such that f − f˜ = λg (see Proposition 1.1.19). Hence we obtain f˜ = f − λg. The above lemma immediately yields the following assertion. Proposition 6.4.35. If ϕ ∈ C 1 (X, R) and u ∈ V {v ∈ X : ψ(v) = 1}, then ϕ (u)∗ = min ϕ (u) − λψ (u). λ∈R
In particular, u is a critical point of ϕ|V if and only if there exists λ ∈ R such that ϕ (u) = λψ (u).60 Next we define a tangent pseudogradient vector field on V . Definition 6.4.36.61 Let ϕ ∈ C 1 (V , R). A tangent pseudogradient vector field for ϕ on M {u ∈ V : ϕ (u)∗ = 0} is a locally Lipschitz continuous vector field g : M → X such that g(u) ∈ Tu V and g(u) ≤ 2ϕ (u)∗ ,
ϕ (u), g(u) ≥ ϕ (u)∗
for every
u ∈ M.
Lemma 6.4.37.62 Let ϕ ∈ C 1 (X, R). Then there exists a tangent pseudogradient vector field for ϕ on M. Proof. For every v ∈ M there exists x ∈ Tv V such that x = 1 There is also z ∈ X such that 60 Cf.
Theorem 6.3.2. Definition 6.4.15. 62 Cf. Lemma 6.4.16. 61 Cf.
and
ϕ (v), x >
ψ (v), z = 1.
2 ϕ (v)∗ . 3
448
Chapter 6. Variational Methods
Set y 32 xϕ (v)∗ and for u ∈ V such that ψ (u), z = 0, set gv (u) y −
ψ (u), y z. ψ (u), z
Since ψ (v), y = 0, we have gv (v) = y and gv (v) < 2ϕ (v)∗ ,
ϕ (v), gv (v) > ϕ (v)2∗ .
Since ϕ and gv are continuous, there exists an open neighborhood N (v) of v such that gv (u) < 2ϕ (u)∗ ,
ϕ (u), gv (u) > ϕ (u)2∗
for every
u ∈ N (v).
The family N {N (v) : v ∈ M} is an open covering of M. Since M is a metric space, there exists a locally finite open covering M = {Mi : i ∈ N} of M which is subordinate to N , i.e., such that for every i ∈ N there exists v ∈ V satisfying Mi ⊂ N (v) (cf. the proof of Lemma 6.4.16). For any i ∈ N choose one such v vi and define gvi (u), u ∈ N (vi ), gi (u) o, u ∈ N (vi ), and i (u) dist(u, X \ Mi ),
g(u)
i (u)gi (u) .63 j (u) i∈N j∈N
It is now straightforward to verify that g is a tangent pseudogradient vector field for ϕ on M. (The interested reader is invited to check it in detail and realize that the fact ψ ∈ C 2 (X, R) is used!) The proof of the following version of the Quantitative Deformation Lemma follows the lines of the proof of Lemma 6.4.17 (cf. Exercise 6.4.50). Lemma 6.4.38 (Quantitative Deformation Lemma). Let ϕ ∈ C 1 (X, R), S ⊂ V , c ∈ R, ε, δ > 0 be such that ϕ (u)∗ ≥
8ε δ
for any
u ∈ ϕ−1 ([c − 2ε, c + 2ε]) ∩ S2δ ∩ V .
Then there exists η ∈ C([0, 1] × V , V ) such that (i) η(t, u) = u if t = 0 or u ∈ ϕ−1 ([c − 2ε, c + 2ε]) ∩ S2δ ∩ V , (ii) η(1, ϕc+ε ∩ S) ⊂ ϕc−ε , (iii) ϕ(η(·, u)) is decreasing for any u ∈ V . 63 Note
that the sums contain only a finite number of nonzero terms. Note also that separability of X can be dropped and substituted by paracompactness.
6.4B. Lusternik–Schnirelmann Method
449
We are now ready to prove the General Minimax Principle on the manifold V . We assume that ϕ ∈ C 1 (X, R) is bounded below on V . For j ≥ 1, j ∈ N, we define Aj {A ⊂ V : A is closed, catV (A) ≥ j},
cj inf sup ϕ(u). A∈Aj u∈A
(6.4.33)
Theorem 6.4.39 (General Minimax Principle). Assume that ϕ and cj are as above. If c ck = ck+1 = · · · = ck+m ,
(6.4.34)
then for every ε > 0, δ > 0, A ∈ Ak+m and B ⊂ V closed such that sup ϕ(u) ≤ c + ε,
catV (B) ≤ m,
(6.4.35)
u∈A
there exists u0 ∈ V such that (i) c − 2ε ≤ ϕ(u0 ) ≤ c + 2ε, (ii) dist(u0 , A \ int B) ≤ 2δ, (iii) ϕ (u0 )∗ ≤
8ε . δ
Proof. We assume by contradiction that there exist numbers ε > 0, δ > 0, and closed sets A ∈ Ak+m , B ⊂ V such that (6.4.35) holds and for all u ∈ V satisfying (i) and (ii)64 the inequality (iii) is false. We apply Lemma 6.4.38 with S A \ int B. We obtain by virtue of Lemma 6.4.38(ii) that (A \ int B) ≺ ϕc−ε . It follows from Lemma 6.4.30(ii), (iii) and from the definition of ck that k + m ≤ catV (A) ≤ catV (A \ int B) + catV (B) ≤ catV (ϕc−ε ) + m ≤ k − 1 + m,
a contradiction.
Definition 6.4.40. A functional ϕ satisfies the Palais–Smale condition (PS)c on V if any sequence {un }∞ n=1 ⊂ V such that ϕ(un ) → c,
ϕ (un )∗ → 0,
has a convergent subsequence. Theorem 6.4.41. Let ϕ be bounded below on V , satisfy (PS)c on V , and let (6.4.34) hold. Let Kc {u ∈ V : ϕ(u) = c, ϕ (u)∗ = 0}. Then catV (Kc ) ≥ m + 1. Proof. Assume that catV (Kc ) ≤ m. Then Proposition 6.4.33 implies the existence of a closed neighborhood B of Kc in V such that catV (B) ≤ m.65 64 Cf.
Exercise 6.4.55. that the fact that V is an absolute neighborhood extensor is used here, cf. page 446.
65 Note
450
Chapter 6. Variational Methods
By Theorem 6.4.39 for A = V there exists a sequence {un }∞ n=1 ⊂ V satisfying ϕ(un ) → c,
dist(un , V \ int B) → 0,
ϕ (un )∗ → 0.
It then follows from the (PS)c condition on V that Kc ∩ (V \ int B) = ∅, a contradiction with the definition of B. Theorem 6.4.42. Let ϕ be bounded below on V , d ≥ inf ϕ(u) and ϕ satisfy (PS)c on V u∈V for any c ∈ inf ϕ(u), d . Then ϕ|V has a minimum and ϕd contains at least catV (ϕd ) u∈V
critical points of ϕ|V . Proof. If n catV (ϕd ), then inf ϕ(u) = c1 ≤ c2 ≤ c3 ≤ · · · ≤ cn ≤ d
u∈V
where ci , i = 1, . . . , n, are given by (6.4.33). The critical points corresponding to different critical levels are mutually different. If some levels coincide, we apply Theorem 6.4.41 to get the assertion. Remark 6.4.43. Note that (i) catX (B(o; 1)) = 1 for the closed ball B(o; 1) in X where X is a Banach space (see Figure 6.4.7);
B(o; 1) ⊂ X
Figure 6.4.7. (ii) catS N −1 (S N−1 ) = 2 for the unit sphere S N−1 = ∂B(o; 1) ⊂ RN , N ≥ 1. Indeed, Figure 6.4.4 suggests that catS N −1 (S N−1 ) ≤ 2. On the other hand, it follows from Lemma 6.3.22 that catS N −1 (S N−1 ) > 1. Definition 6.4.44. Let S N−1 ⊂ RN be the unit sphere. Then P N−1 = {(u, −u) : u ∈ S N−1 } is called an (N − 1)-dimensional projective space. The geometrical interpretation of P N−1 is the following: the (N − 1)-dimensional projective space P N−1 results from S N−1 , N ≥ 1 by identifying antipodal points (see Figure 6.4.8). The following identity is the key to the proof of existence of a sequence of eigenvalues of nonlinear problems: catP N −1 (P N−1 ) = N. (6.4.36)
6.4B. Lusternik–Schnirelmann Method
451
To see that catP N −1 (P N−1 ) ≤ N we can proceed by induction as follows. S 1 can be covered by two closed symmetric sets which are contractible to a point in P 1 (see Figure 6.4.9).
A2
−u o
A1
u
A1 A2
P1 Figure 6.4.8.
Figure 6.4.9.
The closed strip along the equator on S 2 can be covered by two closed symmetric sets which are contractible to a point in P 2 as well. If we add the closed north and south caps, we get a covering of S 2 by three closed symmetric sets which are contractible to a point in P 2 , etc. (see Figure 6.4.10).
A3 A2
A1
A1
A2 A3 Figure 6.4.10.
To prove the reversed inequality we proceed by contradiction. Assume that catP N −1 (P N−1 ) < N . Then according to Exercise 6.4.51 there exist M < N and closed symmetric sets Ai , i = 1, . . . , M , such that S N−1 =
M
Ai ,
Ai = A˜i ∪ (−A˜i ),
A˜i ∩ (−A˜i ) = ∅.
i=1
Then A˜1 , . . . , A˜M , (−A˜1 ) ∪ · · · ∪ (−A˜M ) is a covering of S N−1 by M + 1 closed sets and none of them contains antipodal points. This contradicts the covering result of Lusternik
452
Chapter 6. Variational Methods
and Schnirelmann (if M = N −1, we can apply directly Exercise 4.3.138; if M < N −1, we complete the above covering by N − 1 − M empty sets and apply again Exercise 4.3.138). Similarly to Definition 6.4.44 we can define an infinite dimensional projective space P ∞ {(u, −u) : u ∈ S} where S = ∂B(o; 1) is the boundary of the unit ball B(o; 1) in an infinite dimensional Banach space. Then (6.4.36) immediately yields that catP ∞ (P ∞ ) = ∞. Example 6.4.45. Let f : RN → R be a continuously differentiable function. Since S N−1 is compact, it follows from the Extreme Value Theorem that there exists d > sup f (u). u∈S N −1
It follows then from Theorem 6.4.42 that the number of critical points on S N−1 is greater than or equal to catRN (S N−1 ) = 2. However, this result is trivial. On the other hand, if f is even, we can think of f as a continuous mapping from P N−1 into R. Then by (6.4.36) and Theorem 6.4.42, f has at least N critical points in P N−1 to which N pairs (−u, u) e of critical points of f on S N−1 correspond. This is a nontrivial result. If we combine this example and Theorem 6.4.42, we get the following assertion. Theorem 6.4.46. Let H be a real (separable) Hilbert space, dim H = ∞, let the functional ϕ ∈ C 1 (H, R) be bounded below, even and let it satisfy (PS)c on ∂B(o; 1) ⊂ H for ϕ(u). Then ϕ|∂B(o;1) possesses infinitely many distinct pairs of critical any c ≥ inf u∈∂B(o;1)
points. Proof. Since ∂B(o; 1) {u ∈ H : ψ(u) = 1}
where
ψ(u) = (u, u)
is of the class C (H, R), we can apply Theorem 6.4.42. Indeed, since ϕ and ψ are even, we can identify the antipodal points and define 2
X {x = (u, −u) : u ∈ H},
V {x ∈ X : ψ(u) = 1}.
∞
Since V = P , we have
catV (V ) = ∞.
This completes the proof.
Now we illustrate the connection between the critical points of functionals on manifolds in Banach spaces and the nonlinear eigenvalue problems. We present this fact by a simple example. Example 6.4.47. Set X W01,p (0, 1), p ≥ 2, 1 p ϕ(x) |x(t)| ˙ dt, ψ(x) 0
1
|x(t)|p dt,
x ∈ X.
0
Then ϕ and ψ satisfy all the above assumptions. The functional ϕ is bounded below on V by λ1 (see Example 6.3.5) and satisfies (PS)c on V for any level c ≥ λ1 . Indeed, let ϕ(xn ) → c,
ϕ (xn )∗ → 0 {xn }∞ n=1
for
xn ∈ V .
(6.4.37)
The first convergence in (6.4.37) implies that is a bounded sequence in X. Then the reflexivity of X implies that without loss of generality we can assume xn x in X,
6.4B. Lusternik–Schnirelmann Method
453
and by the compact embedding X = W01,p (0, 1) ⊂⊂ Lp (0, 1) also xn → x in Lp (Ω). But then x ∈ V , i.e., x = o. It follows from (6.4.37) that ? @ ϕ (xn ), w = ϕ (xn ), w − ϕ(xn )ψ (xn ), w → 0 (6.4.38) ψ uniformly for all w ∈ X, w ≤ R (cf. Exercise 6.4.53). We can take w xn − x in (6.4.38) (note that {xn }∞ n=1 is bounded in X). Hence 1 1 |x˙ n (t)|p−2 x˙ n (t)(x˙ n (t) − x(t)) ˙ dt − ϕ(xn ) |xn (t)|p−2 xn (t)(xn (t) − x(t)) dt → 0. 0
0
Since also ϕ (x), xn − x =
1 p−2 |x(t)| ˙ x(t)( ˙ x˙ n (t) − x(t)) ˙ dt → 0 0
by the weak convergence xn x in X and 1 |xn (t)|p−2 xn (t)(xn (t) − x(t)) dt → 0 0
by the compact embedding X = W01,p (0, 1) ⊂⊂ Lp (0, 1), we obtain 1 p−2 |x˙ n (t)|p−2 x˙ n (t) − |x(t)| ˙ x(t) ˙ (x˙ n (t) − x(t)) ˙ dt → 0. 0
However, for p ≥ 2 we have 1 p−2 ˙ x(t) ˙ (x˙ n (t) − x(t)) ˙ dt ≥ |x˙ n (t)|p−2 x˙ n (t) − |x(t)| 0
1 p |x˙ n (t) − x(t)| ˙ dt, 0
i.e., xn → x. It follows from Theorem 6.4.46 that ϕ has infinitely many distinct pairs of critical points (xi , −xi ), i = 1, 2, . . . , xi ∈ V . It follows from Proposition 6.4.35 that there exist λi , i = 1, 2, . . . , such that ϕ (xi ) = λi ψ (xi ). But, since ψ (xi ), xi = pψ(xi ), ϕ (xi ), xi = pϕ(xi ), we have i = 1, 2, . . . , ϕ(xi ) = λi , i.e., the critical values of ϕ|V are the eigenvalues of the problem 1 1 p−2 |x(t)| ˙ x(t) ˙ y(t) ˙ dt = λ |x(t)|p−2 x(t)y(t) dt. 0
(6.4.39)
0
From the proof of Theorem 6.4.42 we have λi = inf sup ϕ(x) A∈Ai x∈A
where
(6.4.40)
Ai {A ⊂ V : A is closed, symmetric, catV (A) ≥ i}.
We prove that lim λi = ∞.
i→∞
For this purpose we have to exclude the following two cases: Case 1. There exists n ∈ N such that λm = λn for any m ≥ n. Case 2. Case 1 does not occur but there exists Λ ∈ R such that λi " Λ.
(6.4.41)
454
Chapter 6. Variational Methods
If Case 1 occurs, then necessarily catϕλm +1 (Kλm ) = ∞ by Theorem 6.4.41. However, the (PS)λm condition implies that Kλm is a compact set and hence catϕλm +1 (Kλm ) < ∞ (see Exercise 6.4.52), i.e., Case 1 is excluded. If Case 2 occurs, then we argue as follows. Let ε > 0 be specified later and denote K {x ∈ V : λ1 ≤ ϕ(x) ≤ Λ + ε, ϕ (x)∗ = 0}. By the (PS)c condition we know that K is compact, and hence j catV (K) < ∞. ˜ of K in V such According to Proposition 6.4.33 there exists a closed neighborhood K ˜ = j. In particular, if we set that catV (K) ˜ S ϕΛ+ε \ K, we can apply Lemma 6.4.38. Indeed, choose ε and δ such that the assumptions of Lemma 6.4.38 are satisfied with c = Λ, and let m be the smallest integer such that λm > Λ − ε > λ1 . Choose A ∈ Aj+m such that sup ϕ(x) ≤ Λ + ε (this is possible due to x∈A
the variational characterization of λj+m , see (6.4.40)) and set ˜ B A\K (the closure is taken in the topology of V ). Then according to Lemma 6.4.30(ii), catV (B) ≥ m,
i.e.,
B ∈ Am .
It follows from Lemma 6.4.30(iii) that catV (η(1, B)) ≥ catV (B) ≥ m,
i.e.,
η(1, B) ∈ Am .
But then according to Lemma 6.4.38(ii), λm ≤ sup ϕ(η(1, x)) ≤ Λ − ε < λm , x∈B
a contradiction. Hence Case 2 is also excluded, and (6.4.41) is proved.
e
Remark 6.4.48. Using the technique of ordinary differential equations it is possible to prove that the set {λn }∞ n=1 represents all eigenvalues of (6.4.39), that every λn , n = 1, 2, . . . , is simple (see, e.g., Elbert [47], Doˇsl´ y [37] and references therein). The same approach as above can be used to prove the existence of an infinite sequence of eigenvalues, approaching infinity, of the p-Laplacian in more dimensions. Contrary to the one-dimensional case it is not clear if such a sequence exhausts all the eigenvalues or not. This has been a long standing and challenging open problem of nonlinear analysis. Note that the assumption p ≥ 2 can be relaxed to p > 1. However, V is not a manifold of the class C 2 for 1 < p < 2 and so a more general approach must be employed (see, e.g., Ghoussoub [58]). Remark 6.4.49. Similar and more general minimax arguments can be found in literature where instead of the (Lusternik–Schnirelmann) category a more general concept of the relative category is used.
6.4B. Lusternik–Schnirelmann Method
455
One can develop an abstract index theory where the index of a set (an analogue of the category) satisfies certain axioms and similar results to those in this section can be proved. The reader can find various kinds of indices: Krasnoselski genus, S 1 -index of Benci, cohomological index of Fadell–Rabinowitz, etc. (see, e.g., Zeidler [136] and references therein). The notion of a category is, in a certain sense, a maximal function satisfying the key properties of Lemma 6.4.30 (cf. Exercise 6.4.54). Exercise 6.4.50. Give the proof of Lemma 6.4.38 in detail. Watch carefully for the moment when the assumption ψ ∈ C 2 (X, R) is essential. Exercise 6.4.51. Every closed set A∗ ⊂ P N−1 corresponds to a symmetric closed set A ⊂ S N−1 and vice versa as follows: x∈A
if and only if
(x, −x) ∈ A∗ .
Prove that catP N −1 (A∗ ) = 1 if and only if there exists A˜ such that ˜ A = A˜ ∪ (−A)
and
˜ = ∅. A˜ ∩ (−A)
Hint. catP N −1 (A∗ ) = 1 if and only if there exist an odd continuous mapping f : [0, 1] × S N−1 → S N−1 and a point a ∈ S N−1 such that for (x, −x) ∈ A∗ we have (f (0, x), f (0, −x)) = (−x, x)
and
(f (1, x), f (1, −x)) = (a, −a);
take A˜ = {x ∈ S N−1 : f (1, x) = a}. Exercise 6.4.52. Prove that if K is a compact subset of a manifold V of the class C 2 , then catV (K) < ∞. Hint. For any u ∈ K there exists B(u; R(u)) such that B(u; R(u)) ∩ V is contractible to a point in V (use Remark 6.4.43(i) and the fact that V is a manifold of the class C 2 ); n (B(u; R(u)) ∩ V ) is an open covering of K, choose B(ui ; R(ui )) a finite subcovu∈K
ering of K, use Lemma 6.4.30(ii) to show catV (K) ≤ n.
i=1
Exercise 6.4.53. Prove (6.4.38). Hint. Split w tn xn + yn , tn ∈ R, ψ (xn ), yn = 0. Using the facts {xn }∞ n=1 bounded 2 ˆ in X and xn → x in L (Ω) prove that for any R > 0 there exists R > 0 such that ˆ for all n ∈ N. Now, take also into account ψ(xn ) = 1, w ≤ R implies yn ≤ R ϕ (xn ), xn = pϕ(xn ), ψ (xn ), xn = p, in order to get ? @ ϕ ϕ (xn ), wψ(xn ) − ϕ(xn )ψ (xn ), w (xn ), w = ψ ψ 2 (xn ) = ptn ϕ(xn ) + ϕ (xn ), yn − ptn ϕ(xn ) − ϕ(xn )ψ (xn ), yn = ϕ (xn ), yn → 0 uniformly with respect to w ∈ X, w ≤ R. Exercise 6.4.54. Prove the following assertion: Let ΦY be a function defined on the class A of closed subsets A of Y . If ΦY possesses properties (i)–(iii) of Lemma 6.4.30 and ΦY (A) = 1 when A consists of a single point, then ΦY (A) ≤ catY (A)
for all
A∈A.
456
Chapter 6. Variational Methods
Hint. Let catY (A) = 1. Since A is contractible to a point u0 ∈ Y , hence, by the fact that ΦY satisfies (iii) of Lemma 6.4.30, ΦY (A) ≤ ΦY (u0 ) = 1. Now the assertion follows by using a covering and (ii) of Lemma 6.4.30. Exercise 6.4.55. Prove that the set of all x ∈ V satisfying (i) and (ii) from Theorem 6.4.39 is not empty.
6.5 Saddle Point Theorem The main assertion in this section, the Saddle Point Theorem, is a useful tool to prove existence of a critical point which is neither a local minimum nor a local maximum of a given functional. Let us start again by considering a real function of two independent variables F: R×R→R which is continuously differentiable and satisfies the condition on top of the next page.
−
y o
Figure 6.5.1.
x
6.5. Saddle Point Theorem
457
There exists > 0 such that inf F (0, y) > max {F (− , 0), F ( , 0)}.
(6.5.1)
y∈R
The graph of such a function is sketched in Figure 6.5.1.The impression one can get from the graph is that c = inf max F (γ(t)) γ∈Γ t∈[−,]
where
Γ = {γ ∈ C([− , ], R2 ) : γ(− ) = (− , 0), γ( ) = ( , 0)}
is a critical value of the functional F . The following example, however, shows that this is not the case in general. Example 6.5.1. Let
2
F (x, y) = 2e−x + ey (see Figure 6.5.2). Set = 1. Then we have inf F (0, y) = 2 > max {F (−1, 0), F (1, 0)} =
y∈R
2 + 1, e
i.e., (6.5.1) is satisfied.
F
y
x Figure 6.5.2.
458
Chapter 6. Variational Methods
On the other hand, since 2 ∂F (x, y) = −4xe−x , ∂x
∂F (x, y) = ey , ∂y g
there is no critical point of F .
The reason why the geometric condition (6.5.1) is not sufficient to guarantee the existence of a critical point of F is the same as in the previous section. If we > (see page 428), then the value c is a critical value introduce the assumption (PS) > of F provided F satisfies (PS). Let us consider a more general situation F: H →R where H is a real Hilbert space. We will use the Quantitative Deformation Lemma (Lemma 6.4.2) and prove the following analogue of Proposition 6.4.3. Proposition 6.5.2. Let F ∈ C 2 (H, R) and let H = Y ⊕ Z where dim Y < ∞ and Z is a closed subspace of H. Moreover, assume that there is > 0 such that, denoting M = {u ∈ Y : u ≤ }, M0 = {u ∈ Y : u = }, we have inf F (u) > max F (u).
u∈Z
u∈M0
(6.5.2)
Let c inf max F (γ(u)) γ∈Γ u∈M
where
Γ {γ ∈ C(M, H) : γ|M0 = I}.
Then for each ε > 0 there exists u ∈ H such that (a) c − 2ε ≤ F (u) ≤ c + 2ε, (b) ∇F (u) < 2ε. Proof. First of all we will show that c˜ inf F (u) ≤ c. u∈Z
To establish this inequality it is sufficient to prove that for any γ ∈ Γ there is a point u˜ ∈ M for which γ(˜ u) ∈ Z. Let P be a continuous projection of H into Y such that Ker P = Z.
6.5. Saddle Point Theorem
459
With this P we wish to find a solution in M of the equation P γ(u) = o.
(6.5.3)
To do that we will use the Brouwer degree. Since P γ|M0 = I, the homotopy invariance property (Proposition 5.2.6 and Theorem 5.2.7) yields that deg (P γ, int M, o) = deg (I, int M, o) = 1. Therefore (6.5.3) has a solution in M (again Theorem 5.2.7). Suppose that the conclusion of this proposition does not hold, i.e., assume that ε > 0 is so small that max F (u) < c − 2ε,
u∈M0
and for all u ∈ H satisfying (a) the condition (b) is violated. By the definition of c there exists γ ∈ Γ such that max F (γ(u)) ≤ c + ε.66
(6.5.4)
u∈M
Consider β(u) = η(γ(u)) where η is from Lemma 6.4.2. Using Lemma 6.4.2(i) we conclude that for u ∈ M0 we have β(u) = η(γ(u)) = η(u) = u,
i.e.,
β ∈ Γ,
i.e.,
c ≤ max F (β(u)). u∈M
On the other hand, it follows from Lemma 6.4.2(ii) and (6.5.4) that max F (β(u)) ≤ c − ε,
u∈M
a contradiction.
Similarly to the previous section, employing the (PS)c condition, we have the following assertion called the Saddle Point Theorem. Theorem 6.5.3 (Saddle Point Theorem). Let the assumptions of Proposition 6.5.2 be satisfied. Let F satisfy (PS)c . Then c is a critical value of F . Remark 6.5.4. The reader should have in mind that the Saddle Point Theorem was also proved under more general assumptions when H is a Banach space and F ∈ C 1 (H, R). In this more general form it is attributed to Rabinowitz (see Theorem 6.5.12). Example 6.5.5. Let us consider the boundary value problem −¨ x(t) − x(t) = f (t) + g(x(t)), t ∈ (0, π), x(0) = x(π) = 0, 66 Note
that the “max” exists due to the assumption dim Y < ∞.
(6.5.5)
460
Chapter 6. Variational Methods
where f ∈ L2 (0, π) is a given function and g : R → R is a continuous function having finite limits lim g(s) = g(±∞) and such that s→±∞
for all s ∈ R.
g(−∞) < g(s) < g(+∞)
We will prove that the problem (6.5.5) has a weak solution if and only if 1 π f (t) sin t dt < −g(−∞). (6.5.6) −g(+∞) < 2 0 First let us prove that (6.5.6) is a necessary condition for the solvability of (6.5.5). Assume that x ∈ H W01,2 (0, π) 67 is a weak solution of (6.5.5), i.e., π π π (x(t) ˙ y(t) ˙ − x(t)y(t)) dt = f (t)y(t) dt + g(x(t))y(t) dt 0
0
0
for any y ∈ H. Take y = sin t, then π f (t) sin t dt = − 0
π
g(x(t)) sin t dt.
0
However, an easy calculation yields π 2g(−∞) < g(x(t)) sin t dt < 2g(+∞). 0
To prove that (6.5.6) is also a sufficient condition we apply Theorem 6.5.3. Define π x(t) π 1 π 1 π 2 |x(t)| ˙ dt − |x(t)|2 dt − g(s) ds dt − f (t)x(t) dt, F (x) = 2 0 2 0 0 0 0 x ∈ H. Let us verify that F has a suitable geometry which corresponds to the Saddle Point Theorem. Let Y = Lin{sin t}, Z = Y ⊥ .68 For z ∈ Z we have π 1 π |z(t)|2 dt ≤ |z(t)| ˙ 2 dt (6.5.7) 4 0 0 (cf. Exercise 6.5.6). From (6.5.7) we get that F is weakly coercive on Z. Namely, we have π -- z(t) 3 π 2 |z(t)| ˙ dt − g(s) ds- dt − f L2(0,π) zL2(0,π) F (z) ≥ 8 0 0 - 0 c 3 3 2 , ≥ z − czL2(0,π) ≥ z z − 8 8 2
67 We 68 H
π
consider the norm x = 0
1 2 |x(t)| ˙ dt
2
on H.
= Y ⊕ Z. Notice that Y and Z are orthogonal to each other also in the L2 -scalar product.
6.5. Saddle Point Theorem
461
and so F (z) → ∞ for z → ∞, z ∈ Z. The functional F is weakly sequentially lower semi-continuous on Z (the argument is the same as that used in Example 6.2.6 and the reader should check it carefully). Then it follows from Theorem 6.2.8 that there exists z0 ∈ Z such that −∞ < F (z0 ) = min F (z). z∈Z
On the other hand, F ( sin t) =
2 2
π
0
2 cos2 t dt − 2
π
0
sin2 t dt
=0
π − 0
Denote
sin t
g(s) ds dt −
0
π
f (t) sin t dt. 0
σ
g(s) ds = G(σ). 0
Then
π
F ( sin t) = − 0
G( sin t) dt +
π
f (t) sin t dt . 0
Since, by the l’Hospital Rule, lim
→±∞
G( sin t) = lim g( sin t) sin t = g(±∞) sin t, →±∞
the Lebesgue Dominated Convergence Theorem and (6.5.6) yield lim F ( sin t) = −∞.
||→∞
(6.5.8)
Taking 0 large enough we then have F (± 0 sin t) < F (z0 ), i.e., the assumptions of Theorem 6.5.3 are satisfied with M = { sin t : ∈ [− 0 , 0 ]},
M0 = {− 0 sin t, 0 sin t}.
It remains to prove that F satisfies (PS)c . Similarly to Example 6.4.7 we will prove that F satisfies even a stronger version of (PS)c : ∞
∞
Any sequence {xn }n=1 ⊂ H such that {F (xn )}n=1 is bounded in R and ∇F (xn ) → o contains a convergent subsequence.
462
Chapter 6. Variational Methods
To prove it we follow the usual scheme. Step 1. We will first show that {xn }∞ n=1 is bounded in H. To do that, decompose xn = yn + zn where yn ∈ Y (i.e., yn (t) = n sin t) and zn ∈ Z = Y ⊥ . First we prove that ∞ {zn }n=1 is bounded in H. To see this consider (∇F (xn ), zn )H . We have π π (∇F (xn ), zn ) = x˙ n (t)z˙n (t) dt − xn (t)zn (t) dt 0 0 π π − zn (t)g(xn (t)) dt − f (t)zn (t) dt 0 0π π 2 2 69 = zn − zn L2 (0,π) − zn (t)g(xn (t)) dt − f (t)zn (t) dt 0
0
3 3 k ≥ zn 2 − kzn L2 (0,π) ≥ zn 2 − zn 4 4 2 with a positive constant k (for the last two inequalities we have used (6.5.7)). Since we have assumed that ∇F (xn ) → o we know that ∇F (xn ) ≤ const.
for all sufficiently large n.
In particular, this means that for all these n the inequalities (∇F (xn ), zn ) ≤ zn Hence
hold.
3 k zn 2 − zn ≤ zn , 4 2 ∞
and the boundedness of {zn }n=1 is shown. For the investigation of yn we will use the boundedness of F (xn ). We have 1 1 1 1 F (xn ) = F (yn + zn ) = yn 2 + zn 2 − yn 2L2 (0,π) − zn 2L2 (0,π) 2 2 2 2 π yn (t) π yn (t)+zn (t) − g(s) ds dt − g(s) ds dt 0
0 π
f (t)yn (t) dt −
− 0
0 π
yn (t)
f (t)zn (t) dt 0
1 1 = F (yn ) + zn 2 − zn 2L2 (0,π) 2 2 π yn (t)+zn (t) π − g(s) ds dt − f (t)zn (t) dt. 0 69 See
yn (t)
footnote 68 on page 460.
0
6.5. Saddle Point Theorem
463 ∞
∞
∞
By the boundedness of {zn }n=1 , g and {F (xn )}n=1 , we obtain that {F (yn )}n=1 is bounded. Since yn (t) = n sin t and lim F ( sin t) = −∞ (see (6.5.8)), { n }∞ n=1 ||→∞
and also yn have to be bounded. Step 2. Passing to a subsequence if necessary we may assume that xn x0 in H and xn → x0 in C[0, π]. Since ∇F (xn ) → o we get (∇F (xn ) − ∇F (x0 ), xn − x0 ) → 0
as n → ∞.
This means that π π |x˙ n (t) − x˙ 0 (t)|2 dt − |xn (t) − x0 (t)|2 dt 0 0 π − (g(xn (t)) − g(x0 (t)))(xn (t) − x0 (t)) dt − 0
π
f (t)(xn (t) − x0 (t)) dt → 0.
0
However, the last three integrals tend to zero, i.e., xn − x0 → 0
as
n → ∞. g
Hence xn → x0 in H. Exercise 6.5.6. Prove the inequality (6.5.7) for z ∈ Z. +; ,∞ +; ,∞ 2 2 Hint. Remember that sin nt , cos nt π π basis in L (0, π). For z ∈ Z one has 2
z(t) =
∞
ak sin kt
n=1
and (the Parseval equality)
form an orthonormal
n=0
z2L2(0,π) =
k=2
∞ π 2 ak . 2 k=2
A similar argument with z˙ and integration by parts leads to (6.5.7). Exercise 6.5.7. Find conditions on Φ : [0, 1] × R → R such that the procedure for proving the existence of a solution of (6.5.5) could be used for the boundary value problem ⎧ ∂Φ ⎨ −¨ (t, x(t)), t ∈ (0, 1), x(t) = ∂x ⎩ x(0) = x(1) = 0. Exercise 6.5.8. Consider the boundary value problem −¨ x(t) − λx(t) = g(t, x(t)), t ∈ (0, π), x(0) = x(π) = 0.
(6.5.9)
Formulate conditions on λ and g which guarantee that the energy functional associated with (6.5.9) has a geometry corresponding to the Saddle Point Theorem.
464
Chapter 6. Variational Methods
Exercise 6.5.9. Consider the Dirichlet boundary value problem −¨ x(t) = h(t, x(t)), t ∈ (0, π),
(6.5.10)
x(0) = x(π) = 0.
Formulate conditions on h = h(t, x) which guarantee that (6.5.10) has a weak solution. Exercise 6.5.10. Consider the Neumann boundary value problem −¨ x = h(t, x(t)), t ∈ (0, π),
(6.5.11)
x(0) ˙ = x(π) ˙ = 0.
Formulate conditions on h = h(t, x) which guarantee that (6.5.11) has a weak solution. Exercise 6.5.11. Consider the Dirichlet boundary value problem −¨ x(t) − n2 x(t) = f (t) + g(x(t)), t ∈ (0, π), x(0) = x(π) = 0,
(6.5.12)
where n ∈ N, f ∈ L2 (0, π), g : R → R is a continuous function having finite limits lim g(s) = g(±∞) and such that g(−∞) < g(s) < g(+∞) for all s.
s→±∞
Prove that (6.5.12) has a weak solution if and only if
π
−
π
(sin nt) dt − g(+∞)
g(−∞) 0
π
+
(sin nt) dt < f (t) sin nt dt 0 0 π − < g(+∞) (sin nt) dt − g(−∞) 0
π
(sin nt)+ dt
0
where (sin nt)+ and (sin nt)− are the positive and the negative part of sin nt, respectively. Hint. Modify the estimates from Example 6.5.5.
6.5A Linking Theorem The aim of this appendix is to apply the General Minimax Principle (see Theorem 6.4.22) from Appendix 6.4A and to generalize the assertion of Theorem 6.5.3. Namely, we start with the Saddle Point Theorem (cf. Remark 6.5.4). Theorem 6.5.12 (Saddle Point Theorem, Rabinowitz). Let X = Y ⊕ Z be a Banach space with Z closed in X and dim Y < ∞. For > 0 define M {u ∈ Y : u ≤ },
M0 {u ∈ Y : u = }.
6.5A. Linking Theorem
465
Let F ∈ C 1 (X, R) be such that b inf F (u) > a max F (u). u∈Z
u∈M0
If F satisfies the (PS)c condition with c inf max F (γ(u))
where
γ∈Γ u∈M
Γ {γ ∈ C(M, X) : γ|M0 = I},
then c is a critical value of F . Proof. We set Γ0 = {I} and apply Theorem 6.4.22 and Corollary 6.4.23. For this purpose it is enough to verify that c ≥ b. Let us prove that γ(M) ∩ Z = ∅ for every γ ∈ Γ. Denote by P the projection onto Y such that P Z = {o}. If γ(M) ∩ Z = ∅, then the map u →
P γ(u) P γ(u)
is a retraction 70 of the ball M onto its boundary M0 in the space Y . But this is impossible since dim Y < ∞. Hence, for every γ ∈ Γ, max F (γ(u)) ≥ inf F (u),
u∈M
c ≥ b.
i.e.,
u∈Z
This completes the proof.
We postpone the application of Theorem 6.5.12 to Appendix 7.7A. We prove now the Linking Theorem. Theorem 6.5.13 (Linking Theorem, Rabinowitz). Let X = Y ⊕ Z be a Banach space with Z closed in X and dim Y < ∞. Let > r > 0 and let z ∈ Z be such that z = r. Define M {u = y + λz : u ≤ , λ ≥ 0, y ∈ Y },
N {u ∈ Z : u = r},
M0 {u = y + λz : y ∈ Y, u = and λ ≥ 0, or u ≤ and λ = 0}. Let F ∈ C 1 (X, R) be such that b inf F (u) > a max F (u). u∈N
u∈M0
If F satisfies the (PS)c condition with c inf max F (γ(u)) γ∈Γ u∈M
where
Γ {γ ∈ C(M, X) : γ|M0 = I},
then c is a critical value of F . 70 For
the notion of the retraction see Exercise 6.5.16.
466
Chapter 6. Variational Methods
Proof. We set Γ0 = {I} and apply Theorem 6.4.22 and Corollary 6.4.23. As in the previous proof it is sufficient to verify that c ≥ b. Now, we prove that γ(M) ∩ N = ∅ for every γ ∈ Γ. Denote by P the projection onto Y such that P Z = {o}, and by R the retraction of (Y ⊕ Rz ) \ {z} to M0 ,71 see Figure 6.5.3. If γ(M) ∩ N = ∅, then the map
1 u → R P γ(u) + (I − P )γ(u)z r is well defined on M and hence it is a retraction of M to its boundary M0 . This is impossible since M is homeomorphic to a finite dimensional ball (see Exercise 6.5.16). Hence for every γ ∈ Γ we obtain max F (γ(u)) ≥ inf F (u),
u∈M
c ≥ b.
i.e.,
u∈N
Rz M0 z
R
−
o
M0
Y
Figure 6.5.3. As an application we give the following example. Example 6.5.14. Let us consider the boundary value problem −¨ x(t) + a(t)x(t) = f (t, x(t)), t ∈ (0, 1), x(0) = x(1) = 0,
(6.5.13)
where a = a(t) is a continuous function on [0, 1] and f = f (t, s) is a continuous function on [0, 1] × R satisfying some additional hypotheses formulated below. It follows from the Sturm–Liouville theory for linear ordinary differential equations of the second order (see Example 2.2.17 and, e.g., Walter [131]) that the eigenvalues 71 By
Rz we denote the set {tz : t ∈ R} ⊂ Z.
6.5A. Linking Theorem {λn }∞ n=1 of
467
−¨ x(t) + a(t)x(t) = λx(t),
t ∈ (0, 1),
x(0) = x(1) = 0 form a strictly increasing sequence where each eigenvalue is simple and lim λn = ∞. If n→∞
we denote by e1 , e2 , . . . , en , . . . the corresponding eigenfunctions, then they are mutually orthogonal in L2 (0, 1). Suppose that there exists k ∈ N such that λ1 < λ2 < · · · < λk < 0 < λk+1 < λk+2 < · · · . Let us assume that f satisfies the following assumptions: (f1) there exist p ∈ (1, 2) and c > 0 such that for any |f (t, s)| ≤ c 1 + |s|p−1
t ∈ [0, 1],
s ∈ R;
(f2) there exist α > 2 and R > 0 such that for t ∈ [0, 1] and |s| > R we have s 0<α f (t, σ) dσ ≤ sf (t, s); 0
(f3) f (t, s) = o(|s|) as |s| → 0 uniformly on [0, 1]; s s2 (f4) λk f (t, σ) dσ for all s ∈ R and t ∈ [0, 1]. ≤ 2 0 We now consider the functional x(t) 1 1 1 2 2 + a(t)|x(t)| − f (t, s) ds dt |x(t)| ˙ F (x) 2 2 0 0 on H W01,2 (0, 1).72 The critical points of F correspond to weak solutions of (6.5.13). Our plan is to apply the Linking Theorem (Theorem 6.5.13) so as to prove that F has a critical point. Then the existence of a solution of (6.5.13) will follow from a regularity argument similar to that from Theorem 6.1.13. Denote by 1 x(t) ψ(x) f (t, s) ds dt 0
0
the functional defined on the Sobolev space H. Then ψ is of the class C 1 (H, R), and (ψ (x), h) =
1
f (t, x(t))h(t) dt
(6.5.14)
0
(cf. Section 3.2). The fact ψ ∈ C 1 (H, R) implies immediately that F ∈ C 1 (H, R) as well. Let us define 1 Z x∈H: x(t)y(t) dt = 0, y ∈ Y . Y Lin{e1 , . . . , ek }, 0 72 Note
that we use the norm x = x ˙ L2 (0,1) .
468
Chapter 6. Variational Methods
Then
1
δ inf
2 (|x(t)| ˙ + a(t)|x(t)|2 ) dt > 0.
x∈Z
x =1
(6.5.15)
0
Indeed, by definition, on Z we have (see Remark 6.3.13) 1 2 (|x(t)| ˙ + a(t)|x(t)|2 ) dt ≥ λk+1 0
1
|x(t)|2 dt. 0
Since inf xL2 (0,1) = 0 we need some more work in order to establish (6.5.15). Consider
x =1
a minimizing sequence {xn }∞ n=1 ⊂ Z:
1
xn = 1,
(|x˙ n (t)|2 + a(t)|xn (t)|2 ) dt → δ. 0
Going to a subsequence if necessary, we may assume xn x in H, i.e., xn → x in L2 (0, 1) by the compact embedding H = W01,2 (0, 1) ⊂⊂ L2 (0, 1) (cf. Theorem 1.2.28). The continuity of a = a(t) in [0, 1] then implies that 1 1 a(t)|xn (t)|2 dt → a(t)|x(t)|2 dt. 0
0
Since Z is weakly closed and the norm on H is weakly lower semicontinuous, we obtain 1 1 1 1 2 δ =1+ a(t)|x(t)|2 dt ≥ |x(t)| ˙ dt + a(t)|x(t)|2 dt ≥ λk+1 |x(t)|2 dt. 0
0
0
0
If x = o, we have δ = 1, and if x = o, we have 1 δ ≥ λk+1 |x(t)|2 dt > 0 0
and so (6.5.15) is proved. Using (f1) and (f3), we obtain that for any ε > 0 there exists cε > 0 such that - s f (t, σ) dσ -- ≤ ε|s|2 + cε |s|p . (6.5.16) 0
It follows from (6.5.15) and (6.5.16) that on Z we have 1 δ δ F (x) ≥ x2 − (ε|x(t)|2 + cε |x(t)|p ) dt = x2 − εx2L2 (0,1) − cε xpLp (0,1) . 2 2 0 For ε > 0 small enough, by virtue of the inequality 1 < p < 2 and the embedding H = W01,2 (0, 1) ⊂ Lq (0, 1) for any q > 1, there exists r > 0 such that b inf F (x) > 0.
x =r x∈Z
By (f4) we have
1
F (x) ≤ 0
|x(t)|2 λk − 2
x(t)
f (t, s) ds dt ≤ 0
0
for
x ∈ Y.
(6.5.17)
6.5A. Linking Theorem
469
It follows from the first inequality in (f2) that there exist c1 , c2 > 0 such that s f (t, σ) dσ ≥ c1 |s|α − c2 for all s ∈ R, t ∈ [0, 1]
(6.5.18)
0
(cf. Exercise 6.5.17). Hence for x ∈ H we have F (x) ≤
1 x2 + aC[0,1] x2L2 (0,1) − c1 xα Lα (0,1) + c2 . 2
(6.5.19)
e
Set z r ek+1 with r > 0 given above. All norms are equivalent on the finite dimensik+1
onal space Y ⊕ Rz . In particular, there is a constant c > 0 such that x ≤ cxLα (0,1)
for any
x ∈ Y ⊕ Rz .
Since α > 2 we obtain from (6.5.19) that lim
x →∞ x∈Y ⊕Rz
F (x) = −∞.
(6.5.20)
Define M {x y + λz : y ∈ Y, λ ≥ 0, x ≤ }, M0 {x y + λz : y ∈ Y, x = and λ ≥ 0, or x ≤ and λ = 0}. Since F (z) ≥ b > 0, (6.5.17) and (6.5.20) imply that there is > r such that a max F (x) ≤ 0. x∈M0
It remains to verify that F satisfies the (PS)c condition. This will be the case if we show that any sequence {xn }∞ n=1 ⊂ H such that d sup F (xn ) < ∞, n
F (xn ) → o,
(6.5.21)
contains a converging subsequence. We will prove it in two steps. 1 1 Step 1. First we prove that {xn }∞ n=1 is bounded in H. Let β ∈ α , 2 . For n large enough we have for some c3 , c4 > 0, by using (f2), 1 1 d + xn ≥ F (xn ) − β(F (xn ), xn ) = − β (|x˙ n (t)|2 + a(t)|xn (t)|2 ) 2 0 xn (t) f (t, s) ds dt + βf (t, xn (t))xn (t) −
≥
≥
1 −β 2
0
δzn 2 + λ1 yn 2L2 (0,1) + (αβ − 1)
1 0
xn (t)
f (t, s) ds dt − c3
0
1 − β δzn 2 + λ1 yn 2L2 (0,1) + c1 (αβ − 1)xn α Lα (0,1) − c4 2 (6.5.22)
470
Chapter 6. Variational Methods
where xn = yn + zn , yn ∈ Y , zn ∈ Z, δ is from (6.5.15), and we have also used
1
yn 2 +
a(t)|yn (t)|2 dt ≥ λ1 yn 2L2 (0,1) 0
and the fact that
1
[y˙ n (t)z˙ n (t) + a(t)yn (t)zn (t)] dt = 0.73 0
Since dim Y < ∞, the norms · and · L2 (0,1) are equivalent on Y , and (6.5.22) implies that {xn }∞ n=1 is bounded (cf. Exercise 6.5.18). Step 2. In the second step we prove that {xn }∞ n=1 contains a convergent subsequence. Going to a subsequence if necessary, we can assume that xn x in H. By the Rellich– Kondrachov Theorem (Theorem 1.2.28), xn → x in C[0, 1]. Observe that xn − x2 = (F (xn ) − F (x), xn − x) 1 8 9 (f (t, xn (t)) − f (t, x(t)))(xn (t) − x(t)) − a(t)|xn (t) − x(t)|2 dt. + 0
The boundedness of {xn }∞ n=1 and (6.5.21) imply (F (xn ) − F (x), xn − x) → 0, implies
xn → x
in C[0, 1]
1
a(t)(|xn (t)|2 − |x(t)|2 ) dt → 0, 0
and the continuity of f implies
1
(f (t, xn (t)) − f (t, x(t)))(xn(t) − x(t)) dt 0
≤ f (·, xn (·)) − f (·, x(·))C[0,1] xn − xC[0,1] → 0
as
n → ∞.
Thus we have proved that xn − x → 0,
n → ∞.
e
Remark 6.5.15. If λ1 > 0 (this is the case if, e.g., a(t) ≥ 0 in [0, 1]), it suffices to use the Mountain Pass Theorem instead of the Linking Theorem. The interested reader is invited to carry out the proof in detail as an exercise. Exercise 6.5.16. A retraction of a topological space X to a subspace Y is a continuous map r : X → Y such that r(y) = y
for every
y ∈ Y.
Prove that there is no retraction of B(o; 1) ⊂ RN to S N−1 = ∂B(o; 1). 73 Note
1
that
1
[e˙ j (t)z(t) ˙ + a(t)ej (t)z(t)] dt = λj 0
0
ej (t)z(t) dt = 0 for ej , j = 1, . . . , k, z ∈ Z.
6.5A. Linking Theorem
471
Hint. Assume, by contradiction, that there is a retraction r : B(o; 1) → S N−1 . Using the homotopy H(t, u) (1 − t)u + tr(u) we obtain (see Theorem 5.2.7) deg (r, B(o; 1), o) = deg (I, B(o; 1), o) = 1, i.e., there is u0 ∈ B(o; 1) such that r(u0 ) = o. This contradicts r(u) ∈ S N−1 for any u ∈ B(o; 1)! Exercise 6.5.17. Prove that the condition (f2) implies (6.5.18). Hint. Let s > R. It follows from (f2) that f (t, s)
α ≤ s
s
f (t, σ) dσ 0
and integrating over [R, s] yields
s
R
f (t, σ) dσ − log
α log s − α log R ≤ log 0
f (t, σ) dσ. 0
Taking the exponential of both sides, we obtain s R s f (t, σ) dσ f (t, σ) dσ ' s (α α 0 ≤ 0R s ≤ f (t, σ) dσ. , i.e., R Rα 0 f (t, σ) dσ 0
s
f (t, σ) dσ ≤ const. for t ∈ [0, 1], s ∈ [0, R]. Hence
Since f is continuous, we have
0 s
f (t, σ) dσ ≥ c1 sα − c2 . 0
Similarly for s < 0. Exercise 6.5.18. Prove the boundedness of the sequence {xn }∞ n=1 from Step 1 on page 469. Hint. Write xn = yn + zn , yn ∈ Y , zn ∈ Z. Since Y and Z are L2 -orthogonal, we have yn 2L2 (0,1) = xn 2L2 (0,1) − zn 2L2 (0,1) . Write (6.5.22) in the equivalent form
1 d + xn ≥ − β δzn 2 − λ1 zn 2L2 (0,1) 2
1 α − β xn 2L2 (0,1) − c4 + c1 (αβ − 1)xn Lα (0,1) + λ1 2
1 ≥ − β δzn 2 − λ1 zn 2L2 (0,1) 2
1 + c λ + xn 2Lα (0,1) c1 (αβ − 1)xn α−2 − β − c4 5 1 Lα (0,1) 2 (where the inequality xn L2 (0,1) ≤ c5 xn Lα (0,1) is used) and get the boundedness of {xn }∞ n=1 .
472
Chapter 6. Variational Methods
Exercise 6.5.19. Consider the Dirichlet boundary value problem p−2 − |x(t)| ˙ x(t) ˙ ˙− λ|x(t)|p−2 x(t) = g(t, x(t)), t ∈ (0, 1), x(0) = x(1) = 0,
(6.5.23)
where p > 1. Formulate conditions on λ and g = g(t, x) which guarantee that (6.5.23) has a geometry corresponding to (i) the Saddle Point Theorem, (ii) the Linking Theorem. Exercise 6.5.20. How do the conditions on λ and g change if the homogeneous Dirichlet conditions in (6.5.23) are replaced by the Neumann ones?
Chapter 7
Boundary Value Problems for Partial Differential Equations 7.1 Classical Solution, Functional Setting In this section we will explain the notion of the classical solution of a semilinear problem with the Laplace operator and explain what is the “right” functional setting for it. Let Ω be an open bounded subset of RN and let u : Ω → R be a real smooth function. We will denote by ∆u(x)
∂ 2 u(x) ∂ 2 u(x) ∂ 2 u(x) + + ··· + , 2 2 ∂x1 ∂x2 ∂x2N
x = (x1 , . . . , xN ) ∈ Ω
the Laplace operator defined in Ω. Let g: Ω × R → R be a continuous real function. We will study the Dirichlet boundary value problem −∆u(x) = g(x, u(x)) in Ω, (7.1.1) u=0 on ∂Ω and look for its classical solution. Following the definition of the classical solution for the ordinary differential equation it should be a function u ∈ C 2 (Ω) ∩ C(Ω) such that u(x) = 0 for every x ∈ ∂Ω and the equation −∆u(x) = g(x, u(x)) is satisfied at every point x ∈ Ω. Let us explain why this is not a suitable definition of the solution for partial differential equations. In order to apply the methods of nonlinear functional analysis we need an operator representation of the Laplace operator subject to the Dirichlet boundary
474
Chapter 7. Boundary Value Problems for Partial Differential Equations
conditions. For this purpose we need the fact that for any f ∈ C(Ω) the linear problem −∆u(x) = f (x) in Ω, (7.1.2) u=0 on ∂Ω, has a unique solution u ∈ C 2 (Ω) ∩ C(Ω). However, not for every f ∈ C(Ω) does such a solution exist in general! This fact is nontrivial and can be found in the book Gilbarg & Trudinger [59, Chapter 4]. So, we have to look for a different concept of the classical solution. To motivate the definition of the classical solution of (7.1.1) we treat first the linear problem (7.1.2). In order to simplify the notation, in this chapter, denote by | · | the norm in RN for any N ≥ 1. For γ ∈ (0, 1) let us consider the space of γ-H¨older continuous functions C 0,γ (Ω) {u ∈ C(Ω) : ∃K > 0 ∀x, y ∈ Ω : |u(x) − u(y)| ≤ K|x − y|γ } (cf. Example 1.2.25 and Exercise 7.1.4). Let α = (α1 , . . . , αN ) be a multiindex of length N |α| = αi , i=1
and for the sake of brevity denote Dα u =
∂ |α| u α1 αN 2 ∂x1 ∂xα 2 . . . ∂xN
(=
∂ ∂x1
α1
···
∂ ∂xN
αN u)
(cf. Section 1.2). Set C 2,γ (Ω) {u ∈ C 2 (Ω) : ∀|α| = 2, Dα u ∈ C 0,γ (Ω)} (cf. Exercise 7.1.5). In what follows we will assume that Ω is a bounded domain (an open and connected set) of the class C k,γ (k ∈ N ∪ {0}, γ ∈ (0, 1]), i.e., at each point x0 ∈ ∂Ω there is a ball B(x0 ; ) and a one-to-one mapping ψ from B(x0 ; ) onto B(o; 1) ⊂ RN such that 1 (1) ψ(B(x0 ; ) ∩ Ω) ⊂ RN +; (2) ψ(B(x0 ; ) ∩ ∂Ω) ⊂ ∂RN +; (3) ψ ∈ C k,γ (B(x0 ; )), ψ −1 ∈ C k,γ (B(o; 1)), see Figure 7.1.1 (cf. Definition 4.3.89). For the sake of brevity we write Ω ∈ C k,γ . In particular, if Ω ∈ C 0,1 , then Ω will be called the domain with a Lipschitz boundary. 1 RN +
= {x ∈ RN : x = (x1 , x2 , . . . , xN ), xN > 0}.
7.1. Classical Solution, Functional Setting
475
The following deep result is crucial for the classical setting of the problem (7.1.1). Theorem 7.1.1. Let f ∈ C 0,γ (Ω), γ ∈ (0, 1), Ω ∈ C 2,γ . Then there exists a unique u ∈ C 2,γ (Ω) such that u(x) = 0, x ∈ ∂Ω, and the equation in (7.1.2) is satisfied at every point x ∈ Ω. Moreover, there exists c > 0 (which depends only on Ω and γ ∈ (0, 1)) such that uC 2,γ (Ω) ≤ cf C 0,γ (Ω) . (7.1.3) B(x0 ; )
RN +
ψ
−1
x0 o ψ B(o; 1)
∂Ω Ω
Figure 7.1.1.
Proof. The proof of this assertion can be found in Gilbarg & Trudinger [59, Chapter 6]. Note that (7.1.3) is a special case of more general Schauder estimates (see, e.g., Gilbarg & Trudinger [59]). Set X {u ∈ C 2,γ (Ω) : u(x) = 0, x ∈ ∂Ω}
and
Y C 0,γ (Ω).
Then X and Y are Banach spaces (see Exercises 7.1.4 and 7.1.5) and the Arzel`a– Ascoli Theorem (see Theorem 1.2.13) implies that the following compact embedding holds true: X ⊂⊂ Y. (7.1.4) Let us define an operator L : X → Y by (Lu)(x) −∆u(x),
u ∈ X.
(7.1.5)
Then L is a linear and bounded operator (Exercise 7.1.7). It follows from Theorem 7.1.1 that L−1 : Y → X is a well-defined linear and bounded operator. Actually, the best constant (i.e., the least one) in (7.1.3) is nothing but c = L−1 L(Y,X) .
476
Chapter 7. Boundary Value Problems for Partial Differential Equations
It follows from the compact embedding (7.1.4) that2 L−1 : Y → Y is a compact operator. Notice, however, that L−1 L(Y ) = L−1 L(Y,X) in general, because L−1 L(Y ) ≤ cemb L−1 L(Y,X) where cemb is the constant of embedding of X into Y , i.e., the least constant c for which uY ≤ cuX holds for all u ∈ X. This constant depends on Ω and γ ∈ (0, 1). Let us simplify the situation and assume that the nonlinear function g = g(x, u) is given in the special form g(x, u(x)) = g(u(x)) + f (x) (the “variables” u and x are “separated”). In order to find a suitable operator representation for −∆u(x) = g(u(x)) + f (x) in Ω, u=0
on
∂Ω,
(7.1.6)
we have to find conditions on g which guarantee that G : Y → Y defined by G(u)(x) g(u(x))
(7.1.7)
is a correctly defined operator (see also Example 3.2.21). The following nontrivial assertion provides a necessary and sufficient condition on g guaranteeing that the Nemytski operator G is continuous from Y into Y. Lemma 7.1.2 (Dr´abek [38]). The operator G defined by (7.1.7) maps Y continuously into Y if and only if g ∈ C 1 (R).3 Definition 7.1.3. Let f ∈ Y and g ∈ C 1 (R). Then u ∈ X is a classical solution of (7.1.6) if the equation in (7.1.6) holds at every point of Ω. Let us give an operator representation of (7.1.6). Denote Gf (u)(x) g(u(x)) + f (x). 2 To be precise, we should write I ◦ L−1 : Y → Y where I is the compact embedding of X into Y . However, for the sake of brevity of notation we drop it. 3 Note that a similar assertion can be proved also for a more general Nemytski operator G(u)(x) = g(x, u(x)). However, the conditions on g are more complicated in that case (see, e.g., Appell & Zabreiko [8]).
7.2. Classical Solution, Applications
477
Then Gf : Y → Y is a continuous operator by Lemma 7.1.2 and it maps bounded sets onto bounded sets (see Exercise 7.1.8). The problem (7.1.6) is then equivalent to L(u) = Gf (u) or u = L−1 (Gf (u)). The operator T = L−1 ◦ Gf : Y → Y is compact (explain why!). The problem (7.1.6) thus can be written as a fixed point problem u = T (u).
(7.1.8)
The equation (7.1.8) can be solved by applying some of the methods presented in the previous chapters. We will show some applications in the next section. Exercise 7.1.4. Prove that for γ ∈ (0, 1) the set C 0,γ (Ω), Ω a bounded domain in RN , is a Banach space equipped with the norm |u(x) − u(y)| . |x − y|γ x,y∈Ω
uC 0,γ (Ω) = sup |u(x)| + sup x∈Ω
x =y
Exercise 7.1.5. Prove that for γ ∈ (0, 1) the set C 2,γ (Ω), Ω a bounded domain in RN , is a Banach space equipped with the norm uC 2,γ (Ω) = uC 2 (Ω) +
|Dα u(x) − Dα u(y)| . |x − y|γ x,y∈Ω sup
|α|=2 x =y
Exercise 7.1.6. Prove that for γ > 1 the following equivalence holds true: u ∈ C 0,γ (Ω) if and only if u is constant on Ω. (Cf. footnote 26 on page 38). Exercise 7.1.7. Prove that L : X → Y defined by (7.1.5) is a linear and bounded operator. Exercise 7.1.8. Prove that Gf , defined above, maps bounded sets in Y onto bounded sets in Y .
7.2 Classical Solution, Applications In this section we will deal with the existence (and uniqueness) of the classical solution of the Dirichlet problem −∆u(x) = g(u(x)) + f (x) in Ω, (7.2.1) u=0 on ∂Ω, under the assumptions on Ω, g and f introduced in Section 7.1. Namely, we assume that Ω is a domain of the class C 2,γ , γ ∈ (0, 1), f ∈ Y = C 0,γ (Ω) and g ∈ C 1 (R).
478
Chapter 7. Boundary Value Problems for Partial Differential Equations
Let us start with a direct application of the Schauder Fixed Point Theorem. Theorem 7.2.1. Let sup |g(s)| < ∞ and s∈R
sup |g (s)| < s∈R
1 L−1 L(Y )
(7.2.2)
where L−1 was introduced in Section 7.1. Then for any f ∈ Y the problem (7.2.1) has at least one classical solution. Proof. We rewrite the problem (7.2.1) into the operator form (7.1.8) with T , L−1 and Gf introduced in Section 7.1. Due to our assumptions we find a ball B(o; r) ⊂ Y with the property T (B(o; r)) ⊂ B(o; r). The existence of at least one solution will then follow from the Schauder Fixed Point Theorem (Theorem 5.1.11). For f ∈ Y fixed we get T (u) = L−1 (Gf (u)) ≤ L−1 [f + G(u)] ⎡
⎤
|g(u(x)) − g(u(y))| ⎥ ⎢ ≤ L−1 ⎣f + sup |g(u(x))| + sup ⎦ |x − y|γ x,y∈Ω x∈Ω x =y
≤ L−1 f + L−1 sup |g(s)| + L−1 sup |g (s)| sup s∈R
s∈R
x,y∈Ω x =y
|u(x) − u(y)| |x − y|γ
≤ L−1 f + L−1 sup |g(s)| + L−1 sup |g (s)|u s∈R
s∈R
for any u ∈ Y . Note that the first two terms in the last sum are constants independent of u ∈ Y . So, taking r > 0 large enough, the assumption (7.2.2) will guarantee that T (u) < r for any u ∈ B(o; r) ⊂ Y.4
This completes the proof.
Remark 7.2.2. The above theorem actually says that the problem (7.2.1) is solvable in the classical sense if the smooth nonlinearity g is uniformly bounded on R and, moreover, its derivative is uniformly bounded by a certain constant which depends on Ω and γ ∈ (0, 1).5 This constant may be rather difficult to calculate but its estimate from above can be given for special domains Ω and exponents γ ∈ (0, 1) (see Gilbarg & Trudinger [59] for more details). 4 The
reader is invited to calculate the least value of such r in terms of L−1 , f , sup |g(s)|
and sup |g (s)|! s∈R 5 Note
that L−1 depends on Ω and γ ∈ (0, 1). See also Exercise 7.2.5.
s∈R
7.2. Classical Solution, Applications
479
It is quite natural to ask under which conditions on g the classical solution from Theorem 7.2.1 is uniquely determined. For this purpose let us consider the eigenvalue problem for the Laplace operator subject to the homogeneous Dirichlet boundary conditions −∆u(x) = λu(x) in Ω, (7.2.3) u=0 on ∂Ω. The problem (7.2.3) has only real eigenvalues. More precisely, there are only real numbers λ for which (7.2.3) has nonzero classical solutions. There exists the socalled principal eigenvalue λ1 > 0 having the property that |∇u(x)|2 dx ≥ λ1 |u(x)|2 dx for all u ∈ Y.6 (7.2.4) Ω
Ω
(see Example 7.5.1 below). Now we can formulate the following existence and uniqueness result. Theorem 7.2.3. Let sup |g(s)| < ∞ and s∈R
sup |g (s)| < min s∈R
1 , λ 1 . L−1 L(Y )
(7.2.5)
Then for any f ∈ Y the problem (7.2.1) has exactly one classical solution. Proof. Let f ∈ Y be arbitrary. The existence of at least one solution follows from Theorem 7.2.1. To prove that this solution is unique we proceed via contradiction. Let u1 = u2 , ui ∈ X, i = 1, 2, be two solutions of (7.2.1) for a given f ∈ Y . Then in Ω, −∆ui (x) = g(ui (x)) + f (x) i = 1, 2. on ∂Ω, ui = 0 Multiply both equations by the difference u1 − u2 , use the Green Formula7 and the boundary conditions, and then subtract the first expression from the second. Thus we get |∇u1 (x) − ∇u2 (x)|2 dx = (g(u1 (x)) − g(u2 (x)))(u1 (x) − u2 (x)) dx. (7.2.6) Ω
Ω
6 This
inequality is the Poincar´ e inequality (see Exercise 1.2.46 and Remark 7.4.5). 7 The Green Formula reads: Let Ω be a domain with a Lipschitz boundary (see Section 7.1) and assume that w, v ∈ C 2 (Ω). Then the relation ∆w(x)v(x) dx = − (∇w(x), ∇v(x)) dx + (∇w(x), n(x))v(x) dS Ω
Ω
∂Ω
holds where n is the unit vector of the outward normal to ∂Ω and dS indicates integration with respect to the surface measure on ∂Ω. For more details see Appendix 4.3C, in particular, Remark 4.3.99.
480
Chapter 7. Boundary Value Problems for Partial Differential Equations
Now, by virtue of (7.2.4) we have |∇u1 (x) − ∇u2 (x)|2 dx ≥ λ1 |u1 (x) − u2 (x)|2 dx, Ω
(7.2.7)
Ω
and (7.2.5) implies that (g(u1 (x)) − g(u2 (x)))(u1 (x) − u2 (x)) dx < λ1 |u1 (x) − u2 (x)|2 dx. (7.2.8) Ω
Ω
It follows from (7.2.6)–(7.2.8) that λ1 |u1 (x) − u2 (x)|2 dx < |∇u1 (x) − ∇u2 (x)|2 dx Ω Ω ≤ λ1 |u1 (x) − u2 (x)|2 dx, Ω
a contradiction.
Remark 7.2.4. One can ask why not prove the existence (and uniqueness) of a classical solution to (7.2.1) applying directly the Contraction Principle. The reason consists in the fact that the key assumption in this case would be the contractivity of T . Due to the linearity of L−1 this is equivalent to the Lipschitz continuity of the Nemytski operator G : Y → Y . However, according to Appell & Zabreiko [8, Theorem 7.8], G satisfies the global Lipschitz condition if and only if g is of the form g(u) = a + bu where a, b ∈ R, i.e., g is a linear function. According to Appell & Zabreiko [8, Theorem 7.9], G satisfies the local Lipschitz condition provided g is locally Lipschitz continuous. In other words, this means that the assumptions would be too restrictive, and so the above functional setting is not suitable for direct application of the Contraction Principle. Exercise 7.2.5. Prove that the statement of Theorem 7.2.1 holds provided g is a uniformly Lipschitz continuous function on R with a constant K < L1−1 . Generalize also Theorem 7.2.3. Hint. Follow the proof of Theorem 7.2.1 and for an estimate of g(u) use |g(u(·))| ≤ |g(u(·)) − g(0)| + |g(0)| ≤ K|u(·)| + |g(0)|. Exercise 7.2.6. Let u be a solution of the problem (7.2.1) where Ω is a domain of the class C 2,γ , γ ∈ (0, 1) and f ∈ C 0,γ (Ω). The maximum principle (see, e.g., Protter & Weinberger [102]) states that f ≥0
in Ω
implies
u≥0
in Ω.
7.3. Weak Solutions, Functional Setting
481
Use this fact to generalize the result of Example 5.4.19 to the problem −∆u(x) = f (x, u(x)) in Ω, u=0 on ∂Ω.
(7.2.9)
Exercise 7.2.7. Formulate conditions on f = f (x, u) which guarantee that there is a pair of a subsolution u0 and a supersolution v0 of (7.2.9) satisfying u0 ≤ v0
in Ω.
Hint. Look for u0 and v0 constant on Ω, cf. Exercise 6.2.47.
7.3 Weak Solutions, Functional Setting As we have mentioned in the previous section the concept of the classical solution is not suitable for application of many abstract results of nonlinear analysis presented in the previous chapters. Let us list the major drawbacks connected with this fact: the spaces of H¨ older continuous functions do not possess Hilbert structure and are not reflexive; to prove that the Nemytski operator between spaces of H¨ older continuous functions has a certain property requires too strong assumptions about the nonlinearity. To be more specific, the fact that C k,γ (Ω) are not reflexive spaces prevents us from applying variational methods which strongly depend on the selection of weakly convergent sequences from bounded ones (which is usually guaranteed by reflexivity) and sometimes also on the Hilbert structure of the function space. Concerning the restrictive assumptions on nonlinearity let us recall Remark 7.2.4. That is why it is important to look for a different concept of “solution” than that introduced in Section 7.1. Let us point out that there are also more practical reasons for the introduction of a different concept of solution. Many equations and boundary value problems are derived from globally formulated laws (coming from physics, chemistry, biology, sociology, economics, . . . ). To give a very simple example, let G be the primitive of g, i.e., G (s) = g(s), and let the functional 1 2 |∇u(x)| dx − G(u(x)) dx − f (x)u(x) dx (7.3.1) E(u) = 2 Ω Ω Ω represent the energy of a certain system. Following the well-known law, a physicist will be interested in finding functions u defined on Ω, u = 0 on ∂Ω, for which E(u) attains its minimal value. The task for mathematicians is to find the corresponding formalism in the framework of which the existence of such functions is guaranteed (and they can be calculated).
482
Chapter 7. Boundary Value Problems for Partial Differential Equations
Let us assume that u belongs to a normed linear space H of functions defined on Ω and let us specify the properties of H later on. Let us also understand the following calculations formally and assume that u, g, f and Ω have all properties necessary in order to perform the calculations. Assume that u0 ∈ H, u0 = 0 on ∂Ω, is the point of minimum of E, i.e., E(u0 ) = min E(u). u∈H
Then δE(u0 ; v) = 0
for any v ∈ H
(see Proposition 6.1.2), i.e., (∇u0 (x), ∇v(x)) dx = g(u0 (x))v(x) dx + f (x)v(x) dx Ω
Ω
(7.3.2)
Ω
for any v ∈ H. Note that (7.3.2) is nothing else than the Euler necessary condition for (7.3.1). Now, if we assume that u0 ∈ C 2 (Ω), v ∈ C 1 (Ω), v = 0 on ∂Ω, and apply the Green Formula to the left-hand side of (7.3.2) (using the fact that v = 0 on ∂Ω), we arrive at (−∆u0 (x))v(x) dx = g(u0 (x))v(x) dx + f (x)v(x) dx (7.3.3) Ω
Ω
Ω
for any v ∈ H ∩ C 1 (Ω). If g ∈ C(R), f ∈ C(Ω) and H contains “enough” functions (e.g., {v ∈ C 1 (Ω) : v = 0 on ∂Ω} ⊂ H), then (7.3.3) implies in Ω, −∆u0 (x) = g(u0 (x)) + f (x) (7.3.4) on ∂Ω. u0 = 0 On the other hand, if we find u0 ∈ H ∩ C 2 (Ω) satisfying (7.3.4), then we can pass, using the Green Formula, easily “back” to (7.3.2). However, looking carefully at (7.3.2), we immediately realize that all the expressions in (7.3.2) make sense under more general assumptions on u0 (Ω, g and f ) than the expressions in (7.3.4) do. One can immediately see that for (7.3.2) to hold we do not need to assume the existence of second partial derivatives of u0 at all. On the other hand, (7.3.4) does not make any sense without them if we understand it in the classical sense. It also makes good sense from the physical point of view to consider integral identity (7.3.2) as a starting point for the definition of the “solution” u0 .8 The 8 The usual scheme for deriving basic equations of mathematical physics is the following: First the “global formulation” in terms of an integral identity is derived from the conservation law, balance law, etc., and then the “local formulation” is derived in terms of differential equations. The second step, however, requires some extra assumptions about the solution (e.g., smoothness, etc.).
7.3. Weak Solutions, Functional Setting
483
advantage of the weak solution of (7.3.4) consists in the fact that the equation and the boundary conditions are not satisfied pointwise but in a more general sense which can correspond better to the real situation described by problems of the type (7.3.4). Let us point out here that the notion of the weak solution is a generalization of the notion of the classical solution. We shall show later that not every weak solution is a classical solution. We will also see later that more methods of nonlinear analysis from the previous chapters are applicable to get a weak solution instead of a classical one. On the other hand, it makes good sense to ask whether a weak solution to some problem has some better properties (continuity, H¨ older continuity, differentiability, etc.). Very often this is the case and it depends on Ω, g and f how regular (i.e., smooth) the weak solution is. For general partial differential equations, however, this is a difficult problem and the regularity theory which deals with these questions is an important part of basic research in mathematics. Let us consider now a bit more general situation when g = g(x, u), and for u ∈ H let us investigate the identity (∇u(x), ∇v(x)) dx = g(x, u(x))v(x) dx, v ∈ H, (7.3.5) Ω
Ω
in more detail and try to make clear how, under “weak” assumptions concerning the functions u, v and g, the expressions appearing in (7.3.5) make sense. Note that the Lebesgue integral is used in (7.3.5). So, in particular, the functions under the integral sign must be measurable. As for ∇u, ∇v and v this will be guaranteed by the requirement that u and v belong to a suitable space of integrable functions. The measurability of the composite function g(x, u(x)) will depend on the properties of the function g itself. As was mentioned in Section 3.2 (Definition 3.2.22) the composite function h(x) = g(x, u(x)) is measurable provided u is measurable and g fulfils the Carath´eodory conditions. For the sake of brevity we denote this fact as follows: g ∈ CAR(Ω × R). Let us return to the right-hand side of (7.3.5). Assume that there exist r ∈ L2 (Ω) and c > 0 such that for a.a. x ∈ Ω and for all s ∈ R, |g(x, s)| ≤ r(x) + c|s|.
(7.3.6)
Let u, v ∈ L2 (Ω). Then, according to Theorem 3.2.24, g(x, u(x)) ∈ L2 (Ω) and the H¨older inequality yields that g(x, u(x))v(x) ∈ L1 (Ω). Now, let us take care of the left-hand side of (7.3.5). For this purpose we have to employ the Sobolev spaces introduced in Section 1.2.
484
Chapter 7. Boundary Value Problems for Partial Differential Equations
Recall that the Sobolev space W01,p (Ω), p > 1, was defined in Exercise 1.2.46. It follows from the Poincar´e inequality (see Exercise 1.2.46(ii), (iii)) that the expression
p1 u = |∇u(x)|p dx Ω
W01,p (Ω).
defines an equivalent norm on For functions u ∈ W01,p (Ω), v ∈ W01,p (Ω), we can apply the H¨ older inequality to get
p1 1 p p p |(∇u(x), ∇v(x))| dx ≤ |∇u(x)| dx |∇v(x)| dx . (7.3.7) Ω
Ω
Ω
Let us point out that not every function from W01,p (Ω) is continuous in general (cf. Theorem 1.2.26). This fact depends on the values of p > 1 and N (notice that Ω ⊂ RN ) and on the properties of the boundary of Ω. For domains of the class C 0,1 (see page 474), i.e., domains with a Lipschitz boundary, we can define the space Lp (∂Ω) with the norm
uLp(∂Ω) =
|u(x)|p dS
p1
∂Ω
(see Definition 4.3.81 for the meaning of integral, and, e.g., Kufner, John & Fuˇc´ık [82], Neˇcas [99]). The following assertion is called the Trace Theorem and it is the key to understanding the notion of “boundary values” of functions from W 1,p (Ω). Theorem 7.3.1 (Trace Theorem). Let Ω ∈ C 0,1 . There exists one and only one continuous linear operator T which assigns to every function u ∈ W 1,p (Ω) a function T u ∈ Lp (∂Ω) and has the following property: “For
u ∈ C ∞ (Ω)
we have
T u = u|∂Ω .”
The following identity holds: W01,p (Ω) = {u ∈ W 1,p (Ω) : T u = o in Lp (∂Ω)} (see, e.g., Kufner, John & Fuˇc´ık [82] and Neˇcas [99]). This assertion offers another way of understanding the space W01,p (Ω). Instead of T u = o in Lp (∂Ω) we can say “u = o on ∂Ω
in the sense of traces”
but we usually write u=0
on ∂Ω.
Now, if we turn back to the integral identity (7.3.5), we can immediately see, by applying the H¨ older inequality (7.3.7) with p = 2, that the integral on the
7.3. Weak Solutions, Functional Setting
485
left-hand side is finite if u, v ∈ W 1,2 (Ω). If, moreover, u ∈ W01,2 (Ω), then u satisfies the boundary condition in the sense of traces. These facts motivate the following definition. Definition 7.3.2. Let g ∈ CAR(Ω × R) and let Ω ∈ C 0,1 be a bounded domain in RN . By a weak solution of the Dirichlet problem (with homogeneous boundary conditions) −∆u(x) = g(x, u(x)) in Ω, u=0
on
∂Ω,
we understand a function u ∈ W01,2 (Ω) such that the integral identity (∇u(x), ∇v(x)) dx = g(x, u(x))v(x) dx Ω
(7.3.8)
Ω
holds for every v ∈ W01,2 (Ω).9 In order to simplify the notation in the sequel, we write ∇u(x)∇v(x) (∇u(x), ∇v(x)). So, (7.3.8) reads as
∇u(x)∇v(x) dx = Ω
g(x, u(x))v(x) dx, Ω
which is a more common form used in literature. Remark 7.3.3. The function v is called a test function. More general operators as well as nonhomogeneous boundary conditions are often dealt with in literature. The notion of a weak solution is then defined in a similar way. However, the definition is technically more complicated (see, e.g., Gilbarg & Trudinger [59], Fuˇc´ık & Kufner [54]). Remark 7.3.4. It was shown above that every classical solution is also a weak solution. The converse is not always true as follows, e.g., from the fact that the weak solution need not possess the second partial derivatives at all! Roughly speaking, the weak solutions are looked for in larger spaces than the classical ones. That is why the chance to prove the existence of a weak solution is usually bigger than the chance to prove the existence of a classical solution. Hence, some “ill-posed” problems in the framework of classical solutions can appear to be “well-posed” in the framework of weak solutions. In the forthcoming sections we will work exclusively with bounded domains Ω ∈ C 0,1 without any further specification. D(Ω) is dense in W01,2 (Ω) this is equivalent to the validity of (7.3.8) for all v ∈ D(Ω). We use v ∈ W01,2 (Ω) to get the scalar product in W01,2 (Ω) on the left-hand side of (7.3.8). 9 Since
486
Chapter 7. Boundary Value Problems for Partial Differential Equations
Exercise 7.3.5. A function u ∈ W 1,2 (Ω) satisfying (7.3.8) for any v ∈ W 1,2 (Ω) is called a weak solution of the Neumann problem ⎧ in Ω, ⎨ −∆u(x) = g(x, u(x)) (7.3.9) ⎩ ∂u = 0 on ∂Ω. ∂n Here
∂u ∂n
denotes the derivative of u with respect to the outer normal n. Prove that any weak solution u of (7.3.9) such that u ∈ C 2 (Ω) satisfies
∂u =0 at every point of ∂Ω. ∂n Hint. Use an argument similar to that in Exercise 5.3.20. Notice that
12 2 u = |∇u(x)| dx Ω
is only a semi-norm on W
1,2
(Ω)!
7.4 Weak Solutions, Application of Fixed Point Theorems In this section we give an application of the fixed point theorems to proving the existence of a weak solution to the problem −∆u(x) = g(x, u(x)) in Ω, (7.4.1) u=0 on ∂Ω. For a fixed u ∈ W01,2 (Ω) and g ∈ CAR(Ω × R) satisfying (7.3.6) it is easy to see that ˆu : v → ∇u(x)∇v(x) dx, Sˆu : v → g(x, u(x))v(x) dx L Ω
Ω 1,2 W0 (Ω).
Since W01,2 (Ω) is a Hilbert are continuous linear functionals on the space space, by the Riesz Representation Theorem (Theorem 1.2.40) there exist uniquely determined elements Lu, S(u) ∈ W01,2 (Ω) such that ˆ u (v), (Lu, v)W 1,2 (Ω) = L 0
(S(u), v)W 1,2 (Ω) = Sˆu (v) 0
(7.4.2)
for all v ∈ W01,2 (Ω). To prove that the Dirichlet problem (7.4.1) has at least one weak solution it is necessary and sufficient to prove that the operator equation Lu = S(u) has at least one solution in the space properties of the operators L and S.
W01,2 (Ω).
(7.4.3) Let us therefore investigate the
7.4. Weak Solutions, Application of Fixed Point Theorems
487
There are several equivalent inner products defined on W01,2 (Ω). If we choose (u, v)W 1,2 (Ω) ∇u(x)∇v(x) dx, 10 0
Ω
then L defined by (7.4.2) is just an identity on W01,2 (Ω). On the other hand, for any u ∈ W01,2 (Ω) we also have u ∈ L2 (Ω), and Theorem 3.2.24 then implies that g(x, u) ∈ L2 (Ω). Let us assume that g is Lipschitz continuous with respect to the second variable, i.e., there exists a constant c > 0 such that for a.a. x ∈ Ω and for any s1 , s2 ∈ R, |g(x, s1 ) − g(x, s2 )| ≤ c|s1 − s2 |. Then for u1 , u2 ∈ W01,2 (Ω) we have S(u1 ) − S(u2 ) = sup |(S(u1 ) − S(u2 ), v)| v≤1
- = sup -- [g(x, u1 (x)) − g(x, u2 (x))]v(x) dx-v≤1 Ω c|u1 (x) − u2 (x)||v(x)| dx ≤ sup v≤1
(7.4.4)
Ω
≤ sup cu1 − u2 L2 (Ω) vL2 (Ω) v≤1
≤ sup c c2emb u1 − u2 v = c c2emb u1 − u2 .11 v≤1
Hence (7.4.3) is equivalent in W01,2 (Ω) to the operator equation u = S(u)
(7.4.5)
where S is a contraction if c2emb c < 1. We can apply the Contraction Principle to get the following result. Theorem 7.4.1. Let g ∈ CAR(Ω × R) be Lipschitz continuous with respect to the second variable with a constant c > 0, c < c−2 emb , where cemb is the constant of the embedding of W01,2 (Ω) into L2 (Ω). Then there is a unique fixed point u ∈ W01,2 (Ω) of the operator S, i.e., u is a unique weak solution of (7.4.1). Another possibility is to apply the Schauder Fixed Point Theorem. Let us assume that g ∈ CAR(Ω × R) and |g(x, s)| ≤ r(x) for a.a. x ∈ Ω and all s ∈ R 10 Cf.
the Poincar´e inequality (7.2.4). exists a constant cemb > 0 such that uL2 (Ω) ≤ cemb u for any u ∈ W01,2 (Ω) (cf. Theorem 1.2.26 and Exercise 1.2.46). It can be shown that the best value of cemb is √1λ (cf. Re1 mark 7.4.5 and also (7.2.4)). 11 There
488
Chapter 7. Boundary Value Problems for Partial Differential Equations
with a fixed r ∈ L2 (Ω). Then for any u ∈ W01,2 (Ω) we have - S(u) = sup |(S(u), v)| = sup - g(x, u(x))v(x) dx-v≤1
≤ sup
v≤1
v≤1
Ω
12 12 |g(x, u(x))|2 dx |v(x)|2 dx ≤ cemb rL2 (Ω) .
Ω
Ω
(7.4.6) It follows from (7.4.6) that the operator S maps the closure of the ball B(o; R) ⊂ W01,2 (Ω) with radius R = cemb rL2 (Ω) into itself. We will show that S is compact. Indeed, let M ⊂ W01,2 (Ω) be a bounded set ∞ ∞ and {wn }n=1 ⊂ S(M) an arbitrary sequence. Let {un }n=1 ⊂ M be such that S(un ) = wn . The reflexivity of W01,2 (Ω) implies that un u in W01,2 (Ω) at least for a subsequence. It follows from Theorem 1.2.28 that un → u in L2 (Ω). Estimates similar to those from (7.4.4) with u1 replaced by un and u2 replaced by u yield S(un ) − S(u)W 1,2 (Ω) ≤ cemb g(·, un ) − g(·, u)L2 (Ω) . 0
(7.4.7)
The right-hand side approaches zero as follows from the continuity of the Nemytski operator from L2 (Ω) into L2 (Ω) (see Theorem 3.2.24), i.e., wn → S(u)
in W01,2 (Ω)
(at least for a subsequence). This proves the compactness of S(M). Note that (7.4.4) implies the continuity of S as well, i.e., S is a compact operator. Thus we have the following assertion. Proposition 7.4.2. Let g ∈ CAR(Ω × R) and let there exist r ∈ L2 (Ω) such that |g(x, s)| ≤ r(x) for all s ∈ R and a.a. x ∈ Ω. Then there is at least one fixed point u ∈ W01,2 (Ω) of S (and so a weak solution of (7.4.1)) such that uW 1,2 (Ω) ≤ cemb rL2 (Ω) . 0
It is not hard to see that the assumptions of Proposition 7.4.2 are unnecessarily strong. In order to find a ball B(o; R) ⊂ W01,2 (Ω) which is mapped by S into itself one can assume that g has sublinear growth with respect to the second variable. More precisely, we assume that g ∈ CAR(Ω × R) and there exist r ∈ L2 (Ω), c > 0 and δ ∈ (0, 1) such that for all s ∈ R and a.a. x ∈ Ω we have |g(x, s)| ≤ r(x) + c|s|δ .
(7.4.8)
7.4. Weak Solutions, Application of Fixed Point Theorems
489
Then for any u ∈ W01,2 (Ω), similarly to (7.4.6), we have
S(u) ≤ sup
|g(x, u(x))| dx 2
v≤1
12
Ω
Ω
≤ cemb
12
Ω
-r(x) + c|u(x)|δ -2 dx
≤ cemb
|v(x)| dx 2
|r(x)| dx 2
12
12
(7.4.9) 1
|u(x)|2δ dx
+c
Ω
2
Ω
where the last estimate is due to the Minkowski inequality (1.2.4) for p = 2. Applying the H¨ older inequality we have
Ω
12
δ2 1−δ 1−δ |u(x)|2δ dx ≤ |u(x)|2 dx (meas Ω) 2 ≤ cδemb (meas Ω) 2 uδ . Ω
(7.4.10)
Now, (7.4.9) and (7.4.10) yield (meas Ω) S(u) ≤ cemb rL2 (Ω) + c c1+δ emb
1−δ 2
uδ .
(7.4.11)
D
C
It follows from (7.4.11) that for any u ∈ B(o; R) we get S(u) < R
C + DRδ < R
provided
(7.4.12)
and hence S maps B(o; R) into itself if R is large enough. The map S is compact as well (no growth restrictions on g are needed to prove the compactness of S). Therefore we have a more general result than that in Proposition 7.4.2. Theorem 7.4.3. Let g ∈ CAR(Ω × R) satisfy (7.4.8) for δ < 1. Then there is at least one weak solution u of (7.4.1). In Remark 7.4.5 we will show that it is not possible in general to allow the linear growth of g = g(x, s) with respect to s. We will need the result from the next example. Example 7.4.4. Let λ1 ∈ R be defined as
λ1 =
inf
u∈W01,2 (Ω) u =o
Ω
|∇u(x)|2 dx |u(x)|2 dx
Ω
or equivalently as
|∇u(x)|2 dx :
λ1 = inf Ω
Ω
|u(x)|2 dx = 1, u ∈ W01,2 (Ω) .
490
Chapter 7. Boundary Value Problems for Partial Differential Equations
This is equivalent to the following characterization of λ1 : |u(x)|2 dx 1 Ω = sup λ1 u∈W01,2 (Ω) |∇u(x)|2 dx u =o
or
1 = sup λ1
|u(x)|2 dx :
Ω
Ω
Ω
|∇u(x)|2 dx = 1, u ∈ W01,2 (Ω) .
The operator A : W01,2 (Ω) → W01,2 (Ω) defined by (A(u), v)W 1,2 (Ω) = u(x)v(x) dx, 0
u, v ∈ W01,2 (Ω),
Ω
is positive, self-adjoint and compact.12 It follows then from Theorem 6.3.12 that the supremum λ11 is achieved and the function u1 ∈ W01,2 (Ω) such that 1 = |u1 (x)|2 dx, u1 W 1,2 (Ω) = 1, 0 λ1 Ω is an eigenvector of A (corresponding to the eigenvalue Then 1 ϕ1 = λ1 u1
1 λ1 ),
cf. Example 6.3.15.
satisfies ϕ1 L2 (Ω) = 1
and
|∇ϕ1 (x)|2 dx.
λ1 = Ω
Moreover, we have that ∇ϕ1 (x)∇v(x) dx = λ1 ϕ1 (x)v(x) dx Ω
Ω
for any v ∈ W01,2 (Ω).13
We will then call λ1 the principal (least) eigenvalue principal (normalized) eigenfunction of the eigenvalue −∆u(x) = λu(x) in u=0 on
and ϕ1 the corresponding problem Ω, ∂Ω
(7.4.13)
(cf. page 479). According to Example 5.4.40 the function ϕ1 can be chosen positive g in Ω. Remark 7.4.5. Note that it follows from the definition of the principal eigenvalue λ1 that, 1 uL2(Ω) ≤ √ uW 1,2 (Ω) for any u ∈ W01,2 (Ω) 0 λ1 the compact embedding W01,2 (Ω) ⊂⊂ L2 (Ω) (Theorem 1.2.28). reader is invited to prove this equality using the Lagrange Multiplier Method (Theorem 6.3.2). 12 Use
13 The
7.4. Weak Solutions, Application of Fixed Point Theorems
and
√1 λ1
491
is the least constant with this property. Hence 1 cemb = √ . λ1
Let us now consider the problem (7.4.1) where g is of the form g(x, u) = λ1 u + f (x)
with f ∈ L2 (Ω).
If u ∈ W01,2 (Ω) is a weak solution of −∆u(x) = λ1 u(x) + f (x) u=0 then
in on
(7.4.14)
∇u(x)∇v(x) dx = λ1 Ω
Ω, ∂Ω,
u(x)v(x) dx +
f (x)v(x) dx
Ω
Ω
for any v ∈ W01,2 (Ω). In particular, choosing v = ϕ1 where ϕ1 is the principal eigenfunction corresponding to λ1 , we have 0= ∇u(x)∇ϕ1 (x) dx − λ1 u(x)ϕ1 (x) dx = f (x)ϕ1 (x) dx, Ω
Ω
i.e.,
Ω
f (x)ϕ1 (x) dx = 0 Ω
is a necessary condition for (7.4.14) to have a weak solution (cf. Exercise 7.5.11). In other words, it is not possible to relax condition (7.4.8) to hold for δ = 1 and to expect the existence of a weak solution of (7.4.1) at the same time. Checking carefully estimates similar to (7.4.9) but with δ = 1, we come to the conclusion that small c in (7.4.8) may compensate the linear growth and Theorem 7.4.3 will still hold true. More precisely, assuming |g(x, s)| ≤ r(x) + c|s|,
(7.4.15)
we have for any u ∈ W01,2 (Ω), similarly to (7.4.9),
S(u) ≤ sup
v≤1
|g(x, u(x))|2 dx
12
Ω
|r(x) + c|u(x)||2 dx
≤ cemb Ω
≤ cemb
12
12 |v(x)|2 dx
Ω
12
12 |r(x)|2 dx +c |u(x)|2 dx
Ω
≤ cemb rL2 (Ω) +
Ω
c c2embu.
492
Chapter 7. Boundary Value Problems for Partial Differential Equations
If
1 c2emb then we can choose R > 0 such that c<
( ≤ λ1 ),
(7.4.16)
cemb rL2 (Ω) + c c2embR < R, and for any u ∈ B(o; R) we get S(u) < R.
(7.4.17)
Since S is also a compact operator under the growth condition (7.4.15), we can formulate Theorem 7.4.6. Let g ∈ CAR(Ω × R) and let there exist r ∈ L2 (Ω) and c > 0 such that (7.4.15) and (7.4.16) hold true. Then problem (7.4.1) has at least one weak solution u ∈ W01,2 (Ω). Remark 7.4.7. It follows directly that the closer c is to the value c21 , the larger emb R has to be taken in order to get (7.4.17). At the same time one can see that the statement does not hold if c = λ1 (see Remark 7.4.5). Exercise 7.4.8. Prove that if λ is an eigenvalue of (7.4.13), then λ ≥ λ1 . Hint. Apply Theorem 6.3.12 similarly to Example 6.3.15. Exercise 7.4.9. Formulate and prove an analogue of Theorems 7.4.1, 7.4.3 and 7.4.6 for the Neumann problem ⎧ in Ω, ⎨ −∆u(x) = g(x, u(x)) ∂u ⎩ =0 on ∂Ω. ∂n
7.5 Weak Solutions, Application of Degree Theory In this section we show how to generalize assumptions (7.4.15) and (7.4.16). We apply the degree theory and prove that more general nonlinearities g = g(x, s) can be considered in (7.4.1) than those satisfying (7.4.15) and (7.4.16). At the same time we must be aware of Remark 7.4.5 and avoid the situation treated there. But first we give an example concerning the higher eigenvalues of (7.4.13). Example 7.5.1. If for a λ there is u = o, u ∈ W01,2 (Ω), such that ∇u(x)∇v(x) dx = λ u(x)v(x) dx holds for any v ∈ W01,2 (Ω), Ω
(7.5.1)
Ω
then λ is an eigenvalue and u is the corresponding eigenfunction of (7.4.13). Let A : W01,2 (Ω) → W01,2 (Ω) be from Example 7.4.4. Then (7.5.1) is equivalent to the eigenvalue problem 1 µu = Au where µ = . (7.5.2) λ
7.5. Weak Solutions, Application of Degree Theory
493
It follows from Theorem 6.3.12 (cf. also the Hilbert–Schmidt Theorem – Theorem 2.2.16) that A has a countable set of eigenvalues {µn }∞ n=1 of finite multiplicity such that µn ≥ µn+1 ≥ · · · > 0, µn → 0. Hence the eigenvalues of (7.4.13) form an increasing sequence 0 < λ1 ≤ λ2 ≤ λ3 ≤ · · · ,
λn → ∞.
In fact, it is also possible to prove that λ1 has multiplicity 1 (i.e., λ1 < λ2 ) and the corresponding eigenspace is spanned by a function ϕ1 which is positive in Ω, see ∞ Example 5.4.40. The system of all normalized eigenfunctions {ϕn }n=1 forms an or1,2 thonormal basis in W0 (Ω) (cf. the Hilbert–Schmidt Theorem (Theorem 2.2.16)). It follows from the regularity result (which is far from being simple for N ≥ 2, see, e.g., Gilbarg & Trudinger [59], Fuˇc´ık [53], Neˇcas [99]) that all eigenfunctions g ϕk are continuous functions in Ω. Let us assume now that the nonlinear function g = g(x, s) is of the form g(x, s) = λs + h(x, s)
where
λ ∈ R,
h ∈ CAR(Ω × R)
(7.5.3)
and all s ∈ R.
(7.5.4)
and there exists r ∈ L2 (Ω) such that |h(x, s)| ≤ r(x)
for a.a. x ∈ Ω
Now we can formulate the following assertion which generalizes (in the sense of the growth of g with respect to s) the results of the previous section. Theorem 7.5.2. Let h ∈ CAR(Ω × R) satisfy (7.5.4). Let, moreover, λ = λn , n = 1, 2, . . . Then the problem −∆u(x) = λu(x) + h(x, u(x)) in Ω, (7.5.5) u=0 on ∂Ω, has at least one weak solution. Note that the assertion of Theorem 7.5.2 does not hold without the assumption λ = λn , n = 1, 2, . . . , as shown in Remark 7.4.5. Proof of Theorem 7.5.2. We will use the Leray–Schauder degree theory to prove the result. The reader is invited to verify that the existence of a weak solution of (7.5.5) is equivalent to the existence of a solution of the operator equation u = λAu + S(u) W01,2 (Ω)
where A is as in Example 7.4.4 and S : h(x, u(x))v(x) dx (S(u), v) = Ω
→
(7.5.6) W01,2 (Ω)
is defined by
for all u, v ∈ W01,2 (Ω).
Note that S is a compact operator (see page 488).
494
Chapter 7. Boundary Value Problems for Partial Differential Equations
Our plan is the following. The existence of at least one solution of (7.5.6) would follow from deg (I − λA − S(·), B(o; R), o) = 0 (7.5.7) if we found a ball B(o; R) for which (7.5.7) is valid. To prove (7.5.7) we use the homotopy invariance property of the degree (Theorem 5.2.13(vi)) and “connect” the operator I − λA − S(·) with the operator I − λA on the boundary of a ball B(o; R) with a sufficiently large radius R > 0. Once this is done we finally use deg (I − λA, B(o; R), o) = 0.14 So, to complete the proof, we have to find an admissible homotopy connecting I − λA − S(·) and I − λA. Probably the simplest way to do it is the following. Define H(τ, u) = u − λA(u) − τ S(u),
τ ∈ [0, 1],
u ∈ W01,2 (Ω),
and prove that there exists R > 0 such that for all u ∈ W01,2 (Ω), u = R and τ ∈ [0, 1] we have H(τ, u) = o. (7.5.8) The usual way to establish (7.5.8) is an indirect proof. Assume that no such R > 0 ∞ ∞ exists, i.e., we can find sequences {un }n=1 ⊂ W01,2 (Ω) and {τn }n=1 ⊂ [0, 1] such that un → ∞ and un − λA(un ) − τn S(un ) = o. (7.5.9) Set vn
un un
and divide (7.5.9) by un to get vn − λA(vn ) − τn
S(un ) = o. un
(7.5.10)
This is equivalent to h(x, un (x)) w(x) dx ∇vn (x)∇w(x) dx = λ vn (x)w(x) dx − τn un Ω Ω Ω for any w ∈ W01,2 (Ω). Now, passing to suitable subsequences we can assume that τnk → τ ∈ [0, 1], vnk v in W01,2 (Ω). At the same time |h(x, unk (x))| r(x) |w(x)| dx ≤ |w(x)| dx → 0 as k → ∞. unk Ω Ω unk To summarize, we have τnk
S(unk ) → o, unk
(7.5.11)
A(vnk ) → A(v)
(7.5.12)
(by the compactness of A, see Proposition 2.2.4(iii)). 14 This
follows from Proposition 5.2.22 or Exercise 5.2.26.
7.5. Weak Solutions, Application of Degree Theory
495
So, putting together (7.5.10)–(7.5.12) we also obtain that vnk → v ∗
in
W01,2 (Ω).
But v ∗ = v by virtue of vnk v. Now, passing to the limit in (7.5.10) with n replaced by nk , we arrive at v − λA(v) = o, (7.5.13) and v ∈ W01,2 (Ω) satisfies v = 1 (it is the strong limit of elements vnk which satisfy vnk = 1!). However, this contradicts our assumption λ = λn , n = 1, 2, . . . . It proves that (7.5.8) holds, i.e., the homotopy H is admissible. This completes the proof. Following the scheme of the proof one can also handle nonlinearities of type (7.5.3) with λ = λn for an n = 1, 2, . . . . However, the assumptions on h must be strengthened in order “to help” to prove that the corresponding homotopy H is admissible. The following assertion is a simple example of a generalization of Theorem 7.5.2 in this direction. Theorem 7.5.3. Let f ∈ L2 (Ω) and g : R → R be continuous with finite limits lim g(s) = g(±∞) and such that for all s ∈ R we have
s→±∞
g(−∞) < g(s) < g(+∞). Let λ1 > 0 be the first eigenvalue of the Laplace operator subject to the Dirichlet boundary conditions. Then the problem in Ω, −∆u(x) = λ1 u(x) + g(u(x)) − f (x) (7.5.14) u=0 on ∂Ω, has at least one weak solution if and only if f (x)ϕ1 (x) dx < g(+∞) g(−∞) < Ω
(7.5.15)
where ϕ1 is the first positive eigenfunction normalized by
ϕ1 (x) dx = 1. Ω
Proof. We will follow a scheme similar to the proof of Theorem 7.5.2. For δ > 0 so small that λ1 + δ < λ2 we define the homotopy H(τ, u) u − λ1 A(u) − (1 − τ )δA(u) − τ S(u), 15
τ ∈ [0, 1],
u ∈ W01,2 (Ω).
Performing all steps as in the proof of Theorem 7.5.2 we arrive at an analogue of (7.5.13), namely, v − [λ1 + (1 − τ )δ]A(v) = o,
v = 1,
for a τ ∈ [0, 1].
This is a contradiction if τ = 1 since λ1 + (1 − τ )δ is not an eigenvalue and v = o.
15 But
now (S(u), v) = Ω
g(u(x))v(x) dx, u, v ∈ W01,2 (Ω).
496
Chapter 7. Boundary Value Problems for Partial Differential Equations
Let us assume τ = 1, i.e., τnk → 1. Now, however, we have no contradiction since λ1 is an eigenvalue and v − λ1 A(v) = o has a solution with v = 1. Another step is necessary to reach a contradiction and to prove that the homotopy H is admissible. We have to revise the last step when passing to the limit in vn − λ1 A(vn ) − (1 − τn )δA(vn ) − τn
S(un ) =o un
and employ special properties of S. Namely, unk − λ1 A(unk ) − (1 − τnk )δA(unk ) − τnk S(unk ) = o is equivalent to the integral identity ∇unk (x)∇w(x) dx = [λ1 + (1 − τnk )δ] unk (x)w(x) dx Ω Ω + τnk g(unk (x))w(x) dx − τnk f (x)w(x) dx (7.5.16) Ω
for all w ∈
W01,2 (Ω).
Ω
Taking w = ϕ1 in (7.5.16) and using the fact that ∇unk (x)∇ϕ1 (x) dx = λ1 unk (x)ϕ1 (x) dx,
Ω
we obtain
Ω
(1 − τnk )δ
unk (x)ϕ1 (x) dx + τnk Ω
Ω
g(unk (x))ϕ1 (x) dx = τnk f (x)ϕ1 (x) dx. (7.5.17) Ω
As above, vnk
unk unk
→ v in
W01,2 (Ω)
and v = κϕ1 with a κ = 0. Assume that
κ > 0. Then (at least for a subsequence) unk (x) → ∞ a.e. in Ω.16 Passing to the limit in (7.5.17) and using τnk → 1− and the Lebesgue Dominated Convergence Theorem we obtain f (x)ϕ1 (x) dx ≥ lim g(unk (x))ϕ1 (x) dx = g(+∞)ϕ1 (x) dx = g(+∞). k→∞
Ω
Ω
Ω
This contradicts the second inequality in (7.5.15). Similarly we proceed if κ < 0 to get a contradiction with the first inequality in (7.5.15). This proves that H is admissible, and so (7.5.15) is sufficient for the existence of a weak solution of (7.5.14). vnk → κϕ1 in W01,2 (Ω) implies vnk → κϕ1 in L2 (Ω) (Theorem 1.2.26). Hence there ex,∞ + ists a subsequence vnk ⊂ {vnk }∞ k=1 such that vnk → κϕ1 > 0 a.e. in Ω (Remark 1.2.18),
16 Indeed,
l
l=1
i.e., unk (x) → ∞ a.e. in Ω. l
l
7.5. Weak Solutions, Application of Degree Theory
497
To prove that (7.5.15) is also necessary we proceed as follows. Let u0 be a weak solution of (7.5.14), i.e., for any w ∈ W01,2 (Ω) let us have ∇u0 (x)∇w(x) dx Ω = λ1 u0 (x)w(x) dx + g(u0 (x))w(x) dx − f (x)w(x) dx. Ω
Set w = ϕ1 . Then
Ω
Ω
g(u0 (x))ϕ1 (x) dx =
Ω
f (x)ϕ1 (x) dx, Ω
and the result follows from the fact that g(u0 (x))ϕ1 (x) dx < g(+∞).17 g(−∞) <
Ω
Remark 7.5.4. Theorem 7.5.3 holds for more general nonlinearities g (see Exercises 7.5.9 and 7.5.10). Conditions of type (7.5.15) are called the Landesman–Lazer type conditions according to the paper Landesman & Lazer [83] where analogous results were proved for the first time. Remark 7.5.5. The main difference between Theorems 7.5.2 and 7.5.3 consists in the fact that λ is not an eigenvalue of (7.4.13) in Theorem 7.5.2 while λ is an eigenvalue of (7.4.13) in Theorem 7.5.3. That is why we speak about a nonresonance problem in the former and about a resonance problem in the latter case. The word “resonance” is used here because of the fact that the corresponding ordinary differential version of (7.4.13) describes the resonance in electric circuits when λ is an eigenvalue. Remark 7.5.6. A result analogous to Theorem 7.5.3 can be proved also if λ1 is replaced by any eigenvalue λn , n ≥ 2. The Landesman–Lazer type conditions then have a different form and all linearly independent eigenfunctions associated with λn must be involved (see, e.g., Fuˇc´ık [53]). Remark 7.5.7. Let us mention one geometrical aspect of condition (7.5.15). It follows from the linear Fredholm alternative (see Exercise 7.5.11) that in Ω, −∆u(x) = λ1 u(x) − f (x) (7.5.18) u=0 on ∂Ω with f ∈ L2 (Ω) has a weak solution if and only if f belongs to a linear subspace V of L2 (Ω) of codimension 1: f (x)ϕ1 (x) dx = 0 . V = f ∈ L2 (Ω) : Ω 17 Indeed,
note that g(−∞) < g(u0 (x)) < g(+∞), ϕ1 (x) > 0 in Ω, and
ϕ1 (x) dx = 1. Ω
498
Chapter 7. Boundary Value Problems for Partial Differential Equations
If g(−∞) < 0 < g(+∞), then the set of all f for which (7.5.14) has at least one weak solution contains V as a proper subset and is in fact much larger (see Figure 7.5.1). L2 (Ω)
V f: f (x)ϕ1 (x) dx = g(±∞) Ω
o
ϕ1
Figure 7.5.1. Problem (7.5.14) has a weak solution if and only if f belongs to the shaded area.
Exercise 7.5.8. The same method as in the proof of Theorem 7.5.2 works if we replace (7.5.4) by a more general assumption |h(x, s)| ≤ r(x) + |s|δ
for a.a. x ∈ Ω and all s ∈ R
where δ ∈ (0, 1).
Modify the proof and get the corresponding generalization of Theorem 7.5.2. Exercise 7.5.9. Let h ∈ CAR(Ω × R) be bounded and satisfy the following conditions: Let there exist limits h(x, −∞) = lim h(x, s)
h(x, +∞) = lim h(x, s), s→+∞
s→−∞
for a.a. x ∈ Ω. Assume that h(x, −∞) < h(x, +∞),
Prove that
(h(x, +∞) < h(x, −∞)) for a.a. x ∈ Ω.
−∆u(x) = λ1 u(x) + h(x, u(x))
in
Ω,
u=0
on
∂Ω
has at least one weak solution provided h(x, −∞)ϕ1 (x) dx < 0 < h(x, +∞)ϕ1 (x) dx Ω Ω ' ( h(x, +∞)ϕ1 (x) dx < 0 < h(x, −∞)ϕ1 (x) dx . Ω
Ω
(7.5.19)
(7.5.20)
7.5A. Application of the Degree of Generalized Monotone Operators
499
Exercise 7.5.10. Assume that for a.a. x ∈ Ω and for all s ∈ R we have h(x, −∞) < h(x, s) < h(x, +∞)
(h(x, +∞) < h(x, s) < h(x, −∞)).
Prove that (7.5.20) is also a necessary condition for the existence of a weak solution of (7.5.19). Exercise 7.5.11. Let f ∈ L2 (Ω). Prove that (7.5.18) has a weak solution if and only if f ∈ V , i.e., f (x)ϕ1 (x) dx = 0. Ω
(Cf. the Fredholm alternative mentioned in Theorem 2.2.9(iv).) Hint. Let A be from Example 7.4.4. Then I − λ1 A is a self-adjoint operator and Ker (I − λ1 A) = Lin{ϕ1 }. Any f ∈ L2 (Ω) defines a continuous linear form f ∗ on W01,2 (Ω) by f ∗ (u) =
u ∈ W01,2 (Ω). Ω f (x)ϕ1 (x) dx = 0, It follows from Proposition 2.1.27(iv) that given f ∈ L2 (Ω), f (x)u(x) dx,
Ω
then
f ∗ ∈ (Ker (I − λ1 A))⊥ = Im (I − λ1 A).
However, Im (I − λ1 A) = Im (I − λ1 A) by Theorem 2.2.9(iii). On the other hand, if (7.5.18) has a weak solution u0 for a given f ∈ L2 (Ω), then taking v = ϕ1 as a test function we arrive at f (x)ϕ1 (x) dx = 0. Ω
Exercise 7.5.12. Prove an analogue of Theorem 7.5.2 for the Neumann problem ⎧ in Ω, ⎨ −∆u(x) = λu(x) + h(x, u(x)) ∂u ⎩ =0 on ∂Ω. ∂n Exercise 7.5.13. Prove an analogue of Theorem 7.5.3 for the Neumann problem ⎧ in Ω, ⎨ −∆u(x) = g(u(x)) − f (x) ∂u ⎩ =0 on ∂Ω. ∂n
7.5A Application of the Degree of Generalized Monotone Operators In this appendix we deal with the boundary value problem −∆p u(x) = λ|u(x)|p−2 u(x) + f (λ, x, u(x)) in u=0
on
Ω, ∂Ω
(7.5.21)
500
Chapter 7. Boundary Value Problems for Partial Differential Equations
where p > 1, Ω ∈ C 0,1 is a bounded domain in RN , f : R × Ω × R → R is a Carath´eodory function (see Remark 3.2.25) which satisfies some further conditions specified below, and ∆p u div |∇u|p−2 ∇u 18 is the p-Laplacian. It is well known that the problem −∆p u(x) = λ|u(x)|p−2 u(x) u=0
in
Ω,
on
∂Ω
has a principal eigenvalue (i.e., the least one) λ1 > 0 which is simple, isolated and characterized variationally by |∇u(x)|p dx λ1 = inf Ω |u(x)|p dx Ω
where “inf” is taken over all u ∈ W01,p (Ω), u = o; no eigenfunction associated with λ1 changes sign in Ω and they all form a one-dimensional linear space (see, e.g., Anane [7], Lindqvist [86], or Example 6.3.5 for the case N = 1). Our aim in this appendix is to show that under some appropriate assumptions on f the value λ1 is a bifurcation point of (7.5.21) in the sense that for any neighborhood of (λ1 , o) ∈ R × W01,p (Ω) there exists at least one (λ, u) ∈ R × W01,p (Ω), u = o, which solves (in the sense mentioned below) the boundary value problem (7.5.21). Let us assume, for simplicity, that f = f (λ, x, s) is uniformly bounded, i.e., there exists a constant M > 0 such that for any (λ, s) ∈ R2 and a.a. x ∈ Ω, |f (λ, x, s)| ≤ M.
(7.5.22)
Let us define for λ ∈ R operators J, S, Fλ : W01,p (Ω) → (W01,p (Ω))∗ by J(u), v = |∇u(x)|p−2 ∇u(x)∇v(x) dx, S(u), v = |u(x)|p−2 u(x)v(x) dx, Ω Ω Fλ (u), v = f (λ, x, u(x))v(x) dx for u, v ∈ W01,p (Ω). Ω
It follows from (7.5.22), Theorem 3.2.24 and Remark 3.2.26 that J, S and Fλ are welldefined operators (the reader should justify it in detail!). Actually, combining Theorems 1.2.26, 1.2.28, 3.2.24 and Remark 3.2.26 we have (cf. Exercise 7.5.18):19 18 This
+
∂ ∂xN
means ∆p u = ⎧ ⎨' ⎩
∂u ∂x1
(2
∂ ∂x1
⎧ ⎨ ' ⎩
+ ··· +
'
∂u ∂x1
∂u ∂xN
(2
' +
( 2 p−2 2
∂u ∂xN
19 We
∂u ∂xN
work with the norm u = Ω
( 2 p−2 2
⎫ ⎬ ⎭
⎫ ⎬
∂u ∂x1 ⎭
+ ···
, ∆p u = 0 if ∇u = o.
1 p |∇u(x)|p dx on W01,p (Ω), i.e., J(u), u = up .
7.5A. Application of the Degree of Generalized Monotone Operators
501
(a) J and S are bounded operators in the sense that they map bounded sets onto bounded sets; (b) Fλ is a uniformly bounded operator in the sense that there exists a constant c > 0 such that for any λ ∈ R and u ∈ W01,p (Ω) we have Fλ (u)(W 1,p (Ω))∗ ≤ c; 0
(c) J, S and Fλ are continuous operators; (d) for any u, v ∈ W01,p (Ω), J(u) − J(v), u − v ≥ up−1 − vp−1 (u − v); (e) if un u in W01,p (Ω) and λn → λ in R, then we have S(un ) → S(u) and Fλn (un ) → Fλ (u) in (W01,p (Ω))∗ ; in particular, S and Fλ are compact. We say that a couple (λ, u) ∈ R × W01,p (Ω) is a weak solution of (7.5.21) if J(u), v = λS(u), v + Fλ (u), v
holds for any
v ∈ W01,p (Ω).
Thus, to find a weak solution of (7.5.21) is equivalent to finding a couple (λ, u) ∈ R × W01,p (Ω) which satisfies the operator equation J(u) = λS(u) + Fλ (u).
(7.5.23)
Lemma 7.5.14. For any λ ∈ R the operator J − λS − Fλ satisfies the (S+ ) condition (see Definition 5.2.44). Proof. We use the property (d) to prove that J satisfies the (S+ ) condition (we proceed similarly to Example 5.2.51). The assertion then follows from (e) and Lemma 5.2.46. Assume that for any λ ∈ R and a.a. x ∈ Ω we have f (λ, x, o) = 0.
(7.5.24)
This immediately yields for any λ ∈ R. Fλ (o) = o We also assume that for any bounded interval I ⊂ R the limit lim
s→0
f (λ, x, s) =0 |s|p−2 s
(7.5.25)
exists uniformly for a.a. x ∈ Ω and λ ∈ I. This implies lim
u →0
Fλ (u) =o up−1
uniformly for
λ∈I
(7.5.26)
(cf. Exercise 7.5.19). The fact that the eigenvalue λ1 > 0 is isolated and the compactness of S imply that there exists δ > 0 such that for any λ ∈ (λ1 − δ, λ1 + δ), λ = λ1 , there exists c = c(λ) > 0 such that J(u) − λS(u)(W 1,p (Ω))∗ ≥ c(λ)up−1 0
for any
u ∈ W01,p (Ω)
(7.5.27)
(cf. Exercise 7.5.20). Denote Tλ J − λS − Fλ . Then (7.5.26) and (7.5.27) imply that for λ ∈ (λ1 − δ, λ1 + δ), λ = λ1 , o is an isolated solution of Tλ (u) = o and its index i(Tλ , o) 20 is well defined. 20 Here
we mean the index from Appendix 5.2B, page 304.
502
Chapter 7. Boundary Value Problems for Partial Differential Equations
Proposition 7.5.15. Let us assume lim i(Tλ , o) = lim i(Tλ , o).
λ→λ1−
λ→λ1+
Then (λ1 , o) is a bifurcation point of Tλ , i.e., there exist λn → λ1 , un → o, un = o, such that Tλn (un ) = o. Proof. To prove this assertion we can follow the proof of Proposition 5.2.20, but working with the index defined on page 304. Combining (7.5.26) and (7.5.27) we obtain that i(Tλ , o) = i(J − λS, o)
for
λ ∈ (λ1 − δ, λ1 + δ),
λ = λ1
(7.5.28)
(cf. Exercise 7.5.21). It then follows that, to prove that (λ1 , o) is a bifurcation point of Tλ , it suffices to prove that lim i(J − λS, o) = lim i(J − λS, o).
λ→λ1−
λ→λ1+
(7.5.29)
We are now ready to prove the following assertion. Theorem 7.5.16. Let f satisfy all the assumptions stated above. Then (λ1 , o) is a bifurcation point of Tλ , i.e., of (7.5.21). Proof. Let δ > 0 be such that (7.5.28) holds. Taking into account the above discussion it is enough to prove i(J − λS, o) = 1
for
λ∈ (λ1 − δ, λ1 ),
(7.5.30)
i(J − λS, o) = −1
for
λ∈ (λ1 , λ1 + δ).
(7.5.31)
It follows from the variational characterization of λ1 > 0 that J(u) − λS(u), u > 0
for
u = o
if
λ ∈ (λ1 − δ, λ1 ).
Applying Proposition 5.2.48 we immediately get (7.5.30). To prove (7.5.31) we proceed in the following way. There exists K > 0 large enough so that we can define a function ψ : R → R by ⎧ ⎨0 for t ≤ K, ψ(t) = 2δ ⎩ (t − 2K) for t ≥ 3K, λ1 and such that ψ(t) is continuously differentiable in R, positive and strictly convex in (K, 3K) (see Figure 7.5.2, the reader is invited to write an explicit formula for such ψ). We define a functional
1 λ 1 u ∈ W01,p (Ω). Ψλ (u) up − upLp (Ω) + ψ up , p p p Then Ψλ is continuously Fr´echet differentiable and its critical point u0 ∈ W01,p (Ω) corresponds to a solution of the equation J(u0 ) −
1 + ψ
λ '
1 u0 p p
( S(u0 ) = o.
7.5A. Application of the Degree of Generalized Monotone Operators
503
ψ(t)
0
K
2K
t
3K
Figure 7.5.2.
However, since λ ∈ (λ1 , λ1 + δ) and there is only one eigenvalue below λ1 + δ, a nonzero critical point u0 of Ψλ has to satisfy
λ 1 λ ' ( = λ1 , i.e., ψ − 1. (7.5.32) u0 p = 1 p λ p 1 1 + ψ p u0 Due to the properties of ψ we necessarily have 1 u0 p ∈ (K, 3K) p and due to (7.5.32) and the simplicity of λ1 , either u0 = −u1 or u0 = u1 for a u1 > 0 which is an eigenfunction associated with λ1 . So, we conclude that there are precisely three isolated critical points of Ψλ : −u1 ,
o,
u1 .
The functional Ψλ is weakly sequentially lower semicontinuous. Indeed, assume that un v0 in W01,p (Ω). Then (7.5.33) un pLp (Ω) → v0 pLp (Ω) due to the compactness of W01,p (Ω) ⊂⊂ Lp (Ω), and then lim inf Ψλ (un ) ≥ Ψλ (v0 ) n→∞
by the fact that lim inf un ≥ v0 , (7.5.33) holds, and ψ is increasing. Observe that n→∞
Ψλ is weakly coercive, i.e., lim Ψλ (u) = ∞.
(7.5.34)
u →∞
Indeed, we have Ψλ (u) =
1 λ1 λ1 − λ up − upLp (Ω) + upLp (Ω) + ψ p p p
1 up . p
Since up − λ1 upLp (Ω) ≥ 0
for any
u ∈ W01,p (Ω),
(7.5.35)
504
Chapter 7. Boundary Value Problems for Partial Differential Equations
we also have (λ > λ1 ) λ1 − λ upLp (Ω) + ψ p
1 up p
λ1 − λ ≥ up + ψ pλ1 ≥−
δ 2δ up + pλ1 λ1
1 up p
1 up − 2K p
(7.5.36) →∞
for u → ∞, and (7.5.34) follows. Since Ψλ is an even functional, there are precisely two critical points at which the global minimum is achieved: −u1 and u1 . The third critical point o is obviously an isolated critical point of “saddle type”.21 By virtue of Proposition 5.2.50 we have i(Ψλ , −u1 ) = i(Ψλ , u1 ) = 1.22
(7.5.37)
At the same time, due to the definition of ψ, we have Ψλ (u), u > 0
for any
u ∈ W01,p (Ω),
u = κ
(7.5.38)
with κ > 0 large enough (cf. Exercise 7.5.22). Hence, Proposition 5.2.48 implies that deg(Ψλ , B(o; κ), o) = 1.
(7.5.39)
Take κ > 0 so large that ±u1 ∈ B(o; κ). By Proposition 5.2.49 (the additivity property), (7.5.37) and (7.5.39), we deduce (7.5.40) i(Ψλ , o) = −1. Since ψ vanishes in a small neighborhood of 0, we also have i(J − λS, o) = i(Ψλ , o).
(7.5.41)
Hence (7.5.31) follows from (7.5.40) and (7.5.41).
Remark 7.5.17. Note that a global bifurcation result in the sense of Appendix 5.2A can be proved from (7.5.30) and (7.5.31) (see, e.g., Dr´ abek, Kufner & Nicolosi [42]). We do not treat it here because other technical details have to be involved. Exercise 7.5.18. Prove rigorously properties (a)–(e) from the beginning of Appendix 7.5A. Hint. Use the H¨ older inequality and (7.5.22) for the proofs of (a)–(c). For the proof of (d) adopt the estimate similar to the one from Example 5.3.24. For the proof of (e) use the Rellich–Kondrachov Theorem (Theorem 1.2.28) and Theorem 3.2.24. Exercise 7.5.19. Prove that (7.5.25) implies (7.5.26). Hint. Apply the Lebesgue Dominated Convergence Theorem. Exercise 7.5.20. Prove (7.5.27). Hint. The inequality (7.5.27) is equivalent to J(u) − λS(u)(W 1,p (Ω))∗ ≥ c(λ) 0
for all
u = 1.
Proceed via contradiction and use the compactness of S and the fact that λ is not an eigenvalue. 21 Consider Ψ λ in the direction of u1 to get a local maximum at o due to the variational characterization of λ1 , and, on the other hand, in the direction of u = o satisfying
|∇u(x)|p dx − λ 22 The
Ω
|u(x)|p dx > 0 to get Ψλ (tu) > 0, t = 0. Ω
reader is invited to check that the assumptions of Proposition 5.2.50 are satisfied.
7.6. Weak Solutions, Application of Theory of Monotone Operators
505
Exercise 7.5.21. Prove (7.5.28). Hint. Use (7.5.26) and (7.5.27) and the homotopy invariance property of the degree on the ball with a sufficiently small radius. Exercise 7.5.22. Prove (7.5.38). Hint.
1 up up p ⎤ ⎡
λ − λ1 1 p p p ⎣ p p ' ( uLp (Ω) ⎦ u − = u − λ1 uLp (Ω) + ψ u p ψ p1 up 2δ λ1 p p u − for u → ∞, ≥ uLp (Ω) → ∞ λ1 2
Ψλ (u), u = up − λupLp (Ω) + ψ
by the variational characterization of λ1 .
Exercise 7.5.23. Let λ < λ1 and let f ∈ Lp (Ω). Prove that there exists a unique weak solution u ∈ W01,p (Ω) of the problem −∆p u(x) = λ|u(x)|p−2 u(x) + f (x) in Ω, u=0 where p > 1, p =
on
∂Ω
p . p−1
Exercise 7.5.24. Let λ < λ1 and let f : Ω × R → R be a bounded Carath´eodory function. Prove that there exists a weak solution u ∈ W01,p (Ω) of the problem −∆p u(x) = λ|u(x)|p−2 u(x) + f (x, u(x)) in Ω, u=0
on
∂Ω
where p > 1.
7.6 Weak Solutions, Application of Theory of Monotone Operators Before we apply Corollary 5.3.9 we revise our growth conditions on g = g(x, s) with respect to the second variable. According to (7.3.6) and Theorem 3.2.24 we have g(x, u(x)) ∈ L2 (Ω) and the corresponding Nemytski operator is continuous from L2 (Ω) into L2 (Ω). The operator S : W01,2 (Ω) → W01,2 (Ω) defined at the beginning of Section 7.4 is then continuous as follows from the estimate (7.4.4). Our goal is to show that a growth condition more general than (7.3.6) can be considered in order to get analogous results. For this purpose, however, we have to substitute the embedding W01,2 (Ω) ⊂ L2 (Ω) (7.6.1) by a more general one. Recall that Ω ∈ C 0,1 is a bounded domain. Namely, we have (see Theorem 1.2.26 or Kufner, John & Fuˇc´ık [82])
506
Chapter 7. Boundary Value Problems for Partial Differential Equations
(i) N = 1 =⇒ uC(Ω) ≤ c1 uW 1,2 (Ω) , u ∈ W01,2 (Ω); 0
(ii) N = 2 =⇒ uLq (Ω) ≤ c2,q uW 1,2 (Ω) , u ∈ W01,2 (Ω), where q ≥ 1 is 0 arbitrary; (iii) N ≥ 3 =⇒ uLq (Ω) ≤ cN,q uW 1,2 (Ω) , u ∈ W01,2 (Ω), where 1 ≤ q ≤ N2N −2 . 0
The estimate (7.4.4) can then be modified as follows: S(u1 ) − S(u2 )W 1,2 (Ω) ≤ G(u1 ) − G(u2 )X
where G(u) = g(·, u(·))
0
and (i) for N = 1, X = L1 (Ω); (ii) for N = 2, q ≥ 1 arbitrary, X = Lq (Ω) where q = (iii) for N ≥ 3, 1 ≤ q ≤
2N N −2 ,
X = Lq (Ω) where q =
q q−1 ; q q−1 .
The operator S : W01,2 (Ω) → W01,2 (Ω) is continuous provided the Nemytski operator G is continuous (i) from L∞ (Ω) to L1 (Ω) for N = 1; (ii) from Lq (Ω) to Lq (Ω) for N = 2 where q ≥ 1 is arbitrary; (iii) from Lq (Ω) to Lq (Ω) for N ≥ 3 where 1 ≤ q ≤ N2N −2 . It follows from Theorem 3.2.24 that the following growth conditions guarantee the desired continuity of G: (i) for N = 1: |g(x, s)| ≤ r(x) + C(|s|)
where r ∈ L1 (Ω)
(7.6.2)
and C(t) is a nonnegative continuous function of the variable t ≥ 0; (ii) for N = 2: |g(x, s)| ≤ r(x) + c|s|q−1 (7.6.3) q
where r ∈ L q−1 (Ω), c > 0, q ≥ 1 is arbitrary; (iii) for N ≥ 3: N +2
|g(x, s)| ≤ r(x) + c|s| N −2
where
2N
r ∈ L N +2 (Ω), c > 0.
(7.6.4)
The reader should verify in detail that the growth conditions (7.6.2)–(7.6.4) generalize the condition |g(x, s)| ≤ r(x) + c|s|,
r ∈ L2 (Ω), c > 0.
(7.6.5)
It is clear that the larger q we choose in (7.6.3) the more general condition for g we obtain. It is also clear that all conditions (7.6.2)–(7.6.4) generalize (7.6.5) in the sense that more nonlinearities g can be taken into account in the nonlinear problem −∆u(x) = g(x, u(x)) in Ω, (7.6.6) u=0 on ∂Ω,
7.6. Weak Solutions, Application of Theory of Monotone Operators
507
and the definition of weak solution still makes sense. Moreover, the operator S : W01,2 (Ω) → W01,2 (Ω) defined by g(x, u(x))v(x) dx for all u, v ∈ W01,2 (Ω) (S(u), v) = Ω
is also a well-defined continuous operator. Warning. However, S is not compact in general! Remark 7.6.1. In order to prove the compactness of S, we need to employ some compact embeddings of W01,2 (Ω). Namely, we use the following ones (see Theorem 1.2.28): (i) N = 1 =⇒ W01,2 (Ω) ⊂⊂ C(Ω); (ii) N = 2 =⇒ W01,2 (Ω) ⊂⊂ Lq (Ω) where q ≥ 1 is arbitrary; (iii) N ≥ 3 =⇒ W01,2 (Ω) ⊂⊂ Lq (Ω) where 1 ≤ q < N2N −2 . So, S is compact if N = 1 and N = 2 provided (7.6.2) and (7.6.3), respectively, hold.23 To get compactness also for N ≥ 3 we need to modify the growth condition (7.6.4) as follows: there exists ε > 0 (arbitrarily small) such that for a.a. x ∈ Ω and all s ∈ R we have N +2
2N
|g(x, s)| ≤ r(x) + c|s| N −2 (1−ε)
where r ∈ L N +2 (Ω), c > 0.
W01,2 (Ω).
(7.6.7)
Indeed, let un u in Then un → u in L (Ω) for q arbitrarily close to 2N 2N , e.g., q = (1 − ε). Then (7.6.7) and Theorem 3.2.24 imply that N −2 N −2 q
g(·, un ) → g(·, u)
2N
in L N +2 (Ω).
As a consequence we obtain S(un ) → S(u)
in W01,2 (Ω).24
Let us give an application of Corollary 5.3.9. Theorem 7.6.2. Let g ∈ CAR(Ω × R) satisfy one of the growth conditions (7.6.2)– (7.6.4) depending on N . Moreover, let g(x, ·) be a decreasing function for a.a. x ∈ Ω. Then (7.6.6) has a unique weak solution. Proof. Set T = L − S, i.e.,25 (T (u), v) = ∇u(x)∇v(x) dx − g(x, u(x))v(x) dx for any u, v ∈ W01,2 (Ω). Ω
23 The
Ω
reader is invited to repeat the argument from the beginning of Section 7.4 to prove the compactness of S : W01,2 (Ω) → W01,2 (Ω) for N = 1, 2. 24 The reader is invited to perform these steps in detail. 25 For the definition of L : W 1,2 (Ω) → W 1,2 (Ω) see (7.4.2) and the definition of S : W 1,2 (Ω) → 0 0 0 W01,2 (Ω) see above in this section.
508
Chapter 7. Boundary Value Problems for Partial Differential Equations
Then T is a continuous operator from W01,2 (Ω) into itself. Moreover, for u1 , u2 ∈ W01,2 (Ω) we have (T (u1 ) − T (u2 ), u1 − u2 ) = ∇(u1 (x) − u2 (x))∇(u1 (x) − u2 (x)) dx Ω − [g(x, u1 (x)) − g(x, u2 (x))](u1 (x) − u2 (x)) dx Ω |∇(u1 (x) − u2 (x))|2 dx = u1 − u2 2 . ≥ Ω
Hence T is a strongly monotone operator. The operator equation T (u) = o has a unique solution according to Corollary 5.3.9, i.e., (7.6.6) has a unique weak solution. Reading carefully the proof of Theorem 7.6.2 one can easily see that the assumptions on monotonicity of g could be relaxed. However, we have to pay the price for this modification (see the below conditions (7.6.10) and (7.6.13)). Actually, strict monotonicity of T would be enough to prove the same assertion provided we apply Theorem 5.3.4. For instance, the following assumption on g = g(x, s) guarantees the strict monotonicity of T : [g(x, u1 (x)) − g(x, u2 (x))](u1 (x) − u2 (x)) dx < |∇(u1 (x) − u2 (x))|2 dx Ω
Ω
(7.6.8)
for any u1 , u2 ∈ W01,2 (Ω), u1 = u2 . Since |∇(u1 (x) − u2 (x))|2 dx ≥ λ1 |u1 (x) − u2 (x)|2 dx Ω
Ω
by the definition of the first eigenvalue λ1 (see (7.2.4)), the inequality (7.6.8) follows from [g(x, u1 (x)) − g(x, u2 (x))](u1 (x) − u2 (x)) dx < λ1 |u1 (x) − u2 (x)|2 dx. Ω
Ω
(7.6.9) A sufficient condition for (7.6.9) to hold is the Lipschitz continuity of g with respect to the second variable, i.e., |g(x, s1 ) − g(x, s2 )| < λ1 |s1 − s2 |
(7.6.10)
for a.a. x ∈ Ω and any s1 , s2 ∈ R, s1 = s2 . Then (7.6.8) is satisfied and T is then a strictly monotone operator (the reader should verify it in detail!). In order to apply Theorem 5.3.4 we also need T to be weakly coercive, i.e., lim T (u) = ∞.
u→∞
(7.6.11)
7.6. Weak Solutions, Application of Theory of Monotone Operators
509
We have - g(x, u(x))v(x) dx-T (u) = sup |(T (u), v)| = sup -- ∇u(x)∇v(x) dx − v≤1 v≤1 Ω Ω - ≥ u − sup -- g(x, u(x))v(x) dx-v≤1
Ω
12 1 ≥ u − √ |g(x, u(x))|2 dx λ1 Ω 9 1 8 rL2 (Ω) + (λ1 − ε)uL2 (Ω) ≥ u − √ λ1
26
(7.6.12)
and all s ∈ R.
(7.6.13)
provided g satisfies the growth condition |g(x, s)| ≤ r(x) + (λ1 − ε)|s| Since
for a.a. x ∈ Ω
1 λ1 − ε √ (λ1 − ε)u||L2 (Ω) ≤ u, λ1 λ1
we get from (7.6.12) that T (u) ≥
ε 1 u − √ rL2 (Ω) , λ1 λ1
and so (7.6.11) follows. We have just proved the following assertion. Theorem 7.6.3. Let g ∈ CAR(Ω×R) satisfy (7.6.10) and (7.6.13). Then the boundary value problem (7.6.6) has a unique weak solution. Exercise 7.6.4. Let g be as in Theorem 7.6.2 and let λ < λ1 (here λ1 > 0 is the principal eigenvalue of the Laplacian, see Example 7.4.4). Prove that −∆u(x) = λu(x) + g(x, u(x)) in Ω, u=0
on
∂Ω
has a unique weak solution u ∈ W01,2 (Ω). Exercise 7.6.5. Let g be as in Theorem 7.6.2 and let λ < 0. Prove that ⎧ in Ω, ⎨ −∆u(x) = λu(x) + g(x, u(x)) ∂u ⎩ =0 on ∂Ω ∂n has a unique weak solution u ∈ W 1,2 (Ω). 26 Note
that
√1 λ1
is the best embedding constant.
510
Chapter 7. Boundary Value Problems for Partial Differential Equations
Exercise 7.6.6. Consider the problem −∆u(x) = h(x, u(x), ∇u(x)) u=0
in on
Ω, ∂Ω.
(7.6.14)
Formulate conditions on h = h(x, u, ξ) which guarantee that (7.6.14) has a unique weak solution. Exercise 7.6.7. Replace in (7.6.14) the homogeneous Dirichlet condition by the Neumann one.
7.6A Application of Leray–Lions Theorem Let p > 1, let Ω ∈ C 0,1 be a bounded domain in RN and g : Ω×RN+1 → R a Carath´eodory function (see Remark 3.2.25). We shall consider the Dirichlet problem −∆p u(x) + g(x, u(x), ∇u(x)) = f (x) in Ω, (7.6.15) u=0 on ∂Ω. Here ∆p u is the p-Laplacian (see Appendix 7.5A). Assume that 1 < p < N 27 and that g = g(x, s, t1 , . . . , tN ) satisfies the following growth condition: there exist (possibly small) ε > 0, a constant (possibly large) c > 0 and a ∗ function g0 ∈ L(p ) (Ω) such that |g(x, s, t1 , . . . , tN )| ≤ c g0 (x) + |s|
q(s)−ε
+
N
|ti |
q(t)−ε
(7.6.16)
i=1
for a.a. x ∈ Ω and for all (s, t1 , . . . , tN ) ∈ RN+1 where q(s) =
p∗ = p∗ − 1, (p∗ )
q(t) =
p 28 . (p∗ )
We consider the Sobolev space W01,p (Ω) with the norm 1 p |∇u(x)|p dx .
u = Ω
Let us recall the continuous embedding ∗
W01,p (Ω) ⊂ Lp (Ω)
(7.6.17)
(see Theorem 1.2.26(i) and Remark 1.2.27), and the compact embedding W01,p (Ω) ⊂⊂ Lq (Ω)
where
q ∈ [1, p∗ )
(7.6.18)
is a technical assumption. The case p ≥ N is easier since stronger embeddings are available (see Theorems 1.2.26 and 1.2.28). However, a slightly different technique must be employed. ∗ pN 28 Recall that p∗ is the critical Sobolev exponent (see Theorem 1.2.26), (p∗ ) = p = pN−N+p , p∗ −1 27 This
i.e., q(s) =
pN−N+p , N−p
q(t) = p − 1 +
p N
for 1 < p < N .
7.6A. Application of Leray–Lions Theorem
511
(see Theorem 1.2.28(i)). Similarly to Appendix 7.5A we define (nonlinear) operators J, G : W01,p (Ω) → (W01,p (Ω))∗ and an element f ∗ ∈ (W01,p (Ω))∗ by |∇u(x)|p−2 ∇u(x)∇v(x) dx, J(u), v = Ω g(x, u(x), ∇u(x))v(x) dx, G(u), v = Ω f (x)v(x) dx for all u, v ∈ W01,p (Ω). f ∗ , v = Ω
It follows from (7.6.16) and Remark 3.2.26 that G is well defined (the reader is invited to justify this statement!). We have the following properties of J and G: (a) J and G are bounded operators; (b) J and G are continuous operators;29 (c) for any u, v ∈ W01,p (Ω),
J(u) − J(v), u − v ≥ up−1 − vp−1 (u − v);
(d) if un u in W01,p (Ω), then G(un ) → G(u)
in
(W01,p (Ω))∗ .
Theorem 7.6.8. Let 1 < p < N and assume that g : Ω × RN+1 → R is a Carath´ eodory function satisfying (7.6.16) and that for all (s, t1 , . . . , tN ) ∈ RN+1 and almost all x ∈ Ω we have (7.6.19) sg(x, s, t1 , . . . , tN ) ≥ 0. Then (7.6.15) has at least one weak solution. Proof. Set T J + G. Then the operator equation T (u) = f ∗
(7.6.20)
is equivalent to the validity of the integral identity |∇u(x)|p−2 ∇u(x)∇v(x) dx + g(x, u(x), ∇u(x))v(x) dx = f (x)v(x) dx Ω
Ω
Ω
for all v ∈ This fact shows that the solutions of (7.6.20) correspond oneto-one to the weak solutions of (7.6.15). Next we verify the assumptions (i)–(viii) of Theorem 5.3.23 to prove that there is a solution of (7.6.20). Assumptions (i) and (ii) follow directly from (a) and (b). The assumption (iii), i.e., the coercivity of T , is a direct consequence of (7.6.19): 1 1 |∇u(x)|p dx + g(x, u(x), ∇u(x))u(x) dx lim T (u), u = lim
u →∞ u
u →∞ u Ω Ω W01,p (Ω).
≥
lim up−1 = ∞.
u →∞
∗
prove boundedness and continuity of G we have to use the embedding W01,p (Ω) ⊂ Lp (Ω) ∗ and the continuity of the Nemytski operator given by g from Lp (Ω) into the dual space of Lp (Ω). 29 To
512
Chapter 7. Boundary Value Problems for Partial Differential Equations
Let us define an operator Φ : W01,p (Ω) × W01,p (Ω) → (W01,p (Ω))∗ by Φ(u, w), v J(u), v + G(w), v
u, w, v ∈ W01,p (Ω).
for all
It is straightforward to verify the assumption (iv). In order to verify the assumption (v), let u, w, h ∈ W01,p (Ω) and tn → 0. Then Φ(u + tn h, w) = J(u + tn h) + G(w) → J(u) + G(w) = Φ(u, w) by continuity of J (see (b)). The validity of the assumption (vi) follows directly from (c). In order to verify the assumption (vii) let us assume that un u in W01,p (Ω) and lim Φ(un , un ) − Φ(u, un ), un − u = 0,
n→∞
i.e., lim J(un ) − J(u), un − u = 0.
n→∞
(7.6.21)
But (7.6.21) together with (c) implies that un → u. W01,p (Ω)
is a uniformly convex Banach space (see footThe last fact and the fact that note 10 on page 65 and check details) together with the weak convergence imply un → u
in
W01,p (Ω).
(see Proposition 2.1.22(iv)). Now, Φ(w, un ) = J(w) + G(un ) → J(w) + G(u) = Φ(w, u)
for arbitrary
w ∈ W01,p (Ω).
Finally, to verify the assumption (viii), let w ∈ W01,p (Ω) and un u in W01,p (Ω). Then G(un ) → G(u) in (W01,p (Ω))∗ by (d), and so Φ(w, un ), un = J(w) + G(un ), un → J(w), u + G(u), u = Φ(w, u), u. Since also un u in W01,p (Ω) implies that Φ(w, un ) → J(w) + G(u), the last assumption of Theorem 5.3.23 is verified. The advantage of the Leray–Lions Theorem becomes more transparent when one deals with partial differential equations of higher order. The reader is asked to see, e.g., Zeidler [136] for more advanced but also technically more involved problems. Exercise 7.6.9. Modify the assumptions on g in such a way that Theorem 5.3.22 could be applied to get at least one weak solution of (7.6.15). Exercise 7.6.10. Prove the implication (d) from page 511. Hint. Use Theorem 1.2.28(i) and Remark 3.2.26. Exercise 7.6.11. Prove the following assertion: Let p ≥ 2, then for all x1 , x2 ∈ RN , |x2 |p ≥ |x1 |p + p|x1 |p−2 x1 (x2 − x1 ) + 30 Recall
that xy (x, y)RN .
|x2 − x1 |p 30 . 2p−1 − 1
(7.6.22)
7.6A. Application of Leray–Lions Theorem
513
Hint (Lindqvist [86]). The strict convexity of x → |x|p implies that for any x1 , x2 ∈ RN , p > 1, |x2 |p > |x1 |p + p|x1 |p−2 x1 (x2 − x1 ). (7.6.23) Then writing
x2 +x1 2
instead of x2 in (7.6.23) we get - x + x -p 1 - 1 2p p−2 x1 (x2 − x1 ). - ≥ |x1 | + p|x1 | 2 2
Using the Clarkson inequality (see, e.g., Adams [2, Theorem 2.28]) for p ≥ 2, - x2 + x2 - p - x 1 − x 2 -p |x1 |p + |x2 |p ≥ 2 - + 2- , 2 2
(7.6.24)
we arrive at
- x − x -p - 1 2(7.6.25) |x2 |p ≥ |x1 |p + p|x1 |p−2 x1 (x2 − x1 ) + 2 - . 2 1 This is actually (7.6.22) but with 21−p in place of 2p−1 . Repeating this procedure, −1 starting again with (7.6.24) but now using (7.6.25) instead of (7.6.23), we get the constant improved to 21−p + 41−p . By iteration one obtains the constant 21−p + 41−p + 81−p + · · · =
1 2p−1 − 1
in (7.6.22). Exercise 7.6.12. Prove the following assertion Let 1 < p < 2, then for all x1 , x2 ∈ RN , |x2 |p ≥ |x1 |p + p|x1 |p−2 x1 (x2 − x1 ) + c(p)
|x1 − x2 |2 . (|x1 | + |x2 |)2−p
(7.6.26)
Hint (Lindqvist [86]). Fix x1 , x2 and expand the real function f (t) = |x1 + t(x2 − x1 )|p using the Taylor formula f (1) = f (0) + f (0) +
1
(1 − t)f (t) dt.
0
Then, provided f (t) = 0 for all 0 ≤ t ≤ 1,
1
|x2 |p = |x1 |p + p|x1 |p−2 x1 (x2 − x1 ) +
(1 − t)f (t) dt.
(7.6.27)
0
(In the case when there exists t, 0 ≤ t ≤ 1, such that |x1 + t(x2 − x1 )| = 0 it is easily checked that (7.6.26) holds!) At the same time f (t) = p(p − 2)|x1 + t(x2 − x1 )|p−4 [(x1 + t(x2 − x1 ))(x2 − x1 )]2 + p|x1 + t(x2 − x1 )|p−2 |x2 − x1 |2 ,
514
Chapter 7. Boundary Value Problems for Partial Differential Equations
and the Schwartz inequality yields f (t) ≥ p(p − 1)|x1 + t(x2 − x1 )|p−2 |x2 − x1 |2 .
(7.6.28)
Returning to (7.6.27) we estimate
1
(1 − t)f (t) dt ≥
0
3 4
1 4
f (t) dt
(7.6.29)
0
and since |x1 + t(x2 − x1 )| ≤ |x1 | + |x2 |, we use (7.6.28), (7.6.29) and arrive at (7.6.26) 3 p(p − 1). with c(p) = 16 Exercise 7.6.13. Prove that the operator J defined on page 511 is strictly monotone31 for 1 < p < 2 and strongly monotone32 for p ≥ 2. Hint. For u, v ∈ W01,p (Ω) we have |∇u(x)|p−2 ∇u(x) − |∇v(x)|p−2 ∇v(x) (∇u(x) − ∇v(x)) dx J(u) − J(v), u − v = Ω = |∇u(x)|p−2 ∇u(x)(∇u(x) − ∇v(x)) dx Ω |∇v(x)|p−2 ∇v(x) (∇u(x) − ∇v(x)) dx = I1 + I2 . − Ω
For p ≥ 2, it follows from Exercise 7.6.11 that 2 |∇u(x) − ∇v(x)|p dx = cu − vp . I1 + I2 ≥ p (2p−1 − 1) Ω For 1 < p < 2, it follows from Exercise 7.6.12 that |∇u(x) − ∇v(x)|2 2c(p) I1 + I 2 ≥ dx > 0 2−p p Ω (|∇u(x)| + |∇v(x)|)
provided u, v ∈ W01,p (Ω), u = v.
Exercise 7.6.14. Prove that the weak solution from Exercise 7.5.23 is unique. Exercise 7.6.15. Let λ < λ1 and let f : Ω × R → R be a Carath´eodory function which is decreasing with respect to the second variable, i.e., f (x, s1 ) ≥ f (x, s2 )
for a.a.
x∈Ω
and
Assume, moreover, that there exist f0 ∈ Lp (Ω), p =
s1 , s2 ∈ R,
p , p−1
s1 ≤ s2 .
p > 1 and c > 0 such that
|f (x, s)| ≤ f0 (x) + c|s|p−1 . Prove that there is a unique weak solution u ∈ W01,p (Ω) of the problem −∆p u(x) = λ|u(x)|p−2 u(x) + f (x, u(x)) in Ω, u=0
on
∂Ω.
for any u, v ∈ W01,p (Ω), u = v: J(u) − J(v), u − v > 0, cf. Definition 5.3.2. there exists c > 0 such that for any u, v ∈ W01,p (Ω): J(u) − J(v), u − v ≥ cu − vp , cf. Definition 5.3.2. 31 I.e., 32 I.e.,
7.7. Weak Solutions, Application of Variational Methods
515
7.7 Weak Solutions, Application of Variational Methods Let us illustrate an application of Theorem 6.2.8 (the existence of a global minimizer) to the energy functional associated with the boundary value problem −∆u(x) = g(u(x)) + f (x) in Ω, (7.7.1) u=0 on ∂Ω. We will assume that g = g(s) is a continuous function and, moreover, (i) f ∈ L1 (Ω) if N = 1; q
(ii) f ∈ L q−1 (Ω) and there exists c > 0 such that |g(s)| ≤ c|s|q−1
where
⎧ ⎨q ∈ [1, ∞) is arbitrary if 2N ⎩q ∈ 1, if N −2
N = 2, N ≥ 3.
The energy functional33 associated with (7.7.1) is defined as follows: 1 2 E(u) |∇u(x)| dx − G(u(x)) dx − f (x)u(x) dx, 34 u ∈ W01,2 (Ω), 2 Ω Ω Ω where
G(t) =
t
g(s) ds. 0
We will assume that g satisfies the sign condition: sg(s) ≤ 0
for any s ∈ R.
(7.7.2)
Then E is a weakly coercive functional on W01,2 (Ω). Indeed, the condition (7.7.2) immediately implies G ≤ 0, and thus G(u(x)) dx ≤ 0 for any u ∈ W01,2 (Ω), Ω
and so E(u) ≥ 33 Here
1 u2W 1,2 (Ω) − f X uX , 0 2
f represents the influence of external forces, g a nonlinear damping or restoring force, and 12 |∇u|2 the kinetic energy, respectively. Hence E(u) corresponds to the total energy of the system (cf. Examples 6.2.6 and 6.2.14) – that is where the expression “energy functional” comes from. For more details cf. Hlav´ aˇ cek & Neˇcas [68]. 34 The reader is invited to check that if u ∈ W 1,2 (Ω) satisfies δE(u ; v) = 0 for any v ∈ W 1,2 (Ω), 0 0 0 0 then it is a weak solution of (7.7.1), cf. Remark 5.3.10.
516
Chapter 7. Boundary Value Problems for Partial Differential Equations q
where X = L1 (Ω), X = C(Ω) if N = 1, X = L q−1 (Ω), X = Lq (Ω) if N ≥ 2. Hence there is a constant c1 > 0 such that E(u) ≥
1 u2W 1,2 (Ω) − c1 uW 1,2 (Ω) . 0 0 2
Hence lim E(u) = ∞.
u→∞
Next we prove that E is weakly sequentially lower semicontinuous. Clearly, for any un u0 in W01,2 (Ω) we have 1 1 |∇u0 (x)|2 dx ≤ lim inf |∇un (x)|2 dx, (7.7.3) n→∞ 2 Ω 2 Ω f (x)u0 (x) dx = lim f (x)un (x) dx. (7.7.4) n→∞
Ω
Ω
Our assumptions imply that for N ≥ 2 the function G(s) satisfies the estimate |G(s)| ≤
c q |s| . q
So, the Nemytski operator defined by NG (u)(x) = G(u(x)) is continuous from Lq (Ω) into L1 (Ω) (see Theorem 3.2.24). Then this fact and the compact embedding W01,2 (Ω) ⊂⊂ Lq (Ω) imply that un → u0 in Lq (Ω), and thus G(un (·)) → G(u(·))
in L1 (Ω).
(7.7.5)
Since un → u0 in C(Ω) for N = 1 we obtain (7.7.5) easily in this case, too. Summarizing (7.7.3)–(7.7.5) we obtain E(u0 ) ≤ lim inf E(un ). n→∞
We have proved the following assertion. Theorem 7.7.1. Let g be a continuous function satisfying (7.7.2) and (ii) (for N ≥ 2) , let f satisfy (i) (for N = 1) or (ii) (for N ≥ 2). Then the boundary value problem (7.7.1) has at least one weak solution u0 ∈ W01,2 (Ω). The growth assumptions on g stated above for N ≥ 3 are not optimal and can be relaxed ifwe assume the monotonicity of g and apply Theorem 6.2.12. Indeed, s let G(s) = g(τ ) dτ be a concave function.35 Then the energy functional E 0 35 In
particular, if g is decreasing, then G is concave.
7.7. Weak Solutions, Application of Variational Methods
517
is a strictly convex functional. This functional is continuous even if the growth assumption (ii) for N ≥ 3 is relaxed to 2N , q ∈ 1, N −2 i.e., the value q =
2N N −2
is admissible, too.36 Hence we have the following assertion.
Theorem 7.7.2. Let f and g be as in Theorem 7.7.1. If g is decreasing, then (7.7.1) has a unique weak solution. The assertion remains true even if g satisfies (ii) with q = N2N −2 for N ≥ 3. Let us now consider the Dirichlet problem −∆u(x) + λu(x) = |u(x)|p−2 u(x) u=0
in on
Ω, ∂Ω,
(7.7.6)
and look for nonnegative solutions u ≥ 0, u = 0 a.e. in Ω (cf. Example 6.4.7 for N = 1). For future purposes we denote by 2∗ an arbitrary value greater than or equal to 1 for N = 2 and 2∗ =
2N N −2
for N ≥ 3.
The real numbers λ and p in (7.7.6) are parameters. We will apply the Mountain Pass Theorem to prove the following assertion about nonnegative solutions of (7.7.6). Theorem 7.7.3 (Willem [134]). Let N ≥ 2, 2 < p < 2∗ . Then (7.7.6) has at least one nonnegative nontrivial weak solution if and only if λ > −λ1 where λ1 > 0 is the first eigenvalue of −∆ subject to the homogeneous Dirichlet boundary conditions on ∂Ω (see Example 7.4.4). Proof. The necessary part is simple (cf. Example 6.4.7). Indeed, let u ∈ W01,2 (Ω) be a weak solution of (7.7.6), u ≥ 0, u = 0 a.e. in Ω. Then taking v = ϕ1 (see Example 7.4.4) as a test function in ∇u(x)∇v(x) dx + λ u(x)v(x) dx = |u(x)|p−2 u(x)v(x) dx Ω
Ω
we obtain
(λ + λ1 )
Ω
due to
|u(x)|p−2 u(x)ϕ1 (x) dx
u(x)ϕ1 (x) dx = Ω
Ω
∇u(x)∇ϕ1 (x) dx − λ1 Ω
36 Use
Theorem 3.2.24 and explain why!
u(x)ϕ1 (x) dx = 0. Ω
(7.7.7)
518
Chapter 7. Boundary Value Problems for Partial Differential Equations
Since ϕ1 > 0 in Ω, we get from (7.7.7) that λ + λ1 > 0,
i.e.,
λ > −λ1 .
The sufficiency part is more involved and we will apply the Mountain Pass Theorem (Theorem 6.4.5) to prove it. So, we assume λ > −λ1 . Let us start with the observation that the expression
12 |u| = |∇u(x)|2 dx + λ |u(x)|2 dx for u ∈ W01,2 (Ω) Ω
Ω
satisfies c1 u ≤ |u| ≤ c2 u with constants ci > 0, i = 1, 2, independent of u ∈ W01,2 (Ω) where
12 2 u = |∇u(x)| dx . Ω
Indeed, (7.2.4) yields |∇u(x)|2 dx + λ |u(x)|2 dx ≥ d |∇u(x)|2 dx Ω
Ω
(7.7.8)
Ω
, + for any u ∈ W01,2 (Ω) where d = 1 + min 0, λλ1 .37 Then 1
|λ| 2 u d u ≤ |u| ≤ 1 + λ1 1 2
for any u ∈ W01,2 (Ω)
(7.7.9)
by (7.2.4) and (7.7.8). + p
Let us define F (u) = |u p | . Then the functional |u(x)|2 |∇u(x)|2 +λ − F (u(x)) dx, E(u) = 2 2 Ω is the energy functional associated with the problem −∆u(x) + λu(x) = |u+ (x)|p−1 u=0
in on
u ∈ W01,2 (Ω),
Ω, ∂Ω.
(7.7.10)
To prove the existence of a weak solution of (7.7.10) (which possibly changes sign in Ω) we apply the Mountain Pass Theorem (see Theorem 6.4.5). For this purpose we verify that E (i) has the mountain pass type geometry (see Proposition 6.4.3), (ii) satisfies the (PS)c condition (see Definition 6.4.4). We have E(o) = 0 37 The
reader should prove (7.7.8) in detail!
7.7. Weak Solutions, Application of Variational Methods
and
519
cp 1 1 1 2 E(u) ≥ |u| − |u(x)|p dx ≥ c21 u2 − emb up 2 p Ω 2 p 2 p c c = u2 1 − emb up−2 .38 2 p
Hence there exists r > 0 small enough and such that b inf E(u) > 0 = E(o). u=r
On the other hand, taking w0 > 0 in Ω fixed, w0 ∈ W01,2 (Ω), then c 2 t2 tp |w0 (x)|p dx for t ≥ 0. E(tw0 ) ≤ 2 w0 2 − 2 p Ω So, there exists t > 0 (large enough) that for e = tw0 ∈ W01,2 (Ω) we have both e > r
and
E(e) < 0.
(Remember that p > 2.) ∞ In order to verify the (PS)c condition we proceed as follows. Let {un }n=1 ⊂ W01,2 (Ω) be a sequence satisfying E(un ) → c,
∇E(un ) → o
with a c ∈ R.
(For ∇E see Exercise 7.7.6.) For n large enough, we also have ∇E(un ) ≤ 1. Since
∇u(x)∇v(x) dx + λ
(∇E(u), v) =
(7.7.11)
Ω
u(x)v(x) dx − Ω
where f (u) = (u+ )p−1 , we have
f (u(x))v(x) dx Ω
2
(∇E(u), u) = |u| − p
F (u(x)) dx. Ω
Since also
1 2 E(u) = |u| − 2
F (u(x)) dx, Ω
we get due to (7.7.11), ( ( 'p 'p 2 − 1 |u| + (∇E(u), u) ≥ − 1 c21 u2 − u. pE(u) = 2 2 Put u un to see that {un }∞ n=1 is a bounded sequence. 38 Note that by the Sobolev Embedding Theorem and (7.7.9) there exists a constant c emb > 0 such that uLp (Ω) ≤ cemb uW 1,2 (Ω) for any u ∈ W01,2 (Ω). 0
520
Chapter 7. Boundary Value Problems for Partial Differential Equations
Now, passing to a subsequence if necessary, we can assume that un u in W01,2 (Ω) for a u ∈ W01,2 (Ω). By the Rellich–Kondrachov Theorem we have un → u in Lp (Ω). It follows from the continuity of the Nemytski operator (see Theorem 3.2.24) that
f (un ) → f (u)
p =
in Lp (Ω),
p . p−1
Observe that 2
|un − u| = (∇E(un ) − ∇E(u), un − u) (f (un (x)) − f (u(x)))(un (x) − u(x)) dx. +
(7.7.12)
Ω
By the assumption ∇E(un ) → o and by un u in W01,2 (Ω) we have (∇E(un ) − ∇E(u), un − u) → 0
as n → ∞,
and by the H¨ older inequality we conclude that - - (f (un (x)) − f (u(x)))(un (x) − u(x)) dxΩ
1 p1 p p p ≤ |f (un (x)) − f (u(x))| dx |un (x) − u(x)| dx →0 Ω
Ω
as n → ∞. Hence (7.7.12) implies that |un − u| → 0
as n → ∞,
un → u
i.e.,
in W01,2 (Ω).
Consequently, E satisfies the (PS)c condition. It follows from Theorem 6.4.5 and Remark 6.4.6 that E has a critical point u0 ∈ W01,2 (Ω), u0 = o. Since u0 is also a weak solution of (7.7.10) we have p−1 ∇u0 (x)∇v(x) dx + λ u0 (x)v(x) dx = |u+ v(x) dx (7.7.13) 0 (x)| Ω
for any v ∈
Ω
Ω
u− 0
W01,2 (Ω).
Taking v = in (7.7.13) we arrive at − 2 2 39 |u− | = |∇u (x)| dx + λ |u− 0 0 0 (x)| dx = 0, Ω
Ω
implication u ∈ W01,2 (Ω) ⇒ u− ∈ W01,2 (Ω) is nontrivial if Ω ⊂ RN , N ≥ 2, and it is not true in general if we replace W01,2 (Ω) by W0k,2 (Ω) with k ≥ 2! For the case N = 1 see Exercise 1.2.47. It follows from Gilbarg & Trudinger [59, Section 7.4] (or Leinfelder & Simander [84, Appendix], Ziemer [137, Corollary 2.1.8 and Theorem 2.1.11]) that ⎧ ⎪ if u > 0, ⎨ ∇u ∇u if u > 0, 0 if u ≥ 0, + − ∇u = ∇u = and ∇|u| = 0 if u = 0, for u ∈ ⎪ 0 if u ≤ 0, ∇u if u < 0, ⎩ −∇u if u < 0,
39 The
W01,2 (Ω).
7.7. Weak Solutions, Application of Variational Methods
521
hence u− 0 = 0 a.e. in Ω. This proves that u0 is a nonnegative weak solution of (7.7.10). Since u0 ≥ 0 in Ω, u0 ≡ 0 in Ω, we have u+ 0 = u0 in Ω, and so u0 is a nonnegative nontrivial weak solution of (7.7.6). Remark 7.7.4. The nonlinearity u → |u|p−2 u in (7.7.6) has the so-called subcritical growth due to the inequality p < 2∗ . The reader should notice that some existence results for (7.7.6) are known also in the case of the critical growth N ≥ 3, p = 2∗ , (see, e.g., Willem [134]). The proofs are based on the Concentration Compactness Principle which is attributed to Lions (see Lions [87], Lions [88], Lions [89]). These techniques go beyond the limits of this book and the reader can consult, e.g., the book of Flucher [51] to get more information in this direction. Now we show an application of the Saddle Point Theorem. Let us consider the Dirichlet boundary value problem −∆u(x) = λu(x) + h(x, u(x)) in Ω, (7.7.14) u=0 on ∂Ω. If h is bounded and λ is not an eigenvalue of (7.4.13), the existence of a solution of (7.7.14) follows from Theorem 7.5.2. We prove the following assertion. Theorem 7.7.5. Let λ be an eigenvalue of (7.4.13) and h, ∂h ∂s be bounded and continuous. If, moreover, s H(x, s) = h(x, τ ) dτ ⇒ ∞ as |s| → ∞ uniformly for x ∈ Ω, (7.7.15) 0
then (7.7.14) possesses a weak solution.40 Proof. Let λ = λk < λk+1 for a k ∈ N where λk , k ∈ N, are the eigenvalues of (7.4.13). The energy functional associated with (7.7.14), 1 λk 2 2 E(u) = |∇u(x)| dx − |u(x)| dx − H(x, u(x)) dx, (7.7.16) 2 Ω 2 Ω Ω u ∈ W01,2 (Ω), has the property E ∈ C 2 (W01,2 (Ω), R) due to the assumptions on ∞ 1,2 h, ∂h ∂s (Exercise 7.7.8). Let {ϕj }j=1 be an orthonormal basis of W0 (Ω) consisting ∞ of the eigenfunctions associated with the eigenvalues {λj }j=1 , 0 < λ1 ≤ λ2 ≤ · · · (see Example 7.5.1). In particular, 2 λj |ϕj (x)| dx = |∇ϕj (x)|2 dx = 1 holds for all j ∈ N. (7.7.17) Ω 40 The
Ω
assertion can be proved under weaker assumptions on h, cf. Rabinowitz [105] and Remark 6.5.4.
522
Chapter 7. Boundary Value Problems for Partial Differential Equations
Let
Y Lin{ϕ1 , ϕ2 , . . . , ϕk }, Z =
u ∈ W01,2 (Ω) :
u(x)v(x) dx = 0, v ∈ Y
,
Ω
i.e., W01,2 (Ω) = Y ⊕ Z,
dim Y < ∞
{ϕj }∞ j≥k+1
forms an orthonormal basis of Z. and Step 1. We prove that E has a geometry of the Saddle Point Theorem. If u ∈ Z, ∞ aj ϕj and (see (7.7.17)) then u = j=k+1
Ω
∞ λk λk ≥ 1− u2 . |∇u(x)|2 − λk |u(x)|2 dx = a2j 1 − λj λk+1 j=k+1
(7.7.18) Let M sup |h(x, s)|. Then x∈Ω s∈R
- - H(x, u(x)) dx- ≤ M |u(x)| dx ≤ M1 u Ω
(7.7.19)
Ω
older and Poincar´e inequalities. Combining (7.7.18) for all u ∈ W01,2 (Ω), by the H¨ and (7.7.19) shows that E is bounded below on Z, i.e., inf E(u) > −∞.
(7.7.20)
u∈Z
ˆ where Next, if u ∈ Y , then u = u0 + u u0 ∈ Y 0 Lin{ϕj : λj = λk } 41 Then for u ∈ Y , u =
k
and
u ˆ ∈ Yˆ Lin{ϕj : λj < λk }.
aj ϕj ,
j=1
1 E(u) = 2
j:λj <λk
a2j
λk 1− − H(x, u0 (x)) dx λj Ω
H(x, u (x) + u ˆ(x)) − H(x, u (x)) dx.
−
0
(7.7.21)
0
Ω
There is a constant M2 > 0 such that
1 2 λk ≤ −M2 ˆ aj 1 − u2 , 2 λj j:λj <λk
41 Note
that the multiplicity of λk need not be equal to 1 in general but it is finite.
(7.7.22)
7.7. Weak Solutions, Application of Variational Methods
523
and - - [H(x, u0 (x) + uˆ(x)) − H(x, u0 (x))] dxΩ - 0 u0 (x) u (x)+ˆ u(x) (7.7.23) =h(x, s) ds − h(x, s) ds dx- Ω 0 0 - 0 u (x)+ˆ u(x) =h(x, s) ds dx- ≤ M |ˆ u(x)| dx ≤ M1 ˆ u. - Ω u0 (x) Ω It follows from (7.7.21)–(7.7.23) that E(u) ≤ −M2 ˆ u −
H(x, u0 (x)) dx + M1 ˆ u.
2
Ω
This implies that E(u) → −∞
as
u → ∞,
u ∈ Y.
(7.7.24)
Indeed, either ˆ u → ∞ or u0 → ∞, and then (7.7.24) follows from the assumption (7.7.15). It follows from (7.7.20) and (7.7.24) that E verifies the hypotheses of Proposition 6.5.2, i.e., E has the geometry corresponding to the Saddle Point Theorem. Step 2. Now, we prove that E satisfies the (PS)c condition. Let us assume that |E(um )| ≤ K for some K > 0 and ∇E(um ) → o.42 Let us write um = u0m + uˆm + u˜m
where u0m ∈ Y 0 , u ˆm ∈ Yˆ , u ˜m ∈ Z.
For large m, we have ˜m )| ˜ um ≥ |(∇E(um ), u - = - [∇um (x)∇˜ um (x) − λk um (x)˜ um (x) − h(x, um (x))˜ um (x)] dx-- , Ω
(7.7.25) and the same for u ˆm . On the other hand, since Z = Y ⊥ , by (7.7.25), (7.7.18) and the boundedness of h we obtain - - ∇um (x)∇˜ um (x) − λk um (x)˜ um (x) − h(x, um (x))˜ um (x)] dx-Ω
λk ≥ 1− ˜ um 2 − M1 ˜ um . (7.7.26) λk+1 42 E(u
m)
→ c, c ∈ R, implies that there exists K > 0 such that |E(um )| ≤ K.
524
Chapter 7. Boundary Value Problems for Partial Differential Equations
From (7.7.25) and (7.7.26) we obtain
λk ˜ um 2 − M1 ˜ um , ˜ um ≥ 1 − λk+1 ∞
∞
Similarly, we prove that {ˆ um }m=1 is which shows that {˜ um }m=1 is bounded. *∞ ) bounded, too. Finally, we claim that u0m m=1 is bounded. To verify the claim, observe that - 9 18 |∇˜ um (x)|2 + |∇ˆ K ≥ |E(um )| = -um (x)|2 − λk (|˜ um (x)|2 + |ˆ um (x)|2 ) 2 Ω 0 0 − H(x, um (x)) − H(x, um (x)) dx − H(x, um (x)) dx-- . Ω
By what has already been shown, the first integral on the right-hand side is 43 bounded independently of m. Therefore H(x, u0m (x)) dx is bounded. In order Ω ) *∞ to show that u0m m=1 is bounded it is sufficient to prove that H(x, v(x)) dx → ∞ as v → ∞ for v ∈ Y 0 . (7.7.27) Ω
By (7.7.15) for any l > 0, there is dl such that H(x, s) ≥ l
if |s| ≥ dl
for all x ∈ Ω.
Let v ∈ Y 0 , v = o, and write v = tϕ
where
ϕ ∈ ∂B(o; 1) {w ∈ Y 0 : w = 1}.
Then
H(x, tϕ(x)) dx ≥
Ω
Ωtl (ϕ)
where Ωtl (ϕ) = {x ∈ Ω : |tϕ(x)| ≥ dl }
and
H(x, tϕ(x)) dx − M0
(7.7.28)
M0 ≥ (meas Ω) -- inf H(x, s)-- . -x∈Ω s∈R
For any ψ ∈ ∂B(o; 1) we find an open neighborhood U(ψ) ⊂ ∂B(o; 1) of ψ, x = x(ψ) ∈ Ω and r = r(ψ) > 0 with the following property: for an arbitrary l ∈ N there exists tl (ψ) such that B(x(ψ); r(ψ)) {x ∈ Ω : |x − x(ψ)| < r(ψ)} ⊂ Ωtl (ϕ) 44 for any t ≥ tl (ψ) and ϕ ∈ U(ψ). 43 The 44 Here
reader should justify it using an estimate similar to (7.7.23). we use the fact that the eigenfunctions of (7.7.13) are continuous in Ω (cf. Example 7.5.1).
7.7. Weak Solutions, Application of Variational Methods
525
Then (7.7.28) implies that for any l ∈ N we have H(x, tϕ(x)) dx ≥ l meas B(x(ψ); r(ψ)) − M0
(7.7.29)
Ω
for any t ≥ tl (ψ), ϕ ∈ U(ψ). The system {U(ψ) : ψ ∈ ∂B(o; 1)} is an open covering of ∂B(o; 1). The compactness of ∂B(o; 1) implies that there exists a finite subcovering {U(ψi ) : ψi ∈ ∂B(o; 1)}, i = 1, . . . , n, of ∂B(o; 1). Let c min {meas B(x(ψi ); r(ψi ))},
tl max {tl (ψi )}
i=1,...,n
i=1,...,n
Then from (7.7.29) we obtain that H(x, tϕ(x)) dx ≥ cl − M0
for any l ∈ N.
for any ϕ ∈ ∂B(o; 1) and t ≥ tl ,
Ω
i.e., (7.7.27) holds for v → ∞, v ∈ Y 0 . ∞ So, we have proved that {um }m=1 ⊂ W01,2 (Ω) is a bounded sequence. Passing to a subsequence if necessary, we may assume that um u in W01,2 (Ω) and um → u in L2 (Ω). Then we have (∇E(um ) − ∇E(u), um − u) → 0. But we also have
(7.7.30)
|um (x) − u(x)|2 dx → 0,
Ω
[h(x, um (x)) − h(x, u(x))](um (x) − u(x)) dx → 0. Ω
These facts together with (7.7.30) imply |∇um (x) − ∇u(x)|2 dx → 0,
i.e.,
um → u
Ω
in W01,2 (Ω).
This proves that E verifies the (PS)c condition, and the proof of Theorem 7.7.5 is complete. Exercise 7.7.6. Let 1 λ 1 2 2 E(u) = |∇u(x)| dx + |u(x)| dx − |u+ (x)|p dx, 2 Ω 2 Ω p Ω
λ ∈ R, p > 2.
Prove that E ∈ C 2 (W01,2 (Ω), R) and (∇E(u), h) = ∇u(x)∇h(x) dx + λ u(x)h(x) dx − |u+ (x)|p−1 h(x) dx. Ω
Ω
Ω
Exercise 7.7.7. Compare the assertion and the proof of Theorem 7.7.3 with Example 6.4.7. Point out the differences between the one-dimensional and higher dimensional cases.
526
Chapter 7. Boundary Value Problems for Partial Differential Equations
Exercise 7.7.8. Prove that the functional E(u) from (7.7.16) has the property E ∈ C 2 (W01,2 (Ω), R). Hint. First prove that the second Gˆ ateaux derivative is given by D2 E(u)(w, z) = ∇w(x)∇z(x) dx − λk w(x)z(x) dx Ω Ω ∂h − (x, u(x))w(x)z(x) dx. Ω ∂s Then show that D2 E(u) is continuous in u (Remark 3.2.29). To prove continuity of 1,2 the third term in D2 E(u) use the boundedness of ∂h ∂s , the embedding W0 (Ω) ⊂ r L (Ω), r > 2 (Remark 1.2.24) and the continuity of the Nemytski operator from r Lr (Ω) into L r−2 (Ω) (Theorem 3.2.24(ii)). Exercise 7.7.9. Let X ⊂ W01,2 (Ω), dim X < ∞. Why are the norms
u =
|∇u(x)|2 dx Ω
12
uL2∗ (Ω) =
uLp(Ω) =
, ∗
|u(x)|2
21∗ dx ,
p1 |u(x)|p dx ,
Ω
1 < p < 2∗ ,
Ω
equivalent on X? Why are these norms not equivalent on the whole space X = W01,2 (Ω)? Hint. Cf. Corollary 1.2.11. Exercise 7.7.10. Replace (7.7.15) by the assumption H(x, s) → −∞
as |s| → ∞
and prove the assertion of Theorem 7.7.5. Exercise 7.7.11. Consider the boundary value problem −∆u(x) + λu(x) = g(x, u(x)) in u=0
on
Ω, ∂Ω.
(7.7.31)
Formulate conditions on λ and g = g(x, s) which guarantee that the energy functional associated with (7.7.31) (i) is coercive, (ii) is weakly coercive, (iii) has a geometry corresponding to the Mountain Pass Theorem, (iv) has a geometry corresponding to the Saddle Point Theorem.
7.7A. Application of the Saddle Point Theorem
527
7.7A Application of the Saddle Point Theorem In this appendix we will give another application of Theorem 6.5.12. Consider the existence of weak solutions of the boundary value problem
−∆p u(x) = λ1 |u(x)|p−2 u(x) + f (x, u(x)) − h(x)
in
Ω,
u=0
on
∂Ω,
(7.7.32)
where p > 1, Ω ∈ C 0,1 is a bounded domain in RN , f : Ω × R → R is a bounded p . As in Appendix 7.5A, let λ1 > 0 Carath´eodory function and h ∈ Lp (Ω), p = p−1 be the principal eigenvalue of −∆p on Ω with zero Dirichlet boundary conditions, and let us denote by ϕ1 the positive (in Ω) eigenfunction associated with λ1 normalized by
1 p |∇ϕ1 (x)|p dx = 1. ϕ1 = Ω
We will suppose that f satisfies the following condition: for a.a. x ∈ Ω there exist limits lim f (x, s) = f−∞ (x),
lim f (x, s) = f+∞ (x).
s→−∞
s→+∞
It is well known that under this condition the problem (7.7.32) need not have solutions (cf. Exercise 7.7.13). The following result extends the classical result of Landesman & Lazer [83]. Theorem 7.7.12. Suppose that either
f+∞ (x)ϕ1 (x) dx <
h(x)ϕ1 (x) dx <
Ω
Ω
f−∞ (x)ϕ1 (x) dx
(7.7.33)
f+∞ (x)ϕ1 (x) dx.
(7.7.34)
Ω
or else
f−∞ (x)ϕ1 (x) dx <
h(x)ϕ1 (x) dx <
Ω
Ω
Ω
Then there exists at least one weak solution u ∈ W01,p (Ω) of the problem (7.7.32). Proof. We follow the proof from Arcoya & Orsina [9]. Let us introduce the energy functional E : W01,p (Ω) → R associated with (7.7.32): E(u)
1 p
|∇u(x)|p dx− Ω
where
λ1 p
|u(x)|p dx− Ω
F (x, u(x)) dx+ Ω
h(x)u(x) dx (7.7.35) Ω
s
F (x, s) =
f (x, t) dt
for a.a.
x∈Ω
and
s ∈ R.
0
Then E ∈ C 1 (W01,p (Ω), R) (cf. Exercise 7.7.14) and its critical points correspond to the weak solutions of (7.7.32).
528
Chapter 7. Boundary Value Problems for Partial Differential Equations We proceed in three steps.
1,p Step 1. Let {un }∞ n=1 ⊂ W0 (Ω) be such that there exists c > 0 such that
|E(un )| ≤ c
n∈N
for any
and there exists a strictly decreasing sequence - -E (un ), v- ≤ εn v for any n ∈ N
(7.7.36)
lim εn {εn }∞ n=1 , n→∞ and any
= 0, such that
v ∈ W01,p (Ω).45
(7.7.37)
{un }∞ n=1
Then we will prove that contains a subsequence which converges strongly in W01,p (Ω). 1,p Let us begin by proving that the sequence {un }∞ n=1 is bounded in W0 (Ω). Suppose, ∞ un by contradiction, that un → ∞, and define vn = un . Thus {vn }n=1 is bounded in W01,p (Ω) and hence, at least its subsequence, converges to a function v0 weakly in W01,p (Ω) and strongly in Lp (Ω). Dividing (7.7.35) with u = un by un p , we get, due to (7.7.36), un (x) F (x, un (x)) 1 λ1 |vn (x)|p dx − dx + h(x) dx ≤ 0. lim sup − p p Ω un p un p n→∞ Ω Ω
F (x, un (x)) un (x) lim dx + h(x) dx = 0 n→∞ un p un p Ω Ω ∞ by the hypotheses on f , h, and {un }n=1 while |vn (x)|p dx = |v0 (x)|p dx, lim Since
n→∞
Ω
Ω
we have
|v0 (x)|p dx ≥ 1.
λ1 Ω
Using the weak lower semicontinuity of the norm and the variational characterization of λ1 (see Appendix 7.5A), we get 1 ≤ λ1 |v0 (x)|p dx ≤ |∇v0 (x)|p dx ≤ lim inf |∇vn (x)|p dx = 1. Ω
n→∞
Ω
v0 = 1
Ω
Thus
|∇v0 (x)|p dx = λ1
and Ω
|v0 (x)|p dx. Ω
This implies, by the definition of ϕ1 , that v0 = ±ϕ1 .46 Now we write (7.7.36) and (7.7.37) with v = un in the equivalent form −cp ≤ |∇un (x)|p dx − λ1 |un (x)|p dx Ω Ω F (x, un (x)) dx + p h(x)un (x) dx ≤ cp, −p Ω Ω |∇un (x)|p dx + λ1 |un (x)|p dx −εn un ≤ − Ω Ω + f (x, un (x))un (x) dx − h(x)un (x) dx ≤ εn un . Ω
Ω
E (u
will be convenient to express the assumption n ) → 0 in this form. that we have proved that vn → v0 and so by the uniform convexity of W01,p (Ω), vn → ±ϕ1 , too. 45 It
46 Note
7.7A. Application of the Saddle Point Theorem
529
Summing up and dividing by un , we obtain - - [f (x, un (x))vn (x) − pg(x, un (x))vn (x) + (p − 1)h(x)vn (x)] dx- ≤ Ω
⎧ ⎨ F (x, s) g(x, s) = s ⎩f (x, 0)
where
if
s = 0,
if
s = 0.
cp + εn un
(7.7.38)
Letting n tend to infinity and supposing that vn converge to +ϕ1 (for example), we obtain [f (x, un (x))vn (x) − pg(x, un (x))vn (x)] dx = (1 − p) h(x)ϕ1 (x) dx. lim n→∞
Ω
Ω
Since vn converge to ϕ1 , we have lim un (x) = ∞ for a.a. x ∈ Ω, and so n→∞
f (x, un (x)) → f+∞ (x)
for a.a.
x ∈ Ω,
g(x, un (x)) → f+∞ (x)
for a.a.
x ∈ Ω.
The properties of f and F and the Lebesgue Theorem then imply [f (x, un (x))vn (x) − pg(x, un (x))vn (x)] dx = (1 − p) f+∞ (x)ϕ1 (x) dx, lim n→∞
Ω
Ω
and so, since p > 1,
f+∞ (x)ϕ1 (x) dx = Ω
h(x)ϕ1 (x) dx, Ω
which contradicts both (7.7.33) and (7.7.34). 1,p Thus {un }∞ n=1 is bounded. This implies that there exists u ∈ W0 (Ω) such that, 1,p at least its subsequence, un converges to u weakly in W0 (Ω) and strongly in Lp (Ω). Choosing v = un − u in (7.7.37), we obtain - - |∇un (x)|p−2 ∇un (x)(∇un (x) − ∇u(x)) dx − λ1 |un (x)|p−2 un (x)(un (x) − u(x)) dx Ω Ω − f (x, un (x))(un (x) − u(x)) dx + h(x)(un (x) − u(x)) dx-- ≤ εn un − u. Ω
Ω
Since un → u in Lp (Ω) and, by the hypotheses on f and h, |un (x)|p−2 un (x)(un (x) − u(x)) dx = 0, lim n→∞ Ω f (x, un (x))(un (x) − u(x)) dx = 0, lim n→∞ Ω h(x)(un (x) − u(x)) dx = 0, lim n→∞
|∇un (x)|p−2 ∇un (x)(∇un (x) − ∇u(x)) dx = 0.
lim
n→∞
Subtracting
Ω
we have
Ω
|∇u(x)|p−2 ∇u(x)(∇un (x) − ∇u(x)) dx Ω
530
Chapter 7. Boundary Value Problems for Partial Differential Equations
(which converges to zero as n tends to infinity since u belongs to W01,p (Ω)), we conclude that (|∇un (x)|p−2 ∇un (x) − |∇u(x)|p−2 ∇u(x))(∇un (x) − ∇u(x)) dx 0 = lim n→∞
Ω
≥ lim (un p−1 − up−1 )(un − u) ≥ 0, 47 n→∞
which implies un → u. The uniform convexity of W01,p (Ω) yields that un converges strongly to u in W01,p (Ω). This completes the proof of Step 1. Note that it follows from Step 1 that E satisfies (PS)c on any level c ∈ R. Step 2. Note also that, in the proof of the Palais–Smale condition, we have proved that if {E(un )}∞ n=1 is a sequence bounded above with un → ∞, then (at least its subsequence) vn = uunn → ±ϕ1 in W01,p (Ω) (see footnote 46 on page 528). Using this fact, it is easy to prove that E is weakly coercive provided (7.7.33) holds. Otherwise, it is possible to choose a sequence {un }∞ n=1 such that un → ∞,
E(un ) ≤ c
and
vn =
un → ±ϕ un
in
W01,p (Ω).
Assume (for example) that vn → ϕ1 ; arguing as in the previous proof we get h(x)ϕ1 (x) dx − f+∞ (x)ϕ1 (x) dx Ω Ω F (x, un (x)) E(un ) c h(x)vn (x) dx − = lim dx ≤ lim sup ≤ lim = 0, n→∞ n→∞ u u u n n n n→∞ Ω Ω which contradicts (7.7.33). The weak coerciveness of E and the weak sequential lower semicontinuity (cf. Exercise 7.7.15) are enough in order to prove that E attains its infimum (see Theorem 6.2.8 and Remark 6.2.22), so that (7.7.32) has at least one weak solution. Step 3. If (7.7.34) holds, then E has the geometry of the Saddle Point Theorem. Indeed, splitting W01,p (Ω) as the direct sum of and Z u ∈ W01,p (Ω) : u(x)ϕ1 (x) dx = 0 , Y = Lin{ϕ1 } Ω
¯ > λ1 such that we see that there exists λ p ¯ |∇u(x)| dx ≥ λ |u(x)|p dx Ω
for all
u∈Z
Ω
(cf. Exercise 7.7.16). Thus, by the H¨ older inequality and by the properties of F , there exists c > 0 such that for every u in Z,
λ1 1 1− ¯ |∇u(x)|p dx − F (x,u(x))dx + h(x)u(x)dx E(u) ≥ p λ Ω Ω Ω
1 ' ( p 1 λ1 1 c p p p 1− ¯ ≥ |∇u(x)| dx − 1 (measΩ) + hLp (Ω) |∇u(x)| dx . p λ ¯ p Ω Ω λ 47 Cf.
computation on page 328.
7.7A. Application of the Saddle Point Theorem
531
Hence, E is weakly coercive on Z, so that BZ = min E(u) > −∞. u∈Z
Observe that we have not yet used the fact that (7.7.34) holds. On the other hand, for every t ∈ R we have |∇(tϕ1 )(x)|p dx − λ1 |tϕ1 (x)|p dx = 0 Ω
Ω
as follows from the definition of λ1 and ϕ1 . Thus, E(tϕ1 ) = t h(x)ϕ1 (x) dx − F (x, tϕ1 (x)) dx Ω Ω h(x)ϕ1 (x) dx − g(x, tϕ1 (x))ϕ1 (x) dx =t Ω
Ω
where g has been defined by (7.7.38). Using the positivity of ϕ1 and the hypotheses on f , it is easy to see that lim g(x, tϕ1 (x))ϕ1 (x) = f+∞ (x)ϕ1 (x)
for a.a.
t→+∞
x ∈ Ω.
Furthermore, there exists c > 0 such that |g(x, tϕ1 (x))ϕ1 (x)| ≤ cϕ1 (x) ∈ L1 (Ω), so that the Lebesgue Theorem implies lim h(x)ϕ1 (x) dx − g(x, tϕ1 (x))ϕ1 (x) dx = [h(x) − f+∞ (x)]ϕ1 (x) dx, t→+∞
Ω
Ω
Ω
and the limit is negative by (7.7.34). Analogously, if t tends to −∞, we have the same result with f+∞ replaced by f−∞ , so that the limit is positive by (7.7.34). In both the cases we have lim E(tϕ1 ) = −∞. t→±∞
Thus, there exists R > 0 such that if |t| = R, we have E(tϕ1 ) < BZ ≤ E(u)
u ∈ Z.
for all
Hence E satisfies the hypotheses of Theorem 6.5.12, and so there exists a critical point for E, that is, a weak solution of (7.7.32). Exercise 7.7.13. Let p = 2, let f (x, ·) be strictly increasing (decreasing). Then a necessary condition for the existence of a solution of (7.7.32) is that f−∞ (x)ϕ1 (x) dx < h(x)ϕ1 (x) dx < f+∞ (x)ϕ1 (x) dx Ω Ω Ω ( ' f+∞ (x)ϕ1 (x) dx < h(x)ϕ1 (x) dx < f−∞ (x)ϕ1 (x) dx . Ω
Ω
Ω
532
Chapter 7. Boundary Value Problems for Partial Differential Equations
Hint. Assume that u is a solution of (7.7.32), i.e., ∇u(x)∇v(x) dx = λ1 u(x)v(x) dx + f (x, u(x))v(x) dx − h(x)v(x) dx Ω
for any v ∈
Ω
W01,2 (Ω).
Ω
Ω
Choose v = ϕ1 and use the fact that ∇u(x)∇ϕ1 (x) dx = λ1 u(x)ϕ1 (x) dx. Ω
Ω
Exercise 7.7.14. Prove that the functional E(u) defined by (7.7.35) belongs to the space C 1 (W01,p (Ω), R). Hint. Use an approach similar to that in Exercise 7.7.8. Exercise 7.7.15. Prove that E is a weakly sequentially lower semicontinuous functional on W01,p (Ω). Hint. Use the weak sequential lower semicontinuity of the norm in W01,p (Ω), the compact embedding W01,p (Ω) ⊂⊂ Lp (Ω), and the continuity of the Nemytski operator u → F (·, u) from Lp (Ω) to L1 (Ω). ¯ > λ1 such that Exercise 7.7.16. Prove that there exists λ ¯ |∇u(x)|p dx ≥ λ |u(x)|p dx Ω Ω u(x)ϕ1 (x) dx = 0 . for all u ∈ Z u ∈ W01,p (Ω) : Ω
Hint. Assume by contradiction that there exist εn → 0 and un ∈ Z, un = 1 such that 1 = (λ1 + εn ) |un (x)|p dx. Ω
un → u in Lp (Ω) and show u = o, Pass to a subsequence un u in |∇u(x)|p dx ≤ λ1 |u(x)|p dx. W01,p (Ω),
Ω
Ω
This contradicts u ∈ Z due to the simplicity of λ1 . Exercise 7.7.17. Consider the boundary value problem −∆p u(x) + λ|u(x)|p−2 u(x) = g(x, u(x)) u=0
in
Ω,
on
∂Ω,
(7.7.39)
where p > 1. Formulate conditions on λ and g = g(x, s) which guarantee that the energy functional associated with (7.7.39) (i) is coercive, (ii) is weakly coercive, (iii) has a geometry corresponding to the Mountain Pass Theorem, (iv) has a geometry corresponding to the Saddle Point Theorem.
Summary of Methods Presented in This Book Fixed Point Methods Contraction Principle (Theorem 2.3.1) Browder Theorem for non-expansive mappings (Proposition 2.3.10) Brouwer Fixed Point Theorem (Theorem 5.1.3) Schauder Fixed Point Theorem (Theorem 5.1.11, Example 5.2.14) Fixed Point Theorem for condensing mappings (Theorem 5.1.27)
Local Differentiability Methods Local Inverse Function Theorem (Theorem 4.1.1) Implicit Function Theorem (Theorem 4.2.1) Crandall–Rabinowitz Local Bifurcation Theorem (Theorem 4.3.22)
Topological Methods Brouwer degree (Theorems 5.2.7 and 4.3.124) Leray–Schauder degree (Theorem 5.2.13) Topological degree of Browder and Skrypnik (Theorem 5.2.47) Krasnoselski Local Bifurcation Theorem (Theorem 5.2.23) Rabinowitz and Dancer Global Bifurcation Theorems (Theorems 5.2.34 and 5.2.38)
Monotonicity Methods Monotone operators theory in Hilbert space (Theorem 5.3.4) Monotone operators theory in Banach space (Theorem 5.3.22) Leray–Lions Theorem for operators which are monotone in the principal part (Theorem 5.3.23)
534
Summary of Methods Presented in This Book
Methods in Ordered Spaces Monotone iterations, subsolutions and supersolutions (Theorem 5.4.16) Supersolutions, subsolutions and topological degree (Theorems 5.4.49 and 5.4.50) Supersolutions, subsolutions and global extrema (Theorem 6.2.42)
Variational Methods Necessary and sufficient conditions for local extrema (Propositions 6.1.2 and 6.1.4, Theorem 6.1.5) Global extrema (Theorems 6.2.4, 6.2.11 and 6.2.20) Relative extrema and Langrange Multiplier Method (Theorem 6.3.2) Krasnoselski Potential Bifurcation Theorem (Theorem 6.3.26) Mountain Pass Theorem (Theorem 6.4.5 in the Hilbert space setting, Theorem 6.4.24 in the Banach space setting) Lusternik–Schnirelmann Method (Theorems 6.4.42 and 6.4.46) Saddle Point Theorem (Theorem 6.5.3 in the Hilbert space setting, Theorem 6.5.12 in the Banach space setting) Linking Theorem (Theorem 6.5.13)
Approximative Methods Contraction Principle (Theorem 2.3.1) Newton Method (Appendix 3.2A) Ritz Method (Proposition 6.2.34) Finite Element Method (Theorem 6.2.38)
Typical Applications Most of the methods presented in this book are illustrated on boundary value problems for both ordinary and partial differential equations. For the reader’s convenience we stress here some typical boundary value problems and the methods illustrated by them.
Semilinear Problems – Ordinary Differential Equations x(t) ˙ = f (t, x(t)),
x(0) = x(1)
Lyapunov–Schmidt Reduction and Implicit Function Theorem (Example 4.3.15) Coincidence degree (Example 5.2.18)
x ¨(t) = f (t, x(t)),
x(0) = x(1) = 0
Contraction Principle (Example 2.3.8) Schauder Fixed Point Theorem (Example 5.1.14) Supersolutions and subsolutions (Example 5.4.19) Supersolutions and subsolutions combined with global extrema (Appendix 6.2B)
−¨ x(t) + g(x(t)) = f (t),
x(0) = x(1) = 0
Monotone operators (Example 5.3.11) Extreme Value Theorem (special case g(s) = s3 , Examples 6.2.6, 6.2.14 and 6.2.21) Saddle Point Theorem (Example 6.5.5)
536
Typical Applications
−¨ x(t) + λx(t) = |x(t)|p−2 x(t),
x(0) = x(π) = 0
Mountain Pass Theorem (Example 6.4.7)
−¨ x(t) + a(t)x(t) = f (t, x(t)),
x(0) = x(1) = 0
Linking Theorem (Example 6.5.14)
x ¨(t) + λx(t) = 0,
x(0) = x(1) = 0
Courant–Weinstein Variational Principle (Example 6.3.15)
−(p(t)x(t))˙ ˙ + q(t)x(t) = λx(t),
x(a) = x(b) = 0
Hilbert–Schmidt Theorem (Example 2.2.17) Krein–Rutman Theorem (Exercise 5.4.46)
x ¨(t) + λ sin x(t) = 0,
x(0) = x(2π),
x(0) ˙ = x(2π) ˙
Lyapunov–Schmidt Reduction (Example 4.3.24)
x ¨(t) + λx(t) + g(λ, t, x(t)) = 0,
x(0) = x(π) = 0
Krasnoselski Local Bifurcation Theorem (Exercise 5.2.33) Dancer Global Bifurcation Theorem (Example 5.2.39)
x ¨(t) + λx(t) + g(λ, t, x(t)),
x(0) = x(2π),
x(0) ˙ = x(2π) ˙
Krasnoselski Potential Bifurcation Theorem (Example 6.3.32)
x ¨(t) = f (t, x(t), x(t)), ˙
x(0) = x(1) = 0
Leray–Schauder degree (Example 5.2.16)
x ¨(t) + λx(t) + g(λ, t, x(t), x(t)), ˙
x(0) = x(2π),
x(0) ˙ = x(2π) ˙
Crandall–Rabinowitz Bifurcation Theorem (Example 4.3.25)
537
Semilinear Problems – Partial Differential Equations −∆u(x) = µu(x) in Ω,
u=0
on ∂Ω
Krein–Rutman Theorem (Example 5.4.40)
−∆u(x) = g(u(x)) + f (x) in Ω,
u=0
on ∂Ω
Schauder Fixed Point Theorem – existence of a classical solution (Theorem 7.2.1) Global extrema – existence of a weak solution (Theorem 7.7.1)
−∆u(x) = g(x, u(x))
in Ω,
u = 0 on ∂Ω
Contraction Principle – existence of a weak solution (Theorem 7.4.1) Schauder Fixed Point Theorem – existence of a weak solution (Proposition 7.4.2, Theorems 7.4.3 and 7.4.6) Monotone operators – existence of a weak solution (Theorem 7.6.2)
−∆u(x) = λu(x) + h(x, u(x))
in Ω,
u=0
on ∂Ω
Leray–Schauder degree – existence of a weak solution (Theorems 7.5.2 and 7.5.3, Exercises 7.5.8, 7.5.9, 7.5.10 and 7.5.11) Saddle Point Theorem – existence of a weak solution (Theorem 7.7.5)
−∆u(x) + λu(x) = |u(x)|p−2 u(x) in Ω,
u=0
on ∂Ω
Mountain Pass Theorem – existence of a positive weak solution (Theorem 7.7.3)
Quasilinear Problems – Ordinary Differential Equations p−2 (|x(t)| ˙ x(t))˙ ˙ + λ|x(t)|p−2 x(t) = 0,
x(0) = x(1) = 0
Lagrange Multiplier Method (Example 6.3.5) Lusternik–Schnirelmann Method (Example 6.4.47) p−2 −(|x(t)| ˙ x(t))˙ ˙ + g(x(t)) = f (t),
x(0) = x(1) = 0
Topological degree of Browder and Skrypnik (Example 5.2.51) Browder Theorem (Example 5.3.24) p−2 −(|x(t)| ˙ x(t))˙ ˙ = f (t, x(t)),
x(0) = x(1) = 0
538
Typical Applications Supersolutions and subsolutions combined with the topological degree (Appendix 5.4B) p−2 −(|x(t)| ˙ x(t))˙ ˙ + λx(t) = |x(t)|r−2 x(t),
x(0) = x(1) = 0
Moutain Pass Theorem (Exercise 6.4.26)
Quasilinear Problems – Partial Differential Equations −∆p u(x) + g(x, u(x), ∇u(x)) = f (x) in Ω,
u=0
on ∂Ω
Leray–Lions Theorem – existence of a weak solution (Appendix 7.6A)
−∆p u(x) = λ1 |u(x)|p−2 u(x) + f (x, u(x)) − h(x)
in Ω,
u=0
on ∂Ω
Saddle Point Theorem – existence of a weak solution (Appendix 7.7A)
−∆p u(x) = λ|u(x)|p−2 u(x) + f (λ, x, u(x))
in Ω,
u = 0 on ∂Ω
Topological degree of Browder and Skrypnik – bifurcation result (Appendix 7.5A)
Comparison of Bifurcation Results Presented in This Book Bifurcation results presented in this book are based on the following three basic tools: Implicit Function Theorem (Crandall–Rabinowitz Local Bifurcation Theorem – Theorem 4.3.22), Degree Theory (Krasnoselski Bifurcation Theorem – Theorem 5.2.23, Rabinowitz Global Bifurcation Theorem – Theorem 5.2.34, Dancer Global Bifurcation Theorem – Theorem 5.2.38), Variational Principles (Krasnoselski Potential Bifurcation Theorem – Theorem 6.3.26) Below we present a brief discussion of these results and point out differences and links among them. A bifurcation result based on the Implicit Function Theorem provides very precise information about the structure of the set of nontrivial solutions near the bifurcation point – it is expressed in terms of a differentiable curve. Moreover, the result is obtained for “non potential” equations. On the other hand, to verify the assumptions, relatively strong smoothness assumptions on the nonlinearity are required and the information about the set of all nontrivial solutions has only local character. Also, the assumption that the dimension of the kernel of the linear part must be equal to 1 represents a relatively strong restriction. Bifurcation results based on the Degree Theory do not require smoothness of the nonlinearity at all. They allow one to treat “non potential” equations as well and provide also some information about the global structure of the set of nontrivial solutions. On the other hand, the multiplicity of an eigenvalue of the “linear part” must be odd, the set of nontrivial solutions need not be a curve (even in a small neighborhood of the bifurcation point) and its global structure may be unclear if there is no additional information about “higher order terms”. A bifurcation result based on Variational Principles holds for any multiplicity of an eigenvalue of the “linear part”. The price to be paid for that consists in the fact that the equation has to possess a potential. It provides only local information about nontrivial solutions and the structure of the set of all nontrivial solutions might be “very wild” in general. We can summarize the above discussion in the following table.
Theorem
(Theorem 4.3.22)
(Theorem 6.3.26)
Krasnoselski Theorem
(Theorem 5.2.38)
Dancer Theorem
(Theorem 5.2.34)
Rabinowitz Theorem
(Theorem 5.2.23)
Method
Multiplier
Lagrange
Degree Theory
Degree Theory
Degree Theory
Function
Theorem
Krasnoselski Theorem
Implicit
to prove it
Crandall–Rabinowitz
Bifurcation result
Basic tool
arbitrary
m≥1
m=1
only
potential
in general
non potential
in general
non potential
m≥1 m odd
in general
non potential
in general
non potential
equation
Form of the
m odd
m≥1
m=1
the linear part
an eigenvalue of
Multiplicity m of
local
global
global
local
local
information
of the
Character
at the point
differentiable
twice continuously
continuous
continuous
continuous
in a neighborhood
differentiable
twice continuously
required
Smoothness
540 Comparison of Bifurcation Results Presented in This Book
List of Symbols Sets and spaces M×N M Mw ∂M int M exp M sup M inf M N Z Q R C R N , CN
M ⊥ , N⊥ M⊥ X# X∗ M⊥N [a, b] U(x), V(x), . . . {Un } B(a; r) ∂B(a; r) Sk K, Kr X ∗,+ Lin M Co M M+N
Cartesian product of sets M and N , 2 closure of the set M, 25 weak closure of the set M, 77 boundary of the set M, 25 interior of the set M, 22 set of all subsets of M, 121 lowest upper bound (supremum) of the set M, 3 greatest lower bound (infimum) of the set M, 3 set of all positive integers, 19: N = {1, 2, . . . } set of all integers, 49 set of all rational numbers, 4 set of all real numbers, 1 set of all complex numbers, 1 real, or complex space of dimension N ∈ N, 1, 12 set of scalars (in general), 1 nullsets, 12, 69 orthogonal complement of the set M, 47 (algebraic) dual space of the linear space X, 10 dual (adjoint) space of the Banach space X, 55 orthogonality of the sets M and N , 43 order interval in X, 330: [a, b] = {x ∈ X : a ≤ x ≤ b} neighborhood of the point x, 24 covering, 26 open ball centered at the point a ∈ X with radius r > 0, 25 sphere centered at the point a ∈ X with radius r > 0, 279 k-dimensional sphere in RN , 182 order cone and Kr = {x ∈ K : x ≤ r}, 338 set of all positive functionals, 342 span of the elements of the set M, 2 convex hull of the elements of the set M, 314 set of all z = x + y, x ∈ M, y ∈ N , 262
542
List of Symbols
X ⊕Y dim X codim X X|Y M TM T ∗M H p (M ) H1 (M ) {αn }
algebraic direct sum of the spaces X and Y , 5 dimension of the linear space X, 4 co-dimension of the linear subspace X, 9 factor space X over Y , 9 manifold, 154 tangent bundle, 187 cotangent bundle, 187 cohomology group of the manifold M , 205 fundamental group of the manifold M , 206 partition of unity, 209
Elements 0 o
zero number in R or C, 2 zero element of a (topological, . . . , Banach, Hilbert) space X including RN and CN , 1
Special spaces and classes space of all sequences {xn }∞ n=1 with
lp l∞ c0 L(X, Y ) L(X, Y ) B2 (X, Y ) C(X, Y ), C[a, b] C k (X, Y ) C k (M) C k,γ (M) C0k (X, Y ) D(M) BC(X) Lp (M) L∞ (M) Lploc (M) W k,p (M), Ta M (Ta M )∗ CAR(M)
∞
|xn |p < ∞, 37, 49
n=1
space of all bounded sequences, 10 space of all sequences with zero limit and with the sup norm, 59 space of all linear operators from X into Y , 5 space of all linear continuous operators from X into Y , 29 space of all bilinear continuous operators from x into Y , 129 space of all continuous maps from X into Y , and of the closed interval [a, b], 30, 4 space of all maps from X into Y with continuous derivatives up to order k, 37 space of all maps from C k (M) for which continuous derivatives up to order k are bounded in M, 37 space of all γ-H¨ older continuous, bounded functions in M with continuous derivatives up to order k, 38 space of all maps from X into Y with continuous derivatives up to order k and with compact supports in X, 368 class of all infinite differentiable functions on M which have compact supports lying in M, 35 space of all bounded, continuous maps on X, 30 Lebesgue space on M, 33 space of all classes of essentially bounded functions, 33 space of all classes of functions which are in Lp on every compact subset of M, 38 W0k,p (M) Sobolev space on M, 39, 52 tangent space of the differentiable manifold M at the point a, 182 cotangent space of the differentiable manifold M at the point a, 187 class of all Carath´eodory functions on M, 126
List of Symbols H(M) Γa Λn (x) [f ] C 0,1 C (X, Y ) Cf (X, Y )
543 class of all holomorphic functions on a neighborhood of the closed set M, 21 class of all smooth curves, 182 class of all skew-symmetric n-linear forms, 196 class of mutual homotopic continuous maps containing f , 206 class of all domains with Lipschitz boundary, 396 class of all compact operators from X into Y , 77 class of all finite-dimensional compact operators from X into Y , 256
Maps, functions and operators Dom f domain of the map f , 51 Im f range (image) of the map f , set of all values f (x), x ∈ Dom f , 7 f: X → Y f is a map from X into Y (does not automatically mean that either Dom f = X or Im f = Y ), 26 Ker f kernel of the linear operator f , null–space of f , 7 f−1 (M) set of all preimages, 26: f−1 (M) = {x ∈ Dom f : f (x) ∈ M} f −1 inverse map to the map f , 7 positive and negative part of f , respectively, 53: f +, f − f + = max{f, 0}, f − = max{−f, 0}, f = f + − f − f ◦g composition of the maps f and g, 122: f ◦ g = f (g) O, O zero map and zero matrix, respectively, 17, 15: Ox = o, x ∈ X, Ox = o Special functions and maps J Jacobi matrix, 171 Jf determinant of the Jacobi matrix of the map f (Jacobian), 118 partial Fr´echet derivative (in the infinite dimension) of f with ref1 , fx spect to the first variable x, 122 δf (a; v), first (and kth) Gˆ ateaux derivative of the map f at the point a in δ k f (a; v1 , . . . , vk ) the direction v (directions v1 , . . . , vk ), 117, 129 first (and kth) Gˆ ateaux differential of the map f at the point a, Df (a), Dk f (a) 117, 129 α Dw f α-weak derivative of the map f , 38 ∇f gradient of the map f , 52 curl f curl of the map f , 201: curl f = ∇ × f N ∂fi div f divergence of the map f , 225: div f = ∂xi i=1
∆f
Laplace operator of the map f (Laplacian), 146: N ∂2 f ∆f = , ∆f = div (∇f ) ∂x2
∆p f ∆M f Lg f
p-Laplacian of the map f , 500 Laplace–Beltrami operator of the map f on the manifold M , 226 directional (Lie) derivative of the map f (in the direction of the vector field g), 192 commutator (Lie bracket) of the vector fields f and g, 195
i=1
[f, g]
i
544 x, y (x, y)
List of Symbols duality between the Banach spaces X ∗ and X, 303, 387: x, y = x(y), x ∈ X ∗ , y ∈ X scalar (inner) product in the linear space X or in RN , CN (xy (x, y)RN in Chapter 7), 41, 42: N N xi yi , x, y ∈ RN , (x, y)CN = xi y i , x, y ∈ CN , (x, y)RN = i=1
x, xRN ≡ x2 x∧y x×y X ⊂ Y , X ⊂⊂ Y meas M dist(x, M), dist(M, N )
i=1
respectively norm in the normed space X and Euclidean norm in RN (|x| N x2i , x ∈ RN xRN in Chapter 7), respectively, 28: x2RN = i=1
exterior product in the linear space X, 197 cross (vector) product in R3 , 200 the space X is continuously (and compactly) embedded into the space Y , 36, 40 Lebesgue measure of the set M, 36 distance of the point x from the set M, and of the sets M and N , respectively, 52, 272: dist(M, N ) = sup inf (x, y) x∈M y∈N
diam M
diameter of the set M ⊂ (X, ), 261: diam M = sup (x, y)
Re f Im f f∗ (v) f ∗g deg(f, M, a) ind f
real part of f , 41 imaginary part of f , 61 push-forward; the map f pushes forward a tangent vector v, 186 pull-back; the map f pulls back a differential form g, 189 degree of the map f at the point a with respect to the set M, 231 index of the Fredholm operator f , 70: ind f = dim Ker f − dim Ker f ∗ index of the point a with respect to the curve γ, 230 support of the map f , 35 spectrum (set of all eigenvalues) of the map f , 14 resolvent set of the map f , 56: (f ) = C \ σ(f ) (M − 1)-form of the vector field f and of the M -form ω, 225
x,y∈M
Indγ a supp f σ(f ) (f ) ωf Others x y, y x a.a. a.e.
let x be y; x is defined by y; denote y as x, 4, 135 almost all (in the sense of the Lebesgue measure), 70 almost everywhere, almost every (in the sense of the Lebesgue measure), 33 xn → x strong convergence of the sequence {xn }∞ n=1 to the element x, 25 xn x weak convergence of the sequence {xn }∞ n=1 to the element x, 65 ∗ weak star convergence of the sequence {xn }∞ xn x n=1 to the element x, 68 uniform convergence of the sequence {xn }∞ xn ⇒ x n=1 to the element x, 37 p exponent conjugate to p, 33 p = p−1 Np p∗ = N−kp critical Sobolev exponent, 39 x > o, x ≥ o, x o ordering in the Banach space X, o, x ∈ X, 330
Index a posteriori estimate, 92 a priori estimate, 92, 280 absolute neighborhood extensor, 443 absolutely continuous function, 39 accumulation point, 85 adjoint operator, 12, 68, 72 algebraic, 11 admissible homotopy, 271 Alaoglu–Bourbaki theorem, 68 algebra, 31 Banach, 31 Lie, 195 normed, 31 alternative Fredholm, 14, 82 problem, 166 antipodal point, 450 theorem (Borsuk), 242 approximation theorem (Weierstrass), 250 approximative unit, 236 Arzel` a–Ascoli theorem, 31 atlas, 159 equivalent, 185 ball, open, 25 Banach algebra, 31 space, 28 ordered, 330 Banach–Steinhaus theorem, 58 basis, 2 dual, 10 Hamel, 2 orthonormal, 43 positive, 215
Schauder, 40 standard, 3 Bessel inequality, 44 bifurcation, 150 branch, 176 diagram, 175 equation, 166 global Dancer theorem, 300 Rabinowitz theorem, 295 local Crandall–Rabinowitz theorem, 174 Krasnoselski theorem, 290 pitchfork, 177 point, 174 potential (Krasnoselski) theorem, 417 transcritical, 177 bijective linear operator, 7 mapping, 2 bilinear form, 195 operator, 58, 129 Bochner integral, 110 theorem, 111 Borel measure, 39, 63 Borsuk antipodal theorem, 242 Borsuk–Ulam theorem, 245 boundary, 25 condition, 91 Dirichlet, 91 mixed, 91 Neumann, 91 periodic, 91 Lipschitz, 474
546 of manifold, 218 value problem, 96 bounded operator, 264 brachistochrone problem, 366 brackets Lie, 193 Poisson, 193 branch, 176 bread–ham–cheese theorem, 247 Brouwer degree, 238, 269, 275 fixed point theorem, 253 Browder theorem, 323 bundle cotangent, 187 tangent, 187 calculus, functional, 22, 90 Dunford, 22, 113 canonical embedding, 9, 65 Carath´eodory property, 126 Cartesian coordinates, 142 category, Lusternik–Schnirelmann, 443 Cauchy integral formula, 22 Cauchy sequence, 27 Cauchy–Riemann conditions, 230 central manifold, 154 chain rule, 121 characteristic equation, 15 of partial differential, 191 function of set, 36 chart, 159 classical solution, 74, 476 closed differential form, 202 graph theorem, 60 operator, 59 set, 25 weakly sequentially, 381 closure, 25 weak, 77 codimension, 9 coercive functional, weakly, 380 operator, 323, 385 weakly, 310 cohomology group, 205
Index coincidence degree, 284 commutator, 195 compact embedding, 40 operator linear, 77 nonlinear, 256 set, 26 relatively, 26 space, 26 sequentially, 26 compactness in C(T ), 31 in Lp (Ω), 35 comparison principle, 349 complement direct, 5 orthogonal, 47 topological, 175 complete metric space, 27 completely integrable system, 192 completion, 28 complex linear space, 2 complexification, 4 of operator, 342 component, 27 concentration compactness principle, 521 condensing operator, 264 condition boundary, 91 Dirichlet, 91 mixed, 91 Neumann, 91 periodic, 91 Cauchy–Riemann, 230 Euler necessary, 362 growth, 294, 506, 510 sublinear, 488 initial, 89 integrability, 192 Lagrange necessary, 362 sufficient, 363 Landesman–Lazer type, 285, 497 Lipschitz, 107 monotonicity in principal part, 323 Nagumo-type, 281
Index Palais–Smale ((PS)c ), 432 on manifold, 449 (S+ ), 303 sign, 281, 515 cone, 330 order, 330 dual, 342 normal, 332 total, 342 conjugate exponent, 33 connected space, 27 constant, Lipschitz, 96 constrained extremum, 401 maximum, 401 minimum, 401 continuous embedding, 36 extension, 27 linear form, 50 mapping, 26 operator, 29 contractible set, 414 contraction, 92 k-set, 264 principle, 92 convergence, 25 in strong operator topology, 58 uniform, 30 locally, 30 weak, 65, 68 star, 68 convex functional, 365 strictly, 365 hull, 314 set, 13 convolution, 35 coordinates, 3 Cartesian, 142 local, 159 nonlinear, 142 polar, 142 spherical, 142, 143 cotangent space, 187 Courant–Fischer principle, 409
547 Courant–Weinstein variational principle, 411 covering, 26 Crandall–Rabinowitz local bifurcation theorem, 174 critical growth, 521 point, 160, 362 non-degenerate, 170 Sobolev exponent, 39 value, 160 cross product, 200 cubic spline, 395 curl of vector fields, 201 curve integral of function, 208 of one-form, 212 null-homotopic, 206 oriented, positively, 230 Peano, 241 simple, 224 Dancer global bifurcation theorem, 300 Darbo theorem, 265 deformation lemma, 429, 437, 448 degree Brouwer, 238, 269, 275 coincidence, 284 generalized monotone operator, 304 Leray–Schauder, 278 demicontinuity, 303 dense set, 25 density theorem, 35 dependent functions, 164 derivative directional, 117 second, 129 distributional, 39 Fr´echet, 122 second, 130 Gˆ ateaux, 117 second, 129 Lie, 192 partial Fr´echet, 124 Gˆ ateaux, 124 weak, 38
548 diagram, bifurcation, 175 diameter, 261 diffeomorphism, 142 differentiable manifold, 159 infinite dimensional, 189 with boundary, 218 differential equation characteristic, 191 exterior, 206 in Banach space, 106 form, 197 closed, 202 exact, 202 on manifold, 197 smooth, 201 of differential form, 201 on manifold, 187 operator, 74 Dirac measure, 38 direct sum, 5 directional derivative, 117 second, 129 Dirichlet boundary condition, 91 kernel, 57 distribution, 38 distributional derivative, 39 divergence, 225 domain of class C k,γ , 474 with Lipschitz boundary, 474 dual basis, 10 characterization of norm, 61 order cone, 342 space, 10, 61 duality lemma, 447 mapping, 121 Dunford functional calculus, 22, 113 Eberlain–Smulyan theorem, 67 eigenfunction, 305, 404 principal, 405, 490 eigenvalue, 14, 305 principal, 405, 490 simple, 174, 342
Index eigenvector, 14 Ekeland variational principle, 439 element, finite, 396 embedding, 159 canonical, 9, 65 compact, 40 continuous, 36 energy functional, 379, 515 ε-net, 27 equality, Parseval, 48 equation bifurcation, 166 characteristic, 15 differential characteristics, 191 exterior, 206 in Banach space, 106 Euler, 368, 371 in variations, 151 integral first kind, 82 second kind, 82 well-posed, 82 equicontinuity, 31 equivalence of norms, 29 estimate a posteriori, 92 a priori, 92, 280 Schauder, 475 Euler equation, 368, 371 necessary condition, 362 exact differential form, 202 example Beals, 98 Br´ezis–Nirenberg, 428 Brouwer, 275 Edelstein, 101 Kakutani, 256 Weierstrass, 374 exponent conjugate, 33 Sobolev critical, 39 extensor, absolute neighborhood, 443 exterior differential equation, 206 product, 197
Index extreme value theorem, 378 extremum global, 373 local, 361 constrained, 401 strict, 362 factor space, 9 Fatou lemma, 34 Fermat necessary condition, 362 finite elements method, 396 intersection property, 51 first integral, 165 fixed point, 92 Floquet multiplier, 255 form bilinear, 195 differential closed, 202 exact, 202 on manifold, 197 smooth, 201 Jordan canonical, 19 linear, 10 continuous, 50, 61 skew-symmetric, 196 formula Cauchy integral, 22 Green, 226, 479 Leray–Schauder index, 289 product, 293 Taylor, 130 Fr´echet derivative, 122 partial, 124 second, 130 differentiability, 122 Fredholm alternative, 14, 82 operator, 70, 283 Frobenius theorem, 193, 207 function Bochner integrable, 110 continuous absolutely, 39 H¨ older, 38 Lipschitz, 38
549 uniformly, 27 differentiable on manifold, 187 essentially bounded, 33 Green, 75 holomorphic, 145 strongly measurable, 110 test, 485 vector-valued, 64 functional calculus, 22, 90, 113 coercive, weakly, 380 convex, 365 strictly, 365 energy, 379, 515 Minkowski, 62 positive, 342 sublinear, 61 weakly sequentially continuous, 378 lower semi-continuous, 377 functions dependent, 164 independent, 164 fundamental group, 206 lemma in calculus of variations, 365 matrix, 254 theorem of algebra, 15, 228 Gˆ ateaux derivative, 117 partial, 124 second, 129 differentiability, 117 variation, 117 Gauss–Ostrogradski theorem, 225 Gauss–Seidel iterative method, 392 general minimax principle, 440, 449 on manifold, 449 generalized inverse, 283 geometry, mountain pass, 431 global extremum, 373 inverse function theorem, 144 maximum, 373 minimum, 373 gradient, 52, 118, 200 Gramm matrix, 213
550 Graves theorem, 105 greatest lower bound, 3 Green formula, 226, 479 function, 75 theorem, 224 Gronwall inequality, 260 group cohomology, 205 fundamental, 206 Lie, 195 topological, 195 growth condition, 294, 506, 510 critical, 521 subcritical, 521 sublinear, 488 Hahn–Banach theorem, 61 half-linear differential operator, 305 Hamel basis, 2 Hamilton–Cayley theorem, 17 Hammerstein operator, 133 Heaviside function, 38 Hermite interpolation, 398 polynomial, 49 Hess matrix, 132 Hilbert space, 41 Hilbert–Schmidt operator, 79 theorem, 88 holomorphic function, 145 homeomorphism, 26 homotopic mappings, 205 homotopy, 205 admissible, 271 invariance property, 238, 279, 304 H¨ older continuous function, 38 inequality, 33 hull, convex, 314 hyperbolic stationary point, 154, 176 hyperplane, 11 supporting, 120 identity Jacobi, 195
Index parallelogram, 42 polarization, 42 image, 7 immersion, 159 implicit function theorem, 147 independent functions, 164 index Leray–Schauder formula, 289 of fixed point, 230 of Fredholm operator, 70 of isolated solution, 276, 304 inequality Bessel, 44 Gronwall, 260 H¨ older, 33 Minkowski, 33 Poincar´e, 52 Schwartz, 41 triangle, 25, 28 infinite dimensional differentiable manifold, 189 initial condition, 89 value problem, 93 injection, linear, 7 injective linear operator, 7 integrability condition, 192 integral Bochner, 110 curve of function, 208 of one-form, 212 equation first kind, 82 second kind, 82 first, 165 formula (Cauchy), 22 manifold, 192, 207 of differential form, 216 operator (Volterra), 86 Riemann, 105 weak, 110 interior, 25 interpolation Hermite, 398 Lagrange, 397 interval, order, 330
Index invariance of domain, 241 invariant subspace, 15 inverse, 7 function theorem global, 144 local, 140 generalized, 283 matrix, 7 right, 160, 283 isomorphism in algebraic sense, 7 in topological sense, 29 Jacobi identity, 195 matrix, 118 Jacobian, 118 Jentzsch theorem, 350 generalized, 341 Jordan canonical form, 19 cell, 19 separation theorem, generalization, 241 Kakutani counterexample, 256 kernel, 7 Krasnoselski theorem bifurcation local, 290 potential, 417 minorant, 339 Krein–Rutman proposition, 342 theorem, 343 k-set contraction, 264 Kuratowski measure of noncompactness, 261 Lagrange interpolation, 397 multiplier, 402 method, 402 necessary condition, 362 sufficient condition, 363 Landesman–Lazer type conditions, 285, 497 Laplace operator, 226, 473 p-Laplacian, 500
551 one-dimensional, 305 Laplace–Beltrami operator, 226 Lax–Milgram proposition, 50 Le Shujie theorem, 440 Lebesgue measure, 32 space, 33 lemma duality, 447 Fatou, 34 fundamental in calculus of variations, 365 quantitative deformation, 429, 437, 448 Riemann–Lebesgue, 59 Urysohn, 246 Zorn, 2 Leray–Lions theorem, 323 Leray–Schauder continuation method, 280 degree, 278 formula, 289 Lie algebra, 195 brackets, 193 derivative, 192 group, 195 ring, 195 linear form, 10 operator, 5 space, 1 linking theorem, 465 Lipschitz boundary, 474 condition, 107 constant, 96 continuous function, 38 local bifurcation theorem, 174, 290 chart, 159 coordinates, 159 extremum, 361 inverse function theorem, 140 maximum, 361 minimum, 361 parametrization, 182 stable manifold, 154
552 locally finite system, 209 Lipschitz continuous mapping, 93 lower solution, 334 lowest upper bound, 3 Lusternik theorem, 403 Lusternik–Schnirelmann category, 443 method, 443 theorem, 246 Luzin theorem, 35 Lyapunov–Schmidt reduction, 166 manifold, 40 central, 154 differentiable, 159 infinite dimensional, 189 with boundary, 218 integral, 192, 207 orientable, 215 oriented, 215 simply connected, 205 stable, 154 local, 154 submanifold, 212 mapping bijective, 2 class C k , 186 continuous, 26 locally Lipschitz, 93 uniformly Lipschitz, 94 contractive, 92 duality, 121 homotopic, 205 non-expansive, 98 odd, 242 Poincar´e, 152, 254 proper, 230 retraction, 465 set contraction, 264 matrix fundamental, 254 Gramm, 213 Hess, 132 inverse, 7 Jacobi, 118 rank of, 14 regular, 14
Index representation, 6 transpose of, 11 Mawhin theorem, 284 maximum global, 373 local, 361 constrained, 401 strict, 362 mean value theorem, 119 measurable function, strongly, 110 measure Borel, 39, 63 Dirac, 38 Lebesgue, 32 of noncompactness (Kuratowski), 261 method finite elements, 396 Gauss–Seidel iterative, 392 Lagrange multiplier, 402 Leray–Schauder continuation, 280 Lusternik–Schnirelmann, 443 monotone iterative, 335 Newton, 97, 134 Ritz, 389 metric, 24 Riemann, 214 space, 24 symmetry of, 25 mild solution, 116 minimax principle, 408 general, 440, 449 minimizing sequence, 375, 390 minimum global, 373 local, 361 constrained, 401 strict, 362 Minkowski functional, 62 inequality, 33 minorant, 339 principle, 339 Minty trick, 326 mixed boundary condition, 91 M¨ obius strip, 215 mollifier, 35 monotone
Index convergence theorem, 34 iterative method, 335 operator, 310 decreasing, 332 increasing, 332 strictly, 310 strictly decreasing, 332 strictly increasing, 332 strongly, 310 strongly decreasing, 332 strongly increasing, 332 monotonicity in principal part, 323 Morse theorem, 170 mountain pass theorem, 432 Ambrosetti–Rabinowitz, 441 type geometry, 431 multiindex, 37 multiplicity of eigenvalue, 15, 84 multiplier Floquet, 255 Lagrange, 402 Nagumo-type condition, 281 neighborhood, 24 weak, 66 Nemytski operator, 125 net (ε-net), 27 Neumann boundary condition, 91 Newton method, 97, 134 Newton–Robin boundary condition, 91 nilpotent operator, 19 non-expansive mapping, 98 nonresonance problem, 305, 497 norm, 27 equivalent, 29 induced by scalar product, 41 of linear operator, 29 on Cartesian product, 124 normal cone, 332 normed algebra, 31 linear space, 28 null-homotopic curve, 206 number, winding, 231 odd mapping, 242 open
553 ball, 25 mapping theorem, 58 set, 24 operator adjoint, 12, 68, 72 algebraic, 11 bilinear, 58, 129 bounded, 264 closed, 59 coercive, 323, 385 weakly, 310 compact finite dimensional, 256 linear, 77 nonlinear, 256 condensing, 264 continuous, 29 demicontinuous, 303 differential, 74 half-linear, 305 Fredholm, 70, 283 Hammerstein, 133 Hilbert–Schmidt, 79 integral (Volterra), 86 inverse, 7 generalized, 283 right, 160, 283 Laplace, 226, 473 p-Laplacian, 500 Laplace–Beltrami, 226 linear, 5 bijective, 7 injective, 7 isomorphism, 7 surjective, 7 monotone, 310 decreasing, 332 increasing, 332 strictly, 310 strictly decreasing, 332 strictly increasing, 332 strongly, 310 strongly decreasing, 332 strongly increasing, 332 Nemytski, 125 nilpotent, 19 norm, 29
554 of finite rank, 78 positive, 332, 409 strictly, 332 strongly, 332 projection, 8 self-adjoint, 69 shift left, 10 right, 10 Sturm–Liouville, 88 sublinear, 61 substitution, 125 superposition, 125 unitary, 49 Volterra integral, 86 order cone, 330 dual, 342 interval, 330 ordered Banach space, 330 set, 2 ordering, 2 orientable manifold, 215 orientation, 215 induced, 221 oriented curve, 230 manifold, 215 orthogonal complement, 47 projection, 47 set, 43 orthogonalization, 43 orthonormal basis, 43 system, 43 outer normal vector, 215 Palais–Smale condition ((PS)c ), 432 on manifold, 449 parallelogram identity, 42 parametrization, local, 182 Parseval equality, 48 partition of unity, 209 Peano curve, 241 periodic condition, 91 Perron theorem, 350
Index generalized, 341 Pettis theorem, 112 Pfaff system, 207 Picard theorem, 93 pitchfork bifurcation, 177 p-Laplacian, 305, 500 non-well-ordered case, 355 well-ordered case, 353 Poincar´e inequality, 52 mapping, 152, 254 theorem, 203 point antipodal, 450 bifurcation, 174 critical, 160, 362 non-degenerate, 170 fixed, 92 regular, 160 singular, 156 stationary, 176 hyperbolic, 176 stationary hyperbolic, 154 Poisson brackets, 193 polar coordinates, 142 polarization identity, 42 polynomial characteristic, 15 Hermite, 49 Taylor, 130 positive basis, 215 functional, 342 operator, 332, 409 strictly, 332 strongly, 332 solution, 338 subsolution, 338 positively oriented curve, 230 potential, 118, 202, 416 bifurcation theorem, 417 principal eigenfunction, 405 eigenvalue, 405 principle comparison, 349 concentration compactness, 521
Index contraction, 92 Courant–Fischer, 409 Courant–Weinstein variational, 411 Ekeland variational, 439 minimax, 408 general, 440, 449 general on manifold, 449 minorant, 339 super- and subsolutions, 334 uniform boundedness, 57 problem boundary value, 96 brachistochrone, 366 initial value, 93 nonresonance, 305, 497 regularity, 319 resonance, 497 product cross, 200 exterior, 197 formula, 293 scalar, 41 vector, 200 projection, 8 orthogonal, 47 projective space, 450 proper mapping, 230 property Carath´eodory, 126 finite intersection, 51 homotopy invariance, 238, 279, 294, 304 proposition Bochner, 111 Browder, 98 Euler necessary condition, 362 Kolmogorov, 36 Krein–Rutman, 342 Lagrange necessary condition, 362 Lax–Milgram, 50 Leray–Schauder index formula, 289 Riesz, 32 Schauder, 81 Skrypnik, 304 Taylor formula, 130 (PS)c condition, 432 pseudogradient, 436 vector field, 436
555 tangent, 447 pull-back, 189 push-forward, 186 quantitative deformation lemma, 429, 437, 448 Rabinowitz global bifurcation theorem, 295 linking theorem, 465 saddle point theorem, 464 radius, spectral, 57 rank of matrix, 14 theorem, 162 real linear space, 2 reducing subspace, 15 reduction (Lyapunov–Schmidt), 166 reflexive space, 66 regular matrix, 14 point, 160 value, 160 regularity of classical solution, 371 of weak solution, 372 problem, 319 theory, 483 relatively compact set, 26 relaxation parameter, 392 Rellich–Kondrachov theorem, 40 resolvent, 56 set, 56 resonance problem, 497 retraction, 465, 470 Riemann integral, 105 metric, 214 sum, 106 Riemann–Lebesgue lemma, 59 Riesz proposition, 32 representation theorem, 50 Riesz–Fischer theorem, 49 Riesz–Schauder theory, 82 right inverse, 160, 283 Ritz method, 389 Rothe theorem, 279
556 Rouch´e theorem, 231 rule, chain, 121 (S+ ) condition, 303 saddle point theorem, 459 Rabinowitz, 464 Sard theorem, 245, 272 scalar product, 41 Schauder basis, 40 estimates, 475 fixed point theorem, 257 proposition, 81 Schmidt orthogonalization, 43 Schwartz inequality, 41 self-adjoint operator, 69 semi-continuity, 377 semi-norm, 61 separable space, 25 separation theorem, 62 sequence Cauchy, 27 minimizing, 375, 390 sequentially continuous functional weakly, 378 set closed, 25 weakly sequentially, 381 compact, 26 relatively, 26 contractible, 414 contraction, 264 convex, 13 dense, 25 diameter, 261 open, 24 weakly, 66 ordered, 2 orthogonal, 43 resolvent, 56 symmetric, 242 shift left, 10 right, 10 sign condition, 281, 515 simple curve, 224 eigenvalue, 174, 342
Index simply connected manifold, 205 singular point, 156 skew-symmetric form, 196 Sobolev critical exponent, 39 embedding theorem, 39 space, 39 solution classical, 74, 476 lower, 334 mild, 116 of variational problem, 389 operator, 352 positive, 338 strong, 74 upper, 334 weak, 319, 483–485 regularity of, 319, 372 space Banach, 28 completion, 28 ordered, 330 compact, 26 sequentially, 26 complete, 27 connected, 27 cotangent, 187 dual, 10, 61 factor, 9 Hilbert, 41 Lebesgue, 33 linear, 1 complex, 2 normed, 28 real, 2 metric, 24 complete, 27 of bounded sequences, 10 of compact linear operators, 77 of continuous functions, 30 of continuous linear operators, 55 of differentiable functions, 37 of integrable functions, 32 of linear operators, 5 projective, 450 reflexive, 66 separable, 25
Index Sobolev, 39 tangent, 182 topological, 24 uniformly convex, 65 with scalar product, 41 span, 2 spectral radius, 57 spectrum, 14, 56 spherical coordinates, 142, 143 spline cubic, 395 stability, 176 stable manifold, 154 local, 154 standard basis, 3 stationary point hyperbolic, 154, 176 non-hyperbolic, 176 step function, 110 Stokes theorem, 225 abstract, 222 Stone–Weierstrass theorem, 31 strictly monotone operator, 310 strong operator topology, 58 solution, 74 strongly measurable function, 110 monotone operator, 310 Sturm–Liouville operator, 88 subcritical growth, 521 sublinear functional, 61 growth condition, 260, 488 operator, 61 submanifold, 212 subsolution, 334, 339, 351 positive, 338 strict, 334, 352 strong, 334 subspace, 2 closed linear, 60 invariant, 15 reducing, 15 substitution operator, 125 sum, direct, 5 superposition operator, 125
557 supersolution, 334, 351 strict, 334, 352 strong, 334 support, 35, 209 supporting hyperplane, 120 surjection, linear, 7 surjective linear operator, 7 symmetric set, 242 symmetry of metric, 25 system completely integrable, 192 locally finite, 209 orthonormal, 43 Pfaff, 207 tangent bundle, 187 pseudogradient vector field, 447 space, 182 vector, 182 Taylor formula, 130 polynomial, 130 test function, 485 theorem Alaoglu–Bourbaki, 68 Ambrosetti–Rabinowitz, 441 Arzel` a–Ascoli, 31 Banach–Steinhaus, 58 bifurcation, global Dancer, 300 Rabinowitz, 295 bifurcation, local (Crandall– Rabinowitz), 174 Bochner, 111 Borsuk antipodal, 242 Borsuk–Ulam, 245 Br´ezis–Nirenberg, 440 bread–ham–cheese, 247 Browder, 323 chain rule, 122 closed graph, 60 contraction principle, 92 Courant–Fischer principle, 409 Courant–Weinstein variational principle, 411 Crandall–Rabinowitz, 174 Dancer, 300
558 Darbo, 265 density, 35 dual characterization of norm, 61 Dunford functional calculus, 113 Eberlain–Smulyan, 67 Ekeland, 439 Euler necessary condition, 362 extreme value, 378 fixed point Brouwer, 253 Schauder, 257 Frobenius, 193, 207 functional calculus, 22 fundamental lemma in calculus of variations, 365 of algebra, 15, 228 Gauss–Ostrogradski, 225 Graves, 105 Green, 224 Hahn–Banach, 61 Hamilton–Cayley, 17 Hilbert–Schmidt, 88 implicit function, 147 invariance of domain, 241 inverse function global, 144 local, 140 Jentzsch, 350 generalized, 341 Jordan canonical form, 19 separation, generalization, 241 Krasnoselski local bifurcation, 290 minorant, 339 potential bifurcation, 417 Krein–Rutman, 343 Lagrange multiplier method, 402 necessary condition, 362 sufficient condition, 363 Le Shujie, 440 Leray–Lions, 323 linking, 465 Lusternik, 403 Lusternik–Schnirelmann, 246 Luzin, 35
Index Mawhin, 284 mean value, 119 minimax principle, 408, 440, 449 monotone convergence, 34 iterative method, 335 Morse, 170 mountain pass, 432 Ambrosetti–Rabinowitz, 441 on non-well-ordered case, 355 on well-ordered case, 353 open mapping, 58 Perron, 350 generalized, 341 Pettis, 112 Picard, 93 Poincar´e, 203 Rabinowitz, 295, 464, 465 rank, 162 regularity of classical solution, 371 of weak solution, 372 Rellich–Kondrachov, 40 Riesz representation, 50 Riesz–Fischer, 49 Riesz–Schauder theory, 82 Rothe, 279 Rouch´e, 231 saddle point, 459 Rabinowitz, 464 Sard, 245, 272 separation, 62 Skrypnik, 134, 303–305 Sobolev embedding, 39 Stokes, 225 abstract, 222 Stone–Weierstrass, 31 Taylor, 130 Tietze, 236 trace, 484 uniform boundedness, 57 Weierstrass, 31 approximation, 250 Willem, 517 Zeidler, 339 theory regularity, 483
Index Riesz–Schauder, 82 Tietze theorem, 236 topological complement, 175 group, 195 space, 24 topology strong operator, 58 weak, 66 total cone, 342 trace, 484 theorem, 484 transcritical bifurcation, 177 transpose of matrix, 11 triangle inequality, 25, 28 triangulation, 396 uniform boundedness principle, 57 convergence, 30 locally, 30 uniformly continuous function, 27 convex space, 65 Lipschitz continuous mapping, 94 unit, approximative, 236 unitary operator, 49 upper solution, 334 Urysohn lemma, 246 value critical, 160 regular, 160 variation Gˆ ateaux, 117 variational principle, 389
559 vector -valued function, 64 field on manifold, 190 pseudogradient, 436 of outer normal, 220 product, 200 tangent, 182 Volterra integral operator, 86 weak closure, 77 convergence, 65, 68 star, 68 derivative, 38 integral, 110 neighborhood, 66 solution, 319, 483–485 topology, 66 weakly coercive functional, 380 operator, 310 open set, 66 sequentially closed set, 381 continuous functional, 378 lower semi-continuous functional, 377 Weierstrass example, 374 theorem, 31 well-posed equation, 82 winding number, 231 Zorn’s lemma, 2
Bibliography [1]
Adams, J.F., Lectures on Lie Groups, W.A. Benjamin, Inc., New York, 1969.
[2]
Adams, R.A., Sobolev spaces, Academic Press, New York, 1975.
[3]
Alexander, J.C., A primer on connectivity, pp. 455–483, in: Fixed Point Theory (Fadell, E. & Fournier, G., eds.), Lecture Notes in Math. 886, Springer Verlag, Berlin–Heidelberg–New York, 1981.
[4]
Amann, H., Ordinary Differential Equations. An Introduction to Nonlinear Analysis, de Gruyter Stud. in Math. 13, de Gruyter, Berlin–New York, 1990.
[5]
Amann, H. & Weiss, S., “On the uniqueness of the topological degree”, Math. Z. 130 (1973), 39–54.
[6]
Ambrosetti, A. & Prodi, G., A Primer of Nonlinear Analysis, Cambridge Univ. Press, Cambridge, 1993.
[7]
Anane, A., “Simplicit´e et isolation de la premi`ere valeur propre du plaplacien avec poids”, C. R. Math. Acad. Sci. Paris 305 (1987), 725–728.
[8]
Appell, J. & Zabreiko, P.P., Nonlinear Superposition Operators, Cambridge Univ. Press, Cambridge, 1990.
[9]
Arcoya, D. & Orsina, L., “Landesman–Lazer conditions and quasilinear elliptic equations”, Nonlinear Anal. 28 (1997), 1623–1632.
ë
ë
[10] Aubin, J.P. & Ekeland, I., Applied Nonlinear Analysis, Wiley, New York, 1984. [11] Aubin, T., A Course in Differential Geometry, Amer. Math. Soc., Providence, RI, 2000. [12] Berkovitz, L.D., Optimal Control Theory, Springer Verlag, Berlin–Heidelberg–New York, 1974. [13] B¨ ohme, R., “Die L¨ osung der Verzweigungsprobleme f¨ ur nichtlineare Eigenwertprobleme”, Math. Z. 127 (1972), 105–126. ´ ements de Math´ematique, vol. Livre VI, s´eries Int´egration, [14] Bourbaki, N., El´ ie Hermann et C , Paris, 1952. [15] Bourbaki, N., Groupes et Alg`ebres de Lie, Hermann et Cie , Paris, 1975.
562
Bibliography
[16] Brenner, S.C. & Scott, L.R., The Mathematical Theory of Finite Element Methods, 2nd edition, Springer Verlag, Berlin–Heidelberg–New York, 2002. [17] Br¨ ocker, T. & Dieck, T., Representations of Compact Lie Groups, Springer Verlag, Berlin–Heidelberg–New York, 1985. [18] Browder, F.E., Probl`emes Nonlin´eaires, Universit´e de Montr´eal, Montr´eal, 1966. [19] Browder, F.E., “Nonlinear elliptic boundary value problems and the generalized topological degree”, Bull. Amer. Math. Soc. 76 (1970), 999–1005. [20] Ca˜ nada, A., Dr´ abek, P., & Fonda, A., (eds.), Handbook on Differential Equations, Ordinary Differential Equations, vol. 1, Elsevier, North-Holland, Amsterdam, 2004. [21] Cartan, H., Calcul diff´erentiel, Formes diff´erentielles, Hermann et Cie , Paris, 1967. [22] Cesari, L., Hale, J.K., & La Salle, J., (eds.), Dynamical Systems, vol. I, Academic Press, New York, 1976. [23] Chavel, I., Eigenvalues in Riemann Geometry, Academic Press, New York, 1984. [24] Chillingworth, D.R.J., Differential Topology with View to Applications, Pitman Publ., Harlow, 1976. [25] Chow, S.-N., Li, C., & Wang, D., Normal Forms and Bifurcation of Planar Vector Fields, Cambridge Univ. Press, Cambridge, 1994. [26] Citlanadze, E.S., “Existence theorems for minimax points in Banach spaces”, Tr. Mosk. Mat. Obs. 3 (1953), 235–274, in Russian. [27] Coddington, E.A. & Levinson, N., Theory of Ordinary Differential Equations, McGraw–Hill, Inc., New York–Toronto–London, 1955. [28] Conway, J.B., A Course in Functional Analysis, Springer Verlag, Berlin–Heidelberg–New York, 1990. [29] Crandall, M. & Rabinowitz, P.H., “Bifurcation from simple eigenvalues”, J. Funct. Anal. 8 (1971), 321–340. [30] Dancer, E.N., “On the structure of solutions of nonlinear eigenvalue problems”, Indiana Univ. Math. J. 23 (1974), 1069–1076. [31] Davies, B. & Safarov, Y., (eds.), Spectral Theory and Geometry, Cambridge Univ. Press, Cambridge, 1999. [32] de Boor, C., A Practical Guide to Splines, Appl. Math. Sci. Verlag, Berlin–Heidelberg–New York, 2001.
ë 207, Springer
[33] De Coster, C. & Habets, P., The lower and upper solutions method for boundary value problems, in Ca˜ nada et al. [20], pp. 69–160. [34] Deimling, K., Nonlinear Functional Analysis, Springer Verlag, Berlin–Heidelberg–New York, 1985.
Bibliography
563
[35] Dieudonn´e, J., Foundations of Modern Analysis, Academic Press, New York– London, 1960. [36] Dold, A., Lectures on Algebraic Topology, Springer Verlag, Berlin–Heidelberg–New York, 1982. [37] Doˇsl´ y, O., Halflinear differential equations, in Ca˜ nada et al. [20], pp. 161–357. [38] Dr´ abek, P., “Continuity of Nemyckij’s operator in H¨ older spaces”, Comment. Math. Univ. Carolin. 16 (1975), 1, 37–57. [39] Dr´ abek, P., Solvability and Bifurcations of Nonlinear Equations, Pitman Research Notes Math. Ser. 264, Longman Scientific & Technical, Harlow, 1992. [40] Dr´ abek, P., Girg, P., & Man´ asevich, R., “Generic Fredholm alternative-type results for one dimensional p-Laplacian”, NoDEA Nonlinear Differential Equations Appl. 8 (2001), 285–298. [41] Dr´ abek, P., Krejˇc´ı, P., & Tak´ aˇc, P., Nonlinear Differential Equations, CRC Research Notes Math. 404, Chapman & Hall/CRC, Boca Raton, FL– London–New York–Washington, DC, 1999. [42] Dr´ abek, P., Kufner, A., & Nicolosi, F., Quasilinear Elliptic Equations with Degenerations and Singularities, de Gruyter Ser. Nonlinear Anal. Appl., de Gruyter, Berlin–New York, 1997. [43] Dugundji, J., Topology, Brown Publ., Dubuque, IA, 1989. [44] Dunford, N. & Schwartz, J.T., Linear Operators, vol. I. General Theory, Intersci. Publ., New York–London–Sydney–Toronto, 1958. [45] Dunford, N. & Schwartz, J.T., Linear Operators, vol. II, Intersci. Publ., New York–London–Sydney–Toronto, 1963. [46] Edmunds, D.E. & Evans, W.D., Spectral Theory and Differential Operators, Oxford Science Publications, Calderon Press, Oxford, 1987. ´ A half-linear second order differential equation, pp. 153–180, in: [47] Elbert, A., Qualitative Theory of Differential Equations, vol. I and II (Szeged, 1979), Colloq. Math. Soc. J´ anos Bolyai, 30, North-Holland, Amsterdam, 1981. [48] Evans, L.C., Partial Differential Equations, Amer. Math. Soc., Providence, RI, 1998. [49] Fabian, M., Habala, P., H´ ajek, P., Montesinos, V., Pelant, J., & Zizler, V., Functional Analysis and Infinite-Dimensional Geometry, CMS Books Math. / Ouvrages Math. SMC 8, Springer Verlag, Berlin–Heidelberg–New York, 2001. [50] Fitzpatrick, P.M., “Homotopy, linearization, and bifurcation”, Nonlinear Anal. 12 (1988), 171–184. [51] Flucher, M., Variational Problems with Concentration, Birkh¨ auser, Basel– Boston–Berlin, 1999. [52] Folland, G., A Course in Abstract Harmonic Analysis, Chapman & Hall/ CRC, Boca Raton, FL, 1995.
ë
ë
ë
ë
ë
564
Bibliography
[53] Fuˇc´ık, S., Solvability of Nonlinear Equations and Boundary Value Problems, D. Reidel Publ., Dordrecht, 1980. [54] Fuˇc´ık, S. & Kufner, A., Nonlinear Differential Equations, Elsevier, Amsterdam–Oxford–New York, 1980. [55] Fuˇc´ık, S. & Neˇcas, J., “Ljusternik–Schnirelmann theorem and nonlinear eigenvalue problems”, Math. Nachr. 53 (1972), 277–289. [56] Fuˇc´ık, S., Neˇcas, J., Souˇcek, J., & Souˇcek, V., Spectral Analysis of Nonlinear Operators, Lecture Notes in Math. 346, Springer Verlag, Berlin–Heidelberg–New York, 1973. [57] Gaines, R.E. & Mawhin, J., Coincide Degree and Nonlinear Differential Equations, Lecture Notes in Math. 568, Springer Verlag, Berlin–Heidelberg–New York, 1977. [58] Ghoussoub, N., Duality and Perturbation Methods in Critical Point Theory, Cambridge Univ. Press, Cambridge, 1993. [59] Gilbarg, D. & Trudinger, N.S., Elliptic Partial Differential Equations of Second Order, Springer Verlag, Berlin–Heidelberg–New York, 2001. [60] Goebel, K., “An elementary proof of the fixed-point theorem of Browder and Kirk ”, Michigan Math. J. 16 (1969), 381–383. [61] Greenberg, M., Lectures in Algebraic Topology, W.A. Benjamin, Inc., New York, 1967. [62] Gripenberg, C., Londen, S.O., & Staffans, O., Volterra Integral and Functional Equations, Cambridge Univ. Press, Cambridge, 1990. [63] Hale, J., Ordinary Differential Equations, Wiley, New York – London – Sydney – Toronto, 1969. [64] Halmos, P., Finite-Dimensional Vector Spaces, Van Nostrand, Princeton, NJ, 1960. [65] Hamilton, R.S., “The inverse function theorem of Nash and Moser ”, Bull. Amer. Math. Soc. 7 (1982), 65–222. [66] Helgason, S., Differential Geometry, Lie Groups, and Symmetric Spaces, Academic Press, New York, 1978. [67] Hirsch, M.W., Differential Topology, Springer Verlag, Berlin–Heidelberg– New York, 1976. [68] Hlav´ aˇcek, I. & Neˇcas, J., Mathematical Theory of Elastic and Elastoplastic Bodies: An Introduction, Elsevier, Amsterdam, 1981. [69] Iz´e, J.A., Bifurcation Theory for Fredholm Operators, Mem. Amer. Math. Soc. 174, Amer. Math. Soc., Providence, RI, 1971. [70] Kaczmarz, S. & Steinhaus, H., Theorie der Orthogonalreihen, Monografje Matematyczne, Paˇ nstwo Wydawnistwo Naukowe, Warszawa–Lwow, 1935. [71] Kakutani, S., “Some characterization of Euclidian space”, Japan. J. Math. 16 (1939), 93–97.
ë
ë
ë
Bibliography
565
[72] Kantorovich, “Functional analysis and applied mathematics”, Uspekhi Mat. Nauk 3 (1948), 3, 89–185 (in Russian).
ë
[73] Kato, T., Perturbation Theory for Linear Operators, Springer Verlag, Berlin– Heidelberg–New York, 1966. [74] Katok, A. & Hasselblatt, B., Introduction to the Modern Theory of Dynamical Systems, Cambridge Univ. Press, 1985. [75] Kelley, J.L., General Topology, Van Nostrand, Princeton, NJ, 1957. [76] Kittel, Ch., Knight, W.D., & Ruderman, M.A., Mechanics, Berkelly Physics Course, vol. I, McGraw–Hill, Inc., New York–San Francisco–Toronto– London, 1965. [77] Kosniowski, C., A First Course in Algebraic Topology, Cambridge Univ. Press, Cambridge, 1980. [78] Krasnoselski, M.A., Topological Methods in the Theory of Nonlinear Integral Equations, Pergamon, Oxford, 1964. [79] Krasnoselski, M.A. & Zabreiko, P.P., Geometric Methods of Nonlinear Analysis, Springer Verlag, Berlin–Heidelberg–New York, 1984. [80] Krawcewicz, W. & Wu, J., Theory of Degrees with Applications to Bifurcations and Differential Equations, Wiley, New York, 1997. [81] Kˇr´ıˇzek, M. & Neitaanm¨aki, P., Finite Element Approximation of Variational Problems and Applications, Pitman Mon. Surv. Pure Appl. Math. 50, Longman Scientific & Technical, Harlow, 1990.
ë
[82] Kufner, A., John, O., & Fuˇc´ık, S., Function Spaces, Academia and Noordhoff, Prague and Leyden, 1975. [83] Landesman, E.N. & Lazer, A.C., “Nonlinear perturbations of linear elliptic boundary value problems at resonance”, J. Math. Mech. 19 (1970), 609–623. [84] Leinfelder, H. & Simader, C.G., “Schroedinger operators with singular magnetic vector fields”, Math. Z. 176 (1981), 1–19. [85] Leray, J. & Lions, J.-L., “Quelques r´esultats de Viˇsik sur les probl`emes elliptiques nonlin´eaires par les m´ethodes de Minty–Browder ”, Bull. Soc. Math. France 93 (1965), 97–107. [86] Lindqvist, P., “On the equation div |∇u|p−2 ∇u + λ|u|p−2 u = 0”, Proc. Amer. Math. Soc. 109 (1990), 157–164. [87] Lions, P.L., “The concentration–compactness principle in the calculus of variations. The locally compact case I and II ”, Ann. Inst. H. Poincar´e Anal. Non Lin´eaire 1 (1984), 109–145 and 223–283. [88] Lions, P.L., “The concentration–compactness principle in the calculus of variations. The limit case I ”, Rev. Mat. Iberoamericana 1 (1985), 145–201. [89] Lions, P.L., “The concentration–compactness principle in the calculus of variations. The limit case II ”, Rev. Mat. Iberoamericana 2 (1985), 45–121.
566
Bibliography
[90] Lusternik, L. & Sobolev, V., Elements of Functional Analysis, “Nauka”, Moscow, 1965 (in Russian; English translation published in Wiley, New York 1974). [91] Mawhin, J., Topological Degree Methods in Nonlinear Boundary Value Problems, Regional Conference Series in Mathematics 40, Amer. Math. Soc., Providence, RI, 1979. [92] Mawhin, J., “A simple approach to Brouwer degree based on differential forms”, Adv. Nonlinear Stud. 4 (2004), 535–548. [93] Maz’ja, V.G., Sobolev spaces, Springer Verlag, Berlin–Heidelberg–New York, 1985. [94] Michlin, S.G., Variationsmethoden in der Mathematischen Physik, Deutscher Verlag der Wissenschaften, Berlin, 1960. [95] Milnor, J., Topology from the Differentiable Viewpoint, Univ. of Virginia Press, Charlottesville, 1965. [96] Milota, J. & Petzeltov´ a, H., “An existence theorem for semilinear functional ˇ parabolic equations”, Casopis Pˇest. Mat. 110 (1985), 274–288. [97] Moser, J., “A rapidly convergent iteration method and nonlinear partial differential equations I, II ”, Ann. Scuola Norm. Sup. Pisa Cl. Sci. 20 (1966), 265–315, 499–535. [98] Nash, J., “The imbedding problem for Riemann manifolds”, Ann. of Math. 63 (1956), 20–63. [99] Neˇcas, J., Les m´ethodes directes en th´eorie des ´equations elliptiques, Masson et Cie , Paris, 1967. [100] Nirenberg, L., Topics in Nonlinear Functional Analysis, New York Univ., Courant Inst. Math. Sci., New York, 1974. [101] Palis, J. & De Melo, W., Geometry Theory of Dynamical Systems, Springer Verlag, Berlin–Heidelberg–New York, 1982. [102] Protter, M.H. & Weinberger, H.F., Maximum Principle in Differential Equations, Prentice Hall, Englewood Cliffs, NJ, 1967. [103] Rabinowitz, P.H., A global theorem for nonlinear eigenvalue problems and applications, pp. 11–36, in: Contribution in Nonlinear Functional Analysis (Zarantonello, E.H., ed.), Academic Press, New York, 1971. [104] Rabinowitz, P.H., “Some global results for nonlinear eigenvalue problems”, J. Funct. Anal. 7 (1971), 487–513. [105] Rabinowitz, P.H., Minimax Methods in Critical Point Theory with Applications to Differential Equations, Amer. Math. Soc., Providence, RI, 1986. [106] Reed, M. & Simon, B., Methods of Modern Mathematical Physics, vol. IV, Academic Press, New York, 1978. [107] Rektorys, K., Variational Methods in Mathematics, Science and Engineering, 2nd edition, D. Reidel Publ., Dordrecht, 1990.
ë
Bibliography
567
[108] Robinson, D.W., Elliptic Operators and Lie Groups, Oxford Univ. Press, Oxford, 1991. [109] Rockafellar, R.T., Convex Analysis, Princeton Univ. Press, Princeton, NJ, 1970. [110] Rosenberg, S., The Laplacian on a Riemann Manifold, London Math. Soc., London, 1991. [111] Rothe, E.H., Introduction to Various Aspects of Degree Theory in Banach Spaces, Amer. Math. Soc., Providence, RI, 1986. [112] Rudin, W., Functional Analysis, McGraw–Hill, Inc., New York, 1973. [113] Rudin, W., Real and Complex Analysis, McGraw–Hill, Inc., New York, 1974. [114] Ruelle, D., Elements of Differentiable Dynamics and Bifurcation Theory, Academic Press, New York, 1989. [115] Runst, T. & Sickel, W., Sobolev Spaces of Fractional Order, Nemytskij Operators and Nonlinear Partial Differential Equations, De Gruyter Series in Nonlinear Analysis and Applications 3, de Gruyter, Berlin–New York, 1996. [116] Saaty, T.L., Modern Nonlinear Equations, McGraw–Hill, Inc., New York– Toronto–London–Sydney, 1967. [117] Sard, A., “The measure of the critical set values of differential mappings”, Bull. Amer. Math. Soc. 48 (1942), 883–890. [118] Schwartz, J.T., Nonlinear Functional Analysis, Gordon & Breach, New York–London–Paris, 1969. [119] Sehgal, “A fixed point theorem for mapping with a contractive iterate”, Proc. Amer. Math. Soc. 23 (1969), 631–634. [120] Singer, I., Best Approximation in Normed Linear Spaces by Elements of Linear Subspaces, Springer Verlag, Berlin–Heidelberg–New York, 1970. [121] Skrypnik, I.V., Nonlinear Elliptic Boundary Value Problems, Teubner, Leipzig, 1986. [122] Spanier, E., Algebraic Topology, McGraw–Hill, Inc., New York, 1966. [123] Stein, E.M., Singular Integrals and Differentiability Properties of Functions, Princeton Univ. Press, Princeton, NJ, 1970. [124] Sternberg, S., Lectures on Differential Geometry, Prentice Hall, Englewood Cliffs, NJ, 1964. [125] Stoer, J. & Bulirsch, R., Introduction to Numerical Analysis, 3rd edition, Texts in Applied Mathematics 12, Springer Verlag, Berlin–Heidelberg– New York, 2002. [126] Tak´ aˇc, P., “A short elementary proof of the Krein–Rutman theorem”, Houston J. Math. 20 (1994), 1, 93–98. [127] Taylor, M., Partial Differential Equations, vol. I, series Basic Theory, Springer Verlag, Berlin–Heidelberg–New York, 1996.
ë
ë
ë
568
Bibliography
[128] Triebel, H., Theory of Function Spaces, vol. I, Birkh¨ auser, Basel, 1983. [129] Triebel, H., Theory of Function Spaces, vol. II, Birkh¨ auser, Basel, 1992. [130] Vejvoda, O. et al., Periodic Solutions of Partial Differential Equations: Time Periodic Solutions, Sijthoff Noordhoff, The Netherlands, 1981. [131] Walter, W., Ordinary Differential Equations, Graduate Texts in Mathematics, Springer Verlag, Berlin–Heidelberg–New York, 1998. [132] Whitney, H., “A function not constant on a connected set of critical points ”, Duke Math. J. 1 (1935), 514–517. [133] Whitney, H., Geometric Integration Theory, Princeton Univ. Press, Princeton, NJ, 1957. [134] Willem, M., Minimax Theorems, Birkh¨ auser, Boston–Basel–Berlin, 1966. [135] Yosida, K., Functional Analysis, Springer Verlag, Berlin–Heidelberg–New York, 1965. [136] Zeidler, E., Nonlinear Functional Analysis and Its Applications, vol. I, II/A, II/B, III and IV, Springer Verlag, Berlin–Heidelberg–New York, 1986. [137] Ziemer, W.P., Weakly Differentiable Functions: Sobolev Spaces and Functions of Bounded Variation, Springer Verlag, Berlin–Heidelberg–New York, 1989.
Typeset by LATEX 2ε with AMS fonts and BibTEX. Figures were sketched using PSTricks (with the aid of Mathematica) and Matlab.